Semi-Riemannian Geometry With Applications to Relativity, 103 (Pure and Applied Mathematics)

SEMI-RIEMANNIAN GEOMETRY WITH APPLICATIONS TO RELATIVITY This is a volume in PURE AND APPLIED MATHEMATICS A Series of...

Author: Barrett O'Neill

1352 downloads 743 Views 19MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

SEMI-RIEMANNIAN GEOMETRY WITH APPLICATIONS TO RELATIVITY

This is a volume in PURE AND APPLIED MATHEMATICS A Series of Monographs and Textbooks

Editors: S A M U E L EILENBERG AND HYMAN BASS

A complete list o f titles in this series is available from the Publisher upon request.

SEMI-RIEMANNIAN GEOMETRY WITH APPLICATIONS TO RELATIVITY

BARRETT O’NEILL Department of Mathematics University of California Los Angeles, California

W Academic Press San Diego New York Boston London Sydney Tokyo Toronto

COPYRIGHT @ 1983, BY ACADEMIC PRESS ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION M A Y BE REPRODUCED OR TRANSMI'ITED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.

Academic Press A Division ofHareourt Brace & Company 525 B Street, Suite 1900, San Diego, Callfopla 92101-4495 http //www apnet corn

United Kingdom Edition published by

ACADEMIC PRESS LIMITED 24-28 Oval Road, London NWI 7DX

Library 0 7 Corgress Cataloging i n Publication Data

O"eil1,

Barrett. Semi-Riemannian geometry.

(Pure and applied methematics ; ) BibliograpPy: p. 1. Geometry, Riemannian. 2 . Vanifolds (Mathematics) 3 . Calculus o f tensors. 3. Rel a t i k i t y (Physics) I . T i t l e . 11. Series: Pure and applied methematics (Academic Press) ; QA3.P8 [PA6491 510s f516.3'731 82-13917 ISBN 0-12-526740-1

Printed in the United Stales of America 98 99 00 01 BB I I 10 9 8 7

h CONTENTS

xi

...

Preface Notation and Terminology

XI11

1. MANIFOLD THEORY 1 4 6

Smooth Manifolds Smooth Mappings Tangent Vectors Differential Maps Curves Vector Fields One-Forms Submanifolds Immersions and Submersions Topology of Manifolds Some Special Manifolds integral Curves

9 10

12 14

15 19 21 24 27

2. TENSORS 34 35 36 37

Basic Algebra Tensor Fields Interpretations Tensors at a Point Tensor Components Contraction Covariant Tensors Tensor Derivations

39 40 42 43 V

vi Contents Symmetric Bilinear Forms Scalar Products

46 47

3. SEMI-RIEMANNIAN MANIFOLDS Isometries The Levi-Civita Connection Parallel Translation Geodesics The Exponential Map Curvature Sectional Curvature Semi-Riemannian Surfaces Type-Changing and Metric Contraction Frame Fields Some Differential Operators Ricci and Scalar Curvature Semi-Riemannian Product Manifolds Local Isometries Levels of Structure

58 59 65 67 70 74 77 80 81 84 85 87 89 90 93

4. SEMI-RIEMANNIAN SUBMANIFOLDS Tangents and Normals The Induced Connection Geodesics in Submanifolds Totally Geodesic Submanifolds Semi-Riemannian Hypersurfaces Hyperquadrics The Codazzi Equation Totally Umbilic Hypersurfaces The Normal Connection A Congruence Theorem Isometric Immersions Two-Parameter Maps

97 98 102 104

106 108 I I4 116 118 120 121 122

5. RIEMANNIAN ANDCONTENTS LORENTZ GEOMETRY The Gauss Lemma Convex Open Sets Arc Length Riemannian Distance Riemannian Completeness Lorentz Causal Character Timecones Local Lorentz Geometry Geodesics in Hyperquadrics

126 129 131 132 138 140 143 146 149

Contents Geodesics in Surfaces Completeness and Extendibility

vii 150 154

6. SPECIAL RELATIVITY Newtonian Space and Time Newtonian Space-Time Minkowski Spacetime Minkowski Geometry Particles Observed Some Relativistic Effects Lorentz-Fitzgerald Contraction Energy-Momentum Collisions An Accelerating Observer

158 160 163 164 167 171 174 176 179 181

7. CONSTRUCTIONS Deck Transformations Orbit Manifolds Orientability Semi-Riemannian Coverings Lorentz Time-Orientability Volume Elements Vector Bundles Local Isometries Matched Coverings Warped Products Warped Product Geodesics Curvature of Warped Products Semi-Riemannian Submersions

185 187 189 191 194 194 197 200 203 204 207 209 212

8. SYMMETRY AND CONSTANT CURVATURE Jacobi Fields Tidal Forces Locally Symmetric Manifolds Isometries of Normal Neighborhoods Symmetric Spaces Simply Connected Space Forms Transvections

215 218 219 22 1 224 227 23 1

9. ISOMETRIES Semiorthogonal Groups Some Isometry Groups

233 239

viii Contents Time-Orientability and Space-Orientability Linear Algebra Space Forms Killing Vector Fields The Lie Algebra i ( M ) I ( M ) as Lie Group Homogeneous Spaces

240 242 243 249 252 254 257

10. CALCULUS OF VARIATIONS First Variation Second Variation The Index Form Conjugate Points Local Minima and Maxima Some Global Consequences The Endmanifold Case Focal Points Applications Variation of E Focal Points along Null Geodesics A Causality Theorem

263 266 268 270 272 277 280 28 1 286 288 290 293

11. HOMOGENEOUS AND SYMMETRIC SPACES More about Lie Groups Bi-Invariant Metrics Coset Manifolds Reductive Homogeneous Spaces symmetric Spaces Riemannian Symmetric Spaces Duality Some Complex Geometry

300 304 306 310 315 319 32 1 323

12. GENERAL RELATIVITY; COSMOLOGY Foundations The Einstein Equation Perfect Fluids Robertson-Walker Spacetimes The Robertson-Walker Flow Robertson-Walker Cosmology Friedmann Models Geodesics and Redshift Observer Fields Static Spacetimes

332 336 337 341 345 347 350 353 358 360

Contents ix

W. SCHWARZSCHILD GEOMETRY Building the Model Geometry of N and B Schwarzschild Observers Schwarzschild Geodesics Free Fall Orbits Perihelion Advance Lightlike Orbits Stellar Collapse The Kruskal Plane Kruskal Spacetime Black Holes Kruskal Geodesics

364 368 37 1 372 374 378 380 384 386 389 392 395

14. CAUSALITY IN LORENTZ MANIFOLDS Causality Relations Quasi-Limits Causality Conditions Time Separation Achronal Sets Cauchy Hypersurfaces Warped Products Cauchy Developments Spacelike Hypersurfaces Cauchy Horizons Hawking’s Singularity Theorem Penrose’s Singularity Theorem

APPENDIX A. FUNDAMENTAL GROUPS AND COVERING MANIFOLDS

402 404 407 409 413 415 417 419 425 428 43 1 434

44 1

APPENDIX B. LIE GROUPS Lie Algebras Lie Exponential Map The Classical Groups

447 449 450

APPENDIX C. NEWTONIAN GRAVITATION

453

References

456

Index

459

This Page Intentionally Left Blank

PREFACE

This book is an exposition of semi-Riemannian geometry (also called pseudo-Riemannian g e o m e t r y t t h e study of a smooth manifold furnished with a metric tensor of arbitrary signature. The principal special cases are Riemannian geometry, where the metric is positive definite, and Lorentz geometry. For many years these two geometries have developed almost independently: Riemannian geometry reformulated in coordinatefree fashion and directed toward global problems, Lorentz geometry in classical tensor notation devoted to general relativity. More recently, this divergence has been reversed as physicists, turning increasingly toward invariant methods, have produced results of compelling mathematical interest. After establishing the requisite language of manifolds and tensors (Chapters 1 and 2), the plan of the book is to develop the foundations of semi-Riemannian geometry in the simplest way and without regard to signature, allowing the Riemannian and Lorentz cases to appear as needed (Chapters 3-5 and 7). Then in the latter half of the book two threads are followed. One uses the notion of isometry to develop algebraic aspects of semi-Riemannian geometry: manifolds of constant curvature, symmetric spaces, and homogeneous spaces (Chapters 8, 9, and 11); the introductions to these chapters will give a more detailed description of their contents. The other thread applies Lorentz geometry to special and general relativity (Chapters 6,12, and 13). The fact that relativity theory is expressed in terms of Lorentz geometry is lucky for geometers, who can thus penetrate surprisingly quickly into cosmology (redshift, expanding universe, and big bang) and, a topic no less interesting geometrically, the gravitation of a single star (perihelion precession, bending of xi

xii Preface

light, and black holes). The tendency of the spacetimes in Chapters 12 and 13 to have singularities (big bang and black holes) is accounted for in abstract Lorentz terms by two theorems, due respectively to S. W. Hawking and R. Penrose; these are the goals of Chapter 14. The general approach of the book is coordinate-free; however, coordinates are not neglected. Typically, geometric objects are defined invariantly and then described in terms of coordinates. In particular, the definition of a tensor I have adopted converts almost automatically into the classical coordinate formulation. A number of key proofs are given in classical notation. This attitude is only reasonable in view of the vast literature in each style. The basic prerequisites for the book are modest: a good working knowledge of multivariable differential calculus, a firm belief in the existence and uniqueness theorems of ordinary differential equations, and an acquaintance with the fundamentals of point set topology and algebra. Later on, a knowledge of fundamental groups, covering spaces, and Lie groups is required; the necessary background in these topics is outlined briefly in Appendixes A and B. A college course in physics (particularly Newtonian mechanics) is required, not to read this book, but to appreciate the transformation and unification of Newtonian concepts effected by Einstein’s relativistic geometry and the remarkable way the old and new theoriesso different at base-reach approximate agreement on, say, the running of the solar system (Appendix C versus Chapter 13). In the early chapters (1-5 and 7) the logical ordering is fairly strict. Thereafter the two branches- 8,9,11 and 6,12,13- are almost independent. (Chapters 12 and 13 require only an occasional reference to Chapters 9 and perhaps 8.) Chapter 10 is used in Chapters 1 1 and 14. Otherwise Chapter 14, though strongly motivated by Chapters 12 and 13, depends logically on only the early chapters. Following each chapter are a number of exercises; these are meant to be workable without undue strain. In each chapter a single sequence of numbers designates collectively the theorems, lemmas, examples, and so on. For instance, Lemma 5.12 is the twelfth designated item in Chapter 5, not the twelfth lemma. Within a given chapter, the chapter number is omitted. Initials in square brackets, e.g., [SW], direct the reader to the References. It is a pleasure to express my gratitude to the authors of the following brilliant and very different books: S. W. Hawking and G. F. R. Ellis, The Large Scale Structure of Space-time; C. W. Misner, K. S. Thorne, and J. A. Wheeler, Gravitation;R. K. Sachs and H. Wu, General Relativity for Mathematicians.

NOTATION AND TERMINOLOGY

The following notations are among the most frequently used throughout the book:

M, N f , g, h v, w 4, $

4

=

. . , x")

(XI,.

points curves vector fields open sets

P, 4

manifolds real-valued functions vectors mappings

a, p, y

V, W, X , Y

a,

coordinate system

R is the real number field, I denotes an open interval in R, and, for example, [a, b) = { r E R : a 5 r < b}. The identity map is id; 4 $ is the composite mapping that sends p to c$($p). See Appendix B for Lie group notation such as GL(n, R). A mapping 4 : M + N is one-to-one (injective) if p # q implies 4 p # 4q. The image of 4 is { 4 p : p E M } c N , and 4 is onto (surjective) if image 4 = N . (Inclusion B c N does not exclude equality B = N.) If B c N then 4- ' ( B ) = { p E M : 4 p E B } , and when 4 is one-to-one and onto, 4- also denotes the inverse mapping of 4. If TI 0 = 4, then 6 is called a lift of 4 through rc. A lift of the identity map is called a cross section (or merely a section). A linear isomorphism of vector spaces is a linear transformation that is one-to-one and onto, hence is invertible. A subset A ofa topological space has closure A, interior int A , and boundary bd A . 0

6

xiii

This Page Intentionally Left Blank

1

MANIFOLD THEORY

Generally speaking, a manifold is a topological space that locally resembles Euclidean space. A smooth manifold is a manifold M for which this resemblance is sharp enough to permit the establishment of partial differentiation-in fact, all the essential features of calculus-on M . Smooth manifolds are thus the natural setting for "calculus in the large." SMOOTH MANIFOLDS Euclidean n-space R" is the set of all n-tuples p = ( p l , . . . , p , ) of real numbers. We assume in particular a familiarity with its structure as a vector space and as a topological space. The natural inner product of R" is the dot product p * q = piqi,with norm I p I = The resulting metric d(p, q ) = Ip - q I is compatible with the topology of R". A real-valued function f defined on an open set % of R" is smooth (or equivalently, Cm)provided all mixed partial derivatives off -of all ordersexist and are continuous at every point of %. For 1 I i I n, let ui:R" -+R be the function that sends each point p = ( p l , . . . ,p,) to its ith coordinate pi. Then u l , . . . , U" are the natural coordinate functions of R". A function q5 from an open set 42 of R" to R" is smooth provided each realvalued function u i q5 is smooth (1 < i < n). We can now make precise the resemblance to Euclidean space mentioned above. A coordinate system (or chart) in a topological space S is a homeomorphism of an open set 42 of S onto an open set t(%)of R". If we write for each p E 42, &) = (xl(p), . . . , x"(p))

fi.

1

0

<

1

2 1

Manifold Theory

the resulting functions xl, . . . , x"are called the coordinatefunctions of 5. Thus

5

= (XI,.. . , x"): 42 + R".

Here we call n the dimension of 5. Note the identity u i0 5 = xi. Two n-dimensional coordinate systems 5 and q in S overlap smoothly provided the functions 5 q - and q 0 l - are both smooth. Explicitly, if 5 : 4 2 + R n a n d q : W + R" , th en q o ( - ' is d ef in e d o n t h e o p e n s e t <(%n V) and carries it to q(%! n V)-while its inverse function 5 runs in the opposite direction (see Figure 1). These functions are then required to be smooth in the usual Euclidean sense defined above. This condition is considered to hold trivially if 42 and V d o not meet. 0

'

0

Figure 1

1. Definition. An atfas d of dimension n on a space S is a collection of n-dimensional coordinate systems in S such that (Al) each point of S is contained in the domain of some coordinate system in Y ,and (A2) any two coordinate systems in d overlap smoothly. An atlas on S makes it possible to do calculus consistently on all of S . But different atlases may produce the same calculus, a technical difficulty eliminated as follows. Call an atlas V on S complete if 97 contains each coordinate system in S that overlaps smoothly with every coordinate system in W. 2.

Lemma. Each atlas d on S is contained in a unique complete atlas.

Proof. If d has dimension n, let d' be the set of all n-dimensional coordinate systems in S that overlap smoothly with every one contained in d. (a) d'is an atlas (of the same dimension as a!).

Since (Al) is obvious, consider (A2). If q q2 E d', then by symmetry we need only prove that the function q l 0 q; is Euclidean smooth. For any point p E R" in its domain, choose a 5 ~d whose domain contains q;'(p). As a composition of smooth functions, (ql 5-l) 0 (5 0 q; l ) is smooth. Since this function equals q1 0 q; on a neighborhood of p , the latter is smooth on that neighborhood. Smoothness being a local property, (a) follows. 0

(b) d'is complete. If a coordinate system ( in S overlaps smoothly with every element of d'3 d,then by definition 5 E d'. (c) d' is the unique complete atlas containing d. If V is another, then since V contains d, (A2) guarantees that V c d'. But rn then (A2) implies d' c V.

3. Definition. Asmoothinanifold M is a Hausdorff space furnished with a complete atlas. There are many variants of the notion of manifold but for us manifold will mean smooth manifold as above. Any atlas d on a Hausdorff space makes it a manifold since we agree always to use the unique complete atlas containing d to fulfill Definition 3. The dimension n = dim M of a manifold M is the dimension of its atlas, and is often indicated by the notation M". A coordinate system 5 in a manifold M is a coordinate system belonging to the complete atlas of M. If the domain 42 of 5 contains the point p E M , then 5 is called a coordinate system at p and 42 a coordinate neighborhood of p . If 5 is a coordinate system in M and ,V is an open set contained in the domain of 5, then by completeness ( 1 V is also a coordinate system in M . 4. Examples of Manifolds. (1) The identity map ( u l , . . . ,u") of R", by itself, is an atlas. From now on, R" will denote the resulting n-dimensional manifold, called Euclidean n-space. ( 2 ) The sphere S". Let S" be the subspace { a € R"+' : la1 = l} of R"+l. For each 1 5 i In 1, let 42i [42c:] be the open hemisphere consisting of all points a with ai > 0 [ai < 01. The restriction to a i . o r 42,: of the coordinate functions u', . . . ,ui-',d+',.. . ,u"+'gives a coordinate system in the space S". It is easy to check that the 2(n + 1) coordinate systems gotten in this way constitute an atlas on S" making it an n-dimensional manifold. (3) A two-dimensional manifold is often called a surface, and generally speaking, the objects called surfaces in elementary calculus (torus, cylinder, paraboloid, etc.) are two-dimensional manifolds.

+

We now consider two simple ways to get new manifolds from old. Let 42 be an open set in a manifold M . Let d'be the set of all coordinates systems 5 in M such that the domain of 5 is contained in 42. By the remark

4 1

Manifold Theory

preceding Example 4 these domains cover 9. Hence d'is an atlas on 42, making it a manifold called an open submanifold of M . Open sets of a manifold will always be considered to be open submanifolds. If M and N are manifolds, let

5

= ( x ' , . . . , x") : 42 -+

R"

and

q

=

(y', . . . , y") : ^Y- --f R"

be coordinate systems in M and N , respectively. The product function 5: x q : 42 x .Y- -+ Rmf"is defined by

(5

x

d ( P , 4) = ( X ' ( P > , .

*

. 9

X"(P),

Y'(4), . . ., Y"(4)).

Evidently 5 x q is a coordinate system in the Hausdorff space M x N , and it is easy to see that any two such product coordinate systems in M x N overlap smoothly. 5. Lemma. If M and N are manifolds, then the set of all product coordinate systems in M x N is an atlas on M x N making it the product manifold of M and N .

The dimension of M x N is dim M + dim N . This construction extends in an obvious way to the product of any finite number of manifolds. Indeed Euclidean space R",as in Example 4,is exactly the product manifold R' x . . . x R' (n factors). SMOOTH MAPPINGS

Consider first the special case of a real-valued function f on a manifold M . If 5 : 42 + R" is a coordinate system in M, then the composite function f 0 5-l: <(%) + R' is called the coordinate expression for f in terms of 5. In fact, f

=

(f t-')(x',. . . , x") on 42. 0

(Compare, from elementary calculus, expressing a function f = f ( x , y) in terms of polar coordinates.) It is natural then to define a functionf: M -+ R to be smooth provided that for every coordinate system 5 in M the coordinate expression f 5-l is smooth in the usual Euclidean sense. Let S ( M ) be the set of all smooth real-valued functions on M . Iff and g are smooth functions on M so is their sumf + g and product fg. The usual algebraic rules hold for these two operations, making g ( M ) a commutative ring. Multiplicative inverses do not exist in general, but iff E B ( M ) is never zero, then l / f E B ( M ) . The notion of smoothness extends from a real-valued function to an arbitrary mapping of manifolds using the same idea: that coordinate expressions must be Euclidean smooth. 0

Smooth Mappings

5

6. Definition. Let M" and N" be manifolds. A mapping 4 : M + N is smooth provided that for every coordinate system 5 in M and y in N the coordinate expression y 4 ( - is Euclidean smooth (and defined on an open set of R"). 0

0

'

Explicitly, if 4Y and V are the domains of ( and q, then for all p E $-'(V) $(4p), 1 I j I n, depend smoothly on the coordinates

n uz1 the coordinates X'(P>,

. . . ,X"(P).

Comments. (1) It suffices to check the smoothness condition for sufficiently many coordinate systems to cover M and N ; smooth overlap then takes care of the rest. (2) A Euclidean smooth mapping 4 from an open set 42 c R" to R" is smooth in the above sense, since 4 is its own coordinate expression relative to the identity coordinate systems on #! and R". Similarly this definition agrees with the earlier one in the case f :M -+ R. (3) The identity map ofa manifold is smooth; any composition ofsmooth mappings is smooth. (Once we have built up a supply of smooth mappings the latter will provide the easiest way to prove smoothness.) (4) Coordinate systems 15 and coordinate functions x i are smooth maps on the domain of 5. ( 5 ) Smoothness is a local property. Explicitly, define 4 : M + N to be smooth at p E M provided the restriction of 4 to some neighborhood of p is smooth. Then evidently 4 is smooth if and only if smooth at every point of M . (6) Smooth mappings are continuous. The following consequence of ( 5 ) will be used frequently. For each index A let ??Labe an open set in a manifold M and let 4z:4Ym + N be a smooth mapping. If for all a, /3 E A , 4x = 4D on J U ~n @*,

CI E

u

then these mappings combine to give a single mapping #: BE+ N such that 4 I qE= 4,for all c( E A . Since smoothness is a local property, 4 is smooth.

7. Definition. A diffeomorphism 4 : M + N is a smooth mapping that has an inverse mapping which is also smooth. Identity maps ofmanifolds, compositions ofdiffeomorphisms, and inverses of diffeomorphisms are all diffeomorphisms. If there exists a diffeomorphism 4 from M to N , then M and N are said to be diffeomorphic under 4. For example, any open interval (a, b) in R1 is diffeomorphic to (- 1, 1) under a suitablelinearmap,and( - 1, 1)isdiffeomorphicto R' under 4(t) = t/(l - t 2 ) . Manifold theory can be defined as the study of those objects preserved by diffeomorphisms; thus diffeomorphic manifolds are the same from the viewpoint of manifold theory.

6 1

Manifold Theory

If d, is a one-to-one function from a set C onto a manifold M , then evidently there is a unique way to make C a manifold (that is, a unique topology and complete atlas on C) such that d, is a diffeomorphism. Since smooth maps are continuous, a diffeomorphism is in particular a homeomorphism. However a smooth homeomorphism need not be a diffeomorphism, since its inverse may not be smooth. The canonical example is the mapping t + t 3 on the real line. Every coordinate system 5 is a diffeomorphism from its domain @ to t(@)c R". Conversely, every diffeomorphism d, from an open set 9'" c M to d,(V)c R" is a coordinate system of M . In fact, for any coordinate system 5 the maps d, 0 t- and 5 0 4- are smooth, and since the atlas of M is complete, 4 is in it. Thus following a coordinate system by a diffeomorphism of R" gives again a coordinate system. In particular, if p E M there always exist coordinate systems with t ( p ) = 0 E R". The support supp f o f f E S ( M ) is the closure of the set { p E M :f ( p ) # O}. Thus M - supp f is the largest open set on which f is identically zero.

8. Lemma. Given any neighborhood 92 of a point p in M there is a function f E S ( M ) , called a bump function at p , such that (1) 0 I f I 1 on M . (2) f = 1 on some neighborhood of p . (3) SUPPf = 92.

Proof: Start from the celebrated smooth function equal to e - 1 1 z 2if t > 0, and zero elsewhere. Using elementary calculus one can then build, for any E > 0, a smooth function h on R' such that h(t) = 1 for E I E, h(t) = 0 for t 2 26, and 0 I h I 1. Now let t : 9'" -+ Rnbe a coordinate system in M with <(p) = 0 and V c 92. If E > 0 is sufficiently small, then < ( V contains ) the neighborhood ( p E R" : Ip 1' < 3 ~ of ) 0 in R". Let N = (xi)' on V .For h as above, define f to be h o N on 9'" and zero elsewhere. Then f is smooth on M and has the required properties.

1

TANGENT VECTORS

The crucial step in generalizing calculus from R" to an arbitrary manifold is the following elegant definition, which axiomatizes the directional derivative aspect of Euclidean tangent vectors.

9. Definition. Let p be a point of a manifold M . A tangent vector to M at p is a real-valued function v : S ( M ) -+ R that is

Tangent Vectors 7

+

(1) R-linear: v(af bg) = au(f) + bu(g), and ( 2 ) Leibnizian: v ( f g ) = u ( f ) g ( p ) + f ( p ) u ( g ) for all a, b E R and

f , g E SW). At each point p E A4 let T,(M) be the set of all tangent vectors to M at p . The usual definitions of functional addition and scalar multiplication make Tp(M)a vector space over the real numbers R. Explicitly, (u

+ M f )= 4.f)+ N f ) ? for all f

( a u ) ( f )= av(.f)

ES(M),

a E R,

and Tp(M)is called the tangent space to M at p . To define partial differentiation on a manifold, the scheme is to move the function f back to Euclidean space using a coordinate system, and then take the usual partial derivatives. Iff

10. Definition. Let E S ( M ) , let

=

(xi, . . . , x") be a coordinate system in M at p .

where u l , . . . , U" are the natural coordinate functions of R".

A straightforward computation then shows that the function

sending each f E S ( M ) to (af/dx')(p> is a tangent vector to M at p . We can picture ailp as an arrow at p tangent to the xi-coordinate curve through p . 11. Lemma. Let v E Tp(M).(1) Iff, g E S ( M ) are equal on a neighborhood ofp, then v ( f ) = v(g).( 2 )If h E S ( M )is constant on a neighborhood ofp, then v(h) = 0.

Proof: (1) By linearity it suffices to show that i f f = 0 on a neighborhood @ of p , then u(f) = 0. Let g be a bump function at p with support in @; thenfg = 0 on all ofM. But u(0) = u(0 0) = u(0) + v(0)implies v(0) = 0. Thus

+

0 = 4 f g ) = u(f)g(p)+ f(P)fQ) = 4 f ) , since f(p) = 0 and g(p) = 1. ( 2 ) By (1) we can assume that h has constant value c on all of M . If 1 is the constant function of value 1, then v(1)

Hence v(1)

=

=

v(1 1) '

=

v(1)l

+ lu(1) = 241).

0, and v(h) = u(c. 1) = cv(1)

=

0.

H

8 7

Manifold Theory

The preceding lemma is one way to express the fact that tangent vectors are local objects. Another version is this: If % is an open set in M then, as an open submanifold, % has tangent space T,(@) at p E &. If u E T'(Q) define

i i ( f ) = u ( f 1%)

for all f

E

S(M).

Evidently I5 E T,(M), and the function u + I5 is a linear isomorphism. We ignore this trivial isomorphism henceforth, writing T,(%) = T,(M). The following result, called the basis theorem, is the fundamental link between coordinates and tangent vectors. 12. Theorem. If 5 = (x', . . . , x") is a coordinate system in M at p , then its coordinate vectors a, ,I . . . ,a, ,1 form a basis for the tangent space T,(M); and n

u =

C u(xi)ai l P

for all u E T,(M).

i= 1

Proof: By the preceding remarks we can work solely on the coordinate neighborhood % of 5. Since u(c) = 0 there is no loss of generality in assuming ( ( p ) = 0 E R". Shrinking % if necessary gives 5(@) = {q E R" : ( qI < E } for some E. If g is a smooth function on ((42) then for each 1 I i I n define

It follows using the fundamental theorem of calculus that g = g(0)

Thus iff

E

S ( M ) , setting g

=

f

+ 2 giui 0

f = f(p)

5-

on

t(&).

yields

+ C fix'

on %.

Applying a/axi gives f ; ( p ) = (af /axi)(p).Thus applying the tangent vector u to the formula gives

1

Since this holds for all f E g ( M ) , the tangent vectors u and u(xi) ailp are equal. It remains to show that the coordinate vectors are linearly independent. But if C ai ai,1 = 0, then application to x j yields axj 1 ai 7 (p) = c

ai dij = a j . i ax In particular the (vector space) dimension of T,(M) is the same as the dimension of M .

0

=

Differential Maps

9

DIFFERENTIAL MAPS The basic idea of differential calculus is to approximate smooth objects by linear objects. In the preceding section a manifold M was approximated near each of its points p by the tangent space T'(M). Now we approximate a smooth mapping 4 : M + N near each point p E M by a linear transformation of tangent spaces. Note first that if u E T,(M) then the function u+: S ( N ) + R sending each g to v(g 4) is a tangent vector to N at b(p).To prove the Leibnizian property, for example, let f , g E S ( N ) ;then 0

u+(fg) = u(fg

O

4) = u((f

O

4>(s 4)) = v ( f 4M4P) + f(4Sp)uk 4) O

O

=

13. Definition. the function

4: M

Let

+N

ug(f>g(4SP) + f(dP)U,@).

be a smooth mapping. For each p

&$ T,(M)

+

EM

T+,(N>

sending u to u+ (as above) is called the diflerential map of 4 at p (see Figure 2).

L:

/ \

Figure 2.

Thus d4' is characterized by the equation

d4,(4(g) = u(g 4) for all v E T'(M) and g E fJ(N). It follows that diflerential maps are linear. Lemma. Let 4: M" + N" be a smooth mapping. If 5 is a coordinate system at p in M , and is a coordinate system at 4 ( p ) in N, then 14.

Proof. Let w E T,,(N) be the left-hand side of this equation. By the basis theorem (12), w = w(y') d/dy' I+'. But by the definition of a differentia1map,

10 1

Manifold Theory

Thus the matrix of d 4 pwith respect to these coordinate bases is

(V

@)) 1 c iSn, 1 9j < m

called the Jacobian matrix of 4at p relative to 5 and q. The classical chain-rule formula for the Jacobian matrix of a composite mapping follows immediately from the following expression for it in terms of differential maps. 15. Lemma. If for each p E M ,

4: M

-+

4I)

N and I):N

O

+P

4)p= d*bP

are smooth mappings, then

d4p.

Proof: If u E T,(M) and g E s ( P ) , then

4$

O

4>(u)(g> = u(g $ O

O

4) = d4(u)(g

O

I))= @$ d4(u))(g).

'

Henceforth we usually omit the subscript p from d&. In terms of manifold theory the inverse function theorem can be stated as follows.

16. Theorem. Let 4: M + N be a smooth mapping. The differential map d 4 p at a point p E A4 is a linear isomorphism if and only if there is a neighborhood Y of p in M such that 4I Y is a diffeomorphism from ^Y- onto a neighborhood 4(V) of 4 ( p ) in N . For a proof, apply the classical theorem to a coordinate expression for 4 near p. Because of this result a smooth map 4: M -+ N such that every d 4 p is a linear isomorphism is called a local dtfeomorphisrn. If 4 is also one-to-one and onto, then it is a diffeomorphism. CURVES A curue in a manifold M is a smooth mapping a : I + M , where I is an open interval in the real line R'. (We allow I to be half-infinite or all of R.) As an open submanifold of R', I has a coordinate system consisting of the identity map u of I . At each t E R we can picture the coordinate vector (d/du)(t)E T,(R) as the unit vector at t in the positive u direction.

17. Definition. Let a : I tEIis

+M

be a curve. The velocity uector of

tl

at

Curves 11

Intuitively, a’(t) is the vector rate of change of CI at t E I. We list some of its basic properties :

(1) Directional derivatives. By the definition of da, the tangent vector a’(r) applied to a function f E g ( M ) gives

Thus if a is any curve with say a’(0) = u, then u ( f ) = ( d ( f a)/dt)(O). (2) Coordinate expression. Let x l , . . . , x n be a coordinate system in M at a point a(t) of a. By the basis theorem and (1): 0

(3) Reparametrization. If S I : I -+ M is a curve and h : J -+ I is a smooth M is a curve called a reparafunction on an interval J , then /3 = a(h): J metrizarion of a. Then -+

/3’(s) = (dh/du)(s)cc’(h(s))

for all s E J .

This follows from the chain rule (Lemma 15), as does the following. (4) Eject o f a mapping. If a : I + M is a curve in M , then a mapping 4 : M N carries a to the curve 4 a:I + N in N . The differential map of 4 preserves velocities, that is, -+

0

d4(a’(t)) = ( 4 a)’(t)

for all t E 1

0

This is usually a more efficient way to get information about d+ than are coordinate computations as in Lemma 14.

A curve CI is regular provided a‘(t) # 0 for all t. If [a, b] is a closed interval in R then a curue segment 2: [a, b] M is a function that has a smooth extension to an open interval containing [a, b]. Thus a‘ is well defined even at the endpoints a and b. A function /3: [a, b] -+ M is a piecewise smooth curve segment provided there is a partition a = ro < t < . . . < t k + = b of [a, b] such that each /3[ [ t i , ti+ is a curve segment. Thus B may well have two velocity vectors at each break ti.For an open interval I , j: I M is piecewise smooth provided that for all a < b in I the restriction /3 I [a, b] is piecewise smooth. (Thus the breaks have no cluster point in I.) For no class of curves to be considered is the reparametrization t -+ a(t + c) important. Thus to simplify notation we shall often assume, without explicit mention, that the domain of a curve contains the number 0. -+

-+

12 1

Manifold Theory

VECTOR FIELDS

A vectorjeld V on a manifold M is a function that assigns to each point p E M a tangent vector V, to M at p. Intuitively, V is a collection of arrows, one at each point of M . If V is a vector field on M and f E S(M), then Vf’ denotes the real-valued function on M given by for all p

( V f ) ( p ) = Vp(f )

E

M.

Then V is smooth provided Vf is smooth for allf E S(M). Vector fieldson M areadded together,ormultiplied byafunctionf E S(M), in the obvious way:

(f (V

v,= f ( P ) V P >

+ W), = + V p

wp

for all p

E

M.

+

If V and W are smooth, then the vector fields V W and f Vare also. These two operations make the set X(M) of all smooth vector fields on M a module over the ring S(M). (The definition of module over a commutative ring with unit is formally the same as that of a vector space over a field.) If 4 = ( X I ,... , x”) is a coordinate system on 02 c M , then for each 1I iI n the vector field ai on sending each p to ailpis called the ith coordinate uector jeld of (. These vector fields are smooth, since di(f) = df/dx’. It follows immediately from the basis theorem that for any vector field V

For low-dimensional manifolds index notation can be cumbersome. On the Euclidean plane, for example, we often denote the natural coordinate functions by x, y as in elementary calculus; then the coordinate vector fields are denoted by a,, d y . Aderivationon S(M)is afunction9: S(M) -+ S(M) that is (1) R-linear: B(u,f + bg) = a 9 ( f ) + b 9 ( g ) ,(a, b E R), and (2) Leibnizian: 9 ( f g ) = 9(f ) g fg(g).

+

The definition of tangent vector shows that for a vector field V E X(M) the function f -+ Vf is a derivation on S ( M ) .Conversely, every derivation 9 on S(M) comesfrom a uectorjeld. In fact, for each p E M define V’: S(M) -, R by V,( f ) = 9(f)(p). The derivation properties (1) and (2) above imply that V, is a tangent vector to M at p ; thus V is a well-defined vector field on M. E S(M)for all ,f E S(M), so V is smooth and determines the But Vf = 9(f) derivation 9.

Vector Fields 13

Henceforth whenever convenient we will consider vector fields to be derivations on B ( M ) . This interpretation leads to a crucial operation on vector fields. If V , W E X(M) let [V, W ] = V W - WV. This is the function from g ( M ) to g ( M ) sending each , f to V ( W f ) - W(V'f).An easy computation shows that [V, W ] is a derivation of g ( M ) , hence a smooth vector field on M . [V, W ] is called the brucket of V and W . In terms of the original definition of vector field, [ V, W ] assigns to each p E M the tangent vector [ V, W ] , such that [V, wl,(.f>= V,(Wf) - W,(Vf).

18. Lemma. The bracket operation on X(M) has the following properties : [uV + bW, X ] = a[V, X ] + b [ W , X I , [ X , UV b W ] = U [ X , V ] b [ X , W ] . (2) Skew-symmetry: [W, V l = - [ V , W ] . (3) Jacobi identity: [ X , [ Y, Z ] ] [ Y , [ Z , X I ] + [ Z , [ X , Y ] ] = 0. (1) R-bilinearity:

+

+

+

Proof These identities are purely formal, and hold for derivations on any linear algebra: (1) is immediate from the linearity of derivations, and (2) is obvious. For (3),substitute the definition of bracket; then the resulting twelve rn terms cancel in pairs.

The bracket operation on X(M) though R-bilinear, is not S(M)-bilinear. In fact,

[IfK SWl

=

fsCK W l + .f(VS)W - S ( W f ) V

To prove this, check that both sides have the same effect on any function hENM). In two special cases, the bracket is always zero. For any VEX(M), [ V , V ] = 0. In fact, in the presence of R-bilinearity this is equivalent to the skew-symmetry (2) above. For any two coordinate vector fields of the same coordinate system, [a,, d j ] = 0, which is the bracket expression of d2f/dxiaxj = a2f/dxJax' for smooth functions f . 19. Example. Let x, y be the natural coordinates of R2, and consider the vector fields V = y 8, and W = x a,. Then [V, W ] = - W, since for all

.f E 5 ( M ) ,

c ~ w1.f , = CY a,, x a,if

=

- y x , a2f - x - - x y - af =

dY

aY

Y a,(x

a,f)

d2f aY2

-

x q y a,f)

- x d f = -Wf. Y

14 I

Manifold Theory

Such computations can be abbreviated by omitting the function f-and also the second derivatives (since these must cancel.) The differential map of 4: M + N moves individual tangent vectors from M to N , but in general provides no way to move vector jields from M to N (or the reverse). This difficulty can be overcome to some extent as follows. 20. Definition. Let 4: M -+ N be a smooth mapping. Vector fields X on M and Yon N are 4-reluted, written X 7 Y provided d+(X,) = Ybp

for all p

E

M.

21. Lemma. Vector fields X E X ( M ) and Y E X(N) are +-related if and only if X ( g 4) = Yg 4 for all g E g ( N ) . 0

0

ProoJ: The following assertions are equivalent: X ( g 0 4 ) = ygo+7

22. Lemma. If XI 7 Y, and X 2

g

E

S(NX

7 Y2, then [ X I ,X J 7 [Yl, YJ.

Let 4 : M -+ N be a diffeomorphism. For each X E X ( M ) there is a unique vector field d $ ( X ) E X(N) that is +-related to X . In fact, we have no choice but to define ( d 4 X ) , = d+(X,) for all q = + ( p ) E N . Then d 4 ( X ) is smooth, since ifgE S(N)thedefinitionleadstotheformula(d$X)g = X(g 0 $) 4-lE g(N). We call d d ( X ) the transferred vector field of X . 0

ONE- FORMS

The one-forms on a smooth manifold M are the objects dual to vector fields. At a point p of M the dual space T,(M)* of the tangent space T,(M) is called the cotangent spuce of M at p . Elements of T,(M)*-sometimes called covectors-are linear maps of T,(M) into R.

Submanifolds

15

23. Debition. A oneTform 8 on a manifold M is a function that assigns to each point p an element 8, of the cotangent space T,(M)*. Thus 8 assigns a number to every tangent vector and is linear on the tangent vectors at each point. If 8 is a one-form on M and X is a vector field on M , denote by OX the real-valued function on M whose value at each point p is the value of 8, on X,. A one-form 8 is smooth provided OX is smooth for all X E X(M). Let X*(M) be the set of all (smooth) one-forms on M . Two one-forms are added, and a one-form is multiplied by a real-valued function, just as in the dual case of vector fields. Explicitly, (8

+ 4, = ep + u p ,

(m,

=

f (P)8,

for all p E M . Thus X*(M) becomes a module over g ( M ) . There is a remarkable operation that converts functions into one-forms.

24. Definition. The diferential o f f E g ( M ) is the one-form df such that ( d f ) ( u ) = u ( f ) for every tangent vector u to M . Clearly d f is a one-form since at each point p the function (df ) p : T,(M) + R is linear, and if V E X(M) the function (df )( V ) = Vf is smooth. If x ', . . . , X" is a coordinate system on 4? c M we have thus the coordinate one-forms dx', . . . ,dx" on 4?. At each point of @ these provide a dual basis to the coordinate vector fields d l , . . . , a, since dx'(aj) = dxi/dxJ = h i j .It follows that for any one-form 8, 8 = C e(ai) dxi on u2r. This formula corresponds to the basis theorem for vector fields. (To prove it, apply both sides to the coordinate vector fields.) In particular, iff E s ( M ) , then since df (ai)= df /axi,

25. Lemma. The differential has the following properties: (1) d : B ( M ) + X*(M) is R-linear. ( 2 ) Product rule: If ,f, g E g ( M ) , then d ( f g ) = g df f dg. ( 3 ) Iff' E s ( M ) and h E B(R'), then d ( h ( f ) ) = h ' ( f ) d f .

+

SU BMANIFOLDS

Roughly speaking, a submanifold P of a manifold M is a subset of M that acquires its manifold structure from M . In particular we shall require that P be a topological subspace of M , that is, have the induced topology, for which a

16 1

Manifold Theory

subset Y of P is open if and only if there is an open set 4 nP = V.

9 of M

such that

26. Definition. A manifold P is a subman$oki of a manifold M provided : (1) P is a topological subspace of M . (2) The inclusion map j : P c M is smooth and at each point p E P its differential map d j is one-to-one.

If P is a submanifold of M and 4 : M --* N is a smooth map, then so is the restriction 4 I P of 4 to P , since 4 1 P i s just 4 0 j . In particular, iff E S ( M ) , then f I P E g ( P ) . Sincej is such an obvious mapping and each d j : T,(P) + T,(M) is one-to-one, it is customary to ignore dj and consider the tangent space T&P) as a vector subspace of T,(M). Open submanifolds are trivially submanifolds, and the sphere S" as in Example 4(2) is a submanifold of R"". Coordinate systems produce submanifolds; for instance the plane z = 1 in R3 is a submanifold, diffeomorphic to R2 under (x,y , 1) -+(x, y). More generally, if 5: 42 + R" is a coordinate system in a manifold M then holding any n - m of the coordinate functions constant produces an m-dimensional submanifold called a (-coordinate slice C of 42. Our goal is to show that every submanifold can be constructed by gluing together such slices.

27. Definition. Let P be a subset of M". A coordinate system <: 42 + R" in M is adapted to P provided 42 n P is a S-coordinate slice of 42. By permuting the coordinates we can always suppose it is the last n - m that are held constant. Then let 42 n P -+ R" be the restriction of (XI,. . . , x") to 42 n P. In case P is a submanifold it follows using the inverse function theorem that is a diffeomorphism onto its (open) image in R", and hence is a coordinate system in P .

rp:

cp

28. Proposition. If P" is a submanifold of M", then at each point of P there is a coordinate system of M adapted to P . Proof Let ( = (XI, . . . , x") be a coordinate system for M at p E M and let yl, . . . , y" be a coordinate system for P at p. Since the differential map d j , is one-to-one its Jacobian matrix

has rank m. Thus (relabeling the xis if necessary) we can suppose its first m rows are linearly independent. Then by Exercise 7 the restrictions to P of the

Submanifolds 17

functions xl, . . . ,x"' form a coordinate system for P on a neighborhood W of p . Iffk is the coordinate expression for xk I P relative to these coordinates, then

. . . , xm) on W .

xk = f"(xl,

It follows that the functions Zk = X k

-

fk(X1,

. . . , xrn)

are well defined on some neighborhood of p in M . If [ = ( 2 ,... , X", z m - l, . .. ,z") then at p the Jacobian matrix of [ relative to tl has the form

where I , is the m x m identity matrix. Evidently this matrix is invertible, so is a coordinate system on a neighborhood 42 of p in M . (1) Because P is a topological subspace of M we can choose 42 so that 42 n P c W . (2) Since 0 = (XI,. . . , x")(42 n P ) is an open set of R", by shrinking OCL' if necessary we can also suppose that ( X I ,... , xm)(42)c 0. Now on % n P the functions zk are all zero, so 42 n P is contained in the slice

c:

zln

+ 1 = 0, ... , Z " = 0

of a.Conversely, one can check that C c @ n P. Hence 42 n P

= C.

The following seemingly obvious result depends on the fact that submanifolds have the induced topology.

29. Corollary. Let P" be a submanifold of M". If 4: N + M is a smooth mapping such that $ ( N ) c P , then the induced mapping 4: N -+ Pis smooth. Proof: If q E N let xl, . . . , X" be a coordinate system on a neighborhood @ c M of 4(q)that is adapted to P. Being smooth, 4 is continuous, and since Pis a topological subspace ofM, 4 is continuous. Thus there is a neighborhood V of q in N such that 4(V)is contained in the neighborhood 42 n P of 4(q)in P. Now x 11 P, . . ., X" I P is a coordinate system on % n P . If j : P c M is the inclusion map, then (XiIP)"$

=

These functions are smooth since Exercise 1 that is smooth.

6

xi.j0$

= Xi.4.

4 and the xis are smooth. It follows by

30. Corollary. A subset P of a smooth manifold M can be made a submanifold of M in at most one way.

18

I

Manifold Theory

Proof By definition P must be given the induced topology. Suppose two atlases are assigned to the space P , producing submanifolds P , and P , of M . The inclusion maps of PI and P , into M are smooth, hence by the preceding corollary the identity maps PI -+ P , and P , + P , are smooth. Thus these identity maps are inverse diffeomorphisms. It follows that the complete rn atlases of PI and P , are identical. Thus it makes sense to say that a subset P of a manifold M is (or is not) a submanifold of M . A basic criterion is as follows. 31. Proposition. A subset P of a manifold M is an m-dimensional submanifold if (and only if) at each point p of P there is a coordinate system of M adapted to P by m-dimensional slices.

Proof Assign P the induced topology. Let 5 : 0% + R" be a coordinate system of M at p E P such that n P is the slice XJ = xj(p) for m 1 I j I n. We can suppose &) = 0. Then since 5 is a homeomorphism, the map t p= (XI, . . . , x m ) l P is a homeomorphism from & n P to the open set t(&)n R" of R". We assert that all such topological coordinate systems t pin P form an atlas. Certainly they cover P, and any two overlap smoothly since for 1 I i I m, u io tPo ylpl = u io ( 5 o q-')1Rrn,

+

where R" is considered as the coordinate m-plane of the first rn coordinates of R". It remains to show that this atlas makes P a submanifold of M . For a m are coordinate system 5 as above, the functions x j l P = xi o j for 1 I i I Hence by Exercise 1 smooth, since they constitute the coordinate system the inclusion map j: P c M is smooth. Evidently the Jacobian matrix of 5 relative to tPcontains an m x m identity matrix, hence dj is always one-toone.

cp.

A vector field X on M is tangent to a submanifold P of M provided X , E T,(P) for all p E P . (Recall that for a submanifold, T,(P) is considered as a subspace of T,(M).)

32. Proposition. Let P be a submanifold of M . (1) If X E X(M) is tangent to P , then its restriction XI P to P is a smooth vector field on P . (2) Furthermore, if Y E X(M) is also tangent to P , then the bracket [ X , Y ] is tangent to P and [ X , YllP = CXlP, YIP]. The proof is straightforward, using an adapted coordinate system for (1) and applying Lemma 22 to (2).

immersions and Submersions 19

IMMERSIONS AND SUBMERSIONS

This section deals with two special types of smooth mappings defined by hypotheses on differential maps.

33. Lemma. Let 4: M" + N" be a smooth map, and let p be a point of M. Then the following are equivalent: (1) The differential map d $ p is one-to-one. (2) The Jacobian matrix of d$p has rank m relative to one (hence every) choice of coordinate systems. (3) If y ', . . . , y" is a coordinate system in N at 4(p), then there are integers . . . ,i < n such that the functions y'' #, . . . , y'- 0 4 form a 1 < i, coordinate system on a neighborhood p in M . 0

The proof is a mild generalization of the first part of the proof of Proposition 28.

34. Definition. An immersion 4 : A4 + N is a smooth mapping such that d 4 p is one-to-one for all p E M . For example, every regular curve (a' never 0) is an immersion. An imbedding of a manifold P into M is a one-to-one immersion 4: P + M such that the induced map P -P 4(P) is a homeomorphism onto the subspace +(P) of M . The standard example is the mapping (ul, . . . ,a,) -+ ( u l , . . . , a,, 0, . . . , 0) of R"' into R". Lemma 33(3) says that Locally every immersion looks like this. Explicitly, the restriction of an immersion to any sufficiently small open set is an imbedding. Submanifolds and imbeddings are very closely related: If P is a submanifold of M , then the inclusion mapj: P c M is an imbedding. Conversely, if 4: P -+ M is an imbedding, make its image b ( P ) a manifold so that the induced map 4:P -+ $(P) is a diffeomorphism. Then 4 ( P ) is a subspace of M , and the inclusion mapj: $(P) c M is just 4 6-l , which by the chain rule is an immersion. Thus $(P) is a submanifold of M . The term "submanifold" is sometimes applied to a more general object: Let P be a manifold that is merely a subset of a manifold M . If the inclusion m a p j : P c M is an immersion, we call P an immersed submanifold of M . Thus submanifolds are immersed submanifolds, but not conversely, since the latter need not have the induced topology (see Exercise 15). Lemma 33 dualizes as follows. 0

20 7

Manifold Theory

35. Lemma. Let I):M" following are equivalent:

-+ N"

be a smooth map, and let p E M . The

(1) The differential map d$p is onto. (2) The Jacobian matrix of dI)phas rank n relative to one (hence every) choice of coordinates at p and I)(p). ( 3 ) I f y ' , .. . , y" is acoordinate system for N at $(p), there is acoordinate systemforMatpoftheform(y'o$ ,..., y n o I ) , x n + l ,..., x").

Proof: It is easy to see that (2), for one choice of coordinates, implies (l), and that (1) implies (2) for every choice of coordinates. For coordinates as in (3), (2) is certainly true. Assume (2) holds for coordinates y', . . . , y" at I)@) and x', . . . ,xm at p . where necessarily m 2 n. At p the n x m Jacobian matrix (d(y' o I))/dxj) has, by hypothesis, n linearly independent columns. Relabeling, if necessary, we may assume these are the first n columns. But then by the Jacobian criterion in Exercise 7 , ( y ' 0 I), . . . , y" I), x"", . . . , x m ) is a coordinate system at p . rn 0

A point q E N is called a regular value of a smooth mapping $: M provided that d$p is onto for every p E I)-'(4).

-+

N

36. Corollary. If q E $(M)is a regular value of a smooth mapping $: M -+ N , then I)-'(q) is a submanifold of M , and dim M = dim N dim I)-' ( 4 ) .

+

Proof We use the slice condition from Proposition 31. At p E $- ' ( 4 ) let the notation be as in the proof above. Reordering the coordinate functions, (x"", . . . , x", y' 0 I),. . . , y" I))is a coordinate system on a neighborhood @ of p. But the slice of 42 through p is just % n $- ' ( 4 ) . rn 0

A hypersurface in a manifold M is a submanifold P whose codimension dim M - dim P is 1. Applying the previous result with N = R' gives the following efficient way to get hypersurfaces.

37. Corollary. Let c be a value of the function f E ~ ( M )If. at each point of f - '(c) = (p E M : f(p) = c) the differential df, is nonzero, then f- '(c) is a submanifold of M calied a level hypersurface off. For example, on R"' ' let f = (ui)'. Then df = 2 ui du', so f and df are each zero only at the origin. Thus the n-sphere S"(r) = f - '(r') of radius r > 0 is a hypersurface in R"+'.

1

1

38. Definition. A submersion I):M such that dI)p is onto for all p E M .

-+

B is a smooth mapping onto B

Topology of Manifolds

21

Since every value of a submersion is regular, M is partitioned into the submanifolds II/- '(4) for all q E B. For m 2 n the projection R"' + R" sending (tl, . . . , tm)to ( t l , . . . , t,) is clearly a submersion. Lemma 35(3) implies that locally every submersion has this form.

TOPOLOGY OF MANIFOLDS Manifolds are locally Euclidean; that is, each point of a manifold M has a neighborhood homeomorphic (by a coordinate system) to an open set in Euclidean space. Thus manifolds share all the local properties of Euclidean space, notably local connectedness and local compactness. A locally Euclidean space need not be Hausdorff; hence the inclusion of the Hausdorff axiom in Definition 3. A.

Connectedness

As for any topological space, a manifold is connected provided it cannot be expressed as the disjoint union of two nonempty open sets. For a manifold its coordinate neighborhoods show that any point can be connected to a sufficiently nearby point by a (smooth) curve segment. It follows that a manifold is connected if and only if any two of its points can be joined by a piecewise smooth curve segment. An arbitrary space S is the disjoint union of its connected components, these being the maximal connected subspaces of S . Because a manifold is locally Euclidean its components are open, hence are (connected) manifolds. B.

Second Countability

A topological space S is second countable provided its topology has a countable base G?, that is, a countable collection of open sets such that every open set is the union of some subcollection of G?. For example, R" has a countable basis consisting of all sets {p E R" : a, < pi < hi}where ai and b, are rational numbers. It is easy to verify that (1) any subspace of a second countable space is second countable, (2) a Cartesian product of second countable spaces is second countable, and ( 3 ) a space that is a countable union of second countable open subspaces is itself second countable. An [open] covering Q of a topological space S is a collection of [open] subsets of S whose union is S. A second countable space S has the Lindelof property: Any open covering Q ofS has a countable subcovering (subcollection of Q that still covers S ) . In particular, a second countable manifold has only

22 1

Manifold Theory

countably many connected components. There do exist connected manifolds that are not second countable but they are merely curiosities. Henceforth we shall assume, whenever convenient, that the mangolds we deal with are second countable. C.

Partitions of Unity

A collection 2 of subsets of a space S is locallyfrnite provided each point of S has a neighborhood that meets only finitely many elements of 2.Let { f ,: CI E A } be a collection of smooth functions on a manifold M such that {supp f , : M E A } is locally finite. Then the sumx, f,is a well-defined smooth function on M , since on some neighborhood of each point all but a finite number of f,s are identically zero.

39. Definition. A smooth partition ofunity on a manifold M is a collection (f,:CI E A ) of functions f, E g ( M ) such that (1) 0 I f, 1 for all CI E A . (2) {supp f, : CI E A ) is locally finite. (3) f = 1.

1,

The partition is said to be subordinate to an open covering 6 of M provided each set supp f, is contained in some element of 6. Partitions of unity are an indispensable tool for assembling locally defined objects into a global object (or decomposing a global object into a sum of local objects). For such purposes partitions of unity with “small” supports are needed, as follows. 40. Proposition. If M is a (second countable) manifold then given any open covering K of M there is a smooth partition of unity subordinate to 6.

For a proof of this and related results, see [Mat], [Sp]. The existence of partitions of unity subordinate to arbitrary open coverings is equivalent to the topological property paracompactness. Thus by the proposition, a second countabIe manifold is paracompact. (The converse fails only when the manifold has noncountably many components-a possibility of scant practical importance.)

D.

Orientability

This subtle topological property has many different characterizations; for manifolds the following is simplest.

Topology of Manifolds

23

41. Definition. A manifold M is orientable provided there exists a collection LO of coordinate systems in M whose domains cover M and such that for each 5, q E LO the Jacobian determinant function J ( t , q ) = det(ay'/axj) is positive. (LO is called an orientation atlas for M.)

Evidently R" is orientable, since it can be covered by a single coordinate system. Experiment shows that the sphere and torus are orientable, but Mobius bands and Klein bottles are not. The standard definition of manifold starts with a topological space, but actually the atlas of a manifold determines its topology, and this approach is often the most practical way to construct manifolds. 42. Proposition. Let C be an abstract set, and for each a E A let la be a one-to-one function from a subset of Z to an open set
(1) The domains {an : a E A } cover C. ( 2 ) For all a, BE A the function t D0 5,' is Euclidean smooth-its domain n aa) an open set of R". ( 3 ) If p # q in C, then either p and q are in a single %= or there are a, p E A such that p E %=,q E aD, with %=and aPdisjoint. Then there is a unique Hausdorff topology and complete atlas on C such that each 4, is a coordinate system in the resulting manifold. Furthermore, if countably many %a cover C the manifold is second countable. Proof. Since each 5, is to be a homeomorphism there is no choice but to define a subset V of C to be open if and only if ta(Vn is open in &(aa)hence in R"-for all a E A . It is easy to check that such open sets constitute a topology on C. Note that by (2) the domains %!= are open. We assert that each 5, is a homeomorphism. If Y is open in hence in C, then by definition <,(V)is open in Conversely, if W is an open set in t,(@(,) we must show that 4, '( W ) is open. If p E A then applying (2) to both a, p and p, a shows that t D 5; is a homeomorphism from tU(am n a,) to tD(aa n as). But then the expression on the right-hand side in the formula

a=)

cm(%!J.

0

shows that this set is open. Consequently ta-'(W)is open. Thus { ( a : o! E A } is an atlas on the topological space C,which by ( 3 ) is Hausdorff. The countability assertion is clear, since each %=, being homeomorphic to an open set in R", is second countable.

24 1

Manifold Theory

SOME SPECIAL MANIFOLDS A.

Product Manifolds

We consider how the calculus on a product manifold M x N (Lemma 5 ) derives from that of M and N separately. This decomposition is modeled closely on the way the calculus of the plane R2 = R' x R' derives from that of the real line. Using product coordinate systems on M x N it is easy to check that (a) The projections n: M x N

+M

sending ( p , q ) to p ,

6 :M

-, N

sending ( p , q ) to q

x N

are smooth mappings-in fact, submersions. (b) A mapping 4: P -, M x N is smooth if and only if both n 0 4 and c 0 4 are smooth. (c) For each ( p , q) E M x N the subsets

M x q

=

{(r,q)~M x N:reM},

p x N

=

{(p,r)~M x N:rEN)

are submanifolds of M x N . ( 4 For each ( P , 4 ) n l M x q is a diffeomorphism from M x q to M ; alp x N

is a diffeomorphism from p x N

to N .

By (b) the tangent spaces are subspaces of the tangent space to M x N at ( p , q). 43. Lemma. T P , 4 ) ( Mx N ) is the direct sum of its subspaces 7;p,q)M and T p , ¶ ) Nthat ; is, each element of T P , J M x N ) has a unique expression as x

+ u,

where x E TP,q+14 and u E T p , q , N .

Proof. Since nlp x N is constant, dn at ( p , q) sends 'T;P,q)N to 0. But by (c), dnl 7;P,4)Mis an isomorphism. Thus 7;,,$4 n T P , 4 ) N= 0. The result then follows by linear algebra, since by (c) the sum of the dimensions of rn these two subspaces is dim(M x N ) . To relate the calculus of M x N to that of its factors the crucial notion is that of lifting, as follows.

Some Special Manifolds

25

Iff E g ( M ) the lft off to M x N is f = f n E g ( M x N ) . If x E Tp(M)and q E N then the l f t I of x to ( p , q ) is the unique vector in qp,q)(M)such that d n ( I ) = x. If X E X(M) the liji of X to M x N is the vector field r? whose value at each ( p , q ) is the lift of X , to ( p , q). Product coordinate systems show that x i s smooth. Thus the l f t o f X E X(M) to M x N is the unique element o f X ( M x N ) that is n-related to X and a-related to the zero uector,fieldon N . The set of all such horizontal lzfts r? is denoted by 2 ( M ) . Functions, tangent vectors, and vector fields on N are lifted to M x N in the same way using the projection a. Note that 2 ( M ) and symmetrically the vertical lifts 2 ( N ) are vector subspaces of X(M x N ) but (except in trivial cases) neither is invariant under multiplication by arbitrary functions f E S(M x N). For example, on R2 the natural coordinate vector field a, = a/ax is the horizontal lift of the vector field d/dx on R' (as x axis), but y a, is not a lift. 0

44. Corollary. (1) If 2, 7 E f?(M)then [ f ,71 = [ X , Y]" similarly for 2 ( N ) . ( 2 ) If f E 2 ( M ) and P E 2 ( N ) , then [r?, 81 = 0.

E f?(M),and

Proof. Both assertions follow from Lemma 22. In the case of (2) for example, [r?, 81 is n-related to [ X , 01 = 0 and a-related to [0, V ] = 0. Thus the result follows by Lemma 43. rn

To establish these facts we have used a rather elaborate notation. In practice, Lemma 43 is usually rendered as q p , J M x N ) = T , M x T p N , and the tilde (-) is omitted from lifts. 6.

Vector Spaces as Manifolds

Let V be an n-dimensional vector space over R. If 5 and v] are linear isomorphisms from V to R", then 0 q - : R" -, R" is a linear isomorphism-hence a diffeomorphism. Thus Proposition 42 is scarcely needed to show that there is a unique way to make V a manifold such that every linear isomorphism <: I/ + R" is a coordinate system. The following convenient notation will be used frequently.

<

'

45. Definition. If p , u E V, let up E Tp(V )be the initial velocity "(0 ) of the curve a(t) = p tu.

+

We picture u p as the arrow running from p to p

+ u.

26 I

Manifold Theory

Lemma. If xl,. . . ,X" is a linear coordinate system on V, then

46.

Proof. Since the coordinates x i are linear, xi(@))

=

x'(p)

+ tx'(u).

Hence by the velocity formula, up = a'@)

=

1 ~

dt

Thus up is the tangent vector at p with the same coordinates as u E V. It follows immediately that (1) for fixed p E V ,the function u + up is a linear isomorphism V % Tp(V ) ; (2) for p , q E V , the function up + uq is a linear isomorphism Tp(V)% T,( V ) . As in the familiar case V = R3 these canonical isomorphisms allow free interchange between u (point of V), uo (arrow from 0 to v), and up (arrow from p to p u). The position uector field P E X( V ) assigns to each p E V the tangent vector p , E T,( V ) (intuitively a duplicate, starting at p , of the arrow from 0 to p ) . In terms of a linear coordinate system, P = C x idi.

+

C.

The Tangent Bundle

u

For a manifold M ,let T M be the set ( T J M ) :p E M ) of all tangent vectors to M. A technicality: For each p E M replace 0 E T,(M) by 0, (otherwise the zero tangent vector is in every tangent space). Then each u E T M is in a unique T,(M), and the projection 71: T M + M sends u to p . Thus nn-'(p) = Tp(M), There is a natural way to make T M a manifold, called the tangent bundle of M . Let 5 be a coordinate system on 42 c M. If u is tangent to M a t a point p of 42, then u is uniquely determined by the coordinates of p and the coordinates of u relative to d,, . . .,d, at p . To formalize this, let f ' b e the real-valued function on Y'(?2)c M given by i i ( u ) = u(xi). Then define [: n-*(?2).+ R2" by

E = (x'

0

71,

. . . ,x"

0

..

71, il,. , in).

We shall now apply Proposition 42 to make T M a manifold with all such functions as coordinate systems. By the basis theorem, if u E n-'(42), then u = xi(u) ailnv.Thus 4 is a one-to-one function from 71'- '(42) onto the open set 4(%) x R" of R'".

1

Integral Curves 27

Next we show that any two such functions A 1y) x R", then for 1 I i 5 n

and t j overlap smoothly. If

(a, b) E q(%

uiStj-l(a, b) = x'ntj-'(a, b ) = x'q-'(a). Since d/dyi

=

2 (dxk/dyi)d/dxk,we also have

Thus 4" t j - l is Euclidean smooth. It is easy to check the conditions in Proposition 42 that show T M is a Hausdorff and second countable. A vector field X E X ( M ) is exactly a smooth section of T M , that is, a smooth function X : M + T M such that n X = id. This suggests a useful generalization. 0

0

47.

Z :P

+

Definition. A vectorjield Z on a smooth map 4:P + M is mapping T M such that 71 0 Z = 4,where 71 is the projection T M + M .

Thus Z assigns to each point p E P a tangent vector to M at 4(p). (For example, its velocity a' is a vector field on a curve a in M.) Z is smooth as a mapping P + T M if and only iff E g ( M ) implies Zf E g(P),where ( Z f ) ( p )= Z(p)ffor all p E P. The set X(4) of all smooth vector fields on 4: P -+ M is, in a natural way, a module over g(P ) . INTEGRAL CURVES

A vector field on a manifold can be interpreted as a differential equation for which the appropriate notion of solution is as follows. 48. Definition. A curve CI: I -+ M is an integral curve of V E X ( M ) provided a' = V,;that is, ~ ' ( t=) I/n(*) for all t E I.

Thus at each point the curve CI has the velocity prescribed by V . If the equation above is expressed in terms of a coordinate system 5 , it yields a system of first-order ordinary differential equations

d(x' 0 a ) --

dt

-

F'(x'

0

c1,

. . . ,x" 0 a)

(1 I

2

I n),

where F' is the coordinate expression for Vx'. Since the parameter t does not appear on the right-hand side, we can think of V as giving the velocity of the steady state flow of a fluid through M-an idea pursued below. The fundamental existence and uniqueness theorem for

28 7

Manifold Theory

the solutions of such systems has the following consequence in terms of manifold theory. 49. Proposition. If I/ E X(M) then for each p E M there is an interval I around 0 and a unique integral curve a : I -+ M of V such that a(0) = p .

Note that if a is an integral curve of V , then t

+ a([

+ c ) is also.

Corollary. If a, 8:I -+ M are integral curves of V such that a(a) = P(a) for some a E I , then a = 8. 50.

Proof By continuity the agreement set A = { t E I : a(t) = P(t)} is closed. If A is also open, then since A is nonempty, A = I . Fix t E A. Then s -+ a(t + s) and s + p(t + s) are integral curves of V that agree at s = 0. W Hence by Proposition 49 they agree for s sufficiently near 0. Consider the collection of all integral curves a : I , -+ M of V that start at is, for which a(0) = p . For any two such, the corollary shows that a = on I , n I , . Thus by a remark on page 5, all these curves define a single integral curve a,: I , -+ M where I , = I,. We call a, the maxima/ integral curve of V starting at p . Examples as in Exercise 12(c) show that this largest possible domain I , need not be the entire real line. p

E M , that

u

51. Example. On the plane R2 let V = x 8, - y 8,. Then a(t) = ( x ( t ) , y ( t ) ) is an integral curve of V if and only if dx/dt = x and dy/dt = -y. Hence ~ ( t =) Ae' and y(t) = Be-'. Thus the maximal integral curve starting

at P = (PI,Pz) is a,(t) = (pie', p z e - ' )

for all t E R.

These curves parametrize the hyperbolas xy constant, except for the four coordinate semiaxes and a constant curve at the origin. The following refinement of Corollary 50 shows that if maximal integral curves of V meet at one point then they differ only in parametrization. 52. E,(S

Lemma. For V E X(M) let q = a,(s). Then s for all t E I , .

+ t ) = a,(t)

+ I, = I , ,

and

Proof. Let p(u) = aq(u - s); then is an integral curve of V defined on Since b(s) = q = a,(s), Corollary 50 implies that 8 = a p on (s + Iq) n I,. Thus 8 and a, combine to give a single integral curve on (s + Z9) u I,. Since I , is maximal, s + I , c I,. Hence ap(s + t ) = fl(s + t ) = u,(t) for all t E I , . Then since I , is maximal, I , 3 -s I , . Thus s + I, = I,. rn

s

+ I,.

+

integral Curves 29

A nonconstant curve y : R -+ M is periodic if there is a number c > 0 such that y ( t + c ) = y ( t ) for all t . Since y is nonconstant it follows that there is a smallest such c > 0, called the period of y. If y is one-to-one on some interval [a, a + c), then y is simplji periodic. Thus a figure-8 curve is periodic but not simply periodic. It follows from the preceding lemma that every maximal integral curve is either one-to-one, simply periodic, or constant. A vector field is complete if each of its maximal integral curves is defined on the entire real line. We shall now assemble all the integral curves of a given vector field V into a single mapping, assuming at first that V is complete. 53. Delinition. TheJlow ofacomplete vector field Von M is the mapping $: M x R' -+ Mgiven by

$(P? t ) = @,@X where clp is the maximal integral curve starting at p . If p is held constant, then t -, $(p, t ) is just the integral curve LY,. On the other hand, if t is held constant, then p -P $(p, t ) defines a function t,ht: M -+ M that lets every point p of M flow for exactly time t . We call $, the tth stage of the flow $, and sometime refer to {$t : t E R } as the flow of V . 54. Lemma. If $ is the flow of a complete vector field, then:

(1) $o is the identity map id of M . (2) $s $, = $ s + f for all s, t E R (so the stages of $ commute). (3) Each stage is a diffeomorphism, with = $-,. 0

+;

Proof: (l)Obvious,since$,(p) = MJO) = p.(2)Thisfollowsimmediately from Lemma 52. (3) By the preceding properties, $,o $-, = id = I+-, t J t . 0

If the vector field V is not complete, then for each point p of A4 we at least have a localflow $: 42 x I -+ M defined by the formula in Definition 53 but with 42 a neighborhood of p in M and I an interval around 0 in R'. By differential equations theory if 42 and I are sufficiently small then I+ is smooth. For such a local flow the analogue of Lemma 54 holds: (1) is the identity map of 42, (2) t,hs+, = t,bs o I+t whenever s, t , and s + t are in I , and (3) for t E I , $r: 42 -+ $t(42) is a diffeomorphism. If V is not necessarily complete then its flow I)@, t ) = a,(t) has largest domain 9 = {(p,t)EM x R ' : ~ E I , ) .

30 1 Manifold Theory

Patching together smooth local flows as above shows that 9 is an open set of M x R' and the flow $: 9 -, M is smooth. (See Theorem 5 in Chapter IV of bl.) In considering the behavior of curves ct : I M near the ends of the interval I it suffices to focus on the right end by taking 1 = [O, B); the left end is dealt co.) with analogously. (By convention b < 00 but B I 55. Definition. A piecewise smooth curve a : [0, B ) + M is extendible provided it has a continuous extension E : [0, B] -+ M . Then q = &(B) is called an endpoint of a.

Equivalently, there exists a point q E M such that for every sequence {a(si)}converges to q. An extendible curve need not have a piecewise smooth extension, however this occurs in important special cases. { s i } in [0, B ) approaching B, the sequence

56. Lemma. Let a : [0, b ) -+ M , b < co, be an integral curve of V E X ( M ) . The following are equivalent:

(1) a is not maximal; that is, ct is extendible as an integral curve of V to a larger interval [0, b E ) . (2) a is extendible. (3) a lies in a compact set of M . (4) There exists a sequence {si}-+ b such that {a(si))converges.

+

-

Proof: Clearly (1) (2) 3 (3) * (4). To prove (4)e-(l), let @ be a neighborhood of lim a(si) such that a flow of V is defined on q x (-6,s). Thereis an n such that b - 6 < s, and ~(S,,)E@. The integral curve of Vstarting at a(s,) is defined on [0, 6). Hence by Lemma 52 it serves to extend CI past b. Finally we give two applications of flows. 57. Lemma. If V is a vector field and p a point such that V, # 0, then there is a coordinate system x l , . . . , xn at p such that V = d/dx' on the coordinate neighborhood.

Proof: Let:)I J . x I -+ M be a local flow for V, where @ is a neighborhood of p on which V is never zero. Let S be a hypersurface through p in @ such that V, 4 Tp(S),and consider the restriction $: S x I M of $. The map $l(S x 0) is a trivial diffeomorphism to S , and d$(,,o, merely identifies q,. ,,,(S) with T,(S). Since d$(d/dt ,I o) = Vp$ Tp(S),it follows that d$(p,o)is an isomorphism. Thus $ is a diffeomorphism of some neighborhood V x J of ( p , 0) onto a neighborhood W of p in M . -+

integral Curves 31

We can suppose that y 2 , .. . , y" is a coordinate system for S on V . Let y' = t be the natural coordinate on J c R'. Transferring the product coordinate system y ' , . . . ,y" to %'. by means of the diffeomorphism $ I(+'- x J ) gives therequiredcoordinatesystemx', . . . , x". In fact,d/dx' = d$(d/at) = V, rn since $(p, t) = a,([). The bracket of vector fields can be described in terms of flows. Recall first that a smooth function F : I + V into a finite-dimensional real vector space has derivative F ' : I -, V given by 1 t

~ ' ( s )= lim - [ F ( s r-0

+ t)

-

F(s)]

=

df ' C(s)e,, dt

1

where F = f 'eirelative to a basis el, . . . , en for V . Now take the derivative of a vector field W with respect to a vector field V as follows. Ifp E M let upbe the integral curve of V starting at p . Use the flow of V to move the value of W at ap(s)backward along a, into T,(M). Then take the vector derivative at s = 0. The result turns out to be [V, W],. 58. Proposition. If V , W E X ( M ) , let $ be a local flow of V near p Then

Proot Write F,(t) equation is Fb(0).

=

E

M.

d$ - r( Wet,). Then the right-hand side of the above

Case 1. V, # 0. Choose a coordinate system xl,. . , ,X" as in Lemma 57 so that V = 8,.Thus the flow of V only changes the x' coordinate of points q near p :

+ t,

x'($,q) = x ' ( q )

It follows that d$,(di)

=

x'($[q) = xj(q)

difor all i and t. Thus if W

for 2 ~j I n. =

c W' ai,

F J f ) = 1W ' ( $ ~ Pai)Ip. Omitting the subscript p we compute

Fb(0) =

cd

-

dt

( W' a,)(O) 0

di =

1V,( W') d'

32 1

Manifold Theory

Case 2. V = 0 on a neighborhood of p . Then [V, W ] , = 0. Integral curves starting in the neighborhood are constant, hence $t = id for all t. Thus F , is constant, hence Fb(0) = 0. Case 3. V, = 0, but p is the limit of a sequence {pi} with Vpi # 0 for all i. Their expressions in terms of coordinates show that both Fb(0) and [V, W ] , depend continuously on p , hence the result follows by Case 1.

Exercises 1. (a) A vector field V on M is smooth if for sufficiently many coordinate systems 5 to cover M the functions Vx' are smooth. (b) A map 4 : M -,N is smooth if for sufficiently many coordinate systems g to cover N the functions y' 4 are smooth. 2. A linear transformation 4: V + W is smooth, and &$(up) = ( 4 ~ ) ~ ~ 3. (a) If id is the identity map of M , then d(id), is the identity map of T,(M). (b) If 4 : M -+ B is a diffeomorphism then for each p E M , d$, is a linear isomorphism with inverse d ( 4 - ')+,. 4. On the domain of a coordinate system 5, if V = I/' ai and W = W j aj, then 0

5. Given u E T,(M), there is a vector field V E X ( M ) such that V, = u and a curve a such that a'(0) = u. 6. (a) A map 4: M -+ N is smooth if and only ifg E B ( N ) impliesg o 4 E 3 ( M ) . (b) A smooth map 4: M -+ N , with M connected, is constant if and only if

d4,=OforallpEM. 7. Smooth real-valued functions y ' , . . . , y" on a neighborhood of p in M form a coordinate system on a (possib'ly smaller) neighborhood of p if and only if for one, hence every, coordinate system x l , . . . , xn at p, ( p ) ) # 0.

det($

8. (a) If P and Q are submanifolds of M and P c Q then P is a submanifold of Q. (b) If P is a submanifold of M such that dim P = dim M , then P is an open submanifold of M . 9. (a) A smooth map $ of M onto B is a submersion if and only if $ has local sections; that is, given P E M there is a neighborhood 42 of $(p) in B and a smooth map A: 92 + M such that I,!I A = id and A($p) = p . 0

.

Exercises

33

10. Let $: M + B be a submersion. (a) A vector u E T,(M) is tangent to the fiber $ - ' ( $ p ) if and only if d$(u) = 0. (b) A map 4:B -+ N is smooth if and only if 4 $ is smooth. 11. Let 4: M + N be a smooth map and let p E M be a point at which d 4 , is a linear isomorphism. (a) Given a coordinate system 5 at p or q at 4 ( p ) the other can be selected so that 4 preserves coordinates: y'(4q) = x'(q) for q near p . (b) Then d~$(d/dx')= d/dy', and if f E S ( N ) , d(f 0 4)/dxi = (df/dy') 0 4 near p. 12. In each case find a formula for the maximal integral curve up starting at an arbitrary point p . (a) V = y d, - x d, on R2.(b) V = x d, + (x + y) dy on R2. Sketch a few integral curves. (c) V = u2 d/du on R'. Note that V is not complete; find the largest domain 9 of its flow. 13. For a functionf E S(M), its differential d f : T ( M ) + Rand its differential map df: T ( M ) + T(R') differ only by a canonical isomorphism. 14. A critical point off E S(M) is a point p E M a t which df, = 0. (a) At such a point there exists a Hessian function H: TPM x TPM + R such that H(X,, Y,) = X,(Yf) = Y,(Xf) for all X, Y E X(M). (b) H is bilinear, symmetric, and satisfies H(diI,, d j l p ) = (d2f/dx' d x J ) ( p ) relative to a coordinate system. Also H(u, u) = (d2(f u)/ds2)(0)if ~ ' ( 0=) u. 15. (a) Let 4: P + M be a one-to-one immersion; make 4 ( P ) a manifold so that P -+ 4 ( P ) is a diffeomorphism. Show that 4 ( P ) is an immersed submanifold of M , and that if P is compact 4 ( P ) is a submanifold. (b) By considering a subset of R2 shaped like a figure 8, show that an immersed submanifold need not have the induced topology and that Corollaries 29 and 30 both fail for immersed submanifolds. 16. Let $ be the flow of V E X(M). (a) If $,(p) = p for a sequence of t-values approaching zero, then V, = 0 (hence $,(p) = p for all t). (b) If an integral curve a : [0, co) -+ M of V is extendible, with endpoint q, then % = 0. 17. (a) Let a E 1 c R. Iff E % ( I ) and f(a) = 0 show that there exists a function g E S(I)such that f ( s ) = (s - a)g(s) for all s E 1. (b) Prove the analogue with f replaced by a vector field on a curve. 18. Let 4: P + M be an immersion and fix p E P . (a) Iff E S ( P ) there exists a function f E s ( M ) such that f = f 0 4 on some neighborhood of p in P . (Hint: See Proposition 33(3).) (b) If 2 E X(4) there exists X E X(M) such that Z = X , on some neighborhood of p in P . 19. An atlas on a space S is maximal if it is not contained in any strictly larger atlas on S . Prove that an atlas is maximal if and only if it is complete. 0

0

2

TENSORS

The notion of tensor field on a manifold generalizes the notions of realvalued function, vector field, and one-form, and thus provides the mathematical means of describing more complicated objects on a manifold. Tensors occur in many different guises, but their characteristic property is always multilinearity. The definition we use stresses this and converts easily into the classical coordinate description of tensor. The last part of the chapter deals with the generalization of inner product on which semi-Riemannian geometry is based. BASIC ALGEBRA

The following definitions cover the two main cases we need: the module X(M) over 3(M),and the vector space T’(M) over R. Let V,, . . . , V , be modules over a ring K . Then V, x . . . x V , is the set of all s-tuples (ul,. . . , u s ) with ui E F. The usual componentwise definitions of addition and of multiplication by an element of K make V, x ... x V , a module over K, called a direct product (or direct sum if the notation x is replaced by 0). If W is also a module over K , a function A:V1 x ... x

K+ w

is K-multilinear provided A is K-linear in each slot, that is, for 1 I i I s and u j E vj ( j # i), the function u

4

A ( u , , . . . ,ui- 1,

is K-linear. 34

0,

ui+ 1,.

.. ,us)

Tensor Fields

35

If V is a module over K , let V* be the set of all K-linear functions from V to K . The usual definition of addition of functions and multiplication by elements of K makes V* a module over K , called the dual module of V. If = V for 1 i 5 s, the notation V, x . . . x V, is abbreviated to V'. 1. Definition. For integers r 2 0, s 2 0 not both zero, a K-multilinear function A : (V*)l x V s + K is called a tensor of type ( r , s) over V. (Here we understand A : V s + K if r = 0, and A : ( V * y + K if s = 0.)

The set 2 : ( V ) of all tensors of type (r, s) over V is a module over K , again with the usual definitions of functional addition and multiplication by an element of K . A tensor oftype (0,O) over V is simply an element of K . TENSOR FIELDS A tensorjeld A on a manifold M is a tensor over the S(M)-module X ( M ) , as defined above. Thus if A has type (r, s) it is an g(M)-multilinear function A : X*(M)' x X(M)s+ %(M).

So A is a multilinear machine which when fed r one-forms 0'. . . . ,8' and s vector fields X , , . . . ,X , produces a real-valued function f

=

A ( @ , . . . ,@,

x,,. . .,X , ) E %(M).

Here 0' occupies the ith contravariant slot, X j the jth covariant slot of A . The set 2 : ( M ) of all tensor fields on M of type (r, s) is then a module over ?j(M).In the exceptional case r = s = 0, a tensor field on M of type (0,O) is just a functionf E g ( M ) ; that is, Z:(M) = g ( M ) . To show that a given function A : X*(M)' x X ( M y + g ( M ) is a tensor we must show that it is S(M)-linear in each slot (i.e., in each variable separately). Additivity in each slot is often obvious, so the crucial question is whether functions can be factored out of each slot: A ( o , , . . . , Or,

-

x , ,. . . ,f x i , . . , x,)= j ~ ( o 1 ,. . , er, x , ,. ..,xi,. . . , x,).

Consider the following examples. (1) The evaluation function E : X*(M) x X(M) -, g ( M ) is given by E(O, X ) = OX. Clearly E is g(M)-linear in each slot, so it is a tensor field on M of type (1, 1). ( 2 ) Fix a one-form o # 0 and define F : X(M) x X(M) + ? j ( M ) by F ( X , Y ) = X ( o Y ) for all X , Y. Then F is g(M)-linear in X but only additive in Y. In fact, F(X,f Y ) = X w ( f Y ) = X ( f o Y ) = ( X f ) o Y + f F ( X , Y). Thus F is not a tensor field.

36 2

Tensors

In these examples, note that the tensor (1) is algebraic, while the nontensor (2) involves differentiation. While we only add tensors of the same type, any two tensors can be multiplied as follows: If A E 2 : ( M ) and B E 2 : : ( M ) ,define A 0 B : X*(My+r' x X(My+s' -P S ( M )

by ( A 0 B ) ( P , . . . ,e r + " , X I , . . . , X,+,,) = A(ei,.

. . , e r , x,,. . . , x , ) B ( e r + l , . . . , o r + r ' x,,1,. * . 9

1

XS+S,).

Then A 0 B is a tensor of type ( r + r', s + s'), called the tensor product of A and B. If r' = s' = 0, so B is a functionf E g ( M ) , define A 0f

=

f 0A

=

fA.

Thus if A is also of type (0, 0), the tensor product reduces to ordinary multiplication in g ( M ) . Evidently the tensor product is S(M)-bilinear, that is, (fA

+ gA') 0 B = f A 0 B + gA' 0 B,

with a similar identity for B. Furthermore it is immediate from the definition that the tensor product is associative; thus A €3 B 0 C is well defined for tensors of any types. However the tensor product is generally not commutative. For example, on a coordinate neighborhood (dx' 0 dx2)(d,, 8,) = dx'(dl) d x 2 ( d 2 ) = 1, (dX2

0 dx')(a,,

a,)

= dx2(d1) dx'(d,)

= 0,

so dx' 0 dx2 # d x 2 0 d x ' . On the other hand, functions commute with everything : f ( A 0 B) = f A 0 B

=

A

fB.

INTER PRETATIONS

If w is a smooth one-form on a manifold M , then the function X + w ( X ) is S(M)-linear from X(M) to S ( M ) , hence is a (0, 1) tensor field. Every (0,l) tensor field arises in this way from a unique one-form (Exercise 4), so we write simply =

X*(M).

There are two less obvious interpretations that will be used frequently.

Tensors at a Point

37

(1) If V is a (smooth) vector field on M , define I/(@ = Q ( V ) for all 0 E X*(M).

This function I/: X*(M) + S ( M ) is S(M)-linear, hence is a (1,O) tensor field. Every (1,O) tensor field on M arises in this way from a unique vector field (Exercise 5), so we write

2 A ( M ) = X(M).

(2) If A : X(M)" -+ X(M) is g(M)-multilinear, define A: X*(M) x %M)S S ( M ) by for all f3 and Xi. A(& X I , . . . , X,) = B ( A ( X , , . . . ,X,)), Evidently A is S(M)-multilinear, hence is a (1, s) tensor field. We shall consider A itself to be a tensor field, using the formula above only when necessary. Tensors of type (0,s) are said to be covariant, while tensors of type (I,0) with r 2 1 are contrauariant. For example, real-valued functions and oneforms are covariant; vector fields are contravariant. An (r, s) tensor is mixed if neither r nor s is zero. Notice that the definition of tensor product shows that if A is covariant and B contravariant then A 0B = B 0A . +

TENSORS AT A POINT

The goal of this section is to show that-just as for a vector field or oneform-any tensor field A on M can indeed be viewed as ajiefd on M , assigning a value A , at each point p E M . The essential fact is that when A is evaluated on one-forms and vector fields to give a real-valued function A ( 8 1 , . . . ,8', X I , . . . , X,), the value of this function at a point p E M depends not on the entirety of each one-form and vector field-or even on their values on a neighborhood of p-but solely on their values at the point p itself. Formally: 2. Proposition. Let p E M and A E q ( M ) . Let 8', . . . ,8' and 8', . . . ,8' be one-forms such that 8'1, = 8' ,1 (1 I i I r ) ; let X,,. . . ,X , and X , , . . . , X , be vector fields such that X j l , = X j l p (1 5 j I s). Then

. . ,P,x,,. . . , x,)(~) = A(e1,. . . ,e r , x,,. . . , x,)(~).

~(81,.

The proof will be easy once we establish the following fact.

3. Lemma. If any one of the one-forms 8',. . . ,8' or vector fields X , , . . . ,X , is zero at p , then A(B1,. . . ,8', X , , . . . , X,)(p) = 0.

38 2

Tensors

Proof: Suppose that, say, X,I, = 0. Let x l , . . . ,x" be a coordinate system on a neighborhood %! of p . Then X , = X i di on 42, where X i = X , x i E 8($2). Letfbe a bump function at p with support in %! (Lemma 1.8). Then as usualf X iis a smooth function on all of M, and similarly ,f di E X(M). Hence ~ ~ A A (.P.,,x,) . = A(e1,. . . , = A(P,. . . , ail

1

c fxy

px,)

c f X ' A ( 6 1 , . . . ,f di). n

=

i= 1

Since X,I, = 0, each X i ( p ) = 0; also f ( p ) = 1. Hence evaluating the formula above at p yields A ( 8 ' , . . . , X , ) ( p ) = 0. m Proof of Proposition 2. For clarity suppose r = 1 and s = 2. Consider the following telescoping identity (extendible in an obvious way to any r, s):

~ ( ex, , P) - A(e, x , Y ) = A(B - 8, x,P) + A(@,x - x , P) + A(e, x , P - Y). By hypothesis 0 - 8, X - X , and Y - Y all vanish at the point p . Hence, by the lemma, A@, 1, y)(p)= A(8, X , Y)(p). It follows immediately from Proposition 2 that a tensor field A has a value A , at each point p of M , namely, the function

E

2:(M)

A,:(T,M*)' x (T,M)" + R defined as follows. If a', . . . ,CI'

E

T p M * and x I , .. . ,x , E T,M, let

A,(CI', . . . ,ar,X I , . . . , x s ) = A ( @ , . . . ,@,

x , ,. . . ,X,)(p),

where 8', . . . , O r are any one-forms on M such that Oilp= ui (1 I i Ir) and X , , . . . ,X , are any vector fields such that X i, 1 = xi (1 5 j I s). It is easy to check that the function A , is R-multilinear; then by Definition 1 it is an (r, s) tensor over T,(M). We can thus consider A E 2L(M) as afield smoothly assigning to each p E M the tensor A , . Just as a vector field is a smooth section of the tangent bundle T M , such a field p + A , is a smooth section of the (r, s ) tensor bundle-the latter obtained, roughly speaking, by replacing each T,(M) in T M by the space T,(M): of (r, s) tensors over T,(M). Conversely, a smooth cross section, say p 4 B, E T,(M): for simplicity, arises from the unique tensor B E 2 : ( M ) given by

B(@,x , YMP) = B,@,,

x,,Y,)

for all points p , one-forms 8, and vector fields X , Y. (That the section is smooth means that the values of B are in '?j(M).) In particular the cross-sectional interpretation shows that if A E 2 : ( M ) and 42 is an open set of M , then the restriction A 1% o f A to 92 is a well-defined tensor field on $2.

39

Tensor Components

TENSOR COMPONENTS

The coordinate formulas X

=

1X ( x i )di for a vector field and

0

C O(di) dx' for a one-form extend readily to tensor fields of arbitrary type.

=

4. Definition. Let 5 = ( x ' , . . . ,x") be a coordinate system on 42 c M . If A E 3 : ( M ) the components ofA relative to 5 are the real-valued functions

on Ai.1 J I ... ...ir J. - A(dx'1,. . . , dx'r, d,,, . . . , djs) where all indices run from 1 to n

=

42,

dim M .

Evidently for a (0, 1) tensor, that is, a one-form, these components are exactly those of the formula 8 = O(di) dx'. To see that the corresponding agreement holds for a vector field X we must use its interpretation as a (1,O) tensor field. By the definition above the ith component of X relative to 5 is X(dxi), which is interpreted as d x ' ( X ) = X x ' . Similarly, when a (1, s) tensor field is given in the form A : X(M>s -+ X(M) its components are determined directly by the equation A(dil . . di,) 3

. y

... is

=z

aj,

j

since for its interpretation A E Xf(M) 'qdx', d i , , . . . , di\)= h j ( A ( d i 1 ,...,a,,> =

c

...i ,

dxi(dk)

=

Ajl

... i ,

'

k

The evaluation of a tensor field on one-forms and vector fields can be described in terms of coordinates by writing everything out in components. For example, suppose A is a (1, 2 ) tensor. Write 0 = Ok dxk for an arbitrary one-form and X = X i d i , Y = Y j d j for arbitrary vector fields. The g(M)-multilinearity of A then yields

1

2

A(&

x, Y )=

1A(dxk,di, dj)OkXiY' = 2 A!jOkXiyj. i, j , k

i, j , k

For a fixed coordinate system, the components of a sum of tensors are just the sum of the components. (Recall that only tensors of the same type are added.) The components of a tensor product are given by ( A 8 B ) ~. , I . ... I ~ + +- ~ i , j . . . i. ~@ ? + I ...ir+,., I I ~ ~ . J s + s ~

J 1 ...Is

Js+l

...J s + s "

where, as usual, all indices run from 1 to n = dim M . To check this formula, suppose that A has type (1,2) and B has type (1, 1). Then A 0 B is a (2, 3) tensor with components: ( A 8 B):?, = ( A &I B)(dxk,dXq, d i , d j , 8,) = A(dxk, d i , a j ) . B(dx4, a,> = AFjB;.

40 2

Tensors

Let 5 be a coordinate system on 42 c M . Then just as for a vector field or one-form, any tensor has a unique expression on 42 in terms of its components relative to 5. Suppose for example that r = 1 and s = 2. Then ak 0dx' 0dx' is a (1,2) tensor on 42 for all 1 Ii, j , k I n. We assert that if A is any (1,2) tensor, then A =

1A f j ak 0dx' Q dxJ

on 42,

where each index is summed from 1 to n. Since both sides are S(%)-rnultilinear it suffices to check that they have the same value on dxm,ap, a, for all 1 I m,p , q I n. This follows immediately from ( a k @ dxi 0 dx')(dx",

ap, a,)

=

dxm(dk)dx'(8,) dxi(d,)

=

STSbS;,

where for the sake of index balance we use the extended Kronecker delta

6 . . = 6:;- 6'' V

=

1 0

if i = j , if i # j.

In general:

(Y,

5. Lemma. Let X I , . . . ,X" be a coordinate system on 42 c M . If A is an s) tensor field, then on 42,

where each index is summed from 1 to n. CONTRACTION

There is a remarkable operation called contraction that shrinks ( 2 , s) tensors to (Y - 1, s - 1) tensors. The general definition derives from the following special case. 6. Lemma. There is a unique S(M)-linear function C: 2 : ( M ) -+ S ( M ) , called (1, 1) contraction, such that C(X Q 0) = OX for all X E X(M) and 0 E X*(M). Proof. Since it is to be S(M)-linear, C will be a pointwise operation. On a coordinate neighborhood 42 a (1, 1) tensor field A can be written as 1A: aiQ dx'. Since C(ai Q dxj) must be dx'(ai) = 6{, we have no choice but to define

Contraction

41

Then C has the required properties on a. To obtain the required global function it suffices to show that this definition is independent of the choice of coordinate system. But

Evidently (1, 1) contraction is closely related to the notion of truce (see Exercise 9). To extend (1, 1) contraction C to tensors of higher type the scheme is to specify one covariant slot and one contravariant slot, and apply C to these. Suppose A E 2 : ( M ) and 1 Ii 5 r and 1 < j Is. Fix one-forms O:, . . . , 8’- and vector fields X , , . , . , X,- Then the function

,.

ith contravariant slot

(0, X) -, A(@, . . . , 0 , . . . , O r -

, xl , . . . , X , . . . , X s - l )

j t h covariant slot

_T

is a (1, 1) tensor that can be written as

A(@, . . . , . ,. . . , Or- , x,,. . . , . , . . .,X,-

1).

Applying the (1, 1) contraction to this tensor produces a real-valued function denoted by

(cp)(e1,. . .,

er-1

,xl?.**,xs-l).

Evidently CjA is ’fj(M)-multilinear in its arguments. Hence it is a tensor of type ( r - 1, s - 1) called the contraction of A over i , j . For example, if A is a (2, 3) tensor field then C:(A)is the (1, 2) tensor field given by ( C : A ) ( & X , Y)

=

C{A(., 8, X , Y , .)}.

Relative to a coordinate system the components of C i A are ( C i A)fj = (C: A)(dxk, aj, d j ) = C(A(-, d X k , d i , d j , -)I =

C A(dxm,dXk,ai,aj, am)= C A;;, m

m

where we use the coordinate formula for C from the preceding proof.

42 2

Tensors

7. Corollary. Let 1 I i I r and 1 I j Is. Relative to a coordinate system, if A E 2 : ( M ) has components A ; ; ; ; - : ,then CfA has components

c A;::::r...,s. Tdex m

L,th index

COVARIANT TENSORS

By means of a mapping from M to N any covariant tensor on N can be pulled back to M . The following definition uses the cross-sectional view of tensors.

8. Definition. with s 2 1, let

Let

4: M

+

N be a smooth mapping. If A E 2 ; ( N )

( 4 * A ) ( u , , . . . , u s ) = A(d4u,, . . . , d 4 v s ) for all ui E T,(M), p E M . Then 4 * ( A ) is called the pullback of A by 4.

At each point p on M , 4*(A)gives an R-multilinear function from Tp(M)” to R, that is, an (0, s) tensor over Tp(M).Coordinate computations as in Exercise 8 show that 4 * ( A ) is a smooth covariant tensor field on M . In the special case of a (0,O) tensor f E g ( N ) , the pullback to M is defined to be $*(.f)= f 4 E S ( M ) . Note that 4*(@)= d(4*f). The following properties of the pullback operation are easily verified. 0

9. Lemma. (1) If 4: M .+ N is a smooth mapping, then 4*: 2 ; ( N ) 3 2 : ( M ) is R-linear for each s 2 0, and $*(A 0 B ) =

4 * ( 4 0 4*(B)

for covariant tensors of arbitrary types (0, s) and (0, t). (2) If $: N P is also a smooth map, then . +

($

0

4)*

=

4* $*: Z,O(P)+ 2,o(M) 0

for all s 2 0.

A tensor field of type (r, s) with r 2 1 can generally be moved neither from M to N nor from N to M by an arbitrary mapping M -,N. Let A be a covariant or contravariant tensor of type at least 2. A is symmetric if transposing any two of its argument leaves its value unchanged. A is skew-symmetric (or alternate) if each such reversal produces a sign change. Functions, one-forms, and vector fields are considered by convention to be

Tensor Derivations

43

both symmetric and skew symmetric. A diferentiul s-form is a skew-symmetric covariant tensor field of type (0, s). For the calculus of differential forms, see, for example, [BG].

10. Lemma. Let p be an n-form. If

=

CS= A i jq. for 1 Ii I n, then

AVlr . . . , V,) = (det 4pW1,. . . , W .

The proof is a standard combinatorial argument.

TENSOR DERIVATIONS

Previous sections have dealt with tensor algebra; we now consider some tensor calculus. 11. Definition. A tensor derivation 9 on a smooth manifold M is a set of R-linear functions

93 = 9;: q M ) +q ( M )

(r 2 0, s 2 0)

such that for any tensors A and B : 9 ( A @ B) = 9 A @ B + A @ 9 B , (2) 9 ( C A ) = C ( 9 A )for any contraction C.

(1)

Thus 3 is R-linear, preserves tensor type, obeys the usual Leibnizian product rule, and commutes with all contractions. For a functionf E B ( M ) recall thatfA = ,f @ A ; hence 9 ( f A ) = ( 9 f ) A + f 9 A . In the special case t = s = 0, 9;is a derivation on 2 ; ( M ) = % ( M ) so, as discussed in Chapter 1, there is a unique vector field V E X(M) such that

9j”= V f

for all f~ g ( M ) .

Since tensor derivations are generally not S(M)-linear the value of 9 A at a point p E M cannot usually be found from A , alone. However it can be found from the values of A on any arbitrarily small neighborhood of p . This local character of tensor derivations can be expressed as follows.

12. Proposition. If 9 is a tensor derivation on M and Q is an open set of M , then there is a unique tensor derivation 9qon 42 such that 9 % ( A1%)

= (9A)l42

for all tensors A

on

M.

(9% is called the restriction of 9 to 42, and henceforth we omit the subscript Q.)

44 2

Tensors

Scheme of ProoJ: Let B E %:(%). If p E t?l let f be a bump function at p with support in Q. ThusfB E %:(M). Define =

9(fBIp.

Then show: (1) This definition is independent of the choice of bump function. (2) 9 % B is a smooth tensor field on %. (3) 9%is a tensor derivation on t?l. (4)9%has the stated restriction property. ( 5 ) 9%is unique. The Leibnizian formula 9 ( A 0 B ) = 9 A 0 B as follows. 13. Proposition (The Product Rule). If A E 2 : ( M ) then q A ( e l , . . . ,e',

+ A 0 9 B can be recast

Let 9be a tensor derivation on M.

x , ,. . . ,x,)]= ( 9 ~ ) ( e l ,. .. ,v,x , ,. . . , x,)

+

A(61,. . . , m i , . . . ,8',X I , . . . , X,) i= 1 S

+ 2 A ( e 1 , . . . ,P,x,,. . . ,9 x j , . . ,x,). j= 1

(The placement of parentheses is crucial here: on the left-hand side 9 is applied to a function, on the right-hand side to the tensor A , to one-forms, and to vector fields.) ProoJ: For simplicity let r

=

A(@,X )

s

=

=

1. We assert that

C(A 0 6 0 X ) ,

where C is a composition of two contractions. In fact, relative to a coordinate system A 0 6 0 X has components A;O,XJ, while A(@,X ) = A $ X j . Thus q A ( e , x ) )= 9 C ( A 0 6 0 x ) = C ~ 0 A e0x ) = C(9A0 6 0 X ) C(A0 9 6 0 X ) C(A 0 8 0 9X) = (m)(e, A(m, A(e, 9x1.

x)+

+

+

x)+

For a (1, s) tensor expressed as an g(M)-multilinear function A : X(M)" + X(M) the tensor derivation obeys the same formal product rule, namely, 9 ( A ( X 1 , .. . , X , ) ) = (9A)(X1,.. . , X,)

S

+ 2 A(X,, . . . ,9 x i , . * , XJ i= 1

Both these versions of the product rule will frequently be solved for the term involving 9 A . This gives a formula for 9of an arbitrary tensor in terms

Tensor Derivations

45

of 9 applied solely to functions, vector fields, and one-forms. But for a oneform

( 9 0 ) ( X ) = qex)- O ( 9 X ) . Thus functions and vector fields suffice. 14. Corollary. If tensor derivations g1and g 2 agree on functions % ( M ) and vector fields X(M), then g1= g 2 .

Furthermore, from suitable data on % ( M ) and X(M) we can construct a tensor derivation. 15. Theorem. Given a vector field V E X(M) and an R-linear function 6: X(M) + X(M) such that

S ( f X ) = Vf X

+f

for all f

6(X)

E

5(M),

X

E X(M),

there exists a unique tensor derivation 9 on M such that 9:= I/: g ( M ) + g ( M ) and 9;= 6. Proof. 9;and 9;are given. The formula preceding Corollary 14 shows that 9 on a one-form 19must be defined by

(gdB)(X)= V ( 0 X ) - O(SX)

for all X

E X(M).

Using the formula given for 6 it is easy to check that 9 0 is %(M)-linear, X*(M) + X*(M) is R-linear. hence is a one-form, and that 9 = 9:: By the product rule (13), 9 on an (r, s) tensor A with r s 2 2 must be defined by

+

(9A)(e1,. . . ,er, x,, . . . , x,)=

vpqel,.. . , 19r, x,,. . . , x,))

c A(e1,. . . ,goi,.. . ,er, x,,. . . ,x,) r

-

i= 1 s

(On the right-hand side, 9 of a one-form is defined as above.) Again it is easy to verify that 9 A is g(M)-multilinear hence is an (r, s) tensor, and that 9:2 I ( M ) 2 i ( M ) is R-linear. Furthermore, a direct 0B ) = 9 A 0B A 0 9 B . (Take A and B computation shows that 9(,4 of type (1, 1) to see how this works.) To prove that 9 commutes with contraction, consider first the case C: i2:(M) -+ %(M).That 9 C = C 9 on tensor products 6 0X is immediate from the definition of 9 on one-forms. Hence 9 C = C 9 on sums of terms of -+

+

46 2

Tensors

the form 0 @ X . Since 9is local and C pointwise it suffices to prove BC = C 9 on coordinate neighborhoods. But there Lemma 5 shows that every (1, 1) tensor can be written as such a sum. The extension to arbitrary contractions is an exercise in parentheses. Taking A E ;2i(M) for example,

(gc;A ) ( X ) = w c :A ) ( X ) ) - (CiA)(BX) = % C { A ( . , x, .)I) C { A ( . ,gx, = C { B ( A ( .x, , .)) - A ( . , gx, .)} = C { ( Q A ) ( . x, , .)} = (C;gA)(x). -

Hence 92;A

= Cl9A.

Here is an application of the theorem. 16. Definition. If I/ E X(M) the tensor derivation L , such that

U f )= Vf L,(X) = [ V , X]

for all f E Z ( M ) , for all X E X(M)

is called the Lie derivative relative to I/. The definitionis valid since L, on vector fields satisfies the hypothesis on 6 in the theorem: L,(fX)

=

[V, f X ] = V f ' X

+ f [ V , X ] = yyx + f L , X .

SYMMETRIC BILINEAR FORMS

Semi-Riemannian geometry involves a particular kind of (0,2) tensor on tangent spaces. To study these in general, let I/ be a real vector space (finitedimensional where the context so indicates). A bilinear form on V is an R-bilinear function b : V x I/ 4 R, and we consider only the symmetric case: b(o, w ) = b(w, u) for all u, w. 17. Definition.

A symmetric bilinear form b on I/ is

(1) positive [negative] definite provided u # 0 implies b(u, v ) > 0 [ < 01, (2) positive [negative] semidefinite provided b(v, v) 2 0 [ S O ] for all uE

v.

(3) nondegenerate provided b(v, w)

=

0 for all w E 1/ implies u = 0.

Also b is definite [semidefnite] provided either alternative in (1) [(Z)] holds. If b is definite then it is obviously both semidefinite and nondegenerate; the converse follows from Exercise 12.

Scalar Products 47

If b is a symmetric bilinear form on V then for any subspace W of V the restriction b I( W x W ) ,denoted merely by b I W, is again symmetric and bilinear. If b is [semi-]definite, so is b I W. 18. Definition. The index v of a symmetric bilinear form b on V is the largest integer that is the dimension of a subspace W c V on which bl W is negative definite.

Thus 0 < v Idim V, and v = 0 if and only if b is positive semidefinite. The function q : V + R given by q(v) = b(v, u) is the associated quadratic form of b. It is often easier to deal with than b, and no information is lost since b can be reconstructed by the polarization identity

b(v, w ) = +Cq(u

+ w)

-

q(u> - q(w)l.

If e l , . . . ,en is a basis for V, the n x n matrix (bij)= b(ei, e j ) is called the matrix of b relative to e , , . . . ,en. Since b is symmetric, this matrix is symmetric. Clearly it determines b since h ( x v i e i ,1wjej)

=

1b i j v i w j .

19. Lemma. A symmetric bilinear form is nondegenerate if and only if its matrix relative to one (hence every) basis is invertible. Proo$ Let e l , . . . , e n be a basis for V. If u E V, then b(v, w) = 0 for all w E V if and only if b(v, ei) = 0 for i = 1, . . . , n. Since (bij) is symmetric,

b(v, ei) = h ( 1 u j e j ,e i ) =

1bijuj.

Thus b is degenerate if and only if there exist numbers ul,. . . ,u, not all zero such that bijuj = 0 for i = 1,. . . , n. But this is equivalent to the linear H dependence of the columns of (bij),that is, to ( b i j ) being singular.

1

SCALAR PRODUCTS

20. Definition. A scalar product g on a vector space Visa nondegenerate symmetric bilinear form on V. An inner product is a positive definite scalar product, the canonical example being the dot product on R", for which v w = viwi. Many properties of inner products carry over to scalar products, however some distinctive new phenomena arise when g is indefinite. Changing one sign in the definition of the dot product on R2 gives the simplest example of an indefinite scalar product.

-

1

48 2

21.

Tensors

Example. Define g : R2 x R2 -+ R by g(u, w ) = U I W l - u 2 w 2 .

Obviously g is symmetric and bilinear. Taking w to be (1,O) and then (0, 1) shows that g is nondegenerate. Thus g is a scalar product. The associated quadratic form has q ( v ) = vf - u;, and g is indefinite. Henceforth V will denote a scalar product space, that is, a (finite-dimensional, real) vector space furnished with a scalar product g. A vector u E I/ is null provided q(u) = 0 but u # 0. Evidently null vectors exist if and only if g is indefinite. In the example above, null vectors fill the two 45" lines with the origin omitted (0 is not null). For c # 0 the sets q = c and q = -c are hyperbolas asymptotic to the null lines (Figure 1).

Figure 1

Vectors u, w E V are orthogonal, written u I w, provided g(u, w) = 0. Subsets A and B of V are orthogonal, written A IB, provided u I w for all u E A and w E B. When the scalar product g is indefinite we can no longer picture orthogonal vectors as being at right angles to each other. For Example 21, Figure 2 shows three pairs of orthogonal vectors z I z', the last of which

v

=

(1, b )

u = (1,O)

Figure 2

Scalar Products

49

illustrates the fact that a null vector is a nonzero vector that is orthogonal to itself. If W is a subspace of V, let

w' =

{UE

V : u I W}.

W' is a subspace of I/ called W perp. We cannot call W' the orthogonul complement of W since W + W' is generally not all of V. (In Example 21 if W is the subspace spanned by (1, l), then W' = W ) However the perp operation does have two familiar properties:

22. Lemma. If W is a subspace of a scalar product space V, then (1) dim W + dim W' = n (2) (W')' = w.

=

dim V,

!'roo$ (1) Let el, . . . , enbe a basis for V adapted to W,that is, for which e l , . . . ,ekis a basis for W. Now u E W' if and only ifg(u, ei) = 0 for 1 5 i k, which in coordinate terms is n

c g i j v j= 0

(1 I i 5 k).

j= 1

This is k linear equations in n unknowns, and by Lemma 19 the rows of the coefficient matrix are linearly independent, so the matrix has rank k. Hence by linear algebra the space of solutions has dimension n - k . But by construction these n-tuple solutions (ul, . . . ,u,) give exactly the vectors u = uiei of w'. (2) Since u E (W')' means v I W', we have W c (W')'. By (1) these two subspaces have the same dimension, hence they are equal. w Note that the nondegeneracy of g on the whole space V is equivalent to = 0. A subspace W of V is called nondegenerate if gJW is nondegenerate. When V is an inner product space, every subspace W is again an inner product space (under gl W )hence is nondegenerate. However when g is indefinite there will always be degenerate subspaces; for example, any null vector spans one. Thus a subspace of a scalar product space need not be a scalar product space. This difficulty is linked to the earlier one involving the perp operation. V'

23. Lemma. A subspace W of V is nondegenerate if and only if is the direct sum of W and W'. Proof: By a standard vector space identity

dim(W

+ W') + dim(W n W')

=

dim W

+ dim W'.

I/

50

2

Tensors

According to Lemma 22, the right-hand side is n = dim V. Hence W + W' = V if and only if W n WL = 0. Thus either of these two conditions is equivalent to V = W @ W'. But W n W' = (w E W : w 1 W } ,so the vanishing of this subspace is equivalent to the nondegeneracy of W. Since (W')' = W, it follows that W is nondegenerate if and only if W' is. Because q(u) = g(u, u ) may be negative, the norm I u I of a vector is defined to be Ig(u, A unit vector u is a vector of norm 1, that is, g(u, u) = & 1. As usual a set of mutually orthogonal unit vectors is said to be orthonormal, and for n = dim V,any set of n orthonormal vectors in V is necessarily a basis for V. 24.

Lemma. A scalar product space V # 0 has an orthonormal basis.

Proof. Since g is nondegenerate there is a vector

UE

V such that

g(u, u) # 0. Now u / l u I is a unit vector. Thus it suffices by induction to show

that any orthonormal set e l , . . . , ek with k < n can be enlarged by one. By Lemma 19 these vectors span a (k-dimensional) nondegenerate subspace W. It remains only to find a unit vector in W' # 0. But as noted above W' is also nondegenerate, so the preceding argument shows it contains a unit vector. The matrix of g relative to an orthonormal basis e,, . . . , en for V is diagonal; in fact, g(e,,ej) = h i j ~ j , where c j = g(ej,ej) = f l . Whenever convenient we shall order the vectors in an orthonormal basis so that the negative signs-if any-come first in the so-called signature (El,.

. ., En).

Taking these signs into account orthonormal expansion is still available. 25. E,

Lemma. Let e l , . . . , e n be an orthonormal basis for V, with each u E V has a unique expression

= g(e,, e,). Then

u =

C Eig(u,ei)ei.

For the proof it suffices to check that u minus the sum is orthogonal to each ei;thus by the nondegeneracy of g it is zero. The orthogonal projection 7c of V onto a nondegenerate subspace W is the linear transformation that sends WL to 0 and leaves each vector of W fixed. An orthonormal basis e,, . . . , ek for W can always be enlarged to a basis for V ;thus k

n(u) =

1Ejg(u,ej)ej. j= 1

Scalar Products

51

It is customary to refer to the index v of the scalar product g of V as the index of V, writing v = ind V. 26. Lemma. For any orthonormal basis e l , .. . , en for V the number of negative signs in the signature ( c l . . . . , E,) is the index v of V.

Proof: Assume that the first m signs ci are the negative ones. The result is trivial if g is definite, so 0 < m < n. Evidently g is negative definite on the subspace S spanned by e l , . . . ,e m ;thus v 2 m. To prove the reverse inequality, let W be an arbitrary subspace on which g is negative definite, and define n: W + S by

n(w) = - C g(w, ei)ei. ism

Evidently 7c is linear. Thus it suffices to show that n is one-to-one, for then dim W I dim S = m, hence v I m. If n(w) = 0, then by orthonormal expansion w = Cg(w, ej)ej. j >m

But since w

E

W,

0 2 g(w, w)= C g ( w ,

ej>2.

j >m

Hence g(w, e j ) = 0 for j > m, so w

=

0.

It follows that for a nondegenerate subspace W of V ind V

=

ind W

+ ind W I,

since the proof of Lemma 25 shows that there is an orthonormal basis for V adapted to the direct sum V = W W'. Let V and V have scalar products g and g. A linear transformation T : V -, V preserves scalar products provided

+

g(Tu, T w ) = g(u, w)

for all v, w

E

V.

In this case T is necessarily one-to-one, because if Tv = 0 then g(u, w) = 0 for all w, hence u = 0. Note that T preserves scalar products if and only if it preserves their associated quadratic forms, that is,

~ ( T u=) q(v)

for all u E V.

One implication is obvious; the other follows by polarization. A linear isomorphism T : V -+ W that preserves scalar products is called a linear isometry. By the preceding remarks a linear transformation T : V + W is a linear isometry if and only if dim V = dim W and T preserves scalar products (or equivalently their quadratic forms).

52

2

Tensors

27. Lemma. Scalar product spaces V and W have the same dimension and index if and only if there exists a linear isometry from V to W. Prooj. Assuming the invariants are the same, pick orthonormal bases

e l , . . . , e n for V and e;, . . . ,e; for W. By Lemma 26 we can suppose that ( e , , e , ) = (ef, ef) for all i. Let T be the linear transformation such that T e , = ef for all i. Then (Te,, T e j ) = (el, e J ) for all i,j. Hence by linearity T

is a linear isometry. Conversely, if T : V + W is a linear isometry, then T carries an orthonormal basis for V to an orthonormal basis for W. Hence dim V = dim W H and, by Lemma 26, ind V = ind W.

Exercises 1. A one-form 0 is zero if and only if OX = 0 for all X E X(M). A vector field X is zero if and only if OX = 0 for all O E X*(M). 2. (a) The bracket operation on vector fields is R-linear X(M) x X(M) + X(M), but cannot be interpreted as a (1,2) tensor field. (b) If 8 is a one-form its exterior derivative d8 is a tensor field, in fact a two-form. (By definition, ( d e ) ( x , Y ) = x e y - y e x - e[x, Y ] . ) 3. Tensor transformation rule. Let A be a tensor field, say of type (1, 2). Let 5 and 9 be coordinate systems on 42 c M . Show that the components of A relative to 9 are determined as follows by the components of A relative to 5 :

4. Prove that the interpretation on page 36 gives an g(M)-linear isomorphism from X*(M) to 2 y ( M ) . (Thus X*(M) is identified with the dual module of X(M)J 5. (a) If V is a finite-dimensional vector space and 4 E (V*)* there is a unique u E V such that a(u) = 4(01)for all a E V*. (b) The interpretation on page 37 gives an S(M)-linear isomorphism from X(M) to 2A(M). (Hint: If Z E 2A(M), then Z , E T,(M)**.) Thus, in view of the preceding exercise, X(M) is identified with its double dual module T;(M) = (X*(M))* = X(M)**. 6. Let 9 be a tensor derivation on M . Relative to a coordinate system, (a) if 9 ( a i ) = Fi aj, show that 9 ( d x j ) = Fi dx’. (b) If A is a (1, 2 ) tensor field, find a formula for the components of 9 A in terms of Fi and the components of A .

-c

Exercises

53

7. If V E X(M) and A E 2 i ( M ) , then relative to a coordinate system express the components of the Lie derivative L,A in terms of the components of A and V. 8. Let 4: M" + N" be a smooth map that carries the coordinate neighborhood @ c M of 5 into the coordinate neighborhood of q in N . If B is a covariant tensor field on N , with say s = 2, show that on 42,

(Hint: Evaluate the left-hand side at p E @.) 9. (a) Interpret A E 2 : ( M ) as a function smoothly assigning t o each p E M a linear operator A , on T,(M). (b) Show that ( C A ) ( p )= trace A , . (c) If A , B E 2 : ( M ) , express the function p -+ A , B, as a element of 2 i ( M ) . 10. (a) Prove that a tensor derivation has 52: = 0 if and only if 9;is g ( M ) linear. Then by interpretation, 9;= B E 2 i ( M ) , and we write 9 = g B . (b) If 9 is an arbitrary tensor derivation on M , show that there is a unique V E X ( M ) and a unique B E 2 ! ( M ) such that 9 = L, + g B . 11. Establish the following properties of Lie derivatives : (a) &V + b W = aL, + bL,, (a, b E R ) (b) C L V , bvl = L[",W ] , (c) M d f ) = 4 V f X f as in Exercise 10). (d) L,, = f ' L , - g V e d(notation 12. Let b be a symmetric bilinear form on V. The nullspace of b is N = { u : b(u, w) = 0 for all w } . The nullcone of b is the set A of all null vectors in V. Let A = A u 0, so A =) N . Prove : (a) N is a subspace, but A is not unless A = 0 or V. (b) b is nondegenerate o N = 0; b is definite o A = 0. (c) b is semidefinite o N = A . 13. Let g be a scalar product of index v on an n-dimensional vector space V. Prove that there exists a subspace W of dimension min(v, n - v), and no larger, on which g = 0. 14. (Dajczer, Nomizu, and others.) Let V have indefinite scalar product g, and let b be a symmetric bilinear form on V with corresponding quadratic form q. Show that the following conditions are equivalent: (i) b = c g for some c E R, (ii) q = 0 on null vectors, (iii) ) q (is bounded on timelike unit vectors, and (iv) 1 q I is bounded on spacelike unit vectors. (Hint : Polarize.) 0

3

SEMI-RIEMANNIAN MANIFOLDS

The familiar geometry of the Euclidean space R3 can be traced back to its natural inner product, the dot product. By means of the natural isomorphism Tp(R3)z R3 the dot product can be deployed on each tangent space. Then one can perform such basic geometric operations as measuring the length of a tangent vector or the angle between two tangent vectors. The theory of surfaces in R3 attained its classical form in the work of Gauss, who showed in 1827 that the intrinsic geometry of a surface S in R3 (roughly, the geometry perceived by the inhabitants of S ) derives solely from the dot product as applied to tangent vectors to S . As long ago as 1854 Riemann saw what was needed to generalize these two special cases and introduce geometry on an arbitrary n-dimensional manifold: an inner product must be given on each tangent space. This is thought of as providing, in particular, an infinitesimal measurement of distance. Crudely, if p and p d p are nearby points, the distance between them is the norm of the “tangent vector” dp. Under the impetus of Einstein’s general theory of relativity (1915) a further generalization, technical but far-reaching, appeared : the positive definiteness of the inner product was weakened to nondegeneracy.

+

1. Definition. A metric tensor g on a smooth manifold M is a symmetric nondegenerate (0, 2) tensor field on M of constant index. In other words g E 2 : ( M ) smoothly assigns to each point p of M a scalar product g pon the tangent space T’(M), and the index ofg, is the same for all p. 2. Definition. A semi-Riemannian manifold is a smooth manifold M furnished with a metric tensor g. 54

Semi-Riemannian Manifolds

55

Thus strictly speaking a semi-Riemannian manifold is .an ordered pair ( M , g): two different metric tensors on the same manifold constitute different semi-Riemannian manifolds. Nevertheless we usually denote a semiRiemannian manifold by the name of its smooth manifold M , N , . . . . The common value v of index g, on a semi-Riemannian manifold M is caIled the index of M : 0 I v I iz = dim M . If v = 0, M is a Riemannian manifold; each g, is then a (positive definite) inner product on Tp(M). If v = 1 and n 2 2, M is a Lorentz manifold. Semi-Riemannian manifolds are often called pseudo-Riemannian manifolds, or even-in older terminology-Riemannian manifolds, but we reserve the latter term for the distinctive positive definite case. We use ( , ) as an alternative notation forg, writingg(v, w) = (v, w) E R for tangent vectors, and g( V, W ) = ( V , W ) E g ( M ) for vector fields. If x l , . . . , X" is a coordinate system on 42 c M the components of the metric tensor g on 42 are gij =

Thus for vector fields V

=

( a i , a,)

(1 I i, j 5 n).

1V' di and W = c W j d j ,

g(V, W ) = ( V , W ) =

1

gij

V' wj.

Sinceg is nondegenerate, at each point p of 42 the matrix (gij(p))is invertible, and its inverse matrix is denoted by (g"(p)). The usual formula for the inverse of a matrix shows that the functions giJare smooth on 42. Since g is symmetric, g i j = g j i and hence g" = g"' for 1 5 i,j S n. Finally on 92 the metric tensor can be written as g=

1gijdx' @ dx'.

Recall from Chapter 1 that for each p E R" there is a canonical linear isomorphism from R" to Tp(R")that, in terms of natural coordinates, sends u to up = v i di.Thus the dot product on R" gives rise to a metric tensor on R" with . . ( u p , wp) = V' w = U'W'.

c

c

Henceforth in any geometric context R" will denote the resulting Riemannian manifold, called Euclidean n-space. For an integer v with 0 I v 5 n, changing the first v plus signs above to minus gives a metric tensor Y

( u p , wp) =

-

cviwi +

i= 1

n

1 j=v+l

V'WJ

of index v. The resulting semi-Euclidean space R: reduces to R" if v = 0. For n 2 2, R; is called Minkowski n-space; if n = 4 it is the simplest example of a relativistic spacetime.

56

3

Semi- Riemannian Manifolds

Fix the notation for 1 I i 5 v, for v + l I i < n .

-1

Then the metric tensor of R: can be written g=

C ci du' @ du'.

The geometric significance of the index of a semi-Riemannian manifold derives from the following trichotomy.

3. Definition. A tangent vector u to M is spacelike

if

(v, v ) > 0

null

if

(u, v )

timelike

if

(v, v ) < 0.

=

or

u =

0,

0 and v # 0,

The set of all null vectors in T,(M) is called the nullcone at p E M . The category into which a given tangent vector falls is called its causal character. This terminology derives from relativity theory, and particularly in the Lorentz case, null vectors are also said to be lightlike. Let q ( v ) = ( u , u ) for each tangent vector v to M . At each point p of M , q gives the associated quadratic form of the scalar product at p ; thus q determines the metric tensor. If V E X(M) and E 8 ( M ) , then q ( f V ) = f 2 q ( V )E 8 ( M ) , so q is not a tensor field. Classically q is called the line element of M , and denoted by ds2.In terms of a coordinate system, q = ds2 = g i j dx' dxJ.

1

Here the juxtaposition of differentials denotes ordinary multiplication of functions (on each tangent space), so q( V ) =

C gijdx'( V ) dx'( V ) = 1g i jV V j .

As in Chapter 2, the norm I u I of a tangent vector is I q(u) = I ( u , v) ( l i 2 , and unit vectors, orthogonality, and orthonormality are as before. The origin of the unusual notation ds2 can be seen intuitively as follows. Assume for simplicity that M is Riemannian. Ifp and p' are nearby points with Ax") relative to some coordinates ( x ' , . . . ,x") and (x' + Ax', . . . , X" coordinate system, then the tangent vector Ap = Ax' di at p points approximately to p'. Thus we expect the square of the distance As from p to p' to be approximately

+ 1

I Ap12 as in the formula ds2 =

=

(Ap, Ap)

C gijdx' dxj.

=

C g&)

AX' AxJ,

Semi- Riemannian Manifolds

57

Given a way to get new smooth manifolds from old, there is often a corresponding way to derive a metric tensor on the new manifold from metric tensors on the old. For example, suppose first that P is a submanifold of a Riemannian manifold M . Since each tangent space Tp(P)of P is regarded as a subspace of Tp(M),we obtain a Riemannian metric tensorgp on P merely by applying the metric tensor g of M to each pair of tangent vectors to P . Formally, g, is the pullback j * ( g ) , where j : P c M is the inclusion map. For example, the standard n-sphere of'radius r > 0 is the Riemannian submanifold Sn(r) = { p : ( p i = r } of

R"".

However when the metric tensorg of M is indefinite, then j*(g) need not be a metric on P. It is a smooth symmetric (0, 2) tensor field, hence it is a metric if and only if each Tp(P)is nondegenerate in Tp(M)relative to g-and the index of Tp(P)is the same for all p (see Exercise 10(b)). 4. Definition. Let P be a submanifold of a semi-Riemannian manifold M . If the pullback j*(g) (as above) is a metric tensor on P it makes P a semiRiemannian submanifold of M .

(If P is known to be Riemannian or Lorentz, these terms replace semiRiemannian .) Now we consider product manifolds. 5. Lemma. Let M and N be semi-Riemannian manifolds with metric tensorsg, and g N . If n and d are the projections of M x N onto M and N , respectively, let g = n*hM) +

O*kN).

Then g is a metric tensor on M x N making it a semi-Riemannian product manifold. Proof: Translating from the pullback notation: if then

0,

w E 7;p,q)(Mx N ) ,

g(u9 W ) = g,(dx(v), dn(w)) + ~ N ( ~ O ( O )d, d w ) ) . Thus g is symmetric. To show nondegeneracy, suppose g(u, w ) = 0 for all w E 7;p,q)(Mx N ) . Then, in particular, for all w E 'T;p,q+V we have gy(d7c(u), dn(w)) = 0, since da(w) = 0. But such d7c(w) fill T,(M), hence dn(u) = 0. Similarly do(u) = 0; hence u = 0. Orthonormal bases for T,(M) and T,(N)combine to give an orthonormal basis for 7;p,q)(Mx N ) . Hence the index of g has constant value ind M + indN.

58 3

Semi- Riemannian Manifolds

The same scheme extends in an obvious way to any finite product of semiRiemannian manifolds. For example, the semi-Euclidean space R: is

--

R: x ... x Ri x R' x ... x R' = R: x F-", v

factors

n-v factors

where by definition Ri is the real line with metric tensor the negative of the usual dot product on R'.

I S 0METR IES An isometry is the special type of mapping that expresses the notion of isomorphism for semi-Riemannian manifolds.

6. Definition. Let M and N be semi-Riemannian manifolds with metric tensors g , and g,. An isometry from M to N is a diffeomorphism 4 : M + N that preserves metric tensors: 4*(gN)= g M . Explicitly, (d+(u), d+(w)) = ( v , w ) for all u, w E T,(M), p E M . Since 4 is a diffeomorphism, each differential map d4,, is a linear isomorphism; thus the metric condition means that each d4,, is a linear isometry. The pullback operates in the usual way on line elements, and since these determine their metric tensors, preservation of metrics is equivalent to 4*(qN)= q M . It is easy to see that (1) The identity map of a semi-Riemannian manifold is an isometry. (2) A composition of isometries is an isometry. (3) The inverse map of an isometry is an isometry. The interpretation of q = ds2 as the square of infinitesimal distance suggests thinking of an isometry as a rigid motion, by contrast with an arbitrary diffeomorphism which can deform M in applying it to N . An object preserved in an appropriate sense by all isometries is called an isometric invariant ; and semi-Riemannian geometry is traditionally described as the study of such invariants. If there exists an isometry between M and N , they are said to be isometric; roughly speaking, isometric manifolds are geometrically the same. Let V be a scalar product space, that is, a real vector space furnished with a scalar product. Then V is a manifold, and just as in the case V = R" the formula ( u p , w p ) = ( u , w ) defines a metric tensor on V, making it a semiRiemannian manifold.

The Levi-Civita Connection

59

7. Lemma. If $: V + W is a linear isometry of scalar product spaces, then (for V and W semi-Riemannian as above) $: V -P W is an isometry. Proof. Since linear maps are smooth, the linear isomorphism $ is a diffeomorphism. If v pE T,(V ) ,then Exercise 1.2givesd$(up) = ($(v))*(,). Thus $ preserves metric tensors, since = <($(v))+@)($(W+@,) = <$(v), $(w)> = (0, w> = < u p wp>. 9

It follows that if V is a scalar product space of dimension n and index v, then as a semi-Riemannian manifold, V is isometric to RC. In fact, the coordinate isomorphism of any orthonormal basis for V is a (linear) isometry. If M is an arbitrary semi-Riemannian manifold, its metric tensor makes each of its tangent spaces a semi-Euclidean space of the same dimension and index as M itself. This is one view of how semi-Riemannian geometry generalizes semi-Euclidean geometry. THE LEVI-CIVITA CONNECTION

Let V and W be vector fields on a semi-Riemannian manifold M . The goal of this section is to show how to define a new vector field Dv Won M whose value at each point p is the vector rate of change of Win the V, direction. There is a natural way to do this on R: . 8. Definition. Let u l , . . . , U" be the natural coordinates on R:. If I/ and W = W' aiare vector fields on R:, the vector field

1

Dv W

=

1 V ( W ' )ai

is called the natural covariant derivative of W with respect to V. Since this definition uses the distinctive coordinates of Rt it is not obvious how to extend it to an arbitrary semi-Riemannian manifold. We begin, therefore, by axiomatizing its key properties.

9. Definition. A connection D on a smooth manifold M is a function D : X(M) x X(M) -, X(M) such that (Dl) Dv W is g(M)-linear in V, (D2) Dv W is R-linear in W, (D3) Dv(J'W) = (VJ')W + f D , W for f E S ( M ) .

Dv W is called the couariant derivative of W with respect to V for the connection D.

60 3

Semi-Riemannian Manifolds

Axiom (Dl) asserts that Dv W is tensor in V ;hence by Proposition 2.2, for an individual tangent vector u E T , ( M ) we have a well-defined tangent vector D,W E T,(M), namely, (DvW), where V is any vector field such that V, = u. On the other hand, (D3) shows that Dv W is not tensor in W. We can now state our goal more precisely: it is to show that on every semiRiemannian manifold there is a unique connection sharing two further properties ((D4) and (D5)below) of the natural connection on R!. The next step is algebraic.

10. Proposition. Let M be a semi-Riemannian manifold. If V E X(M) let V* be the one-form on M such that V*(X)= (V, X )

for all X E X(M).

Then the function V-+ V* is an S(M)-linear isomorphism from X(M) to X*(M). Proof. Since V * is aM)-linear it is indeed a one-form, and the function V -+ V* is also g(M)-linear. That it is an isomorphism follows from two facts: (a) If ( V , X) = (W, X) for all X E X(M),then V = W. (b) Given any one-form 0 E X*(M) there is a unique vector field V E X(M) such that 0 ( X ) = ( V , X) for all X. Let U = V - W. Then assertion (a) amounts to showing that if ( U p , X , ) = 0 for all X E X(M) and all p E M ,then U = 0. Since every element of T,(M) has the form X,,the result follows by the nondegeneracy ofthe metric tensor. Now (a) is exactly the uniqueness assertion in (b), hence to prove (b) it suffices to find V on an arbitrary coordinate neighborhood Q. (All these local V s will be consistent on overlaps.) If 8 = C Oi dx' on Q, let V = g'J& aj. Then since (gij)and (glJ) are inverse matrices,

ci,

(v, a,)

=

1g i j e i ( a j ,a,) = C eigijgjk i. j

It follows by S(M)-linearity that ( V , X )

i, j

=

e ( X ) for all X on %.

Thus in semi-Riemannian geometry we can freely transform a vector field into a one-form, and vice versa. Corresponding pairs V c t 0 contain exactly the same information, and are said to be metrically equivalent. The following fundamental result has been called the miracle of semiRiemannian geometry:

The Levi-Civita Connection

61

11. Theorem. On a semi-Riemannian manifold M there is a unique connection D such that (D4) [V, W ] = D , W - Dw V, and (D5) X ( V , W ) = ( D x K W ) + ( V , Dx W ) , for all X , V, W EX(M). D is called the Levi-Cioita connection of M, and is characterized by the Koszul formula

+

2(Dv W, X ) = V ( W, X ) W ( X , V ) - X ( V, W ) - ( V , [W, X I ) + ( W ?[ X , V I ) + ( X , [K W l ) . Proof: Suppose that D is a connection on M satisfying axioms (D4) and (D5). O n the right-hand side of the Koszul formula use (D5) on the first three terms and (04) on the last three. Most terms cancel in pairs leaving 2(Dv W, X ) . Thus D satisfies the Koszul formula, hence by assertion (a) in the preceding proof it is unique. For the existence define F ( V , W, X ) to be the right-hand side of the Koszul formula. For fixed V, W E SE(M)a straightforward computation shows that the function X + F(V, W, X ) is g(M)-linear, hence is a one-form. By Proposition 10, there is a unique vector field, which we denote by Dv W,such that 2 ( D , W,X ) = F( V, W,X ) for all X . Thus the Koszul formula holds and from it we can deduce (Dl)-(D5). For example, let us prove (D3). For an arbitrary X , 2(D,(fW),

x>= V ( f W , x>+ f ’ W ( X , V > - X ( V , f W > -

( K Cf

w, X I > + ( f W , [ X , V l > + ( X , [ V , f W l > .

Functions can be factored out of the tensor ( , ), and for the bracket operation we have, for example, [,f W, X ] = - X f W + f [W, X I . Thus the expression on the right-hand side above becomes V f ( W , X > + V ( X , w>+X f ( V , W ) - X f ( V , W ) = 2( V f W f Dv W, X ) .

+

Then by the preceding proof, D,(fW) To prove (D4) start from 2(DV W

-

=

(Vf)W

+ fF(V, W,X)

+ f D , W.

Dw V, X ) = F(V, W, X ) - F(W, V, X ) .

The right-hand side reduces to ( X , [V, W I ) -

<x,rw, V I > = 2 w , W l , X ) .

Hence the result follows. The other verifications are similar.

62 3

Semi- Riemannian Manifolds

12. Definition. Let xl,, . ., X" be a coordinate system on a neighborhood @ in a semi-Riemannian manifold M . The Christofel symbols for this coordinate system are the real-valued functions r:,on @ such that

&,(a,) Since

[ai, aj] = 0, it

r;,= r;i.

=

1 r;jak

(1 I i, j

s n).

k

follows from (D4) that

&,(aj)

=

Daj(ai), hence

The connection D is not a tensor, so the Christoffel symbols do not obey the usual tensor transformation rule under change of coordinates.

13. Proposition. For a coordinate system xl, . . . ,x n on 42,

where the Christoffel symbols are given by

f'roofi (1) is an immediate consequence of (D3). To derive (2) set V = W= X = in the Koszul formula. The brackets are zero, leaving

ai,

aj,

a,

But by the definition of Christoffel symbols, 2 =

2

1

Cjgam.

a

Attacking both equations above with

Em

fk

gives the required result.

M

Using (Dl) we can compute any D y W on coordinate neighborhoods by the first formula above, while the second formula is the coordinate description of how the metric tensor deterniines the Levi-Civita connection.

14. Lemma. The natural connection D of Definition 8 is the Levi-Civita connection of the semi-Euclidean space RC for every v = 0, 1, . . ., n. Relative to natural coordinates on Rt (1)

gij =

(2)

rFj= 0,

for all 1 I i, j , k I n.

where z j =

-1 +I

for 1 < j < v, for v + 1 s j s n ,

The Levi-Civita Connection

63

Proof: (1) is essentially the definition of the metric tensor of R:. To prove that D is the Levi-Civita connection of R: one must check that it satisfies (Dl)-(D5). Take (D5), for example. Since ( V , W ) = E~ ViWi,

1

+ CEiViX(Wi)

X(V, W ) = CEiX(V1)W' =

(Dx V , W ) + ( V , Dx W ) .

Then (2) follows from Proposition 13(2), since the gijs are constant.

W

A vector field V is parallel provided its covariant derivatives D xV are zero for all X E X(M). Thus the vanishing of Christoffel symbols in the lemma means that the natural coordinate vector fields on R: are parallel. In general the Christoffel symbols of a coordinate system measure the failure of its coordinate vector fields to be parallel. 15. Example. Cylindrical Coordinates in R 3 . Let r , cp, z be the usual cylindrical coordinates in R3 as indicated in Figure 1. Actually (r, cp, z ) is a coordinate system only on R3 - H , where H is, for example, the half-plane x 2 0, y = 0. There the coordinate functions are well defined and an inverse mapping exists given by x = r cos cp, y = r sin cp, z = z.

:

rU

Y

xJ Figure 1

Hence by the basis theorem

8,

=

cos cp a,

a,

=

rU,

a,

=

a,.

+ sin cp 8,; where

U = -sin cp a,

+ cos cp a,;

64 3

Semi- Riemannian Manifolds

For the sake of indexing, let y' = r, y 2 = cp, y 3 = z. Then g l l= g,, = 1, g , , = r2, and gij= 0 for i # j , hence ds2 = dr2

+ r2 dcp2 + dz2.

In particular this is an orthogonal coordinate system; that is, the coordinate vector fields are mutually orthogonal. A direct computation shows that the Christoffel symbols of cylindrical coordinates are all zero except ri, = - r and rtl = r:, = l / r . Hence all coordinate covariant derivatives are zero, except

These formulas are consistent with what we can visualize from Figure 1. Since 8,is also a natural coordinate vector field it must be parallel. Similarly we expect D,=(a,) = D,=(8,) = 0, since 8,and 8, remain parallel as the point p moves in the z direction. The covariant derivative Dv can be extended to operate on arbitrary tensor fields. In fact, axioms (D2) and (D3) are exactly what is needed to apply Theorem 2.15.

16. Definition. Let V be a vector field on a semi-Riemannianmanifold M . The (Levi-Civita) couariant derivative D v is the unique tensor derivation on M such that D V f

=

for fES(M),

Vf

and D v W is the Levi-Civita covariant derivative for all W E X(M).

If A E 2 : ( M ) then the (r, s) tensor field D V A is S(M)-linear in V E X(M). In fact, by Corollary 2.14 it suffices to check that the tensor derivations D,-v+sw and f Dv + g D w agree on S ( M ) and X(M). But the former is definitionaland the latter is(D1). This remark justifiesthe followingdefinition. 17. Definition. The covariant difleerential of an (r, s) tensor A on M is the (r, s + 1) tensor D A such that (DA)(O', . . . ,8, x,,. . . ,x,,V ) = (DVA)(O1,.. . ,8, XI,. . . , X,) for all V, X iE X(M) and @ E X*(M). In the exceptional case r = s = 0 the covariant differential of a function f is its usual differential df E X*(M), since

(Df)(V)= D Y f = Vf

= df(V)

for all

V E X(M).

Parallel Translation

65

D A is simply a convenient way to collect all the covariant derivatives of A . The fact that the covariant type of D A is one larger than that of A accounts for the term covariant as applied to both derivatives and differentials. Just as for a vector field, a tensor field A is parallel provided its covariant differential is zero, that is, D, A = 0 for all V E X(M). For example, using the product rule (2.13) it follows that (D5) is equivalent to the parallelism of the metric tensor g. If A E 2 : ( M ) the components of D A relative to a coordinate system are denoted by Exercise 2 shows how to express these components in terms of the components of A and the Christoffel symbols of the coordinate system. The general formula is somewhat complicated, but as we shall see its use can be avoided. In the special case of natural coordinates on R:, since the coordinate vector fields and hence the differentials du', . . . , dunare parallel it fOllOWS that A;: :::f:;k = (a/dUk)Ail". J 1 ... Js !r.

PARALLEL TRANSLATION

The simplest case of a vector field on a mapping (Definition 1.47) is a vector field Z on a curve a : I + M . Z smoothly assigns to each t E I a tangent vector to M at a(t). For example, the velocity a' is a vector field on a, as is the restriction V,of any V E X(M). The set X(a) of all (smooth) vector fields on a is a module over g(Z). When M is a semi-Riemannian manifold there is a natural way to define the vector rate of change Z' of a vector field Z E X(a). 18. Propition. Let a : I + M be a curve in a semi-Riemannian manifold M . Then there is a unique function Z -+ Z = DZ/dt from X(a) to X(a), called the induced covariant derivative, such that

(1) (2) (3)

+

+

(aZ1 bZ2)' = aZ; bZ; (hZ)' = (dh/dt)Z hZ' (K)'(t) = D a d V )

+

(a, b E R),

(h E w)jj (t E I ,

V E X(M)).

Furthermore, Proof. Uniqueness. Suppose an induced connection exists satisfying only the first three properties. We can assume that a lies in the domain of a single coordinate system X I , . . . ,x". By the basis theorem, if Z E X(a), then at a(t),

Z(t) =

c Z(t)x' ai

=

(Zx')(t)ai.

66 3

Semi- Riemannian Manifolds

Denote the component function Zx': I -+ R by Z'. By properties (1) and (2)

But by (3),

(ailz)'= D,,(ai); thus dZ' Z' = C -di + 1Z' D,.(di). dt

Thus Z' is completely determined by the Levi-Civita connection D . Existence. On any subinterval J of I such that a(J) lies in a coordinate neighborhood, define Z by the formula above. Then straightforward computations show that all four properties hold. By the uniqueness these local definitions of Z' constitute a single vector field in X(a). In the special case Z = a' the derivative Z = a" is called the acceleration of the curve a. More elaborate notations are sometimes used to emphasize that a" involves geometry while a' does not. For a vector field Z on a it is tempting to write Z = D,,Z and hence also a" = Da,(d).Though Z and a' are not vector fields on M , these formulas can be justified-but only at points a(t) where a'(t) # 0 (see Exercise 12). Introducing Christoffel symbols into the coordinate formula above yields

If Z'

Z

=

0, then Z is said to be parallel. This formula shows that the equation

= 0 is equivalent to a system of linear ordinary differential equations.

Thus the fundamental existence and uniqueness theorem for such systems gives:

19. Proposition. For a curve a : I -,M , let a E I and z E c(al(M). Then there is a unique parallel vector field Z on a such that Z ( a ) = z . Here we take advantage of the fact that solutions of a linear system are defined on the entire interval for which its coefficient functions are given. In the notation of the proposition, if b E I then the function P

=

e(a):T,(M)

-+

7p4)

sending each z to Z ( b ) is called parallel translation along a from p to q = a(b). 20. Lemma.

= a(a)

Parallel translation is a linear isometry.

ProoJ: With notation as above, let u, w E T J M ) correspond as in the proposition to parallel vector fields V, W. Since I/ + W is also parallel,

Geodesics 67

P(u + w)= ( V + W ) ( b )= V ( b ) + W(b) = P(u) + P(w). Similarly, P(cu) = cP(u). Thus P is linear. If P(u) = 0 then by the uniqueness in the proposition, V can only be the identically zero vector field on ol. Hence u = V ( a ) = 0. Thus P is one-to-one, and since tangent spaces to M have the same dimension, P is a linear isomorphism. Finally, for V, W as above,

d dt

-(V, W ) = (V', W )

+ ( V , W ' ) = 0.

Hence ( V , W ) is constant, so = ( V ( b ) , W @ ) ) = = ( u , w>. In general, parallel translation from p t o q depends on the particular curve joining p to q. O n R: the natural coordinate vector fields are parallel and hence so are their restrictions to any curve. Hence parallel translation from p to q along any curve is just the canonical isomorphism up + uq.This phenomenon is called distant parallelism. GEODESICS We now generalize the Euclidean notion of straight line. A geodesic in a semi-Riemannian manifold M is a curve y: I -+ M whose vector field y' is parallel. Equivalently, geodesics are the curves of acceleration zero: y" = 0.

21. Corollary. Let xl,.. .,xn be a coordinate system on $2 c M . A curve y in $2 is a geodesic of M if and only if its coordinate functions xk 0 y satisfy d(x' 0 7) d(x' y) d2(;;; y , rfJ{y) =o i, i dt dt for 1 I k I n.

+c

0

~

~

In fact, these expressions are the components of y" relative to the coordinate vector fields d,, . . . , d,. In dealing with curves it is often convenient to use a common abbreviation, writing the coordinate functions of y as xi rather than xi 0 y. In any reasonable context there should be no confusion between these functions on the domain I of y and the coordinate functions on % c M . The geodesic equations then become dx' dxJ d2(xk) +Irk.--= i , j I J dt dt dt2

0

(1

k I n).

68 3

Semi- Riemannian Manifolds

The existence and uniqueness theorem for ordinary differential equations gives the following local result. 22. Lemma. If u E Tp(M)there exists an interval I about 0 and a unique geodesic y: I + M such that y’(0) = u.

The last equation implies, of course, that y(0) = p ; we say that y is a geodesic starting at p with initial velocity u.

23. Lemma. Let a, b: I such that a‘(a) = p(a),then a

+M

=

be geodesics. If there is a number a E I

8.

ProoJ Suppose the conclusion is false; then there is a to E I such that a(t,) # /3(to), with say to > a. Thus the set ( t E I : t > a and a(t) # /3(t)} has a greatest lower bound b, for which b 2 a. We assert that a’(b) = p(b). This is given if b = a. If b > a, then a and agree on the interval (a, b). Coordinate expressions show that the functions t + a‘(t) and t + a’(t) from (a, b) into the tangent manifold T M are continuous (in fact, smooth). Thus as t approaches b from below

a

a’(b) = lim a‘(t) = lim p(t) = P(b). Since t a =

+

+ a(t b) and t + P(t + b) are also geodesics, Lemma 22 shows that on some interval around b. But this contradicts the definition of

b.

24. Proposition. Given any tangent vector u E T&M) there is a unique geodesic yu in M such that (1) The initial velocity of yu is u ; that is, yL(0) = u. (2) The domain I , of yo is the largest possible. Hence, if a : J geodesic with initial velocity u, then J c I and a = yulJ.

-+ M

is a

Proof. Let Y be the collection of all geodesics y: I , -+ M with initial velocity u. (By Lemma 22 there are some.) Lemma 23 shows that a and /3 in 9 agree on I , n I,. Hence the collection fa consistently defines a single curve yu on the interval I = I , . Evidently y o has the required properties.

u

Because of (2) the geodesic yu is said to be maximal or geodesically inextendible. The notation yu will be used frequently. Picture M as a surface in R3 and p as a penny constrained to remain on M . Once p is given an initial velocity its motion is completely determined and it traces out a geodesic of M . A semi-Riemannian manifold M for which every maximal geodesic is defined on the entire real line is said to be geodesically complete-or

Geodesics

69

merely complete. (See Exercise 7 ) . Note that if even a single point p is removed from a complete manifold M then M - p is no longer complete, since geodesics that formerly went through p are now obliged to stop.

25. Example. Geodesics of Semi-Euclidean Space. For natural coordinates the Christoffel symbols vanish, so the geodesic equations become

Thus u'(y(t)) = pi + tu' for all t , where pi and ui are arbitrary constants. In vector notation, y(t) = p + tu. Hence the geodesics of R':are straight lines. In particular, R: is geodesically complete. Since its velocity vector field is parallel, a geodesic y has quite uniform behavior. Every constant curve in M is trivially geodesic, but if y'(t) # 0 for one single t then y' never vanishes. Thus a geodesic cannot slow down and stop. A curve CI in M is spacelike if all of its velocity vectors a'(s) are spacelike; similarly for tirnelike and null. An arbitrary curve need not have one of these causal characters, but a geodesic y always does since y' is parallel, and parallel translation preserves causal character of vectors.

26. Lemma. Let y : I + M be a nonconstant geodesic. A reparametrization y o h: J -+ M is a geodesic if and only if h has the form h(t) = at f b. Proof. For any curve y, (y h)'(t) = (dh/dt)(t)y'(h(t)).Hence by Exercise 3, 0

(7

0

d2h k)"(t) = i y ' ( h ( t ) ) + dt

Since y is a geodesic, y" = 0, and y nonconstant implies y' never zero. Thus y h geodesic o (y h)" = 0 o d2h/dt2 = 0 o h(t) = at b. 0

0

+

This result shows that that geodesic parametrizations have geometric significance. If a curve has a reparametrization as a geodesic we call it a pregeodesic. If a system of second-order ordinary differential equations is given by smooth functions, then its solutions are smooth not just in the parameter but simultaneously in the parameter, initial values, and initial first derivatives. Applying this fact to the geodesic differential equations gives

27. Lemma. Let u be a tangent vector to M ,that is, an element of the tangent bundle T M . Then there exists a neighborhood J1' of u in T M and

70 3

Semi- Riemannian Manifolds

an interval I around 0 such that (w, s) + yw(s) is a well-defined smooth function from JV x I into M . A second-order differential equation for y can be converted into a pair of first-order equations by taking y’ as a new variable. By essentially the same device, geodesics in M can be represented by integral curves in the tangent bundle T M . 28. Proposition. There is a vector field G on T M such that the projection z: T M .-+ M establishes a one-to-one correspondence between [maximal] integral curves of G and [maximal] geodesics of M .

Proof. If u E T M let G , be the initial velocity of the curve s + y:(s) in T M . It follows using the preceding lemma that G is a smooth vector field on T M . (a) l j y is a geodesic in M , then y‘ is an integral curve oj’G. For all s, let a(s) = y’(s). For arbitrary fixed t , let w = y’(t) and P(s) = &,(s). By Lemma 23, y(t s) = y,(s). Taking velocities in M gives a(t + s) = yk(s) = P(s). Then taking velocities in TM gives a’(t + s) = p(s). In part icular, a’(t) = p’(0) = G, = Gact,.

+

(b) If a is an integral curve of G, then TL 0 a is a geodesic in M . If u = a(0) then by (a), s + y#) is also an integral curve of G . Like a it starts at u, hence the uniqueness of integral curves implies that, initially at least, .n 0 a = z o y: = y u . For arbitrary t let 6 be the integral curve of G starting at a(t). By Lemma 1.50, a(t + s) = 6(s), hence m ( t + s) = n6(s) = Yaco,(S).

The identities TL 0 y‘ = y and (z a)’ = a show that the maps a + 7t 0 a and y -+ y’are inverses, and the result follows. rn 0

THE EXPONENTIAL M A P

At each point o of a semi-Riemannian manifold M we collect the geodesics starting at o into a single mapping. 29. Definition. Ifo E M , let gobe the set of vectors u in T , ( M )such that the inextendible geodesic yu is defined at least on [0, 11. The exponential map of M at o is the function expo: go-+ M

such that exp,(u)

=

y,,(l) for all u E go.

The Exponential M a p

71

Obviously 9,is the largest subset of T J M ) on which expocan be defined. If M is complete, then 9,= T , ( M ) for every point o of M . Fix u E T J M ) and t E R ; then the geodesic s + yu(ts)has initial velocity tyL(0) = tu. Hence y,,(s) = y,(rs) for all s and t such that either side (hence both) is well defined. In particular, if u E 9,then exp,(to)

= Yt"(1) =

Y&).

Thus the exponential map exp,, carries lines through the origin of 7',,(M) to geodesics of M through 0.

30. Proposition. For each point o E M there exists a neighborhood & of 0 in T,(M) on which the exponential map expo is a diffeomorphism onto a neighborhood @ of o in M . Proof. It follows from Lemma 27 that expo is a well-defined smooth mapping on some neighborhood of 0 in T,(M). We assert that the differential map d exp,: T,(T'M) + T J M ) is the canonical isomorphism u, -+ u (page 26). By definition uo p(t) = t u ; as noted above, exp,(tu) = y,(t). Thus d exp,(u,)

=

d exp,(p'(O))

=

(exp, p)'(O) = yX0) 0

=

p'(O), where

= u.

The result then follows by the inverse function theorem (1.16).

w

A subset S of a vector space is starshaped about 0 if u E S implies tv E S for all 0 I t I1. Then S is a union of radial line segments. If %2 and & are as in the preceding proposition and 4 is starshaped about 0, then 42 is called a normal neighborhood of o (see Figure 2). Now we show that 42 deserves to be called starshaped about 0.

Figure 2

72 3

Semi- Riemannian Manifolds

31. Proposition. If 42 is a normal neighborhood of o E M , then for each point p E 42 there is a unique geodesic a: [0, 11 -+ Q from o to p in 9. Furthermore o'(0) = exp; '(p) E @. Proof. By definition is a starshaped neighborhood of 0 in T,(M) such that expoI@ is a diffeomorphism onto @. For p E 42 let u = exp; ' ( p ) E @. Since @isstarshaped the ray p(t) = tu (0 2 t 2 1) lies in @. Thus the geodesic segment a = expo 0 p lies in 42 and runs from o to p . (a is said to be radial.) At the origin of T,(M),d expois the canonical isomorphism To(T,M) x T,(M). But p'(0) = uo, hence a'(0) = d exp,(p'(O)) = d expo(uo)= u.

Suppose 7 :[0, 11 -+ @ is an arbitrary geodesic in 9 from o to p . If w = ~'(o),then the geodesics t -,exp,(tw) and T have the same initial velocities, hence are equal. The radial segment t -+ tw (0 I t I 1) does not leave @, for if it does there is a 0 < to < 1 such that tow E @ but expo(tow) E 4 - t([O, 11). Thus w E @. But exp,(w) = ~ ( 1 )= p = exp,(u) and expo is one-to-one on @, so w = u. Hence by the uniqueness of geodesics T = a. This proof shows that a normal neighborhood 9of o uniquely determines the neighborhood @ in T@(M). A broken geodesic is a piecewise smooth curve segment whose smooth subsegments are geodesics. For example, a broken geodesic in RZ is just a polygonal curve.

32. Lemma. A semi-Riemannian manifold M is connected if and only if any two points of M can be joined by a broken geodesic. Proof. Assume M is connected and fix p E M.Let % be the set of points that can be connected to p by a broken geodesic. For q E M let 42 be a normal neighborhood. If q E V then clearly 42 c V. But also if q E M - V then 42 c M - %.Thusby connectedness,M = V. The converse is obvious. On any normal neighborhood 42 of o E M there is a special type of coordinate system that is particularly simple. Let e l , . . . ,en be an orthonormal basis for T,(M), so (ei, ej) = aijcj. The normal coordinate system C; = (x', . . . , x") determined by e l , . . . , e n assigns to each point p E 42 the vector coordinates relative to e l , . . . ,en of the corresponding point expo- ' ( p ) E @ c T,(M). In short, expo- ' ( P ) = C xi(p)ei (P E Q). Hence iff', . . . ,f " is the dual basis to e l , . . . , e n ,then x i 0 expo = f i on 42.

-

The Exponential Map

73

33. Proposition. If xl,. . . ,x" is a normal coordinate system at o E M , then for all i , j , k (1)

(2)

gij(0) = bijgj;

rk2{o)= 0.

Proof: With notation as above, if u E T , ( M ) write u e x p m = yu(0, x'(yu(t)) =

=

aiei. Since

f ' ( t u ) = tf'(u) = ta'.

Hence u = yk(0) = a' ail,. Taking ui = bij shows that e j = a,[,, and (1) follows. The expression for x i o yo shows that the geodesic differential equations for yu reduce to . .

rk,(yU(t))a'aJ= 0

for all k.

ij

In particular, rkJ(o)aiai= 0 holds for all a = (a', . . . ,a") E R". For fixed k , this expresses the fact that a certain quadratic form on R" is identically zero. Hence by polarization the corresponding symmetric bilinear form is identically zero, that is, Tfj(o)= 0. Comparison with Lemma 14 shows that at the point o-though in general not elsewhere-the metric tensor and Christoffel symbols of a normal coordinate system are semi-Euclidean. Since tensors are pointwise creatures this fact is surprisingly powerful in computations. For example, suppose a problem involves the covariant differential D A of A E 2 : ( M ) . For arbitrary coordinates the components of DA (Exercise 2) are rather clumsy, but if for each point o E M we use normal coordinates at 0,then

just as for natural coordinates in semi-Euclidean space. Near o the formulas in the proposition are at least approximately true: the closer we get to o the more nearly M resembles T , ( M ) x R:. But this approximation cannot be pushed too far. For example, at o the first derivatives of rfjgenerally do not vanish (although those of gijdo).

34. Example. Exponential Maps for R:. According to Example 25 the geodesic with initial velocity up E T,(R:) is the straight line t + p + tu. Thus the exponential map at p sends up to p u (the tip of the arrow up). It follows that exp,: T,(R:) -+ R: is a diffeomorphism, since it is the composition of the canonical isomorphism T,(R:) x R: and the translation x + p x. In fact, when the scalar product space T,(R:) is given its usual metric tensor, both these maps are isometries, so exp, is an isometry.

+

+

74

3

Semi- Riemannian Manifolds

CURVATURE

In the theory of surfaces in R 3 evolving during the late 1700s, a notion of curvature was defined that gives a very reasonable description of the way the surface is shaped in R’. It was Gauss who showed (“theorema egregium”) that this Gaussian curvature is an isometric invariant of the surface itself, independent of the fact that the surface happens to be in 3-space. This theorem led Riemann to his invention of Riemannian geometry, whose dominant feature is the generalization of Gaussian curvature to arbitrary Riemannian manifolds. No significant changes are required in extending to semi-Riemannian manifolds. Lie derivatives satisfy the identity Lrx,y l = [L,, L y ] , where as usual the right-hand side means L x L y - L y L x .Hence if [ X , Y ] = 0 (as, for example, for coordinate vector fields) then L x and L y commute. By contrast, these results fail in general for the covariant derivative D,. This failure is measured by a tensor field that plays a central role in all differential geometry. 35. Lemma. Let M be a semi-Riemannian manifold with LeviCivita connection D. The function R : X(M)3-+ 3 ( M ) given by

RXYZ = D,,,Y,Z

-

[DX. DYlZ

is a (1,3) tensor field on M called the Riemannian curvature tensor of M . Proof. As on page 37, R can be interpreted as an element of 2:(M) provided it is S(M)-multilinear. Since R-linearity is obvious, this amounts to showing we can “factor out functions.” For example, since [ X , f Y ] = X f ’ y + .fCX, Y I , RXJYZ

D[X,fY,Z - DXD,YZ + DfYDXZ = X f .DYZ + .fD[X.Y,Z- DX(f’DYZ)+ .fDYDXZ = Xf-DyZ -Xf.DyZ f R x y Z = .fRxyZ. =

+

The bracket operation on vector fields is not a tensor and the covariant derivative is not a tensor, but in the combination above they produce the tensor R . The alternative notation R ( X , Y ) Z for R x y Z is convenient when X and Y are replaced by more complicated expressions. As shown in Chapter 2 the tensor R can be considered as an R-multilinear function on individual tangent vectors. If x, y E T,(M) the linear operator

sending each z to R X y zis called a curvature operator. The following identities are the symmetries of curvature.

Curvature

36. Proposition. If x , y , z , u, w

E

75

T'(M), then

R,, = - R y n (2) ( R x y 4w> = - (RxyW,u>, ( 3 ) RxyZ + Ry& + R Z X Y = 0, (4) (Rxyu9w> = (R",X, Y>. (1)

The first two identities show that the curvature tensor contains considerable skew-symmetry. In particular (2) says that curvature operators are skew-adjoint. Equation (3) is called the j r s t Bianchi identity-note that its vectors are cyclically permuted. Symmetry by pairs, (4), will follow from the earlier identities.

Proof. Since both the covariant derivative D x and the bracket operation on vector fields are local operations, it suffices to work on any neighborhood of the point p . Because the identities to be proved are tensor equations, the tangent vectors x , y , . . . can be extended to vector fields X , Y, . . . on some neighborhood in any convenient way. In the case at hand, we choose the extensions so that all their brackets are zero. (This is accomplished by taking them to have constant components relative to a coordinate system.) In particular, R x , Z then reduces to D y ( D x Z ) - D x ( D y Z ) . (1) Whenever the bracket [ A , B ] = A B - BA makes sense, it is skewsymmetric in A and B. Thus (1) is immediate from the definition of curvature. (2) By polarization we need only show that ( R x y u ,u ) = 0. But using (D5) from Theorem 11,

( R X YV,

( D Y D xv, V > - ( D x D y V , V ) Y ( D x V, V ) - ( D x V , Dy V ) - X ( D y V, V ) = + Y X ( V , V ) - + X Y ( V , V ) = 0,

= =

+ ( D y V,D x V )

since [ X , Y ] = 0. (3) Suppose F : X(M)3 -+ X(M) is a function that is merely R-linear, and let G F ( X , Y, 2 )be the sum over the cyclic permutations of X , Y, 2 : F ( X , Y, Z )

+ F(Y, z, X ) + F ( Z , x,Y ) .

A cyclic permutation of X , Y, Z leaves G F ( X , Y, Z ) unchanged. Consequently, GRXyZ

=

GDyD,yZ - G D x D y Z

=

G D x D , Y - GDXDyZ

=

G D x [ Z , Y]

=

0.

(4) The proof is a combinatorial exercise. By ( 3 ) , ( G R y , X , W ) = 0, where now 6 acts on whichever three vectors fields are attached to R . Sum over the four cyclic permutations of Y, V, X , W, and then expand each 6 to

76 3

Semi- Riemannian Manifolds

obtain twelve terms. Using (1) and (2), eight of these will cancel in pairs, leaving 2(R,yyV, W ) 2 ( R w v X , Y ) = 0.

+

Hence ( R x y V, W) = ( R V w X , Y ) . The symmetries of the curvature tensor R lead to a less obvious symmetry of its covariant differential DR, called the second Bianchi identity. By definition DR is a (1,4) tensor that we interpret as assigning to four vector fields --the (,vector field) value (DZR)& = ( D z R ) ( X , Y ) V . As always, this makes sense for individual tangent vectors, so the summands below are linear operators on Tp(M).

37. Proposition (Second Bianchi Identity). If x, y, z E T,(M), then (D,R)(x, y )

+ (D,R)Cy, z ) + (D,RXz, x) = 0.

Proof. As in the previous proof extend the tangent vectors x, y, z to vector fields X , Y, Z on a neighborhood of p. This time we choose more carefully: for a normal coordinate system at p , let the extensions have constant components. Thus not only do all brackets vanish identically, but at the point p (where the Christoffel symbols are zero) it follows from the formula in Proposition 13( 1) that all nine covariant derivatives involving only X , Y, Z are zero. By the product rule, applying ( D Z R ) ( X ,Y) to an arbitrary vector field V yields

Dz(R(X, Y ) V ) - R ( D z X , Y ) V - R ( X , Dz Y ) V - R ( X , YXDz V). At the point p the two middle terms are zero, hence dropping the now superfluous vector field V we have ( D z R ) ( X , Y ) = CDz, R ( X , Y ) ] = CDz, CDY D.yI1 at P. But the Jacobi identity (as one can see by writing it out) is valid here just as for brackets of vector fields. Thus summing the above formula over the cyclic permutations of X , Y, 2 gives the required result G ( D , R ) ( X , Y ) = 0 at 9

P.

38. Lemma. On the coordinate neighborhood of a coordinate system x',

. . . ,X", R&a,(aj)

=

c i

where the compments of R are given by

R!kl

Sectional Curvature 77

Proof: For coordinate vector fields Raka,(aj>= Da,(Dak a j )

-

Da,(Da,

aj)-

The first term on the right-hand side is

Relabeling two pairs of indices gives

Then subtracting the corresponding expression with k and 1 reversed gives rn the result. Substituting from Proposition 13(2) into the formula above gives an explicit formula for curvature in terms of the metric tensor. Even in simple cases such computations are tedious and give minimum information. To compute the curvature of a given manifold M the practical way is to use theoretical results to exploit the distinctive features of M . Many examples of this general approach will appear later on.

SECTIONAL CURVATURE

The Riemannian curvature tensor R is fairly complicated; we now consider a simpler real-valued function which completely determines R . A two-dimensional subspace Zl of the tangent space T,(M) is called a tangent plane to M at p . For tangent vectors u, w define

Q(u, W ) =

(0,

u>(w,

W>

- (0, w>’.

By Lemma 2.19 a tangent plane Il is nondegenerate if and only if Q(u, w) # 0 for one-hence every-basis u, w for n. The absolute value IQ(u, w)l is the square of the area of the parallelogram with sides u and w.Q(u, w) is positive i f g l n is definite, negative if it is indefinite (use an orthonormal basis).

39. Lemma. number

Let Zl be a nondegenerate tangent plane to M at p . The K(u, w)= ( R , , Q w>/Q(u, w)

is independent of the choice of basis u, w for curvature K ( n ) of n.

n, and is called the sectional

78 3

Semi- Riemannian Manifolds

Proof. Any two bases for Il are related by equations

u = ax w = cx

+ by, + dy,

where the determinant of coefficients ad - bc is not zero. A direct computation shows that (R,,u,

fi7) =

(ad - bc)’(R,,x, Y ) ,

and

Q(u, W ) = (ad

-

bc)’Q(x, y ) .

Thus the sectional curvature I( of M is a real-valued function on the set of all nondegenerate tangent planes to M . By definition, R determines K ; to show that K determines R , a technicality about indefinite scalar products is needed. 40. Lemma. Given vectors u, w in a scalar product space, there exist vectors V and W,arbitrarily close to u and w, respectively, that span a nondegenerate plane. Proof: We can assume that u and w are linearly independent since any pair of vectors can be approximated by independent vectors. Obviously the plane spanned by u and w can be assumed to be degenerate-hence the scalar product is indefinite. If u is null, let x be a vector such that (0, x ) # 0; if u is nonnull, pick x # 0 of opposite causal character. In both cases Q(u, x ) < 0. It suffices now to show that for all sufficiently small 6 # 0 the vectors u and w + 6 x span a nondegenerate plane. Expansion of Q(u, w + 6x) gives an expression of the form

26b

+ S2Q(u, x).

If b # 0, this will be nonzero since 6 small dominates 6’. If b zero since Q(v, x) < 0. 41. Proposition. If K

=

0 at p

E

=

0, it is non-

M , then R = 0 at p .

Explicitly, if K(n)= 0 for every nondegenerate plane in T,(M), then R X y z= 0 for all x , y, z in T,(M). Proof. (1) (Ro,u, w) = Ofor all u, W E T,(M). If u and w span a nondegenerate plane, then (R,,v, w) = 0. But by the lemma any pair of vectors is a limit of such vectors. Since (R,,x, y ) is so (1) is true. multilinear it is continuous on T,(,M)~, RVH.u = O.for all 17, w E T,(M). (2)

Sectional Curvature

79

For arbitrary x , polarize thus: ( R " , W + , ~w,

+ x> = (RVWU, w > + (Rv,u, w > + (Rvwu, x > + (R",U, x > .

Three of these terms vanish by (1). Symmetry by pairs asserts that the remaining two are equal, so (R,,u, x) = 0 for all x . ( 3 ) Ruwx = Rw,u for all u, w,x E Tp(M). Polarize a second time : R,,,, ,(u

+

X) =

Ruwu + Rxwu + R,, x

+ R,,.x.

Again three terms vanish, by (2); hence ( 3 ) follows by the skew-symmetry of R in its subscripts. According to (3), Ruwxis unchanged by a cyclic permutation of the vectors v, w, x . Thus the first Bianchi identity implies R v w x = 0 for all v, w, x ; so R=Oatp. rn

A semi-Riemannian manifold M for which the curvature tensor R is zero at every point is said to beflat. By the proposition, M is flat if and only if the sectional curvature function K is identically zero. For example, every semi-Euclidean space R: is flat: for natural coordinates the Christoffel symbols all vanish, hence R = 0 by Lemma 38. The preceding proof actually shows somewhat more. Let us say that a multilinear function F : Tp(M)4-+ R is curuaturelike provided F has the symmetries stated in Proposition 36 for the function (v, w, x , y ) -+ (R,,x, y ) . The preceding proof used only these abstract properties; thus F(v, w , u, w)= 0 for all u, w E Tp(M)spanning a nondegenerate plane implies F = 0. It follows that K determines R in this sense: 42.

Corollary. Let F be a curvaturelike function on Tp(M)such that

whenever u and w span a nondegenerate plane. Then ( R v w x ,Y > = F(u, w,x , Y ) for all u, w, x, y in T,(M). Proof. The difference function d(v, w, x , y ) = F(u, w, x , y ) - (R,,x, y ) is also curvaturelike. By hypothesis, A(u, w, u, w) = 0 if v and w span a nondegenerate plane. Thus by the remark preceding this corollary, A = 0.

A semi-Riemannian manifold M has constant curvature if its sectional curvature function is constant. In Chapter 4 we shall find many such manifolds, for example, the sphere S"(r).

80 3

Semi- Riemannian Manifolds

The preceding corollary leads to a simple formula for R when K is constant. 43.

Corollary. If M has constant curvature C, then R X y z= C{(z,

ProoJ

X>Y

- (2, Y > x > .

A routine computation shows that the formula F(x, Y , 0,

W) =

CI(u, x > ( Y , W > - (v, Y > ( x , w>}

defines a curvaturelike function at each point, and F(x, y , x , y ) = CQ(x, y ) . So if x and y span a nondegenerate plane,

and the result follows by Corollary 42. In the Riemannian case, this curvature formula has a simple geometric meaning: if x, y is an orthonormal basis for a plane n,then R,, is zero on I l l , and on I7 is the rotation sending x to y and y to -x, followed by scalar multiplication by C. SEMI-RIEMANNIAN SURFACES

Let M be a semi-Riemannian surface, that is, a semi-Riemannian manifold of dimension 2. For a coordinate system u, u in M the components of the metric tensor are traditionally denoted by E

=g,1

=

(a”, a,>,

F

= g12 = g 2 , =

where for indexing purposes, u

(&, a,>,

= u l , and u =

ds2 = E du2

G

=g22 =

(aV,av>,

u2. The line element is thus

+ 2F du do + G do2,

and Q = Q(a,, a,) = EG - F 2 . Then by Proposition 13 or by differentiation of E, F , G the Christoffel symbols are as follows:

Type-Changing and Metric Contraction

81

Coordinate geodesic equations then follow by substitution in

+ r:1uf2 + 2 r : 2 u ’ u r+ riZd2= 0, Y“ + rf1ur2 + 2 r : 2 u f ~+‘ r:2~i2 = 0.

u’i

Since M is two-dimensional, T,(M) is the only tangent plane at p . Thus the sectional curvature K becomes a real-valued function on M , called the Gaussian curvature of M . General formulas for K are complicated so we consider a useful special case. 44. Proposition. Let u, u be an orthogonal coordinate system in a semiRiemannian surface, so F = (a,, a,) = 0.

(1) Da,

Ell E a, = a, - 2 a,, 2E 2G

Da,

a,

=

Da,

D ~ a,”

(2) Let e = 1 E I 1 / 2 and g = I GI

GI4 G” a, = - -a, + -a,, 2E 2G

E 2E

=2

G a, + 2 a,. 2G

’”, and let

E~ =

& 1 be the sign of E and

e2 the sign of G. Then

Proof. (1) Set F

=

0 in the formulas above for the Christoffel symbols.

(2) By definition, K = (Raua,(du),d,)/EG. In computing the numerator, use

(Da,Da,

a a,, a,> = a u (Da, a,, a,> - (Da, a,, Da, a,>, -

which by (1) becomes G,J2 - ( E J 2 / 4 E - (G,)2/4G. There is one other analogous term. Then the curvature formula can be verified by calculating its right-hand side. For some examples, see Exercise 8. TYPE - C HA N G I N G A N D M ETR IC CO NTR ACT10 N

In tensor language, Proposition 10 asserts that for a semi-Riemannian manifold M there is a natural S(M)-linear isomorphism 2A(M) = 2 ? ( M ) . This isomorphism is readily extended to higher types as follows. Fix integers

82 3

Semi- Riemannian Manifolds

1 I a s r and 1 I b I s. If A E 2 : ( M ) then the value of on arbitrary one-forms and vector fields is defined by

(1; ,q(e1, . . . , r1

9

XI,.

.

. 1

xs+

1; A E 2 : ; : ( M )

1)

0th slot

xi, . . . , x b - I! x b + 1, . . . ,x,+I),

= A(O1,. . . , i t , . . . , 8'-

where X,* is the one-form metrically equivalent to X , . Thus on the right-hand side we extract the bth vector field and insert its metrically equivalent oneform in the ath slot among the one-forms. For example, let A be a (2, 2) tensor field. Then B = 1: A is the (1, 3) tensor field such that B(0, X , Y, Z ) = A( Y*, 0, X , Z ) for all one-forms 8 and vector fields X,Y, Z. In coordinate terms, the one-form dual to diis c g i j dx'. Hence B$

=

B(dx', aj, a,, 31)

=

i

C gkmdxm,dx', aj, 81

A

( m

g,rnAF'

= m

Thus 1; uses the metric tensor to turn first superscripts into second subscripts. The operation 1; : 2 ; ( M ) -P 2:;i ( M ) is known classically as lowering an index. It is clearly g(M)-linear, and is in fact an isomorphism, since there is a inverse operation 7; which, with notation essentially as above, extracts the ath one-form and inserts its metrically equivalent vector field in the bth slot among the vector fields. In coordinates, the vector field metrically equivalent to dx' is g'j 8'. If B is a (1, 3) tensor, then

1

(ti Bx4 = 1 g'4BLqr, 4

with the metric tensor turning the second subscript into the first superscript. Thus the operation tt is classically called raising an index. As an example, in coordinate terms, of the inverse nature of the two operations, (f;

1;

A)2

=

c g ' p ( 1 ; A)jkpl = ~ g ' P g p m A $= P

pm

1 SLAKj = A:{. rn

Type-changing of tensors is so natural that it is apt to occur in practice without even being noticed. An important case is that of a (1, s) tensor A given as an g(M)-multilinear function A : X(M)"-P X(M). It is then particularly simple to lower the unique contravariant slot to, say, the first covariant position :

(1: A ) ( K

X I , . . . ,XS)= ( V , A w l , . . . , X , ) ) .

In fact, the left-hand side is by definition A ( V * , X,,. . . , X,), which by the interpretation on page 37 is V * ( A ( X , , . . . , X,)= ( V , A ( X 1 , . . . , X , ) ) .

Type- Changing and Metric Contraction 83

All tensors obtained from a given tensor by the raising and lowering operations are said to be metrically equiualent. They all contain the same information, and hence can be viewed as different manifestations of a single object. The classical coordinatized version of multidimensional differential geometry was developed long before the invariant version, and in harmonizing the two approaches one point requires care. When the curvature tensor R : X(kQ3 + X(M) is written in the ordinary way as a function of three vector fields, the classical index pattern in Lemma 38 demands R(Z, X , Y) = R,,Z. The components of the (0,4)tensor 1: R are then given by Rijkl

=

(1: R)(ai, a j , a k l 8,)

= (ai,

Raka,(aj)>=

CgimRz1.

This is the usual way to lower the contravariant index of R-an agreement r R5ki. expressed classically by writing R f j kfor On a smooth manifold, contraction operates on one contravariant and one covariant slot to reduce an ( r , s) tensor to an ( r - 1, s - 1) tensor. But on a semi-Riemannian manifold we can metrically contract two covariant indices by first raising either one of them and then contracting in the usual way. Thus a
f

'

t h t h position

For example if A is a (1,3) tensor, then (C,,A)f

=

CgP4A;qj. P.4

Similarly, in the contravariant case, for 1 I a < b I r and arbitrary s, we get C b :2 : ( M ) + 2 : - 2 ( M ) with a coordinate formula reversing covariant and contravariant indices above (so gij replaces 9''). When specific indices are not important, all contractions will be denoted by C. 45. Lemma. Covariant derivatives D v and the covariant differential D commute with both type-changing and contraction.

Proof. Since a raising operation is the inverse of a lowering, it suffices to consider the latter. By a simple permutation argument we need only consider 1:. The coordinate expression for 1: A shows that this tensor is the ordinary

04 3

Semi- Riemannian Manifolds

contraction C; applied to g 0 A . As a tensor derivation, D v commutes with ordinary contraction. Since the metric tensor is parallel,

DV(14 4 = D"(Clk 8 4) = Gk 8 D " 4

=

14 ( D " 4

Hence Dv also commutes with metric contraction. Formal computations then give the corresponding results for D. F R A M E FIELDS

An orthonormal basis for a tangent space Tp(M)is called aframe on M at p . If n = dim A4 then a set El, . . . ,En of n mutually orthogonal unit vector fields is called a framefield, since it assigns a frame at each point. For example, on R" the natural coordinate vector fields form a frame field. In general there may not be a frame field on all of M , but we shall see in a moment that they always exist locally. By orthonormal expansion (2.25) any vector field V can be expressed in terms of a frame field as V

=

where ei = ( E i , Ei).

ei( V, E i ) E i ,

Thus ( V, W > =

1

Ei(

V, Ei) ( W , Ei).

At the origin o of a normal coordinate system the coordinate vectors are orthonormal. It follows that as long as only pointwise operations are involved, frame field formulas are (simpler) consequences of corresponding coordinate formulas. For example, consider the metric contraction Cobof A E 2 : ( M ) . Relative to a frame field, nth

(CobA)(Xl, *

*.

> Xs-2)

=

slo'lr

1EmA(X1, . . ., E m ,

bth Slot

* * *

E m , . . ., X s - 2 ) .

To prove this tensor equation it suffices to work at a single point 0,origin of normal coordinates such that ailo = EiJ,. By multilinearity it suffices to let the X i s be the coordinate vector fields But then the formula follows from the coordinate formula for cob,since at the point o both g i j and g'j become

ai.

dijEj.

Similarly, for a (1, s) tensor field A : X(M)" -+ X(M), ( C ; A ) ( X l , . . ., X s -

1)

=

Cm E m ( E m ,

A ( X 1 , . . ., E m , * *

tbth

*

9

X~-I)>.

slot

These remarks apply only to tensor algebra: for tensor calculus the advantage of ( E i , Ei> = sijcj over a j ) = g i j must be balanced against the

(ai,

Some Differential Operators

85

disadvantage that, unlike [a,, d j ] , the brackets [ E , , E j ] are generally not zero. A frame field on a curve a : I -+ M is a set of mutually orthogonal unit vector fields El, . . . , En on a. Not only can such a frame field be defined on the entire curve, but we can choose the vector fields Ei E X(a) to be parallel. 46. Corollary. If a : I + M is a curve and el, . . . , en is a frame at a(O), then there is a unique parallel frame field El, . . . ,En on a such that Ei(0) = ei for 1 I i In.

Proof. By Proposition 19 there is a unique parallel vector field Ei on a such that Ei(0) = e;. But since parallel translation to any t E I is a linear isometry, E l , . . . , En is in fact a (parallel) frame field.

The triple advantages of orthonormality, parallelism, and global definition give the use of parallel frame fields on a curve a decisive superiority over coordinate methods. It follows that on M frame fields exist locally: given any frame e l , . . . , enin a tangent space T,(M), choose a normal neighborhood 42 of o and extend the frame to a frame field El, . . . , En on 42 by parallel translation along radial geodesics. Differential equations theory guarantees that the vector fields Ei are smooth.

SOME DIFFERENTIAL OPERATORS

O n a semi-Riemannian manifold M there are natural generalizations of the well-known differential operators of vector calculus on R3 : gradient, divergence, and Laplacian. 47. Definition. The gradient grad f of a function f E g ( M ) is the vector field metrically equivalent to the differential df E X*(M). Thus

(grad f, X )

=

for all X E X(M).

d f ( X ) = Xf

In terms of a coordinate system df grad f

=

1(tJf/dx') dx', hence ..

af

g ' . ' T dj.

= i, j

ax

In particular, for natural coordinates on semi-Euclidean space we have grad f = ~ ~ ( a f / d u 'di, ) which reduces to the usual formula on R3.

1

86 3

Semi- Riemannian Manifolds

For a tensor A the contraction of the new covariant slot in its covariant differential D A with one of its original slots is called a divergence div A of A. We mostly use two special cases where there is a unique divergence: (1) field

If V is a vector field, then div V div V =

=

C ( D V )E g ( M ) . Thus for a frame

1ci(DEiV, E i ) ,

and for a coordinate system

Hence for natural coordinates on R:, div V = aV'/du', which on R3 is the usual formula. (2) IfA is a symmetric(O,2) tensor, then div A = C, ,@A) = C,,(DA) E X*(M). For a frame field, (div A ) ( X ) = 1 E ~ ( D ~ , A ) X( E ) , ~while , for coordinates (div A)i =

c glSAri; c A t S . =

r.s

S

) its second 48. Definition. The Hessian of a function f ~ g ( M is covariant differential Hf = D(Df).

49. Lemma. The Hessian Hf off is the symmetric (0,2) tensor field such that H f ( X , Y ) = X Y f - ( D x Y )f = (D,(grad f ) , Y ) . Proof. Since Df

=

df,

HS(X,Y ) = D(df)(X,Y )

D y ( d f ) ( X ) = Y ( d f ( X ) )- d f ( D y X ) = YXf - (DyX)f.

Because X Y - Y X = [ X , Y ] = D x Y - D y X , we can reverse X and Y in the preceding formula-showing also that Hf is symmetric. Finally, (Dx(grad f ) , Y ) = X(grad f , Y ) - (grad f , D x Y )

=

50. Definition. The Laplucian df of a function f gence of its gradient: df = div(grad f ) E g ( M ) .

E

Hf(X, Y).

g ( M ) is the diver-

Since the covariant differential commutes with type-changing, it follows that the Laplatian o f f is the contraction of its Hessian. In fact, df = div(grad f ) = CD(grad f ) = CD(tf d f ) =

C

Ddf

=

(C t:)Hf

=

C12(H').

Ricci and Scalar Curvature

87

For a coordinate system the components of the Hessian can be read from the lemma above. Thus the Laplacian off has coordinate expression

For natural coordinates in R: the components of Hf are just the second partials a2f/8ui a d , and df = E~ a2f/d(ui)2,which reduces to usual formula on R3. See Exercise 7.5 for another formula for df. (In special contexts, d acquires other names, and it is sometimes defined with opposite sign.) RlCCl AND SCALAR CURVATURE

Contraction of Riemannian curvature yields simpler invariants.

51. Definition. Let R be the Riemannian curvature tensor of M . The Ricci curvature tensor Ric of M is the contraction C i ( R ) E %:(M), whose components relative to a coordinate system are R , = RZm.

c

Because of the symmetries of R the only nonzero contractions of R are & Ric. 52. Lemma. The Ricci curvature tensor Ric is symmetric, and is given relative to a frame field by

Ric(X, Y ) =

1 Em(RXE,,,Y, E m ) , m

where as usual E,

=

( E m ,E m ) .

Proof: As pointed out earlier, classical indexing demands the notation R ( X , Y, Em) = R y E , X . Thus

Ric(X, Y) = ( C : R ) ( X , Y ) = C E r n < E m , R ( X , Y , Ern))

= CEm
Ern).

Symmetry by pairs then gives the required formula and shows that Ric is symmetric. If its Ricci tensor is identically zero, M is said to be RicciJlat. A flat manifold is certainly Ricci flat, but we shall see later that the converse does not hold. Note the trace formula Ric(X, Y ) = trace{ I/ -+ R,, Y } . Since sectional curvature determines the curvature tensor R it also determines Ric-and in a rather simple way. By polarization and scalar multiplication, Ric can be reconstructed at each point p from its values Ric(u, u ) on

88

3

Semi-Riemannian Manifolds

the unit vectors at p . But if e , , . . . , e, is a frame at p such that u = e l , then by the preceding lemma, Ric(u, u ) =

E ~ ( R ~ ~em> J = ~ )(u, , u>

1K(u, ern).

Thus Ric(u, u) is, but for the sign (u, u) = L- 1, the sum of the sectional curvatures of any n - 1 orthogonal nondegenerate planes through u. 53. Definition. The scalar curuature S of M is the contraction C(Ric) E a M ) of its Ricci tensor.

In coordinates, S

giJRij=

=

1giJR!jk.

Contracting relative to a frame field yields

1K ( E i , Ej) = 2C K ( E i , Ej).

S =

i+j

i<j

The following consequence of the second Bianchi identity (37) is crucial in the foundations of general relativity. 54. Corollary. dS

=

2 div Ric.

Proof. To express the second Bianchi identity G ( D Z R ) x u= 0 in terms of coordinates apply it to (Dd,R)a&(aj)

=

1Rikl;r

ai

to obtain Rikl;r

+ Riir;k + Rfirk;l = 0.

Reverse r and k in the third term, with change of sign, and contract on i and r. Thus

1

RSkl; r

r

+ 1 RSlr; k

-

1RSkr; = O, I

r

which is just

1Rgkl;, + Rjpk - Rjk:l = 0. r

Now metrically contract o n j and k :

1

In the first term write Rgk1as grmRmjkl and use the coordinate symmetries of R (Exercise 4)to express this term as RYm.Thus 2

1RYm = S : [ .

Semi-Riemannian Product Manifolds

89

The left-hand side is the coordinate expression for 2 div(Ric); the right-hand rn side is that of DS = dS.

55. Remark. Curvature Sign. Our definition of the Riemannian curvature tensor is not the only reasonable possibility; one can also change its sign, defining curvature to be W = - R. By contrast the notions of sectional, Ricci, and scalar curvature are inviolate. To convert the formulas of this book from R to 9, simply change the sign of Riemannian curvature and itsfour index components. Do not change the signs of K , Ric, R,,, or S . Replacing R by - B will adjust definitions as required. For example, Q ( x , Y ) K ( X , Y ) = ( R x u X , Y > = - (Wxy X , Y > =

(gxy

Y, X>,

SEMI-RIEM A N NlAN PRO DUCT MANIFOLDS

To see how the geometry of a semi-Riemannian product M x N depends on that of M and N , an essential tool is the notion of lift as discussed in Chapter 1. If X E X(M) we will use the same notation for its horizontal lift X E 2 ( M ) c X(M x N ) ; similarly for the vertical lift V E 2 ( N ) of V E X(N). 56. Proposition. If X , Y E 2 ( M ) and V, W E 2 ( N ) , then

(1) D xY is the lift of M D XY E X(M). ( 2 ) Dv W is the lift of ND, W E X(N). (3) D y X = 0 = D x V . These assertions can be easily verified using the Koszul formula. (Compare the generalization in Proposition 7.35.)

57. Corollary. (1) A curve y(s) = (~(s), p(s)) in M x N is a geodesic if and only if its projections c1 in M and p in N are both geodesics. (2) M x N is complete if and only if both M and N are complete. The first assertion follows from a more general result in Chapter 7; evidently (1) implies (2). Applying Proposition 56 to the definition of curvature gives:

58. Corollary. O n M x N if X , Y, 2 E Q ( M )and U , V, W E 2 ( N ) ,then: (1) R x y Z is the lift of M R X y Zon M. (2) R,, CJ is the lift on NR,W U on N . (3) R is zero on any other choices from X , . . . , W.

90 3

Semi- Riemannian Manifolds

These tensor results are valid for individual tangent vectors. It follows that the sectional curvature of a nondegenerate horizontal plane is the same as that of its projection into M , and analogously for a vertical plane. A nondegenerate plane spanned by a vertical and a horizontal vector has K = 0 since R,, = 0. Thus there is always some flatness in a semi-Riemannian product manifold. LOCAL ISOMETRIES

The various features of semi-Riemannian geometry defined so far are isometric invariants :each is, in an appropriate sense, preserved by isometries. This can hardly be doubted since each is constructed, using the tools of manifold theory, from the metric tensor, and an isometry preserves both the tools and the tensor (Definition 6). For example, Levi-Civita connections are preserved in the following sense. 59. Proposition. If 4 : M + N is an isometry, then d 4 ( D X Y )= Dd,,(d4Y) for all X , Y E X(M).

Proof: Recall that the value of the transferred vector field d # ( X ) at a point $(p)is d$(X,). Because 4 is a diffeomorphism, for any p E M there are coordinate systems ( at p and q at 4 ( p ) that are preserved by 4, that is, y'(4q) = x'(q) for q near p . It follows at once that 4 preserves coordinate vector fields and partial derivatives (as in Exercise 1.1 1). Then the components of X and Y = d 4 ( X ) are preserved since Y ' ( 4 9 ) = Y&(Y') = (d4Xq)y' = XJy'

0

4) = X J x ' )

=

X'(9).

Because 4 is also an isometry, metric tensor components are preserved: gc(4q) = $(q). Hence the formula in Proposition 13(2)showsthat Christoffel

symbols are similarly preserved. The result then follows from formula (1) of that proposition. The local character of this proof suggests that such invariance results can be extended to a broader class of mappings. 60. Definition. A smooth map 4: M + N of semi-Riemannian manifolds is a local isometry provided each differential map d 4 : T J M ) -, T+,(N) is a linear isometry.

In view of the inverse function theorem an equivalent formulation, justifying the term local isometry, is this: Each point p of M has a neighborhood 42 such that 4 142 is an isometry of 42 onto a neighborhood of 4(p) in N .

Local lsometries

91

61. Example. Let S' be the unit circle in R2. The exponential map exp: R' -, S' wraps the line evenly around the circle via t + (cos t, sin t). Consider S' as a Riemanian submanifold of R 2 ;then exp is a local isometry. (It suffices to check that, as a curve in R 2 , exp has unit speed.) By the above criterion for local isornetries, every object of local (or pointwise) character that is preserved by isometries will automatically be preserved by local isometries 4 : M -+ N . Here are some examples: (1) The induced covariant derivative on a curve. If Y is a vector field on a curve ci in M , then (d4Y)' = d4(Y'), where ( d + Y ) ( s ) = d$(Y(s)) for all s. (The proof is a mild variant of that of Proposition 59.) Hence: (2) Parallel translation. Let P be parallel translation from .(a) to a(b) along a curve a in M . Let P be parallel translation from &(a) to along q5 ci in N . Then dc$ab P = P d+za. Hence: (3) Geodesics. If y is a geodesic in M , then 4 y is a geodesic in N . Thus 4 y, = ydb,l I , , since both sides are geodesics with the same initial velocity. (The domain of yde, may be larger than I , ; for example, consider the inclusion map 4 of an open disk into R2.) Hence : (4) Exponential maps. 4 exp, = expgp dq5,, wherever the left-handhence right-hand-side is defined. ( 5 ) Riemannian curvature tensors. d$(R(x, y ) z ) = R(d$x, d4y)(dcjz). (This is immediate from the definition of R , since a local isometry locally preserves both brackets and covariant derivatives.) Hence: ( 6 ) Sectional curvature. K,(d417) = KM(17) for all nondegenerate tangent planes on M . Ricci curvature: 4*(RicN) = Ric,. Scalar curvature: SN 4 = S,. (The latter two use the fact that the pointwise operation of contraction is also preserved.) 0

0

0

0

0

0

0

0

A local isometry is uniquely determined by its differential map at a single point.

62. Proposition. Let 4, $: M -, N be local isometries of a connected semi-Riemannian manifold M . If there is a point p E M such that d 4 , = d+, (hence 4(P) = $(P)), then 4 = ICI. Pro05 Let A = {q E M : d 4 q = d+q}. By continuity, A is closed in M . Since A is nonempty it suffices to show that A is open. We assert that if q E A then any normal neighborhood 42 of q is contained in A . If r E 42 there is a vector v E 7&M)such that (,y 1) = exp,(v) = r. Hence

92 3

Semi-Riemannian Manifolds

A smooth mapping $: M + N of semi-Riemannian mappings is conformal provided $*(gN) = hg, for some function h E 3 ( M )such that h > 0 or h < 0. The following special case will prove quite useful. 63. Definition. A diffeomorphism $: M -,N of semi-Riemannian manifolds such that $*(BN) = cg, for some constant c # 0 is called a homothety of coeflcient c. Thus (d$(u), d$(w)) = c(u, w ) for all u, w E T,(M), p E M : all scalar products are stretched by the same constant c. Ifc > 0 [c < 01 then $ is called a positive [negative] homothety of scalefactor I c 1.' For example, in R"' scalar multiplication by b/u is a positive homothety from S"(a)to S"(b)of scale factor bJa. An isometry is just a homothety with c = 1. If c = - 1 we call $ an anti-isometry.

'

64.

Lemma. Homotheties preserve Levi-Civita connections.

Proof: If $: M -+ N is a homothety with coefficient c, let N' be the smooth manifold of N furnished with the new metric tensor cg,. Then $: M + N' is an isometry, hence preserves connections. Thus it remains to show that cg, and g, determine the same Levi-Civita connection. This is clear from the Koszul formula, since the metric tensor appears exactly once in each of its terms, hence the coefficient c cancels.

65. Remark. Effect ofa Homothety. (1) Since a homothety preserves Levi-Civita connections it also preserves all geometric notions that derive solely from D,notably the induced covariant derivative on a curve, parallel translation, geodesics, Riemannian curvature R , and Ricci curvature-the latter since Ric is a nonmetric contraction of R . (2) By contrast, sectional and scalar curvature are not invariant under homotheties. In fact, if $: M --f N has coefficient c, it is easy to check that 1

KN(d$17)= - KM(M(n) C

and

SN $ 0

1

= - S,. C

-

(3) Clearly a homothety of coefficient c > 0 preserves causal character of tangent vectors (hence of curves). However if c < 0 then causal character is reversed: u timelike d 4 u spacelike; u spacelike => d#w timelike, and u null * d& null.

The operation of changing the semi-Riemannian manifold M with metric tensorg to the same smooth manifold with metric tensor -g is called reversing the metric of M . The effect on geometry is clear from the remarks

Exercises

93

above. In particular we will largely neglect manifolds with negative-definite metrics in favor of Riemannian manifolds. Analogously, the case of index v = n - 1 2 1 could also be considered as Lorentz geometry. LEVELS OF STRUCTURE

Basic to all mathematics is the notion-here used quite informally-of a set with structure. For every type of structure there is a notion of equivalence (or isomorphism)-a one-to-one onto function that, in an appropriate sense, preserves the structure. A particular type of structure defines a branch of mathematics: the study of those concepts preserved by equivalence. For example, a group is a set furnished with the structure group operation. The notion of equivalence is the usual notion of isomorphism of groups. In the case of semi-Riemannian geometry there is a hierarchy of structures (see Table 1). TABLE 1

Branch of mathematics

Set with structure

Structure

Semi-Riemannian geometry Manifold theory Topology Set theory

Semi-Riemannian manifold Manifold Topological space Set

Metric tensor, atlas, topology Atlas, topology Topology (None)

Equivalence Isometry Diffeomorphism Homeomorphism One-to-one onto function

Exercises

1. For an arbitrary connection D on a manifold M show that T ( X , Y ) = [ X , Y ] - D x Y D,X defines a (1,2) tensor field on M . T is called the torsion tensor of D. (Thus for a semi-Riemannian manifold, property (D4) asserts that the Levi-Civita connection has torsion zero.) 2. For a (1, 2) tensor field A , derive the formula

+

(Hint: See Exercise 2.6.) 3. Let h: J -+I reparametrize a curve a : I -+ M . If 2 E X(a) then Z h E X(0r h) and (a) ( Z 0 h)’ = (dh/dt)Z’0 h ; (b) (2 h)” = (dZh/dt2)Z’0 h + (dh/dt)’Z” 0 h. 0

0

0

94 3

Semi- Riemannian Manifolds

4. (a) Symmetries are preserved under covariant differentiation; in particular

D x R has the same symmetries as R , and D x Ric is symmetric. (b) With indices as on page 83, R i j k l = -Rjikr = - R i j l k = R k l i j and R i j k l + R i k l , + R i l j k = 0. 5. If M" has constant sectional curvature C , then Ric = ( n - 1)Cg and S = n(n - 1)C. 6. For a semi-Riemannian surface, (a) R x , Z = K [ ( Z , X ) Y - (2, Y ) X ] . (b) Ric = Kg and S = 2 K . (c) K = - R 1 2 1 2 (see Remark 55). 7 . Let P be an open submanifold of a semi-Riemannian manifold M . If P is complete and M is connected, then P = M . (Hint: Assume not, and consider a boundary point of P.) 8. (a) The Poincare halflpfane is the region u > 0 in R2, but with line element ds2 = (du' + du2)/u2. Show that P has constant curvature K = - 1. (b) Exercise 13will imply that the unit spheres' in R3 hasds' = d9' + sin2 9 d q 2 . Show that S2 has constant curvature K = + l . (These examples will be extensively generalized in the next chapter.) 9. Iff, g E S ( M ) and V E X(M),then f D A ( A E 2:(M)). (a) D ( . f A ) = A 0df (b) grad(fg) = f grad g g grad f . (c) div(fV) = Vf + f div V. (d) Hfg = f H g + g H f + 4f @ dg + dg @ 4f. (el N f q ) = f4 + gdf + 2(grad f ,grad g). 10. For the metric tensorg of M : (a) The contraction o f g is dim M . (b) If M is connected the hypothesis that g has constant index is superfluous. (c) div(fg) = df. Prove using coordinates; using frame fields. 11. Let 4 : M -+N be a smooth map of semi-Riemannian manifolds. For coordinate systems ( at p in M and q at d p in N prove that dq5p preserves scalar products if and only if for all i, j

+

+

12. Let 2 be a vector field on a curve a in M . (a) If ~'(s,) # 0 there exists a smooth vector field 2 on a neighborhood of a(so)in M such that ,??a(,, = Z(s) for s near so. In particular, o! is focally an integral curve. (Hint: For some interval J around so, a ( J ) is a submanifold of M.) (b) Z ( s ) = Da,(,)2for s near so. (c) If o! is an integral curve of X then X f o a = d ( f 0 a)/& forf E S ( M ) . 13. We use spherical coordinates r, 9, q in R3 as suggested by Figure 3 so that

r, cp give polar coordinates in the plane 9 = n/2. (Some mathematicians prefer to reverse the notations 9 and cp.) (a) Prove that (r, 9, q)is a coordinate system on R3 - H , where H is a suitable half-plane. (b) Express a,, a,, dv in

Exercises

95

Figure 3

terms of pute

a,, ay,a,. Dar a,

(c) Deduce ds’

=

dr’

+ r2(dg2 + sin’ 9 dq’). (d) Com-

Da, 8, = ( W )a,, a,, Da, 8, = - r sin 9 U, Da, a, = cot 9 a,, Da, a, = - r a,, where U = cos cp a, + sin cp ay = sin 9 a, + (cos 9)lr 3,. 14. (a) Let L’ be a tangent plane. If A : Il -,Il is a linear operator of deter=

0,

Da, a,

=

(l/r)

minant 1, then R(Au, Aw) = R(u, w). (b) For a semi-Riemannian product M x N,

+ RicN(u,w), where x,y E T,(M) and u, w E T,(N). Hence S = S , + S N . Ric(x

+ u, y + w) = Ric,(x,

y)

15. For an orthogonal coordinate system the geodesic differential equations become

16. (a) If y is a geodesic and f E ?j(M),then Hf(y’, y’) = d 2 ( f 0 ?),Us2.(b) At a critical point off, Hfagrees with the Hessian in Exercise 1.14. 17. (a) If Y is a vector field on a curve a, then Y’(s) = lim,-,o (l/t) [P, Y ( s t ) - Y ( s ) ] ,where P , is parallel translation along a from a(s t ) to a@).(b) Parallel translation along geodesics uniquely determines the Levi-Civita connection. 18. The curl of V E X ( M ) is defined by (curl V ) ( X , Y ) = ( D x V , Y ) ( D y V, X ) . (a) curl Visa skew-symmetric (0, 2) tensor field (that is, two-form) with coordinate components dv,iaxi - d&/dxj. (b) curl(grad f ) = 0. (c) curl V = do, where tl is the one-form metrically equivalent to V. (d) On R3, (curl V ) ( X , Y ) = ( X x Y ) .( V x V ) . 19. To show that a regular curve a with a“ and a‘ collinear is a pregeodesic, write CL”(S) = f(s)a’(s) and prove: (a) B = a 0 h is a geodesic if and only if

+

+

96 3

Semi- Riemannian Manifolds

h“ + f(h)h’’ = 0. (b) If (a’, a ’ ) is never zero, then any constant speed reparametrization of a is a geodesic. (c) (a’, a’) is always zero or never zero. (d) a is pregeodesic in the case (a’, a ’ ) = 0. 20. The Riemannian curvature tensor on M” has n2(nz - 1)/12 independent components. Formally, if V is an n-dimensional scalar product space, the vector space of all curvaturelike (page 79) functions V4 + R has dimension n2(n’ - 1)/12. 21. A semi-Riemannian manifold M is an Einstein manifold provided Ric = cg for some constant c. (a) If M is connected, n = dim M 2 3, and Ric = fg,then M is Einstein. (b) (Schur.) If M is connected, n 2 3, and for each p E M , K is constant on the nondegenerate planes in T’(M), then K is constant. (Hint: For (a), contract and use div.) 22. If $2 is an open set in T , ( M ) then the curvature tensor R at p is completely determined by K(n)for all nondegenerate planes l7 that meet %. (Hint: If u E $2 and x E T,(M) then for 6 small, u 6x E $2.)

+

4

SEMI-RIEMANNIAN SUBMANIFOLDS

If M is a semi-Riemannian submanifold of M (Definition 3.4) then for vectors u, w E T,(M) c T,(M) the notation ( u , w) is unambiguous. Though they agree on this measurement it will soon be clear that inhabitants of the submanifold see their world M differently than do the inhabitants of the outside world R.Comparing the Levi-Civita connections of M and R gives a tensor I1 that provides an infinitesimal description of the shape of M in R. Then I1 is used systematically to compare the geometries of M and R.In particular, the geometry of M can be derived from I1 and the geometry of R. Overbars are used to distinguish corresponding geometrical objects on M and M . Thus for connection and curvature we have D, R on M and D,R on M. In the case of a vector field Y on a curve, we use the notations Y = DY/ds and Y’ = OY/ds. (But for a curve c1 in M , 6: = c1’ = da/ds.) The abbreviation M c means that M is a semi-Riemannian submanifold of M .

TANGENTS A N D NORMALS

If M is merely a smooth submanifold of R,a vector field X on the inclusion mapj: M c R (Definition 1.47) is called an R vectorfield on M . Thus X assigns to each point p of M a tangent vector X , to R a t p , and X is smooth iff E g ( R )implies Xf E S ( M ) .The set %(M)of all such (smooth) vector fields is a module over g ( M ) . For any Y E X(M) the restriction Y I M is in Z ( M ) . Since we have agreed to ignore differential maps ofj, X(M) is a submodule of 3E( M ) . 97

98 4

Semi- Riemannian Submanifolds

Now let M be a semi-Riemannian submanifold of M . Each tangent space T,(M) is, by definition, a nondegenerate subspace of T,(R). Hence Lemma 2.23 gives the direct sum decomposition

T,(R) = T,(M)

+ T,(M)',

and T'(M)' is also nondegenerate. Its dimension is k, the codimension of M" in &i"+k.Similarly the index of (g restricted to) Tp(M)' is called the co-index of M in R,and by Lemma 2.26, ind = ind M coind M . Vectors in T,(M)' are said to be normal to M , while those in T'(M) are of course tangent to M . That the sum above is direct means that every vector x E T,(m), p E M , has a unique expression

+

x

=

tan x

+ nor x.

where tan x E T,(M) and nor x E T,(M)'. The resulting orthogonal projections tan: T,(R)

-+

TJM)

and

nor: T,(M)

+

T,(M)'

are obviously R-linear. A vector field 2 E is normal to M provided each value 2, is normal to M . The set X(M)' of all such is a submodule of x ( M ) . For X E x ( M ) , applying tan and nor at each point of M gives vector fields tan X E X ( M ) and nor X E X(M)'. (A computation with adapted coordinate systems shows that these vector fields are smooth.) The resulting orthogonal projections

z(M)

tan: % ( M ) + X(M)

and

nor: % ( M ) + 3E(M)'

are g(M)-linear, and the identity X = tan X + nor X is that of the direct sum X(M) = X(M) X(M)'. In this chapter, Vand W will always denote vector fields tangent to M , and Z will be normal to M .

+

THE INDUCED CONNECTION

If M is a semi-Riemannian submanifold of R the Levi-Civita connection D of R gives rise in a natural way to a function X(M) x % ( M ) % ( M )called the induced connection on M c R.If V E X(M) and X E % ( M ) then D,X is --*

not yet meaningful, since V and X are not in X(R). However for each p E M let V and X be smooth local extensions of V and X over a coordinate neighborhood & of p in (Exercise 1.18). Then define 6 , X on each 42 n M to be the restriction of to @ n M .

m

The Induced Connection 99

1.

Lemma. b , X is a well-defined smooth

R vector field on M.

n M is Proof: As the restriction of a smooth vector field, smooth. Thus it suffices to show that it is independent of the choice of extensions. In terms of a coordinate system on %, write X = 'fi di. Then D v X = V ( f ' ) ai+ f i&(ai). But at q E 9 n M , (Vfi)(q)= V , ( f ' ) = K(f'IG2 n M ) and bv(ai)I, = Dv,(di). Thus the restriction of & X depends solely on V and X .

c

c

Since the induced connection D : X(M) x x ( M ) + f ( M ) is so closely related to the Levi-Civita connection of R we have used the same notation for both. In particular the induced connection has the five LeviCivita properties, as follows.

2. Corollary. Let be the induced connection of M c W E X(M) and X , Y E z ( M ) , then

R.If

V,

(1) b v X is S(M)-linear in V. (2) b , X is R-linear in X . f D , X for f E S(M). (3) D , ( f X ) = V f X (4) [V, W ] = D,w- D,V. ( 5 ) V ( X , Y ) = ( B V X , Y) (X,

+

+

avo.

Proo$ For eachpoint p E M extend all the vector fields and functions over a neighborhood of p in R.The corresponding five properties hold for the Levi-Civita connection of R;then restriction to M gives the results above, since (a) (DvX)IM = B , X ; (b) Y f l M = Vf; (c) (W,P ) ( M = ( X , Y ) ; and by Proposition 1.32, (d) [V, W] M = [V, W].

A basic fact here is that for V, W both tangent to M , the covariant derivative

b, W need not be tangent to M . So it is natural to ask what tan D, Wand nor

D, W are. Lemma. For M c R,if V, W E X(M), then D , W = tan 6, W, where D is the Levi-Civita connection of M . 3.

Proof: For an arbitrary vector field X E X(M), locally extend the vector fields X , V, Wand write out the Koszul equation 2(bvW,X) = F(V, W,X). Upon restriction to M , W,X) becomes (b, W, X ) , and properties (a)-(d) in the previous proof show that F(V, W,D)IM = F(V, W, U ) . Thus W, X ) = ( D , W, X ) . Since X is tangent to M, we can replace b, W by tan 6 , W, and the result follows.

(av

(a,

100 4

Semi-Riemannian Submanifolds

But nor 6, W is something new. 4.

Lemma. The function I I : X(M) x X(M) + X(M)' such that II( V, W ) = nor

DV W

is g(M)-bilinear and symmetric. II is called the shape tensor (or second,fundamentalform tensor) of M c

a.

Prouf. Since 6, W is g(M)-linear in V and R-linear in W , so is II. For

f E5(M),

+ f D"

F " ( f W ) = Vf W

W.

But W is tangent to M, and the projection nor is g(M)-linear, hence

II( V, ,f W ) = nor D V ( f W ) = f nor 6,W

=

f iI( V, W).

Finally, II( V, W ) - II( W, V ) = nor(D, W - Dw V ) = nor[ V, W ] = 0. II is a more general tensor field than those defined in Chapter 2, since its values lie in X(M)' rather than X(M). But evidently its g(M)-bilinearity means that it has the pointwise character expressed by Proposition 2.2 (see page 199). Thus at each point p E M , I1 determines an R-bilinear function T,(M) x T,(M)

+

T,(M)'

sending (v, w ) to Il(u, w). The two preceding lemmas are summarized by

DV W =

+

D, W

II(V, W ) normal

tangent 10 M

for

V, W E X(M).

to M

This decomposition leads to a fundamental curvature result called the Gauss equation.

5. Theorem. Let M be a semi-Riemannian submanifold of R,with R and R their Riemannian curvature tensors and II the shape tensor. Then for vector fields V, W , X , Y all tangent to M , (RvwX, Y>

=

(RvwX, Y > + ( I I ( V , XI, II(w, Y ) ) - W ( V , 0, II(W, XI).

Proof: As usual we can suppose [V, W ] = 0. Thus ( R , , X , - ( V W ) + (WV),where ( V W ) = (D,D,X,

Y)

=

(D,DWX, Y )

+ (D,(ZZ(W,

X)), Y ) .

Y) =

The Induced Connection

101

Since Y is tangent to M the projection tan can be introduced in the left side of the first summand, giving ( D v D , X , Y ) . The second summand can be written as V(ZI(W, X ) , Y ) - (IZ(W, X ) , D, Y ) . Since ZI( W , X ) is normal to M this expression reduces to - (If(

W , X ) , nor

B, Y ) =

-

(II( W, X ) , fI( V, Y ) ) .

We conclude that ( V W ) = (D,D,X,

Y ) - (ZI(W, X ) , ZZ(V, Y ) ) .

The result then follows by evaluating - ( V W ) + (WV). Because it is a tensor equation the Gauss equation remains valid if its vector fields are replaced by individual tangent vectors. Thus substitution in the formula of Lemma 3.39 gives the following relation between the sectional curvatures K of M and R of R (also called the Gauss equation). 6. Corollary. If the vectors v and w form a basis for a nondegenerate tangent plane to M , then

As an application let us show that the sphere S"(r) has constant sectional curvature K = l/r2 if n 2 2. (All one-dimensional manifolds are trivially flat.) The position vector field P = ui ai of R"+ is normal to the standard sphere Sn(r) c R"" at each point. If B is the connection of R"+', note that D,P = Xu' di = X for every vector field X . We assert that the shape tensor of the sphere is given by ZZ(V, W ) = (- l/r)( v, W ) U ,

1

where U = P / r is the outward unit normal on S"(r). In fact, (zI(V, W ) , U )

{nor D , W, U ) = (B,W, P ) l r - - ( W , D , P ) / r = -(V, W>/r.

=

Since Rn+' is flat, substitution in Corollary 6 gives the result K = l/rz. Since it is a (0, 2) tensor field with values in X(M)', 11 can be metrically contracted to give a normal vector field on M . Dividing by n = dim M gives the mean curvature vectorjeld H of M c R.Explicitly, at p E M , 1

n

where e l , . . . ,en is any frame on M at p.

102 4

Semi- Riemannian Submanifolds

If M is a semi-Riemannian submanifold of R, the usual geometry of M is sometimes called its intrinsic geometry to emphasize that it is independent of the fact that M happens to be in R.Roughly speaking, the extrinsic geometry of M is that seen by observers in R.Formally, a pair isometry from M c R to N c is an isometry 4: + N such that $1 M is an isometry from M to N . (When R = R, 4 is also called a congruence from M to N . ) Features of M that are preserved by all such pair isometries-and do not belong to its intrinsic geometry-constitute the extrinsic geometry of M . For example the shape tensor of M c R is an extrinsic invariant.

m

7. Lemma. tensors, that is,

A pair isometry 4 from M c R to N c

N preserves shape

df$(Zl(u, w))= II(d+u, d+w)

for all u, w E T,(M), p

EM.

ProoJ: Let V, W E X(M). Since 4 1 M is in particular a diffeomorphism M + N , d&V) and dq5(W) are in X(N). Because 4: R -+ N preserves connections it follows that dd(b, W ) = b,,,(d+W). For each p E M , the linear isometry d 4 : T'(@) -+ Tp(N) carries T,(M) to T,(N), hence Tp(M)' to T,(N)'. Thus dq5 preserves tangential and normal components; hence dc$(ZI(V, W ) )= d$(nor 6,W ) = nor(d+(D, W ) ) = nor(D,,,(d+W)) = l ~ ( d c $d~+, ~ ) .

The shape tensor is not intrinsic to M . For example, if different shapes of a piece of paper in space represent isometric submanifolds of R3 then ZZ = 0 when the paper is flat but not when it is curved.

GEODESICS IN SUBMANIFOLDS

The decomposition D , W on a curve as follows.

=

D, W

+ ZI(V, W )is adapted to vector fields

8. Proposition. Let Y be a vector field-always tangent to M-on curve a in M c R.Then Y

=

Y' tangent to M

where Y

=

DY/ds and Y'

=

DY/ds.

+ ZI(a', Y), normal to M

a

Geodesics in Submanifolds 103

Proof: As usual we can assume a lies in a single coordinate neighborhood of M and write Y = Y' 13'. Relative to the geometry of R,

1

Y =

c ddsY' a' + 1 Y'(a'(,).. ~

But (ail,)'

=

D,@')

=

D,.(&)+ IZ(M', 3').

Substituting in the previous equation then gives the required result.

9. Corollary. If a is a curve in M c R, then

2 = a"

+

ZZ(CI',

a'),

where ii is the acceleration of M in R, and a" its acceleration in M .

To see how the tensor ZZ describes shape, fix a point p

E

M and for

u E T,(M) let y be the M geodesic with initial velocity u. In M, y is "straight," thus its curving in M is that ,forced by the curving of M itself in R. By the corollary, y(0) = ZZ(u, v). Thus for all u, I1 describes the shape at p of M in (see Figure 1). If M is a Euclidean space, then y has the quadratic approximation sj(s) ts2jj(s), and in fact it is true in general that the hypersurface

+

{ u + ;ZZ(u, u ) : v in R near p .

E

T,(M)} in T,(R) is the best quadratic approximation of M

Figure 1. The M geodesics 2 and jl have initial velocities v and w ; hence //(v. v ) = ?(O) and / l ( w , w ) = ji(0).

10. Corollary. A curve a in M c R is a geodesic of M if and only if its

R acceleration is everywhere normal to M . This is obvious since, by the preceding corollary, a normal t o M implies that the M acceleration of M is zero.

11. Corollary. The nonconstant geodesics of the sphere S"(r) are precisely all constant speed parametrizations of the great circles of S"(y) c R"+l .

104 4

Semi-Riemannian Submanifolds

Pro05 A great circle of the sphere S = S"(r) is a circle Il n S cut from S by a plane IZ through the origin of R"+'.If c1 is a constant speed parameterization of Il n S then u = du/dt and ii = d2u/dt2are orthogonal to each other and tangent to the plane II. But on c1 the position vector field Pa is also tangent to Il and orthogonal to dr # 0. Thus a and P , are collinear at each point, so that a is normal to S and hence a is a geodesic of the sphere. To show that every nonconstant geodesic y can be so obtained, let IZ be the plane through the origin and y(0) to which y'(0) is tangent. Then a suitable constant speed parametrization of Il n S has "(0) = y'(0). Thus y = c1 by the uniqueness of geodesics. In particular, every geodesic of S"(r) is periodic, with period 27cr. It is now easy to see what the exponential maps of the sphere are like. Fix a point p of S = Sn(r). For each unit vector u E T,(S), exp, wraps the radial line through v around the great circle to which v is tangent. Explicitly, exp,(tu)

=

r[cos(t/r) p

+ sin(t/r) u]

Thus the ( n - 1)-spheres t constant in T'(S) are typically carried to "latitudinal" ( n - 1)-spheres in S cut by hyperplanes orthogonal to the diameter p , - p . Exceptionally, the spheres t = kn collapse to the north pole p when k is an even integer and to the south pole - p when k is odd. Note that exp, is a diffeomorphism from the ball 0 It nr to S - { - p } , hence the latter is a normal neighborhood of p .

-=

TOTALLY GEODESIC SUBMANIFOLDS

These are the submanifolds with the simplest possible shape. 12. Definition. A semi-Riemannian submanifold M of yeodesic provided its shape tensor vanishes: I1 = 0.

R is totally

Thus a totally geodesic submanifold M is extrinsically flat: observers in R see no curving. That is not to say that M is intrinsically flat; indeed by the Gauss equation (6) it has the same intrinsic curvature as R.

13. Proposition. For M c R the following are equivalent. (1) M is totally geodesic in 52. (2) Every geodesic of M is also a geodesic of R. (3) If u E T p ( R is ) tangent to M , then the R geodesic yv lies initially in M . (4) If a is a curve in M and v E '&,)(M), then parallel translation of v along a is the same for M and for R.

Totally Geodesic Submanifolds 105

Proof: (2) * (3). If a : Z -+ M is the geodesic of M with initial velocity u, then since u is also a geodesic of R it follows by the uniqueness of geodesics that c( = y v l Z . (However, yo may later on leave M ; for example, let M be an open disk in the xy-plane of &l = R3.) (3) + (1). For every u tangent to M , applying Corollary 9 to y o shows that ZZ(u, u) = 0. Then the polarization ZZ(u w,u w) = ZZ(U, u ) + 21Z(V, w) ZZ(w, w ) shows that ZZ = 0. (1) =. (4). Let V be the M parallel vector field on LY such that V ( 0 ) = u. By Proposition 8, V is R parallel. Thus the two parallel translations agree. (4)3 (2). If y is a geodesic of M , then y' is M parallel hence R parallel, so y is a geodesic of h?.

+

+

+

If Wis a [nondegenerate] k-dimensional subspace of R: then any translate x + W is called a [nondegenerate] k-plane in R; (0 5 k In). It is easy to see that nondegenerate k-planes are totally geodesic semi-Riemannian submanifolds of R:. The following lemma will imply that they are the only complete connected ones. 14. Lemma. Let M and N be complete, connected, totally geodesic semi-Riemannian submanifolds of &l. If there is a point p E M n N at which Tp(M)= T,(N), then M = N .

Proof. It suffices to show that if M is connected and N is complete, then M c N . Let (T be a geodesic segment in M running from p to q. Then IT is a geodesic of R, and by the hypothesis on tangent spaces, o'(0) is tangent to N . Thus IT is a geodesic of N as long as it remains in N . But since N is complete, (T is entirely contained in N . By the preceding proposition, M parallel translation of T'(M) = T,(N) along o will give 7 J M ) = T,(N). Thus the argument can be repeated to show that every broken geodesic of M starting at p lies also in N . Since M is connected it follows that M c N . (In fact M is an open submanifold of N , by Exercise 1.8.) It follows that the complete, connected, k-dimensional totally geodesic Riemannian submanifolds of the sphere S"(r) are exactly its great k-spheres: the submanifolds W n S"(r) where W is a (k + 1)-plane through the origin. These results about totally geodesic submanifolds M can be extended to the case where M is not semi-Riemannian (Exercise 9). 15. Definition. A point p of M c R is umbilic provided there is a normal vector z E T,(M)' such that for all u, w E Tp(M). IZ(u, w) = (v, w ) z Then z is called the normal curuature uector of M at p ,

106 4

Semi-Riemannian Submanifolds

In the Riemannian case, Il(u, u) = z for all unit vectors: M bends the same way in all directions at an umbilic point. However for indefinite metrics, the formula Il(u, u) = ( u , u)z shows that M bends toward z in spacelike directions, away from z in timelike directions. A semi-Riemannian submanifold M of R is totally umbilic provided every point of M is umbilic. Then there is a smooth normal vector field z on M, called the normal curvature vectorbeld of M, such that II( V, W ) = ( V , W ) Z for all V, W E X(M). Thus a totally geodesic submanifold is a totally umbilic submanifold for which z = 0. Our earlier computation of the shape tensor of the sphere Sn(r) c R"+' shows that it is totally umbilic with z = - U / r , where U is the outward unit normal. SEMI-RIEMANNIAN HYPERSURFACES

A semi-Riemunnian hypersurface M of R is just a semi-Riemannian submanifold of codimension 1. Thus the co-index of M-the common index of all (one-dimensional) normal spaces T,(M)-must be 0 or 1. It is more efficient to separate these two cases by a sign as follows. 16.

Definition. The sign E of a semi-Riemannian hypersurface M of M

is

+ 1 if the co-index of M is 0, that is, ( z , z ) > 0 for every normal vector z # 0; -1

if the co-index of M is 1, that is, ( z , z ) < 0 for every normal vector

z # 0.

Note that ind M = ind M if c = 1, but ind M = ind M - 1 if E = -1. For a Riemannian manifold every hypersurface is Riemannian with sign 1, but in the indefinite case, sign - 1 is as natural as 1.

+

+

17. Proposition. Let c be a value off E S(R).Then M = f - ' ( c ) is a semi-Riemannian hypersurface of M if and only if (grad f , grad f ) is > O or < O on M . In this case the sign of M is the (constant) sign of (grad f , grad f ) , and U = grad f /[grad f I is a unit normal vector field on M.

Proof: Since grad f is metrically equivalent to the differential df, it follows that M is a hypersurface, and the condition on (grad f , grad f ) ensures that M is semi-Riemannian. (If M is connected, nonvanishing of the scalar product will suffice.) Finally grad f is normal to M, since for any v E T,(M), (grad f , 2 ; ) = v ( . f ) = v(f I M) = 0, because f is constant on M.

rn

Semi-Riemannian Hypersurfaces 107

In R ” + l , for example, i f f = ( u ~ then ) ~ f - l ( r ’ ) is the standard sphere Sn(r). Not every semi-Riemannian hypersurface M c R can be obtained from this proposition, since in general there does not exist a smooth unit normal on all of M : the Mobius band in R3 is one example. However it is easy to show that there is always a unit normal on some neighborhood of any point of M . For hypersurfaces the shape tensor can be reduced to simpler tensors as follows.

18. Definition. Let U be a unit normal vector field on a semi-Riemannian hypersurface M c R. The (1, 1) tensor field S on M such that for all V, W E X(M) ( S ( V ) ,W ) = ( I I ( V , W ) , U ) is called the shape operator of M c R derived.from U .

As usual, S determines a linear operator S: T’(M)

-+

T,(M) at each point

EM. 19. Lemma. If S is the shape operator derived from U , then S(u) = -0, U , and at each point the linear operator S on T,(M) is self-adjoint. ProoJ: Since ( U , U ) is constant, ( D , U , U ) to M for all V E X(M). But if W E X(M) then ( S ( V ) , W ) = (ZZ(V, W ) , U )

=

=

0. Hence DvU is tangent

(DvW , U ) =

-(DvU, W).

Hence S( V ) = - D , U . The symmetry of I1 implies that S is self-adjoint.

w

m:

This characterization of S shows how it describes the shape of M in S measures the R rate of change of U in all tangent directions, and since U i = T,(M) it thereby records the turning of T,(M) in J? as p traverses M . The symmetric (0, 2) tensor B metrically equivalent to S is traditionally called the second fundamental form of M c R. (The jirst fundamental form is just the metric tensor of M . ) If a unit normal U , perhaps defined only locally, is replaced by - U then S changes sign. Thus even if M does not admit a global unit normal, S is globally defined u p to sign. This sign ambiguity is something of a nuisance, but in intrinsic formulas it must cancel out. For hypersurfaces the Gauss equation takes the following form. 20. Corollary. Let S be the shape operator of a semi-Riemannian hypersurface M c R. If v, w span a nondegenerate tangent plane on M , then K(u, w )

=

R(v,w)

+

where E is the sign of M c R.

I:

(SV, u)(Sw, w) - (SV, w)’ ’ (v, v>(w, w > - (v, w > 2

108 4

Semi- Riemannian Submanifolds

This follows immediately from Corollary 6 since ZZ(u, w ) = E(SU, w ) U and ( U , U ) = E. 21. Lemma. A semi-Riemannian hypersurface M c R is totally umbilic if and only if its shape operator is scalar.

Proof: Suppose first that M is totally umbilic, with normal curvature vector field z. Let S be the shape operator derived from a unit normal U , perhaps only locally defined. Then (SV,

w> = (ZZ(V,

W ) , v>= ( V , W ) ( Z , u>

for all tangent vector fields V, W .Thus SV = ( U , z)Vfor all V, so S is scalar. Conversely, suppose that for every choice of U the derived S is scalar, that is, there is a function kU on the domain of U such that SV = kU Vfor all V.Then U ( V , W ) = E(SV, W ) V

=

&ku(V, W ) V .

Since k - , = - k , the vector field z = EkU U is globally well defined and the above equation becomes ZZ( V , W ) = ( V , W ) z . The (sign ambiguous) function k in this proof is called the normal curvature function of M c A?. HYPERQUADRICS

'

The same methods used above to deal with spheres in R"+ can be applied in R:+ to an important family of semi-Riemannian hypersurfaces. Let q E 7j(R:+') as usual be the function q(u) = (u, u ) . Relative to natural coordinates, q

=

c

Ei(Ui)2

= -

V

n+ 1

i= 1

j=v+ 1

c(uy+

1

(uj)'.

If P is the position vector field of R:, then q = ( P , P ) . Consequently grad q = 2P, since for all V , (grad q, V ) = Vq = V ( P , P ) = 2(D,P, P ) = 2(V, P). Thus (grad q, grad q ) = 4q. By Proposition 17 we conclude thatfor r > 0 and E = +1, Q = q-'(er2) is a semi-Riemannian hypersurface of R!+' with unit normal U = P/r and sign E. These hypersurfaces are called the (central) hyperquadrics of & + I . The two families E = 1 and E = - 1 fill all of R:+ except the set q-'(O), which consists of the nullcone A = q - ' ( O ) - {0} and the origin 0. (See Figure 2.)

Hyperquadrics 109

'

22. Proposition. The nullcone A of R:+ is a hypersurface invariant under scalar multiplication and diffeomorphic to (R' - 0) x Sn-'. The position vector field P of R;+ is both tangent to A and normal to A, hence A is not semi-Riemannian.

Proof. Since P is zero only at the origin 0 it follows as in the proof of Proposition 17 that A is a hypersurface and P is normal to A. P is also tangent to A since at u E A the vector P , = v, is tangent to the radial null geodesic t -+ tv in A . The scalar multiplication assertion is obvious, so it remains to determine the structure of A as a smooth manifold. Let S"-' be the unit sphere in R " - ' + ' , and define a mapping 4 of (R' - 0) x S"-' into R;+ ' by

Since V

(4(x, PI, +(x, PI>

=

- 1(XJ2

+ Ix12 = 0,

1

the image of 4 lies in A. If U E A , let $(u) = (x,p ) e ( R V- 0) x S"-', where x = ( u l , . . . , uv), p = 1, . . . , u,+ 1)/4(~), and A(u) = ( u ~ ) ~ ] Then " ~ . 4 and I,+are inverse mappings, hence diffeomorphisms. rn

(vV+

[ctz

*') = H i ( r )

110 4

Semi- Riemannian Subrnanifolds

23. Definition. Let n 2 2 and 0 5 v I n. Then (1) The pseudosphere of radius r > 0 in R:"

S:(r) = 4 - '(r') = { p E R:"

is the hyperquadric

: (p, p ) = r'},

with dimension n and index v. ( 2 ) The pseudohjyerbolic space of radius r > 0 in R::; is the hyperquadric

H:(r) = q - ' ( - r ' ) = { P E R : : : : ( p , p )

=

-r'j,

with dimension n and index v.

+ 1, by a previous remark their index in

Since pseudospheres have sign

R': is v. Pseudohyperbolic spaces have sign - 1, thus in R': have index v - 1 (hence the shift to R;+ in the definition above). For v = 0, S:(r) is just the standard sphere Sn(r) in Euclidean space R:+I = Rn+1 The study of hyperquadrics is simplified by the fact that any hyperquadric is homothetic to a suitable unit pseudosphere S: = S:(l). First, for any r > 0, scalar multiplication by r is a (positive) homothety R:+' --* R:+' of scale factor r ; hence so is its restriction S: --+ S:(r). Then S:(r) is homothetic to a pseudohyperbolic space as follows. 24. Lemma. The mapping 4

~

1

.. 3

. 7

~

n 1) +

=

(T:

@v+

R;' 19

+ R;Tt+

given by

. . . , P n + PI, . . ., ~ 19

v

)

is an anti-isometry that carries each S:(r) anti-isometrically onto H;- '(r), and vice versa.

Proof: Since (T is a linear isomorphism and n+

<(T(PX

o(P)> = Y

I

2+

1

V

(pj)'

+ 1 (Pi>2 = - ( P , P>, 1

it follows as in the proof of Lemma 3.7 that

(T is an anti-isometry. The formula also shows that (T carries S:(r) into H:-,,(r), and vice versa. Thus alSn(r)is a diffeomorphism, hence an anti-isometry.

In view of Remark 3.65 these homotheties reduce the intrinsic geometry of hyperquadrics to the case of the unit pseudosphere.

25. Lemma. The pseudosphere S:(r) is diffeomorphic to R" x P-"; the pseudohyperbolic space H:(r) is diffeomorphic to S' x R"-'.

Hyperquadrics

111

Proof: By the preceding results it suffices to deal with the unit pseudosphere S:. If x E R" and p E S" ",let ~

$(x, p ) =

(x,(1

+ ( X12)1'2p)E R& x R

fl+l-V

- Rfl+1

Because
the mapping (x,q) + (x, (I

I#J

+ (1 + 1x17 = 1,

carries Rv x S"-" into S:. Since 4 has inverse mapping it is a diffeomorphism.

+IX(~)-''~~),

By convention Ro is a single point; by definition Soconsists of two points. Aside from spheres the only Riemannian manifolds among the hyperquadrics are the hypersurfaces H",r) in Minkowski space R;+ '.By the lemma, H:(r) consists of two connected components, each diffeomorphic to R".In fact, the components are congruent under the isometry (Pl,.

.

. ?

P n + 1)

+

(-PI? PZ,.

. . >Pn+1)

of R;+' (see Figure 2). 26. Definition. The component of &(r) through (r, 0, . . . , 0) is called the upper imbedding-the component through ( - r , 0, . . . , 0) the lower imbedding-of hyperbolic n-space H"(r) in R;+ '.

Like the sphere, every hyperquadric is totally umbilic.

27. Lemma. The hyperquadric Q = q - '(Er') c R:+ of sign E is totally umbilic, with shape operator S = -Z/r derived from the outward unit normal P/r. Proofi If I/ E X( Q ) , then S ( V ) = -&(P/r)

= - V/r.

It follows that the normal curvature vector field of Q is z = -(e/r)U. Since U is outward (away from the origin), z is inward on pseudospheres and outward on pseudohyperbolic spaces. This is reasonable enough on the Riemannian hyperquadrics Sn(r)and H"(r) since the sphere bends inward at all points, while hyperbolic space bends outward (Figure 2). But in the indefinite case, where Il(u, u) = ( u , u ) z , pseudospheres bend inward in spacelike directions but outward in timelike directions (Figure 2) while on pseudohyperbolic spaces this bending pattern is reversed. The following proof will show that, just as for the sphere, the geodesics of any hyperquadric Q c R:+ are the curves sliced from Q by planes 17 through the origin of R:+ I . (nn Q may have two components instead of one as for the sphere.) First consider pseudospheres.

112 4

Semi- Riemannian Submanifolds

28. Proposition. Let y be a nonconstant geodesic of S:(r) c R;". (1)

If y is timelike it is a parametrization of one branch of a hyperbola in

Rt+l.

(2) If y is null, it is a straight line, that is, a geodesic of Rt". (3) If y is spacelike it is a periodic parametrization of an ellipse in R;' (see Figure 3).

In particular S:(r) is complete.

Timelike geodesic

Null geodesic Spacelike geodesic

Null geodesic

Figure 3

Proof: Let p E S = S: and let 17 be a plane in R:+' through the origin 0 and p. If g is the scalar product of R",+', then since p is spacelike, there are three possibilities for the restriction of g to n.

Case 1. glZi' is positive definite. Then S n 17 is a circle in 17 x R2. In fact, if el, e2 is an orthonormal basis for I7 then a point ae, + be2 of 17 is in S = { u E R",+': ( u , u ) = r2} if and only if a' + b2 = r 2 . Thus u(t) = r cos t el r sin t e2 is a constant speed parametrization of n n S. In fact, (a, 6) = r2, so c1 is spacelike. Furthermore 6 = -Pm, so 6 is normal to S, hence c1 is a geodesic of S.

+

Case 2. g 127 is nondegenerate, with index 1. Let e,, el be an orthonormal basis for ll such that p = re,, so e, is timelike. A point ae, + be, of n is in S if and only if - a 2 + b2 = r 2 . Thus Z l n S consists of the two branches of a hyperbola in Il x Rf.The branch through p has constant speed parametrization a ( [ ) = r sinh t eo

+ r cosh t el

( t R). ~

Hyperquadrics 113

Then (4 6 ) = - r 2 cosh' t + r 2 sinh t = - r 2 , hence a is timelike. Since Cr = P a , Cr is normal to S;hence a is a geodesic of S. Case 3. glU is degenerate with nullspace of dimension 1. Thus if v # 0 is in the nullspace, u is in particular a null vector and p , u is a basis for n. A point up bv of I7 is in S if and only if a' = 1, that is, a = f I. Thus l7 n S consists of two parallel straight lines. The one through p is parametrized by a(t) = p tv. Since a is a geodesic of R:' lying in S it is a geodesic of S. Furthermore &(O) = u p , so a is a null geodesic. A proof such as the one for the sphere shows that every geodesic of S is a rn reparametrization of one of the three types above.

+

+

The geodesics of pseudohyperbolic spaces can now easily be derived. Since the mapping CJ in Lemma 24 is an anti-isometry it follows that the preceding proposition holds also for H:(r) provided the words spacelike and timelike are reversed. Thus in particular, every geodesic y of hyperbolic space H"(r) is one-to-one, since y is spacelike in R:". Substituting from Lemma 27 into Corollary 20 shows that hyperquadrics have constant curvature. 29. Proposition.

Let n 2 2 and 0 I v I n.

(1) The pseudosphere S:(r) is a complete semi-Riemannian manifold of constant positive curvature K = l/r2. (2) The pseudohyperbolic space H:(r) is a complete semi-Riemannian manifold of constant negative curvature K = - l/r2. In the Riemannian case the hyperbolic space H"(r) contrasts sharply with the sphere S"(r). Where the sphere is compact, with periodic geodesics and positive curvature, H"(r) is noncompact (indeed diffeomorphic to R"), with one-to-one geodesics and negative curvature. Intuitively speaking all points on the unit sphere S2 in R3, and all directions, are geometrically the same. Figure 2 may seem to suggest that this uniformity fails for pseudospheres, but in fact it holds in the following strong form. 30. Proposition. Let e l , . . . , e n and fi,.. . , f n be (tangent) frames on S:(r) at pointsp and q, respectively. Then there is a unique isometry 4 : R;' -+ R:' carrying S:(r) isometrically to itself, with d ( p ) = q and dd(ei) = fi for l l i l n . Proof: The position vector pp at p E S = Sn(r) is normal to S , hence orthogonal to each e,. Thus if Zi denotes the element of R:" canonically corresponding to ei, then Cl,.. . , Zn,p/r is an orthonormal basis for R:". The same holds for the other frame.

114 4

Semi- Riemannian Submanifolds

The proof of Lemma 2.27 shows that there is a unique linear isometry I -, R:+ such that 4(ei) = f; for all i, and q5(p/r) = q/r hence 4 ( p ) = q. By Lemma 3.7, 4 is an isometry of R:". It is clear from the definition of S = S:(r) that 4(S) = S. Since S is a semiRiemannian submanifold, 4 restricts to an isometry S -, s. Since 4 is linear on RC" it differs only by canonical isomorphism from its differential map. Thus dq5(ei) = 1;- for all i. To show that 4 is unique, let $ be another isometry of R:+' with the required properties. Clearly d 4 and d$ agree on T,(S). Since 4 and $ are pair isometries, it follows from Lemma 7 that d 4 and d$ preserve normal curvature vectors. Thus they agree on T'(R:+'). Proposition 3.62 then gives +=$. m

4: R:+

Evidently the same result holds for pseudohyperbolic spaces. THE CODAZZI EQUATION For M c R,if the geometry of M is considered to be that of vectors tangent to M then there is an analogous geometry of vectors normal to M . 31. Definition. The normal connection of M c D': X(M) x X(M)' -, X(M)' given by DhZ = nor nvZ

for

a

is the function

V E X(M), 2 E X(M)'.

D b Z is called the normal covariant derivative of Z with respect to V . Evidently it registers the strictly normal rate of change of Z , when p moves as prescribed by V. The following properties are immediate: D i Z is g(M)-linear in V and R-linear in Z. (2) D $ ( f Z ) = f D i Z + r(l'2,wheref E g(M). ( 3 ) V ( Y, Z ) = ( D ; Y, 2 ) + (Y, D k Z ) , where Y, Z (1)

E

X(M)'.

Deeper results about semi-Riemannian submanifolds often require information about the rate of change of the shape tensor. To avoid tensor complications we use a standard definition modeled on the product rules in Chapter 2. 32. Definition. Let I1 be the shape tensor of M c

a.If V, X , Y

E X(M),

let ( V v I I ) ( X , Y) = D $ ( I I ( X , Y ) ) - Z I ( D v X , Y) - IZ(X, Dy Y ) .

It is easily verified that, like II itself, VvZI is a symmetric S(M)-bilinear function X(M) x X(M) .+ X(M)'.

The Codazzi Equation 115

The Gauss equation describes tan & X , where V, W,X are all tangent to M , in terms of the shape tensor. The following analogue for nor &,X is called the Codazzi equation.

33. Proposition. Let M be a semi-Riemannian submanifold of If V, W, X E X(M) then nor

R,,x

= -(VvII)(W, X

M.

) + (V,II)(V, X ) .

Proof: As usual we can assume [V, W ] = 0. Then nor RV,X = B, B, X . Now

- ( V W ) + ( W V ) , where ( V W ) = nor

( V W ) = nor D V ( D w X ) + nor Dv(II(W, X ) ) . The first term here is just II(V, D,X). The second is, by definition, Db(II(W, X ) ) ; hence reading from the definition of V v I I gives

( V W ) = ZI(V, D,X)

+ (PvZl)(W, X ) + ZI(DvW,X ) + II(W, D V X ) .

Since DvW - DwV = [V, W ] = 0, cancellations in - ( V W ) the required result. A vector field Z V E X(M). 34.

E

+ ( W V ) give

X(M)' is normal parallel provided Dj$Z

=

0 for all

Corollary. Let 52 have constant curvature.

(1) The Codazzi equation for M c R becomes ( V v II)( W, X ) = ( V w II)( V, X)

for all

V, W, X

E

X(M).

(2) If M is a hypersurface in R with shape operator S, then (D,S)(W)

=

(DwS)(V)

for all

V, W E X(M).

Proof. (1) By the formula in Corollary 3.43, R,,X is tangent to M , hence nor R v w X = 0. (2) Suppose S derives from the unit normal U . Note that U is normal parallel, that is, D i U = 0 for all X E X ( M ) . In fact, ( U , U ) = const 3 DxU I U + DxU tangent to M * D i U = 0. As usual in a tensor proof we can suppose that the vector fields V, W , X E X(M) have all M covariant derivatives zero at a point p. There ((DvS)(W),

x> = ( ~ v ( S W )x> , = V(SW, x>= V(II(W, X I , u> = < D V ( N W X ) ) , u> = ( ( V V W W , X ) , u>.

The result then follows from (1).

116 4

Semi-Riemannian Submanifolds

TOTALLY UMBlLlC HY PERS UR FACES

As an application of the methods of this chapter we will find all connected totally umbilic hypersurfaces of semi-Euclidean space R:. We have already seen that every connected totally geodesic hypersurface of R: is an open set in a nondegenerate hyperplane. Thus it remains to determine the totally umbilic hypersurfaces that are not totally geodesic. This class includes, of course, all hyperquadrics. The essential step is the following simple application of the Codazzi equation. 35. Lemma. Let M c &f be a connected semi-Riemannian hypersurface of sign E and dimension 2 2. If M is totally umbilic and has constant curvature c, then the normal curvature k is constant (mod sign) and M has constant curvature C + &kZ.

Proof: The curvature assertion is clear from the Gauss equation once k is known to be constant. Since M is connected it suffices to show that u(k) = 0 for every tangent vector to M . If u f Tp(M)choose V, W E X(M) so that V’ = u and Wpis independent of up. Since S = ki, ( D , S ) ( W ) = D,(SW) - S ( D , W ) = D,(kW) - kD, W

=

Vk W

By the choice of V and Wit follows from Corollary 34 that u(k) = ( V k ) ( p ) = 0.

rn For example, a flat totally umbilic hypersurface in R: is totally geodesic. A central hyperquadric in R:, given by (p, p) = c # 0, is carried by the transIation p -+ p - xo to a hyperquadric centered at xo, given by ( p - xo, p - xo> = c. Evidently the translation is a pair isometry; hence, in particular, all hyperquadrics are totally umbilic. A semi-Riemannian hypersurface of a two-dimensional manifold is trivially totally umbilic. so we deal with R: only for n 2 3. 36. Proposition. If M is a connected semi-Riemannian hypersurface of R t , n 2 3, that is totally umbilic but not totally geodesic, then M is an open set in a hyperquadric. Hence if M is also complete, M is a component of a hyperquadric. (This reference to components is operative only in the two nonconnected cases H : and S:.) Proof. If U is a locally defined unit normal on M then S = k l , where by the preceding lemma, k is constant. The normal vector field U / k = ( - U ) / - k is well defined on all of M.

Totally Urnbilic Hypersurfaces 117

Identify the tangent spaces of R: with R; itself by canonical isomorphism, and define a mapping 4 : M -+ R" by

4(P) = P + Up/k. The goal is to show that 4 is a constant map. Since M is connected it suffices to show that each differential map of 4 is zero. With the identifications above we assert that for any tangent vector u to M ,

d 4 ( ~ =) u

+ (D,U)/k.

It suffices to show that both sides have the same components relative to the natural coordinates u l , . . . , u" of R:. If U = f i ai,then

ui(4(p)) = u'(p)

+fi(p)/k

for all p ,

hence

(d$(v))ui = u(ui 0

4) = uui + uf ' / k .

2

Since D,U = uf' ai, the assertion is true. Now B, U = - S ( v ) = - ku, hence d$(u) = u + (- ku)/k = 0. Thus 4 is a constant map, say to the point xo of R". But then for every point p of M , p - xo = - U,/k, hence ( p - x o , p - xo) = Elk2. Thus M is contained in this hyperquadric Q. Since M is connected it is contained in a single component C of Q. By Exercise 1.8, M is an open submanifold of C. If M is complete, then M = C , by Exercise 3.7.

Thus the complete, connected, totaljy umbilic hypersurfaces of R; ( n 2 3 ) exactly the nondegenerate hyperplanes of R': and the components of hyperquadrics. On the basis of this extrinsic information it turns out to be easy to characterize intrinsically the semi-Riemannian hypersurfaces of R: that have constant nonzero curvature.

37. Proposition. If M is a connected semi-Riemannian hypersurface of R: (n 2 4) with constant curvature C # 0, then A4 is an open set in a hyperquadric. Hence if M is complete, M is a component of a hyperquadric. This follows immediately from Proposition 36 and the following algebraic result.

a.

38. Lemma. Let M be a semi-Riemannian hypersurface of If M and have constant curvatures C # and dim M 2 3, then M is totally umbilic.

c

118 4

Semi- Riemannian Submanifolds

Proof: Let A = E(C equation

c) # 0, where

E

is the sign of M. By the Gauss

(Su, u)(Sw, w ) - (Su, w ) =~ AC(u, u)(w, w) - ( u , w)'],

(a)

whenever u and w span a nondegenerate tangent plane to M . By Lemma 3.40 the relation is valid for arbitrary u, w. At each point p E M the shape operator S on T J M ) is invertible. In fact, if u # 0 then by an assertion in the proof of Lemma 3.40, there is a vector w E T',(M) such that ( u . u ) ( w , w ) - ( u , w)' # 0. Hence by (a), Su # 0. We assert that for all u. w, x, y in T,(M), ~ (b) (Sv, x ) ( S w , Y > - ( S O , Y ) ( S W ,X> = A [ ( u , x > ( w , Y> - ( u , Y > < w x>I. To prove this, note that each side defines a curvaturelike function. Thus Corollary 3.42 shows that (a) implies (b). Since dim T,(M) 2 3 there is a vector y # 0 orthogonal to both u and Su. Thus (b) becomes

for all u, w , x.

(Su,x)(Sw, y ) = A(v, x ) ( w , y )

Because S is invertible, the image of S is not contained in.'y Hence there is a vector MIsuch that (Sw, y) # 0. Thus for some constant k, we have ( S u , x) = k ( v , x) for all u, x. Hence S is scalar. In Proposition 37, if C = C the shape of the submanifold M can not always be specified.

THE NORMAL CONNECTION

For semi-Riemannian submanifolds of codimension greater than one the normal connection 'D becomes more important. Just as the shape tensor of M c R measures the difference between D and ,n it also measures the D and &but in a different formulation, as follows. difference between ' V

39. Remarks. The Tensor fi of M c M . (1) If M c X(M) and 2 E X(M)' define d(V, 2) = tan &2. Thus

R then for

E

D,Z

=

i?(V, Z ) + Di W. tangent to M

normal to M

It is easy to check that the function d : X(M) x X(M)' -+ X ( M ) is g ( M ) bilinear. Hence at each point p E M , fi gives an R-bilinear map TP(M)x q(M)' ~,(M). (2) The tensor d contains no new information, since +

V, 21, w >= - ( I W ,

w>, 2))

The Normal Connection 1 19

for all V, W E X(M) and Z E X(M)'. This identity is derived by differentiating ( Z , W ) = 0 to get ( D , Z , W ) = - (2, B, W ) . ( 3 ) When a particular Z E X(M)' is important, the notation S , I/ = - rl( V, Z ) is sometimes convenient. Since

Sz gives at each point p E M a self-adjoint linear operator on T,(M). (Compare [KN] where 11 is a, and S , is A L . )In the case of a unit normal to a hyper-

surface, S , is consistent with Definition 18 so we call it a shape operator. The normal connection D1 is adapted as follows to normal vector fields on curves in M c R. If Y is a vector field on a always normal to M , then its normal covariant derivative Y' = D'Y/ds is defined to be the normal component of its covariant derivative Y = BY/&. The properties in Proposition 3.18 imply at once the analogous properties for Y'. Corresponding to Proposition 8 is

a

Y = li(Cd, Y) vangent 10 M

If Y'

=

+

Y'. normal to M

0, Y is said to be normal parallel.

40. Lemma. Let a be a curve in M c R.If y is a vector normal to M at a(a)there is a unique normal parallel vector field Y on CI such that Y ( a ) = y.

Proof: Using an adapted coordinate systems and the projection nor it is easy to find, on subintervals of a, normal vector fields E , , . . . , E , that at each point give a basis for the normal space. With somewhat more work these can be patched together smoothly to give such fields defined on the entire curve a. Write Y = x f i E i , y = xci.EilE(al,and E: = x h j i E j (1 I i I k). Then Y' = (f; 1 hijfj)Ei. Let f i , . . . , fk be the unique solutions of the linear h,f, = 0 such thatf;.(a) = cisThe solutions are differential equationsf: defined on the entire interval 1, hence Y = C A E , is the required vector w field.

c

+

+

In the notation of the lemma, Y ( b ) is the normal parallel translate of y, and a proof like that for Lemma 3.20 shows that normal parallel translation y Y ( b )is a linear isometry P : x&l4)' -, 7&(M)'. Hence normal parallel translation of a normal frame gives a normal parallel frame field on a. A proof like that of Lemma 7 shows that the normal connection is preserved by pair isometries, hence D 1 and its subsidiary notions above belong to the extrinsic geometry of M c R. ---f

120 4

Semi- Riemannian Submanifolds

A CONGRUENCE THEOREM As in elementary Euclidean geometry two objects in a semi-Riemannian manifold R a r e congruent if there is an isometry of R carrying one to the other. In particular, submanifolds M , and M , of R are congruent provided there is an isometry I,$ of R such that I,$ I M is an isometry from M to M , . Then by definition M I and M , have the same intrinsic and extrinsic geometry. Evidently congruence is a useful notion only when has many isometries. Taking R = R: we prove that isometric submanifolds are congruent i f and only ifthey have “the same shape tensor.” 41. Theorem. Let 4 : M , -+ M , be an isometry of connected semiRiemannian submanifolds of R t . There is an isometry $ of R; such that $ ( M ,= 4 if and only if at a point O E M , there is a linear isometry F,: T,(Ml)’ --f T4,(M2)’ with this property: If a is any curve in M , starting at o, then for each s the linear isometry

Fa(,) = P+a

0

~o

0

P, : Ta(s)(M,)’ --* T4u(s)(M2)’

(P normal parallel translation) preserves shape tensors, that is, for all

w E K(,)(Mi). Proof: The condition on shape tensors is necessary since we have seen that normal parallel translation and shape tensors belong to extrinsic geometry. To prove sufficiency, let a be a curve as above. Fa(s)(ZZ1(v,

w ) ) = 1Z2(d4u7d$w)

0,

(a) If Z is a vector field on a normal to M I , then (F,Z)(s) = F,(,,Z(s) defines a vector field on $ 0 a normal to M , . The definition of Fa,,, shows that if Z is normal parallel then so is F , Z . By expressing an arbitrary Z in terms of a normal parallel frame field on c( it follows that (FaZ)’ = F,(Z’). (b) Since F , preserves IZ, it also preserves 11, that is,

-

d4(fil(v?2))= fiZ(d4vv F e z ) , where I/ and Z are vector fields on a tangent and normal, respectively, to M. To check this take the scalar product with d $ ( W ) of both sides of the ZZ equation in the theorem and use Remark 39(2). (c) For each s, d$ + Fa(,,is a well-defined linear isometry from q(s)(R:) to T+,,,,(Rt).If X is an Rt vector field on a, ( d 4 + F,)X is an R:’ vector field on 4 a. We assert that i f X is R: parallel, so is (d$ + F,)X. Let X denote the R: covariant derivative along a ; then 0 = X = (tan X)’ + (nor X>’ gives 0

an X I ’ + fil(at,nor X I = 0, ZIl(a’, tan X) + (nor X)’ = 0.

Isometric Immersions

Now

((d4 + F,)X)'

= ( d 4 tan =

XY

121

+ ( F , nor X>'

+ Z Z 2 ( d ~ a 'd, 4 tan X ) + ~l,(d+a', F , nor X ) + ( F , nor x)'. (d4 tan X ) '

By (a), (b), and properties of isometries this becomes @[(tan X)'

+ fil(a',nor X ) ] + Fa[ZZl(a',tan X ) + (nor X ) ' ] .

The equations for X = 0 show that this expression is zero. (d) Zf curues a, 8: Z + Rt haue a(0) = p(O), a'(0) = p'(O), and for all s, a"(s)IIP"(s)(distant parallelism), then a = p. In fact, the parallelism means that the natural coordinates of a and p satisfy d2(ui cr)/ds2 = d2(ui 0 P)/ds2. Hence u i a - ui0 p = ais+ bi for all i. But the initial conditions imply a, = bi = 0. (e) Since a pair map preserves the intrinsic and extrinsic geometry of submanifolds we can assume without loss of generality that the point o E M , in the theorem and its image 4(0)E M , are both at the origin of R:. In fact it suffices to translate each manifold, with corresponding obvious changes in 4 and F,. (f) Let $ be the linear isometry of Rt canonically corresponding to d4, + F,: T,(R:) -, T,(RC);thusd$ = d4, + F , . W e w i l l s h o w t h a t $ ~ M , = ~ . If p E M , let a be a curve in M , from o to p . It suffices to prove 4 o a = $ CI. By (e), &o) = o = &o), and since d $ l T , ( M ) = d $ ~it follows that the two curves have the same initial position and velocity. As usual ii denotes R: acceleration. Then 0

0

0

(4

0

CC) '' =

(4

0

a)''

= @(a")

+ ZZ2(d4a', d4a') + F,ZZ,(CI', a') = (do + Fa)@).

Let A, be the vector in T,(R") that is distantly parallel to &(s). By (c),

(d4 +

II (d40 + The latter vector is just d$(A,). Since $ is a linear isometry, &(A,) 11 d$(a(s)) where the second vector is ( 4 a)"(s). Consequently ( 4 o a)"(s) (6o a)"(s) F U ( S J ( W )

0

for all s. Hence (d) shows that

4

0

a=

6

0

a.

ISOMETRIC IMMERSIONS The range of applicability of the results of this chapter can be enlarged in the following way. 42. Definition. Let M and M be semi-Riemannian manifolds with metric tensors g and g . An isometric immersion of M into R is a smooth

122 4

Semi- Riemannian Submanifolds

immersion such that $*@) isometric immersion.

= g.

An isometric imbedding is a one-to-one

The latter notion includes that of semi-Riemannian submanifold M c R in the sense that the inclusion m a p j of M into R is an isometric imbedding. The machinery for the study of semi-Riemannian submanifolds is readily adapted to the more general case, mostly by everywhere replacing j by 4. In terms of the notion of vector field on a mapping, x ( M ) is just XG).Thus % ( M ) is replaced by X(4), and similarly X1(M) is replaced by

X'(#)

=

{ Z E X(4): Z, Id 4 ( T p M )

for all p

E

M).

We could continue as before to get the induced connection and shape tensor, and thereby generalize the basic results of this chapter. While this continuation has technical advantages, all local results and thereby many global results can be obtained automatically from the submanifold case by means of the observation that locally an isometric immersion is essentially a semiRiemannian submanifold: 43.

Lemma.

Let

4: M

+

R be an isometric imbedding. Then

(1) Each point p E M has a neighborhood u2 such that 4 142 is an imbedding, and (2) If $(%) is assigned the metric tensor such that the induced map Q + 4(%) is an isometry, then $(a)is a semi-Riemannian manifold of R.

Proof; ( 1 ) As noted in Chapter 1 this is an immediate consequence of Lemma 1.33. (2) If u, w E T,(M), p E 42, then (d4(u), d 4 ( w ) ) has the same value, (v, w ) , whether computed relative to $(%) or a.

TWO-PARAMETER M A P S Let 9 be an open subset of the plane Rz satisfying this interval condition: horizontal or vertical lines intersect $3in intervals (if at all). A two-parameter map is a smooth map x: 53 + M . Thus x is composed of two interwoven families of parameter curves: The u-parameter curue L' = uo of x is u The v-parameter curve u = uo of x is v X, =

dx(d,),

X,

-+ -+

x(u, vo). x(uo, u). The partial velocities

= dx(8,)

are vector fields on x (Definition 1.47). Evidently x,(uo, vo) is the velocity vector at uo of the u-parameter curve u = v0, and symmetrically for x,(uo, uo).

Exercises 123

If x lies in the domain of a coordinate system XI, .. .,x", then its coordinatefunctions xi = x i o x (1 < i I n) are real-valued functions on 9, and =

C-ai, axi au

x,

=

ax

aU

So far M could be a smooth manifold; now suppose it is semi-Riemannian. If Z is a smooth vector field on x, its partial covariant deriuatiues are Z , = DZ/au, the covariant derivative of Z along u-parameter curves, and Z , = DZ/av, the covariant derivative of Z along u-parameter curves. Explicitly, Zu(uo,vo) is the covariant derivative at uo of the vector field

u

-,Z(u, uo) on the curve u + x ( u , vo).

In terms of coordinates, Z = z 'd i , where each Z' = Z(x') is a realvalued function on 9.Then by the formula following Proposition 3.18,

In the special case 2 = xu the derivative Z , = xu, gives the accelerations of u-parameter curves, while x,, gives u-parameter accelerations. 44. Proposition. (1) If x is a two-parameter map into a semiRiemannian manifold M , then xu, = xu,. (2) If 2 is a vector field on x, then 2," - Z, = R ( x , , xJZ, where R is the Riemannian curvature tensor of M .

ProoJ: (1) With coordinate notation as above,

This formula is symmetric in u and u, since T j jis symmetric in i and j. (2) A coordinate computation of Z,, - Z,, produces curvature as in Lemma 3.38. H Here (1) expresses for a two-parameter map the axiom (D4) (see Theorem 3.11) on the Levi-Civita connection, while in (2) curvature arises as usual from the failure of commutativity of covariant differentiation. Exercises

1. Let x', . . . ,x n f k be an adapted coordinate system for M c R. Show (a) the mean curvature vector field H is ( l / n ) x g i j l I ( d i ,d j ) for 1 Ii, j 5 n. (b) If a,,,, . . . , d n + k are normal to M , then Ir(ai, aj) =

c r;ja,.

r>n

124 4

Semi- Riemannian Submanifolds

2. Let i7 be a nondegenerate tangent plane to M at p . If P is a small enough neighborhood of 0 in i7, prove that exp,(P) is a semi-Riemannian submanifold of M whose Gaussian curvature at p is K(ZZ), where K is the sectional curvature of M . 3. Let S derive from the unit normal U = grad f/lgrad f 1 on a semiRiemannian hypersurface M = f - '(c) of @. (a) Show that (Su, w) = - H f ( u , w)/)grad f

1

for u, w tangent to M , where H I is the Hessian off: (b) Find S at p = (I, 0,O) for the hyperboloid x z + y 2 = 1 + zz in R3 first by using (a) then by using S(u) = -D,u. 4. Let M" be a spacelike hypersurface of sign E in A? (Riemannian or Lorentz). Let S derive from unit normal U , and let k , , . . . ,k , be the eigenvalues of S. Prove: (a) ( H , U ) = (l/n) ki. (b) K(n)= Ekikj + &U), where ZZ is spanned by the eigenvectors of ki and k j . (c) p E M is an umbilico k l = ... = k, at p . 5. (a) Show that the Ricci curvature of a hypersurface M c R",' is given by Ric(V, W ) = &[(SV, W ) trace S - ( S V , S W ) ] . 6. Generalize Lemma 35 as follows: For arbitrary codimension, if M is connected, has dim M 2 2, and is totally umbilic in a manifold of constant curvature C, then (a) the normal curvature vector z is normal parallel, and (b) M has constant curvature C (z, z). 7. Classical surface theory [Ol]. Let the two-parameter map x: 9 -+ R3 be an immersion. Use the dot product on R 3 ; x is an isometric immersion relative to the pulled-back metric on $2. (a) Check that

+

u = x,

x

x,/(x, x

X"I

is a unit normal vector field on x. Define

E = X, x,, L = x,,

U,

F = X, * x,, M = x,,

- U,

-

G = X, x,,

N

=

x,,

*

U.

Prove: (b) L = Sx; xu, M = SX; xu, N = SX;~,. (c) The Gaussian curvature of 9 is K = ( L N - Mz)/(EG - F2) = det S. (d) If H is the mean curvature vector field of x, then H = hU, where h = f trace S = (GL + E N - 2FM)/2(EG - F2). N o t e : if x parametrizes a surface 2 c R 3 , these results transfer to X since x is then a local isometry $2 + M . (See also Exercise 9.22.) 8. Total shape tensor. For M c R define T :X(M) x f ( M ) + g ( M ) by TvX = tan &(nor X ) + nor &(tan X ) . Prove: (a) T is g(M)-bilinear;

Exercises

125

(b) T, is skew-adjoint; (c) Tv reverses X(M) and X(M)l; (d) Tv W = IZ(V, W ) and TvZ = r’r( V, Z ) where V, W E X(M) and Z E X(M)I; (e) T is completely determined by ZI. 9. Let M be merely a smooth submanifold of a semi-Riemannian manifold R.If u E T’(M) let { u } be the element ofthe quotient vector space T,(M)/T,(M) containing u. If V, W E X(M) define II(V, W ) = {D,W). Prove: (a) II is symmetric and S(M)-bilinear. (b) The following are equivalent: (i) M is totally geodesic (that is, II = 0). (ii) if y is an Kf geodesic such that y’(0) is tangent to M then y remains initially in M . (iii) If a tangent vector to M is parallel translated along a curve in M it remains tangent to M . (Hint: To prove (i) e-(iii) write the coordinate expression for parallelism in terms of an adapted coordinate system.) (c) This notion of totally geodesic agrees with the previous notion if M is semi-Riemannian. 10. (a) If 4 : M -+ M is a homothety then each component of its fixed point set { p E M : $(p) = p } is a totally geodesic submanifold of M in the sense ofthe preceding exercise. (b) If $: M --+ N is a homothety, then its graph

{(P, @P) : P E MI is a totally geodesic submanifold of M x N , and is semi-Riemannian if $ is not an anti-isometry. 11. If M c M the function R’: X(M) x X(M) x X(M)’ + X(M)’ given by RhwX

=

D&,,IX

- [D;, D&]X

is called the normal curvature tensor of M c multilinear, and prove the Ricci equation: (R$,X, Y ) = ( R v w X , Y ) - (fi( V, Y),

a. Check that R 1 is S(M)-

+ (Il(V,X ) , E(W, Y ) ) I?( w,X ) )

where X, Y E X(M)’. (Since Rv, is skew-adjoint, the Gauss, Codazzi, and Ricci equations cover all four tan/nor choices in (R,,., .).) 12. The sectional curvature K of M is constant at p if and only if every unit vector in T,(M) is normal to a hypersurface totally geodesic at p . (Hint: Use Codazzi to get R,,.x = ( x , x ) K ( x , y ) y for nonnull x i y.) 13. Show that for 0 < v < n there are no compact semi-Riemannian hypersurfaces in R t . (Hint: Deduce a contradiction to constancy of index.)

5

RIEMANNIAN A N D LORENTZ GEOMETRY

After some general preliminaries we consider special features of the two most important geometries determined by the index: Riemannian geometry, v = 0, and Lorentz geometry, v = 1. The metric tensor of a Riemannian manifold M makes each tangent space an inner product space, linearly isometric to Euclidean space R". Then the notion of arc length leads to a notion of distance between points of M that generalizes the usual Euclidean distance in R".Riemannian distance makes every Riemannian manifold a metric space and simplifies the study of its geometry. Each tangent space to a Lorentz manifold is linearly isometric to Minkowski space R;, and Lorentz geometry begins with the study of the causal character of vectors in such a space. The emphasis in this chapter is on local geometry and on geodesics. In particular, a useful analogy appears between timelike geodesics in a Lorentz manifold and arbitrary geodesics in a Riemannian manifold. THE GAUSS L E M M A

The key to the local geometry of a semi-Riemannian manifold near a point o E M is the comparison with the semi-Euclidean space To(M)x R t afforded by the exponential map. We have seen that expo carries rays t -+ t x in T , ( M ) t o radial geodesics yx. The following result, called the Gauss lemma, implies in particular that orthogonality to radial directions is also preserved. 126

The Gauss Lemma 127

1. Lemma. Let o E A4 and 0 # x E T,(M). If u x , w,

E

T,(T,M) with

u, radial, then

( d eXPo(Ux),d exp,(w,)>

=
Proof: That u, is radial means that u is a scalar multiple of x. Hence we can suppose that 2) = x (and we elect to replace x by u). Consider the two-parameter map <(t, s) = t(u + sw) in T,(M), and its exponential image in M : x(t, s) = exp,(t(u + sw)). Now 5i,(l,0) = u, and 5is(l, 0) = w, hence xr(I3 0 ) = d eXpo(ov),

xs(l,O) = d exp,(w,).

So we must show that (x,(l, 0), xs(l, 0 ) ) = (u, w). The longitudinal curve t + x(t, s) is a geodesic with initial velocity u + sw. Hence x,, = 0 and (x,, x,) = ( u sw, u sw}. Proposition 4.44(1) says that x,, = xt,. Thus

+

a

- (XI, at

xs)

+

=

The formula above for (x,, x,) then shows that

(3 )

x xs) ( t , 0 ) = ( u , w)

,,

for all t.

Since x(0, s) = exp,(O) = o for all s, (x,, x,)(O, 0) = 0. Thus by elementary calculus, (x,, x,)(t, 0) = r ( o , w ) . Setting t = 1 gives the required result. In particular the length of radial vectors is preserved. Thus the Gauss lemma describes the exponential map expo as a kind of partial isometry whose principal distortions are in directions orthogonal to radial directions in T,(M). Using the Gauss lemma we now set up a detailed comparison between a normal neighborhood of a point o in M and the corresponding neighborhood of 0 in T'(M). A tilde ( - ) is used systematically to distinguish objects in T,(M) from the corresponding objects in 92 c M . In this context expo-' will always mean the diffeomorphism (exp, 14)- :92 + %.

If 4 is the function u -+ (u, u ) on T,(M) the corresponding function is denoted by q. (2) Hyperquadrics appear in T , ( M ) x R:, just as in R; itself, as level hypersurfaces 0 = @-'(c), c # 0. The diffeomorphic image of 0 n @ under expo is the hypersurface Q = q-'(c) in 92 c M , called a local hyperquadric at o. (See Figure 1.) (1)

4" exp; 0

128 5

Riemannian and Lorentz Geometry

Figure 1

(3) For indefinite metrics, if 3 is the nullcone 4-'(0) - 0 in T,(M), then the diffeomorphic image of 3 n @ under expo is the local nullcone A(o) = q - '(0) - 0.Thus A(o)consists of initial segments of all null geodesics starting at 0,and in 42 as in T , ( M ) the two families of hyperquadrics (c < 0, c 0) are separated by the nullcone. The simpler Riemannian situation is spelled out later. (4) If P is the position vector field u -+ u, on @ c T , ( M )then the transferred vector field P = d exp,(P) is called the local position uectorjeld at 0. Like P it is radial, that is, tangent to all radial geodesics emanating from o. Applying the Gauss lemma to known properties of P gives

=-

2. Corollary. The local position vector field P at o is orthogonal to every local hyperquadric of M at o. Furthermore P is both orthogonal and tangent to the local nullcone A(o).

Again by the Gauss lemma, ( P , P ) 0 expo = (p, p). Thus by the corollary each local hyperquadric Q is semi-Riemannian with the same sign (Definition 4.16) as the corresponding 0. (Though Q and are diffeomorphic they are generally not isometric, since tangent directions to Q' are the distortion directions of expo.) 3.

Corollary. grad q = 2P.

ProoJ Recall from Chapter 4 that grad 4 = 2P. If u is tangent to 42, let 0 be the tangent vector to Ui2 such that d expo@)= u. Then

Convex Open Sets

129

CONVEX OPEN SETS

Let M be a semi-Riemannian manifold, assumed for the moment to be complete. The exponential maps exp,: T,(M) -+M for all p E M constitute a single mapping exp: T M -+ M of the tangent bundle T M . If n is the natural projection of T M onto M , define E : T M M x M by E(u) = (n(u),exp(u)). Explicitly, E(u) = (p, exp,(u)) if u E T,(M) c T M . When M is not complete, exp and E have the same largest domain, namely the set $2 of all vectors u E T M such that the geodesic y, is defined at least on the interval [0, 11. Thus 9,= 9 n T,(M) is the largest domain of the exponential map exp,. 4. Corollary. The domain 9 of exp is open in T M . The domain of exp, is an open set of T,(M) starshaped about 0.

a

$2,

Proof: Let be the domain of the mapping $(u, s) = y:(s). Proposition 3.28 shows that $ is the flow of a vector field on T M , hence is an open set of T M x R. is also the domain of the map n )I sending (u, s) to y,(s). Thus 9 is open in T M , since it corresponds to n ( T M x 1) under the diffeomorphism u tj(v, 1). Then $2, is open in T,(M). If u E 9,then y, is defined on [0, 11. But yt,,(l) = y,(t), so tu E 9,for t E [0, 11. Hence 9,is starshaped.

a

0

a

5. Definition. An open set %? in a semi-Riemannian manifold is conuex provided V is a normal neighborhood of each of its points.

In particular for any two points p , q of V there is a unique geodesic segment up4:[0, 13 -, M from p to q that lies entirely in V . Unlike the situation in R" there may well be other geodesics from p to q that do not remain in %?

6. Lemma. If exp,: 9,-, M is nonsingular at x E 9,, then E : 9 -, M x M is nonsingular at x. ProoJ: Suppose dE(u) = 0 for u E T,(TM). We must show that u = 0. Let n: T ( M ) -+ M be the usual projection and let n1 be the projection of M x M on its first factor. Then n1 0 E = n, so that dn(u) = dn,(dE(u)) = 0. It follows that u is vertical, that is, tangent to T,(M), where p = n(x). But E 1 T,(M) differs from exp, only by the obvious diffeomorphism from p x M to M . Thus d exp,(u) = 0, which by hypothesis implies v = 0, Since d exp, is always nonsingular at 0 E T , ( M ) c T M it follows by the inverse function theorem that E maps some neighborhood of 0 in T M diffeomorphically onto a neighborhood of ( p , p ) in M x M .

130 5

Riemannian and Lorentz Geometry

7. Proposition. Each point o of M has a convex neighborhood. Proqf: Let 5 = (x', . . . ,x") be a normal coordinate system on a neighborhood $' of o E M . If N = ( x ' ) ~ then , for 6 > 0 sufficiently small, V ( 6 ) = { p E 7 -: N ( p ) < 6) is a neighborhood of o diffeomorphic under 5 to an open ball in R". By a remark above, if 6 is sufficiently small then E is a diffeomorphism of a neighborhood W of 0 E T',(M) in T M onto V(6)x T"(h). Consider the symmetric (0, 2) tensor B whose components are d i j Tfjxk.Obviously B is positive definite at 0.Thus a further reduction of 6, if necessary, ensures that B is positive definite on V ( S ) . We assert that this neighborhood 43 = V ( 6 ) is a normal neighborhood of each point p E a. Let W,, = W n T,(M). By construction E I W pis a diffeomorphism onto p x a,hence exp,( W',, is a diffeomorphisrn onto 42. Also we must show that WY-,is starshaped about 0. If q E 22, q # p , let u = E-'(p, q). Then u is in W p , and a = yL,1 [0, 13 is a geodesic from p to q. If a lies in 42 the proof of Proposition 3.30 shows that t u E %b-pfor 0 I t I 1;hence, W,, is starshaped. Assume that a leaves 42 = V(S).We derive a contradiction as follows. Since N ( p ) , N ( q ) < 6, the function N 0 a has a maximum at some to with 0 < to < 1. Abbreviating x i 0 a to x i , we compute

xk

d2(N 0 a) = 2 c dt2 i

{

( dd:i)2

+ x i -d;t;i)] -.

By the geodesic equations (3.21) this becomes

Thus d2(N 0 a) ( t o ) = 2B(a'(t0), @'(to))> 0. dt2

This contradicts the fact that to is a maximum point. A first consequence is the following partial analogue of Lemma 1.56.

8. Lemma. A geodesic y: [0, b) -+ M is extendible as a geodesic if and only if it is (continuously) extendible. Proof. The "only if" assertion is obvious, so suppose that y has a continuous extension 7 : [0, b] -+ M . Let %? be a convex neighborhood of 7(b).There is a number a with 0 Ia < b such that ?[a, b] c W. Then 9? is a normal neighborhood of o = ?(a) and y 1 [a, b ) is a radial geodesic. Like any

Arc Length

131

radial geodesic, y I [a, b ) can be geodesically extended until it approaches bd %-or until its domain is [a, a).But Y(b)is not in bd %, hence y can be extended past b. rn If p and q are points of a convex open set V and op4is the geodesic in % from p = a,(O) to q = op,(l), the displacement uector p< is ab,(O) E T,(M).

9. Lemma. If % is a convex open set then the map A : %? x % -+ T M is smooth. sending (p, q ) to In fact, arguments as above show that A(% x 59)is open and that A is the inverse map of the diffeomorphism E : A(V x V ) + %? x %. In R" an intersection of convex sets is convex, but in the circle S' such an intersection need not even be connected. In general, if convex open sets %? and v' are contained in a convex open set d, then %? n v' is convex (if nonempty). In fact, if p , q E V n W then both V and W'provide the same geodesic op4as d;hence 2 is unambiguously defined. Thus the diffeomorphism exp,: @ + V restricts to a diffeomorphism of q E %? n V} onto 2' 7 n W . A convex covering R of a semi-Riemannian manifold M is a covering of M by convex open sets such that if elements a, Y of R meet then 92 n V is convex.

(2:

10. Lemma. Given any open covering6 of M there is a convex covering

R such that each element of 52 is contained in some element of 6. Proof: Let 6* be the open covering of M consist of all convex open sets contained in any element of 6. Since M is second countable, hence paracompact, there is an open covering % such that if two of its elements meet then their union is contained in some element of 6*.Let R consist of all convex open sets contained in elements of B. By a remark above, the covering R has the required intersection property. w

ARC LENGTH

The familiar notion of arc length of a curve segment in Euclidean space generalizes in a natural way. 11. Definition. Let a : [a, b] + M be a piecewise smooth curve segment in a semi-Riemannian manifold M . The arc length of CI is b

L(a) =

J la'(s)lds. a

132 5

Riemannian and Lorentz Geometry

By definition l a ’ ( = I(a’, a‘)J1’2,hence in coordinates

On a Riemannian manifold arc length behaves much as in Euclidean space, but for indefinite metrics the term length can be misleading since, for example, a null curve has length zero. We consider now the effect of a change of parametrization. A reparametrization,function h: [c, d ] -+ [a, b] is a piecewise smooth function such that either h(c) = a, h(d) = b ( h orientation-preserving), or h(c) = b, h(d) = a ( h orientation-reversing). If its derivative does not change sign, h is monotone. The same proofs as in elementary calculus then show the following. 12. Lemma. (1) The length of a piecewise smooth curve segment is unchanged by monotone reparametrization. (2) If a is a curve segment with Ja’J> 0, there is a strictly increasing reparameterization function h such that p = a(h) has Jp’I= 1.

In the latter case /J is said to have unit speed or arc length parametrization. Let Q be a normal neighborhood of a point o of M . The function r on 4‘ 2 given by r(p) = 1 exp; ‘ ( p ) 1 is the radiusfunction of M at o. In terms of normal coordinates,

Thus r is smooth except where it is zero, namely at the point o and on the local nullcone at 0. 13. Lemma. Let r be the radius function on a normal neighborhood Q of o. If a is the radial geodesic from o to p E @, then L(a) = r(p).

Pro05 If u is the initial velocity a’(0) of 0, we know from Proposition 3.30 that u = exp; ‘(p). Since ( 0 ’ )is constant,

L(a) = Jolla.(ds =

Jo

1

( u ( d s = Iul = r(p).

R IEM A NNIAN DISTA NCE

In this section M will be a Riemannian manifold; thus its metric tensor makes each tangent space an inner product space. The local geometry of A4 near a point o is relatively simple. If i is the norm function ?(u) = lul = ( u , u)”’ on T,(M), then on any normal neighborhood Q of o the radius

Riemannian Distance 133

’

function r = ? o exp, is smooth except at o. There is only a single family of hyperquadrics at o, namely, the spheres r constant, which are the images under expo of the standard spheres r‘ constant in To(M)z R”.If as usual q = 4” 0 expo- on %, and P is the local position vector field near 0, then

[ P I= ( P , P ) ” ~=

& = r.

Hence U = P / r is the outward radial unit vector field on % - o, and U is normal to all the hyperspheres at o. Since r = it follows from Corollary 3 that grad r = P / r = U on % - (0). The local geometry of Riemannian geodesics is dominated by the following result.

&,

14. Lemma. Let % be a normal neighborhood of a point o in a Riemannian manifold M . If p E 42 then the radial geodesic segment 0: [O,

13 -+ %

from o to p is the unique shortest curve in 42 from o to p .

Proof: In view of Lemma 12 the uniqueness of (r must be interpreted as uniqueness up to monotone reparametrization. Thus if a : [0, b] + % is a curve from o to p we must show that (a) L(a) 2 L(o). (b) If L(a) = L(a), then a is a monotone reparametrization of

(r.

For (a), restricting the radial unit vector field U to cc we can write a’ = (a’,

U)U

+ N,

where N is a vector field on a orthogonal to U . (At t N zero.) Then ( a ‘ [ = (a’, a’)’’’

= [(a‘,

U)’

=

0, let U be a’(0) and

+ ( N , N ) ] ’ I 2 2 [ ( a ’ , U ) l 2 (a’, U ) .

But U = grad r implies (a‘, U ) = d(r a)/&. Hence by the fundamental theorem of calculus 0

L(a) =

c

la’/ dt 2 r(a(b)) - r(a(0)) = r(p).

But by Lemma 13, r(p) = L(o). For (b), if L(a) = L(o),then the inequalities above all become equalities; hence, writing r for r o a, N

=

0,

and

dr/dt

= (a’,

U}

= [(a’,

U ) [ 2 0.

Thus a’ = (dr/dt)U, showing that a travels monotonically along the radial geodesic from o to p , namely, 0.In fact, since r measures radial distance, a is the monotone reparametrization a(t) = o(r(t)/r(b))of (r. w

134 5

Riemannian and Lorentz Geometry

Rejnements: (1) The proof remains valid if a is only required to be piecewise smooth. Then drldt has only a finite number of double values and the reparametrization function t -+ r(t)/b is piecewise smooth. (2) Dropping the normal neighborhood % from the hypotheses, assertions (a) and (b) still hold provided CL = expo 5 , where 5 is a curve in T , ( M ) from 0 to ~ ' ( 0such ) that d expois nonsingular at each point of T. 0

In Euclidean space the distance d(p, 4 ) = Ip - q1 between two points p and q could also be defined to be the length of the shortest curve segment joining them, namely the straight line segment from p to 4. But in R2 - (0, 0), for example, there is no shortest curve from p = ( - 1,O) to q = (I, 0). However the following modification works in general. 15. Delinition. For any points p and q of a connected Riemannian manifold M , the Riemannian distance d(p, q ) from p to q is the greatest lower bound of {L(a): CI E R(p, 4 ) ) ,where R(p, q ) is the set of all piecewise smooth curve segments in M from p to 4.

If o E M and E > 0 the set . 2 ;fo) of points p E M such that d(o, p ) < E is called the E-neighborhood of o in M . Using this notion the preceding lemma can be strengthened as follows. 16. Proposition. Let o be a point of a Riemannian manifold M . E > 0 sufficiently small, the E-neighborhood ,Y',(o) is normal; (2) For a normal r.-neighborhood ,V,(o) the radial geodesic G from o to p E N,(O) is the unique shortest curve in M from 0 to p . In particular, (1) For

UG)= r(p> = 4 0 , P). ' be a normal neighborhood of o in M , with @ the Pro05 (1) Let 2 corresponding neighborhood of 0 in T,(M). For E > 0 sufficiently small, JZ contains the starshaped open set

-

v

c

=.

i-&(o)= { a E T,(M): 1uI < E } .

Thus 4 = exp,(..P) is also a normal neighborhood of 0. If p E 2' then by Lemma 14 the radial geodesic o from o to p is the unique shortest curve in t . from o to p , and L(a) = r(p). Since exp;'(p) = v is in P,r(p) = I a 1 < E . It now suffices to prove : lf a is a curve in M starting at o and leaving N',then L(a) 2

E.

In fact, this will show that 01s the unique shortest curve in all M from o top, hence L ( G ) = r(p) is d(o, p ) . Thus d(o, p ) < E. But if 4 4 A then d(o, q) 2 E, showing that ,V is the &-neighborhoodof 0.

Riemannian Distance 135

To prove the assertion above, note first that since a.leaves JV it meets every sphere S(a), r = a, with a < E. If u 1 is the shortest initial segment of a from o to S(a) then a lies in . N ;so Lemma 14 gives L(a) 2 L(a,) 2 a for all a < E. Thus L(a) 2 E. The following simple example illustrates many properties involving geodesics, in particular, the distinction between arbitrary normal neighborhoods and normal &-neighborhoods. 17. Example. Geodesies on a Cylinder. The Riemannian product manifold S' x R' can be viewed as the cylinder M : x2 + y2 = 1 in R3. The mapping $(u, u ) = (cos u, sin u, v ) , which wraps the plane around the cylinder, is a local isometry. Thus the geodesics of M are all curves of the form y(t) =

(cos(at

+ b), sin(at + b), ct + d ) .

These are typically helixes, reducing to cross-sectional circles for c and to vertical lines L for a = 0.

=

0

(a) For any such line L, the open set M - L is convex; hence, if L does not pass through the point o E M , then M - L is a normal neighborhood of o. Thus by Lemma 14, the radial geodesic from o to p E M - L is the unique shortest curve in M - L from o to p. But evidently the geodesic z in Figure 2 is a shorter curve from o to p that does not remain in M - L. (b) For any point o, its largest normal &-neighborhood is MJo). Thus by Proposition 16 a radial geodesic segment from o to a point 4 of this neighborhood is the unique shortest curve in all of M from o to 4. For a point Y outside Mr(o), there is always a shortest curve from o to r, but uniqueness is lost if Y lies on the vertical line opposite o, that is, through -0.

___---

u

/'

-__ --- ----. -0

c-

0

Figure 2

f 36 5

Riemannian and Lorentz Geometry

It is now easy to show that d has the properties expected of a distance function.

18. Proposition. For a connected Riemannian manifold M the Riemannian distance function d : M x M + R is a metric on M , that is, for all p,q,reM: (1) d(p, q) 2 0, and d(p, q ) = 0 if and only if p = q (positive definiteness) (2) d(P, 4) = d(P, 4) (symmetry). (3) d(p, q ) + d(p, r ) 2 d(p, r ) (triangle inequality).

Furthermore d is compatible with the topology of M . Proqf. To prove the triangle inequality, for example, given any choose o! E !2(p, q ) and fl E Q(q, r ) such that

Joining

o!

E

>0

and p produces a curve y in Q(p, u ) for which

d(p, r ) I Uy)

=

U a ) + UP) < d(p, 4) + d(q, 1.)

+ 2s.

Since E is arbitrary the result follows. The other metric properties are trivial except for d(p, q ) = 0 * p = q. But if p # q then since M is Hausdorff there is a normal neighborhood of p that does not contain q. The proof of the preceding proposition shows that % contains an &-neighborhood of p , so d(p, q) 2 E > 0. Since every neighborhood of a point of M contains an &-neighborhood, and (also by the preceding proof) &-neighborhoodsare open sets of M , the metric d is compatible with the topology of M, that is, a subset Y of M is open if and only if for each p E V there is an E > 0 such that .A’&) c V . W

Compatibility means that &-neighborhoods can be used in the same familiar way as in Euclidean space. Hence, for example, a sequence of points {pi}in M converges to p E M if and only if the numerical sequence { d ( p , pi)} converges to 0 in R. By definition of Riemannian distance, a curve segment a from p to q in a Riemannian manifold M is a shortest curve segment from p to q if and only if L(a) = d(p, 4). (There may be many or none such segments.) In this case we also say that minimizes arc length from p to q, or merely that 0 is minimizing. Note that any subsegment of a minimizing segment is also minimizing.

Riemannian Distance 137

Picturing a minimizing segment as a tightly stretched string correctly suggests that it is geodesic: 19. Corollary. In a Riemannian manifold a minimizing curve segment from p to q is a monotone reparametrization of an (unbroken) geodesic segment from p to q. CI

Proof. The domain I of CI can be decomposed into subintervals l i such that each ai= CI I l i lies in a convex open set. (Each aican be assumed to be nonconstant, for otherwise Zicould be adjoined to an adjacent subinterval.) Since cti is minimizing it follows from the uniqueness feature of Lemma 14 that it is a monotone reparametrization of a unit speed geodesic ai. Joining these gives a possibly broken geodesic a from p to q. Similarly, the reparametrization functions can be patched together to exhibit c1 as a monotone reparametrization of a. Since L(a) = L(cI)= d(p, q), the following useful fact will imply that a is unbroken: If a geodesic segment y1 ending at p and a geodesic segment yz starting at p combine to give a minimizing curve segment y, then y is an (unbroken) geodesic. (Intuitively, if there were a corner at p we could round it off to get a shorter curve.) Assume y has constant speed and let V be a convex neighborhood of p. Then a final segment of y1 and an initial segment in y z combine to give a minimizing curve segment 7 in $9. Since V is a normal neighborhood of the initial point of 7 it follows from Lemma 14 that 7is a constant speed reparametrization of a radial geodesic. Consequently, 7 and hence y are unbroken geodesics.

20. Example. The Sphere S"(r). If p and q are distinct nonantipodal points ( q # - t p ) there is a unique great circle through p and q. Its shorter arc thus provides the unique minimizing geodesic a from p to q. If 9 is the angle, 0 < 9 < n, between p and q (as vectors in R"' '), it follows that d(p, q) = L(a)

=

r9.

By continuity, d(p, - p ) = rn. Thus the normal neighborhood S"(r) - { - p } is exactly A'"&). Each semicircle from p to - p gives a minimizing geodesic from p to - p , but a slightly longer geodesic is not minimizing (the other arc of its great circle being shorter). Thus a geodesic of S"(r) is minimizing if and only if its length is at most rn. The hemisphere Mrni2(p) is the largest &-neighborhoodof p that is convex.

138 5

Riemannian and Lorentz Geometry

R IEMA NNIA N CO M PLET E NE SS The fundamental theorem of complete Riemannian manifolds is this: 21. Theorem (Hopf-Rinow). For a connected Riemannian manifold M the following conditions are equivalent :

(MC) As a metric space under Riemannian distance d, M is complete; that is, every Cauchy sequence converges. ( C , ) There exists a point p E M from which M is geodesically complete, that is, exp, is defined on the entire tangent space T’(M). ( C ) M is geodesically complete. (HB) Every closed bounded subset of M is compact. The following companion result will be proved simultaneously. 22. Proposition. If a connected Riemannian manifold is complete then any two of its points be joined by a minimizing geodesic segment.

The open disk in RZ shows that the converse of the proposition is false. From the viewpoint of semi-Riemannian geometry the most striking feature of the proposition is that arbitrary points can be joined by any geodesic at all, much less a minimizing one. This property, called geodesic connectedness, is equivalent to all exponential maps of M being onto. We shall soon see (after Proposition 38) that for indefinite metrics, connectedness and geodesic completeness do not imply geodesic connectedness. In general the proposition permits the free use of geodesic constructions throughout M , while the Hopf-Rinow theorem itself links the geodesics of a complete Riemannian manifold firmly to its structure as a metric space; for example:

23. Corollary. A compact Riemannian manifold is complete. Proof. The Heine-Bore1 condition (HB) holds trivially.

w

This result too fails for indefinite metrics; see Example 7.16. (Even if an indefinite M is compact, the set of unit vectors in T M is not.) The following is the essential step in the proof of (21) and (22). 24. Lemma. If exp, is defined on all of T,(M) (condition (Cl)), then for any q E M , there is a minimizing geodesic segment from p to 4.

Prooj. Let % be a normal &-neighborhood of p . We suppose q $ % , for otherwise the result is trivial. If r is the radius function at p , then for 6 > 0 sufficiently small, r = 6 defines an entire (n - 1)-sphere S in 42. The function

Riemannian Completeness 139

s + d(s, q ) is continuous on S (compact), hence has a minimum point m E S.

We assert that

(*I

d(p, m )

+ d ( m , 4 ) = d ( p , 4).

Let u : [0, b] -, M be any curve from p to q. Since M is Hausdorff, u meets S at some parameter value a, with 0 < a < b. Let u1 and a2 be the restrictions of u to [0, a] and [a, b], respectively. Then by Proposition 16,

+ L(uJ 2 6 + d(m, 4). Hence d(p, q) 2 6 + d(m, q ) = d ( p , m ) + d(m, 4). The reverse inequality L(u) = L(crI)

is the triangle inequality, so (*) is proved. Now let y : [O, a).+ M be the unit speed geodesic whose initial segment runs radially from p through m (thus y is aimed at q). Let d = d(p, q ) and let T be the set of all t E [0, d ] such that t

+ d(y(0, 4) = d.

It suffices to show that d E T, for then d(y(d), q ) = 0 hence y(d) = q, and since y I [0, d ] has length d = d(p, q ) it is minimizing. In fact y I [0, t ] is minimizing for any t E T since its length is t , and

+ 4 y ( t ) , 4 ) = d(P, y(tN + d - t ,

d 5 d(P, y(tN

hence d(p, y ( t ) ) = t. The set T is closed by continuity and nonempty by (*); thus it contains a largest number t o I d. Assuming to < d , wededuce acontradiction as follows. In a normal neighborhood 42‘ of ?(to)the same procedure as above produces a unit speed radial geodesic segment g : [0, S’] -, W from y ( t o ) to a point m’ E 42‘ such that 8‘

+ d(m’, 4 ) = d(y(to), 4).

(See Figure 3.) Because t o E T, to

+ 6‘ + d(m’, 4) = d.

Figure 3

140 5

Riemannian and Lorentz Geometry

+

Since d I d(p, m') d(m', q), another application of the triangle inequality yields t o + 6' = d(p, m').Thus the sum of y I [0, t o ] and CJ minimizes, so by the proof of Corollary 19 there is no corner at their meeting point y ( t o ) = rr(0). This means that m' = a(#) is y ( t o + 6').But then to

+ 6' + d(y(to + S'), q ) = d(p, m') + d(m', 4 ) = d,

giving the contradiction

to

< t o + 6' E T.

Proofof(21) and (22). It suffices to prove the Hopf-Rinow theorem, for Proposition 22 then follows from the lemma. (MC) 3 (C). Let y : [O, 6) --,M be a unit speed geodesic. If { t i } + 6 in [0, 6) then { y ( t i ) } is Cauchy, since d ( y t i , y t j ) I ] t i - t j J . Hence {?(ti)} converges to some point q E M . For another such sequence Isi} the sequence { ~ ( s , ) }converges to the same point q, since d ( y t i , ysi) IIti - si t . Thus y is continuously extendible ; hence by Lemma 8 it is geodesically extendible. (C) (Ci). Trivial. (C,) = (HB). Let A be a closed bounded subset of M . By the preceding lemma, if q E A there is a minimizing geodesic aq.: [0, 13 -+ M from p to q. But I ai(0) I = L(cJ,)= d ( p , q), and by the triangle inequality the set of these numbers for all q E A is bounded, say by r. Thus each C J ~ ( Ois) in the compact ball B, = { v E Tp(p): It11 Ib } . Since exp,(B,) is compact and contains the closed set A , the latter is compact. (HB) + (MC). The point set of a Cauchy sequence is bounded, hence its closure is compact. Thus the sequence contains a convergent subsequence and, being Cauchy, must itself converge. We conclude with another distinctively Riemannian property. 25. Lemma. Every (second countable) smooth manifold admits a Riemannian metric tensor.

Proof. Let {fa} be a partition of unity subordinate to the covering of M by coordinate neighborhoods. For each CI, pick a coordinate system XI,.. . ,x" whose domain 42 contains supp fa, and let g, be the metric tensor dx' 8 dx' on "11. A linear combination of (positive definite) inner products, with positive coefficients, is again an inner product. Thus fag, is a Riemannian metric onM.

1

LORENTZ CAUSAL CHARACTER

To study the tangent spaces of a Lorentz manifold in abstract terms, define a Lorentz vector space to be a scalar product space of index 1 and dimension 2 2. The notion of causal character of vectors has in this context a natural generalization to vector subspaces.

Lorentz Causal Character

141

Let W be a subspace of a Lorentz vector space V, and let g be the scalar product of V. There are three mutually exclusive possibilities for W : (1) g I W is positive definite; that is, W is an inner product space. Then W is said to be spacelike. ( 2 ) g1 W is nondegenerate of index 1. Then W is timelike. ( 3 ) gl W is degenerate. Then W is lightlike.

The type into which W falls is called its causal character. This definition is consistent with Definition 3.3 in the sense that the causal character of an individual vector u is the same as the causal character of the subspace Ru it generates. (The zero subspace, like the zero vector, is spacelike.) The following simple result is widely useful. 26. Lemma. If z is a timelike vector in a Lorentz vector space, then the subspace z' is spacelike and I/ is the direct sum Rz + z'. Proof: The subspace Rz is nondegenerate with index 1. Hence by Lemma 2.23, z' is nondegenerate and I/ = Rz + z' is a direct sum. Thus ind V = ind Rz + ind z l , which implies ind z' = 0. Hence z1is spacelike. w

This argument shows, more generally, that a subspace W is timelike ifand only if W' is spacelike. Since (W')' = W the words timelike and spacelike can be reversed in this assertion. It follows then that W is lightlike if and only if W' is lightlike. Spacelike subspaces W are the easiest to deal with since, for example, every subspace of W is also spacelike and the Schwarz inequality is available:

I ( u , w ) I I 1 u 1 I w 1, with equality if and only if u and w are dependent (collinear). Now we consider some criteria for a subspace to be timelike, omitting the trivial case dim W = 1.

27. Lemma. Let W be a subspace of dimension 2 2 in a Lorentz vector space. Then the following are equivalent: (1) W is timelike, hence is itself a Lorentz vector space. (2) W contains two linearly independent null vectors. (3) W contains a timelike vector. Proof: (1) * (2). Let e l , . . . , em be an orthonormal basis for W with e l the timelike vector. Then e , e2 are independent null vectors. ( 2 ) * (3). By Exercise 2, if u, u are independent null vectors, then g(u, u ) # 0. Hence one of the vectors u _+ u is timelike. ( 3 ) 3 (1). If z is a timelike vector in W, then W' c z' and the latter is spacelike. Hence W' is spacelike. But then W = (W')' is timelike. rn

142 5

Riemannian and Lorentz Geometry

28. Lemma. For a subspace W of a Lorentz vector space the following are equivalent (1) W is lightlike, that is, degenerate. (2) W contains a null vector but not a timelike vector. (3) W n A = L - 0, where L is a one-dimensional subspace and A is the nullcone of I/.

Proof. (1) * (2). Since W is degenerate it contains a null vector. By the previous lemma it cannot contain a timelike vector. (2) (3). Since W contains a null vector, W n A is nonempty. By the previous lemma, two independent null vectors would imply that W contains timelike vector. (3) (1). W cannot be spacelike, and again by the preceding lemma cannot be timelike. Hence W is lightlike. ( L is in fact the nullspace W n W' of W.) m

Causal characters for a subspace W are illustrated in Figure 4. Let P be a submanifold of a Lorentz manifold. If for every p E P the subspace T,(P) has the same causal character in T,(M), then that causal character is attributed to P itself. Thus semi-Riemannian submanifolds of M are either spacelike or timelike. The nullcone in R; is an example of a lightlike submanifold. Ofcourse an arbitrary submanifold need not haveacausalcharacter. The causal character of tangent vectors, curves, and submanifolds is preserved not just by local isometries but by conformal maps with h > 0

'

lightlike

Figure 4. Causal character of subspace W .

Timecones 143

(see the description preceding Definition 3.63). If h < 0 then spacelike and timelike are reversed. Relativity theory merges three space dimensions and one time dimension into a four-dimensional Lorentz manifold. Instead of making time “the fourth dimension” it is more consistent in dealing with Lorentz manifolds of various different dimensions to make it the zero-th dimension [MTW]. Thus for the natural coordinates of R; we frequently use the relativistic indexing t = uo, u l , . . . ,u“- ’. Similarly an orthonormal basis for a Lorentz vector space will be denoted by eo, e l , ,. . , e n - with e , the timelike vector. (However in the diagrams of relativity theory, time axes are customarily drawn to be vertical.) TIMECONES

Let F be the set of all timelike vectors in a Lorentz vector space V. For u E F C(u) = ( 0 E F : ( u , u ) < O} is the timecone of V containing u. The opposite timecone is C(-u)=

-C(u)=

{UE9--:(U,U)>O).

Since u’ is spacelike, F is the disjoint union of these two timecones. 29. Lemma. Timelike vectors u and w in a Lorentz vector space are in the same timecone if and only if ( u , w) < 0.

Proof. We show that if u E C(u) and w is timelike, then w E C(u) if and only if (u, w) < 0. Since C ( u / l u l ) = C(u), we can assume u is a (timelike) unit vector. Write u = au 8, w = bu + 2, where 8, i? E u’. Since these are timelike vectors, ]a1 > 131 and Ibl > 131. Now (u, w) = -ab + (8, $), where by the Schwarz inequality (,I; d ) I I 181 Ii$l < I ab I. Since u E C(u), a > 0. Hence sgn(u, w) = sgn( -ab) = -sgn(b), which gives the result. H

+

It follows that for timelike vectors u E C(u)0u E C(u) 0C(v) = C(u).

Furthermore, timecones are convex, for if u, w E C(u) and a 2 0, b 2 0 (not both zero), then it is easy to check that au + bw is in C(u). Many features of inner product spaces have novel analogues in the Lorentz case. For example, in an inner product space the Schwarz inequality permits the definition of the angle 9 between u and w as the unique number 0 I 9 I n such that cos 9 = ( u , w ) / I u I I w 1. An analogous Lorentz result is as follows.

144 5

Riemannian and Lorentz Geometry

30. Proposition. Let u and w be timelike vectors in a Lorentz vector space. Then (1) 1 ( u , w) I 2 I u ] 1 w I, with equality if and only if u and w are collinear. (2) If u and M’ are in the same timecone of V , there is a unique number cp 2 0, called the hyperbolic angle between u and w, such that

(u, W) = Proof: (1)

-1111

(wlcosh 40.

+ 2, with w’ E u’. Since w is timelike, = a’(u, u ) + (2,3) < 0.

Write w = au (w, w)

Then a2(u, u>’ = ((w, w) - (iJ, G))(v, u ) 2 (w, w)(u, u ) = 1U1’1W]’,

(v, w)’ since (G,

=

z) 2 0 and

( u , u ) < 0. Evidently equality holds if and only if

(G, d ) = 0, which is equivalent to 3 = 0, that is, to w

=

au.

(2) If u and ware in the same timecone, then ( u , w) < 0, hence - ( u , w>/lol lwl 2 1,

and the result follows from properties of the hyperbolic cosine.

w

Since the Schwarz inequality runs backwards in this context, so does the triangle inequality. 31. Corollary. If u and w are (timelike) vectors in the same timecone, then J u ( + J w I 5 1 0 + w 1, with equality if and only if u and w are collinear.

Since ( u , w) < 0, the backwards Schwarz inequality gives IuI IwI I - ( u , w). Hence Proof

(Iul

+ IwI)’=

lul’

+ 2lvl I w ( + lwl’

I- ( u

+ w, u + w ) = Iu + WI’.

This becomes an equality if and only if I u 1 1 w I = - ( u , w). But the latter term is l ( u , w) 1, so the previous proposition gives the collinearity criterion. rn To our Euclidean intuitions it can only seem distressing at first that a straight line segment is no longer the shortest route between two points, that cutting across a corner makes a trip longer rather than shorter. But the preceding result is fundamental in Lorentz geometry and its applications to relativity theory. The existence of timecones raises a fundamental global question about an arbitrary Lorentz manifold M . In each (Lorentz) tangent space T,(M) there are two timecones, and there is no intrinsic way to distinguish one from the other. To choose one of them is to time-orient T’(M). The question is: Can every tangent space of M be time-oriented in a suitably continuous way?

Timecones 145

Let z be a function on M that assigns to each point p a timecone z, in T,(M). z is smooth if for each p E M there is a (smooth) vector field V on some neighborhood 42 of p such that V, E r, for each q E 42. Such a smooth function is called a time-orientation of M . If M admits a time-orientation, then A4 is said to be time-orientable. Then to choose a specific time-orientation on M is to time-orient M . For example, Minkowski space R; is time-orientable; its usual timeorientation is the one containing the coordinate vector field d, of natural coordinates uo, . . . ,u"-l.

32. Lemma. A Lorentz manifold M is time-orientable if and only if there exists a timelike vector field X E X(M). ProoJ If such an X exists then (as above for R;) assigning to each p E M the timecone containing X , gives a time-orientation. Conversely, let z be a time-orientation of M . Since z is smooth, each point of M has a neighborhood Q on which is defined a timelike vector field X , whose value at each p E Q is in rp. Now let { f a101 E A } be a smooth partition of unity subordinate to the covering of M by all such neighborhoods. Thus each supp f , is contained in some member @(a) of the covering. The functions f , are nonnegative and timecones are convex. Thus the vector field X = C f ,XqYr(,)is timelike. rn

For example, all Lorentz spheres S; are time-orientable, because if do is the (timelike) natural coordinate vector field on Rlf', then X = tan do is a timelike vector field on S;. Also, all Lorentz hyperbolic spaces H ; c R;+ are time-orientable. In fact, since P = C uidi is normal to H;, it is easy to check that u2 d , - u1 8,is tangent to H ; and timelike. For a Lorentz manifold there is no relation between orientability (1.41) and time-orientability. For example, it is easy to assign a time-orientable Lorentz metric to the orientable band S' x I and a not time-orientable metric to the (nonorientable) Mobius band. The reverse situation is suggested in Figure 5.

Not time-orientable

Time-orientable Figure 5

146 5

Riemannian and Lorentz Geometry

In a Lorentz vector space a vector that is nonspacelike (hence either null or timelike) is also said to be causal. For a timelike vector u the set C(u)of all causal vectors w such that ( u , w} < 0 is the causal cone containing u. Causal cones have properties quite similar to timecones (see Exercise 3). In a Lorentz manifold a causal curue is one whose velocity vectors are all nonspacelike. LOCAL LORENTZ GEOMETRY

Since Lorentz manifolds have index 1, it is natural to focus attention on their timelike curves. In the case of a piecewise smooth curve, a timelike means not only that every a’(?)is timelike, but that at each break ti of a (a’(r,7), “(ti+))

< 0.

Here the first vector derives from ct I [ti- t i ] and the second from a I [ t i , ti+ I]. Thus GI’ does not switch timecones at a break. Similarly, we require that a piecewise smooth causal curve does not switch causal cones at a break. 33. Lemma. Let o be a point of a Lorentz manifold M . Suppose that

0:[0, 61 -, & ( M ) is a piecewise smooth curve starting at 0 such that a =

fl is timelike. Then fl remains in a single timecone of T,(M). Proof. Suppose first that fl (hence a) is smooth. In the following argu-

expo

0

ment, initially will mean: f i r all 0 < t < E with E > 0 suficiently small. Since r(0)is timelike and timecones are convex open sets, initially 0 is in a single timecone C . Since the position vector field P is outward radial and timelike on C , initially (b’, P ) is negative. Since grad 4 = ZP,

and by the Gauss lemma (a’, P ) =

(b‘, P ) .

It follows that initially (a’, P ) and d(4 0 P)/dt are negative. So long as /I remains in C , and (by the Gauss lemma) P, remain timelike. Thus (a‘, P ) hence (P, P) hence d(ij 0 P)/dt remain negative. But fl can leave C only by reaching either 0 or the nullcone, on both of which 4“ is 0. Thus 0must remain in C. Now suppose that fl (hence a) is merely piecewise smooth. We know from above that on its first smooth segment 8 stays in C ; thus at the first break, @‘(r;), P} < 0. Hence by the Gauss lemma (a’(r;), PI) < 0, where

5

Local Lorentz Geometry 147

PI = d expo(Pg(,l)).The additional condition on a at breaks keeps a’(t:) in the same timecone, namely that of P I . So, again by the Gauss lemma, ( f l ’ ( t T ) , P) < 0. Thus it follows as above that d(q“ P ) / d t cannot change signs at breaks, hence the argument for the smooth case remains valid. 0

Minor changes in this proof show that the lemma remains true if the words timelike and timecone are replaced by causal and causal cone. We can now prove a Lorentz analogue of Lemma 14 for timelike curves. 34. Proposition. Let 42 be a normal neighborhood of o in a Lorentz manifold M . If there exists a timelike curve in @ from o to p , then the radial geodesic segment o from o to p is the unique longest timelike curve in 42 from 0 to p .

ProoJ: As before, uniqueness is up to monotone reparametrization. If a is any timelike curve in 42 from o to p , then by the lemma, /? = expo- a lies in a single timecone C . Hence (but for a(0) = o), a lies in a region on which U = P / r is a unit timelike vector field. In particular o is timelike. The argument is now a straightforward variant of that in the Riemannian case. Write

’

CX’= -(a’, U ) U

0

+ N,

where N is a (spacelike) vector field on a orthogonal to U . Then la’\= (-(CX’,a‘))1/2 Since r

=

=

[(a’,U ) * - ( N , N ) ] ” 2 II(@‘,

6along cc we have grad r

=

-P/r

= - U.

U)\.

By the lemma,

(fl, P), hence (a’,U ) , is negative, so l(a‘, U ) l

=

-(a’, U ) = d(r 0 a)/&.

Consequently,

L(a)=

6

la’ldt I r(p) = L(a).

IfL(a) = L(o),then as in the Riemannian case, N = 0 and a is a monotone reparametrization of a ; in fact, a(t) = o(r(a(t))/r(p)). This proof works because if the timelike curve a strays from the radial geodesic (T its velocity acquires a spacelike component N that serves to reduce I a’I relative to 1 o’1, hence to reduce L(a)relative to L(o). As before the result can be refined by omitting the normal neighborhood and comparing (T with timelike curves of the form a = expo fl, where fl is a curve in T,(M)from 0 to u such that exp,(u) = p and d expo is nonsingular at each point of p. 0

148 5

Riemannian and L orentz Geometry

In Minkowski space Ru; the entire manifold is a normal neighborhood of each point. Thus the proposition implies that any timelike geodesic segment in R; is the unique longest timelike curve joining its endpoints. 35. Example. Lorentz Cylinders. (1) The Lorentz surface M = S : x R' can be viewed as a cylinder in R; (see Figure 6). It has essentially the same connection, hence same geodesics, as its Riemannian analogue in Example 17. Nullcones on M are marked out by intersecting null geodesics, for example ( fcos s, sin s, s + c). Assertion (a) in Example 17 has the following Lorentz analogue: By Proposition 34 the timelike geodesic 0 shown in Figure 6 is the unique longest timelike curve in M - L from o to p . There are longer curves in M - L that are not timelike, and there are longer timelike curves that do not remain in M - L. Assertion (b) has no Lorentz analogue. Explicitly, there exists no normal neighborhood of o whose timelike radial geodesic segments are the longest in M joining their endpoints. In fact, any two points of M can be joined by arbitrarily long timelike geodesics (spiraling repeatedly around the cylinder). (2) The Lorentz cylinder R: x S' can be gotten by reversing the metric tensor of S : x R', hence reversing the causal character of geodesics. It is not hard to see that the analogue of (b) mentioned above is valid now.

\

L I

I

Timecones

I

Figure 6. Si

x

R'.

In favorable cases a Riemannian manifold can be turned into a Lorentz manifold as follows.

36. Lemma. Suppose U is a unit vector field on a Riemannian manifold M with metric tensorg. Theng = g - 2U* @ U* is a Lorentz metric on M . Furthermore, U becomes timelike so the resulting Lorentz manifold is time-orientable. Proof: Recall that U * ( X ) = g(U, X) for vector fields X . Locally there exist vector fields E j such that U , E , , . . . , En is a frame field relative to g.

Geodesics in Hyperquadrics 149

Theng(E,, E j ) = g ( E i , E j ) = 6,,, andg(U, E j ) = g ( U , E j ) = 0, but

g(u,U ) =

-1.

The final assertion follows from Lemma 32.

w

For example, both Lorentz cylinders in Example 35 can be gotten in this way from the Riemannian cylinder S' x R'. By contrast with the Riemannian case, not every smooth manifold can be made a Lorentz manifold.

37. Proposition. For a smooth manifold M the following are equivalent: (1) There exists a Lorentz metric on M . (2) There exists a time-orientable Lorentz metric on M . (3) There is a nonvanishing vector field on M . (4) Either M is noncompact, or M is compact and has Euler number X(M) = 0.

Proof: (3) ~ ( 4 ) .This is a standard topological result [V]. (2) -,(1) is trivial. (3) + (2). By Lemma 25 there is a Riemannian metric tensor on M . For X E X(M) nonvanishing, apply the preceding lemma to X / ( X ( (2) . -+ (3) is immediate from Lemma 32. Thus it suffices to prove (1) + (4). If M is time-orientable the preceding results establish (4). If M is not time-orientable we will see in Chapter 7 that it has a double-covering Lorentz manifold that is time-orientable. Thus ii? is either noncompact or has ~ ( f i=) 0. Because the covering map ii? + M sendstwo points to one, M is compact if and only if fi is-and in the compact case x ( M ) = ~ ( f i ) / 2= 0.

For example, the only compact surfaces that can be made Lorentz surfaces are the torus and Klein bottle. Also, a sphere S" admits a Lorentz metric if and only if n is odd 2 3. GEODESICS IN HYPERQUADRICS

For any semi-Riemannian manifold a basic problem is to understand the global behavior of its geodesics. As a simple example we consider hyperquadrics. Proposition 4.28 tells what the geodesics of S:(r) are; a closer look at its proof shows where they go. (See Figure 4.3.) 38. Proposition. Let p and q be distinct nonantipodal points of SC(r). (1) If ( p , q ) > r z , then p and q lie on a unique geodesic, which is timelike and one-to-one.

150 5

Riemannian and Lorentz Geometry

(2) If ( p , q ) = r2, then p and q lie on a unique geodesic, which is also a null geodesic of R:' '. (3) If - r 2 < ( p , q ) < r2, then p and q lie on a unique geodesic, which is spacelike and periodic. (4) If ( p , q ) I- r2, there exists no geodesic joining p and q.

Pruofi The hypotheses on p and q imply that they lie in a unique plane I? through the origin of R:' ',and by Proposition 4.28 the only geodesic that can possibly pass through both p and q is a parametrization of a component of the one-dimensional manifold l7 n St(r). Consider now the three cases in the proof of that proposition. Case 1. ll is positive definire. Then Il n Se(r) is a circle in fl parametrized by a (periodic) spacelike geodesic and - r 2 < ( p , q ) < r 2 . Case 2. ll is nundegenerate indefinite. As in Example 2.21, Il n S:(r) = { x E 17 : (x, x) = - r Z } is a hyperbola of two branches. It is easy to see that p and q are on the same branch if and only if ( p , q ) > r2, and are on opposite branches if and only if ( p . q ) < - r 2 . In the former case p and q lie on a (timelike) geodesic; in the latter on no geodesic. Case 3. ll is degenerate. Proposition 4.28 shows that ll n Sz(r) consists of two parallel null straight lines of R:+ ',with p and q on the same line if and only if ( p , q ) = r 2 ;on opposite lines if and only if ( p , q ) = -r2.

Since the various restrictions on ( p , q ) above are mutually exclusive, the result follows. The corresponding result for pseudohyperbolic spaces derives as usual from the anti-isometry of Lemma 4.24. These results let us answer reasonable questions about the geodesics of hyperquadrics. For example, it follows, as predicted by the Hopf-Rinow theorem, that spheres and hyperbolic spaces are geodesically connected. By contrast an indefinite hyperquadric Q is never geodesically connected, since it always contains points p , q for which ( p , 4) has arbitrarily large positive and negative values. In fact, taking Q = S:(r), the points connectable to p by geodesics (that is, the image of exp,) consist of - p and all points of Q on the same side of the hyperplane {x : ( p , x) = - r 2 } % T-,(Q) as p itself. GEODESICS IN SURFACES A semi-Riemannian surface is either Lorentz or definite-and the latter can be supposed, as usual, to be Riemannian. Even in dimension 2 the geodesic differential equations remain quite complicated, so we consider a special case : surfaces that admit coordinate systems x,y for which E , F , G depend only on

Geodesics in Surfaces 151

one coordinate, say x; hence, E , = F , = G, = 0. On the domain of such a coordinate system the geometry is “constant in the y direction”; that is, each coordinate y-translation (x, y ) -+ (x, y + c) is an isometry.

39. Lemma. Let x, y be surface coordinates with E,

=

F, = G, = 0.

(1) If y is a geodesic with coordinate functions x(s), y ( s ) then

(y’, 8,) = Fx’ + Gy’

=

C

(the conservation equation), where C is constant. Hence ( E G - F’)x‘’

+

=

Ge - C’,

+

where e = (y’, y’) = EX” 2Fx’y’ Gy”. (2) The y-coordinate curve, x = xo, is a geodesic if and only if G,(xo) = 0. (3) If, furthermore, F = 0 then every x-coordinate curve is pregeodesic.

Proof: (0) In view of the identities (a,, DaXa,) = EJ2, (a,, Daya,) = G,/2, and (a,, Daya,) + (a,, DZx3,) = F,, the hypotheses on E, F , G are equivalent to ( V , Dv(13,)) = 0 for all I/. (A more conceptual reason for this identity appears in Chapter 9.) (1) The derivative of (y’, 8,) is (y”, 8,) (y’, D,. a,) = 0. Subtracting C’ = (Fx’ Gy‘)’ from the coordinate formula for Ge = G(y’, y ’ ) gives the differential equation. ( 2 ) This curve fi has p’ = a,,,hence

+

+

(p”, 8,) =

myd,, 3,)

=

-G,(x,)/2;

(P”, 8,)

=

a,, 3,)

=

0.

(3) Since F = 0, to show that the x-coordinate curve CI is pregeodesic it suffices to observe that (a”, 13,) = (Da, a,, a,) = F , - E,/2 = 0. Evidently the lemma remains valid under the reversal x e* y hence G. Since the conservation equation is a first-order differential equation it is generally much easier to deal with than the (second-0rder)geodesicdifferential equations. A practical way to search for geodesics, under the hypotheses of the lemma, is to try to find the constant-speed solutions of the conservation equation. This class of curves contains all geodesics-and little else (only linear parametrizations of y-parameter curves). E

c*

40. Example. The Poincare Half-plane P. By Exercise 3.8, P is the dd)/u’ and constant region u > 0 in R2 with line element ds2 = (du’ curvature K = - 1. The preceding lemma applies with u = y and v = x.

+

(1) Isornetries. For any numbers a and r > 0, the mappings (u, v) (+u a, v) and (u, v ) + (ru, r r ) are isometries.

+

--f

152 5

Riemannian and Lorentz Geometry

( 2 ) Geodesics. The conservation equation is (y‘, 13,) = u‘/v2

=

C. For

C = 0 this yields the vertical lines u constant, so suppose C # 0. Assuming (7‘1 = 1 we get (u” U ’ ~ ) / U= ~ 1. Substituting u’ = Cu2 yields u” =

+

u2(1 - C2u2).Since u‘ is never zero,

Let r

=

1/C; then an integration gives J 2-

that is, ( u - uo)2 + v 2

u2 = u - uo,

=

r2.

Thus the geodesics of P are the constant speed parametrizations of all vertical lines and all semicircles centered on the u axis. This surface gives a concrete model for the so-called non-Euclidean geometry of Bolyai and Lobachevski, which satisfies all the axioms of Euclidean plane geometry except the parallel postulate. Indeed, through each point of P not on a geodesic y there pass infinitely many geodesics that do not meet y. (It will turn out that P is isometric to the hyperbolic plane H2(l).) Lorentz surfaces have two notable features not shared by higher dimensional Lorentz manifolds. First, reversing the metric gives again a Lorentz surface. Thus Lorentz surfaces with K > 0 differ from those with K < 0 only in causal character. Second, every null curve is pregeodesic. In fact, (a’, a’) = 0 implies (a”, a ’ ) = 0, and a” is one-dimensional for a surface, so a” must be collinear with E‘ # 0. Thus a is not turning, and hence (Exercise 3.19) is pregeodesic. In terms of arbitrary coordinates, the null curves are the solutions of ( a ’ , a ‘ ) = EM”

+ 2Fu’v’ + Gv”

= 0.

41. Example. The Schwarzschild Half-plane P, . For constant M > 0 let &(r) = 1 - (2M/r). Then P , is the region r > 2~ in the tr-plane, furnished with Lorentz line element ds2

= -R

dt2

+

&-I

dr2.

(1) Curuature. In terms of Proposition 3.44(2) take c2 = 1, g = l/@ Then we compute K = 2kf/r3 > 0.

+

=

-

1, e =

( 2 ) Isornetries. The mappings (r, r ) -+ ( r b, r ) are isometries. (3) Geodesics. Lemma 39 applies with x = r and y = t. The r-coordinate curves are spacelike pregeodesics. We now find the null geodesics, which obey

- 4 t ” + g.-’r’Z = 0.

Geodesics in Surfaces 153

The conservation equation is ( y ' , 13,) = - A' = const. Because of (2) it will be enough to find a single null geodesic, so by a convenient choice of constants we reduce the above equation to just r' = 1. Hence r = s + b. To integrate (1 -

choose b

=

zhl)dt = 1, s + b ds

2 ~which , yields t=s+2~Ins,

r = s + 2 ~

for s > O .

Applying the isometries in (2) gives explicit parametrizations for all the null geodesics of P I . Intersecting null geodesics mark out the null cones of P , (Figure 7).

Figure 7 . Null geodesics in

6

The Schwarzschild half-plane is the essential building block in the simplest relativistic model of the region around a star (Chapter 13). Its null geodesics describe radial light rays sluggishly approaching (or departing) radius r = 2M. 42. Definition. A coordinate system u, u in a Lorentz surface is null provided its coordinate curves are null. Thus the line element has the form ds2 = 2F du du, where as usual F = (J,,, a,).

The local geometry of the surface can then be traced back to the function F (see Exercise 8). As an example, rotating the natural coordinates t , x of R: by 45" gives null coordinates

+

u = (l/$)(-r for which 2 du d v = - d t 2 + dx2.

XI,

u = (l/,,b)(r

+

X)

154 5

Riemannian and Lorentz Geometry

COMPLETENESS A N D EXTENDlBlLlTY

For a manifold with indefinite metric, completeness is a more subtle notion than in the Riemannian case, since the Hopf-Rinow theorem (21) has has no satisfactory generalization. In discussing completeness it suffices as usual to consider only geodesics defined on [O, b); left endpoints can be handled similarly. Lemma 8 gives a convenient topological criterion for the inextendibility of a geodesic y : [0, b ) + M ; namely, there is a parameter sequence {s,) + b such that {~(s,)) does not converge. (If two such sequences converge to different points, interlacing them gives a nonconvergent sequence.) Thus for example all the geodesics of the Poincare half-plane are inextendible (at both ends) as are all the null geodesics of the Schwarzschild half-plane. Sometimes pregeodesics are available but not their geodesic parametrizations. A spacelike or timelike pregeodesic a : [0, b) -+ M i s complete (to the right) if‘ and only if‘ it has infinite length. This is clear since the unit speed reparametrization of c1 is a geodesic defined on the interval [0, L(a)). In this way it is easy to check that the Poincare half-plane is complete; for example, a ) (sin s, cos s), 0 I s < ~ / 2 ,has typical “semicircular” pregeodesic ~ ( s = (a’, a‘) = sec2 s, hence

L(0c) =

1.’

sec s ds

=

co.

For null geodesics there is no such simple criterion, but null geodesics are often easier to compute. For example, those of the Schwarzschild half-plane are incomplete as s decreases; since they are inextendible, P I is incomplete. On a manifold with indefinite metric, completeness can be separated by causal character into spacelike completeness (inextendible spacelike geodesics complete), null completeness, and timelike completeness. A complete manifold of course satisfies all three, while M minus a point satisfies none, But the conditions are independent [BE]; for instance, here is a Lorentz surface that is null and spacelike complete but timelike incomplete. 43. Example (Geroch). On R2 with coordinates ( t , x) it is easy to see that there is a smooth function f > 0 satisfying (1) f = 1 outside the open strip ( xI < 1, (2) f is symmetric about the t axis, ( 3 )SF f ( t , 0) dt is finite. Let M be this plane with line element f 2 ( - d r 2 + d x 2 ) . Then M is timelike incomplete but null and spacelike complete. In fact, by (2), the t-axis can be parametrized as a timelike geodesic, but it is incomplete, since by (3) it has finite length (from origin). Outside the strip the metric is Minkowskian, so inextendible null and spacelike geodesics that avoid the strip are certainly complete. But if such a geodesic meets the strip it must leave because p is

Exercises

155

a curve with Idtldx I I 1 (since the timecones of M are still Minkowskian), so staying in the strip would produce an endpoint. Having left the strip, B evidently cannot return and hence is complete. Riemannian geometers are accustomed to assuming their manifolds are complete, but relativity theory necessarily involves Lorentz manifolds that are incomplete. For incomplete manifolds the following notion becomes important. 44. Definition. A connected semi-Riemannian manifold M is extendible provided M is (isometric to) an open submanifold of a connected semiRiemannian manifold fi # M . In general M is extendible if one of its connected components is extendible. Otherwise M is inextendible (or maximal).

Compact manifolds are inextendible. By Exercise 3.7 a complete manifold is inextendible. The converse is false; for example, an ordinary cone (minus vertex) in R3 is inextendible but not complete. 45. Remark. Let M be a connected semi-Riemannian manifold. Suppose that for every inextendible geodesic y : [O, b ) -+ M of given causal character there is a curvature invariant Z such that Z(y(s)) does not approach a finite limit as s + b. (For example the invariant could be Ric(y', y ' ) or K(y', V ) for a parallel vector field V on y.) Then M is inextendible, for otherwise some y has an extension past b in fi hence Z(y(s)) has limit I(j(b)) as s -P b.

Exercises

1. For a Riemannian manifold M , (a) the distance function d : M x M -+ M is continuous; (b) if M is complete and C is a closed set of M , then for any p E M there is a point of C closest to p . 2. In a Lorentz vector space, (a) orthogonal null vectors are collinear, (b) orthogonal nonspacelike vectors are null hence collinear, (c) there exists no two-dimensional subspace on which the scalar product is identically zero. 3. Let V be a Lorentz vector space. (a) Nonspacelike vectors u, w are in the same causal cone if and only if either (v, w) < 0 or u and w are null with w = au, a > 0. (b) If u is timelike, c ( u ) = C(u) u (one component of A ) = (closure of C(u)) - 0. (c) Causal cones are convex. (d) The components of the set of all nonspacelike vectors in Vare the two causal cones in V. 4. If V is a scalar product space of index v and dimension 2 2 , the set of all timelike vectors is connected (or empty) if v # 1.

156 5

Riemannian and Lorentz Geometry

5. In a semi-Riemannian surface, if t, r is a coordinate system with ds2 = ~ ( rdt2 ) ~ ( rdr2,compute: ) (a)Hr(a,, a,) = E,/2G, H'(a,, a,) = 0, H'(a,, a,) = - G,/2G. (b) grad r = a,/G. (c) Ar = (1/2G) [(EJE) - (GJG)]. 6. If a is a timelike geodesic in Ri from p to q, then arbitrarily close to a there are smooth timelike curves from p to q that are (a) longer than a,(b) shorter than a. 7. (a) If X and Yare linearly independent vector fields on a neighborhood of a point p in a surface, there is a coordinate system (u, v ) at p such that 8, and a, are collinear with X and Y, respectively. (Hint: Let (u, y ) be a coordinate system such that Y = a/ay.) (b) At each point of a Lorentz surface there is a null coordinate system. 8. For a null coordinate system in a Lorentz surface, (a) Dau a, = F,/F a,, Dau a, = F,/F a,, Dau a, = Dd,, a, = 0. (b) The Gaussian curvature is K = ( - l/F)(F,JF), = ( - l/F)(F,/F),. (c) For a function f,

+

grad f

a, + f,d,)/F

and

= (f,

df = 2f,,/F.

(d) The geodesic differential equations are U"

+ (Fu/F)u'2 = 0,

v"

+ (F,/F)v'2 = 0.

9. In a Lorentz manifold, if c is a longest timelike curve joining p and q, then a is a monotone reparametrization of an (unbroken) timelike geodesic from p to q. 10. Directional derivatives. Let u be a unit vector in T,(M) and let f E S ( M ) have grad, f # 0. Prove: (a) If M is Riemannian, then u ( f ) = Jgrad, f [ cos 9 where 9 = ~ ( ugrad, , J'). (b) If M is Lorentz, and grad, f and u are timelike in the same timecone, then u ( f ) = - [grad, f [ cosh rp, where rp is the Lorentz angle between u and grad, f (change sign if opposite timecone). (c) If M is Lorentz and one of u, grad, f is null, the other timelike, then u(f) # 0. 11. Hyperbolic space. If p , q E H"(r) then (a) there is a unique geodesic a:R + H"(r) such that o(0) = p and a( 1 ) = q. (b) d(p, q ) = L(a 1 [O, 11) = rcp, where rp is the hyperbolic angle between p and q in R;+ (compare Example 20). (c) If also rn E H"(r), hyperbolic angles satisfy C ( p , q) I X ( p , m) <(m,q), with equality if and only if rn E y[O, 11. 12. In a complete Riemannian manifold, (a) if each ai:[0, 13 -+ M is minimizing and {ai(O)]converges to v E T M , then y, 1 [0, 11 is minimizing. (b) If M is not compact there exists a minimizing ray p : [0, co) -+M starting at p E M (that is, each subsegment of p is minimizing). 13. Call a semi-Riemannian manifold M Misner-complete provided no geodesic races to infinity, that is, provided every geodesic y: [0, b) + M , b < 03, lies in a compact set. Prove: (a) complete * Misner-complete *

+

Exercises 157

inextendible ;(b) neither converse holds ; (c) a Misner-complete Riemannian manifold is complete. (For (b), see Example 7.16.) 14. If a : [0, B ) + M , B 5 00, is an extendible piecewise smooth causal ( = nonspacelike) curve in a Lorentz manifold, then tl has finite length. 15. Let M be a connected manifold with indefinite metric. Prove: (a) Each point of M has a neighborhood any two points of which can be joined by an at-most-once-broken null geodesic segment. (b) Any two points of M can be joined by a broken null geodesic (hence Riemannian distance on M is identically zero). (c) If M is null complete it is inextendible. (d) Same as (c) with null replaced by timelike [spacelike]. 16. Let g, and g, be scalar products of index 1 on a vector space V. If they have the same nullcone then g, = C g , .

6

SPECIAL RELATIVITY

By the end of the last century it had become clear that there were serious difficulties in classical Newtonian physics, centering around the properties of light. Progress in resolving these difficulties was made by Lorentz, Poincare, and others, but the first comprehensive solution was given by Einstein in 1905 with the publication of his special theory of relativity. Its mathematical essence was a novel way to change space and time coordinates; in 1908 Minkowski showed that these occur naturally if space R3 and time R’ are merged in a single spacetime . :R ’‘ Henceforth space by itself, and time by itself,” he wrote, “are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.” Special relativity has a number of features-bizarre from the Newtonian viewpoint -that will follow rather easily from the geometry ofthe Minkowski spacetime. To mention a few: There is no way to determine whether two different events occur at the same time or in the same place. There is no notion of (absolute) speed for a material particle; indeed there is no way to determine whether a nonaccelerating particle is moving or not. On the other hand, the speed of light in a vacuum is a constant-independent of the motion of its source. Moving clocks run slow; moving bodies shorten in the direction of the motion. NEWTONIAN SPACE AND TIME

For the sake of comparison with relativity we review briefly some basic features of Newtonian motion. 158

Newtonian Space and Time 159

1. Definition. Newtonian space is a Euclidean 3-space E, that is, a Riemannian manifold isometric to R3 (with dot product).

Since there are no coordinate axes in nature this definition is better than simply declaring Newtonian space to be R 3 . For simplicity, Newtonian time will be modeled as the real line R', but only time differences s - t are significant, not particular times s, t. Informally, a particle is an object that is negligibly small compared to the typical distances in the problem at hand. Thus an electron is a particle compared to an atom, and a galaxy is a particle compared to the universe at large. Particles have mass, and are capable of motion, described by a function from time to space.

2. Definition. A Newtonian particle is a curve space, with I an interval in Newtonian time.

CI:

I

-+

E in Newtonian

This definition is consistent with the terminology velocity a' = da/dt, speed u = ( a ' ( acceleration , a" = d Z a / d t 2and , arclength L(al [a, b]). Fundamentally mass is constant (and positive) in Newtonian physics, but a complex particle, modeling say a rocket expending fuel, can have mass a function of time.

3. Definition. If a : I -+ E is a Newtonian particle of mass m, then (1) The momentum of a is the vector field ma' on a ; scalar momentum is the function m ( d (on 1. ( 2 ) The.force on CI is the vector field d(ma')/dt on a. (3) The kinetic energy of a is the function mu2/2 on I, where t' = ( a ' ( . When m is constant then (2)-Newton's second law of motion-takes the familiar form : Force equals mass times acceleration. The definition of Newtonian space shows that it has preferred coordinate systems.

4. Definition. A Euclideurr coordinate system for E is an isometry

C;:

E

+

R3.

Since 5 is in particular a diffeomorphism it is indeed a coordinate system for the manifold E. For any coordinate system, dt(a/dx') = d/dui, hence t is Euclidean if and only if gij = hi,for 1 I i, j I 3. Euclidean coordinates are highly efficient in dealing with straight line problems. For example, geodesics then have affine coordinates x'(y(t)) = a't + b', distantly parallel tangent vectors have the same components, and the distance from p to q is given by

160 6

Special Relativity

the usual Pythagorean formula d(P, 4 ) =

cc

(x'(4)

- xi(P)>21"2.

In effect, once Euclidean coordinates are introduced Newtonian space turns into R3.

NEWTONIAN SPACE-TIME

Suppose u is a Newtonian particle moving on a line L x R'. Then we are all accustomed to drawing its graph { ( t , cr(t)): t E I } and interpreting slope as velocity (that is, & speed), rate of change of slope as acceleration, and so on. The particle, by definition a curve, has thus become a one-dimensional submanifold of the plane R2 = (time R') x (space R'). This can be done more generally as follows.

5. Definition. Newtonian space-time is the Riemannian product manifold R' x E of Newtonian time and Newtonian space. The definition is superficial since the product metric lacks physical significance; however, it will serve to introduce some ideas that become significant when the correct (Minkowskian) metric tensor is used. A point (i, x) of R' x E is called an eoent: it is an instantaneous happening at a particular time t E R' and position x E E. The natural projections of R' x E on R' and E are denoted by T and S , respectively. Thus T is the universal Newtonian clock that lets us measure the time interval between any two events. As in the case above, a Newtonian particle can now be represented not by its equation of motion (Definition 2) but by its worldline, roughly speaking its life history in R' x E .

6. Definition. A worldline in Newtonian space-time is a one-dimensional submanifold W such that TI W is a diffeomorphism onto an interval I c R' (see Figure 1). The transition between particles and worldlines is easy. Given a particle a : I -+ E , its graph { ( t , cr(t)) : t E I } is a worldline. Conversely, given a worldline W the corresponding particle is reconstructed as a = S 0 ( T I W ) -

'.

On a human scale the Newtonian account of motion is quite accurate, but when it is pushed to extremes, difficulties arise:

Newtonian Space-Time

161

Figure 1

(1) For 300 years it has been known that in a vacuum light travels at a very high but nonetheless finite speed c, and no material object has been observed to travel faster. However, c plays no special role in Newtonian motion; for example, according to Newtonian theory a suitably designed rocket can attain arbitrarily high speeds (Exercise 4). (2) Suppose that a rocketship is coming directly toward an observer at speed c/lO. Its searchlight beam darts ahead at speed c relative to the ship and thus by Newtonian addition of velocities reaches the observer at speed c + c/10. If the rocketship were traveling directly away from him the light would arrive at speed c - c/10. Astronomical observation of double stars should readily reveal such fast and slow light, but in fact the speeds turn out to be the same. (3) In Newtonian theory a particle is either at rest (the curve is constant) or not -a straightforward dichotomy. Suppose a rocketship p is out in space far from any external influence. If p is not accelerating how can it be determined whether p is at rest or not? A natural approach would be to measure the position of p relative to something that is at rest. But one by one the candidates failed: earth, sun, “fixed” stars, and the conjectured ether. (See, however, background microwave radiation, on page 357.) Nor can the crew of p perform a distinguishing experiment strictly within their ship. Such experiments deal with measurements relative to p and give no information about absolute motion.

In short, Newtonian physics treats light relatively (as in (2)) when it should be treated absolutely (l),and treats motion absolutely when it should be treated relatively (3). There is a direct way to eliminate the Newtonian difficulty (1) above. The speed of a material particle c( at the time t can be read from its worldline

162 6

Special Relativity

W in terms of the angle between time axis R' and the tangent line to W at at constant speed c-thus determine a cone in every tangent space to R' x E . Requiring material particles to have their tangent lines inside the cone thus keeps their speed below c. Within semi-Riemannian geometry a more natural way to get such cones is to change the sign of the time-coordinate of the metric tensor of R' x E , thus producing Minkowski spacetime with its nullcones. We shall see that a reasonable attempt to reconstruct Newtonian mechanics in this context eliminates the difficulties (2) and (3) as well, producing special relativity.

( t , a(t)). The tangent directions of light-always

7. Remark. Geometric Units. A highly efficient system of physical units is obtained by taking the speed of light c and the gravitational constant G both to be the (dimensionless) number 1. All units then become powers of just one freely chosen unit, e.g., seconds, meters, . . . . The conversion factors between geometric units and any conventional system of units follow in an obvious way from the values c,,, and G,,,, for that system. For example, cCgs= 3 x 10"

cm/sec;

Gcgs= 6.67 x

cm3/g sec'.

In geometric units, distance and time are measured in the same units, as in the well-known case of distance in (light) years: the time required for light to travel that distance. Conversely time can be measured in units of distance: the distance light travels in the given time. It follows that speed u is dimensionless, with u = u,,,,/c,,,, for any conventional units. For example, the earth-sun distance xCgs= 1.5 x loL3cm (about 93 million miles) can also be given as xcgJcCgs= 500 sec. A rocket of speed u = 0.01 has uCgs= (0.01) cCgs= 3 x 10' cm/sec, that is, 3000 km/sec. In geometric units, mass is also measured in the same units as distance and time. G c g J ( ~ c g s=) 27.42 x cm/g, hence, for example, the mass of the g, becomes 14.8 x lo4 cm-about 1.5 km. Using sun, M~~~ % 2 x seconds as the geometric unit, the sun's mass is 14.8 x lo4 cm = 4.9 x 3 x 10'' cm/sec

sec.

Based on the fundamental constants c and G, geometric units have adirect physical significance lacking in more haphazard conventional systems. Certainly speed as u = rconv/~conv is more informative than some number of feet per second or kilometers per year. Using geometric units, it also becomes meaningful, for example, to say that the sun's mass is small compared to its radius (M = 1.5 km 4 r = 7 x lo5 km). As we see in Chapter 13 this fact is decisive for the qualitative character of the sun's gravitational field.

Minkowski Spacetime 163

MINKOWSKI SPACETIME A spacetime is a connected time-oriented four-dimensional Lorentz manifold. (Informally, rime-oriented is often weakened to rime-orientable.)

8. Definition. A Minkowski spacetime M is a spacetime that is isometric to Minkowski 4-space R:. In this chapter, M will always denote a Minkowski spacetime. As with any (time-oriented) spacetime, the time-orientation of M is called the future, and its negative is the past. A tangent vector in a future causal cone is said to be future-pointing (or future-directed). A causal curve is future-pointing if all its velocity vectors are future-pointing. Comparison with Newtonian space-time is helpful in organizing intuitive ideas about Minkowski spacetime; thus points of M are called euents and particles will be (parametrized) worldlines. On the other hand, by contrast with both R’ x E and R;, there exists no canonical time function on M . Though there is no Time there are many times:

9. Definition. A material particle in M is a timelike future-pointing curve u : 1-M such that Iu’(z)l = 1 for all z E I. The parameter 7 is called the proper time of the particle. As in the Newtonian case a particle models the life history of an object which in a given context is negligibly small. We imagine that each particle comes equipped with a clock (mechanical, atomic, biological, . . .) measuring its proper time. As with Newtonian time, only intervals of proper time are significant. For example if u is a material particle from @(a)= p to u(b) = q then the arclength b - a is its elapsed proper time between the events p and q. For any number d, 7 u(7 d ) can be considered as the same particle with its clock reset, since the time interval between events is unchanged. -+

y:I

+

10. Definition. A lightlike particle is a future-pointing null geodesic -+ M .

Contemporary physics identifies three types of lightlike particles : photons (light itself); neutrinos (elementary particles perhaps not perfectly lightlike); and (confidently conjectured) grauitons (outside the framework of special relativity). Any particle /3: I -+ M is a regular curve, and its image p ( I ) is a onedimensional submanifold of M called the worldline of 8. As in the Newtonian case, particles in M have mass, positive for material particles but, as we shall see, necessarily zero for lightlike particles.

164 6

Special Relativity

That light moves geodesically is a fundamental hypothesis of relativity, and since (y’, y‘) = 0 for a lightlike particle, parametrization by proper time is out of the question. Being massless it can’t carry a clock! A particle in M that is a geodesic is said to be freely falling. In general “freely falling” means moving under the influence of gravity alone. But Chapter 12 will show that the fact that Minkowski space is flat limits special relativity to situations where gravitation is negligible; for example: in elementary particle theory, where electromagnetism dominates; in empty space, far from significant sources of gravity; anywhere, provided times and distances involved are small enough to trivialize gravitational effects. 11. Definition. A Lorentz (or inertial) coordinate system in M is a time-orientation-preserving isometry 5 : M -+ R:. As in the Newtonian case, it follows immediately that a coordinate system = (- 1, 1, 1, 1)) and

5 : M -+ R4 is Lorentz if and only if gij = hijej (where E

a,

is future-pointing.

12. Lemma. Given a frame e,, e , , e2, e3 in T,(M) such that e, is future-pointing, there is a unique Lorentz coordinate system C; such that d i l p = e,forO 5 i s 3. Proof: The normal coordinate system C; determined by the frame is one such coordinate system, with exp,(x xi(q)ei)= q for all q E M. If r] is another, then 5 - l 0 r] is an isometry of M fixing the given frame. Thus by Proposition 3.62, 5 - l r] = id; that is, = 5. rn 0

In prerelativistic physics the proper role of coordinates was not clear. Lacking a firm notion of manifold it was only reasonable to assume that the physics of Newtonian space was linked to the form its laws took in terms of Euclidean coordinates. Not the least of Einstein’s contributions was his insistence (principle of general covariance) that every physical law has an expression independent of the choice of coordinates. Tensor formalism allows the last phrase to be replaced by “not using coordinates.” Indeed, introducing a particular coordinate system in a given context raises the new problem of distinguishing intrinsic properties from coordinate properties. Of course this problem is simplified by using coordinates well adapted to the intrinsic data. MINKOWSKI GEOMETRY

Since Minkowski spacetime is isometric to R; we know that (1) for any points p , q E M there is a unique geodesic a with o(0) = p and a(1) = q, (2) there is a natural linear isometry T,(M) % T,(M), called distant parallel-

Minkowski Geometry 165

ism, and (3) each exponential map exp,: T,(M) + M is an isometry. Thus M viewed from p is geometrically the same as T,(M) viewed from 0.

13. Remark. M is a normal neighborhood of each of its points. Thus for all p, q E M the displacement vector = a'(0) of Chapter 5 is well defined, where (T is the geodesic as in (1) above. Note that exp,(G) = q.

A

In terms of a Lorentz coordinate system {, distantly parallel tangent vectors have the same components, and

d = C ( ~ ' ( q-) x'(P>) The preceding remarks let us move causality from the tangent spaces of M to M itself. For an event p E M thefuture timecone of p is { q E M : 2 is timelike and future-pointing}. This is a solid cone whose boundary, but for p, is the future lightcone of p, namely { q E M : is null and future-pointing}. The union of these two sets is thefuture causal cone of p. Past analogues are defined similarly. The lightcone A(p) c M of p is the union of the past and future lightcones of p (only locally defined in Chapter 5). A point q in neither causal cone of p is spacelike relative to p; that is, 4 is spacelike. The appropriateness of the term causal now becomes clear. It is natural to say that an event p can injuence an event q if and only if there is a particle from p to q. By the definition of particles (material and lightlike) it follows from Lemma 5.33 that

(1) The only events that can be influenced by an event p are those in its future causal cone. (2) The only events that can influence an event p are those in its past causal cone. Thus "most" events-those that are spacelike relative to p-can neither influence nor be influenced by p. See Figure 2, which obeys the pictorial conventions that the future is upward and (by Lemma 16) light rises at 45". Future

I I

Future

I

Spacelike Past

I I

3-plane of events simultaneous with p

Past Minkowski spacetime

Newtonian space-time Figure 2

166 6

Special Relativity

Relativistic causality contrasts sharply with Newtonian causality, where for an event p = (xo, t o )the past and future fill the whole space-time except for the 3-plane t = to of simultaneous events. Unrestricted by the speed of light, Newtonian rockets can go from xo to any distant place x in arbitrarily short time t - t o .

14. Definition. For p , q E M the number pq separution between p and q.

=

13I 2 0 is called the

In terms of a Lorentz coordinate system, pq = - ( x 0 q - xOp)2

+ c3 (x’q.

-

xJq)2

1

r2

Because space and time are merged in M , separation is richer in information than the comparable notion of distance in Euclidean space.

15. Remark. Physical Significance of Separation. Let p , q E M . (1) If G is timelike future-pointing, then p q is the elapsed proper time L(a) of the unique freely falling material particle from p to q. (A freely falling spaceship records pq as the time from event p to event q.) (2) p”s is lightlike o p q = O o t h e r e is a lightlike particle through p and q. (3) If 3 is spacelike, then pq 2 0 is the distancefrom p to q as measured b y any freely falling observer I2.(We anticipate some terminology from the next section.)

The k-planes in M are the images under any isometry R; -+M of the k-planes in R:. By Chapter 4 these are totally geodesic and (if nondegenerate) isometric to either Rk or R:. If V is a subspace of Tp(M)then

P = expp(V) = { q E M : ~ V E) is the unique k-plane in M with Tp(P)= I/. Finally we consider some trigonometry in M . 16. Lemma. If $ is spacelike and following imply the third:

4 is timelike, then any two of the

(1) p”s is lightlike, (2) J- 4, (3) up = oq.

G

Proof Moving the vector 3 by parallelism to o gives p”s = 4 - $. Taking scalar products then yields f p q 2 = -oq2 - 2 ( 4 , $) + op2,and the result follows. W

Particles Observed

167

Figure 3. The curved line represents the set of future points at separation oq from o. (It is a hyperbolic 3-space.)

The notion of hyperbolic angle (5.30) transfers into M in the obvious way: if p and q are in the same timecone of 0,then the hyperbolic angle cp = 4: poq is the hyperbolic angle between I$ and 2. 17. Proposition. Let p and q be events in the same timecone of o and such that op Ipq (Figure 3). Then (1)

oq2 = op2 - pq2. = oq cosh cp, pq = oq sinh cp,

(2) op

where cp = < poq.

2

Proof. (1) As in the previous lemma, moving the spacelike vector to o gives = ."p Then scalar products yield - o q 2 = -op2 + pq2. (2) Let u and u be the (timelike) unit vectors in the direction of ."p and 4, + + respectively. Then (op,oq) = o p oq(u, u ) = - o p oq cosh cp. But also + + 4 )= ( o p , op 8 )= -op2. Thus op = oq cosh cp, and hence by (l), pq2 = oq2(cosh2 p - 1) = o q 2 sinh' cp. But 4 2 0, hence sinh cp 2 0, so p q = oq sinh cp. w

+ 2.

4

+

(4,

Thus the Pythagorean formula is replaced by .(l), and orthogonal projections are given by hyperbolic rather than circular sines and cosines. Note that the timelike projection o p is always 2 oq and the same can also hold for the spacelike projection pq. PARTICLES OBSERVED

An observer in M is just a material particle, the terminology suggesting a new role. Let be a Lorentz coordinate system. The x o axis of 4 is the worldline of a freely falling observer w ; the natural parametrization of w has xOw(t) = t , so t is proper time for w. We think of the numbers produced by ( as measurements taken by the observer w (see Exercise 3). By Lemma 12, every freely falling observer has many such associated Lorentz coordinate systems.

<

160 6

Special Relativity

Figure 4

The coordinate slice xo = 0 of ( is a Euclidean space E , that we identify with R3 by the natural isometry q ++ (x'(q), x2(q), x3(q)). In M there is no natural way to define either time or space, but Lorentz coordinates effect an artificial decomposition as follows. 18. Definition. Let 5 be a Lorentz coordinate system in M . For each event P E M the number x0@) is called the (-time of p and the point p' = (x'(p), xz(p), x3(p))E R3 is called the (-position of p (see Figure 4).

Now let a : I -+ M be a particle, either material or lightlike. For each parameter value s E I (proper time if a is material), the (-time of the event a(s) is t = xo(a(s)) and its (-position is (x'(a(s)), x2(a(s)), x3(a(s))E R3 x E , . Since a is causal (nonspacelike) and future-pointing, d(xo -- 0 a)

ds

- -(a',

do> > 0.

Hence xo o a is a diffeomorphism of I onto some interval J c R'. Let u : J be the inverse function. Then at (-time t E J,the (-position of a is

2(f) = (x'au(r), X%!U(t),

+I

x3au(t)).

Thus measurements of the particle a in M produce a curve 2: J -+ R3 x E , called the (-associated Newtonian particle of a. This is what the observer w observes of a. The relationship between a and 3 is a guide for the development of special relativity. Applying Newtonian concepts to $ suggests how to find their relativistic analogues for a ; reinterpreting the relativistic concepts in terms of a shows how the new theory has modified the old. By convention, the parameters t and s are related by t = xOa(s) and inversely by s = u(r), and only by these functions. Thus it is meaningful to

Particles Observed

169

write dtlds = d(xo 0 @)Ids > 0 and, by the chain rule, d z - dZ/ds dt dtlds’ 19. Lemma. Let y be a lightlike particle in M . For Lorentz coordinate system (, the associated Newtonian particle 3of y is a straight line in R3 with speed 1.

Proof. Since y is a geodesic in M , affine coordinates x’y(s) = u,s

<

y is a geodesic in R:. Thus y has

0

+ bi

(0 I i I 3).

Hence the projection $(s) = (xly(s), x2jl(s), x3y(s)) into R3 % E , is a straight line, and its reparametrization ?(t) follows this straight line. Since the vector

dY -

ds

=CUidi

dt =-do ds

+ C d ( xdsJ 0 Y ) ~

j =l

aj

is null, and dtlds is positive, it follows that

Thus the associated particle 3 has speed

In particular, light has the same constant speed 1 relative to every freely falling observer. Now consider the case of a material particle, so the parameter s becomes proper time z. 20.

Proposition.

Let Cl be a Lorentz coordinate system in M . If +M,

2: J -,R3 is the associated Newtonian particle of a material particle a : I then

(1) The speed Idi$dtI of $ is u = tanh cp where cp is the Lorentz angle between a’ = da/dz and the coordinate vector do of (. In particular, 0 Iu < 1. (2) The time z of a and its (-time t are related by dt - d(xo dz

0

dz

a)

-

cash cp

=

1

Here u and cp are, of course, functions of the parameter of a.

170 6

Special Relativity

'r; = cash cp Lightcone

I

Figure 5. The curved line represents the future-pointing timelike unit vectors (Cornpare Figure 3.)

Proof: Since both a' and do are timelike and future-pointing there is a unique Lorentz angle cp 2 0 between them determined by -(a', 8,) = cosh cp 2 1. See Figure 5. Since a' = (d(x' 0 a)/d.r) di,we have

1

dt - d(xo o a)

dr

dr

- -(at,

and the coordinate expression for (a', a')

a,)

=

cosh cp,

= - 1 becomes

Since cp 2 0 it follows that

Thus 2 has speed =

Hyperbolic identities then give cosh cp

=

tanh cp.

l/,/D.

rn

The following interpretations put more emphasis on the freely falling observer o,and less on the coordinate system

r.

(1) Time. For any Lorentz coordinate system 5 associated with o,the coordinate hyperplane E, given by xo = t is readily seen to be the (unique)

Some Relativistic Effects

171

3-plane through w(t) perpendicular to w. Thus xo is the same for all choices of <.In effect, xo imposes 0’s proper time t on all o f M , with E, consisting of those events that w considers to be simultaneous with w(t). (2) Space. For the observer himself, the <-associated Newtonian particle w’ is constant; thus E , is called the restspace of w. For any s, t , orthogonal projection E , -+ E, sends p E E , to the unique point q E E, such that p”s is parallel to w. (In terms of any associated Lorentz coordinate system this map merely changes xo coordinates.) Consequently the E , s are canonically isometric Newtonian spaces, and any one will serve equally well as the restspace for w. (3) Speed. In the preceding proposition, since w is the xo coordinate curve, w‘ is always distantly parallel to 8,.Thus c p ( ~ ) is the hyperbolic angle between CY’(T) and w’. The function v = (di$/dt/gives the speed of. relative to w, and the function cp = tanh- v, though it measures speed, is traditionally called the velocity parameter of c1 relative to o.(Sometimes CI‘ = dci/dt is called the 4-velocity of c1 to distinguish it from the relative notion dd/dt in 3-space.) In short, the restspace of w is the egocentric Newtonian space in which Q perceives all particles moving as Newtonian particles relative to his rest position. (4) Time dilation. For a particle with proper time T the equations in (2) of the preceding proposition show that the faster the particle is moving relative to the observer (that is, the larger v is) the slower the particle’s clock (T) runs relative to the observer’s clock ( t ) . Thus the slogan: Moving clocks run slow. ( 5 ) Distance. We can now account for Remark 15(3). That p 2 is orthogonal to a freely falling observer o means that xo(p)=xo(q). Hence p and q are in the same hyperplane E , , and their separation is ordinary Euclidean distance : P4 =

r

1(X’(P)

j= 1

r2

- xj(q>I2

Thus distance between events is meaningful only for observers who consider the events to be simultaneous.

SOME RELATIVISTIC EFFECTS

The preceding section has shown that each freely falling observer w in M has his own notion of time and space. Many characteristic features of reiativity that seem paradoxical arise in comparing the conclusions of two different observers.

172 6

Special Relativity

For example, if w 1 and w2 are nonparallel they have different restspaces. Thus if 2 is orthogonal to w 1 but not w 2 ,the events p and q are simultaneous for w , but not for w 2 . Analogously if $ is parallel to w 1 then it cannot be parallel to w 2 ,and (projecting into restspaces) the events p and I occur at the same position for w1 but at different positions for w 2 . Using distant parallelism the notion of relative speed can be generalized as follows. If a and /3 are material particles, then the hyperbolic angle rp = gC (a'(o), P'(t)) is their instantaneous velocity parameter and u = tanh rp is the instantaneous relative speed. For a freely falling particle /3, freely falling observers parallel to fl regard it as being at rest, while for others it can have arbitrary constant speeds 0 < u < 1. In particular, two freely falling observers regard themselves as moving at constant relative speed tanh *(a', p'). The difficulties with Newtonian motion mentioned earlier do not arise here: only relative motion is defined, light moves at speed 1 relative to every freely falling observer, and all material particles have relative speeds u < 1 (in the empty space modeled by M).The essential dichotomy now is not between rest and motion but between free fall and acceleration-for if P" is not identically zero no freely falling observer considers /3 to be at rest. 21. Example. Relativistic Addition of Velocities. A rocketship p leaves a space station o (both freely falling) at relative speed u , > 0. A spaceman p is ejected from p in the plane of p and o with constant (signed) speed u2 relative to p. Here u2 > 0 means forward, away from o,and u2 0 means backward, toward o. What is the speed u of p relative to o? The Newtonian answer is, of course, u = u1 + u 2 , but Einstein's answer is different. Figure 6 illustrates the case u2 > 0. Event p is the departure of p from a ; event q is the departure of p from p. Thus u1 = tanh rp, and u2 = tanh q 2 .By

-=

Figure 6

Some Relativistic Effects

173

distant parallelism, p’ is between IT‘and p‘, hence by the additivity of hyperbolic angles (Exercise 5.1 1) the angle 4p = 3: (of, p’) is cp, cp2. Thus addition of velocity parameters replaces Newtonian addition of speed. Indeed, since

+

u = tanh cp = tanh(cp,

+ cpz)

=

+

tanh cpl tanh q2 1 + tanhcp, tanhcp,’

we find v1 1

u=

+ v2

+

VlU2

The same formula holds if u2 < 0. When, as above, several particles lie in a single timelike 2-plane P z R:, they are said to be moving on a line, though ofcourse the particular Newtonian line and motion depends on the observer. Einstein dramatized the relativity of time in the following scenario (as formulated in [MTW]).

22. Example. The Twin Paradox. On their 21st birthday Peter leaves his twin Paul behind on their freely falling spaceship and departs at constant relative speed u = 24/25 for a free fall of seven years of his proper time. Then he turns and comes back symmetrically in another seven years. Upon his arrival he is thus 35 years old-but Paul is 71. To compute Paul’s age, drop a perpendicular from the turn p to the worldline of the spaceship (Figure 7). By Proposition 17, 7

OX

Symmetrically, xq 71 yr.

=

=

OP

cash

I

4p =

[I - (24/25)2]”2

= 25.

25. Thus Paul’s age at Peter’s return is 21

Figure 7

+ 2(25) =

174 6

Special Relativity

It is useless to object that this phenomenon-no paradox-involves somehow the difficulty of building accurate clocks. “Clock” is merely a name for a time measurer, whether based on the rhythms of atoms, mechanical devices, or biological processes. While they were apart, Paul’s heart has beaten 50/14 times more than Peter’s. A more telling objection is that if Peter is to survive the trip the corners at o, p , and q must be smoothed to keep accelerations low. But the phenomenon remains.

23. Corollary. Let a : [a, b] + M be a material particle from p to q in M . Its elapsed proper time A t = b - a is at most pq, with equality if and only if a is freely falling. Since M is a normal neighborhood of each of its points this follows immediately from Proposition 5.34. Thus free fall is the unique slowest way to go from one event to another. Roughly speaking any other way is more nearly lightlike hence quicker.

LORENTZ-FITZGERALD

CONTRACTION

Let a and fl be parallel freely falling material particles in M . We can consider M and fl to be the endpoints of a freely falling rod [a, fl] in M . Let Q be a freely falling observer with restspace E , through+w(O). Since M and fl are parallel their associated Newtonian particles 2 and fl move along parallel straight lines ‘5E , both at constant speed u. Hence w sees the rod as a line segment [$(t), fl(t)] moving in translation, that is, parallel t o itself, in E,*. The length ofthe rod as measured by w is the constant distance L, from a(t)!o P(t) in Eco.For a rider on the rod, say a, the associated Newtonian rod [hp] is at rest in his restspace where he measures its restlength L. This is the length of any vector from M to p orthogonal to both. 24. Proposition. As above let [a, fl] be a freely fplling rod with speed u relative to a freely falling observer 0 . In E , , (1) If [.’, fl] movzs in a direction orthogonal to its axis, then L, = L (restlength). (2) If [d, p ] moves in the direction of its axis, then L, = L J n .

Proof: Evidently a parallel displacement of o has no effest on these (1) Let A = [d(O),fl(O)] be the measurements so we can assume w(0) = ~(0). initial position of the Newtonian rod in E,. It suffices to show that A is also in the restspace E , of I through a(O), for then wand a measure the same length for A . Since A c E , , , A I (L). By hypothesis A I $. But cx is in the plane

Lorentz-Fitzgerald Contraction

175

Figure 8

determined by w and 2, hence A I cc; that is, A c E , . (2) This is the case of motion along a line: a, 0, and w are in the same timelike plane, and Figure 8 shows the lines in which the restspaces E , and E , meet that plane. In the figure, u = tanh cp, op is the restlength L of the rod, and oq is L,. By Proposition 17, in the right triangle Aopr, L = ro sinh cp, and in the right triangle Aroq, L,,, = ro tanh cp. Thus L = L,, cosh cp, so L,,, = L , , / D . This phenomenon ofshortening in the direction of motion is thecelebrated Lorentz-Fitzgerald contraction conjectured independently by Fitzgerald and Lorentz some years before Einstein's 1905 organization of special relativity. (For the original papers of Lorentz and Einstein, see [El.)

25. Example. An Einstein Train. A train of restlength 200 m travels a straight stretch of track past a station of res'tlength 100 m. Thus, when standing at the station, the train is twice as long as the station. On a particular trip the train passes the station at constant relative speed u =3 / 2 0.87. By Proposition 24 the stationmaster, at rest in the station, measures the length of the train as 20OdG2= 100 m. Hence, as it passes, the train exactly fits the station. For the conductor, the train is at rest, hence has length 200 m. For him the station is moving at speed u, hence has length lOOJ= 50m. Thus the train is four times as long as the station. These apparently conflicting measurements coexist harmoniously in the spacetime diagram of Figure 9, which shows the station as [cc, p] and the train as [ y , S]. The significant times and distances are separations between various pairs of events. If we suppose the conductor is at the front (6) of the train and the stationmaster is at the right end (8) of the station, then the event q is their passing each other.

-

176 6

Special Relativity

'f

2

E,

Y

Figure 9. Ell is the restspace of the stationmaster 6 (hence 6 I E6).

p; E,

Ell

is the restspace of the conductor

At the station time of q. the stationmaster sees the train as exactly fitting the station-both of them as the 100-m segment [ p , q] in his restspace E , . Orthogonal projection onto E , would show the train moving to the right through the station. At the train time of event q, the conductor 6 sees the train at rest in Ed as the 200-m segment [x, q ] of which only the first 50-m segment [y, 41 is inside the station. Orthogonal projection onto Ed (along lines parallel to S) would show the station moving t o the left past the resting train. Event r is the passing of the stationmaster fl and the back (7) of the train. At speed $/2 the 100-m train passes him in elapsed time op = qr = 2 0 / $ m. At the same relative speed the conductor is passed by the 50-m station in elapsed train time oq = pr = loo/$ m. ENERGY-MOMENTU M 26. Definition. If a : I + M is a material particle of mass m, its energymomentum vectorjield is the vector field P = m daldz on a.

To understand this let us consider what a freely falling observer o makes of it in terms of his notions of time and space. For an associated Lorentz

Energy-Momentum

177

coordinate system ( the components of P are . d(x' 0 a) P'=mdr

(0 Ii I 3),

where as usual 7 is the proper time of a. Introducing the proper time t of the observer gives d(x' 0 a) dt P'=m-dt dz'

dt

where

-

=

dz

1

J1-u'

The space components P', P 2 , P 3 thus describe a vector field +

rn

dd

p=JDz on the associated Newtonian particle 8 in E , z R3. This is a reasonable extension of the Newtonian momentum vector field of 2 (Definition 3), since for slow speeds, the time-dilation factor (1 - v 2 ) - 1 ' 2 is nearly 1. However the time component of P is something new.

By the binomial theorem (Newton, 1665) Po

=

m

+ +mu2 + O(v4).

The second term here is the Newtonian kinetic energy of d, and Einstein identified Po as the total energy E of the particle as measured by w, concluding in particular that mass is merely one form of energy. Specifically m is the rest energy E,,,,, since Po = m when u = 0. Converting to conventional units gives the famous equation E,,,, = mc2. To summarize:

27. Definition. Let a be a material particle of mass m in M . If w is a freely falling observer, then the energy ofcr relatiue to w is the time component E

of P field

=

=

m/J1 - vz

m dalds, and the momentum of a relative to w is the Euclidean vector m

d$

+=J=d' on 2. Finally the scalar momentum of

ct

relative to w is the function p

=

+

(PI.

178 6

Special Relativity

Just as space and time are merged in special relativity, so are energy and momentum. The unified concept, energy-momentum, can be split up only artificially, relative to a particular observer. Using distant parallelism, the definition above can be expressed concisely as P = E a, $, where

+

(1) 8, is the timelike coordinate vector field of any Lorentz coordinate system associated with w ; that is, a, is the (parallel) vector field on CI distantly parallel to a’,and + (2) the spacelike vector field P , orthogonal to a,, has been moved from ?i in the restspace to SI itself by distant parallelism. 28. Corollary. Let a be a material particle of mass m in M . Relative to a freely falling observer, the energy E , scalar momentum p, and speed v = tanh cp of a are related by

E 2 = m2 + p2. ( 2 ) E = m / J D = m cosh cp = - ( P , ( 3 ) p = m sinh cp = mu/,/-. ( 4 ) p/E = tanh cp = v. (1)

a,).

Proof. Since P = ma‘ = E 8, + P , with P Ia,, the assertions follow from Proposition 17, but with the Lorentz trigonometry in each tangent space rather than in M itself. -3

Let {!xi}be a sequence of freely falling material particles whose worldlines approach that of a lightlike particle y. For any freely falling observer the relative speeds ui approach 1, hence equations (1) and (4)of the corollary give EZ = m2 + pz and E = p, confirming that lightlike particles have mass zero. That light carries energy and has momentum is noncontroversial, but since it has no mass a new definition is needed for its energy-momentum.

IJ

29. Definition. The energy-momentum vector field of a lightlike particle is its 4-velocity: P = y’ = dy/ds.

Then any freely falling observer w can split P into e+nergy ,E relative to o and momentum P relative ro w by writing P = E a, + P with P Ia,, just as for a material particle. But since P = y’ is lightlike, the energy and scalar momentum are equal: E = p = - ( y ’ , a,). (Compare Lemma 16.) Furthermore, since y and the observer w are both geodesics, E = p is constant and P’ is parallel. In geometrical units, energy E and scalar momentum p have the same common unit as distance, time, and mass. For a material particle the mass

Collisions 179

unit for E and p is clear from the formulas in Corollary 28. For a lightlike particle y the affine parameter s is a pure number, hence the formula p

=

E

=

d(xO0 y)/ds

shows the unit is the same as for time. Conversion to conventional units is accomplished as in Remark 7 using c,,,, and G,,,,. For example, in cgs units one gram of energy, the rest energy of a mass of 1 g is

Ecgs = ( c , ~ g~ )=~9 x 10’’ g cm2/sec2 (=erg). The wave character of light is measured as follows: A photon of ener.gy E, relative to some observer, hasfrequency v = E/h, where h is Planck’s constant. As usual, frequency times wavelength I is speed c. In geometric units, h is about 1.8 x lops6set' and Iv = 1. Since frequency and wavelength derive from energy they too depend on who is doing the observing. Thus visible light for one observer is radio waves for another and x rays for a third.

CO LLlSlO NS Suppose that a number of particles in M enter a very small spacetime region 0 and a (possibly different) number emerge. What takes place in 0 may be quite complicated, but if 0 is sufficiently small it can be modelled as a single event o.

30. Definition. A collision in M is a collection of r incoming material or lightlike particles: cci: [ U i ,

01 + M

(1 I i 5

Y)

and s outgoing particles

F j : [O, b j ] + M

(1 ~j 5 s),

such that ~ ~ (=0pj(0) ) = o E M for all i, j. Then o is called the collision event. Such collisions are the stock-in-trade of particle physics, and hundreds of types have been studied experimentally. In all of these the following holds :

31. Law of Conservation of Energy-Momentum. For a collision as above, the total incoming energy-momentum vector equals the total

180 6

Special Relativity

outgoing energy-momentum vector; that is, i= I

j= 1

where Piand P j are the energy-momentum vectors of aiand pj, respectively, at the collision event 0.

32. Example. A Totally Inelastic Collision. Two blobs of putty a, and x , of mass m, and m, collide at relative speed u > 0 and stick together. Investigate the outgoing blob p. Conservation of energy-momentum asserts that at the collision event 0,

mo‘

=

rnla’,

+ m,a;,

where m is the mass of B. Taking scalar products yields - m z = -m:

-(1

+ 2mIm2(a;,a;> - m i .

Here ( a ; , x i ) = -cosh cp, where tanh cp = u. Thus the scalar product is - u ~ ) - ~ ”so , that

mz = (ml>’

+ [ 2 m I m , / J T 5 7 1 + (rn,)’.

Thus m > m , + m, , so mass is not conserved in the collision. The outgoing blob has drawn additional mass from the energy-momentum of the incoming blobs. if freely falling, is completely determined by its 4-velocity Then /I,

B’

=

( m , / m ) 4 + h,/m).;

at the collision event o. A timelike future-pointing unit vector u E T p ( M )is called an instantaneous observer at p. In this role, u is regarded as having that information common to all observers passing through p with 4-velocity u. Specifically, u knows the tangent space Tp(M),and if a particle through p has energy-momentum vector P E Tp(M),then u can split P into energy E and momentum 3, as usual, by + P = Eu + P , with 9 Iu.

33. Corollary. Relative to an instantaneous observer u at a collision event 0,both energy and momentum are preserved in the collision; that is, E, = Z E , in

out

and

c P‘, in

=

+

cPj. ow

Proof. Since the incoming and outgoing total energy-momentum vectors are equal at o, they have the same numerical components in the u direction and the same vector components orthogonal to u.

An Accelerating Observer

181

Figure 10

34. Example. Photon Rockets. A rocketship p of initial mass m, leaves a freely falling space station (T at initial relative speed 0. The exhaust of p , aimed toward G,consists solely of photons. Find the velocity parameter q ( z ) of p relative to G in terms of the mass m(r) of p(z) at its proper time z since launch. To set up the necessary calculus we shall construe the events during (z,z + A t ) as a collision :the decay of p-before-z into p-after-o and the photons ejected during A T . Setting m = m(z) and f i = m(z + AT), conservation of energy-momentum implies that m p ’ ( t ) = rnp’(,r

+ A T ) + v,

where uis theenergy-momentum ofthe photonsejectedduring Az(Figure lo). Since u is null, we find - m 2 - m2

-

2 m ~ ( p ’ ( z )p’(z ,

+ AS))

=

0.

The scalar product here is -cosh A q , so after writing f i algebra gives -1 cosh dy, = ( ~ l r n ) ~ / 2 m @ .

+

=

m

+ Am some

-

Now take the limit as d7 + 0. Since cosh A q 1 + (Aq)’/2 and f i -+ m, we find d q = -dm/m (minus sign since Am < 0 < A q ) . Integration from 0 to z then yields d ~=)Mmo/m(z)>. AN ACCELERATING OBSERVER

Let 5 be a Lorentz coordinate system associated with a freely falling observer o in M , but consider only the timelike 2-plane P : x3 = x4 = 0 (relativistic motion on a line). Let a be the curve in P with coordinates

xo(az) = g-’ sinh(gr),

x’(az) = g-’ cosh(gr)

(g > 0).

102 6

Special Relativity

Figure 11. 0 is the quadrant between the two light rays.

Differentiation shows that ct is an observer with proper time T but that ct is not freely falling; indeed, a has constant scalar acceleration 101' 1 = g. The observer o perceives a as a Newtonian particle in his restline xo = 0, with ct moving in from infinity to distanceg-', then retreating symmetrically to infinity. However, c( cannot pick up w at all on his radar screen. In fact, it is clear from Figure 11 that radiation (light) from CI will always reach w but only at to-proper times t > 0-then the reflected radiation can never return to ct. The restline E , of a through s ( ( T ) is the spacelike line through a ( t ) orthogonal to c('(T). This line consists of the events that a considers to be simultaneous with 47). It extends only t o the origin 0, and such restlines fill the open quadrant Q that a can pick up on his radar. In fact, Exercise 7 shows that, using only his clock and radar, a can construct a (non-Lorentz) coordinate system yo, y' on Q such that

YO(d

=7

y'(q) = oq - g-'

if

q E E,;

for all q E Q .

Here yo gives the a-time ofeach event q E Q, and y' gives the c(-po$ion. Hence a particle p in Q is perceived by ct as the Newtonian particle p in R' given by parametrizing y' p by a's proper time z. In particular d is at rest at 0. Consider a race between x and the (outgoing) photon y emitted at a(0) with, say, ~'(0) = d,, + 13,. If a were freely falling the photon would move 0

Exercises 183

away from him at speed 1, but since a is accelerating in the direction of y perhaps he can do better. In fact, he does worse: a straightforward computation shows that y meets the ray E , at e%(T). But la(z)l = g-'; hence relative to M, -+ y(z) = g - ' ( e + - I). Thus the photon's speed relative to the observer being I -approaches infinity with increasing T.

CI

is ec17,which-far from

Exercises

1. If p E M is not on the worldline of a freely falling observer w, then w meets the lightcone A(p) o f p in exactly two events: w(z-) and w(z+).Further2 - ) / 2 ) , then 2 Iw and & = qw(z-) = qw(z+). more, if q = w((z' 2. Leaving her twin Jean at rest in their freely falling spaceship, Evelyn departs at constant relative speed u = 0.4 for one trip around a circle of radius 2 lightyears in the restspace of their ship. Find their age difference upon Evelyn's return. 3. Radar construction of coordinate systems. Let w be a freely falling observer in M . (a) Light emitted by u at z = a is reflected off q and received by w at t = b. Show q lies in the restspace of o through m = w((a b)/2) and that mq = (a + b)/2. (b) Parallel orthonormal vector fields E , , E,, E 3 on w, Iw,represent gyroscopes. Using the spatial direction of the emitted and received light in (a) deduce the coordinates of q for an appropriate Lorentz coordinate system associated with w. 4. (a) If the exhaust of the rocketship in Example 34 consists of material particles, show that its velocity parameter is rp = u,, In(m,/m), where u,, < 1 is the (constant) exhaust speed. (b) If the rocket in (a) is Newtonian show that its speed is given by u = u,, ln(m,/m). (Thus u < 1 in (a) but u -+ cc in (b).) 5. In Example 34, let CJ be our solar system and suppose the rocketship p is heading directly toward M Centauri. Assume the latter and (T are freely falling and parallel; in their common restspace their distance apart is 4.40 (1ight)years. If p burns fuel at the rate din/& = -m/100 find the approximate time of the trip in terms of (a) 0's proper time t, (b) p's proper time z. (The difference is about 158 days.) 6. (See page 181.) (a) Relative to the freely falling observer w, find the associated Newtonian particles of accelerating observer a and the photon y, and verify that the distance between them increases toward l/g as I -, m. (b) If w is constantly emitting light at frequency v o , show that a measures the frequency of light he receives at proper time z as ~ ( t )= voe-g'. (c) If a is

+

+

184 6

Special Relativity

constantly emitting light at frequency v o , show that o measures the frequency of the light he receives at proper time r > 0 as v(t) = vo/(gr). 7. Consider the accelerating observer a on page 181. Light emitted by a at his proper time t = a reflects off 4 E Q and is received by a at t = b. Prove: (a) 4 lies on the ray from 0 through a((a + b)/2); and 04 = g-lex, where x = + g ( b - a)/2, the sign depending on the direction of the emitted light. (b) The functions vo, y1 in the text form a coordinate system for Q , with line element ds2

=

-(I

- gy')'

(dye)*

+ (dy')'.

8. Change of coordinates. Let 5 and q be Lorentz coordinate systems in M such that x2 = y 2 and x3 = y 3 . (a) Show that there are unique numbers cp, a, b such that xo = yo cosh cp

+ y1 sinh cp + a,

x 1 = yo sinh cp

+ y'

cosh cp

+ b.

(b) Explain the meanings of cp, a, b in terms of the freely falling observers constituting the O-axes of 5 and q . (c) Express d/dyo, d/dyl in terms of d/axo, d/dx' -and vice versa. This is the Lorentz analogue of a rotation and translation of Euclidean coordinates in R2. (Compare Example 9.4). For coordinate techniques and a variety of interesting problems in special relativity, see [TW].

7

CONSTRUCTIONS

The theory of covering manifolds, whose fundamentals are outlined in Appendix A, is an indispensable tool for dealing with global problems in manifold theory. We establish further properties of covering manifolds that tie them closely to fundamental groups. Furnishing the manifolds with metric tensors brings these techniques to bear in semi-Riemannian geometry, where they provide useful means of constructing new semi-Riemannian manifolds that cover, or are covered by, a given one. The notion of vector bundle is discussed, with the tangent bundle as prototype and the normal bundle of a semi-Riemannian submanifold as a principal application. A generalization of semi-Riemannian product manifold M x N is obtained by homothetically distorting the geometry of each fiber p x N to get a new “warped” metric tensor on the manifold M x N . In particular it turns out that fundamental examples of relativistic spacetimes are such warped products. A further generalization, briefly considered, is the notion of semi-Riemannian submersion.

DECK TRANSFORMATIONS 1. Definition. A deck transformation of a covering 4 :fi + M is a diffeomorphism 4: fi -+ fi such that 4 0 4 = A.

Thus merely rearranges the points in each fiber 4 - ‘(p), p E M . The set 9 of all deck transformations of a covering forms a group, with composition of functions as the group operation. 185

186 7

Constructions

2. Example. The covering map exp: R' -+ S' sending t to (cost, sin t ) has periodicity exp(t 27rn) = exp(t). Thus any deck transformation must carry each t E R to a point t + 27rn(t), and by continuity the integer n ( t ) is independent of t. Hence the deck transformation group 9 of this covering consists of all translations c#J,(t)= t + 27cn. The function n -+ & is an isomorphism from the additive group of integers 2 to 8, so 9 is infinite cyclic.

+

In the language of Appendix A, a deck transformation 4 of R : fi --t M is in particular a lift of R through R. Thus by Proposition A.ll, a deck transformation of a connected covering is determined by its value at a single point. Explicitly, if 4, $ E 9 and b(p)= @(p) for some p E fi, then 4 = $. In particular, since the identity map of fi is a deck transformation, the existence of a single fixed point, 4 ( p ) = p , implies that 4 is the identity map. The more symmetrical a covering is, the larger its deck transformation group. A covering A : -+ M is normal (sometimes called regular) when its deck transformation group is as large as possible, that is, when R ( p ) = R(q) implies there is a deck transformation 4 such that $(p) = q.

a

3. Examples. (1) Every double covering is normal, with deck-transformation group consisting of the identity map and the map p that reverses the two points in each fiber R - ' ( p ) . (2) Every simply connected covering is normal (by Proposition A. 11). 4. Proposition. If A : fi + M is a simply connected covering, then its deck transformation group 59 is isomorphic to the fundamental group n , ( M ) of M .

Proof. As in Appendix A, let n , ( M ) be based at q E M . Fix a point

P E K ' ( q ) . If u: [0,1] -, M is a loop at q, that is, a(0) = a(1) = q, let 6 be the unique lift of a starting at p . Any two loops a', u2 in the same homotopy class a € 7rl(M) are by defini-

tion fixed-endpoint homotopic, hence Corollary A.10 implies &'(1) = Z2(1). Denote by 4, the unique deck transformation such that 4,(p) is the common is an isovalue of 6(l) for all a in a. We assert that the function a -+ morphism from n , ( M ) to 8. First we show that it is a homomorphism, that is, 4ao +b = (Pab for all a, b E n,(M). For a in a and /Iin b, 12runs from p to 4,@) and 6 runs from p to (bb(P). Thus 4,oj-l runs from $,(p) to +,(c#Jb(p)). In particular the path product = I2 * (4, 0 6) is well defined. Now j j starts at p , and since A 0 4, = A, A o y = ( ~ 0 6 ) * ( R o ~ , o ~a)* =p .

Orbit Manifolds 107

By definition ab is the element of n , ( M ) containing c1 * p, hence

Since fi is connected, $hob = 4a4b. The homomorphism above is onto 9. In fact, if 1(1 E 9 and ii is a curve in from p to 1(1(p), then I) = 4a,where a is the homotopy class of the loop A z. To show that the homomorphism is one-to-one, suppose is the identity map. By construction this means that any loop c1 E a lifts to a loop at p . Then Corollary A.10 implies that is fixed-endpoint homotopic to a constant, that is, a is the identity element of n l ( M ) . 0

This result is often the easiest way to compute fundamental groups. For instance, applied to Example 2, it shows that n,(S') is infinite cyclic. Various situations offer a choice of two objects at each point of a manifold; in favorable cases a covering manifold results, as follows. 5. Remark. Let 1:C + M be a two-to-one map of a set C onto a manifold M . Let 9 be a collection of functions A: %! + C (% an open set of M ) such that

R o I = id for all I E 9. (2) If A(p) = p(p) for I , p E 9, then I = p on some neighborhood of pin M . (3) Every point C is in the image of some A E 9. (1)

Then there is a unique way to make C a manifold so that 4:C + A4 is a smooth double covering map and each A E 9is a smooth local cross section of 1. The proof is a straightforward exercise in the use of Proposition 1.42.

ORBIT MANIFOLDS

r

Let be a group of diffeomorphisrnsof a manifold M. For p E M the set { + ( p ) : 4 E r}is called the orbit o f p under r. The collection of all such orbits is denoted by MIT. The natural map A : M -,M / T sends each point to its orbit under r. We want conditions on r so that M / T becomes a manifold and 4 a covering map. For example, on the sphere S" let be the group { k l} consisting of the identity map and the antipodal map p + - p . Then the orbits are unordered pairs {p, - p ] and S"/+ 1 will turn out to be a projective space.

r

188 7

Constructions

6. Definition. A group r of diffeomorphisms of a manifold M is properly discontinuous (and acts freely) provided (PD1) Each point P E M has a neighborhood @ such that if &%) meets C for 4 E r then 4 = id. (PD2) Points pt 4 E M not in the same orbit have neighborhoods q and Y' such that for every 4 E r, 4(@)and -#'. are disjoint. It is easy to verify that the deck transformation group of any covering is properly discontinuous; conversely, 7. Proposition. Let r be a properly discontinuous group of diffeomorphisms of a manifold M . There is a unique way to make M / T a manifold so that the natural map A : M -, M / T is a covering map. If M is connected the deck transformation group is r, hence the covering is normal.

Proof. Call a neighborhood % of p in M special if 42 has the property in (PD1) and is the domain of a coordinate system 5. Now r& 142 is one-to-one hence so is the map 5 0 ( A [ d i ) - ' : A(%) + R". We apply Proposition 1.42 to show that M / T is a manifold with these maps as coordinate systems. Evidently their domains cover M/T. For another such map q o ( k / V ) - ' , the sets A(@)and A ( V ) meet if and only if there is a 4 E r such that $I(%) meets V . But then q ( A 1 V ) - 1 [< (Ap4!)-1]-1 = r7, 0

0

0

0

4

0

t-1,

which is smooth on an open set of R". The Hausdorff property for M / T follows from (PD2), and M second countable implies M / T second countable. Thus M / r is a manifold. If 42 is special and connected, then k(%) is evenly covered by A, since the components of & - I ( @ ) are the sets 4(@) for all 4 E r. By construction, A carries 42 diffeomorphically onto A(@), and similarly for any +(@) since A 0 4 = k . Thus A is a covering map, and each 4 is a deck transformation. If A4 is connected, then Proposition A . l l applies to show that every deck transformation is in r. The construction above is the only way to make A a local diffeomorphism, hence uniqueness holds. This result is particularly informative when M is simply connected (hence connected), for then Proposition 4 shows n l ( M / T )% r. In the example above, the orbit manifold P" = S"/f 1 is real projective n-space. For n 2 2, S" is simply connected, hence x,(P") is the two-element group Z, . (PI is diffeomorphic to S ' , hence has infinite cyclic fundamental group.)

Orientability

189

OR1ENTABl LlTY

Equivalent to the previous definition of orientability (1.41) is an alternative with greater range of development. Consider first some linear algebra. Two bases e , , ..., e, and e;, . . . ,ek for a vector space V have the same orientation provided det A > 0,

where e; =

1Aiej

(1 5 i 5 n).

They have opposite orientation if det A < 0. It is easy to check that having the same orientation is an equivalence relation on the set of all bases for V, and that there are just two equivalence classes, called orientations of V. The orientation containing e l , . . . , e, will be denoted by [ e l , . . . , en]. If ( is a coordinate system on 42 c M , let

A&)

= [dl

IP' . . . 4ilPl. >

An orientation A of a manifold M assigns to each point p E M an orientation A(p) of T,(M) and A is smooth in the sense that for each p E M there is a coordinate system at p such that A = At on some neighborhood of p . A manifold is orientable if there exists an orientation of M ; to orient M is to choose a particular orientation. For example, R" is orientable in view of its usual orientation At, where 5 is the natural coordinate system. Any object that determines a unique orientation of M will also be said to orient M. An object in agreement with a selected orientation A is said to be positively oriented. For example, a basis v , , . . . , v, for T,(M) is positively oriented if [ v , , . . . , v,] = A(p), and a coordinate system ( on 42 c M is positively oriented if At = 1 on (42.

<

8. Lemma. A semi-Riemannian hypersurface M of an orientable manifold is orientable if and only if there exists a smooth unit normal vector field on M . (For a generalization see Exercise 9(b).) In particular, any hypersurfacef- '(c) as in Proposition 4.17 is orientable if is orientable. Thus the hyperquadrics S: and H: are orientable. If 5 and y~ are overlapping coordinate systems in M , define J(5,q)

=

det(dyj/dx') on 42( n a,,.

Then =

A,(P)

-

4 5 , ?)(P) > 0.

In view of the smoothness condition for orientations, if two orientations of M agree at a point, then they agree on some neighborhood of that point.

190 7

Constructions

If I is an orientation of M , then so is - I , which assigns the opposite orientation at each point. If M is connected, then & I are its only orientations, since the sets where an orientation p agrees with I and with -I constitute a disjoint open covering of M , so one of them must be all of M . Thus in the connected case, orienting a single tangent space serves to orient all of M . If 4: M -+ N is a local diffeomorphism, then it is easy to verify that for each point p E M $ [ e l , . . . , en] = [ M e 1 ) , . . . , &J(~,)I is a well-defined one-to-one function from the orientations of T J M ) to the orientations of TbP(N).If M and N are oriented by I, and I,, we say that q5 is orientation-preserving if $(I&)) = IN(q5p)for all p E M , orientation-reversing if $(I,@)) = - l N ( 4 p ) for all p E M .

i

If M is connected, then these are the only possibilities; otherwise 4 might preserve orientation on one component, but reverse it on another. If 4: M ---t N is a local diffeomorphism and N is orientable, then M is orientable. In fact, if I is an orientation of N , there is a unique orientation 4 * ( I ) of M making 4 orientation-preserving. (Define #*(A) at p E M to be the orientation of T,(M) carried to I(#p) by 4.) Orientability has a useful expression in terms of covering manifolds. For a manifold M , let fi be the set of all orientations of tangent spaces of M . Let A : M -+ M send the two orientations of each T J M ) to p . Then Remark 5 gives 9. Corollary. For a manifold M there is a unique way to make fi,as above, a manifold so that (1) A : M M is a double covering map, and (2) for each coordinate system { on "1G c M, the map Ac: 42 M is a smooth local cross section. -+

-+

This covering A : M -+ M is called the orientation covering of M . Evidently an orientation of M as defined earlier is exactly a smooth global section I : M -+ M. 10. Lemma. For any manifold M , (1) its orientation covering manifold fi is orientable, and (2) M is orientable if and only if A : M -+ M is trivial.

Proof. (1) For each coordinate system 5 in M consider the pulled-back orientation ,$'*(I,) on I<(%). For another coordinate system the intersection % - = I,(&) n I,(V)is nonempty if and only if I < = I, on R(W).Thus the local orientations ,#*(It) for all 5 combine to give a global orientation of M . The proof of (2) is analogous.

Semi- Riemannian Coverings 191

Thus for a nonorientable manifold, orientability can be recovered by passing to the orientation covering manifold. Also, in view of Proposition A. 14, a simply connected manifold is orientable.

SEMI-RIEMANNIAN COVERINGS

The notion of covering map is made geometric in the following obvious way. 11. Definition. A semi-Riemannian covering map k :k? + M is a covering map of semi-Riemannian manifolds that is a local isometry.

For example, the exponential map in Example 2 is a semi-Riemannian covering of the unit circle S' by R'. A product of semi-Riemannian coverings is again one; thus, exp x exp: RZ + S' x S' is a semi-Riemannian covering of a flat torus by the plane. If A : M + N is a covering map of a smooth manifold M onto a semiRiemannian manifold N with metric tensor g , then assigning M the pulledback metric tensor A*(g) makes A a semi-Riemannian covering map. If A : k? M is a semi-Riemannian covering and 4 : P -+ M is a local isometry, it is easy to see that any lift P + k? of 4 through k is again a local isometry. In particular, any deck transformation I) of the covering is an isometry (since I) is a diffeomorphic lift of the identity map). Thus the deck transformation group is a properly discontinuous group of isometries of @. Conversely, -+

8:

12. Corollary. If r is a properly discontinuous group of isometries of a semi-Riemannian manifold M , then there is a unique way to make M / T a semi-Riemannian manifold such that A: M M / f is a semi-Riemannian covering. If M is connected, the deck transformation group is r. ---f

Proof. By Proposition 7 it suffices to show there is a unique metric tensor on M / T such that A is a local isometry. If w E T,(M/T), then for each p E k - ' ( q ) there is a unique up E T p ( M )such that dA(v,) = w . If p, p' E R - ' ( q ) , an orbit under r, there is an isometry 4 E r such that $ ( p ) = p', and it follows that d@(u,) = u p , .Thus for z, w E T,(M/r)the scalar product ( z p , w p ) is independent of p in &-'(q), so we have no choice but to define g ( z , w) to be this number. Then g is a smooth metric tensor on M / T making k a local isometry, since, if 1:92 + M is a local cross section, g/& is the pullback L*(gM)of the metric tensor of M.

192 7

Constructions

On any hyperquadric the identity map and the antipodal map p + - p constitute a properly discontinuous group of isometries. Thus we obtain semi-Riemannian orbit manifolds PXr) = S:(r)/& 1 and H:(r)/+ 1. The Riemannian manifold P“(r) = S”(r)/+ 1 is projective n-space of radius r, and H ” , r ) / f 1 is just the hyperbolic space H”(r). Using covering techniques we can illustrate some distinctive features of indefinite metrics. A curve segment a:[ a , b] -+ M is srnoorhly closed provided a(a) = a(b) and d ( b ) = ca’(a) # 0 for some c > 0. A smoothly closed geodesic cr (traditionally called merely a closed geodesic) thus has a geodesic extension d that, by the uniqueness of geodesics, perpetually traverses the route of cr. However it need not be periodic.

13. Proposition. Let d be the maximal geodesic extension of a smoothly closed geodesic cr: [a, b] -,M , so a’(b) = ccr’(a) # 0 with c > 0. Then (1) d is complete (domain d = R ) o d is periodic o c (2) If cr is nonnull, then c = 1.

=

1.

Proof. (la) If c # 1, then d is not complete hence not periodic. Suppose for definiteness that c > 1. In order to extend cr past b we must use the formula d(t)

=

o(ct - cb

+ a),

since only then is the new segment a geodesic satisfying the extension condition d‘(b+) = CO’(U)

=

d ( b ) = B‘(b-).

But ct - cb + U E [a, b] if and only if t E [ b , b + ( b - a)/c]. Thus this first extension reaches only to b ( b - a)/c. Similarly, a second such extension reaches b + ( b - a)/c + ( b - a)/c2, and so on. There are no problems in extending cr to infinity in the negative direction when c > 1. Thus the largest domain of d in this case is ( t : t < b*}, where b* = (bc - a)/(c - 1) > b. The case c < 1 is analogous.

+

(lb) If c

=

1, then 6 is periodic, hence complete.

In fact, the formula d(t + n(b - a ) ) = a(t) for all t E [a, b] and all integers n gives 8 : R 4 M as the unique geodesic extension of 0, and d is periodic-of period b - a if cr 1 [a, b) is one-to-one. (2) Since c > 0, the closure condition implies Icr’(b)I = clcr’(a)I. But parallel, nonnull, and nonzero, hence Io’(b)l = Io’(a)I # 0. Thus

6’ is

c=l.

Semi- Riemannian Coverings 193

For nonnull geodesics the terms closed and periodic are often used interchangeably. 14. Example. We describe a closed (null) geodesic that is not periodic. Let M be the right half-plane {(u, u) E R 2 : u > 0 ) with ds2 = 2 du du. The map +(u, u) = (u/2,2u) is an isometry of M , and the group r = { V :n E 2 ) it generates is properly discontinuous. The orbit manifold M / T is a flat Lorentz surface diffeomorphic to a cylinder S' x R1. The parametrization ct(t) 5 (c,0) of the positive u axis is a null geodesic in M . Since r&: M -+ M / T is a local isometry, p = R 0 ct is a null geodesic in M/T. For all t > 0, +(t, 0) = (t/2,0); hence,

p(t/2)

=

A(a(t/2))

= R+ct(t) = Rct(t) =

p(t).

Thus p is a closed geodesic, repeatedly traversing a circle C in M / T . But obviously y is not periodic; indeed it makes one circuit of C on each parameter interval [2-k, 21-k]. The domain R+ of the null geodesic above is maximal; thus the surface M / T is not complete. In fact by the preceding proposition: 15. Corollary. If M is either Riemannian or complete, then smoothly closed geodesics in M have periodic extensions.

By the Hopf-Rinow theorem (5.21) a compact Riemannian manifold is complete. However for indefinite metrics compactness does not imply completeness : 16. Example. The Clifton-Pohl Torus [MI. Let M be R2 - 0 with ds2 = 2 du du/(u2 u2). Evidently scalar multiplication by any c # 0 is an isometry of M . Take say p(u, u ) = ( 2 4 2u). The group r = {p">generated by p is properly discontinuous; thus, T = M / T is a Lorentz surface. Topor I 2 with boundary points identified logically T is the closed annulus 1 I under p. Thus T is a torus; in particular, it is compact. But T is not complete; for this it suffices to show that M is not complete. By Exercise 5.8 the geodesic differential equations are

+

U'' =

2u ~

u2

+ v2

(try,

u" =

2u ~

Thus the curve a(t) = (1/(1 - t ) , 0) is a geodesic, defined for - cc < t < 1. Since ct fills the positive u axis in M it is inextendible. Thus M is not complete. (For further properties of T, see Exercise 9.12.)

194 7

Constructions

LOR ENTZ T I M E-OR1ENTABILlTY

The notion of time-orientability of a Lorentz manifold M can be dealt with in much the same way as orientability for a smooth manifold. Let M' be the set of all timecones in tangent spaces of M , and let k : M T + M be the natural two-to-one map. As in the discussion preceding Lemma 5.33, if V is a timelike vector field on 42 c M , then for each p E 42 let z&) be the timecone containing V,. If zw is another such local time-orientation, then by Lemma 5.30 rv(p) = zw(p) if and only if (V,, W,) < 0. Hence the set of all local time-orientations of M satisfies the conditions in Remark 5, and k : M' + M becomes a smooth double covering map. The pulled-back metric tensor on M T makes this a Lorentz covering, called the time-orientafion covering of M . Evidently a time-orientation of M as defined in Chapter 5 is just a global section of this covering. The following analogue of Lemma 10 is proved analogously. 17. Lemma. If M is a Lorentz manifold, then (1) M T is time-orientable, and (2) M is time-orientable if and only if A : M T + M is trivial.

Thus a simply connected Lorentz manifold is time-orientable. Let 4: M -, N be a conformal mapping of Lorentz manifolds, so ( d & , d 4 w ) = h(p)(u, w) for all u, w E T,(M), EM. Suppose that h > 0. Then d 4 preserves the causal character of tangent vectors and hence carries the two timecones of each T,,(M) to the timecones of T+,(N). If z is a timeorientation of N then, just as for ordinary orientation, there is a natural pulled-back time-orientation 4*(t)of M . Thus N time-orientable implies M time-orientable. If M and N are time-oriented by zM and zN, then 4 preserues time-orientation provided 4*(zN)= z M , and reuerses time-orientation provided # * ( t N= ) - zM . As before, if M is connected, these are the only possibilities.

VOLUME ELEMENTS

Intuitively a volume element on an n-dimensional scalar product space V is a function w that assigns to n vectors u l , . . . , u, E I/ the volume of the parallelepiped with these vectors as sides. (Thus w ( u l , .. . , u,) = 0 if the vectors are linearly dependent, that is, if the parallelepiped collapses.) Equivalently, but more rigorously, w is multilinear and, if e,, ... ,en is an orthonormal basis for V, then w(el,. . . ,en) = &1. Now we apply this definition to tangent spaces.

Volume Elements

195

18. Definition. A volume elernent on an n-dimensional semi-Riemannian manifold M is a smooth n-form u) such that w ( e , , . . .,en) = i.1 for every frame on M . Volume elements always exist at least locally.

19. Lemma. On the domain 42 of a coordinate system 5 there is a volume element orsuch that w,(d,, . . . , a,) = Idet(gij)(l/'. Proof. For vector fields Vl, . . . , V, on 42 write V j =

wc(V,, . . . , V,)

=

1 V i ai and define

det( V i )1 det(gij) 1.'

Properties of determinants show that this uniquely defines we as an n-form on 42. If V,, . . . , V, is a frame field, then dijEj =

(Vi, V j ) =

Vga,)

=

c VigrsVj.

Taking determinants gives ( - 1)" = (det( V;))' det(gij), hence we(Vl, . . ., V,) det(Vi)(det(gij)I'i2= +l. rn In the notation of differential forms, w g = lg1'i2dx' lgl = Idetkij)l.

A

=

. . . A dx", where

20. Lemma. A semi-Riemannian manifold M has a (global) volume element if and only if M is orientable. Proof. If w is a volume element, then the bases for T,(M) such that w(ul,. ..,v,) > 0 constitute an orientation A(p) of T J M ) , and the function p -, I ( p ) is smooth. If M is oriented, then for all positively oriented coordinate systems ( the local volume elements w g agree on overlaps, hence give a global volume element. In fact, by the determinant formula, w,, = J w g .Then J > 0 and 0,' = w: imply w,, = weon their common domain. In Chapter 9 we shall interpret the Lie derivative Lx of Definition 2.16 as a derivative relative to the flow of the vector field X (compare Proposition 1.58). Thus the following result shows that the divergence of X is the logarithmic rate of change of volume under the flow of X . 21. Lemma. If w is a (local) volume element on M , then L x ( o ) = (div X ) w .

Proof. Let E , , ..., En be a frame field such that w(E,, ..., E n ) = 1. Since Lx is a tensor derivation and Lx(l) = X1 = 0, (LXwXE,, . . . > En) = -

1o(E,,. . . ,LXEi,. . . ,En).

196 7

Constructions

Write LxEi = [X, Ei] as fijEj. Since w is skew-symmetric, upon substitution the summands other than fiiEi yield zero. Thus (L,o)(E,, . . . ,En) =

-1

fii.

On the other hand, from Chapter 3,

The latter sum vanishes since (Ei, Ei)is constant, leaving div X

=

-C

fii.

The proof of Lemma 20 shows that if M is orientable, it can be oriented by the choice of a volume element-and there are just two, +w, if M is connected. Sometimes we denote a selected volume element on A4 in classical style by dM. 22. Definition. Let M and N be semi-Riemannian manifolds of the same dimension n, oriented by volume elements dM and d N . If 4: M + N is a smooth mapping, the function J E ~ ( M such ) that +*(dN) = J d M is the Jacobianfunction of 4. Applying the formula to vectors ul,. .. , v, in T,(M) gives dN(d$(u,L . . . > d$(un)) = J ( P ) dM(u,, . . ., u n ) Thus the function J is nonvanishing if and only if q5 is a local diffeomorphism, and then $ is orientation-preserving if J > 0, orientation-reversing if J < 0. In general, if u l r . . .,un is any basis for T,(M), the formula

exhibits I J ( p ) I as the rate of change of volume near p under the mapping 4. The following scheme traces back to Gauss. Let M be a hypersurface in R:+' oriented by unit normal U . If M has sign E , that is, if ( U , 17)= E, let Q be the unit hyperquadric in It:' with the same sign. Explicitly,

Q will always be oriented by the position vector field P , which on Q is a unit normal. For each p~ M the point I&) of R;+' canonically corresponding to the vector U p lies in Q. The resulting smooth mapping $: M + Q is called the Gauss map of M c R:'

23. Proposition. Let M be a hypersurface in R:+' oriented by U . Then the Jacobian function J of the Gauss map of M is (- 1)"det S , where S is the shape operator of M derived from U .

Vector Bundles

197

Proof. By construction U p and P,, canonically correspond to $ ( p ) hence to each other. Thus if u E T,(M), its canonical correspondent at $(p) is in T,,(Q). We assert that the differential map of $ is essentially just - S. If u E T,(M), let ci be a curve in M with initial velocity v. Writing canonical correspondence as equality gives d$(u)

=

($ 0 0c)'(0)

=

(UL)(O) = D,U

=

-S(U).

That M is oriented by U means that a frame el, . . . , e, in T,(M) is positively oriented if and only if el, . . . , en,U is positively oriented. Then the canonically corresponding frame in 7'$,(Q) is positively oriented (by means of P z U ) . For the corresponding volume elements o, J(p)

w(d$(e,), . . . , d$(e,) = w( -Se,, . . . , -Sen) = (- l)"(det S)w(el, . . . , e,) = (- 1)" det S. =

For an orientable hypersurface M c R;' the function det S is called the Gauss-Kronecker curvature. It is independent of the choice of k U in even dimensions n, and for n = 2 reduces to Gaussian curvature. The preceding result exhibits I det S I as the rate of change of volume under the Gauss map. For the integration of n-forms over (regions in) smooth n-dimensional manifolds, see [W], [Sp], and [BG]. VECTOR BUNDLES

The tangent bundle T M is an instance of the following general notion. 24. Definition. A k-vector bundle ( E , R ) over a manifold M consists of a manifold E and a smooth map R : E + M such that (1) each n-'(p), p E M, is a k-dimensional vector space; (2) for each p E M there is a neighborhood @ of p in M and a diffeomorphism

4 ~ x&R~ + &(&)

c

E

such that for each q E @, the map u .+ +(q, u) is a linear isomorphism from Rk onto Y ' ( q ) . (See Figure 1.) Roughly speaking, over small enough regions 42 in M , the manifold E is a product @ x Rk. Terminology: M is the base manifold, E the total manifold, R the projection, n - l ( p ) the$ber ouer p , Rk the standard fiber, and 4 a bundle chart of the bundle. When the projection n is clear from context we say merely that E is a k-vector bundle over M.

198 7

Constructions

Rk

_11 Recall that in the case of the tangent bundle T M each coordinate system ( on 42 c M gives rise to a coordinate system &, al, . . . , a,) = aiai,I on 7c- '(%) c T M . Evidently these functions f are bundle charts making T M an n-vector bundle over M". A vector field X on M is precisely a section X : M -+ T M of the tangent bundle, and this fact is the key to exploiting the analogy between T M and arbitrary vector bundles. A local basis for a k-vector bundle ( E , 7c) is a set of k linearly independent local sections Xi: 42 -+ E (1 I i Ik) on an open set 42 c M . There is a natural one-to-one correspondence between bundle charts I$ and local bases X ,, . . . , X , expressed by the formula I$(p, a l , . . . , a 3 = ai Xi(,,. A k-vector bundle ( E , n ) over M is trivial provided it has k linearly independent global sections, or equivalently a global bundle chart 4 : M x Rk % E. For example, TR" and TS' are trivial, but TS2 is not, since S2does not admit even one nonvanishing tangent vector field. A vector bundle (E,n) is orientable provided there exists a function smoothly assigning to each p E B an orientation of 71- '(p). Thus, for example, the tangent bundle ( T M , 7t) is orientable if and only if the manifold M is orientable. ( B y contrast the manifold T M is always orientable.) If M" is a semi-Riemannian submanifold of IV"'~, let N M be the set { T,(M)': p E M ) of all normal vectors to M . Let n : N M + M be the map carrying each T,(M)'- to p E M . We shall show that ( N M , n) is in a natural way a k-vector bundle over M , called the normal bundle of M in For each p E M there is a normal frame field El, . . . , Ek defined on some neighborhood % of p in M . The formula

u

a.

$ ( q 7 a i , . - . , a , J = CaiEiIq defines a one-to-one map q5 of 42 x Rk onto f1(@) c N M . As with the tangent bundle, requiring each 42 to be a coordinate neighborhood lets us

Vector Bundles 199

+-

apply Proposition 1.42 to make N M a manifold with maps (5 x id) 0 as are bundle charts making NM a coordinate systems. Then the maps vector bundle over M. Geometrically the normal bundle NM bears a strong analogy to the tangent bundle TM. Just as vector fields in X(M) are the sections of T M , those in X(M)' are the sections of NM. Corresponding to the Levi-Civita D for N P (Definition 4.31). connection D for T M is the normal connection ' Indeed the theory of vector bundles can be used to unify much of differential geometry by generalizing the notions of tensor field, connection, and metric. If ( E , n) is a vector bundle over M, then the set T ( E ) of its sections is in a natural way a module over g(M). For tensor fields

+

A : X*(M)" x X(M)" -+ g(M)

as defined in Chapter 2, if g(M) is replaced by T(E),then many results, in particular Proposition 2.2, go through as before. For example, the shape tensor of M c R is an g(M)-bilinear function 11: X(M) x X(M) + T ( N M ) = X(M)'.

Since the fiber of NM at p is T,(M)' the generalized Proposition 2.2 gives as the value of I1 at p an R-bilinear function T p ( M )x T,(M) + TP(M)'. Change notation from A4 c M to P c M. In the trivial case where the submanifold P reduces to a single point p E M, then N P is just the tangent space T,(M). The exponential map exp, generalizes as follows: If M is complete, the normal exponential map exp': N P

-+

M

sends u E N P to ~ " ( l where ), yv is the M geodesic of initial velocity u. Thus exp' carries radial lines in T,(P) to geodesics of M normal to P at p. A differential equations argument as for exp: T M -+ M shows that exp' is smooth. If M is not complete, then exp' is defined only on some open domain containing the set 2 of all zero vectors in N P . (Henceforth we often write exp' merely as exp.) Continuing the analogy, define a neighborhood % of P in M to be normal provided % is the diffeomorphic image under exp' of a neighborhood of Z in N P . To prove that normal neighborhoods exist, a preliminary fact is needed. 25. Lemma. If p E P c M, then exp' carries some neighborhood of 0, in N P diffeomorphically onto a neighborhood of p in M.

Proof. As for any vector bundle, the bundle charts show that the set Z of zeros is a submanifold of N P . Thus exp restricts to a smooth map of 2

200 7

Constructions

onto P . In fact this is a diffeomorphism, since the inverse map is the zero vector field. Thus d expo is one-to-one on To(Z)c T,(NP). Also N , = T,(P)' is a submanifold of N P , and as in the one-point case, d expo is the canonical isomorphism To(N,) x N , . Since T,(M) is the direct sum of T J P ) and N , it follows that d expo is a linear isomorphism of To(NP) onto T,(M).Then the inverse function theorem gives the result. 26. Proposition. Every semi-Riemannian submanifold P c M has a normal neighborhood in M . Proof. For each p E P let M, be a neighborhood of 0, in N P on which exp is a diffeomorphism. By definition, P has the induced topology, so by shrinking M, we can arrange that, if exp(u)EP for ~ € then 4 ,u = 0. By standard point set topology we can further arrange that JV = 4 has this property: given any compact set K c P , the set n - ' ( K ) A JV is compact, where n: N P + P is the natural projection. It suffices to show that exp is one-to-one on some neighborhood of Z contained in J1/: Note that by construction, if u # w in M and exp(u) = exp(w), then both u and w are nonzero. There exists a sequence M = & 2 4 =I . . . of neighborhoods of Z in N P such that = Z. We assert that given any compact set K c P there is an index i such that exp is One-to-one on E j = ( n - ' K n 4.) u Z . Assume not. Then for each j there are vectors uj # w j in E j such that exp(uj) = exp(wj). By an above property, neither u j nor w j is zero, hence both are in n - ' K n 4. c n - ' K n .A? Thus, passing to subsequences if necessary, { u j ) and ( w j ] converge, necessarily to zero vectors 0, and 0,, respectively. Since exp is continuous, p = exp(0,) = exp(0,) = q. But then there are pairs u j , wj (as above) in Mp,a contradiction. If P is compact, take K to be P and the proof is complete. If P is not compact, then by second countability it contains an increasing sequence of compact sets K ic int Ki+'such that K , = P. For K , choose Eias above. A mild variant of the preceding argument Continue shows that there is an 'i 2 i such that exp is one-to-one on Eiu Ei.. by induction. Then Eiis the required neighborhood of Z . rn

04

u

u

This result fails in general for immersed submanifolds. For example, a figure-8 submanifold of R2 as in Exercise 1.15 evidently has no normal neighborhood. LOCAL ISOMETRIES

One of the most useful geometric applications of covering methods is this simple consequence of Proposition A. 14.

Local lsometries

201

27. Corollary. If 4 : @ + M is a semi-Riemannian covering with @ connected and M simply connected, then r& is an isometry. The result fails if 1 is merely a local isometry onto M ;thus it is important to decide which local isometries are covering maps. The following theorem shows that a necessary lift condition (Lemma A.9) is also sufficient. 28. Theorem. Let 4 : M -, N be a local isometry with N connected. Suppose that, given any geodesic 0:[0,1] + N and point p E M such that 4(p) = o(O), there exists a lift 6: [0, 11 + M of 0 through 4 starting at p . Then 4 is a semi-Riemannian covering map. Proof. In view of Lemma 3.32 the lift condition implies that 4 is onto. If % is a normal neighborhood of q in N , we shall show that 42 is evenly

covered by 4 (Definition A.7). The scheme is as follows: 42 is filled with radial geodesics; lifting these to start at each p E 4-'(q) gives disjoint diffeomorphic copies of % that fill 4- '(42). Let @ be the neighborhood of 0 in T,(N) mapped diffeomorphically onto 42 by exp,. If p E ~ $ - ' ( q ) then , d 4 , : T,(M) + T,(N) is a linear isometry, hence also a diffeomorphism. Thus &(p) = (d4,)- '(@) is a starshaped neighborhood of 0 in T,(M). (1)

exp, is dejned on @(p).

If u E @(p), let 0:[0, 11 -+ @ be the radial geodesic with initial velocity d4(u) E @. By hypothesis, 0 has a lift 5: [0, 11 + M starting at p. Since 4 is locally an isometry, 6 is a geodesic. Now d&Z'(O)) = ~ ' ( 0 = ) d&u), hence u = Z(0). But then exp,(u) = 5(1). (2) %(p) = exp,(&(p)) is a normal neighborhood of p in M . The proof of (1) shows in fact that, if v ~ @ ( p ) ,then 4(exp,(u)) = exp,(d4(o)) for all u E @(p). Thus 4 carries %(p) onto %. Let us agree that these maps are reduced to the sets shown in Figure 2. Then since d 4 and exp, are diffeomorphisms, so is 4 0 exp,. Thus exp, is one-to-one as well as onto, and d 4 d exp, is a linear isomorphism at each u E @(p). But d 4 is a linear isomorphism, hence so is (d exp,),. It follows that %(p) is an open submanifold of M and exp,: @(p) -, %(p) is a diffeomorphism,thus proving (2). 0

4 carries %(p) diffeomorphically onto 42. In fact, for reduced maps as above, 4 = exp, d 4 , exp, a composition

(3)

0

0

of diffeomorphisms.

(4) Zf p1 # p2 in 4 - '(q), then the neighborhoods %(pJ and 42(p2) are disjoint.

Assuming that there is a point m E 42(pl) n %(p2), we prove p1 = p 2 . For i = 1,2, let oi:[0, 11 -+ %(pi) be the radial geodesic-with parametriza-

202

7 Constructions

14 /

.

$/

Figure 2

tion reversed-from m to pi. Then 4 o, and 4 i o2 are reversed radial geodesics in ,2! from $(m) to q, so they are equal. Thus o1 and o2 are lifts of the same geodesic starting at the same point, hence o1 = 0 2 .In particular, p1 = o,(l) = o2tl) = p z . 0

(5)

4-Y-V = u w ( P ) : P - # - l ( d l .

By ( 3 ) the union is contained in &-'(%), so we must prove the reverse inclusion. Let rn E c#-'(&), and let 0 : [0,1] 42 be the reversed radial geodesic from d(m) to q. If 8 is the lift of o starting at m, let p = o(1). Then +(p) = o(1) = q, so p E 4- '(4). But 6 must lie in &(p), hence rn E &(p). --f

N be a local isometry, with N connected. 29. Corollary. Let 4: M Then M is complete if and only if N is complete and 4 is a semi-Riemannian covering map. --f

Proof: If M is complete, we assert that the lift condition in the theorem holds. In fact, if o:[0, 1) + N is a geodesic and +(p) = o(O),there is a unique vector u E T&M) such that d&o) = o'(0). By completeness, yv is defined on R, and by the uniqueness of geodesics, 4 y DI [0,1] = o. Thus the theorem shows that 4 is a covering map. With the same notation, 4 0 yr is a geodesic extension of o over R. Thus N is complete. Conversely, if t' E T,(M), then since N is complete, the geodesic with initial velocity ~ R ( u is) defined on the whole real line. Since 4 is a covering 0

Matched Coverings 203

map this curve has a lift 7 : R + M starting at p and hence having initial velocity v. Since 4 is a local isometry, 7 is a geodesic. Thus jj is in fact y"; hence M is complete. In this corollary and in the preceding theorem, if N is not connected it suffices to suppose that the image of 4 meets every component of N . MATCHED COVERINGS

The goal of this section is to show how a suitable covering @* of a manifold M by open sets can be used to construct a covering manifold M* of M . Suppose M is a surface; we can think of the open sets in @* as disks of paper lying on M . If various pairs of overlapping disks are glued together-leaving each in position-then a new surface M* is constructed together with a mapping A : M* + M sending each point of M* to the point of M it lies over. In favorable cases B will be a covering map.

30. Definition. A matched covering (@*, -) of a smooth manifold together with a M is a covering @* = {go: a E A } of M by open sets relation on the index set A such that for all a, b, c E A

-

-- - -

(1) a - a . (2) If a b, then b a. (3) If a b, b c, and

-

n abn 4YC is nonempty, then a

-

c.

Then is called the matching relation; it tells which sets of %* are to be glued together. For such a matched covering let

-

,Z = { ( p , a ) E M x A : p

E @,}.

- -

On C define (p, a) (4, 6 ) to mean p = q and a 6. Though need not be is an equivalence an equivalence relation on A , it is easy to check that relation on C. Let M* be the resulting set C/- of equivalence classes. If a E A , let An: +. M* be the one-to-one function sending each p E @, to the equivalence class containing ( p , a). It follows from a variant of Proposition 1.42 that there is a unique way to make M* a manifold such that each such Lo is a diffeomorphism onto an open submanifold of M*. A point p* of M* is an equivalence class all of whose elements @,a) have the same first coordinate p . Setting A(p*) = p defines the natural mapping A : M* -+ M . Then B is a local diffeomorphism onto M , for it is onto by construction and, if ( p , a) E p* E M * , then R I &(qo) is the diffeomorphism inverse to Aa: @,, -+ A further property is required to make A a covering map. VI

204 7

Constructions

31. Definition. A matched covering (%*, -) of M is chainable over a curve 0:[0, 11 + M provided that, given any a E A such that o(0) E 9Yn, there exist numbers 0 = to < t l < ... < t k = 1 and indices a = a , a2 - . . . * ak such that

-

a([ti-

ti]) c %az

for

1I i I k.

-

This is just what is required to guarantee that o has a smooth lift through 4 : M* + M starting at any point over o(0). In fact, since a 6 implies An = & on n @ b , the segments Aai 0 al[ti- 1, ti] combine to form the required lift. If (%*, -) is chainable over every curve segment, then A is a covering map; however, we consider only the case where M is semi-Riemannian As usual the metric A*(g) on M * makes A : M* -+ M a local isometry, and Theorem 28 then gives

32. Proposition. If a matched covering (@*, -) of a semi-Riemannian manifold M is chainable over every geodesic segment 0:[0, 13 + M , then A : M* + M is a semi-Riemannian covering map. WARPED PRODUCTS

On a semi-Riemannian product manifold B x F the metric tensor is n*(g,) + o*(gF),where n and 0 are the projections of B x F onto B and F , respectively. A rich class of metrics on B x F will now be obtained by homothetically warping the product metric on each fiber p x F (see [BO]). Standard notation is used for the geometry of B, but the metric tensor of F is denoted by gF = (,) and its Levi-Civita connection by V.

33. Definition. Suppose B and F are semi-Riemannian manifolds, and letf > 0 be a smooth function on B. The warped product M = B x F is the product manifold B x F furnished with metric tensor

I: = 7c*(gB)+ (f n)20*(gFF). Explicitly, if x is tangent to B x F at ( p , q), then (x, x> =

( M x ) ,d.rr(x)) + f 2 ( p ) ( d 4 x ) ,ddx)).

The argument proving Lemma 3.5 shows that g is in fact a metric tensor. Iff = 1, then B x I F reduces to a semi-Riemannian product manifold. B is called the base of M = B x IF , and F the fiber. Our goal is to express the geometry of M in terms of warping function f and the geometries of B and F .

q

Warped Products 205

F

(a. Y)

Figure 3

As in the case of a semi-Riemannian product it is easy to see that the $fibers p x F = f l ( p ) and the leaves B x q = c-'(q)are semi-Riemannian submanifolds of M (see Figure 3), and the warped metric is characterized by

For each q E F , the map 7t I(B x q ) is an isometry onto B. (2) For each p E B, the map al(p x F) is a positive homothety onto F , with scale factor l / f ( p ) . ( 3 ) For each ( p , q ) E M , the leaf B x q and-the fiber p x Fare orthogonal at (P,4). (1)

Vectors tangent to leaves are horizontal; vectors tangent to fibers are onto its vertical. We denote by X the orthogonal projection of T(p,q)(M) horizontal subspace T,,,,,(B x q), and by V the projection onto the vertical subspace T@,q)(p x F). The horizontal/vertical terminology can be confusing in particular applications, and the following alternative notation for S and V is often convenient. It will soon be clear that only the shape tensor of thejbers is of interest. Thus, as in Chapter 4, we write tan for the projection V onto T(p,q)(p x F), and nor for the projection X onto (Tp,q)(P

F))I =

'T;p,q)(B

x 4).

Hence for vertical vector fields V, W on M , the formula 11(V, W ) = nor D, W gives the shape tensor of all fibers. Recall from Chapter 1 the notion of lift of a vector field on B or F to B x F , the set of all such lifts being denoted as usual by 2(B) and 2(F), respectively. Typically we use the same notation for a vector field and for its lift. The relation of a warped product to the base B is almost as simple as in the special case of a semi-Riemannian product; however, the relation to the fiber F often involves the warping function f.

206 7

Constructions

34. Lemma. If h E g(B), then the gradient of the lift h n of h to 0

M = B x F is the lift to M of the gradient of h on B.

Proof. We must show that grad(h 0 n) is horizontal and n-related to grad h on B. If u is a vertical tangent vector to M , then (grad(h 0 n),u ) = u(h o n) = rln(tl)h = 0, since dn(u) = 0. Thus grad(h n) is horizontal. If x is horizontal, 0

(dn(grad(h n)),d n ( x ) ) = (grad@ 0 n), x) = x(h 0 n) = da(x)h = (grad h, dn(x)). 0

Hence at each point, dn(grad(h 0 n)) = grad h. Thus there should be no confusion if we simplify the notation by writing h for h n and grad h for grad( h 0 x). The Levi-Civita connection of M can now be related to those of B and F as follows. 0

35. Proposition. On M then

=

B x F , if X , Y E 2 ( B ) and V, WE 2 ( F ) ,

(1) D , Y E 2(B)is the lift of D, Y on B. ( 2 ) Dx V = D v X = ( X f / f ) V . (3) nor Dv W = 11( V, W ) = - (( V, W ) / f )grad f . (4) tan D , W E 2 ( F ) is the lift of V, W on F. Proof. (1) The Koszul formula for 2 ( D x Y, V ) reduces to -V(X,Y

) + ( V , [ X , Yl)

since by Corollary 1.44, [ X , V] = [Y, V ] = 0. Because X and Y are lifts from B, < X , Y) is constant on fibers. Since V is vertical, V ( X , Y ) = 0. But [ X , Y] is tangent to leaves, hence ( V , [ X , Y ] ) = 0. Thus (D, Y, V ) = 0 for all V E 2 ( F ) , so D x Y is horizontal. Since each nl(B x q ) is an isometry the result follows.

( 2 ) D x V = D v X since [ X , V ] = 0. These vector fields are vertical since by (l), ( D , V, Y) = - ( V , D, Y) = 0. All the terms in the Koszul formula for 2 ( D x V , W ) vanish except X ( V , W ) . By definition of the warped metric tensor, ( V , W)(p,q ) = f 2 ( P ) ( V q ,W,). Writing f for f o n, we have ( V , W ) = f 2 ( ( V , W )0 a). The parenthesized term is constant on leaves, to which X is tangent; hence

w

X ( V , W )= X [ I f 2 ( ( V ,

Thus D x V = ( X f / f ) V .

41 = YXf((V,

W )O 0 ) = 2 ( X f / f K K W ) .

Warped Product Geodesics 207

(3)

BY (3,

( D v W, X > = - ( W , D v X ) By Lemma 34, X f

=

=

- ( W , (X.f/f)v>= -(Xf/’f)
W>.

(grad f , X ) on M as on B. Thus for all X ,

(DV

w,x>=

-

(((

v,W > / f )grad f ?x>.

(4) Since I/ and W are tangent to all fibers, Lemma 4.4 asserts that on a fiber, tan Dv W is the fiber covariant derivative applied to the restrictions of I/ and W to that fiber. Then c-relatedness follows since homotheties preserve Levi-Civita connections. rn

36. Corollary. The leaves B x q of a warped product are totally geodesic; the fibers p x F are totally umbilic.

Proof. By (1) in the preceding proposition the shape tensor of each rn leaf is zero. The fiber assertion is immediate from (3).

37. Examples of Warped Products. (1) A surface of revolution is a warped product with leaves the different positions of the rotated curve and fibers the circles of revolution. Explicitly, if M is gotten by revolving a plane curve C about an axis in R3 andf: C + R’ gives distance to the axis, then M is C xsS’(l). ( 2 ) R3 - 0 as a warped product: In spherical coordinates the line element of R3 - 0 is ds2 = dr2

+ r2(d9’ + sin2 9dq2).

Setting r = 1 gives the line element of the unit sphere S 2 . Evidently R3 - 0 is diffeomorphic to R f x S 2 under the natural map ( t ,p ) t-$ tp. Thus the formula for ds2 shows that R3 - 0 can be identified with the warped product R + x I S 2 . In R3 - 0 the leaves are the rays from the origin and the fibers are the spheres S2(r), r > 0. In general, R” - 0 is naturally isometric to R’ x,Sn-’. (3) The standard spacetime models of the universe are warped products (Chapter 12), as are the simplest models of neighborhoods of stars and black holes (Chapter 13).

WARPED PRODUCT GEODESICS

In B x f F , as in any product manifold, a curve y can be written as /? the projections of y into B and F , respectively.

y(s) = (a(s), /?(s)) with CI and

208 7

Constructions

38. Propositioo. A curve y = (a,b) in M if and only if (1) a“

=

(p’, p’)f

0

CI

=

B x F is a geodesic

grad f in B,

Proof. The result is local in character so it suffices to work in an arbitrarily small interval around say s = 0. Case I. y’(0) is neither horizontal nor vertical. Then ct and p are regular; hence by Exercise 3.12 we can suppose a is an integral curve of X on B and p is an integral curve of V on F . With X and V denoting also the lifts to M , y is an integral curve of X + V. Thus y“

=

D,+v(X

+V)

=z

DXX

+ Dx V + D V X + D, V.

Evidently y” = 0 if and only if both tany” = 0 and nor y” Proposition 35, the latter become

(evaluated on y). Since ( V , V ) = f 2 ( V , V ) and a’ follows.

=

=

0. Using

X , p’ = V, the result

Case 2. ~ ’ ( 0is) horizontal. If y is a geodesic, then since leaves are totally geodesic, y remains in B x p(0). Hence p is constant and (1) and (2) are trivial. Conversely, if (2) holds, then since p(0) = 0, it follows as in the remark below that /? is constant. Then (1) implies that a geodesic; hence so is y. Case 3. y‘(0) is vertical and nonzero. We can suppose grad f # 0 at p = a(O), for otherwise the fiber p x F is totally geodesic and the result

follows as in Case 2. If y is geodesic, then on no interval around 0 does y remain in the (totally umbilic) fiber p x F . Hence there is a sequence { s i } -+ 0 such that for all i, y’(si) is neither horizontal or vertical. Then (1) and (2) follow by continuity from Case 1. Conversely, (1) shows that a”(0) # 0, hence there is a sequence {si)as above and, again by Case 1, y is geodesic. rn 39. Remark. Let y = ( M , b) be a geodesic in M = B x I F. By Exercise 3.19, equation (2) above implies that p is a pregeodesic in F . Also the function (f0 M ) ~ (p) P , is a constant C . since by (2) its derivative is zero. Thus (1)

Curvature of Warped Products 209

becomes

where q5 = C/(2f2). By reparametrization it can be assumed that C/2 is - 1,0, or + 1 depending on the causal character of p. In the special case of a semi-Riemannian product the warping function is constant, so the geodesic equations in Proposition 38 reduce to d‘ = 0, p,l = 0. This proves Corollary 3.57( 1). The second assertion in the corollary, dealing with completeness, has its strongest generalization in the Riemannian case, as follows. M

40. Lemma. If B and F are complete Riemannian manifolds, then B x F is complete for every warping function f .

=

Proof. We use the metric completeness criterion from the HopfRinow theorem (5.21). Note first that if u is tangent to M then, since f > 0 and F is Riemannian, ( u , u ) 2 (dnu, dnu). Hence L(a) 2 L(n 0 a) for any curve segment; hence d(m, m’)2 d(nm, nm’) for all m, m’ E M . This property implies that, if {(pi,qi)} is a Cauchy sequence in M , then (pi}is Cauchy in B. Since B is complete, {pi} converges to some point p E B. We can assume then that the sequence lies in some compact set K in B ; hencef 2 c > 0 on K . Then a variant of the argument above shows that d(m, m’) 2 cd(orn, om’) for m, m’ E K X F . Now {qi}is Cauchy in F and thus rn converges; so the original sequence converges and M is complete.

This result fails for indefinite metrics-even if, as in the following example, both B and F have definite metrics. 41. Example (Beem, Buseman). Let M = R: x et R’.Thegeodesicequations in Proposition 38 reduce to the following numerical formulas: =

-pf2e2a,

p” =

- 2a’/,”.

Then y(s) = (Ins, l/s) is a geodesic defined for s > 0. Evidently y is inextendible and incomplete, hence M is incomplete. Reversing the metric of M gives the same result for R’ x et R t . CURVATURE OF WARPED PRODUCTS

The problem is to express the curvature of a warped product M = B x I F in terms of its warping function ,f and the curvatures of B and F.

210 7

Constructions

For a covariant tensor A on B, its /$ 2to M is just its pullback n*(A) under the projection n: M -+ B. In the case of a (1, s) tensor A : X(B) x . . . x X(B) 4 X(B), if u l , . . . , us E T , p , q J M ) define . . . , u s ) to be the horizontal vector at (p,q ) that projects to A(dnu,, . . . , dxv,) in TJB). Thus in both cases A” is zero on vectors any one of which is vertical. These definitions involve no geometry, hence are correspondingly valid for lifts from F. Let BR and FR be the lifts to M of the Riemannian curvature tensors of B and F. Since the projection n is an isometry on each leaf, BR gives the Riemannian curvature of each leaf. The corresponding assertion holds for FR, since the projection 0 is a homothety. Because leaves are totally geodesic, BR agrees with the curvature tensor R of M on horizontal vectors. This time the corresponding assertion fails for FR and R, since fibers are in general only umbilic. If h E g(B), the lift to M of the Hessian of h is again denoted by Hh. (This agrees with the Hessian of the lift h n generally only on horizontal vectors.)

x(u,,

0

42. Proposition. Let M = B x/. F be a warped product with Riemannian curvature tensor R. If X, Y, Z E Q(B) and U , V, WE !i?(F), then

(1) R x , Z E f?(B) is the lift of B R x , Z on B. (2) R v x Y = (H’(X, Y)/,f)V. where Hfis the Hessian off. (3) Rx, V = RVwX = 0. (4) RXV w = (( V?W l f )Dx(grad .f 1. ( 5 ) R v w U = FRvWu - ((gradf’,gradf)/f2)((I/, W W-

(W U > V .

These are tensor equations, hence are valid as usual for individual tangent vectors. Proof. (1) Proved above. (2) Since [V, X] = 0, RVxY = - D , Dx Y + Dx D v Y. By Proposition

35, Dx Dv y

Dx( Y f i f V ) = X( Yflf = rx Yf/f + YfX(l/f>l =

1v +

V f / f Dx V

+ (Yf/f)(XJ/f)V.

’,

But X ( l / f ) = - X,f/ f so this expression reduces to (X Yf/,f)V. Because Dx Y E i?(B),Dv(Dx Y ) = ((Dx Y ) f / f ) V .Thus R v x Y = [(XYf - (Dx Y ) f ) / . f l ~= (Hs(X, Y ) / f > V .

(3) As usual we can assume that [V, W] = 0. Thus RvwX

=

-DvDwX

+ DwDvX.

Now DVDWX

=

DV(Xfi.fW) = V(X.f/.f)W + XfIfDVW.

Curvature of Warped Products

But X f / f is constant on fibers, so V(Xf/f)

Rv,X

=

(xf/f)(-DV

W

+D,

=

21 1

0. Thus

V ) = (xf/f)Cw,

VI

=

0.

Then by a symmetry of curvature, ( R x y V ,W) = (RvwX, Y ) = 0. By (l), (RxyV, Z ) = - ( R x r Z , V ) = 0. These equations hold for all WE 2(F) and 2 E Q(B),hence R x r V = 0. (4) Note first that Rxv W is horizontal, since (Rxv W, U ) = ( R , , X , V ) , which is zero by (3). Since R,,X = 0, it follows from the symmetries of curvature that R,, W = R,, V. But using (2),

( R x v w,y > = ( R V , y, w > = H”X, Y)( V, W>/f = (( V, W)/f)(D,(grad

f), y>.

Since Rxv W is horizontal and the equation holds for all Y, the result follows. ( 5 ) R,,U is vertical since, by (3), ( R , , U , X ) = -(R,,X, U ) = 0. Because the projection CT is a homothety on fibers, FR,, U E f?(F)is the application to V, W, U of the curvature tensor of each fiber. Thus FRv,U and R,, U are related by the Gauss equation (4.5).Since the shape tensor of the fibers is given by II(V, W) = - ( ( i f , W ) / f )gradf, the result follows. rn Now consider the Ricci curvature Ric of a warped product, writing ’Ric for the lift (pullback by TC)of the Ricci curvature of B, and similarly for FRic. 43. Corollary. O n a warped product M = B x F with d 1, let X , Y be horizontal and V, W vertical. Then

=

dim F >

(1) Ric(X, Y ) = ’Ric(X, Y) - (d/,f)Hf(X, Y ) . (2) Ric(X, V ) = 0. (3) Ric(V, W) = FRic(V, W) - ( V , W ) f ” , where

and df

=

C ( H f )is the Laplacian on B.

The proof is an exercise in tensor computation: Apply the Ricci formula in Lemma 3.52 to a frame field on M whose vector fields are in f?(B) and i?(F).

212 7

Constructions

SEMI-RIEMANNIAN SUBMERSIONS

The notion of warped product can be generalized as follows. 44. Definition. A semi-Riemannian submersion x: M mersion (1.38) of semi-Riemannian manifolds such that:

-+

B is a sub-

(Sl) The fibers x-’(b), b E B, are semi-Riemannian submanifolds of

M. (S2) dx preserves scalar products of vectors normal to fibers.

Since the fibers of a submersion are smooth submanifolds, (Sl) is automatically true if M is Riemannian. Consistent with previous usage, vectors tangent to fibers are vertical, those normal to fibers are horizontal. At each point p ~ x - ’ ( b )c M , 2 ‘ and -Y- denote the orthogonal projections of T p ( M )on its subspaces

X P= Tp(x-’b)L

and

9;= T,(x-’b),

respectively. Because n is a submersion, dx gives a linear isomorphism .X, z T,(B); ( S 2 ) asserts that it is a linear isometry. Just as for a product on M : manifold each vector field X on B has a unique horizonfal l$ is horizontal and x-related to X.

x

45. Lemma. If X, YEX(B), then: (1)

%[X, r] = [X,Y ] - ; (3) 2‘Dx F = (DxY)-,

(X, P) D

where

(X, Y) x; (2) is the Levi-Civita

=

0

connection of M . Proqf. (1) is immediate from (S2). For (2) note that by Lemma 1.22, hence % [X,F] is also. (3) holds if both sides have the same scalar product with every horizontal vector field-r merely with every horizontal lift 2.Thus by (1) it suffices to prove (Dx Z ) = (DxY, 2 ) n. This follows by expanding both sides in the Koszul formula, since, using (I) and (2),

[w,P] is x-related to [X, Y],

r,

0

X( r,2 ) = X{( Y, Z ) (1, [P,Z])

=

0

n}

=

(dxX)( Y, Z )

(1,[Y,z]-) = (X, [Y,Z])

0

= fl.

X( KZ) x, 0

rn

46. Corollary. Under a semi-Riemannian submersion x: M horizontal geodesics in M map to geodesics in B.

-+

B,

Proof. If y is a nonconstant horizontal geodesic, then n y is regular, hence is (locally) an integral curve of a vector field X. But y is an integral curve of the lift X,and hence by the lemma, 0

- -

(X

0

- -

7)’’ = D x X = d z ( . X D x X ) = dx(DzX) = dn(y”) = 0.

rn

Exercises

21 3

The preceding results are trivial for warped products since leaves are totally geodesic and project isometrically. However, for a semi-Riemannian submersion 7t: M + B with dim B 2 2, leaves need not exist, even locally. In view of Frobenius' theorem [ W ] , this failure can be measured by the function assigning to each pair of horizontal vector fields X , Y the vertical vector field V [ X , Y ] . This function is fJ(M)-bilinear and in the case of a warped product is identically zero. 47. Theorem. Let n: M .+ B be a semi-Riemannian submersion. If horizontal vector fields X , Y on M span nondegenerate planes, then K,(dnX, d n Y ) = K,(X,

Y ) + $ < v [ x Y], , W"[X ,Y ] ) / Q ( X , Y ) .

The proof is a computation much like those of the preceding section. (See Theorem 3.20 of [CE] or for further details [02].)

Exercises

1. A local diffeomorphism 4: M -+ N is a covering map (a) if each @ - ' ( q ) , q E N , contains the same finite number of points, or (b) if M is compact and N is connected. 2. Let R : R + M be a double covering with M connected. (a) The map p reversing the (two) points in each A-'(m) is a deck transformation-in fact, the only nontrivial one, (b) The covering is trivial if and only if R is not connected. 3. Let A : fi + M be a normal covering. (a) Show that M is orientable if and only if fi has an orientation preserved by all deck transformations. (b) Prove the analogue for time-orientation and a Lorentz covering. 4. Let 4 :&f + M be a smooth covering. (a) If 4: P + fi is continuous and A 4 is smooth, then 4 is smooth. (b) If I):M + Q is continuous and $ 0 R is smooth, then $ is smooth. (Corresponding arguments prove the semiRiemannian analogues, in which smooth is replaced by locally isometric.) 5. If I/ is a vector field, then in terms of coordinates, 0

Hence for the corresponding formula for Af, replace V' by Cjgijdf/axj. (Hint: For a local volume element, w(dl, . . . , a,) = k m . ) 6. Let r be a group of diffeomorphisms on a manifold M . (a) If (PD1) of Definition 6 holds, then each orbit of p is a closed set of M and each p E M

214 7

Constructions

has a neighborhood 42 such that the sets 4(@), 4 E r, are mutually disjoint. (b) If (PD1) holds and M is compact, then r is finite. (c) If r is finite and no element # id has a fixed point, then r is properly discontinuous. 7. (a) If r is a group of isometries of a Riemannian manifold, then (PD1) of Definition 6 implies (PD2). (b) Let M = R2 - 0 with line element 2 dx dy. The group r generated by the isometry 4(x, y ) = (x/2,2y) satisfies (PD1) but not (PD2). 8. Definitionsof orientation. The following conditions on M" are equivalent: (i) M is orientable (Definition 1.41);(ii) M has an orientation A ; (iii) there is a nonvanishing n-form on M ; (iv) every loop in M is orientation-preserving, that is, lifts as a loop into the orientation covering A. 9. (a) A product manifold M x N is orientable if and only if both M and N are. (b) Let ( E , 71) be a vector bundle over M . If any two of M , E, (E, n) are orientable, then so is the third. (c) A submanifold P of an orientable manifold M is orientable if and only if its normal bundle is orientable. (d) If M is orientable and q is a regular value of 4 : M + N , then 4- '(4) is orientable. 10. (a) For any manifold M the manifold T M is orientable. (b) A manifold with trivial tangent bundle is orientable. (c) Lie groups are orientable. 11. A product map I$ x $ of a warped product B x F is an isometry if and only if $: F -,F is an isometry and 4: B -,B is an isometry such that

f 4=f* 12. (a) Pt(r) = St(r)/kl has curvature K = l/r2 and is complete and geodesically connected; it is orientable if and only if n is odd. (b) The same properties hold for H!(r)/+ 1, except K = -l/r2. (c) P l ( r ) is not timeorientable, but H;/ 1 is. 13. (a) In Corollary 43, f " = d(fd)/(fdd). (b) The scalar curvature of B xsFis

*

S

=

BS

+ FS/

f 2

- 2dAf/ f

-

d(d - l)(grad f , grad f )/ f ',

where d = dim F . 14. Express St as a warped product with fiber S"-" and base H' with metric reversed.

8

SYMMETRY AND CONSTANT CU RVATURE

A natural condition to impose on a semi-Riemannian manifold is that its curvature tensor R be parallel, that is, have vanishing covariant differential, DR = 0. Such a manifold is said to be locally symmetric. In particular, manifolds of constant curvature turn out to be locally symmetric. A fundamental property of curvature is its control over the relative behavior of nearby geodesics. Because a normal neighborhood 42 is filled with radial geodesics, curvature thereby gives a description of the geometry of 42. Considering only the locally symmetric case, we show that this description is so accurate that, if 42 and 42' are normal neighborhoods with the same description (and same dimension and index), then 42 and 42' are isometric. Using covering techniques this local result is given a global formulation that as a first consequence provides a list of all complete, simply connected manifolds of constant curvature.

JACOB1 FIELDS

A curve can be compared with nearby curves using the following notion. 1. Definition. A variation of a curve segment parameter mapping

x:[a,b] x(-d,d)+M, such that ~ ( u = ) x(u, 0) for all a I u I b. 21 5

ci:

[a, b] -+ M is a two-

216 8

Symmetry and Constant Curvature

The u-parameter curves of a variation are called longitudinal and the u-parameter curves transuerse. The base curve of x is ci. Typically we are interested in the longitudinal curves and the number 6 > 0 is not important. The vector field V on CI given by V ( u ) = xv(u, 0) is called the variation uectorjeld of x. Each V ( u ) is the initial velocity of the transverse curve u -+ x(u, u ) ; thus, for 6 > 0 sufficiently small, the vector field V is an infinitesimal model of the variation x. If every longitudinal curve of x is geodesic, x is called a geodesic variation or one-parameterfamily of geodesics. 2. Definition. If y is a geodesic, a vector field Y on y that satisfies the Jacobi diflerential equation Y" = R y J y ' ) is called a Jacobi vector field.

3. Lemma. The variation vector field of a geodesic variation is a Jacobi field. Proof: Since each longitudinal curve is geodesic, xu, = 0. Thus b y Proposition 4.44, xu,, = xu,, = xu,,

+ Nx,,

x,)x, = R(x,, x,)x,.

Hence x, satisfies the Jacobi equation on every longitudinal curve, in particular on the base curve, where xu is the variation vector field. D Because of this result the Jacobi equation is also called the equation of geodesic deviation. There is a far-reaching heuristic interpretation. If we think of a geodesic variation x of y as a one-parameter family of freely falling particles, then the variation vector field V gives the position, relative to y, of arbitrarily nearby particles. Thus the derivative V' gives relative velocity, and V" relative acceleration. Assigning these particles unit mass we can read the Jacobi equation I/" = RV,.y' as Newton's second law with the curvature vector R,,.y' in the role of force, the so-called tidal force. We shall see in Chapter 12 that this is the key to the interpretation of curvature as gravitation in general relativity. In the following example, attracting tidal forces pull radiating geodesics back together again. 4.

Example. In the sphere S 2 ( r ) consider the variation x(u, u ) =

r(cos u cos v, cos u sin u, sin u )

for - n/2 I u I742 and I u I small. The curves u constant are indeed longitudinal, since they parametrize semicircles joining the north and south poles (0, 0, ? r ) . Thus x is a geodesic variation of y(u) = r(cos u, 0, sin u). The variation vector field is V ( u ) = x,(u, 0) = r cos u a,,. Since dYis parallel in R3

Jacobi Fields

217

and tangent to the sphere it is parallel in the geometry of the sphere. Hence V” = - r cos u a , Since S2(r)has constant curvature K we compute

Rv&’)

=

=

-V.

l/r2 and y’ =

= (1/r2)[Y’-

-r

(y’, Y’)VI

sin u 8,

+ r cos u a,,

= - V.

Thus, as predicted by the lemma, V is a Jacobi field. 5. Lemma. Let y be a geodesic with y(0) = p , and let u, w E Tp(M).Then there is a unique Jacobi field Y on y such that Y(0) = u and Y’(0) = w.

Proof: Let El, . . . , En be a parallel frame field on y, and write Y

=

1y i E i . Let ui and wi (1 Ii In) be the coordinates of u and w relative to

E,(O), . .. ,E,(O). Then

Y(0) = u 0~ ‘ ( 0=) 0 ’ ;

Y’(0) = w c> (d$/ds)(O)

=

wi.

1

Because 7’ is parallel, y’ = aiEi with constant coefficients. Thus the Jacobi equation Y ” = Ryy,(y‘) is equivalent to the linear system

where the smooth coefficient functions are uniquely determined by R E , E ~ ( E=~C ) R;kE,

(1 I i , j , k I n).

Such linear systems have smooth solutions (on the entire domain of y) that are uniquely determined by initial conditions as above. Because the Jacobi equation is linear, the set of all Jacobi fields on y forms a real vector space. The lemma shows that its dimension is 2n, since u and w can be chosen arbitrarily in the n-dimensional space Tp(M). Exponential maps are defined in terms of radial geodesics; thus the Jacobi influence of curvature on geodesics leads to a Jacobi description of exponential maps.

TX(

6. Proposition. Let o be a point of M and let x E T,(M). For u, E T,M I , d expAu,) =

W),

where V is the unique Jacobi field on the geodesic y x such that V ( 0 )= 0

and

V’(0)= u E T,(M).

218 8

Symmetry and Constant Curvature

Proof: As in the proof of the Gauss lemma (5.1) consider the twoparameter map t ( r , s) = t(x su) in T,(M),with 0 I t I1 and I s I small. Its exponential image in M ,

+

x ( t , s) = exp,(t(x + su>> = ~,+,,(t), is a geodesic variation of y x l [O, 11. Thus the variation vector field V ( u ) = x,(u, 0) = d exp,(i,(u, 0))

is a Jacobi field on y x . Since k(1, s) = x + su, V ( 1) = d exp,(u,). The curve s + x(0, s) is constant at 0,hence V ( 0 ) = x,(O, 0) = 0. By Proposition 4.44 V’(0) = X,,(O, 0) = X,,(O, 0).

Now s + x,(O, s) = x + su is a vector field on the constant curve at o. Hence, by Proposition 3.18, x,,(O, s) = v for all s. In particular, V’(0)= u. This result provides a strong link between M and its curvature tensor. TIDAL FORCES

A vector field Y on a curve a : 1 -+ M is tangent to a if Y =fa‘ for some % ( I ) and perpendicular to u if ( Y , a’) = 0. We consider how these notions relate to Jacobi fields. If la’] > 0, then each tangent space x ( , ) ( M )has a direct sum decomposition Ru‘ + a‘l. Hence each vector field Y on ci has a unique expression Y = Y T + Y’, where YT is tangent to u and Y’ is perpendicular to a. If y is a geodesic, then Y I y implies Y’ I y. since d/ds( Y , y’} = (Y’, y’}; similarly for tangency. Then, if y is also nonnull, it follows that (Y‘)’ = ( YT)r and (Y‘)’ = (Y’)’. {E

7. Lemma. Let Y be a vector field on a geodesic y. (1) If Y is tangent to y then it is a Jacobi field o Y” = 0 0 Y(s) = + b)y’(s)for all s. (2) If Y is a Jacobi field, then Y I y o there exist a # b such that Y ( a )Iy, Y ( b )I y 9 there exists a such that Y ( a ) I y, Y’(a) I y. (3) If y is nonnull, then Y is a Jacobi field o both Y T and Y’ are Jacobi fields.

(as

Proof: (1) Since R(u, u ) = 0, the Jacobi equation forfy’ is equivalent to d2f/ds2= 0. (2) Again since R(u, u ) = 0, the second derivative of ( Y , y‘) is zero, hence ( Y ( s ) ,y’(s)} = A sB, and the result follows. ( 3 ) R( YT, 7’) = 0, hence R( Y’, y‘) = R( Y , y’). Also, since R( Y , 7’) is skew-adjoint, R( Y, y’)y’ Iy.

+

Locally Symmetric Manifolds 21 9

Thus the Jacobi equation Y ” = R ( Y , y‘)y’ splits into the two equations (YT)”= 0 and (Y’)’’ = R ( Y 1 , y’)y’. The result follows using (1).

A Jacobi field tangent to a geodesic y is of scant importance since it is the infinitesimal model for a family of geodesics that merely reparametrize y. Thus in considering the Jacobi equation Y ” = R,,,. y’ as relatiue acceleration produced by tidalforce we shall emphasize the case Y Iy. 8. Definition. For a vector 0 # u E T,(M) the tidal force operator F,: u’- + u’ is given by F,(y) = Ryvu. -

9. Lemma. F , is a self-adjoint linear operator on u’, and trace F , Ric(o, 0).

=

Proof. Clearly F , is linear (and carries u’ to itself). The pair symmetry of curvature implies that F , is self-adjoint, but note that u’ is a degenerate subspace if u is a null vector. If u is nonnull, let e 2 , .. . , en be an orthonormal basis for u’-. Then Ric(u, v ) = ci(RUe,v,ei) = -1 Ei(Fv(ei),e i ) = -trace F,. If u is null, let w be a null vector such that ( u , w ) = - 1; these two vectors span a Lorentz plane IZ. Then e , = ( u + w)/,,h and e2 = ( v - w)/$ form an orthonormal basis for IZ with e l timelike and e2 spacelike. Let e 3 , .. . , en be an orthonormal basis for Il’ c u‘. Then

c

Ric(v, u> = - + ( R o e ,u, e2> +

C Ej(Rve,U, ej>.

j>2

The first two terms on the right-hand side cancel since (RuelV, e l > = HRuwv, W > = (Rve2v, e l > . Now u, e 3 , .. . , en is a basis for d , and F,(v) = 0. Hence trace F , = ej(F,(ej), e j ) = - E~(R,,,u,e j ) . j> 2

1

j>2

In the nonnull case tidal forces are often normalized by taking u to be a unit vector. For example, if M has constant curvature C , then Corollary 3.43 gives F,(y) = - &Cy,where E = ( v , u ) = f 1. LOCALLY SYMMETRIC MANIFOLDS

10. Proposition. The following conditions on a semi-Riemannian manifold M are equivalent: (1) DR = 0, that is, M is locally symmetric. (2) If X , Y , 2 are parallel vector fields on a curve a, then the vector field R,,Z on a is also parallel.

220 8

Symmetry and Constant Curvature

( 3 ) Sectional curvature is invariant under parallel translation, that is, the sectional curvature of a nondegenerate tangent plane I7 remains constant as n is parallel translated along any curve a. Proof: For simplicity assume that the curve a is regular; this will be sufficient for later work. (2). Fix an arbitrary point on a,say a(0). By Exercise 3.12 there (1) exist vector fields V , X , F, Z on a neighborhood of a(0) in M such that K,f) = a'(t), = X ( t ) , and similarly for P and Z . Since D R = 0, - - --0 = (D,R),?;Z = D,(R(X, Y ) Z ) - R ( D V X , Y ) Z - R ( X , D v F ) Z - R ( X , P)(D,Z). Now evaluate at 0. There, for example,

(4m

( O ) =

( E m = Y'(0) = 0,

since Y is parallel. Thus the equation above reduces to

(RXYZ)'(O) = Dd(O,(RXYZ)= 0. (2) (1). If u, x, y , z E T,(M), let X , Y, Z be the parallel vector fields on yu gotten by parallel translation of x, y, z, respectively. Then

-

(D,R),,z

=

( R x y Z ) ' ( 0 )= 0.

(2) (3). Let n(0)be a nondegenerate tangent plane at a(O), and let X , Y be parallel vector fields on a such that X(O), Y ( 0 ) is a basis for n(0). Thus X(r), Y ( t ) is a basis for the parallel translate n(t)of n(0)along a. Then both ( R , , X , Y) and Q ( X , Y ) are constant along a, hence the sectional curvature of n(t)is constant. ( 3 ) (2). Suppose a : I -+ M starts at p . By orthonormal expansion it suffices to show that, if X , Y , Z , W are parallel vector fields on a, then ( R x y Z , W ) is constant. Fix t E I and define a function A : T,(M)4 -+ R by A(x, y , Z, W ) = (RxrZ, w)(t>,

where X , Y, Z , W are parallel vector fields on a extending x, y, z , w. Since Q ( X , Y ) is constant, ( 3 ) gives

for all x, y e T,(M) spanning a nondegenerate tangent plane. It is easy to check that the function A is curvaturelike (page 79); hence by Corollary 3.42

&, y , z, 4 = (RxyZ,w ) . Thus ( R , , Z , W ) ( t ) is independent oft.

lsometries of Normal Neighborhoods

221

11. Corollary. A semi-Riemannian manifold of constant sectional curvature is locally symmetric.

This is obvious from criterion (3) above. Semi-Riemannian products of locally symmetric manifolds are again locally symmetric but (if nonflat) do not have constant curvature.

ISOMETRIES

OF NORMAL NEIGHBORHOODS

Let M and R be semi-Riemannian manifolds of the same dimension and index and let o E M and A E M . Our goal is to find conditions under which a given linear isometry T,(M) + &(M) is the differential map of an isometry defined on some normal neighborhood of 0. 12. Definition. Let L: & ( M ) -+ T,(M) be a linear isometry, and let Q be a normal neighborhood of o in M such that exp, is defined on the set L(exp,- '(42)). Then the mapping

rbL = exp50Loexp;':%

+

M

is called the polar map of L on 42. In short, +L sends exp,(u) to exp,(Lu) for all u E Q c 7',,(M). Polar maps always exist for Q sufficiently small, and the first two properties below show that if the isometry we seek exists it must be a polar map of L. 13. Lemma. With notation as above, ( 1 ) 4Lcarries radial geodesics to radial geodesics; explicitly, if u E T,(M), then & o y u = yLu, where both sides are defined. (2) The differential map of 4 at o is L. (3) If 42 is sufficiently small, then 4Lis a diffeomorphism onto a normal neighborhood of o in (4) If M is complete, +L is defined on every normal neighborhood of 0.

a.

Proof. (1) Since y,(t)

=

exp,(tu),

4L(YU(t))= exPo(L(t4) = exPo(tL(u))

=

YLdt)

for all t such that y,(t) remains in Q. (2) If u E T,(M), then by (1) &L(U)

=

ddL(YL(0))

= (rbL

O

Y,)'(O)

=

YkU(0)

= LU.

222 8

Symmetry and Constant Curvature

(3) If @ is sufficiently small, $L(%) is contained in a normal neighborhood ofo. Then exp, is a diffeomorphism of P = L(exp; ' %) onto a (normal) neighborhood Y" of 6 Thus 4L:% -+ V is a composition of diffeomorphisms. Finally, (4) is clear since exp, is defined on all of %(AT). a If L is to be the differential map at o of an isometry, then, as we saw in Chapter 3, it must preserve curuature at 0 ;that is, L(RxyZ)= RLX,&z)

for x, y ,

E

T(W.

By Corollary 3.42 it is equivalent that L preserve sectional curvature: K(ZZ) = K ( ~ 1 7for ) nondegenerate planes in To(M). This necessary condition is sufficient. 14. Theorem. Let M and

manifolds, and let L: curvature. Then

T(M)

-+

be locally symmetric semi-Riemannian &(AT) be a linear isometry that preserves

(1) If 2% is a sufficiently small normal neighborhood of o, there is a unique isometry 4 of -& onto a normal neighborhood V o f 0 such that d#lo = L. (2) If M is complete then for any normal neighborhood 42 of 0,there is a unique local isometry 4: 1 -+ M such that d40 = L.

Proof: In both cases uniqueness follows from Proposition 3.62. By the preceding lemma, existence in both cases follows if every polar map 4 = 4L:42 -+ is a local isometry. Thus for u E T J M ) , p E %, we must prove that (dqh, d q h ) = (0, u ) . The idea is to use Proposition 6 and show that corresponding Jacobi fields grow at the same rate in both manifolds. (1) Let 02 be the neighborhood in T'(M) corresponding to the normal neighborhood 42. There is a unique x E 4 and y X g T,.(T'M) such that d exp,(y,) = u. By Proposition 6, (0, o) = (Y(1), Y(1)), where Y is the Jacobi field on yx such that Y(0) = 0 and Y'(0) = Y E T,(M). (2) Now look at the corresponding situation in AT. Since L is linear, d ~ ( y ,= ) ( L Y ) L ~ . Hence by the definition of the polar map 4,

d4(u)

=

d exPo((Ly)Lx).

Thus, as above, (d&, d 4 v ) = ( P ( 1 ) , P(1)), where P is the unique Jacobi field on yLx such that P(0) = 0 and P'(0) = LY E G(R). (3) Let E , , . . . , En and El, . . . , Enbe parallel frame fields on y x and yLx respectively, such that L(E~(O))= E,(O) for all i. Since L is a linear isometry, the coordinates of x and y relative to the Ei(0)s are the same as the coordinates of LX and ~y relative to the E,(O)s.

lsometries of Normal Neighborhoods

223

If we write Y = y i E i , then the functions y ' , . . . ,y" satisfy the system of differential equations in the proof of Lemma 5 as well as the initial conditions corresponding to Y(0) = 0 and Y'(O)'= y . Over in El, writing P = y'Ei gives the corresponding differential equations

1

Furthermore, by a remark above, the functions y ' , . . ., y" and j ' , . . .,y" satisfy exactly the same initial conditions. (4) We assert that R $ k = RZk on common domain I of y x and y L x (keeping the former in 42). In fact, since the linear isometry L preserves curvature, it follows from our choice of corresponding frame fields that j ? Z k ( O ) = R&(O). Since M is locally symmetric, Proposition 1 shows that the functions ( R E j E k E iE, m ) and hence RZk are constant on I . Similarly REk is constant, giving the result. i I n) satisfy the same system of dif( 5 ) The functions y' and y i (1 I ferential equations and the same initial conditions. By the uniqueness of such solutions, y' = y' for all i; hence ( d 4 u , d$u) = (P(l), Y(1)) = ~ ~ ( y ' ( 1=) )(Y(1), ~ Y ( 1 ) ) = (0, u ) .

C

A first consequence of the theorem is that constant curvature determines local geometry: 15. Corollary. Let M and R be semi-Riemannian manifolds with the same dimension and index and the same constant curvature C . Then Any points o E M and B E R have isometric neighborhoods.

exists (by 2.27) This is clear since a linear isometry from T J M ) to &(R) and must preserve curvature. Another application will explain the term locally symmetric. For p E M let (, be the polar map of the linear isometry u + - u of T,(M). If 42 is a suitably chosen normal neighborhood, then [,: (42 + 42 is a diffeomorphism. Evidently (, reverses geodesics through p: if y(0) = p , then (,(y(s)) = (y -s). In fact this property uniquely determines (, (once is specified), hence [, is called the local geodesic symmetry of M at p . 16. Corollary. The following conditions on a semi-Riemannian manifold are equivalent:

(1) M is locally symmetric. (2) If L: T,(M) + T , ( M ) is a local isometry that preserves curvature, then there is an isometry 4 of normal neighborhoods of p and q such that dcjp = L. (3) At each point p of M the local geodesic symmetry (, is an isometry.

224 8

Symmetry and Constant Curvature

Proof: (1) (2). By the theorem. (2) (3). The linear isometry -id on T,(M) carries each nondegenerate plane Il c T,(M) to itself; thus -id preserves curvature. If 4-idis the isometry given by (2), then reverses geodesics through p , hence is the local geodesic symmetry at p . (3) * (1). We show that the tensor field D R vanishes at an arbitrary point p . Since [, is an isometry it preserves curvature and covariant derivatives. The differential map of l pat p is -id, hence for all u, x, y , z E T'(M)

-(D,R),,z Hence DR = 0 at p .

=

u - " W - , ,-,(-4 = (D,R),,z.

SYMMETRIC SPACES

The local result in Theorem 14 can be generalized as follows. 17. Theorem. Let M and hl be complete, connected, locally symmetric semi-Riemannian manifolds, with M simply connected. If L: T,(M) -+ %(a) is a linear isometry that preserves curvature, then there is a unique semi-Riemannian covering map 4: M -+ R such that d 4 0 = L. We postpone the proof to look first at some consequences. Roughly speaking, the theorem asserts that a complete connected locally symmetric manifold R is determined-up to semi-Riemannian covering-by its curvature at one point. The result is due in principle to E. Cartan (1869-1951), who singlehandedly created the theory of symmetric spaces. (Also see [A] and Theorem 1.36 of [CE].)

18. Definition. A semi-Riemannian symmetric space is a connected semi-Riemannian manifold M such that for each p E M there is a (unique) isometry 5,: M -+ M with differential map -id on T,(M). The isometry i,is called the global symmetry of M at p . Since it reverses the geodesics through p , i,is the unique extension to all of M of the local geodesic symmetry at p . Thus the latter is an isometry; so by Corollary 16, symmetric implies locally symmetric. 19. Examples. ( 1 ) R" is symmetric, since for each point p the map p x -+ p - x is an isometry. (2) The sphere S" is symmetric, since for each p it is symmetric in the usual Euclidean sense about the line in R"" through p and - p . (3) In fact, by Proposition 4.30, euery connected hyperquadric is to be -el,. . . , --en). symmetric (take the framef,, . . . ,,L

+

Symmetric Spaces 225

Chapter 11 will exhibit a variety of symmetric spaces with nonconstant curvature. A locally symmetric manifold need not be complete, since any open submanifold of one is again one; however,

20. Lemma. A semi-Riemannian symmetric space is complete. Proof: To show that a geodesic y: [0, b) + M is extendible, choose c near b in the interval, and let [ be the global symmetry at y(c). Since [reverses geodesics through y(c), a reparametrization of [ 0 y provides the required rn extension of y.

21. Corollary. A complete, simply connected, locally symmetric semiRiemannian manifold is symmetric. Proof: At any point p E M the linear isometry -id on T’(M) preserves curvature. Thus we can apply Theorem 17, with R = M , to obtain a semiRiemannian covering map 4: M + M such that ddP = -id. Since M is rn simply connected, 4 is an isometry, by Corollary 7.27.

It follows that, if M is a complete, connected, locally symmetric manifold, then its simply connected semi-Riemannian covering manifold fi is symmetric. In fact, by Corollary 7.29, fi is complete and (since the covering map is in particular a local isometry) fi is locally symmetric. Thus the preceding corollary shows that fi is symmetric. Now we return to the proof of Theorem 17. The scheme is as follows. By Theorem 14 there is an isometry on some neighborhood of o E M whose differential map is L. Any point p E M can be connected to o by a broken geodesic /?, so we can parallel translate L along p to p and there get another isometry on a neighborhood of p . Because M is simply connected, these locally defined isometries fit together to give the required map 4. Some notation is required to cope with broken geodesics. For any k 2 1 let u be a k-tuple ( u l , . . . , u k ) of vectors in T,(M). Then the broken geodesic 8,: [0, k ] + M is defined inductively as follows (using the fact that M is complete). Let = y,, I[O, 11. If pj has been defined on [ O , j ] , for 1 Ij < k , let w = P ( U ~ +be~ the ) parallel translate of u j + l along pj to its endpoint bj(j).Then attaching t

to

-,y,(r

-

j)

( j IE Ij

+ I)

pi gives pi+ 1.

(1) For u as above, let P,: T,,(M) + TB&)(M) be parallel translation along Similarly in R,for LU = ( L u l , . . . ,L u k ) let PLv:T(M)+ q , , ( k ) ( R ) be parallel translation along I j L v . Then the parallel translate of L along Ijk is

8,.

L” = P L ,

0

L

0

Pv- : T,(M)

+

T(@),

226 8

Symmetry and Constant Curvature

where p = /l,(k) and q = fiL,(k). Since M and R are locally symmetric the linear isometries P , and PLupreserve curvature, hence so does the linear isometry L,. By Theorem 14, there is a neighborhood 02, of each fl,(k) and a local isometry 4u:4Y, + h? whose differential map at &(k) is L,. In view of Lemma 5.10, we can suppose that each J2,is convex and that if $2, and $2w meet, then their intersection is convex. hence connected. (2) We now construct a matched covering of M . Let A be the set of ktuples in T , ( M ) for all k 2 1. Let M be the collection of convex open sets 42, as above for all v E A. Since M is connected, 53 is a covering of M . If u, w E A , define t' w to mean that 4u= 4won $2, n aW. Evidently the relation is reflexive and symmetric, so it remains to prove the quasi-transitive condition (3) of Definition 7.30. Suppose u v, v w, and p E 4Yu n $2, n $2w. It follows from the two relations that the differential maps d4", d&, d'$, all agree at the point p . Hence by Proposition 3.62 4u = bwon $2,, n 4Yw. Thus u w, and (A, ) is a matched covering. (3) The essential step of the proof is to show that this matched covering is chainable over any geodesic segment a: [O, I] + M , since then Proposition 7.32 applies. Choose u = ( u l , . . . , u k ) E A so that fiu runs from o to a(0). Let uk+ be the parallel translate of ~ ' ( 0 back ) to o along flu. For each s E [0, 11 let w(s) = (vlr . . . , u k , suk+ A . Thus the last segment of flw(s)is a reparametrization of CJI[O, s]. Then there is a partition 0 = so < s1 < . . . < s, = 1 such that

-

-

-

- -

-

for

a([si- 1, s i ] ) c $2w(sI)

1 I i I m.

-

See Figure 1. It remains to check that w(siw(si). Abbreviate 4w(si) to 4i. Let P be parallel translation along CJ from a(sito o(si),and correspondingly let P be parallel translation along 4i0 a from c$~cJ(s~- to 4ia(si).It follows from the definition of L, that LW(S,. I)

=

p-

O LW(Si)

p.

As a local isometry, 4i preserves parallel translation ; hence, ( d 4 i ) o , s z- I )

=

P-

0

(d4i)ofsi) 0 P .

Figure 1

Simply Connected Space Forms

By construction, for each i, L , , , ( ~ ~is) the differential map of q5i at b,,,(& Thus the two preceding formulas yield

227

+ 1)

= a(s,).

-

= 4i on n a,+); that is, w(siPI) w(si). By Proposition 7.32, the matched covering (9, -) gives a semiRiemannian covering map &: M* + M . As with any matched covering, for each v E A there is local section A":%" + M * , and meets A,,,(%,,,) if and only if u w. It follows that the maps 4" A : &(%J + R for all u E A agree on overlaps, hence define a single map 4*: M* -+ R.This is a local isometry since A and every 4" are. Since M is simply connected, Corollary A.14 applies, showing that there is a (locally isometric) global section A: M + M * such that 111q0= Lo for (0) E A . Thus 4 = 4* 0 A: M -+ R is a local isometry with 4la0= $o; hence d$o = L. B

Hence,

$i-l

(4)

-

0

SIMPLY CONNECTED SPACE FORMS

22. Definition. A space,forrn is a complete connected semi-Riemannian manifold of constant curvature.

Space forms must be regarded as the simplest important class of semiRiemannian manifolds. In the simply connected case, Theorem 17 is decisive:

23. Proposition. Simply connected space forms are isometric if and only if they have the same dimension, index, and curvature C. ProoJ: Obviously the conditions are necessary. For the converse, note that, since constant curvature implies local symmetry, Corollary 2 1 shows that simply connected space forms are symmetric. Suppose M and N are simply connected space forms of the same dimension and index. The latter show that linear isometries T,(M) -+ T,(N) exist. Then Theorem 17 gives a semi-Riemannian covering map M -+ N which, since M is simply connected, is an isometry. Thus up to isometry there is at most one simply connected space form M(n, v, C) of dimension n, index v, and curvature C. Existence for all (n, v, C) is settled for C = 0 by the semi-Euclidean spaces R:, and, with suitable modifications, for C > 0 by pseudospheres and for C < 0 by pseudohyperbolic spaces.

228 8

24.

Symmetry and Constant Curvature

Corollary. For n 2 2,

i-

S:(r) if C = l/r2 and 0 I v I n - 2, if C = 0, M(n, v, C ) = R: Ht(r) if C = - l/rz and 2 I v I n. H;-,,(r), these space forms are all In fact, since St(r) s"-' x R simply connected. The remaining cases come in anti-isometric pairs. One-dimensional semi-Riemannian manifolds are trivially flat ; hence by Exercise 8, the only simply connected one-dimensional space forms are R' and R : . For n 2 2 we have .V

M(n, n, l / r 2 ) = one component cS#) of S:(r), and by metric reversal, M(n, 0, - l/r2) = hyperbolic n-space H"(r), one component of H$(r). M ( n , n - 1, l/r2) = S:- l(r), the simply connected semi-Riemannian covering manifold of S:- l(r), and M ( n , 1, - l/r2) = I?y(r), the simply connected semi-Riemannian covering manifold of H l ( r ) .

-

Since R" is the simply connected covering manifold of S' x R n - ' , these four types are diffeomorphic to R". 25. Corollary (Hopf). A complete, simply connected, n-dimensional Riemannian manifold of constant curvature C is isometric to

the sphere S"(r) Euclidean space R" hyperbolic space H"(r)

if if if

c = 1/r2, C = 0, C = - l/r2.

This result is particularly satisfying from a historical viewpoint, since all three of these constant curvature geometries were well known, at least in low dimensions, before Riemannian geometry was invented (1 845) and long before a rigorous proof of this corollary (1926). Euclidean geometry is of course the oldest, and hyperbolic geometry, dating from the early 1800s, the newest. Writing fi for the simply connected semi-Riemannian covering manifold of M , the Lorentz analogue of Hopf's result is

26. Corollary. A complete, simply connected, n-dimensional Lorentz manifold of constant curvature C is isometric to the Lorentz sphere S ; ( r ) s:(r) Minkowski space RY @(r>

if C = l/r2 and if C = l/r2 and if C = 0, if c = -l/r2.

n 2 3, n = 2;

Simply Connected Space Forms

229

In relativity theory S:(r) is called de Sitter spacetime, and fl:(r) is universal anti-de Sitter spacetime. 27. Example. A model for pl.As Figure 2 suggests, H t can be regarded as a hypersurface of revolution in R;i ' = R: x R"- Unwrapping the circles of revolution gives a covering map R : R' x R"-' + H;, where

'.

R(t, XI

=

(JEG 'cos t, J

sin t, XI.

Furnished with the pulled-back metric, R" becomes the simply connected semi-Riemannian covering manifold of H ; -and hence can be denoted by

A;. R: (coordinates uo, u ' )

' (coo1-dinates u 2 , . . . ,u")

Figure 2. H; in R;+'

The geometry of the model can be derived from that of H ; as modified by the map A. For example, if the coordinate slice uo = 0 in RI' is regarded as R;, then R; n H ; is just H",'. The latter's two components H"-' are totally geodesic in H:. By rotational symmetry the same is true for all slices of H ; by hyperplanes through R'- '. Thus in plthe slices t constant are totally geodesic spacelike hypersurfaces, isometric to H"- (or R' if n = 2).

In Riemannian geometry, curvature inequalities such as K 2 c and aI K I b have been intensively studied. For indefinite metrics we shall now see that such conditions imply constant curvature. Sectional curvature K is undefined on a degenerate plane Z l in T'(M), but a formula in the proof of Lemma 3.39 shows that sgn(R,,X, Y ) is the same for all bases X, Y for n. This provides a well-defined function JV from the set of degenerate planes in T,(M) to the set (+ 1,0, - 1 >. 28. Proposition (Kulkarni et a/.). Let p be a point of a semi-Riemannian manifold of indefinite metric. The following conditions on T,(M) are equivalent : (1) K is constant, (2) N = 0,

230 8

Symmetry and Constant Curvature

(3) a I K or K I b, where a, b E R, (4) a I K 5 b on indefinite planes, ( 5 ) a I K < b on definite planes. Proof (S. Harris et al.). Evidently we can assume dim M 2 3. Obviously (1) implies (3, 4, 5), and by Corollary 3.43 it implies (2). The reverse implications will use this consequence of .Ar = 0: (*) If X , U , Z are orthonormal vectors and X and U have opposite causal character, then K ( X , Z ) = K ( U , Z ) .

To prove this note that U are degenerate. Hence

(Rx+u , A X

k X are null vectors and the planes (U i-X , 2)

+ u),Z > = 0 = < R x - u , z ( X

-

Expansion yields first ( R x z U , Z ) = 0, then ( R x z X , Z ) Since Q ( X , Z ) + Q ( U , Z ) = 0, the result follows.

W, Z > .

+ (RuZ U , Z ) = 0.

( 2 3 1) Case I v = 1 or n - 1. By metric reversal we can assume U , X , Y is orthonormal with U timelike, then (*)implies K ( U , X ) = K ( X , Y ) . It follows that every nondegenerate plane that either contains or is orthogonal to U has the same sectional curvature k(U). If V is an independent timelike vector, the plane spanned by U and V is nondegenerate, hence k(U) = K ( U , V ) = k ( V ) . v = 1. If

Case I I 1 < v < n - 1. Let U be a positive definite plane, U’negative definite, and let X E ZZ and U E U’ be unit vectors. Now Q ( X , U ) < 0, and using Lemma 2.19 we easily check that ZZ + RU and U‘ + R X are nondegenerate. Hence by Case I, K(ZZ) = K ( U , X ) = K ( n ’ ) . This suffices because, if X , U is an orthonormal basis for any indefinite plane, then extension to an orthonormal basis for T’JM) will provide planes U , U’ as above.

(332). Suppose a I K (reversing the metric gives K I b). By Exercise 9 every degenerate plane n is a limit of definite planes and a limit of indefinite + ZZ = n ( X , Y ) , where X i + X and +Y. planes. Consider U(Xi, Then ( R ( X t , V X i , X) a.

x)

x

Q(xi,V) Here the numerator approaches ( R ( X , Y ) X , Y ) and the denominator approaches zero. Thus, if all Q ( X i , > 0, then ( R ( X , Y ) X , Y ) 2 0, hence N ( U ) 2 0. Similarly, if all Q ( X , , < 0, then M ( n ) 5 0; hence N ( U ) = 0.

x) x)

x)l/

( 4 a 2) and (5 * 2 ) . With notation as above, I ( R ( X i , x ) X i , I Q ( X i , 6)l is bounded, where Q ( X i , is negative for (4), positive for (5). Hence ( R ( X i , x ) X i , approaches zero, so N(n)= 0. rn

x)

x)

Transvections 231

If at each point of M, connected and dim 2 3, one of the conditions in the proposition holds, then, by Schur’s theorem (Exercise 3.21), M has constant curvature. TRANSVECTIONS 29. Definition. An isometry geodesic y : R + M provided

4: M

+M

is a transvection along a

(1) 4 translates y; that is, &(s) = y(s + c) for all s E R and some c. (2) d 4 gives parallel translation along y ; that is, if x E q(s)(M),then d4(x) E T ( s + c l ( Mis) the parallel translate of x along y. For example, if 4 is a rotation of the sphere S2about the z-axis in R3,then 4 is a transvection along the (geodesic) equatorial circle z = 0. In a symmetric space there are transvections along every geodesic. be the global 30. Lemma. If y is a geodesic in a symmetric space, let is is a transvection symmetry of M at y(s). Then for any c, the isometry [cizio along y that translates it by c. Proof. For all s, [o(y(s))

=

y( - s). But c/2 is the midpoint of [ -s, s

+ c],

+ c). so i c i 2 i o Y ( S ) = If X is a parallel vector field on y , then for any s, d [ , ( X ) is a parallel vector 0 y, which is a reparametrization of y. (Since parallel translation field on is is independent of parametrization, we can ignore such reparametrizations.) If x E TCs, M , then x is parallel along y to a tangent vector y at y(0). Hence dio(x) is parallel to diO(y) = - y , and also to a vector z at y(c/2). Thus di,,, dcO(x)is parallel to - z , hence to y, hence to x.

31. Corollary. Every nonconstant geodesic in a symmetric space is either one-to-one or simply periodic.

Proof. It suffices to show that, if y is a geodesic with y(b) = y(0), then y(s b) = y(s) for all s. For each s let 4sbe the transvection that translates y by s. Then

+

Y(S

+ b) = qbs(Y(b)) = q b S ( Y ( 0 ) )

=

’

Exercises

1. An isometry iof M , connected, is a global symmetry at p E M if and only if [ is involutive (that is, 1’ = id) and p is an isolated fixed point of [.

232 8

Symmetry and Constant Curvature

2. (a) Let M be a semi-Riemannian submanifold of a symmetric space M.If

M is connected, closed in M,and totally geodesic, then M is symmetric. (b) A semi-Riemannian product M x N is symmetric if and only if both M and N are symmetric. (c) Let k : A -+ M be a simply connected semi-Riemannian covering. If M is symmetric, then fi is symmetric (but not conversely). 3. (a) If M is a complete Riemannian manifold, then every element of nl(M, p ) contains a geodesic loop. (Hint: Use the Hopf-Rinow theorem in the simply connected covering manifold.) (b) The fundamental group of a Riemannian symmetric space is abelian. (Hint: Show that the geodesic symmetry at p induces the homomorphism g -+ g - l on q ( M , p).) 4. A model of hence (reversing the metric) R:. Consider the smooth map x: R2 + R: given by x(t, 9) = (sinh t, cosh t cos 9, cosh t sin 9). (a) Prove that x is a covering map of R2 onto the unit pseudosphere S: in R:. (b) Prove that R' furnished with the pulled-back metric is $ and the line element is -dt2 + cosh' t d 9 2 . (c) Sketch some null geodesics of this model and determine which points can be reached by geodesics starting at the origin. 5. (a) The model of in Example 27 has line element - S2 dt2 + do2,where S = (1 + x x ) ~ and ' ~ do2 = dx dx - ( ( x d x ) 2 / S 2 ) on R"- l . (Hint: S d S = x - d x . ) (b) do2 on R"-' gives H"-'.(For other models of hyperbolic space see [Wo].) 6. Let y be a geodesic with (y', y ' ) = E = f 1 in a semi-Riemannian surface. If Y is a Jacobi field on y perpendicular to y, write Y = Y E , where E is a parallel unit vector field on a. Show that the Jacobi equation for Y reduces to y" + EKY = 0. 7. Let y be a geodesic in a manifold of constant curvature C , and let Y be a Jacobi field on y that is ly. (a) the Jacobi equation for Y is Y" + C(y', y')Y = 0. (b) Let c = 1 C(y', y ' ) I '". On y there exist parallel vector fields A , B 1y such that Y(s) = cos(cs) A(s) + sin(cs) B(s) if C(y', y ' ) > 0. Y ( s ) = A(s) + sB(s) if C(y', y ' ) = 0. Y ( s ) = cosh(cs) A(s) + sinh(cs) B(s) if C(y', y ' ) < 0. 8. (a) Let M be a flat connected semi-Riemannian manifold complete at a E M (that is, expo is defined on all of T,(M)).Prove that exp,: T,(M) -+ M is a semi-Riemannian covering map. (Hint: Use Proposition 6.) (b) Give an example of a connected semi-Riemannian manifold that is complete at one point but not complete. 9. In a vector space with indefinite scalar product, every degenerate 2-plane is a limit of (a) indefinite planes (b) definite planes. (Hint: See proof of Lemma 3.40, but for (b), if gln = 0, consider n(u + 6x, w + dy).)

s:,

-

-

.

9

ISOMETRIES

For a semi-Riemannian manifold M , the set Z ( M ) of all isometries M -+ M forms a group under composition of mappings. Roughly speaking, the larger Z(M) is, the simpler M is. M may have no isometries except the identity map, but many manifolds have isometry groups large enough so that Lie theory, whose rudiments appear in Appendix B, can be applied. In nontrivial cases Z(M) is a geometric invariant of M ranking in importance with its curvature and geodesics. Since each tangent space of M is isometric to R:, the isometry group Z(R:) is of fundamental significance in semi-Riemannian geometry. In particular, for manifolds with indefinite metrics it leads to twin notions of time- and space-orientability analogous to ordinary orientability. By Chapter 7 any connected semi-Riemannian manifold M can be expressed as an orbit manifold Q/r,where is the simply connected covering manifold of M and r is a properly discontinuous subgroup of I ( f i ) . Thus geometric properties of M can be expressed in terms of algebraic properties of r, a scheme that is particularly effective when M is a space form. A Killing vector field on M (Definition 22) is an “infinitesimal isometry” of M . Their tensor properties make Killing vector fields easier to find and study than isometries, and in favorable cases they provide a close link between the geometry of M and the algebra of I ( M ) . SEMIORTHOGONAL GROUPS We use the column vector conventions, under which an n x n real matrix g is identified with the linear operator g: R“ + R” such that (gx)i = C g i j x j

for all 233

1 I i In.

234 9

lsometries

Under this identification, composition of functions agrees with matrix multiplication gh. Standing n-tuples x on end as n x 1 matrices (“column vectors”) also gives g x by matrix multiplication :

For 0 I v I n, the signature rnarrix E is the diagonal matrix (sijcj) whose diagonal entries are = . . . = E , = - 1 and E,+ = . . . = E, = + 1. Hence E= E = ‘E, where ‘g denotes the transpose of g. By the identification above, the set of all linear isometries RI: + R: is the same as the set O(v, n - v) of all matrices g E GL(n, R ) that preserve the scalar product ( u , w ) = ~ t ‘ w of R:. Evidently O(v, n - v) is a closed subgroup of GL(n, R ) and hence is itself a Lie group (see Appendix B). We call it a semiorthogonal group. 9

1. Remark. In this chapter the group O(v, n - v) will usually be denoted by O,(n). Although the former notation is standard in Lie group theory, the latter has advantages in the present context (compare [Wo]).

2. Lemma. The following conditions on an n x n matrix are equivalent: SEOdn). ( 2 ) ‘g = cg- I E . ( 3 ) The columns [rows] of g form an orthonormal basis for R: (first v vectors timelike). (4) g carries one (hence every) orthonormal basis for R: to an orthonormal basis. (1)

Proof. (1) o (2). The transpose of an n x n matrix is its adjoint relative to the dot product. Thus, (1) e ( g u , g w ) = ( u , w ) for all u, w o cgu .gw = EU * w for all t’, w o ‘gEgu w = EU * w for all u, w o ‘ g q u = EU for all u o ‘g&g = E 0‘g = Eg - I&. (1) o (4). See Lemma 2.27. The equivalence of (4) and (3, for columns) is clear since the columns of g are just gu,, . . . ,gun, where u,,. . . ,u, is the natural basis for R“ (orthonormal relative to the scalar product of R: for all v). The equivalence of (4) and ( 3 , for rows) follows since manipulation of (2) shows that g E O,(n) if and only if ‘g E O,,(n). a

-

In this context, timelike vectors must appear first in orthonormal bases. For example, (0,l) and (1,O) are orthogonal unit vectors of R: but (y is not in 01(2) since (2) fails.

A)

Serniorthogonal Groups 235

When v is either 0 or n, the group O ( v , n - v ) is the orthogonal group O(n) of all linear isometries of Euclidean space R". For n 2 2, O,(n) = O(1, n - 1) is the Lorentz group of all linear isometries of Minkowski space R;. For arbitrary v, the Lie groups O,(n) and 0,- v(n)are isomorphic. In fact, conjugation by

is a Lie group automorphism of GL(n, R ) that is readily shown to carry Ov(n) to On - v(n). Using the methods of Appendix B, we now compute the matrix Lie algebra of O,(n). Recall that gl(r.1, R ) is the Lie algebra of all real n x n matrices, with [a, b ] = ab - ba, and that the Lie algebra O ( H ) of O(n) consists of all X E gl(n, R ) that are skew-symmetric: 'X= - X.

3. Lemma. The Lie algebra o,,(n>= o(v, n - v ) of Ov(n) is the subalgebra of gl(n, R ) consisting of all S for which 'S = - ESE.Such S have the form

where a E o(v), b E o(n - v), and x is an arbitrary v x ( n - v ) matrix.

Proof: According to Lemma B.14, an n x n matrix S is in o,(n) if and only if ersE O,(n) for I r I small. Since E - ' = E , Lemma 2 and Lemma B.16 show that ers E O,(n) if and only if the matrix E exp(r[S)E-' = exp(rdSc:- * ) equals exp( - rS). By Lemma B.12, the exponential map is one-to-one near 0, hence S E o,(n) if and only if these exponents are equal for I r I small; that is, E'S = - SE. To get the matrix description, write S in block form (: ;), with a v x v and b ( n - v) x ( n - v). Since c:=

(-1.

0

O 1n-v

),

we find

The result follows.

w

Because dim O ( V ) = v(v - 1)/2, a count of vector space dimensions shows that dim O,(n) = dim o,(n) = n(n - 1)/2, independent of v.

236 9 Isometrics

The matrix condition ' S = -ESE is equivalent to (Sv, w ) = -(u, S w ) for all v, w E R:. Thus we can regard o,(n) as consisting of the skew-adjoint linear operators on R:. 4.

Examples. (1)

O(2). For each number 9 E R , the orthogonal

matrix cos 9 -sin 9 sin 9 cos 9 is a rotation of R2 through (oriented) angle 9. The function 9 4 R g is a smooth homomorphism from R, under addition, into O(2). Its kernel is 27rZ and its image is the rotation group O+(2),the component of the identity in O(2). Thus O'(2) and its other coset 0-(2) are diffeomorphic t o circles. (2) 0,(2). For each cp E R, the semiorthogonal matrix coshcp sinh cp sinh cp cosh cp is called a boost of Rf through (oriented) Lorentz angle cp. As above, cp + Bq is a homomorphism, but in this case it is one-to-one. Any a E 0,(2) must carry each hyperbola ( p , p ) = 1 and ( p , p ) = - 1 into itself but may reverse the branches of each. These two choices split O , ( 2 ) into four disjoint open subsets. The one preserving all branches is exactly the set B of all boosts. B is a subgroup diffeomorphic to R' and is thus the component of the identity in 0,(2). The other three sets are cosets of B, hence O,(2) has four components each diffeomorphic to R'. (See Figure 1.) This example correctly suggests two fundamental differences between the orthogonal group O(n) and the semiorthogonal group O,(n), 0 < v < n: First, O(n) is compacr, since (as noted in Appendix B) it is closed and bounded in gl(n, R) x R"'. But O,(n) is not compact in the indefinite case,

Figure 1

Semiorthogonal Groups 237

since for example, elements of the form

i

coshq 0

o

sinh q

sinh cp

r 0 cosh q

constitute an unbounded subset of R"'. Second, O(n) has two components, while in the indefinite case O,(n) has four.

R:

To prove these connectedness assertions, first assume 0 < v < n so that R: x R"-". Write a E O,(n) in block form as

=

-

with a Tv x v and a, ( n - v) x ( n - v). Here the timelike part, a T :R: R:, is a I R: followed by orthogonal projection on R:, and similarly for the spacelike part, a,: R"-' + R"-". Since a is invertible and preserves causal character, it follows readily that both a T and a , are invertible. 5. Definition. For 0 < v < n, an element a E O,(n) preserves [reuerses] time-orientation provided det a > 0 [ 0 [
Thus O,(n) is decomposed into four disjoint sets indexed by the signs of det aTand det a s (in that order):

0: +(n), 0: -(n), 0; +(n), 0; -(n). The two determinants are continuous functions ofa, so the four sets are open, hence also closed, sets of O,(n). 6.

Lemma. 0: +(n)is connected for all 0 I v I n.

ProoJ: Induction on n: For n = 1 the result is trivial; for n = 2 it is true by Example 4. Assume it is true for some n 2 2. In view of Exercise l c we can suppose 0 5 v 2 ( n 1)/2. (Consider O t ' ( n ) to be O'(n).) Given a E 0: ' ( n + l), we now construct a curve joining g to the identity element e. Note that there is a natural one-to-one correspondence between O,(n 1) and the set F ( S t ) of all frames on S:: A matrix g corresponds to its columns gu,, . . . , gun+ the first n of these vectors, by canonical isomorphism, become a frame tangent to S: at gun+ Clearly every frame in F(SC) derives thus from a unique element of O,(n + 1). Since v < n, Se is connected. Move the frame corresponding to g to a frame at u,+ say by parallel translation in the geometry of S;. Corresponding in O,(n 1) is a curve CI from g to

+

+

,.

+

238 9

lsometries

(i y ) = B. This curve is continuous (in fact, smooth) and, since the deter-

+

minant function is continuous, it follows that CL remains in 0: ' ( n 1). But then b E 0: '(n). Hence by inductive hypothesis a further curve joins b, hence B, to the identity. w In particular, O'(n) is connected. This group

O + ( n )= SO(n) = { g E O ( n ) :det g

=

1)

is thus the identity component of O(n), and the coset

0 - ( n ) = (yEO(n):detg= -1) is the other component.

7. Corollary. Let 0 < v < n. (1) 0' = 0: (n) is the identity component of O,(n). (2) The cosets O + + , O + - , 0 - + O , - - of 0'' are the components of O,(n). (3)O" LJ O - - , O + ' u 0 ' - , and O f f u 0 - +are subgroups of O,(n). +

Proof: (1) Since 0

+

+

+

is connected and is both open and closed in O,(n),

i t is the identity component.

(2) Consider the four representative matrices d( +_ 1, ? 1) = diag( +_ 1, + I ) E 0 - ' is a diffeomorphism of O,(n) that on each matrix a changes the sign of the first row of a,, without changing a s . It follows that O + ' is carried diffeomorphically onto 0 - +,so the latter is a coset of O t and is connected. The other cases are similar. (3) One of the representative matrices d( & 1, 2 1) is in each of the four (coset) components. The effect of multiplication and the map a -, a-' on components is thus determined by their effect on the four matrices, hence ( 3 ) follows. 1. . . . , 1, f 1 ) in O,,(n).Left multiplication by say d( - I ,

+

Every semiorthogonal matrix has determinant i 1. In fact, writing the criterion in Lemma 3 as ' m a = E and taking determinants yields (det a)' = 1. Since the determinant function must maintain the same sign on each component of O,(n) the representative matrices show that the special semiorthogonal group SO,(ri) = { g E O(n):det g = 1 ) is O + ' ( n ) u 0 - -(n). Thus the three subgroups in ( 3 ) above consist of the linear isometries Rt -+ R': that preserve orientation, time-orientation, and space-orientation, respectively. (Note that when v = 1, preservation of timeorientation agrees with the earlier definition requiring timecones to be preserved.) These three choices are not independent; checking signs will show that a E O,(n) preserves one of the three orientation types if and only if it preserves both or reverses both of the other two types.

Some Isometry Groups 239

SOME ISOMETRY GROUPS

We compute the isometry groups of pseudospheres, pseudohyperbolic spaces, and semi-Euclidean spaces. The pseudoradius r is irrelevant (Exercise l), so we take r = 1, writing S: and H: as usual. 8. Proposition. I(Se) = OV(/q + 1 ) if v < n ; Z(Hr) = O,+,(n v > 0.

+ 1)

if

Proof: (1) A linear isometry a : R:'' -+ R;+' carries S; to itself, and since S: is a semi-Riemannian submanifold, a 1 S"E Z(Sn).Evidently restriction a -+ a IS"is a homomorphism, so once it is shown to be one-to-one and onto 1). we write simply I(S:) = O,(n If u l , . . . , u,+ is the natural basis for R"", its first n vectors correspond to a frame on S: at the point u,+ '. If 4 E I(S:), the proof of Proposition 4.30 shows there is a unique a E O,(n 1) such that au,, = dun+ and da(ui) = d4(ui) for 1 Ii I n. Since S: is connected for v < n, it follows that 4 = a IS". (2) For H t , the proof is the same except that H: c R::: and u,+' is rn replaced by u l .

+

+

+

In particular the orthogonal group O(n 1) is the isometry group of 1) is the isometry group of the the sphere S",and the Lorentz group O,(n Lorentz sphere S;. The proof fails in the nonconnected cases H t and S:, but what we want are the isometry groups of their components H" and c S ; .

+

9. Corollary. Z(H") = O:+(n I(cs;) = O,'+(H

+ 1) u 0: - ( n + 1); + 1) u 0, ' ( n + 1).

Proof. Since the hyperbolic space H" c R:" is connected, the preceding proof shows that, if 4 E I(H"), there is a unique a E O,(n 1) such that a1 H" = 4. Because v = 1, the matrix aTis just a nonzero number. Since a(H") c H", a carries the timecone of R;" containing H" to itself; thus a preserves time-orientation. Conversely, if a €0:' u 0: -, then a l H " ~I(H"). The other proof is rn similar.

+

The linear isometries of R: form a subgroup O,(n) of its isometry group Z(R:). Also if x E R:, the translation T, sending each to u + x is an isometry. Then T, Ty = T,+, = Ty o T', and, since To is the identity, (TJ- = T - x . It follows that the set R" of all translations of R: is an abelian subgroup of I(R,")and that R" is isomorphic to R" (under vector addition) via T, H x. 0

240 9

lsometries

10. Proposition. Each isometry of R: has a unique expression as Txa, with x E R: and a E O,(n). Furthermore, T,aT,b = T,+,,ab.

Proof. First we show that, if d, is an isometry of R: such that +(0) = 0, then 4 E O,(n). In fact, the differential map dd,,, of d, at 0 is a linear isometry, hence it corresponds under the canonical linear isometry T,(R:) z R: to a linear isometry a: R: --t R:. But then duo is exactly d+o, hence d, = a by Proposition 3.62. Now if d , € I ( R : ) , let x = @(O)E R:. Thus (T-.a)(O) = 0, so by the preceding remark, T-xd,equals some a E O,(n). Hence d, = T,a. If < a = T,b, then x = (T,a)(O) = (T,b)(O)= y , hence also a = b. Finally, for all u,

(aTy>(u)= a(y

+ u ) = a y + au = (T,,a)(u).

Hence aT, = K y a ,and the multiplication rule follows.

w

The multiplication rule shows that R" is a normal subgroup of I(R").The function (x, a) -, Txa is one-to-one from R" x O,(n) onto I(R:). If Z(R:) is made a manifold so that this function is a diffeomorphism, it is easy to check that 1(R:) is a Lie group. Then dim I(R:) = dim R"

+ dim O,(n) = n(n + 1)/2.

Like O,(n), I(RC) has four components if 0 < v < n, reducing to two for v = 0, n. I(R",) is called a semi-Euclidean group, I(R") a Euclidean group, and, if n 2 2, I(R1) is the Poincare group or inhomogeneous Lorentz group.

TIM E - 0 R IENTABILITY AN D SPACE-0 R lENTABlLlTY

For a semi-Riemannian manifold M with indefinite metric we generalize the Lorentz notion of time-orientability and define a complementary notion of space-orientability. The pattern for both is the same as for ordinary orientability; the reason for this is as follows. The manifold description of orientability in Chapter 7 rests solely on the fact that in the full linear group GL(n,R ) the matrices with positive determinant form a subgroup of index 2 (one other coset) that is open, hence closed, in GL(n, R)-or in the special case of Riemannian manifolds that SO(n) is an open subgroup of index 2 in the orthogonal group O(n). But for 0 < v < n, all three of the subgroups in Corollary 7(3) are open and have index 2. Thus we can give a common treatment, denoting any one of these groups (for arbitrary n ) by G. So henceforth

Time- Orientability and Space-Orientability

241

G-orientation should be read as orientation

if

G

=

SO,(n)

time-orientation

if G

=

0: ' ( n ) u 0: -(n).

space-orientation

if

=

0: +(n)u 0; +(n).

G

=

0: + ( n )u 0; -(n).

Formally the results are valid for all v, but if v = 0, time-orientation is undefined and space-orientation becomes orientation (the reverse for v = n). Let V be a scalar product space. If e = ( e l , . . . , en) and f = ( f , , . . . ,f n ) are orthonormal bases for V , then f j = aijei(1 Ij In) defines a matrix a = ( a j j )in O,(n). Then e and f are G-equivalent provided a E G c O,(n). Because G is a subgroup of index 2, G-equivalence is an equivalence relation with exactly two equivalence classes, called the G-orientations of V . To pick one is to G-orient V . The G-orientation containing a given orthonormal basis is denoted by G(e,, . . . , en).If T : V + W is a linear isometry, then

1

G(e,, . . . , en)+ G(Te,, . . . , T e n ) is a well-defined one-to-one function TGfrom the two orientations of I/ to those of W . A G-orientation of a semi-Riemannian manifold M is a function I that assigns to each p E M a G-orientation of T,(M) and is smooch in this sense: For each p~ M , there is a coordinate system 5 whose induced local Gorientation G(a,, . . . , 3,) agrees with I on some neighborhood of p . M is G-orientable provided it admits a G-orientation. Then just as before, (1)

A connected G-orientable manifold M has exactly two G-orientations

kI (these assigning opposite G-orientations to each tangent space). (2) A local isometry 4: M -+ N is said to preserve [reverse] G-orienta= I N ( 4 p )[ = -&($?I)] for all p E M . tions I , and IN provided d$G(&,(p)) (3) I f 4: M + N is a local isometry and I is a G-orientation of N , there is a

unique G-orientation 4*(I) of M making 4 G-orientation preserving. (4) A semi-Riemannian manifold M has a double covering k : M G -+ M , called the G-orientation covering of M , such that every (local or global) G-orientation of M is a smooth section.

M Gis always G-orientable, and M is G-orientable ifand only if k : M G -+ M is trivial. Since this is a double covering, if M is connected, the latter condition is equivalent to M G not connected. If M is connected and G-orientable, it is meaningful to say that an isometry 4: M + M preserves (or reverses) G-orientation provided we agree to use the same G-orientation on M as both domain and range of 4.

242 9 lsometries

11. Proposition. If k : M -+ M / T is a connected semi-Riemannian covering, then M / T is G-orientable if and only if M is G-orientable and each 4 E r preserves G-orientation. Proqf. If A is a G-orientation of M / T , then &*(A) is a G-orientation of M . If 4 E r, then 4*16*(I) = (164)*(A)= &*(I), so 4 preserves G-orientation. Conversely, let p be a G-orientation of M . If q E M / T , we assert that the G-orientation dkG(p(p))is independent of the choice of p in R-'(q). In fact any two such points, p and p', are in the same r orbit, hence there is a 4 E r such that 4 ( p ) = p'. Since 4 by hypothesis preserves G-orientation, d16G(P@')) = dAG(d$GMP))) = W 4 ) G ( P ( P ) ) = d&G(P(P)).

Thus I ( q ) = dRC(p(p))for all p E A - ' ( q ) ,is a valid definition, and using local sections, one can check that I is smooth, hence is a G-orientation of M / T . Let 16: M G M be the G-orientation covering of M . Since R - '(p), p E M , consists of the two G-orientations of T,(M), a lift a of d into MG amounts to moving a G-orientation of tangent spaces along c(. (See Exercises 16 and 24.) The natural G-orienration of R: as a scalar product space is the one containing its natural basis, and the natural G-orientation of R: as a semiRiemannian manifold is the one induced by its natural coordinate system. Thus each canonical isomorphism T,(R:) % R" is G-orientation preserving. It follows that gEO,(n), considered as an isometry of R:, preserves Gorientation if and only if g E G. Pseudospheres and pseudohyperbolic spaces are G-orientable whenever this is meaningful. Furthermore for their isometry groups, as in Proposition 8 and Corollary 9, an isometry g I Q E I ( Q ) preserves orientation, time-orientation, space-orientation, respectively, if and only if g itself does. --f

LINEAR ALGEBRA

We consider briefly some basic facts about linear operators T on a scalar product space I/: (1) Invariant subspaces. A subspace W of V is invariant under T prow'. vided T ( W ) c w.If W and w' are invariant, so are W n w' and W Since the scalars are real, T has an invariant subspace of dimension 1 or 2. Proof: The characteristic polynomial p ( x ) of T can be factored into a product of linear and quadratic factors. Substituting T for x gives p(T) = 0. Hence some factor is singular, and the result follows.

+

Space Forms 243

(2) Self-adjoint and semiorthogonal operators. Let V be a scalar product space of dimension n and index v, so V E R:. If S is a self-adjoint operator on V (that is, (Su, w ) = (0, S w ) for all u, w), then the matrix s of S relative to an orthonormal basis satisfies ‘ s = ESE, where E is the (v, n - v) signature matrix. Thus s has the form

xt:(

;)9

where a and b are symmetric matrices of size v x v and ( n - v) x ( n - v), respectively, while x is an arbitrary v x ( n - v) matrix. If A is semiorthogonal, that is, preserves the scalar product, then relative to an orthonormal basis the matrix of A is an element of O,(n). In fact this association is evidently an isomorphism from the group O ( V ) of all such A to O,(n). (3) Perp operation. Let T be either self-adjoint or semiorthogonal. I f W is inuariant under T , then so is W L .For example, if T is semiorthogonal, u I W implies Tu I T( W ) .But since T is invertible, T ( W ) = W . (4) The definite case. Let V be an inner product space. A self-adjoint operator S on V is often called symmetric. It is well known that S then has an orthonormal basis of eigenvectors, hence the resulting matrix is diagonal. If A is orthogonal, the facts above lead easily to another standard result: For a suitable orthonormal basis, A has a matrix with k 1’s and elements of O(2) along its main diagonal (zeros elsewhere). In particular, A need not have an eigenvector if n is even. (Every operator has an eigenvector if n is odd.) ( 5 ) The Lorentz case. Let V be a Lorentz vector space. If S is selfadjoint, then low dimensions are somewhat irregular, but it turns out that S can be diagonalized outside a subspace of dimension I 3. By contrast with the definite case, a semiorthogonal operator A always has an eigenvector. See Exercises 18-21. SPACE FORMS’

If M is a space form, then so is its simply connected semi-Riemannian covering manifold fi,and we saw in Chapter 7 that M can be expressed as A/r,where r is a properly discontinuous group of isometries of A. What is important here is not the particular group r but its conjugacy class in I ( f i ) . Recall that for a group G, subgroups H and H are conjugate provided there is a g E G such that H = gHg- Evidently conjugacy is an equivalence relation.

’.

This section derives from Chaprer I I o f [Wo], which should be consulted for references and further developments.

244 9

lsornetries

12. Proposition. If M is a simply connected semi-Riemannian manifold, then orbit manifolds M / T and M / T are isometric if and only if r and are conjugate in I ( M ) .

Suppose F = CpI'Cp- ' for Cp E I ( M ) . If p E M ,then Cprp = is, 4 carries the r orbit of p to the T orbit of Cpp. Since these orbits are points of M / T and M / T , this amounts to a function Cp: M J T + M / P such that Cp k = d Cp, where R and 1& are the projections. On any open set 42 of MjT evenly covered by A, there is a local section 13: 42 -+ M, and /z is an isometry onto A(%). But then $I%! = A Cp 13 so Cp is at least a local isometry. Repeating the above with $ replaced by Cp-' shows that 4 is an isometry. (2) Cohersely, let I): M / f -, M / T be an isometry. Since M is simply connected, by Proposition A.11 there is a lift 1,6 of R through J ; thus R $ = $I R. Lifting 3/- I shows that $ E I ( M ) . If Cp E r, then I,&$-' is a deck transformation, since Proof. (1)

Cpr4-'Cpp = TCpp; that

0

0

0

0

$ 5

0

0

@f#$-'

=

3/4+$- 1

$44- = g. E F ; that is, $r$-'c r. Symmetrically, =

*A$-

1

=

Then Corollary 7.12 gives I,&$-' $c r, and it follows that r and T are conjugate.

'r$

1

w

We shall consider only space forms whose fundamental group is finite. Recall that a group is torsion-free if the subgroup generated by each element except the identity is infinite. Thus a torsion-free group is finite only if it is trivial, that is, consists solely of the identity; so the following result excludes many space forms from the finite class. 13. Proposition. Let M" be a space form of curvature C and index v. Then n , ( M ) is torsion-free if (a) C = 0, (b) C > 0 and v = n - 1, n, or (c) C < Oand v = 0, 1. Proof. Write M = fi/f as above. Since @i is simply connected, Proposition 7.4 gives n , ( M ) % r.Thus it suffices to show that every finite subgroup F of r is trivial. On page 228, we saw that the cases in this proposition are exactly those for which is diffeomorphic to R". A theorem of P. Smith asserts any finite group of diffeomorphisms of R" has a common fixed point. Since F consists of deck transformations. it follows that F is trivial. w

Geometric proofs for certain of these cases are indicated in Exercise 5. A space form with curvature C is positioe if C > 0, negatioe if C < 0. We shall consider the positive case ; reversing the metric gives the corresponding result in the negative case, the index v changing to n - v. As before it can be assumed that C = 1.

Space Forms

245

14. Proposition. If M", n 2 3, is a positive space form with index v I 4 2 (or a negative space form with v 2 n/2), then the fundamental group of M is finite. Prooj Since n 2 3 and 2v In imply v _< n - 2, it follows from Corollary 8.24 that the simply connected semi-Riemannian covering manifold fi of M is (isometric to) a pseudosphere Sg. Thus we can take M to be Sg/T, where T is a properly discontinuous subgroup of I(Sg) = O,(n + 1). For v = 0, n , ( M ) E r is finite, since S" is compact hence Exercise 7.6 applies. But if v is positive, the benefits of compactness can be gotten indirectly as follows. Let W E R"+I-" be the spacelike coordinate hyperplane in R:+ '. Then W n S: is a sphere S"-"in W. We assert that for every 4 E r there is a point p E S"-'such that +(p) E F-". Extended to an element of O,(M + l), is linear, hence

+

+ +W) = dim W + dim $(W). It follows that dim(W n + W ) 2 n + 1 2v, which is positive since 2v I n. dim(W n + W ) + dim(W

-

But any unit vector in W n +W is a point of S"-" that is the +image of a point of F-". Assume r is infinite. Then there is an infinite sequence { + i } of distinct elements of r and a sequence (pi}in S"-" such that each &(pi) is also in S-".Since S"-"is compact, by passing if necessary to subsequences we can suppose that {pi} and {4i(pi)} converge to points p and q in S"-". This will contradict the proper discontinuity of r (Definition 7.6): Case I.

q # Tp. The convergent sequences above contradict (PD2).

Case 2. q ~ r p .Let $ send p to q. The sequence { ~ + - ' + ~ ( p ~ ) } converges to I , - '(4) = p. But then { p i } + p contradicts (PD1).

In the omitted dimension 2, the result holds for v = 0 since S 2 is compact and simply connected, but not for v = 1 since, for example, S: has infinite fundamental group. 15. Lemma. Every finite subgroup r of O,(n) is conjugate to a subgroup of O(v) x O(n - v). Proot

(a)

For u, w E R" define

b(u, w) =

1g~

*

gw.

r

Then b is an inner product on R" and is preserved by b(hv, hw) =

c ghu - ghw 9

g'v * g'w

= g'

=

r, since, if h E r, then b(u, w).

246 9

lsometries

(b) There is an inner product /l on R: preserved by r,and a basis e l , . . . , en for R: that is orthonormal relative to both /l and the usual scalar product of R:. There is a linear operator S on R: such that ( u , w ) = b(Su, w ) for all u, w. S is symmetric relative to b, hence R: is the direct sum of the eigenspaces A; of the distinct eigenvalues I, of S. These subspaces are mutually orthogonal relative to b, hence to (, ). Note that A, # 0 since S is invertible. then ( v , w ) = We assert that g(.M,) c A", for all g E r a n d all r. If u E N,, Arb(u,w) for all w, hence { g u , g w ) = il,b(gu, gw) for all w . Since g is onto, we can take gw to be any vector x E N s ,s # r. Then

A,b(gu, X ) = ( g u , x ) = I,b(gv, x). Hence b(gu, x ) = 0 for all such x , showing that gv E N , . Write v, for the component of u in A'-,. Then b(u, w ) = b(u,, w,).Define o ( u , w) = IA,lb(u,, w,). It is easy to verify that B is an inner product on R" preserved by r.Furthermore, if e l , . . . , en is a ( , )-orthonormal basis for R" adapted to the subspaces JV,, then it is also B-orthonormal, since P(ei, e j ) = 0 for i # ,j and

1

P(ej, e j ) = lI,.lb(ei, ei) = (141/A,)(ei, ei> =

+ 1.

(c) We can now prove the result. Let aEO,(n) send the natural basis . . , u, to e l , . . . ,en. If g E r c O,(n), then since O(n) n O,(n) = O(v) x O(n - v), it suffices to show that aga-l E O(n). Because g preserves p, its matrix relative to e,, . . . ,en is orthogonal. But this is just the matrix of a - ' g a relative to u l , . . . , u,, hence a- 'ga E O(n). idl,.

By Proposition I2 and Lemma 15, every positive spaceform with v 5 n - 2 and finite fundamental group can be expressed as S:/T with r c O(v) x O(n - v + 1). Writing R:+' = Rt x R'-"+' as usual gives

s; = { ( p , x ) : x ' X

=

1

+pep},

and (a, b) E r acts on S: by (a, b)(p, x ) = (up, bx). In particular, if (0, x ) = x E S"-"= S: n R " - " + ' ,then (a, b)(O, x ) = (0, bx). Let B be the set of all b E O ( n - v + 1) such that (a, b ) e r for some a E O(v). For each b rhis a is unique, for if (a, b) and (a', b) are in r,then so is (a, b)-'(a', b) = (a-la', e). But this deck transformation fixes each point (0, x ) of S"-' c St, hence is the identity; that is, a = a'. Consequently,

-

(1) B is a subgroup of O(n - v + 1) isomorphic to r under (a, b) b, and ( 2 ) themapsendingeachb E Btotheelementa E O ( v ) s u c h t h a t ( a , b ) ~ r is a homomorphism (whose graph is r).

Using these facts we can show that in about half the cases in Proposition 14 the fundamental group is very simple indeed.

Space forms 247

16. Proposition. Let M be a positive space form with finite fundamental group. If n - v is even, then either M is simply connected or n , ( M ) % Z , . In the latter case, M is not space-orientable. Note. The hypotheses hold for all positive space forms with 2v In and n - v even, since the latter eliminates the exceptional case n = 2, v = 1 following Proposition 14. n - 2; hence M = S'lr Proof: By Proposition 13, we can assume v I with f c O(v) x O(n - v + 1). The following argument is valid even in the Riemannian case v = 0, where R' is trivial and O(v) = { i-l}. Let (a, b) E r. Since n - v + 1 is odd, the characteristic polynomial of b has a real root, that is, b has an eigenvector x E s"-'. Furthermore bx = Ix, since Ibxl = 1x1. If bx = x, then (0, x) E S" is a fixed point of the deck transformation (a, b); hence a and b are identity matrices. Ifbx = -x, then b2x = x ; hence both a' and b2 are identities. Then since a and b are orthogonal they are also symmetric, because a = a - = 'a. Thus the inner product spaces R' and R'- " + have orthonormal bases composed of eigenvectors of a and b, respectively. As above, all eigenvalues are k 1. If one of the eigenvalues of b is + I , then we have seen that (a, b) has a fixed point and hence is the identity. Thus if (a, b) is not the identity, b is -e. Since there is only one a such that (a, - e ) E r, we conclude that, if M is not simply connected, then r is a two-element group whose nontrivial element is (a, - e), where a is a matrix with f 1's on the diagonal, zeros elsewhere. In this case, since n - v + 1 is odd, det( - e ) = - 1. Thus (a, - e ) reverses space-orientation on R",+' and hence also on S:. Then by Proposition 11, M is not spaceorientable. The proof actually shows more. Conjugation of the v x v matrix a = diag( f 1, . . . , k 1) by a suitable permutation matrix will rearrange it so that 1 cases the - 1's (if any) come first. Thus by Proposition 12 there are just v for M not simply connected, and these are orientable o not time-orientable e the number of - 1's is odd.

+

17. Corollary. (1) An even-dimensional Riemannian space form with curvature C = 1 is isometric to either S 2 k or P Z k= SZk/f1, the latter nonorientable. ( 2 ) An odd-dimensional Lorentz space form with curvature C = I is isometric to one of the following: S'j: orientable, time-orientable, space-orientable. S;/* 1 : orientable, but not time- or space-orientable. S'j/r,where the unique nontrivial element of r is diag(1, - 1, . . . , - 1): time-orientable but not orientable or space-orientable.

248 9

lsometries

By contrast with the situation in Proposition 16 there are many positive space forms with finite fundamental group but with n - v odd. 18. Example. For any k 2 2, let R , ~ 0 ( 2 )be a rotation through angle 27cJk. For any n and v such that n - v is odd and 2 3, let a be the element of O(n - v + 1) with (n - v + 1)/2 copies of R , along its main diagonal. Then (- e, a) E O,(n + 1 ) generates a cyclic group r of order k or 2k. Then S:/T is a positive space form with fundamental group r.

The general problem of classifying all space forms is a difficult one. For Riemannian space forms the positive case has been solved by Wolf [Wo] and much is known about the flat case [Wo, KN]. Every complete Riemannian manifold is geodesically connected, but the space forms S': and H ; (0 < v < n) are not. The following result will show that in the positive Lorentz case, geodesic connectedness is incompatible with time-orientability. Note that for a semi-Riemannian covering R : M -+ M / T , the lift and projection properties of geodesics show that M/T is geodesically connected if and only if for any p, q E M there is a geodesic joining p to some point of the orbit r4. 19. Proposition. A positive Lorentz space form M", n 2 3, is geodesically connected if and only if it is not time-orientable.

Proof: For n 2 3, S; is simply connected, hence time-orientable, but not geodesically connected. Thus we can suppose that M = S;/T, with r finite (by Proposition 14) but not trivial. Suppose first that M is not time-orientable; we must show that given any points p, q E S; there exists a geodesic from p to rq.Let q* = g(q) E RY+ Then q* is a common fixed point of every g E T. If q* is timelike or null, no element of r can reverse the causal cones of R;+'. This implies that r preserves time-orientation : a contradiction. If q* is spacelike and nonzero, then dividing by its norm gives a point of S; fixed by the deck transformation group r ;thus r consists only of its identity element: a contradiction. Hence q* must be 0; so 0 = (p, q * ) = CqE, ( p , g(q)). Thus for some g E r we have ( p , g(4)) 2 0. By Proposition 5.38 there is a geodesic from p to g(4). Now suppose M is time-orientable. Since r is finite, we may suppose r c 0(1) x O(n); by time-orientability r c 1 x O(n). Thus the spacelike coordinate hyperplane R" is invariant under r, and gu, = u1 for all g E r. There is a point p E S"- = S; n R" that is not in the (finite) orbit run+ 1. Thus there is an F, > 0 such that (p, g(u,+ < 1 - E for all g E r. We lift this data homothetically to an arbitrary plane u' const, as follows. Writing S = sinh t and C = cosh t , let

zqEI

p , = su,

+ cp,

q, = s u ,

+ CU"+1.

Killing Vector Fields 249

For all g E T , gu, (

~

dq,)) ~

9

=

= u1 and

u1 Ip , g(u,+ l). Hence

(S~+ I CP,S U , -s2 + c y p ,

+ Cg(un+1)) <

-s2 + c2- CZE = 1 - C2E.

Thus for t sufficiently large, ( p , , g(u,+ ,)) < - 1. By Proposition 5.38, there exists no geodesic from pt to f4,,showing that M is not geodesically connected. Let M be a semi-Riemannian manifold with isometry group I(M). If M is simply connected, the isometry group of any M / T can be characterized in a purely algebraic way. Recall that, for a subgroup H of a group G, the normalizer N ( H ) in G is the set of all g E G such that g H g - ' = H . It is easy to check that N ( H ) is a subgroup of G containing H ; in fact it is the largest subgroup in which H is a normal subgroup, so in particular the quotient group N ( H ) / H is well defined. 20. Proposition. Let r be a properly discontinuous group of isometries of a simply connected semi-Riemannian manifold M . The isometry group Z(M/T) of M / T is isomorphic to N(T)/T,where N ( T ) is the normalizer of T in I ( M ) .

Prooc That ~ E N ( Hmeans ) 4 T 4 - l = T ; hence 4Tp = r4p for all p E M . Thus there is a unique function 4 : M / T + M / T such that 4 0 1 = R 0 4, where R : M + M / T is the naturd projection. Local cross sections show that 4 is smooth, and $I is a diffeomorphism since (4- ')- is its inverse. Since A is local isometry it follows that 4 is an isometry. If 4, $ E N ( H ) , then #$d= R+@. Thuc(4$), = 4$; that is, the function 4 -,4 is a homomorphkm. If-$ E T , then obviously 4 is the identity map of M . Conversely if 4 is the identity map of M / T , then is a deck transformation; that is, 4 E f.Thus the kernel of the homomorphism is Because M is simply connected, the homomorphism is onto I(M). In fact, if p E Z(M), then by Proposition A. 11 the map p 0 R : M .+ M / T has a lift fi: M + M through R. Lifting p-' 0 R will show that fi is an isometry. Since p d = d fi, we have (ji)- = p . Thus the homomorphism theorem implies N(T)/T z I ( M / T ) .

a

--

4

0

r.

0

KILLING VECTOR FIELDS

By Definition 2.16, the Lie derivative Lx applied to a vector field Y is [ X , Y ] . Proposition 1.58 interprets this bracket as the rate of change of Y under the flow of X . A similar interpretation holds for L, applied to any tensor field A ; however, for simplicity we take A to be covariant.

250 9

Isometries

If X

21. Proposition.

E X(M) and

L,A

=

A

E 2:(M),

then

1

lim - [$,*(A) - A ] , t

1-0

is the flow of X . (When the flow is local, the equation holds

where locally.)

Proof.

For simplicity, let s = 2. Since L x is a tensor derivation,

(bf,4)(V,

w = XA(V, W )

-

N X ,VI,

w>- A ( V , EX, Wl).

Now we work on the right-hand side of the stated formula, abbreviating liml+o( l / t ) to 2 and fixing a point p . Then y($:A

-

A)(Vp,

w,>= =w,4(~w$l? dlC/,(W,)) - w,,Vp)l.

Adding and subtracting a suitable term turns this into 2{&W,(Vp),

d$l(VPN

-

A(V@IP'W@,p)> + 2 { 4 V @ pW@,p)- A(Vp3 W,),.

Call these two limits I and 11. If then $l(p) = o!(t),and

I1

=

o!

is the integral curve of X starting at p ,

(d/dt)(V,, W,)lo = M'(O)(V, W ) = X P ( V , W ) .

For I, we use the telescoping identity A(#', w') - A(#, w) = A(v'

-

u, w') + A(#, w' - w)

to get

I

=

Y{A(d$i(Vp) -

V#,p,

d$,(Wp))l

Since A is bilinear and $,$-, Proposition 1.58 as follows: Y { A ( d $ r ( V p - d$ ==

=

t+bo

=

+ s { A ( v @ c pd,$ d w p )

-

W@,(p,>).

id, the first term can rewritten to use

dt+bt(W@,cpJ>} - A(d$i y { d $ - t( v+,p> - Vp}?lim d$r( W@,(pJ) -t( V@,p>X

l+O

= - N X , VIP, W,).

Similarly, the second term above is --A(VP, [ X , WIp). Thus, I ( L x A ) ( V p ,W,) as required.

+ I1 =

22. Definition. A Killing Vector Jield on a semi-Riemannian manifold is a vector field X for which the Lie derivative of the metric tensor vanishes:

Lxg = 0. Thus under the flow of X , the metric tensor does not change; this suggests the following view of a Killing vector field as an infinitesimal isometry.

Killing Vector Fields

251

23. Proposition. A vector field X is Killing if and only if the stages $, of all its (local) flows are isometries. Proof. If each $, is an isometry, then $;k(g) = g. Hence by Proposition 21, L x g = 0. Conversely, if Lxg = 0, let {$,} be a local flow of X . If v is a tangent for s small. vector at a point in the domain of the flow, then so is w = d~)~(v) By Proposition 21, lim,+o(l/l)(g(d$,w, d$,w) - g(w, w))= 0. Since $st)( =

*s+t5

1 lim 7 k W S + , ( V ) > d$s+,(v)) - ! N $ s ( v ) ? d$s(v))l

=

0.

t-0

This says that the real-valued function s -,g(d$s(v), d$,(v)) has derivative identically zero. Thus it is constant, so g(d,hs(v), d$,(v)) = g(u, v) for all v ands. 24. Example. Killing vector fields on the Schwarzschild half-plane PI (Example 5.41). Since isometries preserve the Gaussian curvature K = 2 ~ M / r ~ , any isometry must have the form &t, r ) = (f( t , r), r). Thus

dd(a,) = (af/at)a, + 8,; dd(a,)

=

(af/ar)a,+ 8,.

Since ds2 = - A dt2 + R dr2 with R = 1 - (2M/r), computing scalar products shows that 4 is an isometry if and only if &/at = 1 and af/dr = 0. Thus f = t c for some number c ; that is, the isometries are t-coordinate translations. Since the local flows of a Killing vector field consist of isornetries, it follows that every Killing vectorjield on P I is a constant multiple of

+

a,.

Recall that the covariant differential of a vector field is the (1, 1) tensor field D X such that ( D X ) ( V )= D,X for all V E X(M). Thus at each p E M , ( D X ) , is the linear operator on T,(M) sending v to D,X. 25. Proposition. The following conditions on a vector field X are equivalent : (1) X is Killing; that is, L,g = 0. ( 2 ) X ( V , W ) = ( [ X , V ] , W ) + ( V , [ X , W ] )for all V , W EX(M). (3) D X is skew-adjoint relative to g ; that is, ( D , X , W ) + ( D w X , V ) = 0 for all V , W E X(M).

Proof.

For all V , W the following are equivalent:

(D,X, W )

+ ( D W X , V ) = 0;

- ( [ X , Vl, W ) + ( D x V , W > - (CX,W l , V > + = 0; X(V,

w >= (CX,Vl, w >+ ( V , ( X , WI).

252 9

lsornetries

In view of the product rule the latter is equivalent to (L,g)(V, W ) = 0 for w all I/, W , that is, to L,g = 0. Condition (3) shows that any parallel vector field is Killing. We call the following highly useful fact the conservation lemma. 26. Lemma. Let X be a Killing vector field on M , and let y be ageodesic in M . Then the restriction X , is a Jacobi field and (y’, X ) is constant along y.

Proof. It suffices to work locally, so near any point of y let { IJs}be a local flow of X . Each IJ, is an isometry, hence the function (t, s) -,$ X y ( t ) ) is a geodesic variation of (a segment of) y. For fixed t , the curve s -,IJ,(cr(t)) is an integral curve of X, hence its velocity at s = 0 is X,(,). Thus X , is the vector field of this geodesic variation, so by Proposition 8.3 it is a Jacobi field. Then, since y is a geodesic,

<x;,

(dldNX,, 7’) = 7 ’ ) = @,J, r’). But this last expression is zero by Proposition 25(3); hence ( X , 7’) is constant. 27. Lemma. Let X be a Killing vector field on a connected semiRiemannian manifold M . If X , = 0 and ( D X ) , = 0 for some one point p of M , then X = 0. Proof. Let A be the set of points of M at which both X and D X vanish. Evidently M - A is open and A is nonempty; thus it suffices to show that A is open. If o E A, let 4?l be a normal neighborhood of o. If D is a radial geodesic from a, then by the preceding lemma Xu is a Jacobi field. But Xu(o)= X, = 0 and (Xu)‘(0)= Da.(0)X= (DX),(a’(O)) = 0. By Proposition 8.7, Xu is rn identically zero. Hence X is identically zero on 42, so D X is also.

It will follow that, if X and Y are Killing vector fields such that X , = Y, and D X , = DY, for some one point, then X = Y . Since Killing vector fields are infinitesimal isometries, the lemma is thus an infinitesimal analogue of the uniqueness of local isometries, Proposition 3.62. THE LIE ALGEBRA i ( M )

Let i ( M ) be the set of all Killing vector fields on a semi-Riemannian manifold M.By Exercise 2.1 1, (1) The Lie derivative Lx is R-linear in X . Hence any linear combination of Killing fields (coefficients constant) is again a Killing field. (2) [L,, L,,] = L[,, Hence a bracket of Killing fields is again Killing.

The Lie Algebra i ( M ) 253

It follows immediately that i(M) is a real Lie algebra (Definition B.5). In fact, i ( M ) is a Lie subalgebra of X(M),which becomes a Lie algebra under scalar multiplication by numbers rather than functionsf€ g ( M ) . By contrast with X(M),i ( M ) is always finite-dimensional.

28. Lemma. The Lie algebra i ( M ) of Killing vector fields on a connected semi-Riemannian manifold M" has dimension at most n(n 1)/2.

+

Proof: Fix p E M , and let o(T,M) be the Lie algebra of all skew-adjoint linear operators on T,(M). (An orthonormal basis converts each such operator into a matrix, thus giving a Lie algebra isomorphism onto o,(n).) Let E send each Killing vector field X to (X,, (DX),). Then E is a linear transformation from i ( M ) to T,(M) x o(T,M). Lemma 26 implies that E is one-to-one. Hence dim i(M) I dim T,(M)

+ dim o,(n) = n + n(n - l)/2 = n(n + 1)/2.

29. Example. Killing vector fields on R:. (1) If u E R:, let u* be the vector field p -+ u p . Since v* is parallel, it is Killing. The flow of u* consists of the translations $,(p) = p to, so we call u* an infinitesimal translation. If S is a skew-adjoint linear operator on R:, let S* be the vector field (2) such that S: = ( S p ) , for all p E M .Relative to natural coordinates, S* = Sjujdi. Thus for any vector field I/, D,S* = SiI/jdi = SV, where we consider S by canonical isomorphism as a (1, 1) tensor field. But S is skewadjoint, hence s* is a Killing vector field, called an injinitesimal linear isometry. ( 3 ) The Killing vectorjields on R: are the vectorjields of theform u* S*, where v E R: and S E o,(n). To prove this, it suffices to check that the space of such Killing fields has the maximum dimension n(n 1)/2. (4) For example, let t, x, y be the natural coordinates on R:. The infinitesimal translations on R: have basis a,, a,, a,,. By Lemma 3 a basis for 0,(3) is given by

+

1

1

+

+

A=

L 3. B=k::I2 c=k;-;I. 0 1 0 1 0 0

0 0 1

0 0

0

As above, these matrices give Killing vector fields A* = xa, t a x , infinitesimal boost on the timelike planes, y const;

+ B* = yd, + td,, infinitesimal boost on the timelike planes, x const; C* = - y d , + xi?,, infinitesimal rotation on the spacelike planes, t const. The six Killing fields constitute a basis for i(R:).

254 9

lsornetries

Recall that a vector field V is complete provided its maximal integral curves are all defined on the whole real line.

30. Proposition. On a complete semi-Riemannian manifold M every Killing vector field V is complete. Proof. We can assume M is connected. Fix o E M and E > 0 such that V has a local flow defined on 42 x ( - E, E), where 42 is a neighborhood of 0. Let A be the set of all points p E M such that V has a local flow defined on V x ( - E , E), where “fis a neighborhood of p (same E). It suffices to show that A = M, for then every integral curve of V is defined at least on ( - E , E), hence by Lemma 1.52, every maximal integral curve has domain R. We need only show that if a convex open set %‘meets A , then ‘& is contained in A ; so suppose that p E % n A and r E %. Let D : [0,1] .-+ M be a geodesic from p to r. Since p E A , the method used in the proof of Lemma 26 gives, for some 6 > 0, a geodesic variation x: [0, S ] x ( - E , E ) + M of 01 [O, S] such that each transverse curve is an integral curve of V . Because M is complete, each longitudinal geodesic of x can be extended to [O, 11, hence we can suppose that x is a geodesic variation of 0. The proof of Lemma 8.3 actually shows that on any longitudinal geodesic a,(t) = x ( t , s) the vector field x, is a Jacobi field. By Lemma 26, the restriction V, is a Jacobi field. By the construction of x, the two are equal on [O, S] for any s. Thus their initial conditions agree at t = 0, so by Lemma 8.5, they are equal on [0, 11. In particular, x,( 1, s) is the value of V at p(s) = x( 1, s) for all s E ( - E , E ) . But this says that p : ( - E, E ) -+ M is an integral curve of V starting at r, so r E A . Hence %? c A , as required. On an incomplete manifold, Killing vector fields may or may not be complete. For example, on the unit disk lpl < 1 in RZ,infinitesimal translations are obviously not complete, but the infinitesimal rotation - ud, + ud, is.

/ ( M ) AS LIE GROUP

The isometry groups of S;, H;, and R:. were, in a natural way, Lie groups. We shall now see that this is true for any semi-Riemannian manifold. 31. Definition. A (Iqft) action of a Lie group G on a manifold-M is a smooth map G x M -+ M , denoted by (g, p ) -,gp, such that

(1) (gh)p = g(hp) for all g, h E G and p E M . (2) e p = p for all p E M, where e is the identity element of G.

I ( M ) as Lie Group 255

Here G is also called a Lie transformation group on M . The definition makes sense with G an abstract group and M a set, but unless the contrary is mentioned we assume the smooth case. For a given action, if g E G is held fixed, then p + g p is a diffeomorphism with inverse p + (g-’)p. An action G x M --* M is transitive provided that for each p , q E M there is a g E G such that g p = q. For example, under the column vector conventions, (9, x) + gx is an action GL(n, R ) x R” + R”. This action is not transitive, since go = 0 for all g-but it is transitive on R” - 0. If G x M + M is an action and o is a point of M , then H = {g E G :go = 01 is a closed subgroup of G called the isotropy subgroup at 0. Now consider the isometry group I ( M ) of a semi-Riemannian manifold M . Evaluation of c$ E Z(M) on p E M gives a natural map (4, p ) + 4(p) from Z(M) x M into M . 32. Theorem. If M is a semi-Riemannian manifold, there is a unique way to make Z(M) a manifold such that: (Cl) Z(M) is a Lie group. (C2) The natural action I ( M ) x M + M is smooth. (C3) A homomorphism /?:R + I ( M ) is smooth if the map R x M sending (t, p ) to P(t)p is smooth.

-+ M

This follows from general results of Palais in Chapter IV of [PI. It can be expected that the Lie algebra Y(M) of the isometry group I ( M ) is closely related to the Killing vector fields (“infinitesimal isometries”) of M . If X E Y ( M ) ,let t + $, be its one-parameter subgroup. By (C2),the map R x M + M sending (t, p ) to $ , ( p ) is smooth. For each p E M , let X : be the initial velocity of the curve t -+ $,(p). Then X + is a smooth vector field on M . Using the identity $,(rc/,p) = $,+,(p), it is easy to show that {$,} is the flow of X + . Since one-parameter groups are defined on the whole real line, X + is complete. Since each ~)I is an isometry, X + is Killing. 33. Proposition. Let M be a semi-Riemannian manifold, and let 9 ( M ) be the Lie algebra of its isometry group Z(M). Then (1) The set ci(M) of all complete Killing vector fields on M is a Lie subalgebra of i(M). (2) The function X -+ X + is a Lie anti-isomorphism 9 ( M ) + ci(M), that is, a linear isomorphism such that

[X’,Y’]

=

- [ X , Y]’

for all X , Y E Y ( M ) .

Proof: For each p E M it follows from (C2) that the map n,: Z(M) -+ M sending c$ to c$(p) is smooth.

256 9

lsometries

(a) If X E Y(M), then dn,(Xe) = X l for each p E M . Let a(t) = $, be the one-parameter subgroup of X. By definition, a is the integral curve of X startingat ein I ( M ) ,with a'(0) 5 Xe.Thusd7c,(Xe) = d7cp(c1'(0))= (n, 0 a)'(O). But z,a(t) = a(t)(p) = $ , p , so (n, a)'(O) = Xl. (b) The function X -+ X' is a one-to-one linear transformation onto ci(M). It follows immediately from (a) that the function is R-linear. It is oneto-one, for, if X f = 0, then the integral curves t + $t,p of X are constant for all p . Thus $, = id for all t, which implies X = 0. We must show that each Z E ci(M) can be expressed as X + for some XEY(M). The global flow {$,I of Z consists of isometries, so t + $, is a homomorphism of R into I(M). By (C3) the smoothness of the flow implies t -+ is smooth. Hence it is a one-parameter subgroup of Z(M). If X is the corresponding element of Y(M), then clearly X + = 2. (c) [X', Y + ] = - [ X , Y ] ' . Let t + $, be the one-parameter subgroup of X and hence the flow of X +.Let q = $,@).Then by (a), Y: = dzq(Ye),and furthermore, for all 4 E I ( M ) , 0

(L~q)(+) = $-,4$,(P)

= (npR,, L,

,)(4>.

Since Y is left-invariant it follows that

d$-,(Y:)

= drip dR,,(Ye).

Thus 1 [X', Y ' ] , = lim - (d$-,(Y:) - Y,') 1-0

=

t

1 lim - (dn, dR,,(Ye) - dn,( Ye)) 1-0 t

(dR,((Ye) - Ye)}. Since t + I(/, is the one-parameter subgroup of X , + is the oneparameter subgroup of - X . Thus by the lemma below, R,-t is the flow of - X . In view of the signs in Proposition 1.56, the limit above is [-X, Y I P . Hence by (a),

[X+, Y + ] , = -dnp[X, Y ] = -[X,Y];. Then assertions (1) and (2) are direct consequences of (a), (b), and (c).

34. Lemma. Let g be the Lie algebra of a Lie group G . If c1 is the oneparameter subgroup of X E 9 then the flow of X is {&,)}.

Homogeneous Spaces

257

Proof: That CL is the one-parameter subgroup of X means that it is the integral curve ofX starting at the identity element e of G. Ifg E G then applying the left translation L, shows that L, tx is the integral curve of dL,X = X starting at g. But L,cr(t) = ga(t)for all t E R, hence {R,,,,} is the flow of X . a 0

HOMOGENEOUS SPACES

One natural way to specify that M has plenty of isometries is as follows.

35. Definition. A semi-Riemannian manifold M is homogeneous provided that, given any points p , q E M , there is an isometry 4 of M such that

4(P) = 4. In short, Z(M) is transitive on M , which is also sometimes called a homogeneous space. If M is homogeneous, then evidently any geometrical properties at one point of M hold at every point. To show that a given semi-Riemannian manifold is homogeneous, it suffices to exhibit enough isometries to carry some one fixed point to every point (or vice versa). 36. Lemma. A symmetric semi-Riemannian manifold M is homogeneous.

Proof. Let o:[0, 13 + M be a geodesic. The global symmetry ( at o($) is an isometry that reverses geodesics, hence carries a(0) to o(1). Since symmetric manifolds are by definition connected, any two points p , q E M can be joined by a broken geodesic. Thus a finite composition of isometries as above gives an isometry carrying p to q. 37. Remark. Completeness of Homogeneous Spaces. (1) A Riemannian homogeneous space is complete. It suffices to show that a unit speed geodesic y : [0, b) has a geodesic extension past b. The existence of normal neighborhoods shows, in the Riemannian case, that for any point O E M there is an E > 0 such that every unit speed geodesic starting at o is defined on [0, E). If 4 is an isometry carrying o to y(b - ~ / 2 ) ,there exists a unit vector u E T , ( M ) such that @(u) = y'(b - 4 2 ) . Hence the geodesic 4 yo provides an extension of y. (2) An indefinite homogeneous space need not be complete. Consider M as in Example 7.14. For each (a, b) E M the isometry (u, v ) + (u/a,au) carries (a, b) to (1, ab); then the isometry (u, u) + (u, u - ab) carries this point to (1,O). Thus M is homogeneous. But M is not complete since clearly the null geodesic a(t) = (t, 0 ) has maximum domain R + . 0

258 9

lsometries

Thus a homogeneous space need not be symmetric. Since a homogeneous space has a good supply of isometries, it also has a good supply of Killing vector fields.

38. Corollary. Each tangent vector to a homogeneous semi-Riemannian manifold M extends to a Killing vector field on M . Proqf. For p E M , the projection g + g p is a submersion n: Z(M)+ M . (This is a general fact about transitive actions; see Proposition 11.13.) Hence, if v E Tp(M),there is a vector 6 E T , ( I M )such that dx(B) = u. But ii extends to a left-invariant vector field V on Z(M). Hence by Proposition 33, V + is a Killing vector field such that V l = v. w We have now seen that, if a Riemannian manifold is either compact or homogeneous, it is complete, but that neither condition alone suffices for indefinite metrics. However, both together are sufficient.

39. Proposition (Marsden). A compact homogeneous semi-Riemannian manifold is complete. Proofi To show that a geodesic y: [0, b) M is extendible, let { s k ) be a sequence in [0, b) that converges to b. Since M is compact, by passing to a subsequence we can suppose that {y(sk)} converges to some point p E M . Let ul, . . . , 0, be a basis for T,(M). By Corollary 38, each ui extends to a Killing vector field F. The conservation lemma (26) says that (y’, &) is a constant ci for each i = 1, . . . , n. It will follow that the sequence (y’(sk)) in T M converges to some vector v E Tp(M).We can suppose that the sequence {y(sk)) lies in a neighborhood 42 of p in M such that V,, . . . , V, give a basis for each T,(M), q E 42. As with coordinate vector fields, if hij = (V, Vj) then det h # 0 and y‘(sk)

=

1( h - l)ijci5

at

y(sk).

Thus the sequence converges as claimed. By Proposition 3.28,~’:[O, b) -+ T M is an integral curve of the geodesic vector field G . Hence by the single-sequence criterion in Lemma 1.56, y’ has an extension past b as an integral curve of G . But this extension projects to a geodesic extension of y. Putting some dents in, say, the plane R2 will give a manifold whose only isometry is the identity map. At the opposite extreme, a semi-Riemannian manifold M is,frame-homogeneousprovided any frame on M can be carried to any other by the differential map of an isometry of M . Such a manifold M has the largest possible isometry group: M is homogeneous and the isotropy group of I ( M ) at any point p is the entire group O ( T p M )z O,(n) of linear

Exercises 259

isometries on T,(M). It follows, using Exercise 14(a), that M has constant curvature. Proposition 4.30 asserts that hyperquadrics Q are frame-homogeneous. In the Riemannian case every connected frame-homogeneous manifold ( n 2 2) is homothetic to S", P", R",or H"; for indefinite metrics, the list is longer (see [Wo]). Exercises 1. (a) If t+b: M 4 N is a homothety, show that # -+ $#$- is an isomorphism from Z(M) to Z(N). (b) Prove that, if - M is M with metric reversed, then I( - M ) = Z(M). (c) Find an explicit matrix formula for a Lie group isomorphism O,(n) = 0, - v(n). 2. Prove: (a) A coordinate vector field dk is Killing if and only if dgij/8xk = 0 for all i, j (see 5.39.) (b) For a warped product M = B xf F , a vector field on F is Killing if and only if its lift to M is Killing. 3. A positive space form M" with 0 < v I n/2 and n 2 3 is noncompact. 4. Let M be a positive space form with finite fundamental group. If n - v is odd, then M is space-orientable. (Compare Proposition 16.) 5. (See Proposition 13.) Suppose that #EZ(A?) has +m = id for some integer rn # 0. Show that has a fixed point if A? is (a) H",(b) CS;, (c) R:. (Hint: For (a), verify that p = (l/m)(x + +x + ' . . + +"-'x) is a fixed point in R:+ l . ) 6. Let Sn-"/r be a positive Riemannian space form and let 8: r -+ O(v) be a homomorphism. Show that T o = {(Ob, b):b~ r } is a finite subgroup of O,(n 1) that is properly discontinuous on S;. Hence S:/T, is a positive space form with fundamental group rez r. 7. Let P be a semi-Riemannian submanifold of M , and let X be a Killing vector field on M . Prove: (a) If X is tangent to P , then XI P is a Killing vector field on P. (b) If P is totally geodesic, then tan X is Killing on P. 8. If V is an arbitrary vector field and X is Killing, then D,,. = [L,, D,] and D,(DX) = R x v . 9. If X is a Killing vector field on a semi-Riemannian manifold M , let f = i { X , X ) . Prove: (a) g r a d f = -D,X. (b) H f ( V , W ) = ( D , X , D,X) - ( R x v X , W ) = -(D,(D,X), W ) , (c) Af' = -trace(DX DX) - Ric(X,

+

0

X). 10. (Continuation) (a) If an integral curve ci of a Killing vector field X starts at a critical point of (X, X ) , then ci is a geodesic. (b) Theorem of Bochner: On a compact Riemannian manifold with Ric < 0, every Killing vector field is identically zero. (Hint: At a maximum point offt Hf is negative semidefinite.)

260 9

lsometries

11. M is said to be isotropic at p E M provided that, if u, w E T,(M) have ( u , u ) = ( w , w ) , there is a 4 E I ( M ) such that d&u) = w. M is isotropic if

it is isotropic at every point. Prove: (a) M isotropic A4 complete. (b) M connected and isotropic 3 M homogeneous. (c) M frame-homogeneous +-M isotropic and symmetric. (The converse is false, e.g., CP" from Chapter 1 1 .) (d) S' x R' is symmetric but not isotropic. 12. The Clifton-Pohl torus T = M / T (Example 7.16): (a) Find a group of eight isometries and anti-isometries of M. (b) Show that s + (tan s, 1) is a geodesic, and deduce that every null geodesic of M and T is incomplete. (c) P = u (7, + u 8,is a Killing vector field on M . (d) If y(s) = ( u ( s ) , u(s)) is a geodesic, then both u'ut/r2and (uu' + uu')/r2areconstant, wherer2 = u2 + u2. (e) T is not geodesically connected. (f) s + (s, l/s) is pregeodesic and on [l, z ) has finite length. (g) T has timelike geodesics that are complete and ones that are not complete; likewise for spacelike. 13. In complex terms the Poincare plane P of Exercise 3.8 is the region Im z > 0 of C zz R2 with ds2 = dz dZ/(Im z)'. If A = (: !) E SL(2, R ) let send z E P to z' = (az + b)/(cz 4. Prove: (a) 4A(P)c P. (b) 4 A is an isometry of P. (Hint: Compute dz' dz'/(Im z')'.) (c) The map A -+ 4Ais a homomorphism onto the identity component Io(P) of I ( P ) with kernel + I . (d) P i s isometric to the hyperbolic plane, hence I o ( P ) 2 0: '(3) = SOo(l, 2). 14. Prove: (a) If sectional curvature K is constant on the nondegenerate planes of T,(M) of a given index (0, 1, or 2), then K is constant on all nondegenerate planes in T,(M). (b) If X is a Killing vector field with X , = 0, then ( D X ) , = -lim,+o(l/t)(d$, - id) on T,(M), where {+,} is a local flow of X . (c) If M" is connected and dim i(M) = n(n 1)/2, then M has constant curvature. 15. If M is complete and connected, the following are equivalent: dim I ( M ) = n(n + 1)/2, dim i(M) = n(n + 1)/2, M is frame-homogeneous. 16. G-orientability (page 241). Prove: (a) A semi-Riemannian hypersurface M of a G-orientable manifold is G-orientable, where meaningful, if and only if M has a smooth unit normal vector field. (b) An arbitrary semi-Riemannian manifold is G-orientable if and only if every loop in M preserves G-orientation (that is, lifts as a loop into MG).(c) A connected semi-Riemannian manifold is G-orientable if its fundamental group has no subgroup of order 2. 17. A flat connected frame-homogeneous manifold M is isometric to R:. To prove this show: (a) M = R://T, where N ( T ) = I(R");that is, r is a normal subgroup. (b) Each element of r commutes with each element of the identity component K of Z(R",. (See Exercise 11.14.) (c) r = { e } . (Hint: K 3 W" u 0: +(n).) 18. (This exercise requires some advanced linear algebra.) (a) A linear operator S on V 2 R: is self-adjoint if and only if I/ can be expressed as a

+

+

Exercises

261

direct sum of subspaces Vk that are mutually orthogonal (hence nondegenerate) and S-invariant and each S 1 V, has matrix of form either

1 1 1

0

1

’..

\

11

o 1,

relative to a basis ul, . . . , u, ( r 2 1) with all scalar products zero except (vi, vj> = E = & 1 if i j = r 1, or

+

a

+

h

-b

a 1 0 a b 0 1 - b a 1 0 a b 0 1 - h a

0

\

0

a b 1 - b a

1 0

0

relative to a basis u l , vl, . . . , urn, u, with all scalar products zero except (ui,u j ) = 1 = - (ui,u , ~if) i + j = rn + I . (Here r, E, and rn depend on k.) (Hint: Complexify, then geometrize the derivation of the Jordan canonical form.) (b) For each type in (a), determine the existence and causal character of eigenvectors, and the index of V,. 19. (Continuation) (a) A self-adjoint linear operator on a Lorentz vector space I/ z R; has a matrix of exactly one of the following four types, where D, is k x k diagonal: Relative to an orthonormal basis, D, or

262 9

lsornetries

relative to a basis u, 21, e,, . . . , e n - , with all scalar products zero except (u, u ) = 1 = (ei,ei) ( I I i I n - 2),

(b) Characterize these cases in terms of eigenvectors (including causal character and multiplicity). (c) For n = 4, change tensor type from (1, 1) to (0,2) and show agreement with the cases in w.3 of [HE]. 20. Let A E O , ( n ) = O( 1, n - I). (a) Prove: nonnull eigenvectors of A have eigenvalues k 1 ; for two independent null eigenvectors the product of their 1. (b) Analyse eigenvalues and eigenvectors in the case eigenvalues is n = 2. (c) Prove that if A has a null eigenvalue A # 1, then it has an (independent) eigenvector with eigenvalue I/A. 21. (a) For A E O,(n) exactly one of the following is true: (i) A has a timelike eigenvector, (ii) A has a null eigenvector with eigenvalue # k 1, (iii) A has a unique null eigenvector. (b) The first two cases in (a) are equivalent to the existence of an orthonormal basis relative to which the matrix of A is in (i') { f l } x O(n - I), (ii') S0,(2) x O(n - 2). (c) There exists A ~ 0 , ( 3 ) such that A has a unique eigenvector and it is null; hence type (iii) above exists for n 2 3. 22. Semiclassical surface theory. In Exercise 4.7 replace Euclidean space R3 by Minkowski space R:. Assume x has sign E = 1, so E = - 1 if 9 is Riemannian, E = l if 9 is Lorentz. Prove (a, b, c, d) from Exercise 4.7 with the following modifications: (a) Change a sign in the usual definition of cross product. (c) K = E det S , with det S as before. (d) H = &hU,with h as before. 23. (Continuation) For each of the following immersions in R: compute E, K , h, and S , and determine the eigenvectors (principal directions) of S: (a) x(u, P ) = (u, u, ( u u)'/2). (b) x(u, u ) = (u, u cos u, u sin u), [ u J< 1. (c) x as in (b), but ( u ( > 1. (Hint: All have h = 0.) 24. Let M" have indefinite metric. (a) For a frame field on a loop a : [0, 11 -, M , consider the matrix g E O,(n) such that Ei(1) = gijEj(0). Express the criterion in Exercise 16(b) in terms of such matrices. (b) Show that if M is G-orientable for any two of the three types (orientable, time-orientable, space-orientable) it is G-orientable for the third. (c) For connected M , express G-orientabilities in terms of a natural homomorphism z1(M ) + O,,(w)/O: ' ( n ) z Z , x Z , . (d) Find a connected A4 that is G-orientable for none of the three types.

+

+

10

CALCULUS OF VARIATIONS

A basic problem in semi-Riemannian geometry is to measure the change in arc length of a curve segment under small displacements. To formalize the idea let x: [a,b] x (-6,6) -+ M be a variation of a curve segment a (Definition 8.1). and for each u E (-6, 6), let L,(v) be the length of the longitudinal curve u x(u, u). Then L, is a real-valued function with L,(O) the length of a. Under mild conditions the function L = t,is smooth, and we shall find useful formulas for the f i r s t and second variations of arc length on x ; that is, for --f

and

L”(0)=

d”J dv2

v=o

,

the latter when cc is geodesic. The second variation formula involves curvature, and the resulting link between geodesics and curvature has a broad range of applications. FIRST VARIATION

The class of curves a with I a‘ I > 0 consists of all spacelike regular curves and all timelike (hence regular) curves, the two cases distinguished by the sign ofcc; that is, E = sgn(a’, a’) = f 1. 1. Lemma. Let x be a variation of a curve segment a : [a, b] + M with Ia‘I > 0. If L is the length function of x, then

L’(0) = E [
264 10

Calculus of Variations

where F is the sign of a and I/ is the variation vector field of x . Proof. L(u) = St Jx,(u, v ) /du, and if the u-interval (-6,6) enough, I xu1 is positive, hence differentiable. Then

is small

and furthermore sgn(x,, x u ) = E, hence / x u /= ( & ( x ux, u ) ) 1 1 2By . Proposition 4.44, xu, = xu,, and we compute

(d/dv)Ix,I = % ( X u ,

xu))-1’22&(LX

U J

=

&(XU,

~ l J U ~ / l ~ U l ~

Setting u = 0 gives x,(u,O) = a‘(u) and, by definition x,(u,O) = V(u). Thus xu,,@,0) = V’(u),and the result follows. Even simple constructions create piecewise smooth curves by hooking together smooth ones. The preceding formula remains valid in this case, since finitely many discontinuities have no effect on the integral. However, we want to consider an alternative formula in which piecewise smoothness requires care. A variation x of a piecewise smooth curve a : [a, b] + M is itself piecewise smooth provided x is continuous, and for breaks a < u1 < ... < uk < b the restriction of x to each set [ui- 1, ui] x (-6,6) is smooth. There is no loss in generality in assuming that a and x have the same breaks, since we can always add trivial breaks-those at which a or x is smooth. The variation vector field Y of x is always piecewise smooth. By contrast, the velocity vector field a’ will generally have a discontinuity at each break ui (1 s i s k). This discontinuity is measured by the tangent vector

Aa‘(ui) = a‘(u+)

- a’(u;) E

T,,,,,(M),

wherethe first termderivesfromaI[ui, u i + J and thesecondfromaI[ui(We consider a = u,,, b = u ~ + ~ . )

1,

ui].

2. Proposition (First Variation Formula). Let a : [ a , b] + M be a piecewise smooth curve segment with constant speed c > 0 and sign E. If x is a variation of a, then E

(tl“,

V ) du - -

k

&

(AaJ(ui),V(ui))

Ci=l

+ -(a’, C

:1

V) ,

where u1 < . . . < uk are the breaks of a and x. Proof. Since Ia’I = c, the integrand in the preceding lemma is (a‘, V’)/c. We use integration by parts, a standard device in the calculus of variations.

First Variation

265

Since ( d , V ‘ ) = (d/du)(d, V ) - (a’’, V ) ,

it follows by the fundamental theorem of calculus that for any subinterval Cui, ui+ 11 ”“(at,

V ’ )du

= (CI’,

V)

!:+’

-

[ “ ‘ ( d ,V ) du.

Taking the sum from i = 0 to i = k then gives c times the integral in Lemma 1. In particular, for each i = 1,2, . . . , k, the contribution of the break at ui is (a’(ul-), V(ui)) - ( ~ ’ ( u : ,V ( U ~ )= > - ( A N ’ ( u ~ V(ui)). ,

Note that the formula depends not on the entire variation x but only on its infinitesimal model, the variation vector field V. In the summation term, Aa’(ui)is a “one-point acceleration” measuring the change in the velocity of CI at the break. If the corner is rounded off, the vanishing of this term will be compensated for by a change in the integral term. For ajifixed endpoint variation x, the first and last transverse curves are constant, so all longitudinal curves run from p = a(a) to 4 = a(b). In particular, the variation vector field V vanishes at a and b, and hence so does the last term in the first variation formula.

3. Corollary. A piecewise smooth curve segment a of constant speed c > 0 is an (unbroken) geodesic if and only if the first variation of arc length is zero for every fixed endpoint variation of a. Proof. If a is a geodesic, then a” = 0 and the breaks are all trivial: Aa‘(ui) = 0. For fixed endpoint variations, V ( a ) and V(b) are zero. Thus L’(0) = 0. Conversely, suppose L‘(0) = 0 for every fixed endpoint variation x. First we show that each segment c1 I [ui, ui+1] is geodesic. It suffices to show that a”(t) = 0 for ui < t < u i t l . Let y be any tangent vector to M at a(t), and let f be a bump function on [a, b] with supp f c [t - 6, t 61 c [ui, ui+ l]. Let Y be the vector field on a obtained by parallel translation of y, and finally let V = f Y. Since V ( a ) and V ( b ) are both zero, the exponential formula x(u,u) = exp,(,,(uV(u)) produces a fixed endpoint variation of ct whose vector field is V. Since L’(0) = 0, the formula in Proposition 2 reduces to

+

t+d

266

70

Calculus of Variations

This holds for all y, 6 > 0, and f. Hence (cc”(t),y) = 0 for all y ; hence r ” ( t ) = 0. It remains to show that the breaks are trivial, so that a is an unbroken geodesic. As before, let y be an arbitrary tangent vector at a(ui), and let f be a bump function at ui with suppf c [ui-I , ui+ For a fixed endpoint variation with vector field f Y the first variation formula now reduces to 0

=

for all y.

L’(0) = - ( c / c ) ( A d ( u i ) , y)

rn

Hence Aa’(ui) = 0.

SECOND VARIATION

For a variation x of a curve segment a our goal is to compare L(u), u small, with the length L(0) of a. Thus L”(0) is needed only when L’(0) = 0. By Corollary 3, it suffices in practice to find a formula for L”(0)in case a is a geodesic. The vector field V ( u ) = x,,(u,O) gives the velocities of the transverse curves of x as they cross the base curve E . Similarly the vector field A(u) = x,,(u, 0) gives their accelerations; we call A the transverse acceleration vectorjeld of x. Recall that if 10”) > 0, then any vector field Y on CI splits into the sum YT + Y’ of its components tangent to a and perpendicular to a, respectively. 1

If E is a geodesic, then ( Y l ) ’

=

(Y‘)’ is denoted unambiguously by Y‘.

4. Theorem (Synge’s Formula for Second Variation). Let 0 : [a, b] -+ M be a geodesic segment of speed c > 0 and sign c. If x is a variation of 0, then - ( R v o , V, 0 ’ ) }du

&

+ C (o’, A ) -

where V is the variation vector field, A the transverse acceleration vector field of x. Proof. Let h

=

h(u, c)

= Jx,(u,

e)l, so L(u) = J: h du, hence L”(u) = = (E/h)(x,, xu,).

j: ( d 2 h / d v 2 )du. In the proofof Lemma 1, we computed ah/& Thus xUC)

-,U‘(

‘Ill!)

-

E

Second Variation 267

The object now is to move u to the right in each term. By Proposition 4.44, xu, = x,,and xu,,

= XI,,, = R(XU?X ” b ,

+ X,”,.

Hence

0 in this equation produces the following changes: h + c, V, x,, + V’, x,, -+ A , and x,,~ + A‘. Thus, rearranging the curvature term, we find Setting v

xu -+ o‘,x,

@I

av2

=

+

=

E{(V‘, V ‘ ) - ( R v q , V ,a’)

+ (a’,A ’ ) -

Two simplifications are now possible. First, since a is a geodesic, (a’,A ‘ ) = (d/du)(o’, A ) . Second,since a‘/c isa unit vector field, the tangential component of I/‘ is E( V’,o’/c)(a’/c). Thus I

V’ = ( c / c 2 ) ( V’, a’)o’

+ V’;

hence 1

I

(V’, V ’ ) = (t/c’)(V’, 0‘)’ + (V’, V ‘ ) .

Substitution then gives

Finally, integration from a to b yields the required formula.

m

Comments. (1) The only part of the second variation formula that uses x rather than V is the endpoint term

d

=

(a’,A )

:1

=

(a’@), A @ ) ) - (o’(a), A @ ) .

Clearly d measures the contribution to L”(0)of the convexity/concavity of the first and last transverse curves of x. For a fixed endpoint variation, d = 0, so L”(0)depends solely on the variation vector field V. i

i

(2) Since R , , = 0, the curvature integrand can be written as ( R ( V , d ) V , I a’), so the integral term in the formula depends only on V . This is to be expected, since when d = 0, the tangential component VT of V amounts essentially to a change of parametrization of 0, which has no effect on arc length.

268 10

Calculus of Variations

(3) If L’(0) = 0 (as in the fixed endpoint case) and L”(0)# 0, then obviously the sign of L”(0) tells whether nearby longitudinal curves of x are longer or shorter than the base geodesic CT. THE INDEX FORM

To examine in more detail the lengths of curves with the same endpoints, fix the following notation: p and q points of M , [0, b] an interval, and R(p, 9) the set of all piecewise smooth curve segments GI:[0, b] + M from p to q. Treating R(p, q ) as a manifofold establishes a powerful analogy that guides subsequent developments. If x is a fixed endpoint variation of a E R(p, q), then each longitudinal curve u --t x(u, u) is a “point” GI, of R(p, q). As a 1-parameter family, u -,a”, the variation is thus a “curve” in R(p, q ) starting at GI.The initial velocity of a curve is its linear description at its starting point; thus the “initial velocity” of x is its variation vector field V. Since x is a fixed endpoint variation, V is zero at a and b. The tangent vectors to a manifold at a point p are exactly the initial velocities of all curves starting at p . This suggests 5, Definition. If ~ E =RR(p,q) the tangent space T&2) to C2 at GI consists of all piecewise smooth vector fields V on a such that V ( a ) = 0 and V ( b ) = 0.

As noted earlier, every V E T,(Q) is the vector field of some fixed endpoint variation of a. Clearly T,(R) is a module over the ring of (piecewise smooth) functions on [0, b]. The length function L is a real-valued function on R(p, 9). For a function ,/’ on a manifold M, if x E Tp(M),then x ( f ) = (d/dt)(f 0 / ? )It = o, where /3 is any curve in M with initial velocity x.By analogy, if V E T,(R), we can assign a meaning to V ( L ) .Corresponding to /3 is a variation x of a with V as its vector field. Consider x as usual as a “curve” v + a,. Applying L to it gives the length function L, of x. Thus V ( L ) must be the first variation UO).

In manifold theory, p E M is a critical point off E g ( M ) provided x ( , f )= 0 for all x E Tp(M).Thus Corollary 3 is the assertion that the nonnull geodesics in R(p,9) are exactly the nonnull critical points of the length function L on R(p, q). (The null case will be dealt with later.) At a critical point p E M o f f € g(M),the coordinate first derivatives off vanish and to study f near p we turn to second derivatives, expressed in-

The lndex Form 269

variantly by the Hessian off at p . Its properties, developed in Exercise 1.14, show how to define its analogue at a critical point of L.

6. Definition. The index form I, of a nonnull geodesic o ~ R ( p , q ) is the unique symmetric bilinear form

I,: T,(R)

x T,(R) + R,

such that if V E T,(R), then I,(V, V >= L W ) ,

where x is any fixed endpoint variation of c with variation vector field V. Recall that in the fixed endpoint case the second variation Ll(0)depends only on V, not the particular x. The existence of I , follows from 7. Corollary. If then

CJ E

R(p, q ) is a geodesic of speed c > 0 and sign

E,

for all V, W E T,(R). Proof. Evidently the formula is bilinear and (by a symmetry of R ) symmetric in I/ and W. When V = W, Theorem 4 shows that it gives the I required second variation. In this formula, tangential vector fields are negligible. Explicitly, if V E T,(R) is tangent to

1

CJ,

that is, c’

I,(V, W ) = 0

=

0, then

for all W E T,(R).

It follows immediately that 1

1

I,( V, W ) = I,( V , W )

for all

V, W E T,(R).

Thus there is no loss of information in restricting the index form I , to T,l(Q) = { V E T,(R) : I/ Io}. We write I: for this restriction. Integration by parts transformed the first variation formula of Lemma 1 to that of Proposition 2; similarly it produces a new version of the formula above.

270 10

Calculus of Variations

8. Corollary. Let a E Q(p, q ) be a nonnull geodesic. If a and V E To@) have breaks u1 < I . . < uk. then I

1

&

- R( V , C T ' ) ~ 'W , )d ~ / -

k

1

1

1( A V ' , W ) ( U ~ ) .

Ci=1

Proof. In the preceding corollary, write I

I

I

1

1

1

( V ' , W ' ) = (d/du)(V',W ) - ( V " , W ) ,

except at breaks.

For the curvature term, substitute I

1

(RV,. W , O ' )= - ( R ~ , * ( T 'W , ) = - ( R ( V , o')G', 1

w).

1

The contributions of - ( V " , W ) and curvature to the previous integral then constitute the integral in the statement of this corollary. Since the 1

1

derivative of ( V ' , W ' ) is undefined at breaks, its contribution to the integral is the sum of the terms

Now W vanishes at 0 and h, and for each break u i (1 I i I k ) we obtain 1

i

1

1

I

1

( V ' (u~-),W ( U ~) )( V'(U:), W(ui)) = - (AV', W ) ( U ~ ) . w

This formula for I, shows that Jacobi fields are in the offing. CONJUGATE POINTS

9. Definition. Points o(a)and o(b),a # b, on a geodesic G are conjugate ulong (T provided there is a nonzero Jacobi field J on o such that J ( a ) = 0 and J ( b ) = 0. We can say that points p and q are conjugate along a if there is no ambiguity as to the numbers a and b for which p = a(a) and q = o(b). Conjugacy of p = o(a) and q = o(b) along o is independent of the parametrization of 0 1 [a, h]. Hence we often set a = 0. Examples will show that points may be conjugate along one geodesic joining them but not along another. With notation as above, let $(,h be the set of all Jacobi fields on o that vanish at a and h. Evidently f a , , is a subspace of the n-dimensional space consisting of those vanishing only at a (see Proposition 8.5). The dimension of $ab is called the order of conjugacy of o(a) and a(b) along 6.Either of the following implies that such orders are at most n - 1: ( I ) The tangential Jacobi field u -+ ( u - u ) ( T ' ( z Iis) zero at a but not at 6. (2) Jacobi fields in

Conjugate Points

271

yo,,vanish twice, hence by Proposition 8.7(2) are everywhere perpendicular to 0. The following result interprets a conjugate point o(b)of p = a(0) along a geodesic o as an “almost-meeting point” of geodesics starting from p with initial velocities near ~ ’ ( 0 )These . neighboring geodesics may, but need not, actually pass through the point o(b). 10. Proposition. Let o:[0, b] + M be a geodesic starting at p . The following are equivalent:

(1) a(b) is a conjugate point of p = o(0) along o. (2) There is a nontrivial variation x of o through geodesics starting at p such that x,(b, 0) = 0. (Nontrivial means variation vector field not identically zero.) (3) The exponential map exp: T,(M) -+ M is singular at bo’(0);that is, there is a nonzero tangent vector x to T J M ) at ba’(0) such that d exp,(x) = 0. Proof. (2) 3 (1). Since each longitudinal curve of x is a geodesic starting at p, the variation vector field J of x is anonzero Jacobi field vanishing at &and, by hypothesis, also at b. The remainder of the proof is really a corollary of Proposition 8.6. By a reparametrization we can suppose b = 1. (1) (3). Let J be a nonzero Jacobi field on o that vanishes at 0 and b = 1. Let x be the tangent vector to T,(M) at a’(0) that canonically corresponds to J’(0) E T,(M). Then by Proposition 8.6, d exp,(x) = J(1) = 0. Since J is nonzero but J ( 0 ) = 0, we have J’(0) # 0, hence x # 0. (3) * (2). For x as given, let the corresponding vector in T,(M) also be denoted by x. Then the proof of Proposition 8.6 shows that x(m, v) = exp,(u(a’(O) u x ) ) defines a variation x as required. w

+

There are no conjugate points on sufficiently short geodesics; in fact, by assertion (3) in the proposition, if J+’ is a normal neighborhood of p , then no point of .N is conjugate to p along a radial geodesic in JK 11. Examples of Conjugate Points. ( 1 ) In RC there are no conjugate points, since the geodesics starting at any point are radial lines that obviously do not refocus. ( 2 ) On the sphere Sn(r)all geodesics starting at a point p meet again at the antipodal point - p after arc length nr, so p and - p are conjugate. Also, p is conjugate to itself, since the geodesics meet again at p after arc length 2nr. Then - p is conjugate to p again along geodesics of length 3nr, and so on.

272 10

Calculus of Variations

( 3 ) If M has constant curvature C , then on a geodesic a the Jacobi equation for Y I a is Y” + C(a’, a’)Y = 0. The explicit solutions in Exercise 8.7 readily show the following: if o is spacelike and C > 0, or if a is timelike and C < 0, then points on a are conjugate if and only if the arc length between them is a multiple of z / m . (Evidently the conjugacy is of maximum order dim M - 1 .) Otherwise there are no conjugate points in M . In particular, there are none in flat manifolds or along null geodesics in any constant curvature manifold. For a nonnull geodesic a, conjugacy of its endpoints can be read from its index form I:. In fact. the following shows that their order of conjugacy along o is precisely the nullity of f,l, that is, the (finite) dimension of its nullspace (see Exercise 2.12). 12. Corollary. For a nonnull geodesic o ~ ! 2 ( p , 4 )the nullspace of 1; is the space 2 O b of Jacobi fields on o that vanish at p = a(0) and 4 = ~ ( b ) .

Pro$ That Y O bis contained in the nullspace of I : is clear from Corollary 8. The proof of the reverse inclusion is analogous to that of Corollary 3. If V is in the nullspace of I;, we must assume a priori that it has breaks id1 < . . . < uk. First we show that each restriction V I [ui, u , - 1] is Jacobi. For a fixed r inside the interval, let y be an arbitrary tangent vector to A4 at o(r). Construct W = , f Y as in the corresponding case for Corollary 3. Then, since V I 0 ,

i-, 1+6

0

=

fu(V, W ) =

I

( V ” - R(V, o ’ ) d , f Y ) du.

It follows as before that V ” - R ( V , d)a’ is zero at t , hence identically zero on [ u , , ui+ 1], and so V is Jacobi there. The proof that V is unbroken again follows the same pattern as for Corollary 3. As an unbroken piecewise solution of the Jacobi equation, V is in fact a smooth solution, so V E Y O b .

LOCAL M I N I M A A N D M A X I M A For a nonnull geodesic a E Q(p, 4), suppose that a is shorter than its neighboring curves T in Q(p, y ) - o r merely that L(a) IL(T).If V E T,(R), then for any fixed endpoint variation x attached to V, the length function L, has a local minimum at 0, hence I,( V, V ) = Li(0) 2 0. Thus I, is positive semidefinite.

1ocal Minima and Maxima 273

In a Riemannian manifold, this situation can certainly occur; for example, if o is a radial geodesic in a normal &-neighborhoodof o(0). On the other hand, we cannot expect to find Riemannian geodesics with I , negative semidefinite, since small ripples in o will produce nearby longer curves. 13. Lemma. Let (+be a nonnull geodesic of sign E in a semi-Riemannian manifold M" of index v . (1) If I , is positive semidefinite, then v = 0 or n. (2) If I , is negative semidefinite, then either v = 1 and E = -1 or v = n 1 and E = 1.

Proof. (1) y

Assume 0 < v < n. Then there is a unit tangent vector

Io'(0) with causal character opposite that of o. Thus if is the sign of o,

~ ( y y) , = - 1. Let Y be gotten by parallel translation of y, and let 6 > 0 be such that sin(u/b) is zero at the endpoints of the domain [a, b] of o. Then V = 6 sin(u/b)Y E T,(R). In the second variation formula, take 10'1 = c = 1 for simplicity, and let K = K ( V, 0'). Then l,(V, V ) = E =E

=

Job

{(V',V ' )

[((y,

l{

-

K ( V, V ) E }du

y ) cos2(u/6) + Kh2 sin2(u/h)}du

-cos2(u/6)

+ &Kd2sin(u/d)} du.

But K is bounded on [a, b] ; hence for 6 > 0 sufficiently small, I,( V, V ) < 0. (2) If the conclusion is false, there is a vector y Io'(0) such that ~ ( y , y ) = + l . Then a proof as for (1) shows that I , cannot be negative semidefinite. Reversing the metric of M has no effect on the lengths of curves; thus in considering definiteness (or semidefiniteness) of the index form we need only consider (1) arbitrary geodesics in a Riemannian manifold (where, if I , is definite, it can only be positive definite); (2) timelike geodesics in a Lorentz manifold (where, if I , is definite, it can only be negative definite).

These two cases can be unified as follows. 14. Definition. A geodesic o in M is cospacelike provided the subspace of TUc,, M is spacelike for one (hence every) s.

o'(s)*

274 7 0

Calculus of Variations

Then a is necessarily nonnull, and A4 is Riemannian or Lorentz depending on the sign of a. Our goal now is to relate definiteness of the index form to the existence of conjugate points.

15. Lemma. If V and W are Jacobi fields on a geodesic a, then ( V ' , W ) - ( V , W ' ) is constant. Proof. ( V ' , W ) ' = ( V " , W ) + ( V ' , W ' ) = - (RV,, W,a') By a symmetry of curvature this is the same as ( V , W')'.

+ ( V ' ,W ' ) .

16. Lemma. On a geodesic a let Y , , . . . , Y, be Jacobi fields such that ( Y : , Y j ) = ( Yi, Yj) for all i, j . If I/ = .fi Y i , then

c

(V'. V')

where A

=

-

(R,,,.

1,fi Yi and B = c

Proqf. Since V'

=

A

.fi

v,a') = ( A , A ) + ( V , B)', Y:.

+ B,

(V,B)'=(V',B)+(V,B')=(A,B)+(B,B)

+ ( V , 1.ffYt> + < V, 1f i

Yr).

The Jacobi equation Y r = RYio,a' converts the last summand to -(Rv,.V, a'). Using the hypothesis on the Y i s gives (V,

1.fI

Yi) =

1f j . f l ( Y j 3 Y t ) = 1f j j i ( Y i 9 Yi> = ( A , B ) .

Thus ( V , B)'

Since ( V ' , V ' ) = ( A

=

2(A, B )

+ ( B , B ) - ( R v , , V, a').

+ B, A + B ) , the result follows.

17. Theorem. Let

r~ E

Q(p, q ) be a cospacelike geodesic of sign E.

(1) If there are no conjugate points of p = a(0) along a, then the index form 1: is definite (positive if E = 1, negative if E = - 1). ( 2 ) If q = a(b) is the only conjugate point of p along a, then I , is semidefinite but not definite. (3) If there is a conjugate point a(r) of p along a with 0 < r < b, then I , is not semidefinite.

Proof. (1) We must show that &I, is positive definite on T:(R). Let Y , , . .., Y n - , be Jacobi fields on a that vanish at u = 0 and have Y;(O), . . . , Yh- ,(O) a basis for ~'(0)'. These Jacobi fields are then perpendicular to a by Proposition 8.7. Furthermore, since there are no conjugate

Local Minima and Maxima 275

points of o(0) along 0,it follows that for each 0 < u I b the vectors Y , , . . . , Y,- l ( u ) form a basis for ~ ’ ( u ) ’ . Thus if V E T,(Q), there are (unique) piecewise smooth functions f i such that V = f i Yi on (0, b]. Using Exercise 1.17, it is not hard to show that the functions f i have continuous extensions to “0, b]. Since the Y i s vanish at 0, Lemma 15 gives ( Y i , Y j ) = ( Y i , Y ; ) . Thus we can apply Lemma 16 to get (V’, V’> - (R”,, V, G’)

where A

=

1f :Yi and B =

=

(A, A )

fi

Y i . Hence

f J:

( A , A ) du

&I,(V, V ) = -

+ (V, B)’,

+ -c ( V , B )

:1

.

The second summand is zero since V vanishes at 0 and b. Since o is cospacelike and A Io, we have ( A , A ) 2 0. Hence &Z,(V, V ) 2 0. Furthermore, l,(V, V ) = 0 *

1:

( A , A ) du

=

0 * (A,A )

=

0

=- A = 0 * each f i is constant =- V = 0.

(2) Corollary 12 says that 1: has a nullspace, hence is not definite. A continuity argument will show that I , is semidefinite. Briefly, if V E T,(R), write V ( u ) = ( b - u)Z(u), and for bi = b - (l/i), define Vi(u) to be (b, - u)Z(u) on [0, bi] and zero thereafter. Then {I,( Vi, V,)} converges to I,( V, V ) .But (1) applies to IT I [0, b,] to show &I,( V i ,V i ) 2 0 for all i. (3) By hypothesis there is a nonzero Jacobi field J on (TI[0, r ] that vanishes at 0 and r. Extend J to a vector field Y on 0 by defining Y = 0 on [ r , b]. Then Y ’ ( r - )= J’(r) # 0 since J is nonzero. But Y ’ ( r c ) = 0, so (AY’)(r)# 0. Choose any W E T,(R) such that W ( r ) = (AY‘)(r). If 8 = f1 is the sign of CY, it suffices to find a 6 > 0 such that el,( Y + SW, Y SW) < 0. (By Lemma 13, there is always some Z E T,(R) with &I,(Z,Z ) > 0.) Now

+

El,(

Y

+ sw,Y + SW) = &{I,( Y, Y ) + 26Z,( Y, W ) + P I , ( W, W ) ) .

By Corollary 8, l,(Y, Y ) = 0 since Y is piecewise Jacobi and zero at its only break r. But &I,( Y , W ) reduces to - ( l / c ) ( A Y ‘ , W ) ( r ) = -(l/c)l(AY’(r)12 < 0 since D is cospacelike. Thus the result follows for 6 > 0 sufficiently small.

rn The proof of (3) is a rigorous version of the following heuristic argument. A conjugate point is an almost-meeting point; let us suppose that the meeting

276 10

Calculus of Variations

actually takes place, so there is another geodesic segment t near al[O,r] with the same endpoints and length. (This occurs on spheres, for example.) Then the competitive curve t (TI[r, b] has length L(a), but is broken at a(r). In the Riemannian case, rounding off this corner gives a strictly shorter curve 6. Thus (T is not a local minimum of arc length, and if V E T,,(R) points from (T to 3, we expect lo(V, V ) 0. (Inequalities reverse in the Lorentz case.) The three implications in Theorem 17 involve sets of all-inclusive, mutually exclusive conditions, hence all three converses hold. Thus for example, if 1; is indefinite there is a conjugate point of u(0) on u.

+

-=

18. Remark. In terms of the compact-open topology on n(p,q),the length function L on O@,q) has a local minimum at u if there is a neighborhood N of u in f l ( p , q )such that L(T)*L(u) for all TEN.A local minimum is strict if also L(7) =L(u) implies that 7 is a reparametrization of u. (1) The first assertion in Theorem 17 can be strengthened to: If there are no conjugate points of 4 0 ) along u,the length function L on R@,q) has

a strict local minimum if

E

= 1

a strict local maximum if

E

= -1

( M Riemannian), (utimelike in M Lorentz).

The proof uses Proposition 10 and the refinement (2) following Lemma 5.14. (2) If a(b) is the only conjugate point of 4 0 ) along u,then no conclusion can be drawn as to whether u is a local minimum or maximum. For a Riemannian example, let u be a semicircle of longitude joining the poles p and q of the ordinary sphere Sz, so the only conjugate point of p along u is q , Being a global minimum for arc length from p to q, u is certainly a local minimum (but not a strict one). Now change the metric on S2 symmetrically around u so that tangent vectors pointing due north or south become slightly shorter- to the fourth order in distance from u.Then u is still a geodesic in the new surface 2 and the Gaussian curvature along u is unchanged, so q is still the only conjugate point of p along u. But in 2 nearby semicircles of longitude are shorter than u,so u is not a local minimum. (3) If there is a conjugate point u ( r ) of p along u with O
Some Global Consequences 277

in stable equilibrium: any small displacement makes it longer, thus tending to restore it to its original position. But if rs is long enough to contain a conjugate point, then the equilibrium is unstable: there are arbitrarily nearby positions where the string is shorter, and hence it tends to move farther away from its original position. SOME GLOBAL CONSEQUENCES

The curvature of a manifold can be linked to its global structure (as a topological space or smooth manifold) by means of conjugate points. Curvature determines conjugate points via the Jacobi equation; conjugate points influence global structure, for example, through exponential maps as in Proposition 10. We consider two extreme cases: one with no conjugate points, the other with many. Consider the Jacobi equation in terms of tidal forces (8.8) as Y” = F,,( Y ) . Then the geometric meaning of conjugate points makes it clear that tidal forces that attract tend to cause conjugate points, while those that repel tend to prevent them. If the geodesic CJ is cospacelike, then for any unit vector y ICT, the y-component of F,,(y) = R y a , dis -(Ryu,y, a’)y. Thus 0’s neighbors in the y direction are attracted if (Ry,.y, cr‘) > 0, repelled if (R,,,y, rs’) c 0.

19. Lemma. If (Roo,u, d) s 0 for every vector u perpendicular to the cospacelike geodesic rs, then there are no conjugate points along rs. Proof. Let J # 0 be a Jacobi field along (T with J I u and J ( 0 ) = 0. Let h(s) = (J(s), J ( s ) ) . Then h‘ = 2(J’,J), and using the Jacobi equation,

ih” = ( J ’ , J ‘ ) + ( J ” , J )

=

<J,J ‘ )

-

(RJUrJ , 0‘).

Thus the hypothesis implies h” 2 0. Evidently h(0) = h’(0) = 0. Since J # 0 we have J’(0) # 0; hence h is positive near, but not at, 0. Thus h(t) > 0 for all t # 0. rn 20. Corollary. (1) A Riemannian manifold with sectional curvature K I 0 has no conjugate points. (2) A Lorentz manifold with K 2 0 on all timelike tangent planes has no conjugate points along any timelike geodesic. This result can be tested on hyperbolic space H”(r) and the Lorentz sphere Sl(r).

21. Lemma. If A4 is a complete connected Riemannian manifold with K 5 0, then for each point p E M the exponential map exp,: T,(M) .+ M is a covering map.

278 70

Calculus of Variations

Proof. By Proposition 10, the absence of conjugate points means that exp, is a local diffeomorphism. Assign T,(M) the induced metric tensor expf(g), making exp, a local isometry. Since exp, carries each ray r + tu to the geodesic y u , it follows that these rays are geodesics. The Riemannian manifold T J M ) is thus complete at 0; hence by the Hopf-Rinow theorem (5.21) it is complete. The result then follows from Corollary 7.29. Thus Corollary 7.27 gives

22. Theorem (Hadamard). Let H be a complete, simply connected Riemannian manifold with sectional curvature K 5 0. Then for each p E H the exponential map exp,: T,(H) + H is a diffeomorphism. In particular, (1) H is diffeomorphic to R“. ( 2 ) For p , y E H , there is a unique geodesic y: R and y(1) = y.

-+

M such that y(0)

=

p

Such manifolds-hyperbolic space, for example-thus have the same manifold structure as Euclidean space and share the best known Euclidean geometric property: “two points determine a line.” To show that conjugate points exist we need only require that aoeraye tidal forces be sufficiently attractive. In view of Lemma 8.9 this average is given by Ricci curvature; for u cospacelike,

where e 2 , . . . , en is an orthonormal basis for u”

23. Lemma. Let u be a unit speed cospacelike geodesic in M“ along which Ric(a’, 0‘) 2 ( n - 1)C > 0. If u has length L(u) 2 n/,,k, there are conjugate points along u. Proof. Suppose u : [0,b] + M with Iu’I = 1 and b = L(u) = n/@. If c is the sign of 0,then by Theorem 17(1) it suffices to show that 81; is not positive definite. Thus we want a vector field 0 # V E T,(R) with &l,(V, V) I 0. Let u’, E,, . . . , En be a parallel frame field on 0. Iff is a smooth function on [0, b] vanishing at the endpoints, then f E j E T,I(R) for 2 Ij In. By Corollary 7, b

~ l , ( . f E j , fEj) =

f

0

[fI2

- f 2 ( R E j a , E j , a’>]du,

Some Global Consequences 279

Thus rb

n

1

nh

_< ( n - 1)

- [ ( f ' ) Z - Cf2] du.

0

Since b = z/$, we are led to choose f ( t ) = sin(&t), preceding integral becomes /;'lic(cos2

~

c

-

for which the

sin2 u ~ c udu) = 0.

Hence Z,(fEj, f E j ) I 0 for some 2 I j I n.

rn

This lemma has a powerful consequence in the Riemannian case. 24. Theorem (Myers). If M is a complete connected Riemannian manifold with Ric 2 (n - l)C > 0, then

(1) M is compact and has diameter I n/&, (2) the fundamental group n , ( M ) is finite. Proof. (1) As for any metric space, the diameter of M is

s u p { a J ,4) : PI 4 E MI, where d is the Riemannian metric. If p , q E M , then by Proposition 5.22 there is a shortest geodesic a from p to q. In particular, o locally minimizes arclength, so by Theorem 17, q is the only possible conjugate point of p along a.Thus, by the lemma, L(o) > z/& is impossible, for it forces a to have an even earlier conjugate point of p . Consequently, d ( p , q ) = L(o)I n/&. Hence by the Hopf-Rinow theorem, M is compact. ( 2 ) Let A : fi + M be the simply connected Riemannian covering of M . Since fi shares the hypothesized properties of M , it is compact. For m~ M the counterimage k - ' ( m ) has no cluster points, since R is locally one-to-one; hence R-'(m) is finite. But Corollary A.10 implies that there is a one-to-one correspondence between A - ' ( m ) and nl(M). The bound on the diameter of M cannot be reduced, as is shown by S"(l/$), whose (intrinsic) diameter is exactly,n/&. The proof of Lemma 23 amounts to a comparison of M with this sphere. Also it is not enough merely to assume Ric > 0 or even K > 0, since, for example, a paraboloid of revolution has positive curvature but is not compact. As an application of the theorems of Hadamard and Myers, consider the product manifold M = S2 x S'. The Riemannian product metric

280 70

Calculus of Variations

has curvature K 2 0, but there can be no metric for which K 5 0, since the simply connected covering manifold of M is S2 x R’ # R3.Furthermore, there can be no metric with K > 0. In fact, if K > 0, then (since M is compact) K 2 c > 0, so Myers’ theorem applies, showing that n , ( M ) is finite-a contradiction, since n , ( M ) z nl(S’) z Z. It follows in particular that for every Riemannian metric on S2 x S’ there is a tangent plane I7 with K(I7) = 0. (Whether S2 x S2 has this property is not known.) For Riemannian applications of geodesic variational methods, consult [CE], [Mi], [GKM]; and for some Lorentz analogues and extensive references, see [BE].

THE ENDMANIFOLD CASE

The study of the length of curves joining two points of A4 can be generalized by replacing one or both of the points by submanifolds. We shall replace only the initial point by such an endmanifold. Fix the notation: P a semi-Riemannian submanifold of M , q any point of M , [0, b] an interval. Then let Q(P,q ) be the set of all piecewise smooth curves a:[0, b] --t M that run from P to q. Extending the previous analogy, a “curve in Q(P, q) starting at a” is a piecewise smooth variation x of a whose longitudinal curves are all in Q ( P , q ) . Thus the first transverse curve of x is in P , while the last is constant at q. We call x a ( P , q)-variation of a. As before, the vector fields of such variations provide the natural notion of tangent vector to Q(P, q).

25. Definition. The tangent space T@) to Q(P, q ) at a consists of all piecewise smooth vector fields V on a such that V ( 0 )E T a o , Pand V(b) = 0. Lemma 49 will show, in particular, that every V E TJP, q) is the variation vector field of some ( P , q)-variation of a. Corollary 3 now generalizes to assert that the nonnull critical points of the length function L on O(P,q ) are the geodesics a E Q(P, q) that are normal to P : a’(0) IP. 26. Corollary. Let a E Q ( P , q ) have la’ I > 0. Then L;(O) = 0 for every ( P , q)-variation x of a if and only if a is a geodesic normal to P.

Proof. By Proposition 2, LL(0) is zero for any ( P , q)-variation of a normal geodesic. To prove the converse: An argument as for Corollary 3 shows that a is an (unbroken) geodesic. Then for any vector y E Tao,P, choose V E T,(Q) so that V ( 0 ) = y. Let x be a ( P , q)-variation of a whose

Focal Points 281

vector field is V. Since V(b) = 0,

0 = &(O)

= ( E /c)(a‘,

V ); 1 = (E / C ) ( E ’ ( O ) , y).

Hence a‘(0) IP . For a normal geodesic a E R(P, q), the second variation of arc length will involve the shape tensor II of the endmanifold P . In fact, for a ( P , +ariation x of G, the expression (a’,A)l: in Theorem4 (with a = 0) reduces to -(a‘(O), A(0)) since the last transverse curve of x is constant. If E is the first transverse curve, then by definition, A(0) = ~”(0).Since o’(0) IP , (o’(O), A ( 0 ) ) = (o’(O),nor ~ ” ( 0 )= ) (o’(O),IZ(cc’(O), ~’(0))).

But a‘(0) is just V(O), where V is the variation vector field of x. Thus we conclude that

:I

LZ(0) = -

c

{( V’, V ’ ) - ( R V , , V, a’)} du - - (d(O), II(V(O), V(0))). C

As before, the index form I , of a nonnull normal geodesic c r ~ Q ( P , q ) is defined to be the unique symmetric R-bilinear form on T,(R) such that I,(V, V ) = LL(0)for any P-variation x with vector field V E T,(R). Thus the symmetry of the shape tensor II gives the following generalization of Corollary 7. 27. Corollary. If c r ~ R ( P , q )is a normal geodesic of speed c > 0 and sign E, then, for V, WE T,(Q), -

E

R,,, W, a’)) du - - ( ~ ‘ ( 0 )IZ(V(O), ~ W(0))). C

L

L

Again I,(V, W ) = Z,(V, W ) , so we need consider I , only on T i @ ) = T,(Q): V I a}.

{VE

FOCAL POINTS

If B is a geodesic starting at p , then the infinitesimal model of a variation of B through geodesics starting at p is a Jacobi field on a that is zero at p. Now we replace the point p by a semi-Riemannian submanifold P . 28. Proposition. A Jacobi field I/ on a geodesic B normal to P is the variation vector field of a variation x of a through normal geodesics if and only if and tan V’(0)= 17(V(O),o’(0)). V ( 0 )is tangent to P

282 10

Calculus of Variations

Figure 1

for fi as in Remarks 4.39. (A Jacobi field satisfying these conditions is called a P-Jacobijeld on a.)

Proof’. If I/ is the variation vector field of such an x, then the first transverse curve a of x lies in P , and the vector field Z(u) = x,(O, u) on a is normal to P. (See Figure 1.) Thus V ( 0 ) = a’(O), which is tangent to P , and V’(0)=

X,,(O,

0)

= X,,(O,

0) = Z(0).

But tan Z’ = fi(C0, Z ) and, since Z(0) = a’(O),the second condition follows. Conversely, suppose V satisfies the conditions and let c1 be a curve in P with ~ ’ ( 0= ) V(0).First we show: (*) T h e w is a normul vector jield Z on Z’(0) = V’(0).

c1

such that Z(0) = o’(0) and

Let A and B be the (normal) vector fields on a gotten by normal parallel translation of a’(0) and nor V ( 0 )along a. If Z(u) = A(u) + uB(u) for all P , then Z ( 0 ) = a’(0). Furthermore, since a’(0) = V ( 0 ) ,

+

~ ’ ( 0= ) ~ ’ ( 0 ) B(O) = ~ ( v ( oo’(o)) ) , + nor ~ ’ ( 0 ) . By the second hypothesis on V this is just V’(0). For Z as in (*), we now define the required variation x. Let exp be the exponential map of the normal bundle of P. Then x(u, u ) = exp(uZ(u)) defines a variation of a. The longitudinal curves of x are geodesics with initial velocity Z(u), hence they are normal to P. If Y is the variation vector field of x, then Y(0) = a’(0) = V(0). By construction, x,(O, u) = Z(u);hence Y’(0) =

X,,(O,

Thus Lemma 8.5 gives Y

=V

0) = X”,(O, 0) = Z’(0) = V’(0).

.

Focal Points 283

This generalizes the endpoint case, for when the endmanifold P is a single point p, a P-normal geodesic is a geodesic starting at p, and a P-Jacobi field on a is a Jacobi field zero at p . Thus it is clear how to generalize the notion of conjugate point.

29. Definition. Let a be a geodesic of M that is normal to P c M , that is, o(0) E P, o'(0) IP . Then o(r), r # 0, is a focal point of P along cr provided there is a nonzero P-Jacobi field J on a with J ( r ) = 0. Thefocal order of c(r) is the dimension of the space of P-Jacobi fields on o that vanish at r. This order is at most n - 1, where n = dim M . In fact,

since u + ua'(u) is a P-Jacobi field that vanishes only at u = 0, it suffices to show that the space f of all P-Jacobijelds on o has dimension n. If x E Tu(o)(P) and z E 7'u(o)(P)L, let V be the unique Jacobi field on o such that V ( 0 ) = x,

V'(0) = fi(x,o'(O))

+ z.

Then V is a P-Jacobi field, and the map x + z -, V is a linear isomorphism from T u o , Mto f . By Lemma 8.7, a P-Jacobifield vanishing at r # 0 is everywhere perpendicular to a since it is perpendicular to o at both 0 and r. The geometrical meaning of focal points is the obvious extension of that of conjugate points; namely, a(b) is a focal point of P along a normal geodesic a if there is a family of normal geodesics with initial velocities near o'(0) that almost meet at o(b). Formally

30. Proposition. Let a: [0, b] the following are equivalent:

+

M be a geodesic normal to P . Then

(1) o(b)is a focal point of P along o. (2) There is a nontrivial variation x of a through P-normal geodesics for which x,(b, 0) = 0. ( 3 ) The normal exponential map exp: N P .+ M is singular at ba'(0). Proof. The equivalence of (1) and (2) follows immediately from Proposition 28. By a reparametrization we can suppose b = 1. ( 3 ) (2). Let x be a nonzero tangent vector to N P at o'(0) for which d exp(x) = 0, and let p = a(0) E P. Suppose first that x is tangent to the fiber T,(P)' of N P . Since this fiber is a subspace of T J M ) , Proposition 10 applies, showing that a(1) is a conjugate point of p along o,and hence a focal point of P along o.If x is not tangent to the fiber, then dn(x) # 0, where n is the projection of N P onto P. Let 2 be any curve in N P with initial velocity x. Then x(u,v) = exp(uZ(v)) is the required variation, being nontrivial since x,(O, 0) = dn(x) # 0.

284 10

Calculus of Variations

(2) * (3). Let 2(u, u) = ux,(O, u). Then exp(Z(u, u)) =x(u, v), hence d exp(i,(l, 0)) = x,(l, 0) = 0. i"(1, 0) is the initial velocity of the curve u + x,(O, u ) in NP, which projects to u + x(0, u) in P. If x,(O, 0) # 0, then x,(l, 0) # 0, hence exp is singular at cr'(0). But if x,(l, 0) = 0, then a(1) is a conjugate point of p = cr(O), and reversing the argument above, Proposition 10 implies again that exp is singular at ~ ' ( 0 ) . 31. Example. Let P be the hyperboloid xz + yz = 1 + z2 in R3 (Figure 2). Consider the x axis as a normal geodesic starting at p = (1,0,0). The origin 0 and 4 = (2,0,0) are focal points of P along this geodesic. In fact the normal geodesics along the circle z = 0 actually meet at 0, while the normals near p along the hyperbola y = 0 almost meet at 4.

R3

m v

Figure 2

For a submanifold P of semi-Euclidean space, the fact that normal lines near a given normal line cr focus at a point of IT depends only on the shape of P at the foot ~ ( 0of) cr. In fact,

32. Proposition. Let cr be a P-normal geodesic in a flat manifold M . The focal points of P along cr are cr(l/ki), where k , , . . . , k, are the distinct nonzero eigenvalues of the shape operator Sar(O,of P. Proof. Abbreviate Sat(O)to S , and recall from Remarks 4.39 that S(u) = - f i ( u , a'(0)) for all u E 7'o(o)(P). First suppose that S(e) = ke, where both k and e are nonzero. Let E be the vector field on cr obtained by parallel translation of e. It will suffice to show that V ( u ) = (1 - ku)E is a P-Jacobi field on cr, since V(l/k) = 0. Evidently V satisfies the flat Jacobi equation V" = 0. By construction, V ( 0 )is tangent to P. Finally, since V' = - kE, we have tan V'(0) = V'(0)= - ke = -S(e) = fi(e, o'(0)) = fi(V(O),~'(0)).

Focal Points 285

Conversely, let J be a P-Jacobi field on (T for which J(b) = 0, b # 0. Then ) &J(o), ~ ’ ( 0 ) = ) -~ ( ~ ( 0 1 ) . tan ~ ’ ( 0= Since J” = 0, we can write J ( u ) = A(u) + uB(u), where A and B are gotten by parallel translation of J ( 0 ) and J’(O), respectively. Since J(b) = 0, it follows that J ( 0 ) + bJ’(0) = 0. Because J is a P-Jacobi field, J ( 0 ) is tangent to P, and hence J‘(0) = - (l/b)J(O) is also. Furthermore, S(J(0)) = -tan J’(0) = - J’(0)

=

(l/b)J(O).

w

It is easy to check that the focal order of o(l/ki)above is the multiplicity of the eigenvalue ki. In the preceding example, if P is kept the same but R3 is changed to Minkowski space R:, then the normals to P change-all now go through the origin, which is the only focal point. (This is the idea of the proof of Proposition 4.36.) A theorem of Sard asserts that, if C c N is the set of singular points of a smooth map 4: N -+ M , then@) has measure zero in M . Hence, by Proposition 30, the set of focal points of any P c M has measure zero in M . By Exercise 8, along any P-normal geodesic the set of focal points is discrete. 33. Lemma. For a nonnull normal geodesic (T E Q(P, q ) the nullspace of the index form Zf is the space yb of P-Jacobi fields on (T that vanish at q = (T(b).

Proof. If V, W E T,l(Q),then applying integration by parts to the formula in Corollary 27 gives the formula in Corollary 8 but with an additional term -(EIC)
fi(UO), a‘(O)),W(0)).

It follows immediately that if V E ~ then ~ , Zi(V, W ) = 0 for all W. Conversely, if V is in the nullspace of f,l, a standard argument shows V” Rvoj(T’ = 0, and V’(0) - fi( V(O),(~’(0)) is normal to P. Thus V is a P-Jacobi w field zero at b ; that is, V E f b . For cospacelike geodesics the index form characterizes focal points in the same way as in the special case of conjugate points. 34. Theorem. Let o ~ S 1 ( P , q )be a cospacelike normal geodesic of sign E. Then

(1) If there are no focal points of P along (T, then If is definite (positive = 1, negative if E = - 1). (2) If q = o(b) is the only focal point of P along (T, then f,l is semidefinite, but not definite.

if E

286

10

Calculus of Variations

(3) If there is a focal point (i(s), 0 < s < 6, along semidefinite.

(i,

then I , is not

Proof. We proceed as for Theorem 17, with minor adjustments to deal with initial conditions. To prove (l), for example, requires If’ V and W a r e P-Jucobi ,fields on a, then ( V ‘ , W ) = ( V , W‘).

This will follow from Lemma 15 if (V‘(O), W ( 0 ) )= (V(O), W’(0)). But since W(0)is tangent to P,

1

( V ‘ , V ‘ ) = (R“,,, V, 0’)

By Corollary 27 (assuming c &Z,(V. V ) =

s:

( A , A ) du

=

=

(A, A )

+ ( V , B)’.

l),

+ (V,B )

:I

-

(a’(O),ZI(V(O), V ( 0 ) ) .

The proof is completed as before once we check that the last two terms above cancel. In fact, since V ( b ) = 0 and Yi is a P-Jacobi field,

<

3‘

< = -1

B) bIo = - V(OL N O ) )

=

< V(O),C fi(O)YXO))

~ ( o )tan ,

yI(0)) Ji(O)> = - < V ( O ) ~ (II(V(O), V(ON, a’(O)>. .fi(O>(

=

= -

-1

~’(0))

fi(v(019

APPLICATIONS A focal point occurs when a P-Jacobi field has a zero; thus to prove that focal points exist, we expect two kinds of assumptions:

(I) on the initial conditions for P-Jacobi fields, that is, on the shape of P (e.g., P is “initially focusing”); and (2) on the Jacobi equation, that is, on the curvature of M (e.g., tidal forces that cause geodesic convergence).

Here is a basic result of this type.

Applications

287

35. Proposition. Let P be a spacelike submanifold of a Riemannian or Lorentz manifold, and let a be a P-normal nonnull geodesic. Suppose (1)

(2)

(o’(O), Il(y, y ) ) = k > 0 for some unit vector y E T,,,, P ; (R,,.v, a’) 2 0 for all tangent vectors u i a.

Then there is a focal point a(r) of P along cr with 0 < r I Ilk provided a is defined on this interval. (Note that if k < 0, then reversing the orientation of B would change this T < 0.) sign but not that in (2), hence a has focal point with - l/k I Proof. Assume I cr’ I = 1, so l/k is distance along a. Since B is cospacelike, it suffices by Theorem 34 to show that the index form I,‘ of cr on [0, l/k] is indefinite. In view of Lemma 13, if E is the sign of a, we want a nonzero V E T,l(n)with EZ,(V, V ) I 0. Motivated by the proof of Proposition 32, define V ( u ) = (1 - ku)Y(u), where Y(u)is the parallel translate of y . Then V E T,‘(C2). Since V’ = -kY a n d l Y J = 1, d,,(V, V ) =

fak

lo1’*

{k2 - (R,,, V, a’)} du

- (Zl(y, y ) , ~’(0)).

But k2 du = k cancels - ( H ( y , y ) , a‘(0)) = - k, leaving a’) du, which by (2) is 5 0.

(Rv,,, V,

The number k in the proposition can also be expressed as ( S , . o , y , y ) , where S is the shape operator of P c M . This proposition, and also Proposition 32, suggest that k is an initial rate ofconvergence: Move a’(0) infinitesimally in the y direction to normal vectors z. If k > 0, the geodesics yz are initial converging toward a; if k < 0, they are initially diverging; if k = 0, they are initially parallel. This can be seen directly, as follows. Let Y be any P-Jacobi field on a such that Y(0) = y . Proposition 28 shows that Y is an infinitesimal model for a family of normal geodesics yz as above. Thus I Y 1 measures the distance of neighboring geodesics from a. But I Y I‘(0) = - k, since I Y I’(0) = Y(O),Y’(0))= ( Y , tan Y’(0)) = ( y , R ( y ,~ ’ ( 0 )= ) - (d(O), IZ(y, y ) ) = - k .

<

For a spacelike hypersurface there is an averaged version of the preceding result with sectional curvature replaced by Ricci curvature, as in Myers’ theorem, and normal curvature I l ( y , y ) replaced by mean normal curvature H (see page 101).

36. Definition. Let P be a semi-Riemannian submanifold of M with mean curvature vector field H . The convergence of P is the real-valued

288 10

Calculus of Variations

function k on the normal bundle N P such that k(z) = ( z , H , )

=

for z E T,(P)'.

(l/dim P ) trace S ,

For a spacelike hypersurface in M",

where e l , . . . , en- is any orthonormal basis for T,(P).

37. Proposition. Let P be a spacelike hypersurface in a (necessarily) Riemannian or Lorentz manifold M , and let a be a geodesic normal to P at p = a(0). Suppose

(I) k(o'(0))= (d(O), H , ) > 0. (2) Ric(a', a') 2 0. Then there is a focal point a(r)of P along a with 0 c r I l/k(a'(O)), provided a is defined on this interval. Proof. As in the preceding proof suppose I a' I = 1 and let k = k(a'(0)). If el, . . . , em- is an orthonormal basis for T,(P), parallel translate along (T to obtain E , , . . . ,En- Define f(u) = 1 - ku on [O, l / k ] ; then f E i E TJQ) and rlik

d U ( j E i ,fEi)

=

k -

J

f 2 ( R , , , , E i , a') du - (a'(O), ZZ(ei, q)). 0

Since the E i s are spacelike, adding the expressions above gives (n - l)k -

rk

f 2 Ric(o', a') du -

(a'(O), ( n - l)H,,).

Here the first and last terms cancel, and since Ric(o', a') 2 0 it follows that d , ( f E i , f E i ) I 0 for at least one i.

VARIATION OF E

For a curve segment a : [0, b ] integral

+M

'J

E(u) = 2

in a semi-Riemannian manifold the (a', u') du

0

(sometimes called energy or action) lacks the direct geometric significance of arc length, but its variational theory is computationally simpler and, in

Variation of E

289

the following sense, more general. For a piecewise smooth variation x of a let E,(u) be the value of E on the longitudinal curve u -+ x(u, u), so E,(u)

=

:6

-

(x,, x,) du.

By contrast with L,, the function E , is always smooth without restriction on x. Thus in particular E can be used to study null geodesics. Formulas for the first and second variations of E are simpler analogues of those for L. 38. Lemma. Let x be a variation of a curve segment a,with V and A the variation and transverse acceleration vector fields of x. Iff = f(u, u) = ( x u , xu), then =

(V’, a’)

= - ( V , a”)

d +( V , a’), du

Proof. We readily compute

= (X”,, X”,)

-

(W”, xu)x,,

X”)

+

(XUUU?

xu>.

Upon evaluation at u = 0 the first formula gives (V’, a’) and the product rule for the derivative of ( V , a‘) gives the alternate version. Computation of the second derivative formulas is similar. w Integration of the formulas in the lemma then gives first and second variation formulas:

39. Proposition. Let x be a variation of a : [0, b] M , with V and A the variation and transverse acceleration vector fields. Then -+

k

( V , a’’) du -

1 ( V , A.’)(u~) + ( V , a’) i= 1

290 10

Calculus of Variations

where u1 < . . . < uk are the breaks of x and ct. Furthermore, if ct is a geodesic, then E‘.!(O)

=

6

{ ( V ’ , V ‘ ) - ( R V a , V, a‘)} du

+ ( A , a’)

:1

.

As before, if P is a semi-Riemannian submanifold of M and q E M , then E becomes a real-valued function on Q ( P , q ) . Using the first variation formula above it is easy to check that the critical points o f E are exactly the normal geodesicsfrom P to q. If 0 is such a geodesic, then strictly analogous to the index form I , for L is the Hessian H, for E . Explicitly, H , is the unique R-bilinear form on T,(R) such that H,( V, V ) = E:(O), where x is any variation of 0 whose longitudinal curves are in R(P, q) and whose variation vector field is V. By the second variation formula above it follows as in Corollary 27 that

H,(V, W ) =

l

{(V’,

w’) - ( R v a *W, a’>}du - (~’(o), Il(V(O), W(O))),

where I1 is the shape tensor of P.

FOCAL POINTS ALONG NULL GEODESICS

We consider some variational properties of null geodesics in Lorentz manifolds. 40. Corollary. Let o be a null geodesic normal to a submanifold P of a Lorentz manifold. A P-Jacobi field on G is the vector field of a variation of c through null geodesics normal to P if and only if V Ic.

Proof. Suppose x is such a variation. Then f = (xu, xu) is identically zero, hence Lemma 38 implies ( V ’ , d) = 0. In particular, V’(0)I6. But V ( 0 )is tangent to P hence I0.Thus V I (T by Lemma 8.7. For the converse, the proof of Proposition 28 will work provided we can arrange for the vector field Z in assertion (*) to be null. Since V I g, it follows that nor V’(0)Ia; hence by Proposition 5.28, this vector corresponds canonically to a tangent vector to the nullcone A c T,(P)’ at ) X(0) z the point d(0) of A. Choose a curve A in A with A(0) = ~ ’ ( 0and nor V’(0).Then on the curve u (see previous proof), let Z(u) be the normalparallel translate of the vector A(u) along a to a(u). Thus 2 is a null vector field on u normal to P, and Z(0) = a’(0). Using a normal-parallel frame field on ct, one can check that nor Z’(0) = nor V’(O),hence Z’(0) = V’(0).

Focal Points along Null Geodesics

291

A focal point of a submanifold P along a normal geodesic cr is an almostmeeting point of nearby P-normal geodesics of the same causal character as o. If o is nonnull, this is obvious by continuity (in view of Proposition 30); for o null, it is immediate from the preceding result, since a P-Jacobi field vanishing at b # 0 is perpendicular to cr. If a semi-Riemannian submanifold P admits a normal null geodesic, then dim P In - 2. Also focal orders along a normal null geodesic are at most n - 2. The idea is that restricting the almost-meeting geodesics to be null reduces the usual maximum n - 1 by 1. Formally, the space 2 b of PJacobi fields on o vanishing at b # O is a subspace of the (n - 1)-dimensional ' of P-Jacobi fields perpendicular to CT.The tangential vector field space 2 u .+ ucr'(u) is never in 2 b , but since cr is null it is now in ,$'. In particular, in a Lorentz surface there are no focal or conjugate points along null geodesics. Our goal now is to prove an analogue of Theorem 34 for null geodesics. As before, geometric significance is gained by restricting H,, to the subspace T,l(Q) = { V € T,(Q):V 1 o}. 41. Proposition. Let P be a spacelike submanifold of a Lorentz manifold. If there are no focal points of P along a normal null geodesic cr E Q(P, q), then H b is positive semidefinite. Furthermore, if H;( V, V ) = 0, then V is tangent to cr.

,

Proof. Let Y,, . . . , Y,- be a basis for the space of P-Jacobi fields on cr perpendicular to cr. We can suppose Y,(u) = ucr'(u). Since there are no focal points on cr, if V E T i ( Q ) ,we can write V = ,hy. Lemma 16 gives (V', V')

- (RV,,,

V, o')

=

(A,A )

+ ( V , B)'.

As in the proof of Theorem 34, H:(V, V ) then reduces to sk ( A , A ) du. Since A = f;Yiis orthogonal to the null vector cr', Lemma 5.28 asserts that ( A , A ) 2 0 and furthermore that (A(u), A ( u ) ) = 0 if and only if A(u) and ~ ' ( u are ) collinear. The inequality implies H,(V, V ) 2 0, and it follows that if H J V , V ) = 0, then A = f I Yiis everywhere tangent to cr. Since Y , is the only basis vector field ever tangent to cr, we must have f l = 0 for i > 1. But V ( b ) = 0, hence ,fi(b) = 0, hence .fi = 0 for i > 1. Thus V = ,f,Yl which is tangent to cr. 42. Example. Null Focal Points. In M = R;, n 2 3, let P be the sphere Y 2 ( a )in R " - l , considered as the hyperplane u o = 0. Through each point of P run exactly two null normal lines (see Figure 3). In the future directions, the inward-pointing null normals all meet at (a, 0, . . . , 0), which is evidently a focal point of order n - 2 along every such normal.

292 10

Calculus of Variations

Figure 3

Symmetrically located, ( - a , 0, . . . , 0) is the only other null focal point. Proposition 37 has the following analogue for null geodesics. 43. Proposition. Let P be a spacelike ( n - 2)-dimensional submanifold of a Lorentz manifold M, with H the mean normal curvature vector field of P. Let a be a null geodesic normal to P at p = a(0) such that

k(a’(0)) = (a’(O), H , ) > 0 ; ( 2 ) Ric(a’, a’) 2 0. (1)

Then there is a focal point a(r) of P along a with 0 < r Il/k, where k = k(a’(O)), provided 0 is defined on this interval. Proof. By (l), H , # 0, hence n = d i m M 2 3. Let e3 ,..., en be an orthonormal basis for T,(P). Parallel translate these vectors to obtain E , , . . . , En, and as usual let f ( u ) = 1 - ku on [0, l/k]. Then f E i E T;(R), and

H i ( f E i , j E i )= =

J0“*

( f ” - f ’ ( R E l U . Ea i’r}} du - (a’(O), ZZ(ei, e i ) )

k - [ol’*f’(REio,Ei, a’) du - (a’(O), ZI(ei, ei)).

E 3 , . . .,En and a’ cover only n - 1 dimensions of M, but since a is null, the missing dimension is not involved in the relevant Ricci curvature. In fact, the proof of Lemma 8.9 shows that fl

Ric(o’, a’) =

(REiu,E i , a’}. i=3

A Causalitv Theorem 293

Thus adding the formulas above for i

H i ( f E i ,f E i ) = ( n

- 2)k -

=

3, . . . , n gives

f’ Ric(a’, a’)du - (a‘(O),( n

-

2)H,)

Sdl‘

= - Jol’kfzRic(a’, a’)du I 0.

None of the vector fieldsfE, is tangent to c,hence it follows from Proposition 41 that there is a focal point of P on a. Here k = k(a‘(0)) represents the average initial rate of convergence of P-normal null geodesics near 0.If k < 0 in the proposition, then as before there is a focal point a(r) with Ilk I r < 0. 44. Example. In Example 42, the sphere P is spacelike and ( n - 2)dimensional. It is totally umbilic in R’-’ with mean normal curvature H = -U/a, where U is the outward unit normal. Since R‘-’ is totally geodesic in M = R;, H is also the mean normal curvature for P in M . The two families of normal null geodesics through P can conveniently be parametrized to have future-pointing initial velocities & U + do (see Figure 3). Thus for the inward family, k = ( - U + d o , H ) = l/a, independent of p E P. Then Proposition 41 predicts a focal point with 0 < r I l/k = a. In fact they occur at r = a, that is, at p

+ a ( - p / a + ( L O , . . . ,O)) = (a, 0,. . . ,O).

For the outward family, k = ( U + d o , H ) = - l / a , and the focal point predicted for - a = l/k I r < 0 arrives at r = -a, that is, at ( - a , 0, . . . , 0). Since R; is flat, Proposition 32 would locate these focal points exactly. A CAUSALITY THEOREM

A fundamental problem in a Lorentz manifold is to determine which pairs of points can be joined by a timelike curve. This section will establish that there are timelike curves from p to q arbitrarily near every causal curve ci from p to q-unless tl is a null geodesic without conjugate points. 45. Lemma. Let tl be a causal curve segment in a Lorentz manifold M , and let x be a variation of ci with vector field V. If ( V ’ , a’)< 0, then for all sufficiently small u > 0, the longitudinal curve a, of x is timelike.

Proof. Because a is causal, (xu, x,)(u, 0) = ( ~ ’ ( u ) a’(u)) , I0

for all u.

294 10

Calculus of Variations

But a is defined on a closed interval [a, b], and

a

(x,,, xu)

I”=*

=

2( V’, a ’ ) < 0.

Thus if u > 0 is sufficiently small, then (xu, x,)(u, v) < 0 for all u ; that is, a, is timelike. The lemma is valid in the piecewise smooth case, where as usual we understand the hypotheses to hold on unbroken segments. Furthermore it would suffice to assume ( V ’ , a ’ ) I 0, with strict inequality only at points where a’ is a null vector. Deforming a causal curve in the direction of its acceleration tends to make it timelike. For example, the acceleration of the null helix a(t) = ( t ,cos t , sin t ) always points directly toward the t axis; deforming a in that direction makes it “steeper,” hence timelike. The same scheme can be adapted to the instantaneous acceleration y’(u+) - y’(u-) at a break of a (piecewise smooth) causal curve. These and similar deformations suffice to prove :

46. Proposition. In a Lorentz manifold M, if a is a causal curve from p to q that is not a null pregeodesic, then there is a timelike curve from p

to q arbitrarily close to a.

Proof. We can suppose the domain of a is [0,1]. Consider first two special cases. Case 1. a’(0) or a’( 1) is timelike. Assuming the latter, let W be obtained by parallel translation of a’( 1) along a. Then W and a‘ are always in the same causal cone, and since W is timelike, ( W , a‘) < 0. By continuity there is a 6 > 0 such that (a’, a ’ ) < -6 on [l - 6, 11. Let f be any smooth function on [0,1] vanishing at endpoints and with f ’ > 0 on [0,1 - 61. Set V = fW. Then ( V ’ , 0‘) = f ’ ( W, a ’ ) is negative on [0,1 - S]. Let x be a fixed endpoint variation with vector field V. By a remark above, for u > 0 sufficiently small, the longitudinal curve CI, has become timelike on [O,l - S] and remained timelike on [l - 6, 13. Case 2. a is a smooth null curve. Differentiation of (a’, a ‘ ) = 0 shows that a“ Ia’. Now a” cannot always be collinear with a’ or, by Exercise 3.19, a could be reparametrized as a null geodesic. Thus the function (a”, a ” ) 2 0 is not identically zero, since orthogonal null vectors are collinear. Let W be a parallel timelike vector field on a in the same causal cone as a’ at each point, so ( W , a ’ ) < 0. Let V = f W + ga”, where f and gvanishing at endpoints-are to be determined so that ( V ‘ , a’) < 0. Since ( a ” , a ‘ ) = 0 implies (a”’, a ’ ) + (a”, a ” ) = 0, we compute ( V ‘ , a‘) = f ‘( W , a ‘ ) - g(a”, a“).

A Causality Theorem 295

Because h = ( a ” , a”)/( W,a’) is not identically zero, there exists a smooth g, vanishing at endpoints, such that J0lghdu

=

-1.

+

+

I > Let f ( u ) = {; (gh 1) du. Then j’vanishes at endpoints, and f ’ = gh gh = g(a”, a”)/( W, a’). Consequently (V’, a ‘ ) < 0. To complete the proof, note that if a‘ is timelike at a nonendpoint s, then Case 1 applies on [O, s] and [s, 11to give the required result. Thus we are left with the case of a piecewise smooth null curve a. Unless every smooth segment of a can be reparametrized as a null geodesic, then by Case 2 some one can be varied slightly to become timelike-hence another small variation as above gives the result. There remains only the case of a broken null geodesic a. It suffices to assume there is a single break, 0 < s < 1. Let W on a be obtained by parallel translation of Aa’(s) = a’(s+) - a’(s-). Recall that these two velocities are by definition in the same causal cone, so ( W , cr’) is negative on [0, s-] and positive on [s+, 11. Now choose a piecewise smooth function f on [0, t] that vanishes at endpoints and has derivative f’positive on [0, s-1, negative w on [s+, 11. Then for 1/ = f W we have ( V ’ , a’) < 0.

Now the question is: Can a null geodesic segment a be made timelike by a small fixed endpoint deformation? This should be possible if there is a conjugate point a(r) of o(0) with 0 < r < b, because, by assuming that some null geodesic z from p = a(0) that almost reaches a(r) actually reaches a(r), we can apply the preceding proposition to the broken null geodesic z + a I [ r , b]. However, a formal proof of this result requires some work. 47. Lemma. Let x be a variation of a null geodesic a with variation vector field V perpendicular to a at its endpoints. If there is a sequence (ui> -,0 such that each longitudinal curve a”,is timelike, then V is everywhere perpendicular to a.

Proof. Since no ui is zero, by passing to a subsequence we can suppose either all ui > 0 or all vi < 0. Since a is null and

it follows that either ( V ’ ,a’) I 0 or ( V ’ ,a’) 2 0. Since a is a geodesic, ( V ’ ,a’) = ( V, a’)‘, hence Job(

V’, a’)du = ( V , a’)

I:

= 0.

Thus (V‘, a’) = 0, so ( V , a’) is a constant-necessarily

zero.

W

296 10

Calculus of Variations

The first derivative criterion in Lemma 45 fails here, since V Ia implies

Thus we look for variations x with

It will be no harder to deal with the endmanifold case. 48. Proposition. Let P be a spacelike submanifold of a Lorentz manifold, and let 0 E R(P, q) be a normal null geodesic. If there is a focal point of P along a strictly before q, then there is a timelike curve from P to q arbitrarily near a.

Proof. Let a(r) be the first focal point of P along c. By Proposition 46, it suffices to show that for some small 6 > 0 there is a small fixed endpoint deformation of a I [0, r 61 to a timelike curve segment. Let J be a nonzero P-Jacobi field on a that vanishes at a@).

+

(1) There is a 6 > 0 such that J = f U on [0, r like unit vectorfield on u and f > 0 on (0, r).

+ 61, where U is a space-

J is orthogonal to u, and we assert that it is never tangent to 0 on (0, r). In fact, if J(a) = co'(a) for some a E (0, r), then u + J(u) - (cu/a)a'(u) is a nonzero P-Jacobi field on a that vanishes at a-before the first focal point at r. Since M is Lorentz, it follows that J is spacelike on [0, r ] . Now J(r) = 0, and whether or not J ( 0 ) = 0, it follows from Exercise 1.17 that there is a smooth vector field Y on u such that J(u) = u(r - u)Y(u). Now J'(r) # 0 (and J'(0) # 0 if J ( 0 ) = 0), hence for some 6 > 0, Y is never zero on [0, r + 61. Then U = Y/l Y I and f(u) = u(r - u)l Y ( u ) (give J = f U as required.

( 2 ) For some 6 > 0, there is a vector,field V on u that vanishes at 0 and

r

+ 6, is perpendicular to a,and has ( V " - Rvata', V ) > 0 on (0, r + 6).

For J = fU as in (l), let V determined. We compute V" - RYa,a'= g"U

=

(f

+ g)U = J + g U , where g

is to be

+ 2g'U' + g[U" - Rva.a'].

Since U is a unit vector field,

+ g)(g" + gh), where h = ( U " - R,,a', U ) . Let - u 2 , with u > 0, be a lower bound for h on [0, r + S], and define g ( u ) = b(eou- l), where b > 0 is determined by g(r + 6 ) = - f ( r + 6). (*) ( V " - Rva,d,V ) = (f

A Causality Theorem 297

Since g”

=

a2(g + b), we get

+ gh = g(a2 + h ) + a2b > 0 on (0, r + 6). The function f + g is positive on (0, r ] and zero at r + 6. Reducing 6 > 0 if necessary, we can suppose that r + 6 is the only zero off + g in (0, r + 61. Then f + g and g” + gh are both positive on (0, r + S), hence by (*), V g”

has the required properties.

+ +

( 3 ) There is afixed endpoint variation x of a I [0, r S ] whose longitudinal curves au,for u suficiently small, are timelike on (0, r S), hence causal on [O, r S ] .

+

Let N be a parallel null vector field on 0 with ( N , a’) = - 1. By the lemma below there is a variation x of a with vector field V as in (2) and acceleration vector field A = ( V ’ , V ) N . Thus in the second formula for d 2 f / ~ u 2in Lemma 38, two terms cancel since (A’, a’) = (V‘, V ) ‘ ( N , a’) = - (V‘, V ) ’ .

The remaining term is negative on (0, r + S ) by (2), thus ( 3 ) follows. To complete the proof it suffices to apply Proposition 46 to the piecewise smooth curves a” + a1 [ r 6, 11.

+

49. Lemma. If CI E O(P,q), let V E T,(Q) and let A be a vector field on u such that nor A(0) = ZZ(V(O), V(0)) and A(b) = 0. Then there is a ( P , q ) variation x of u whose variation and transverse acceleration vector fields are V and A .

Proof. When P is a single point, then A(0) = V ( 0 ) = 0 and it suffices to define x(u, u) = exp,(,,[uV(u)

+ +u2A(u)].

Recall that, at 0, exp, preserves velocities and accelerations (up to canonical isomorphism z). If dim P > 0, it is easy to find a variation of CI of the form X ( u , v ) = exp Z(u, u) such that (1) the first transverse curve lies in P and has initial velocity V(0)and initial P-acceleration tan A(O+hence initial M-acceleration A ( 0 t a n d (2) the last transverse curve is constant at q. We modify this variation as follows. Denote its variation vector field by W(u)% Z,(u,O) and its acceleration vector field by B(u) z Zvv(u,O).Both V - W and A - B are zero at endpoints. Hence exp(Z(u, u)

+ u(V(u) - W ( u ) ) + su2(A(u)- B(u)))

defines a variation with the required properties.

rn

298 10

Cafculus of Variations

50. Lemma. If o E Q(P, q ) is a null geodesic not normal to P , there is a timelike curve arbitrarily near o in Q(P, q). Proof. By hypothesis there is a vector y E T,,,,P such that ( y , o’(0)) # 0. Switching if necessary to -y, we can suppose (y, o’(0)) > 0. Define V(u) = (1 - (u/b))Y(u),where Y ( u ) is the parallel translate of y along o. Then V E T,(Q), so there is a ( P , q ) variation of o with vector field V. But V’ = - Y/b, hence ( V ’ , of) = - ( Y , o‘)/b = - ( y , a’(O))/b< 0. Then Lemma 45 completes the proof.

Thus we can summarize as follows. 51. Theorem. Let P be a spacelike submanifold of a Lorentz manifold M . If CI E Q(P, q ) is a causal curve, there is a timelike curve arbitrarily near CI in Q(P, q) unless CI is a P-normal null geodesic along which there are no focal points of P before q.

The exceptional case cannot be eliminated. For example (with P a single point), no timelike curve joins the endpoint of a null geodesic segment in Minkowski space. In Example 42, the P-normal null geodesics in Q ( P , q) that pass through a focal point (ka, 0, . . . , 0) are exactly those with timelike neighbors in Q(P, q).

Exercises 1. In each case, x is a variation of a geodesic segment; compute LI(0) by the second variation formula and also directly from L,. (a) In S2, x(u, u) = (cos u cos u, cos u sin u, sin u), 0 I u In. (b) In R2, x(u, u) =(u cosh u, u), -1 I u I 1. (c) In R2, x(u, u ) is (u, uu) if u E [0, 13 and (u, u(2 - u ) ) if u E [1,21. 2. Let o:[0, b] -+ M be a geodesic that is normal to P c M . If for 0 I a < b the points a(a) and o(b) are conjugate along o,then there is a focal Hessians-f o I [a, b] point of P along o.(Hint : Use the index forms-r and o E Q(P, o(b)).) 3. Two endmunifolds. Let P and Q be semi-Riemannian submanifolds of M . (a) Establish first and second variation formulas for E on a ( P , Q). (b) Prove that the critical points of E are geodesics normal to both P and Q. 4. In the preceding exercise suppose M is a complete connected Riemannian manifold. Prove: (a) If P and Q are compact, there is a shortest geodesic from P to Q and it is normal to both. (b) If P and Q are compact and totally geodesic, and M has K > 0, then P and Q meet.

Exercises

299

5. Let y be a cospacelike geodesic in a locally symmetric Riemannian or Lorentz manifold. Let F , be the tidal force operator of v = y'(0). Prove: (a) u' has an orthonormal basis e l , . . . , e, consisting of eigenvectors: F,(ei) = l i e i .(b) The conjugate points of y(0)along y are y ( n r n / f i )for all negative eigenvalues l j < 0 and integers m # 0. 6 . Let y be a null geodesic in a locally symmetric Lorentz manifold. (a) Prove an analogue of the preceding exercise by considering the quotient vector space u'/Ru. (b) In the symmetric space R : x S2 find a null geodesic with conjugate points. 7. Extremul submunifolds. Let P be a semi-Riemannian submanifold of M . Let 2 be a normal vector field on P with compact support (e.g., P itself compact). Then Z can be extended to a neighborhood of P on which its flow 9,is defined. (a) If vol(t) is the volume of +,(P),prove that (vol)'(O) = -(dim P ) J p ( H , Z ) dP, where H is the mean curvature vector field of P in M . (Hint: vol(r) is the integral of the Jacobian function of 9,.) (b) Deduce that H = 0 if and only if P is an rxtremal for volume, that is, (vol)'(O) = 0 for all such 2. 8. Let o(b) be an (order m) focal point of P along a P-normal geodesic o. I f c # b is suflciently near b, (hen o(c) is nor a focal point along (T. Prove this by showing: (a) There exists a basis Y,, . . . , Y,- for the P-Jacobi fields perpendicular to CJ such that Y;(b), . . . , Yh(b), Ym+l(b), . . . , Y,-,(b) is a basis for a ' @ ) ' . If El, . . . , E n - are gotten by parallel translation of the above listed n - 1 vectors, let f = det(Yi, E j ) . (b) For s # 0, f(s) = 0 if and only if o(s) is a focal point along o. (c) Near u = b , f ( u ) = (u - b)"g(u), where q is a continuous function with g(b) # 0. 9. Convergence. Prove: (a) If z is a vector normal to P c M , then k ( z ) is the average of the (possibly complex) eigenvalues of S,. (b) In the notation of Exercises 4.7 and 9.22, k ( U ) = h. (c) If P is totally umbilic in M , with normal curvature vector field z, then k ( z ) = ( z , z).

,

11

HOMOGENEOUS AND SYMMETRIC SPACES

This chapter will show how symmetric spaces (semi-Riemannian symmetric manifolds) can be constructed, and their geometries described, in terms of Lie groups-in fact, to a large extent, in terms of Lie algebras. Given such a Lie description, curvature and geodesics can often be found by surprisingly simple matrix calculations, and information about isometries and the underlying topology is forthcoming. Initially a broader class of homogeneous spaces, said to be naturally reductive, is considered. Also we consider some complex geometry, and-turning the basic scheme around-use symmetric space information to find properties of several geometrically important Lie groups. MORE ABOUT LIE GROUPS

We add to Appendix B a few elementary facts needed in this chapter. If G is a Lie group, g denotes its Lie algebra and e its identity element. An automorphism of a Lie group G is a map 4: G + G that is both a diffeomorphism and a group isomorphism. Since automorphisms preserve the two structures of a Lie group, they must preserve all features of Lie theory on G . 1. Lemma. Let 4: G -+ G be an automorphism. If X E g, then the transferred vector field d 4 ( X ) is in g, and d 4 : g + g is a Lie algebra isomorphism called the differential of 4. 300

More about Lie Groups 301

Proof. (Recall that (d4X), = d&X,-,,) for all gEG.) If U E G , let b = f# -'(a). Then 4Lb(g) = 4(bg) = 4(b)$(g) = ab(g), hence +Lb = La4. Thus if X E g, dL,dd(X) = dddL,(X)

=

d$(X).

Hence d&X) E g. The function d 4 : g + g is certainly linear, and preserves brackets by Proposition 1.22. That it is a linear isomorphism follows since 4- is also an automorphism of G. The differential d 4 contains the same information as the differential map d 4 : T G -+ T G , and is, in fact, completely determined by the single map d4=:T,(G) + T,(G). If U E G, let C , : G + G be the function sending each g to aga-'. C , is an inner automorphism and, since C , = La R a - l , a diffeomorphism. Thus C , is an automorphism of G. The differential of C , is denoted by Ad,. If a, b E G, then c,b(g) = abg(ab)-' = a(bgb-')a- hence c a b = C , o Cb. Taking differentials gives 0

',

Adab = Ad, Ad,. 0

The resulting homomorphism a of G.

+ Ad,

is called the adjoint representation

2. Corollary. If X , Y E g, then [X, Y]

=

1 lim - {Ad,,,, Y - Y}, 1-0

t

where a(t) = exp(tX) is the one-parameter subgroup of X . Proof. 1.58,

By Lemma 9.34 the flow of X is {&)).

[X, Y]

=

Hence by Proposition

1 lim - {dRa(-,)Y - Y}. t+o t

Since Ad, = dC, = dR,-, odL,, it follows that Ad, and dR,-, have the same effect on the (left-invariant) elements of g. Thus dR,,-,, can be replaced by Ad,,,, in the preceding formula. This corollary shows that the bracket operation measures failure of commutativity in G. For example, if G is abelian, then C , = id, hence Ad, = id, for all a E G. Thus by the corollary, [X, Y] = 0 for all X , Y E g; that is, g is abelian. The converse holds if G is connected (see [KN] or [W]). Let H be a subgroup of G (perhaps G itself). An object defined on the Lie algebra g of G is Ad(H)-invariant if it is preserved by Ad,,: g + g for all h E H . Let b c g be the Lie algebra of H .

302

71

Homogeneous and Symmetric Spaces

3. Lemma. If a symmetric bilinear form F on g is Ad(H)-invariant, then F([X, W ] , Y) = F(X, [W, Y ] ) for all X, Y E g and W E lj. The converse holds if H is connected. Proof. By polarization, the stated formula is equivalent to F([W,X I , X ) = 0 for all X, W. Using the preceding corollary this can be shown equivalent to the constancy of the functionf(s) = F(Ad,,,,X, Ad,,,,X) on each one-parameter subgroup c1 in H . (Compare the proof of Proposition 9.23.) The direct assertion follows. For the converse, Lemma B.12 shows that F is preserved by Ad, for h in some neighborhood of the identity in H . We assume H = H , , and h + Ad,, is a homomorphism, so the result follows from Lemma B. 13.

If g is a Lie algebra and X E 9,let ad,: g -,g be the map sending each Y to [X, Y ] . Evidently ad, is a linear operator, and by the Jacobi identity it is a Lie derivation; that is, ad,[X, Y ] = [ad,X, Y]

+ [ X , ad,

Y].

The Jacobi identity also shows that adI,,,] = [ad,, ad,]. Note that the equation in Lemma 3 asserts that ad,,, is skew-adjoint relative to F . 4. Definition. The Killingform of a Lie algebra g is the function B : g x g + R given by B(X, Y) = trace(ad,ad,).

5. Lemma. The Killing form B of g is a symmetric bilinear form that is invariant under all automorphisms of g and satisfies B([X, Y ] , Z) = B(X, [ Y, 21) for X, Y, Z E g. Proof. B is bilinear since the function X -+ ad, is linear. B is symmetric since trace ST = trace TS. Let p be an automorphism of g, that is, a linear isomorphism that preserves brackets. The latter implies ad,, p = p ad,, hence ad,, = p 0 ad, p - '. Since trace S T S - = trace T, it follows im= [ad,, ad,], mediately that B(pX, p Y ) = B ( X , Y ) . Finally, since trace properties show that B([X, Y ] ,2 ) = B ( X , [ Y , Z]). 0

0

'3

If g is the Lie algebra of a Lie group G, then the Killing form of g is also attributed to G and is in particular Ad(G)-invariant. Our applications will deal with matrix Lie groups, that is, Lie subgroups of GL(n,C), with their Lie algebras also in matrix form as subalgebras of gI(n, C) (Appendix B). In computations the notation X * Y = Xij qj is = X Y On gl(n R ) efficient. Note that 'X * ' Y = X Y = Y X and Y7X = R", X Y is indeed the dot product, while on gl(n, C), X * 7 is the natural Hermitian product.

-

-

-

1

More about Lie Groups 303

6. Lemma.

Let G be a Lie subgroup of GL(n, C ) .

(1) Ad,(X) = a X a - for all a E G, X E g c gI(n, C ) . (2) Assume that X E g a ' X E g . Then B(X, Y) = Re trace XY = Re'X- Y is an Ad(G)-invariant scalar product on g called the trace form. Proof. (1) The formula C,(g) = aga-' defining C,: G + G extends this map to an R-linear operator on gl(n,C) % R2n2. Then dC, differs only by canonical isomorphism from C , (Exercise 1.2); hence Ad, = C, I g. (2) Clearly B is bilinear and symmetric. The hypothesis on g lets us pick Yto b e ' X . T h e n O = R e X . X = X - X = C I X i j 1 2 impliesX=O,so B is a scalar product. Since trace STS-' = trace T, it follows immediately from (1) that B(Ad,X, Ad, Y) = B ( X , Y) for all a E G. I

7. Remark. Consider the Lie algebras g in Appendix B. Then:

=

~ ( n )u(n), , sp(n) discussed

(1) X E g * ' X E g ; (2) trace X Y is real for all X, Y E g. Thus the trace form is nondegenerate in these cases and is given merely by trace XY. (3) The Killing form B of g is proportional to the trace form. In fact, B(X, Y) = c trace XY, with c # 0 if dim g > 1. Only the last assertion is not obvious. It can be proved by elementary but tedious computations, but Lie theory produces simplifications [HI. These results are known to hold also for g = o(p, q), u(p, q), and sp(p, q). Trace forms are computed for the nonsymplectic cases below and in Example 41.

8 . Examples. Trace Forms. ( 1 ) Lie algebra o(n) qf the orthogonal group O(n). Since o(n) consists of skew-symmetric matrices, trace XY = -x. Y. ( 2 ) Lie algebra o(p, q ) ofthe semiorthogonal group O(p, q). By Lemma 9.3, o(p, q ) consists of all matrices of the form X = (: 'l), where a E o(p), b E o(q), and x is an arbitrary real q x p matrix. The space of all such x can safely be denoted by Rpq.(This parametrization of X follows [KN, Volume 111, differing trivially from [HI.) If Y = (; :)' E o(p, q), then for X as above, trace XY

=

X * ' Y=

(z ;)-(-'

")

y -d

=

-a.c

-bad

+ 2x.y.

Evidently the vector space o(p, q ) can be written as an orthogonal direct sum o(p) o(q) Rpq.The trace form is thus negative definite on o(p) o(q) and positive definite on RPq.

+

+

+

304 7 1

Homogeneous and Symmetric Spaces

(3) Lie algebra u(n) of’the unifary group U(n). By Appendix B, u(n) consists ofthe n x n complex matrices X that are skew-Hermitian: ‘X = - X . Thus trace X Y = X * ‘ Y = - X F. The trace form is negative definite, since -X X = IXij(’. Hence -trace X Y is an inner product on u(n). (Note that u(n), though constructed of complex numbers, is a real vector space and not a complex one.)

-

-

-1

BI-INVARIANT METRICS

Since a Lie group G is in particular a manifold, it can be made semiRiemannian by furnishing it with a metric tensor. In order to link the resulting geometry of G to the group structure of G, it is customary to use a l e f invariant metric: one for which left multiplication La:G + G is an isometry for all a E G. A left-invariant merric on G is virtually the same rhing as a scalar product on the Lie algebra g of G-or a scalar product on T,(G). In fact, the last two correspond under the canonical isomorphism X + X,. If (,) is a leftinvariant metric tensor on G, then ( X , Y ) is constant for X , Y E g, thereby defining a scalar product on g. Conversely, if ( , ) is a scalar product on T,(G), the definition (x,

v>

=

( d L , - 4x1, dL,- 1 ( ~ > >

for x, Y E T,(G)

gives a left-invariant metric on G. A metric on G that is both left- and right-invariant (each R,: G an isometry) is called bi-invariant.

+

G also

9. Proposition. Let G be a connected Lie group furnished with a left-invariant metric tensor ( , ). Then the following are equivalent: (1) ( , ) is right-invariant, hence bi-invariant. (2) ( , ) is Ad(G)-invariant. (3) The inversion map g + g - is an isometry of G. (4) < x , [ Y , Z ] ) = ( [ X , Y 1 , Z ) f o r a l l X , Y , Z ~ g . (5) D, Y = +[x, Y] for all X , YEg. (6) The geodesics of G starting at e are the one-parameter subgroups of G.

Proof. In fact, conditions (l), (2), and (3) are always equivalent and imply the (equivalent) conditions (4), ( 5 ) , and (6). The connectedness of G is used only for (4), ( 5 ) , (6) * (l), (2), (3).

( I ) e-(2) is immediate, since C , = L, 0 R , , and dL, I g

=

id.

Bi-Invariant Metrics

305

(1) o ( 3 ) . Let [ be the inversion map. Then for any one-parameter subgroup, [ t l ( s ) = a ( s ) - ’ = c x ( - s ) ; s o d [ , = - i d . I f a E G , then [ = R a - , [ L a - l . Thus the differential map d[,: T,(G) + T ,_,( G )is dR, , d i e dL,.,. Hence (1) * ( 3 ) . The converse is clear, since R, = [La~, [. (2) o (4). Immediate from Lemma 3. (4) o (5). For elements of 9, scalar products are constant, hence the Koszul formula reduces to 2 ( D x Y Z ) = - ( X , [Y, Z ] ) ( Y , [ Z , X I ) ( Z , [ X , Y ] ) . By (4), the first two summands cancel, yielding (9,Conversely, if ( 5 ) holds,

+

+

( X , [ Y , Z ] )= 2(X,DyZ)

=

-2(DyX,Z)

=

-([Y,X],Z)

=

( [ X ,Y ] , Z ) .

( 5 ) 0 ( 6 ) . By polarization, (5) is equivalent to D,X = 0 for all X ~ g . Let c1 be the one-parameter subgroup of X ; hence a’’ = D,XI,. Thus ( 5 ) implies that every tl is a geodesic; but this is equivalent to (6). Conversely, if tl is a geodesic, then D,XI, = 0. Left multiplications, being isometries, rn preserve D ; hence D,X = 0. For brevity a Lie group G furnished with a bi-invariant metric will be called a semi-Riemannian group. Then G is a symmetric space since, by ( 3 ) above, the inversion map [ is the global symmetry at e, and thus L,[L,- I is the symmetry at a E G. In particular, G is complete. 10. Corollary. For a semi-Riemannian group G: (1) R X y Z = b [ [ X , Y ] , Z ] for X , Y , Z E g. ( 2 ) If X and Y span a nondegenerate plane in g, then 1 (CX, Y I , [ X , Y 1 ) K(X, Y )= 4 ( X , X ) ( Y , 0 - ( X , 02’ ProoJ: (1) Since Dx Y = + [ X , Y ] , RxyZ

=

tC[x,Y13 Z l

-

bCX7 [ Y , z I I

+

[ X , zll.

But manipulation of the Jacobi identity G [ X , [ Y , Z ] ] = 0 replaces the last two summands by - $ [ [ X , Y ] , 2 1 . (2) By the shift formula (4) in Proposition 9,
Y)

=

+ax,y1, X I , y > = a
Here K ( X , Y ) is a number that, for every g E G, gives the sectional curvature of the plane spanned by X , and Y,. Some consequences of the curvature formulas are :

(1) If G is abelian, then K = 0. (2) If the metric is Riemannian, then K 2 0. (3) Ric), = - B/4, where B is the Killing form of G.

306 1 7

Homogeneous and Symmetric Spaces

(Proof: Ric(X, Y ) = tracejV -,R,, Y), and R,, Y = :[[X, V ] , Y] = - 1/4(ad Y ad X)V.) A Lie group is semisimple if and only if its Killing form B is nondegenerate. By Lemma 5, the Killing form of G is Ad(G)-invariant. Thus for a semisimple Lie group G, its Killing form provides a bi-invariant metric-and, by (3) above, G is then an Einstein manifold. To give some further connections between the algebra and the geometry of Lie groups we need: I f G is a compact group of linear operators on a real vector space V, then there i s a G-invariant inner product on V (every g : V + V is a linear isometry). For a proof, see [W]. 11. Proposition. Let B be the Killing form of a connected Lie group G. (1) if G is compact, then B is negative semidefinite; (2) if B is negative definite, then G is compact and has finite fundamental group; and (3) G is compact and semisimple if and only if B is negative definite.

Proqf. (1) Since g -+ Ad, is a continuous homomorphism, Ad(G) = {Ad,: g E G} is a compact group of linear operators on g. Thus, as mentioned above, there is an Ad(G)-invariant inner product { , ) on 9. By Lemma 3, ad, is skew-adjoint relative to { , ) for each X E g. Because { , ) is positive definite, each ad, has a skew-symmetric matrix xij = -xji. But then B ( X , X)

=

trace(ad,)2

=

C x i j x j i = -C

(xij)’.

(2) By the remarks above, - B provides a bi-invariant Riemannian metric on G, making it a complete Einstein manifold of strictly positive scalar curvature. Thus Myers’ theorem (10.24) gives the result. (3) By Exercise 1.12, nondegenerate and semidefinite imply definite.

COSET MANIFOLDS

We describe a simple way to build smooth manifolds out of Lie groups. Geometry is not directly involved, so several proofs will be omitted. However, the construction will be an essential part of the Lie description of semiRiemannian homogeneous and symmetric spaces. If H is a closed subgroup (Appendix B) of a Lie group G, let G/H be the set of all left cosets gH of H in G. The origin o of G / H is the subgroup H considered as an element of GIH. The projection n : G -+ GIH sends each g E G to the coset g H containing it. For each a E G the translation T ~ G/H : -, G/H sends each g H to agH. For a, b E G, n La = t a o n, Q

and

t a b = t, o t b .

Coset Manifolds

307

12. Proposition. If H is a closed subgroup of G, there is a unique way to make G / H a manifold so that the projection x: G + G/H is a submersion. For proofs of this result and of Proposition 13, see [W]. The manifold GIH so constructed is called a coset manifold. Since x: G + GJH is a sub71 : G 4 N is smooth mersion, a map 4: G / H + N is smooth if and only if (Exercise 1.10). For example, the identity z, 71 = 7t 0 L, shows that z, is smooth. Hence z, is a diffeomorphism since it has inverse map 5,- 1 . Recall from Definition 9.31 the notion of action G x M + M of a Lie group on a manifold M . For a coset manifold G / H the map G x GIH -+ GIH sending (a, g H ) to agH is called the natural action of G on G / H . Obviously this action is transitive; we shall now see that every transitive action can be so represented. If G x M + M is an action and o E M , the isotropy subgroup H = {g E G : go = 0) is a closed subgroup of G. There is a natural map j from the coset manifold G/H into M that sends each coset aH to the point ao. This map is well defined, since

+

0

0

aH

=

bH

=-b-'aH

=

H

* bK'aE

H

3

b-'ao

=

o = a o = bo.

13. Proposition. Let G x M -+ M be a transitive action and let H be its isotropy group at a point o E M . Then the natural map j: G / H + M (as above) is a diffeomorphism. Hence in particular, the projection g + go is a submersion x: G + M . 14. Examples. Spheres as Coset Manifolds. (1) S" = SO(n + l)/SO(n) = O(n + l)/O(n). SO(n + 1) acts on the unit sphere S" in R"' as a restriction of the usual action of GL(n, R ) on R"". This action of SO(n + 1) is transitive on S"; indeed it is already transitive on the set of positively oriented orthonormal bases for R"". The isotropy subgroup of (1,0, . . . , 0) E S" consists of all elements of SO(n 1) of the form E), where bESO(n). Writing this subgroup as SO(n) and ignoring the natural diffeomorphism j gives S" = SO(n + l)/SO(n). Neglecting orientation throughout produces the alternative expression O(n l)/O(n). l)/SU(n) = U ( n l)/U(n). This is the complex (2) S2"+1 = SU(n analogue of (l), using the natural Hermitian product on C"" = R2"+' as in Appendix B. (3) S4n+3= Sp(n + l)/Sp(n). This is the quaternionic analogue for Hn+l = c 2 n + Z - R4n+4. See [St]. (Note that So = O(1), S' = U(1), and

'

+

+

(A

+

+

S3 = Sp(1). It is known that these are the only spheres that can be made Lie groups.)

308 1 I

Homogeneous and Symmetric Spaces

Ignoring differentiability for the moment, suppose that a Lie group G acts transitively on a set C. If H is the isotropy group of an element o of C, then we still get the natural one-to-one function j from G/H onto Z. If the subgroup H is closed, Z will always be made a manifold by requiring j to be a diffeomorphism. Then the action is smooth, and the projection g -, go is a submersion. Using this construction we now generalize the preceding example in two slightly different ways. Note that the sphere S" can be considered as the set of all oriented lines through 0 in R"' '. (Such a line meets the unit sphere in two points; the orientation picks one of them.) The unoriented lines give the projective space P" = S"/I 1.

15. Examples. Real Grassmann Manifolds. ( 1) Unoriented p-subspaces of n-space, where n = p + q. G,, = O(P + d/O(P) x O h )

=

SO(P

+ q)/S(O(P) x

O(q)).

+

Let G,, be the set of all p-dimensional subspaces of R", n = p q. O(n) acts in a natural way on G,,; we shall show the action is transitive. Let V , be the subspace of R" spanned by the first p vectors in the canonical basis e , , . . . , en. If V E G,,, choose an orthonormal basis {el} for R" whose first p vectors span V . Now let a be the linear operator (matrix) carrying each e, to ei. Then a E O(n) and a( V,) = V. If V, is invariant under y E O(n),then so is V f . Hence the isotropy group of V , is O(p) x O(q) considered as all matrices (8 t ) with g E O(p), h E O(q). This gives the first coset expression for G,,. SO(n) is also transitive on G,,, for by changing one sign (if necessary), we can arrange that the el basis also be positively oriented. The second coset expression follows. (2) Oriented p-subspaces of n-space.

Now let G,, be the set of all oriented p-dimensional subspaces of R". Again SO(n) acts transitively, and for the isotropy group, note that, if alV, is orientation-preserving, then so is a I V $ . In particular, G,, = P" and G,, = S". Since SO(n + 1) is compact and connected, so are both G,, and GPq. Since dim G/H = dim G - dim H , both G,, and G,, have dimension pq. The map G,, -+ G,, that forgets orientation is a smooth two-to-one map, hence necessarily a covering map. It is known that x l ( G p q )z Z , (see, for example, [St, p. 1343). Hence G, is simply connected. The map W + W1 is a diffeomorphism G,, z G y p ;similarly for G,,.

Coset Manifolds

309

16. Lemma. Let M = G / H . If the projection n: G -, M has a cross section 1:M + G , then $(w, h ) = 1(m)h defines a diffeomorphism $: MxH-G. Proof. (Such a section amounts to a smooth choice of representative for each coset.) If g E G , let $(g) = (ng, (Ang)-'g). Since n 0 1 = id, the elements 1ng and g have the same projection in M . Thus they are in the same coset mod H , so (Ang)-lg E H . Hence $(g) E M x H . Both c$ and $ are smooth, and it is easily checked that $$ and $4 are identity maps.

The topological properties of H , G, and G/H are closely related. In particular, considerable information about connectedness and fundamental groups of these three manifolds is concentrated in the proposition below. Note first that the set n0(G) of connected components of a Lie group G has a natural group structure. In fact, the identity component Go of G is a normal subgroup (Example B.4(3)), and its cosets are the components of G, thus no(G)can be defined to be the quotient group G/Go. If $: G + G' is a continuous homomorphism, then $(Go) c Go and the resulting quotient homomorphism c$o: no(C)-+ no(G') tells how 4 treats components. 17. Proposition. For a coset manifold M = G / H there is an exact sequence of groups and homomorphisms

0 -, n,(M)

4n , ( H ) k nl(G) "7 n,(M) 5 n O ( H )4 no(G).

Furthermore, i o is onto if M is connected. Here i is the inclusion map H c G, and exactness means that the image of each homomorphism equals the kernel of the next homomorphism. This sequence derives from the homotopy sequence of a principle bundle [St], in this case n : G M . Since G is a Lie group, the sequence can be extended to n,(G) ([St, p. 94]), and this final segment can be detached since n,(G) = 0.

-

18. Lemma. If G/H and H are connected [compact], then G is connected [compact]. Proof. The connectness assertion follows from the preceding proposition. For the compactness assertion, by applying Lemma 16 to local sections, G can be expressed as a finite union of compact sets n- K K x H.

19. Corollary. (1) The classical groups O(n), SO(n), U(n),SU(n), Sp(n) are compact, and all are connected except O(n), which has components SO(n) = O'(n) and O-(n).

310

11

Homogeneous and Symmetric Spaces

(2) S U ( n ) and S p ( n ) are simply connected, while n l U ( n ) z Z , and

n,SO(n)

ZZ

Z Z,

if n = 2, if n 2 3.

Proqf. The properties of V ( n ) will follow from the fact that it is diffeomorphic to SU(n) x S' (Exercise 3). The proof is by induction on n in Example 14, using Lemma 18 for (1) and Proposition 17 for (2). In the nontrivial case SO(n) in (2), one needs n,(S") = 0 for n 2 3 [St], and SO(3) diffeomorphic to P 3 , hence n,SO(3) z Z , . rn 20. Example.

GEq = O(p, q)/O(p) x O(q) = O"(p,

q)/SO(p) x SO(q).

For this variant of Example 15, let G,lb be the set of all p-dimensional negative q and 0 < p < n. (Then V' is definite subspaces V of R i , where n = p y-dimensional and positive definite.) The semiorthogonal group O(p, q ) is transitive on G;q. If V , is the subspace of R; spanned by the first p vectors in the canonical basis, then V , E G;q and the isotropy group of V ,is O(p) x O(q)since both V , and V,l are definite. By a direct argument as in Example 15 (or by Exercise 6), the identity component O"(p, q ) is already transitive on G:q. In this case the isotropy group of V, is evidently SO@) x SO(q).

+

Example 39 will show that G;q is diffeomorphic to Rp9. Thus it follows from Proposition 17 that n , O + +(P. 4 ) z n , S O ( p ) x n1SO(q). Then the fundamental group of O"(p, q ) can be read from Corollary 19(2). (nlSO( 1) is trivial, since SO( 1) is a single point.) In particular, n,O+ + ( p , q ) zz Z , x Z , for all p 2 3, q 2 3. REDUCTIVE HOMOGENEOUS SPACES If a Lie group G acts on a manifold M , a metric tensor on M is Ginvariant provided that, for each g E G, the diffeomorphism p + gp is an isometry. When the action is transitive, such a metric obviously makes M a semi-Riemannian homogeneous space. It is not hard to show that every semi-Riemannian homogeneous space can be expressed as a coset manifold M = G/H with a G-invariant metric. The geometry of M can then be described in Lie terms; to keep this description simple we specialize somewhat.

21. Definition. A coset manifold M = G/H is reductive if there is an Ad(H)-invariant subspace rn of g that is complementary to b in 9.We call m a Lie subspace for G/H.

Reductive Homogeneous Spaces

31 1

Though g is a direct sum m + b of vector spaces, m need not be closed under brackets, as f) is. By Corollary 2, the invariance of m under Ad(H) implies [f),m] c m. , by X , and X , the components of X in f) and m, For X E ~denote respectively. The diferential map dn of the projection rt: G -+ M = G/H gives a linear isomorphism q f m onto T,(M). (See Figure 1.) In fact, identifying IJ c g as usual with T'(H) c T'(G), we have dn(b) = 0. Since n is a submersion, dn is onto, and counting dimensions shows dn 1 m is an isomorphism. In effect, m has become the tangent space to M at 0.

Figure 1 .

The Lie algebra g =

b + rn is identified with

22. Proposition. Let M

=

T,(G). Then dn:m x To(M).

G / H be reductive, with Lie subspace m.

(1) The linear isotropy group { d ~b ~E H: ) acting on T , ( M ) corresponds under dn to Ad(H) on m. ( 2 ) Requiring dn: m % T , ( M ) to be a linear isometry establishes a oneto-one correspondence between Ad(H)-invariant scalar products on m and G-invariant metrics on M . Proof. (1) The assertion is that dt, dn = dn Ad, on m for all z! E H . This is clear, since rh 0 n = n C , for h E H , and Ad,(m) c m. (2) Suppose ( , ) is an Ad(H)-invariant scalar product on m. Since dn: m % T , ( M ) must be a linear isometry, the scalar product ( , ), on T ( M ) is determined. But then (1) implies that dt, : T J M ) s' T , ( M ) is a linear isometry for all h E H . This fact will allow ( , ), to be extended, in a G-invariant way, over all of M . We assert that if p = T,(o) = sb(o), then 0

0

0

312

11

Homogeneous and Symmetric Spaces

the linear isomorphisms dr,- I , drb- I : T,(M) + T , ( M ) pull back ( , ), to the same scalar product ( , ) p on T'(M). In fact r,(o) = Zb(0) means aH = bH, hence b- ' a = h E H. Then r,, = T ~ I -0 r,; hence for all x, y E TJM),

( d r , - ~(x),dr,- ~ ( y ) )= (drh dT,- ~(x),d Z h dz,- ~ ( y ) ) = ( d r b - ~ ( x )drb, ~(y)). The resulting tensor ( , ) on M is easily seen to be G-invariant, and its smoothness can be derived from the existence of local sections (Exercise I .9). Reciprocally, if {, ) is a G-invariant metric on M , the linear isotropy group {dr,,l,, : h E H ) consists of linear isometries. Since d n ( m is required to be a linear isometry, (1) shows that dn pulls { , ), back to an Ad(H)-invariant scalar product on m. Assertion (2) in the above proposition generalizes readily to give a oneto-one correspondence between Ad(H)-invariant (r, s) tensors on the vector space m and G-invariant ( r , s) tensor fields on M . The scheme is to treat the geometry of coset manifolds G/H as a generalization of the geometry of Lie groups G (since G/H reduces to G when H = {e}). From this viewpoint, the isomorphism m 2 T,(G/H) generalizes the canonical isomorphism Q z T,(G), and a G-invariant metric on G/H generalizes a left-invariant metric on G. The notion of bi-invariant metric on G generalizes as follows.

23. Definition. A naturally reductive homogeneous space is a reductive coset manifold M = G/H furnished with a G-invariant metric such that, for the corresponding scalar product on the Lie subspace m, for X , Y, Z E m. ( [ X , Y I m t 2 ) = ( X , [Yt Z]m> In fact, when H = {e}. hence m = g, this formula is just the shift condition in Proposition 9(4). To determine the geodesics and curvature of a naturally reductive homogeneous space M = G / H we shall use submersion results from Chapter 7. The Lie subspace m has a scalar product; extend this to g = b m by picking any scalar product on fj- and defining l~ Im.This furnishes G with a left- invariant metric for which elements of b are vertical (tangent to fibers) and elements of m are horizontal (normal to fibers).

+

24. Lemma. With notation as above, submersion.

ir:

G + M is a semi-Riemannian

Proof. In Definition 7.44, the condition (Sl) holds because T,(H) % b is by construction a nondegenerate subspace of T,(G) = g, and hence leftmultiplication by g E G shows that T,(gH) is nondegenerate in T,(G).

Reductive Homogeneous Spaces 313

To prove (S2), if g E G, let Z,as usual be the space of horizontal vectors and the identity (normal to g H ) in T,(G). Then dL,-, carries 2, to He, 7, n = 71 o L, shows that d n : Z, + T,,(M) can be expressed as a composition of linear isometries dr, dz, dL,- . 0

25. Proposition. If M starting at o are given by

=

G/H is naturally reductive, its geodesics for all t E R,

ydrrX(t) = a(t)o = m ( t )

where a is the one-parameter subgroup of X E m. Proof’. By Corollary 7.46, a submersion carries horizontal geodesics to geodesics, and M is horizontal since it is an integral curve of the horizontal . shall show that if X , Y E M , then D, Y = $ [ X , Y]. vector field X E ~ We Thus M is geodesic, since a” = D,X = 0. Both m and the scalar product on it are Ad(H)-invariant. Hence by a variant of Lemma 3,

( [ X , V], Y)

=

( X , [V, Y])

if

X,

YEm

and

Vet).

We call this identity an @-shift,and the one in Definition 23 an m-shift. If X , Y E m, but W E g, the Koszul formula gives 2(D,Y,

w >= - ( X ,

[Y, Wl)

+ + ( W , [ X , Yl).

If W E m, an m-shift in the second term cancels the first. If W E b, then an b-shift produces the same result. Hence D,Y = [ X , Y]/2. Naturally reductive homogeneous spaces are complete. In fact, one-parameter subgroups are defined on the whole real line; by the preceding proposition the same is true for inextendible geodesics through o, and, by homogeneity, for all geodesics. We now find a Lie algebra formula for curvature.

26. Proposition. Let M = G/H be a naturally reductive homogeneous space. If X and Y span a nondegenerate plane in m, then

Prooj. Continuing in the context of the preceding proof, z: G -+ M is a semi-Riemannian submersion, and for any Z E g the vertical component is Z,. Hence by Theorem 7.47,

314

11

Homogeneous and Symmetric Spaces

where K is the sectional curvature of G. The previous proof showed that D x Y = [ X , Y]/2 for X , Y E m, and it follows that

( R X Y X , y > = ( q x , Y f f ) x y > + - ( D X D Y X , y>. (2) Now

( D x D y X , Y>

=

- ( D y X , Dx Y > = a ( [ X , Y], [ X , Y l >

=

atcx, Ylh, [ X , YIb) + a
[ X , Ylm),

(3)

( D [ x , y ] , , 3 XY$> = t(CCX, YIm, XI, Y > = $ ( [ X , YIm, CX, Y I > . Abbreviate [ X , Y],, to I/; then by the Koszul formula,

(4)

since Im. Using an m-shift gives

2 = -(I/,

[X, Yl>

+ ( X , [Y,

Vl)

+ (Y, [V; XI>.

By an h-shift the last two terms on the right are equal, hence

( D [ X , Y ] , X ,y > = -4(CX, Ylh' [ X , Y1)

+ ("X,

Ylh, XI,

0. ( 5 )

To the contributions from ( 3 ) , (4), and (5) add j ( [ X , Y],,, [ X , Y l b ) from (1). Terms of the latter type cancel, leaving the required result. rn

A point p of a semi-Riemannian manifold is a pole provided the exponential map exp, is a diffeomorphism. The vertex of a paraboloid of revolution is a pole, and in Hadamard's theorem (10.22) every point of H is a pole. For a homogeneous space, if one point is a pole, every point is.

27. Lemma. If M = G/H is a naturally reductive homogeneous space for which o is a pole, then the map ( X , h ) -+ (exp X ) h is a diffeomorphism of m x H onto G. Proof: By Proposition 25, if exp: g -+ G is the Lie exponential map (Appendix B ) , then expo a dn

=

n exp: m 0

+

M.

Call this map E: m -+ M . By hypothesis, exp,: T J M ) -+ M is a diffeomorphism, and since dn: m --f T , ( M ) is a linear isomorphism, it is also a diffeomorphism. Thus E is a diffeomorphism. Now A = e x p o E - ' : M + G is smooth, and n o A = n o e x p o E - ' = E E-' = id, so A is a section of n: G -+M . By Lemma 16, the map ( p , h ) --* A(p)lz is a diffeomorphism of M x H onto G. Following E x id by this diffeomorphism gives the required diffeomorphism m x H -+ G, since A ( E X ) . h = (exp X ) h . 0

Symmetric Spaces

315

SYMMETRIC SPACES

First we show how to express a given semi-Riemannian symmetric space M in terms of Lie groups. Since M is homogeneous, I ( M ) is transitive on M ; hence by Exercise 6 the identity component G = I,(M) is transitive. Thus M can be identified with a coset manifold G / H , where H is the isotropy group of a point o of M . The global symmetry at o then gives further structure. 28. Lemma. With notation as above, if [ is the global symmetry of M = G / H at 0,the map a sending g to [g[ is an involutive automorphism of G. The set F = Fix(a) = { g E G : o ( g ) = g } of fixed pQints of a is a closed subgroup of G such that F , c H c F . Proof. Since [ is involutive, [ - I = [. Thus a is conjugation by [, so a is an involutive automorphism. Consequently, a carries I,(M) to itself, and F is a closed subgroup of G (Example B.4). If h E H , then the differential map of the isometry o ( h ) at o is d[, dh, d[,, which is just dh,, since d[, = -id. Because M is connected, a(h) = h by Proposition 3.62. Thus H c F . To show F , c H , recall that since F , is connected, by Appendix B it is generated by the points a ( t) of the one-parameter subgroups of F . Thus it suffices to show that a(t) E H . But a ( a ( t ) )= a(t), and hence [ and a ( [ ) commute. Thus for all t . [(a(t)o)= a ( t ) [ ( o ) = a(t)o

Since o is an isolated fixed point of the symmetry [, it follows that a(t)o = o for It1 small, hence for all t. Thus a ( [ ) is in the isotropy group H of 0. The preceding makes it clear how to construct symmetric spaces from Lie group data.

29. Theorem. Let H be a closed subgroup of a connected Lie group G. Let a be an involutive automorphism of G such that F , c H c F = Fix(a). Then any G-invariant metric tensor on M = G / H makes M a semi-Riemannian symmetric space such that [ n = n a,where [ is the global symmetry of M at o and 7c is the projection G -+ M . 0

0

Proof. (a) There is a unique function [ : M -+ M such that [ n = 7c a. If g E G, then [(ng) = n(ag) is a consistent definition, because n g , = n g , means g,H = g 2 H , and, since a fixes H , then o(g,)H = a(g,)H; that is, nag, = nag,. (b) [ is a diffeornorphisrn. That [ is smooth derives as usual from the existence of local sections of the submersion 7c. Because a is involutive, it follows that [ is involutive, hence [- = [. 0

0

316 1 1

Homogeneous and Symmetric Spaces

(c) d [ , = -id. Clearly [(o) = o. If y E T,(M), we anticipate (2) of the next lemma, which implies that there is a Y E g such that da(Y) = - Y and dn(Y) = y. Thus d[(y)

= d[(dnY) =

(d) sUg = [zg[ for g

E

da(daY) = da(- Y)

-y.

G . In fact, for all a E G,

[Tg7CU = [7C(gU) = 7CO(gU) =

uE

=

n(0g

*

GU) = Tug7C(OU)

= Tug[7CU.

(e) Relative to any G-invariant metric tensor on M , 5 is an isometry. If T,(M), let u, = d ~ , ,(u) - E T,(M). Then using (d) and (c), (db, d b ) =

( 4d T g ( U o ) , d i d T g ( U 0 ) )

=

( d ~ ~ ~ d Cd7ug&'(u0)) (uJ~ =

=

(-uo, -u,)

(K(uo)34(u,))

= (u, u).

The proof of the theorem is completed by observing that if a homogeneous space has a global symmetry [ at a single point o, it has one at every point p = rfo), namely sir-'. The existence of the automorphism a produces striking effects on the Lie algebra g.

30. Lemma. Let H

c

G and a be as in the preceding theorem, with

b c g the Lie algebras of H c G . Then (1) b = {XEg:da(X) = X}.

(2) g is the direct sum of b and the subspace m = {X E g : d o ( X ) = - X}. (3) Adh(m) C T?l for all h E H . (4) Cb?b l c 5, ml = m7 Cm, ml = 5. C b 3

, da(X) = X. Conversely, Proof. (1) Since o J H= id, if X E ~ then suppose do(X) = X. If a is the one-parameter subgroup of X, then a and a a have the same initial velocity. But a a is also a one-parameter subgroup, hence u a = a. This means that c( lies in F-in fact in its identity component F,. Since F , c H , we get X E b. (2) For X E let ~ X, = (X daX)/2 and X, = (X - duX)/2. Then X = X, + X,. Because a is involutive, so is da; hence da(X,) = X, and do(X,) = -X,. Thus g = b + m, and the sum is direct, since evidently bnm=O. (3) If X E m and h E H , we must show that da(AdhX) = -Ad,X. Since a(h) = h, the automorphisms a and c h commute; in fact, d h ( a ) = a(hah-') = ha(a)h- I. Thus 0

0

0

+

du(AdhX)

=

d(dh)X

=

d(Cha)(X) = AdhdoX

=

Ad,(-X)

=

-AdhX.

Symmetric Spaces

317

(4) The first inclusion holds since H is a Lie subgroup, the second since m is Ad(H)-invariant, but all three are easy consequences of the fact that 5 and m are the 1 and - 1 eigenspaces of do, respectively. For example, if X , Y E m, then

+

d o [ X , Y] Hence [ X , Y]

E

=

[ d o X , duY]

=

[ - X , - Y]

=

[ X , Y].

Q.

We call ( G / H , o,B ) symmetric data provided that (1) H is a closed subgroup of a connected Lie group G. (2) o is an involutive automorphism of G such that F , c H c F = Fix(a). (3) B is an Ad(H)-invariant scalar product on m = { X E g : d o X = - X } .

Theorem 29 shows how such data make GIH a symmetric space under the G-invariant metric corresponding (as in Proposition 22) to B. This construction is presumed when we say that M = G/H is a symmetric space. G/H is then a naturally reductive homogeneous space with m = { X :d o X = - X } as Lie subspace. In fact, by Lemma 30, m is an Ad(H)-invariant complement to 5, and the shift condition in Definition 23 is trivial since [m, m] c 5. In this context, m will always denote the - 1 eigenspace of do, as above.

31. Proposition. space.

Let M

=

G/H be a semi-Riemannian symmetric

(1) The geodesics starting at o are given by y d n X ( t ) = a(t)o = m ( t )

for all t ,

where a is the one-parameter subgroup of X E m. (2) The curvature tensor at o is given by R X y z= d n [ [ X , Y], Z], where x, y, z E T J M ) correspond under d n to X , Y, Z E m. If x and y span a nondegenerate plane, then

W, Y ) = lQ(X, Y). Proof. Since M = G/H is naturally reductive with Lie subspace m, Proposition 25 gives (1). Proposition 26 gives the sectional curvature formula in (2), since [ X , Y] E 5 for X , Y E m. The formula for the curvature operator will follow from Corollary 3.42 once we check that the multilinear function ( X , y, 2, W )

-+

(“X,

YI, ZI, w >

is curvaturelike on m. Obviously it is skew-symmetric in X and Y. Cyclic symmetry in X , Y, 2 is just the Jacobi identity. Finally, since [ X , Y] E 5, skew-symmetry in Z and W follows from the fact that m and the scalar product on it are both Ad(H)-invariant.

318 1 1

Homogeneous and Symmetric Spaces

To illustrate the theory we now take a well-known symmetric space, but use its geometry only to set up appropriate symmetric data, and from this deduce in particular its geodesics and curvature.

+

32. Example. s" = SO(n l)/SO(n) as in Example 14. As the unit sphere in R"+ S" is symmetric, with symmetry [ at o = (1, 0, . . . ,0) given by ( t o , t , , . . . , t,) + ( t o , - t l , . . . , - t n ) .

',

(1) The automorphism o of SO(n + 1). By the column vector conventions, [ can be regarded as the diagonal matrix with entries 1, - 1, . . . , - 1. Thus by Lemma 28, if a E SO(n l),

+

So Fix(o) is S(O( 1) x O(n)),and F , is the isotropy group 1 x SO(n) z SO(n). (2) The subspace m = { X E o(n + 1) : doX = - X } . Since [ = [-', o is conjugation by i.So by Lemma 6(1), do is also conjugation by ( on the Lie algebra o(n + 1). It follows that m consists of all matrices

where 6 is the n x n zero matrix and x is an arbitrary column vectorregarded as usual as an element of R". Write X ++ x for the resulting correspondence between R" and m. (3) The Ad(H)-invariant inner product on m. Under X - x, the dot product x y on R" corresponds to B ( X , Y ) = trace X Y = +X * Yon m. Here H = SO(n) c SO(n + 1) and we know that the trace is Ad(SO(n + 1))invariant. It will follow from (5) that the corresponding metric tensor on S" is its usual one. (4) Geodesics. Let y be a geodesic of S" starting at 0.By Proposition . computation of y ( t ) using the 31, y ( t ) = exp(tX)o for some X E ~ Direct power series for exp(tX) = efxshows that y is the great circle parametrization (cos t)o + (sin t ) x / I x 1, where X e* x. (5) Idenrijcations. In (3), R" is tacitly identified with the last n coordinate space of R"+'. Hence canonical isomorphism identifies T,(s")with R". Then in the notation of (3), x = y'(0). But X is the initial velocity of the one-parameter subgroup projecting to y. Hence in these terms, the isomorphism d n : m 2 T,(S") is just X e,x.

-

-+

Riemannian Symmetric Spaces 31 9

( 6 ) Linear isotropy. Apply Proposition 22(1). If h SO(n) and X E m, then AdhX = hXh-'

=

(; ;)(*" -;")(; bol)

=

(;x

=

(h E) E H

=

-I:)-'

(skew-symmetric since 'b = K'). Thus the linear isotropy action of on T,(S") is, via the identifications, just the usual action of SO(n) on R". This implies, for example, that S" is frame-homogeneous. (7) Curvature. In terms of the Lie subspace m, Proposition 31 asserts that R x , Z = [[X, Y ] , Z ] . If, as usual, x , y , z are the corresponding vectors in R" z T,(S"), we readily compute first [ X , Y ] = (: where S = (xiyj - x j y i ) ,and then 0 -'(Sz) RXYZ = (sz

E),

).

Thus R x y Z corresponds under identification to Sz, which is just (x * z)y ( y * z)x. Hence S" has constant curvature 1.

-

33. Remark. Normal Symmetric Spaces. Let G/H and a be as usual for symmetric data, but let B be a scalar product on g that is invariant under both Ad G and do. (If G is semisimple, then by Lemma 5 its Killing form is such a scalar product.) Then lJ Im relative to B. In fact, if X ~m and V E 6, then B ( X , V ) = B(doX, daV) = B( - X, V ) = -B(X, V). It follows that b and m are nondegenerate relative to B. Hence BI, is an Ad(H)-invariant scalar product on m, so the data ( G / H ,a, BI,) make M = G/H a symmetric space. Furthermore, because of the shift property in Lemma 3 , the curvature formula in Proposition 31 simplifies to K ( X , Y ) = BRX,

YI,

EX, Y l > l Q ( x ,Y ) ,

where X , Y span a nondegenerate plane in m z T,(M).

RIEMANNIAN SYMMETRIC SPACES In the Riemannian case the study of symmetric spaces is concentrated on the following extreme types: 34. Definition. A Riemannian symmetric space M = G/H is of compact type if the Killing form B of G is negative definite, and of noncompact rype if B is negative definite on lJ and positive definite on m.

320 7 1

Homogeneous and Symmetric Spaces

In fact, every simply connected Riemannian symmetric space can be expressed as a product whose factors are compact, noncompact, or Euclidean [ H I . The topological and geometrical properties of these types are quite distinctive. (Below, we write, for example, Ric > 0 to mean that the associated quadratic form is positive definite.) 35. Theorem.

Let M

=

G / H be a Riemannian symmetric space.

(1) If M is of compact type, then K 2 0 and Ric > 0, hence M is compact and n , ( M ) is finite. (2) If M is of noncompact type, then K I 0 and Ric < 0, hence M is diffeomorphic to Euclidean space R" (noncompact, simply connected). Furthermore, G is diffeomorphic to H x R". Proof. Once the curvature assertions have been proved, the topological consequences readily follow:

For (l), Myers' theorem can be applied. In fact, M is complete and, since Ric is positive definite, Ric(u, u ) 2 a > 0 holds on the unit sphere in some one T,(M)-hence by the homogeneity of M it holds everywhere. For (2), Proposition 36 will show that M is simply connected. Then by Hadamard's theorem (10.22) M is diffeomorphic to R" and furthermore has poles, so Lemma 27 gives the assertion about G. We shall compute curvature only in the commonly occurring special case in which the metric tensor on M derives from an inner product where B is the Killing form of G, and c is a constant with c > 0 if M has noncompact type, but c < 0 if M has compact type. (The general case uses also some linear algebra like that in the proof of Lemma 9.15; see [HI, for example.) Thus by Remark 33, K(X, Y ) = c B ( [ X , Y ] , [X, Y ] ) / Q ( X ,Y ) for X , Y Em. Then [X, Y ] E $, so Definition 34 gives K 2 0 in the compact case, K 5 0 in the noncompact case. Since M is Riemannian, it follows that Ric 2 0 and Ric I0, respectively. Thus it remains to show that Ric(X, X ) = 0 implies X = 0. Let X , E , , . . . , En be an orthonormal basis for m. In neither case does K change sign; hence, Ric(X, X)

=

0 =. K(X, E i ) = 0 3

for all i,

B ( [ X , Ei], [ X , E i ] ) = 0 for all i,

=. B([X, Y ] , [ X , Y ] ) = 0 for all Y Em. Since cBIm is an inner product, it follows that [ X , Y ] = 0 for all Y ; that is, ad,(m) = 0. Since ad,ad,($) = ad,[X, $3 c ad,(m) = 0, we conclude that ad,ad, = 0. By the definition of Killing form, B(X, X) = 0. ThusX = 0. w

Duality

321

36. Proposition (S. Kobayashi). A homogeneous Riemannian manifold with K I 0 and Ric < 0 is simply connected. The result follows from these three facts about a homogeneous Riemannian manifold M : (1) Every maximal geodesic of M is either one-to-one or periodic. (2) If M is not simply connected, it contains a periodic geodesic. (3) If K I 0 and Ric < 0, then M contains no periodic geodesics. Proof of (1). It suffices to show that every geodesic loop is smoothly closed. Suppose y: [0, b] + M has unit speed. By Corollary 9.38 there is a Killing vector field X on M such that X , = y’(0) at p = y(0) = y(b). Using the conservation lemma (9.26), (Y’(O), y’(b)) = ( X P >y‘(b)) = ( X , , Y‘(0)) = (y’(O), y‘(0)) = 1.

By the Schwarz inequality the unit vectors y’(0) and y’(b) are equal. Proof of (2). The simply connected covering R : fi + M is not trivial; that is, there are points p # q in fi such that A(p) = R(q). Being homogeneous Riemannian, M is complete, and hence so is A.Thus by the Hopf-Rinow theorem, there is a geodesic p from p to q. Since R 0 fi is not one-to-one, the result follows from (1).

Proof of (3). Assume that y : R -+ M is a periodic geodesic. Since Ricci curvature is negative definite (and M is Riemannian) there is a tangent vector x at y(0) such that (Rxy3(,,)x, y’(0)) c 0. Again there is a Killing vector field X such that X,(,, = x. Let h(t) = (Xr(r),Xy(,)). Then h is a periodic function. Since X , is a Jacobi field on y, the computation in the proof of (10.19) shows that h“ = 2((X’12- ( R x , , X , y’)) 2 0.

But since h is periodic, h” inequality above.

=

0. Thus (R,,, X , y’)

=

0, contradicting the

DUALITY

For normal symmetric spaces (Remark 33) there is a remarkable duality that, for our purposes, can be described as follows.

37. Definition. Normal symmetric spaces M are dual provided there exist

=

G/H and M*

=

G*/H*

322 I 1

Homogeneous and Symmetric Spaces

(1) a Lie algebra isomorphism 8: t) + b* such that B*(6V, 6 W ) = B(V, W ) for all V, W E 5; (2) a linear isometry 6: m m* such that [ S X , 6Y] = - & X , Y] for all X, Y E rn. -

-+

Thus on Ij, brackets are preserved and B has signs reversed, while on m, B is preserved and brackets have signs reversed. Under the identifications m z T , ( M ) and m* z T,(M*), 6 induces a linear isometry 6: T , ( M ) -, T,(M*). Thus duals, M and M * , have the same dimension and same index.

38. Lemma. Dual symmetric spaces M = G/H and M* = G*/H* have opposite curvatures; that is, K(6II) = - K ( U ) for every nondegenerate plane Il in T'JM), where 6: T , ( M ) T,(M*) is the induced linear isometry. -+

Proof. If X and Y span a nondegenerate plane in m z T,(M);then

The Grassmann manifolds G,, of Example 15 can be made symmetric spaces by generalizing Example 32 from 1, n to p, q. In a similar way G;q from Example 20 becomes a symmetric space dual to Cpq.

39. Example.

Dual Grassmann Manifolds.

c,,

= SO(P

x

W q )

G;,

= O f +(P, 4)/SO(P) x

W d

+ 4)/WP)

(p 2 1, q 2 1).

cpq.

Conjugation by the ( p , q) signature matrix E is an involutive (1) automorphism of SO(n), n = p + 9. Since E - ' = E, a(g) = EgE. Hence a

h o ( ~ d)

=

u -h (-c d),

where a i s p x p, d i s q x q.

Thus Fix(cr) = S(O(p) x O(q)), whose identity component is SO(p) x SO(q). By Lemma 6(1), dg on o(n) is also conjugation by E. Thus its - 1 eigenspace rn consists of all matrices

where x is an arbitrary q x p matrix. As in Example 8, we denote the space of all such x by Rpq.Thus X ++ x is a linear isomorphism identifying m with R"4.

Some Complex Geometry

323

-+

On o(n + 1) let B be the inner product B ( X , Y ) = trace X Y = +X * Y. The factor is introduced so that, as in the case S" = GI,,, the restriction B ,I corresponds under X C I x to the dot product x y on Rp4. Since B is a negative multiple of the Killing form of o(n + l), Remark 33 applies, making G,, a symmetric space, in fact a Riemannian symmetric space of compact type. Hence K 2 0 and Ric > 0. As mentioned earlier, is simply connected and has dimension p q . (GPq can be made symmetric, in virtually the same way, so that G,, -+ G,, is a Riemannian covering.) ( 2 ) G&. Let 0 on O"(n) again be conjugation by the ( p , q ) signature matrix. Then Fix(a) = SO@) x SO(q), and the - I eigenspace m* consists of all X = (: :), where x is an arbitrary p x q matrix. This time we use B ( X , Y ) = 4trace X Y , a positive multiple of the Killing form of o(p, q). Example 8(2) shows that the restriction of B to m* is an inner product corresponding to the dot product on Rp4. But B is negative definite on lj = o(p) + o(q). Thus G;lh becomes a Riemannian symmetric space of noncompact type. Hence K 5 0, Ric < 0, and G;q is diffeomorphic to Rp4.Furthermore, 0' ' ( p , q ) is diffeomorphic to S O ( p ) x SO(q) x Rpq. Analogous to the identification S" = GI, is H" = GTn, representing hyperbolic space as the set of timelike lines through 0 in & + I . (3) Duality. The subgroup H = SO(p) x SO(q) is the same for both, so let 6 be the identity map on lj = o(p) x o(q). The scalar products used in (1) and (2) are indeed negatives on lj. Let 6: rn -+ m* send (I: to (: '6). This map is clearly a linear isometry, and a simple computation shows that [ S X , S Y ] = - [ X , Y ] as required for duality. Thus G,, and Giq are duals, with opposite curvature as specified in Lemma 38. The simplest case is the duality between S" = GI, and H" = GT,.

4

-

cp4

-2)

SOME COMPLEX GEOMETRY

We describe briefly some relations between real and complex geometry, with applications to symmetric spaces. Let V be a vector space over the complex numbers C. If only real scalars are used, then I/ becomes a real vector space and scalar multiplication by is an R-linear operator J on I/. Since J 2 = -id, J is nonsingular. If I/ has complex dimension n, then its real dimension is 2n. In fact, if e l , . . . , r, is a complex basis, it is easy to check that e l , . . . ,en, J e , , . . . , Je, is a real basis. A complex line L through 0 in V (that is, a complex one-dimensional subspace) is a real two-dimensional subspace, since if 0 # x E L , then x, J x

fl

324 I 1

Homogeneous and Symmetric Spaces

is a real basis for L. Such holomorphic sections are exactly the two-dimensional real subspaces that are invariant under J . Reciprocally, let W be a real vector space. An R-linear operator J on W such that 5 ' = -id is called a complex structure on W. Then the natural definition of complex scalar multiplication as ( a + n b ) x = ax b J x makes W a complex vector space. Evidently an R-linear operator that preserves J (that is, commutes with J ) is C-linear.

+

40. Definition. A Hermitian scalar product on a complex vector space V is a function h : V x V -+ C such that (1) h(u, w) is C-linear in u ; (2) h(w, 0) = h(u, w); (3) h is nondegenerate; that is, h(u, w)

=

0 for all w implies u = 0.

Then h(u, w) is additive in w, but h(u, cw) = &(u, w).Since h(u, u) is always real, terms such as positive dejnite or semidejnite are meaningful as before. On C" the analogue of the dot product is the (positive definite) natural Hermitian product (u, w)= u * w= u i w i .

1

It is easy to verify that the real part, Re h, of a Hermitian scalar product h on V is a (real) scalar product relative to which J is both orthogonal and skew-adjoint. 41. Example. The Indefinite Unitary Group U(p, q). (1) Dejinition. U(p, q ) consists of those matrices in GL(n, C), n = p + q, that (as C-linear operators on C") preserve the Hermitian scalar product h(x,y) = -x

1Yl -

-

. . . - xpYp + X p + 1 Y p + l + XnYn.

-

If E is the ( p . q) signature matrix, then h(x, y) = E X j , and it follows that. V(p,q) = {g E GL(n, C ) : 'ijc = Eg- ' } . Thus U(p, q ) is a closed subgroup of GL(n, C). Note that U(n, 0) = U ( 0 , n ) = U(n). ( 2 ) Lie algebra u(p, 4). Recall that u(n) consists of skew-Hermitian matrices: '1= -X. By (1) it follows, as in the analogous real case o(p, q), that u(p, q ) consists of all complex n x n matrices such that '1= -EX&. These have the form X

=

(:

'3,

where a ~ u ( p ) ,bEu(q),

XECP~.

As in the real case, C p qdenotes the space of q x p (or p x q ) complex matrices. Thus (direct sum). u(p, q) = u(p) + u(q) + Cpq

Some Complex Geometry

(3)

Trace form.

trace X Y

For X as above and, similarly, Y =

X*'Y

=

-a*F

-

bad

=

(L

325

'dy),

+ ( x - j+ 2 - y ) .

-

We saw in Example 8 that a T is a (real, positive definite) inner product on u(p); similarly for b * d on u(q). The sum in parentheses is just 2 Re x j,an inner product on CpqM RZpq.Thus the trace form on u(p, q ) is a real scalar product that is negative definite on u(p) + u(q) and positive definite on Cpq R2pq.(Like u(n), u(p, q ) is a real but not a complex vector space.) An almost complex structure on a smooth manifold M is a (1, 1) tensor field J such that J' = -id. Thus J smoothly assigns a complex structure J , to each tangent space Tp(M).If a manifold M admits an almost complex structure, it is not hard to show that M is even-dimensional and orientable. 42. Definition. A semi-Riemannian manifold M with almost complex structure J is a Kakler manifold provided

(1) J preserves the metric; that is, ( J X , J Y ) = ( X , Y ) for all

x,Y E 3E(M);

( 2 ) J is parallel; that is, D , ( J Y )

=

J ( D , Y ) for all X , Y E 3E(M).

The simplest example of a Kahler manifold is Euclidean space R'" with its usual metric and with J derived from R2"% C";explicitly, J(e,,= eZk and J(e,,) = - 1 2 ' ~ - 1. The kolomorpkic curvature of a Kahler manifold is the restriction of sectional curvature function K to nondegenerate holomorphic sections. It can be seen as assigning to each nonnull tangent vector x # 0 the sectional curvature K ( x , Jx). In fact, the holomorphic plane spanned by x,J x is nondegenerate since (1) above implies J x Ix , hence Q(x, J x ) = (x, x)' # 0. 43. Proposition. Let M = G/H be a semi-Riemannian symmetric space, and let J , be an Ad(H)-invariant complex structure on m that preserves the scalar product on m. Then there is a unique G-invariant almost complex structure J on M such that dn: m % T,(M) is J-preserving, and M is Kahler relative to J .

Proof. By the remark following Proposition 22 the (1, 1) tensor J , on m gives rise to a G-invariant (1, 1) tensor field J on M . The scalar product on m mentioned in the proposition is, as usual, the one corresponding via dn to the scalar product on T,(M). Since J , and JI T , ( M ) correspond via dn, the latter has J 2 = -id and preserves the scalar product on T,(M). These properties then hold at all points of M , since both the metric tensor and J are G-invariant.

326 1 7

Homogeneous and Symmetric Spaces

To show that J is parallel on M it suffices to show that if Z is a parallel vector field on a geodesic y, then JZ is also parallel. (Compare proof of Proposition 8.10.) By homogeneity we can suppose that y starts at o E M and thus is the projection of a suitable one-parameter subgroup CI of G. By Exercise 10, Z(s) = dt,,,,Z(O) for all s. Since J is G-invariant,

( J Z )(s) = J(Z(S))= J(dL,,,Z(O))= dT,,,,(JZ(O)). Hence by Exercise 10, J Z is parallel. The last part of the proof generalizes automatically to the assertion: A G-invariant tensorfield on a sjlmmetric space M = G/H is parallel. The dual Grassmann manifolds of Example 39 have complex analogues constructed in a strictly analogous way, as follows.

44. Example. U(P + ~ ) / U ( P )x U ( 4 ) = S U ( P + ~)/S(U(P)x U(q)). (1) Cosef manifold. Let CG,, be the set of all complex p-dimensional subspaces of C", n = p + 4. U ( n ) acts naturally on C"and thereby on CG,,. Let V, be the subspace spanned by the first p elements of the natural basis for C". The isotropy group of V , is U ( p ) x U(q), since invariance of V , implies invariance of Vf . (2) Symmetric space. Conjugation by the ( p , 4) signature matrix E is an involutive automorphism a of U ( n ) for which Fix(a) = U(p) x U(q). Since da is also conjugation by E, the - 1 eigenspace m consists of the where x is an arbitrary 4 x p elements of u(n) of the form X = (: -:), complex matrix. trace X Y = +X F On u(n) we know by Example 8 that B ( X , Y ) = is an inner product that is a negative multiple of the Killing form. (As before, means that BI, corresponds under X c--f x to the natural the factor Hermitian product x on Cp4.)Thus CG,, becomes a Riemannian symmetric space of compact type. (3) Properties. A simple computation shows that the adjoint action of H = U(p) x U ( q )on m z CP4correspondsto its action ((a, b), x) -+ axbon CPq. Scalar multiplication by J - 1 on Cp4z m gives a corresponding complex structure J , on m. Evidently J , is Ad(H)-invariant. Hence by Proposition 43, J , determines an almost complex structure J on CG,, making it a Kahler manifold. Applying Proposition 17 to the alternate description CG,, = SU(n)/S(U(p) x U ( 4 ) ) shows that CG,, is simply connected, since SU(n) is simply connected (Corollary 19) and S ( U ( p ) x U ( 4 ) ) is connected (Exercise 4). CG,

=

-4

4

-

-

Some Complex Geometry

327

Thus CG, i s a compact simply connected 2pq-dimensional (Riemannian) Kahler symmetric space with K 2 0 and Ric > 0. 45.

Example. CG&

=

W P , 4)lWP) x

U ( q ) = SU(P, 4)/S(U(P) x U(Y>>.

(1) Coset manifold. Let CGZq be the set of all those complex p-dimensional subspaces of C", n = p + q, on which the indefinite Hermitian product E X * j of Example 41 is negative definite. The indefinite unitary group U(p, q ) is transitive on CG;,, and the isotropy group of the usual subspace V , is U(P>x W ) . ( 2 ) Symmetric space. As before, conjugation by the ( p , q ) signature matrix is an involutive automorphism of U(p, q ) whose fixed point set is U(P) x W q ) . In view of the description of u(p, q ) in Example 41, the - 1 eigenspace m* of do consists of all elements of the form X = (2 :'), where x is a p x q complex matrix. Let B * ( X , Y) = trace X Y = 4 X P on u(p, q). Then B* is a positive multiple of the Killing form of u(p, q). Thus it follows from the properties of B* given by Example 41(3) that it makes CG;q a Riemannian symmetric space of noncompact type. (3) Properties. As in the preceding example, the isomorphism m* z CPq induces a complex structure on m" that, transmitted to CG&, makes it Kahler. Thus CG;, is a (Riemannian) Kahler symmetric space with K 5 0 and Ric < 0, and difleomorphic to R2Pq. (4) Duality. CG,, and CG:q are duals. In fact they share the same subalgebra = u(p) + u(q), on which B and B* = - B certainly have opposite signs. Let 6 : m -+ m* send (," -':) to (," '0"). By construction, B*(6X, SY) = x y = B ( X , Y ) , and, just as in Example 39(3), [ S X , SY] = - [ X , Yl.

4

-

-

The simplest real Grassmann manifolds are spheres (or projective spaces) and hyperbolic spaces. The complex Grassmannians above provide complex analogues as follows: Complex projective space CP" is CG,, = U(n l)/U(l) x U(n). Dual to it is complex hyperbolic space CH" = CGTn = U ( 1, n)/U( 1) x U(n).The metrics on the symmetric spaces we have constructed are in effect defined only up to multiplication by a positive constant. Here we choose the constant so that CP" will have holomorphic curvature 1 (see below). Then duality determines a unique metric on CH".

+

46. Corollary. (1) Complex projective space CP" is a compact, simply connected, 2n-dimensional (Riemannian) Kahler symmetric space

328

11

Homogeneous and Symmetric Spaces

with constant holomorphic curvature 1, and$ I K I 1. Each of its geodesics is simply closed of length 2n. (2) Complex hyperbolic space CH" is a (Riemannian) K2hler symmetric space of constant holomorphic curvature - 1, and - 1 I K I -$. It is diffeomorphic to R2",and each of its geodesics in one-to-one.

Proof. (1) (a) The linear isotropy group is transitive on the set of holomorphic planes in T , ( M ) and on the unit sphere in T,(M). In fact, we saw in Example 44 that this action corresponds to the action of U(1) x U(n) on C"by (ei9,A)x = e-"Ax. For 9 = 0, this is already transitive on complex lines, that is, holomorphic planes. But scalar multiplication by all e-" E U( 1 ) is clearly transitive on the unit circle in each holomorphic plane. (b) It is immediate from (a) that holomorphic curvature is constant on T , ( M ) and hence, by homogeneity, is constant everywhere. Multiply B in Example 44 by 4, so B ( X , Y) = - 2 trace X Y = 2 X F. Let El, E , E m correspond to elements of the natural basis for C" z m. In view of (a), an arbitrary tangent plane on CP" has sectional curvature K ( E , , Y ) , where Y = cos 9 J E , sin 9 E 2 . A simple computation gives K ( E l , Y) = &l 3 cos' 9). Hence 4 I K I 1. Taking 9 = 0 shows that CP" has constant holomorphic curvature 1. (c) Geodesics. By (a) and the homogeneity of CP", all its geodesics are congruent. Thus it suffices to exhibit a single simply periodic geodesic of length 2n. The one-parameter subgroup of El E m is

-

+

+

=(

:s:o-::

1) 0

4-2

Evidently o! is periodic of period 2n, but the geodesic y = n CI is periodic of period n. In fact, for s > 0, the first return of o! to H = U ( 1) x U(n) is at the first zero of sin s; namely, s = n. The proof of Proposition 36 shows that y I [0, n] is smoothly closed and that y is simply periodic. Thus by construction, the period of y is n. Since y has speed B(El, El)"' = 2, its length is 2n. The other stated properties of CP" are those of CG,, in general. (2) The curvature properties of CH" derive from (1) since duality 6 reverses curvature signs. Holomorphic curvature is also reversed, since 6 commutes with J, hence preserves holomorphic planes. By the Hadamard theorem (10.22) the geodesics of every CGZ,, indeed of every Riemannian symmetric space of noncompact type, are one-to-one. 0

If n = 1, then in both cases above, K has constant value 1. Since these two surfaces are complete and simply connected, space form classification

Exercises 329

shows that CP' is the unit 2-sphere, and C H I is the real hyperbolic plane H'(1). However, for n > 1 the preceding proof shows that K is nonconstant, 1, filling the prescribed interval. Furthermore CP" has Euler number n hence for n > 1 is not even homeomorphic to S'". (Compare the Sphere Theorem in[CE].) Proposition 8.28 shows that for sectional curvature to fill a finite interval is a distinctively Riemannian phenomenon. As with the reals and complexes, there are also quaternionic projective spaces [HI. More generally, one can define the symmetric space of all vector subspaces of dimension m and index p in Ft (p I v and m - p I n - v), where F = R, C, or H . For this and further information about indefinite symmetric spaces, see [Wo].

+

Exercises G and H denote Lie groups. 1. Prove: (a) Product structures on G x H make it a Lie group. (b) If A : G + G is the simply connected covering of G, then G can be made a Lie group so that 1 is a homomorphism. (Hint: If p is the multiplication of G, lift p ( R x R ) through 4.) (c) If K is a closed normal subgroup of G, the quotient group G/K-as a coset manifold-is a Lie group. 2. Let 4: G -+ G' be a smooth homomorphism onto G'. Prove: (a) The kernel K of 4 is a closed normal subgroup such that the induced group isomorphism 4: G/K + G' is a diffeomorphism, hence a Lie group isomorphism. (Hint: G acts on G'.) (b) If 4 has a smooth cross section, then G is diffeomorphic to G' x K . (Hint: (a) is important because results for rc: G + G/H now apply to 4: G -, G'.) 3. Show that U(n)isdiffeomorphic to SU(n) x S' by consideringdet: U ( n )-+ U(1) = s'. 4. Prove that S ( U ( p ) x U(q)) is diffeomorphic to S U @ ) x S U ( q ) x S'. (Hint: Find a(h) so that (9, h ) -, (a(h)g,h) is a diffeomorphism of S U ( p ) x U ( q ) onto S(U(P) x W).) 5. If p , q 2 1, prove (a) U(p, q ) is diffeomorphic to SU(p, q ) x U(1); (b) SU(p, q ) is diffeomorphic to S ( U ( p ) x U ( q ) ) x R Z p q ;(c) hence, using Exercise 4, 71' U ( p ,q ) = Z x Z . 6. Show that if a Lie group G acts transitively on a connected manifold M , then so does the identity component Go of G. (Hint: Since n: G + M is a submersion and Go is open, components of G project to open sets of M . ) 0

330 7 I

Homogeneous and Symmetric Spaces

7. Let H be a closed subgroup of G. Let B be an Ad(G)-invariant scalar product on g such that b is nondegenerate. Prove: (a) m = b1 is a Lie subspace on which B Im makes G/H a homogeneous space, said to be normal. (b) G/H is naturally reductive. (c) (R,,X, Y ) = B ( [ X , Y ] , [ X , Y ] ) for X , Y E m z T,(G/H). 8. Prove: (a) If a connected Lie group G admits a bi-invariant Riemannian metric, then every point of G lies on a one-parameter subgroup, that is, exp(g) = G. (b) SL(2, R ) admits a bi-invariant metric but not a bi-invariant has no square root in SL(2, R).) Riemannian metric. (Hint: (-: -:,) 9. Let G consist of all elements of GL(3, R ) of the form

(‘i p)

with a > 0.

Prove (a) G is a closed subgroup of GL(3, R ) ; (b) G is nonabelian (hence g is nonabelian); (c) G does not admit a bi-invariant semi-Riemannian metric. 10. Let M = G/H be a symmetric space. Prove: (a) If c1 is the one-parameter subgroup of G associated with X E m, then T,M + Ty(s)M is parallel translation along the geodesic 71 0 o! = y. (Hint: Use Lemma 8.30). (b) The Levi-Civita connection of M is independent of the choice of G-invariant metric on M . 11. (a) Show that if a curve y in a symmetric space is reversed by the global symmetry ( at each of its points, then 11is a geodesic. (b) Deduce Proposition 3 I( 1). 12. Find an almost complex structure on GT, (Example 39) making it a Kahler manifold. 13. (a) GL+(n,R ) is diffeomorphic to SL(n, R ) x R’. (b) SL(n, R ) is diffeomorphic to SO(n) x Rd, where d = ( n 2)(n - 1)/2. 14. Prove: (a) In Exercise l(b), nl(G) is isomorphic to the kernel of R. (Hint: See proof of Proposition 7.4.) (b) If N is a discrete normal subgroup of a connected Lie group, then g n = ng for all n E N , g E G. (c) The fundamental group of a Lie group is abelian. 15. Given symmetric data (G/H, (T, B ) and (G’IH’, d,B’), let 4 : G + G‘ be a Lie group isomorphism such that $ ( H ) = H’, 4 (T = (T’o 4, and dqj: m + m’ (well defined) preserves the scalar products B, B’. Prove that there is a unique isometry p : M + M’ of the resulting symmetric spaces such that

+

0

p . a n = n’c 4. 16. Prove that if (GIH, (T, B) is symmetric data, then so is (GIH,, 0,B), and the natural map G/H, + G/H of the resulting symmetric spaces is a semi-

Riemannian covering.

Exercises

331

17. For G,, as in Example 39, the map W + W 1is an isometry. (Hint: Derive one isometry from Exercise 15 and follow it with a suitable zg.) 18. If G is a connected Lie group acting transitively on a set C, analyze the effect of a change of origin o on G / H , where the isotropy group H of o is assumed closed. Then show that G,, is unchanged, as a Riemannian manifold, if the origin V, is taken to be last p coordinate space instead of first p coordinate space. 19. Represent S: and H : as dual symmetric spaces. 20. (a) Establish Z(p, q ) = GLt(n, R)/SO(p, q) and V(p, q)/SO(p, q) as dual symmetric spaces. (b) Identify C(p, q ) as the space of all scalar products of index p on R", n = p q. (c) In the Riemannian case identify P = C(n, 0) = C(0, n) with the space of all positive definite symmetric n x n matrices, . (d) Use Exercise 16 to show that in and prove P diffeomorphic to R"("+1)12 the indefinite case, C is not simply connected.

+

12

GENERAL RELATIVITY; COSMOLOGY

While special relativity remains an entirely satisfactory theory within its range of applicability, there is no way for it to encompass gravity. Newton’s law of gravitation, involving space but not time, was lost with the relativistic union of these notions. In the years after 1905, Einstein became convinced that gravitation should be expressed in terms of curvature. By 1915 he had found out how to do it, and in the general theory of relativity flat Minkowski spacetime gives way to spacetimes of arbitrary curvature. This chapter is a brief account of the fundamentals of general relativity and of the remarkable information it gives about the origin and development of the universe. We leave for the next chapter its equally fruitful application to the neighborhood of a single star.

FOUNDATIONS General relativity models aspects of the physical universe by spacetimes, that is, time-oriented connected four-dimensional Lorentz manifolds. In this section we describe informally some of its fundamentals, starting with with the most obvious. A. Special Relativity is a Special Case of General Relativity. In fact special relativity is the general relativity of Minkowski spacetime M , which by Corollary 8.24 is the unique complete, flat, simply connected, Lorentz manifold of dimension 4. Thus any feature of general relativity can be tested in Minkowski spacetime, and, reciprocally, it is natural to carry 332

Foundations 333

over to the general theory those features that do not depend on the distinctive properties of M z R:. In particular, events, material and lightlike particles, proper time, energy-momentum, and observers are defined just as before. In sufficiently large regions of the universe, gravity can scarcely be ignored. Thus special relativity is at best a local theory; it is the flatness of M that is physically significant, not its global properties of completeness and simple connectedness. Indeed, by basing itself on arbitrary Lorentz manifolds, the general theory opens the way to the study of global questions.

B. General Relativity Is Locally Approximated by Special Relativity. If p is an event in a spacetime M , then special relativity makes sense in the tangent space T,(M) % R;, and the exponential map exp, provides a comparison. We have seen that on sufficiently small neighborhoods-and so long as curvature does not intrude-T,(M) is a good geometric approximation of M . Historically this approximation, often in the form of normal coordinates, has been essential in attaching physical significance to the geometry of M . In particular, as in special relativity, we call a timelike future-pointing unit vector u E T,(M) an instantaneous observer at p . The orthogonal decomposition T,(M) = Ru + u' splits the tangent space into the observer's time axis Ru and restspace u'. As before, if a is a particle with ~ ( t , )= p , then " ( t o ) = au x, x E u', and correcting x by the time dilation a gives the instantaneous velocity x/a of a as measured by u. (Thus the speed Ixl/a is 1 for light and less for material particles, as usual.) SLmiJarly, if P is the energy-momentum of M Ja p , the expression P = Eu + P , PEU', gives the energy E and momentum P of IX at p as measured by u. As this terminology shows, Newtonian physics also bears comparison with general relativity. Roughly speaking, the scope of the three theories is as given in the accompanying table. When the data meet Newtonian limitations,

+

General relativity Special relativity Newtonian physics

Gravitation

Speeds

Arbitrary Negligible Weak

Arbitrary Arbitrary Low

general relativity gives approximately Newtonian results. (The next chapter will provide the classic example.)

C. Gravity Dominates in the Large. Among nuclear binding forces, electromagnetism, and gravity, the latter is weakest. However, the range of nuclear forces is so small that they are

334 72 General Relativity; Cosmology ignored a priori by a theory that models the real world by smooth manifolds. At a somewhat larger scale, the electromagnetic repulsion between two electrons is enormously larger than the gravitational attraction. But charge appears with both plus and minus signs: electromagnetism can attract as well as repel, while gravity can only attract. Thus in aggregates electromagnetic effects can cancel, but gravitational effects accumulate. In fact, at larger scales, gravity is utterly dominant. Although electromagnetism fits into its framework, the essential function of general relativity is to give a spacetime explanation of gravity.

D. Free Fall Is Geodesic; Matter Curves Spacetime. Free fall is motion solely under the influence of gravity. Newtonian physics distinguishes two cases : accelerating and nonaccelerating. For example, consider two identical spaceships: S , coasting in at an angle toward a giant star (idealized infinite radius); S, also in free fall but far from the nearest galaxy. According to Newton, S,, accelerating under the gravitational attraction of the star, follows a curved orbit in space and hence has a curved worldline in Newtonian space-time (6.5). By contrast, S , moves in a straight line at constant speed, hence its worldline is straight, that is, geodesic. But if the ships are sealed, neither crew can experimentally determine which ship it is in (principle of equivalence). In both, undisturbed objects appear to be at rest and, if pushed, their relative motions do not differ from ship to ship. As in the previous Newtonian dichotomy, rest versus constant velocity, Einstein refused to accept a theoretical distinction between states not experimentally different, and boldly declared that all free fail is geodesic in spacetirne. The gravitational effect of the star is not to bend the worldline of S, but to bend the spacetime in which it is geodesic. (Though the worldline of S, is “straight” its orbit in space as perceived by observers on the star is curved.) E. Gravity as Curvature. If the star in the preceding section is, more realistically, a uniform round ball, then there is a simple experiment enabling the sealed crews to distinguish S, from S, . It suffices to release a few pebbles at rest in each ship. In S 2 they will remain at rest, but in S , they will move. Suppose for simplicity that S, is falling directly toward the star and that the pebbles are arranged as in Figure 1. Then a and c move inward toward 0,while b and a’ move outward away from 0.The Newtonian explanation is that a and c are falling directly toward the center of the star, while by the inverse-square gravitational law, d accelerates toward the star more rapidly than 0,and b correspondingly lags.

Foundations 335

Figure 1. Relative accelerations observed by o

In the relativistic explanation (Chapter 13) the freely falling particles are modeled by timelike geodesics in spacetime. As discussed in Chapter 8, the relative position of neighbors of the pebble y,, are given by Jacobi fields Y on y o . Changes in relative position result from relative acceleration Y”, which by the Jacobi equation is Ryy,(y’).In this way curvature, in its role as tidal force, replaces the Newtonian notion of gravitation. In general, an instantaneous observer u E T J M ) measures gravity by the tidal force operator F,: u’ 4 u1 (Definition 8.8).

F. Sources of Gravity. “Matter” is an undefined term that we use intuitively to mean all the stuff of the universe. In Newtonian physics the unique source of gravitation is the mass of matter. Relativistically, gravitation springs from the energymomentum of matter, to which mass is but one contributor. For a particular form of matter modeled in a spacetime M the energy-momentum content is described infinitesimally by a stress-energy tensorjeid T on M . Lacking a general definition of matter, there can be no general recipe for constructing T , but there are some empirical rules. Let u be an instantaneous observer at p E M . O n u’ the spatial part of T typically generalizes the classical stress tensor as measured by u ; like it, T is a symmetric (0, 2) tensor. The energy density measured by u is T(u, u) and for known forms of matter is nonnegative. Finally, conservation of energy-momentum is expressed infinitesimally by div T = 0. The terms injinitesimal and instantaneous in the discussions above signal the replacement of global action-at-a-distance Newtonian gravity by a directcontact differential version.

336 12

General Relativity; Cosmology

Two spacetimes on which matter has been modeled are physically rquiualent provided that there is an isometry of the spacetimes that preserves the matter models. Geometric units are used (Remark 6.7).

THE EINSTEIN EQUATION

Matter is gravitationally significant only as a carrier of energy-momentum, so for its effect as a source of gravitation (alias curvature) we must look to the stress-energy tensor T. But how is T related to the curvature tensor? A brief sketch, in his own words, of Einstein’s struggles to answer this question appears in Section 17.7 of [MTW]. Always demanding simplicity, he proposed the formula G = kT, where G is some variant of Ricci curvature and k is a constant. Einstein tried several possibilities for G, testing them notably on the problem of the precession of the perihelion of Mercury. The hard work having been done, it is by now easy to pick an obvious candidate. If div T is to vanish, then G = kT implies div G = 0. But for Ricci curvature, Proposition 3.54 asserts div Ric = 3dS. Thus subtracting half the scalar curvature S = C(Ric) from Ric produces a good result. 1. Definition. The Einstein gravitational tensor of a spacetime M is

G

=

Ric -fSg.

2. Lemma. (1) G is a symmetric (0, 2) tensor field with divergence zero. (2) Ric = G - fC(G)g.

Proof: (1) Both Ric and g are symmetric (0, 2) tensors, hence G is. A direct computation (Exercise 3.10) shows that div(Sg) = dS. Then

div G

=

div(Ric - fSg)= &(dS - d S ) = 0.

(2) Since C(g) = 4, C(G) = C(Ric) - $C(g) = - S . Thus the definition of G gives Ric

=

G

+ (@)g

=

G - iC(G)g.

By (2), the Einstein tensor and the Ricci tensor contain exactly the same information. In fact, since S = C(Ric), each has the same formal expression in terms of the other. General relativity flows from the following law.

Perfect Fluids 337

3. The Einstein Equation. If M is a spacetime containing matter with stress-energy tensor T , then G

=

8zT,

where G is the Einstein gravitational tensor. Many arguments have been advanced to lend plausibility to the Einstein equation, based usually on comparison with Newtonian physics at low speeds and weak gravitation. (The universal constant k = 8z is determined from such comparisons; see [HE, Section 3.41 or [MTW, p. 4061.) But general relativity cannot be derived; like Newton’s gravitational and motion laws, it is Einstein’s assertion of how the macroscopic universe works. Tested experimentally in a variety of situations it has given accurate results, notably for cases in which Newtonian physics is inaccurate or inapplicable. The Einstein equation implies that the stress-energy tensor is a symmetric (0, 2) tensor with divergence zero. As we shall see, the equation div T = 0 gives dynamical laws for the matter that produces T . Roughly speaking, G

=

8nT

div T

=

0

tells how matter determines Ricci curvature; tells how Ricci curvature moves this matter.

But recall that, as in Section E, the full curvature tensor controls the relative motion of test particles-those whose energy-momentum makes negligible contribution to the stress-energy tensor. If T = 0, that is, if M is Ricci flat, then M is said to be a uacuum (or empty).

PERFECT FLUIDS

The flow of a fluid could be described literally by a vast swarm of particles in a spacetime M . Instead of this discrete model it is easier to deal with a smooth model, where the 4-velocity of the flow is given by a timelike unit vector field U on M . Intuitively, the integral curves of U are the average worldlines of the “molecules” of the fluid. The classical stress tensor measures internal forces in a body in space by giving at each point m the forces across all surface elements through m. To motivate Definition 4 we apply this scheme heuristically to the spacetime flow of a so-called perfectfluid. Fix m E M . If u E T,(M) is a unit vector, then in a hypersurface through m perpendicular to u let B(u) be a small coordinate cube (“box”) centered at rn.

338 12

General Relativity; Cosmology

Let P l be the total energy-momentum of molecules in B(v) that are crossing from the - u to the u side of B(u); correspondingly let P i derive from those crossing from u to - u. Then for another unit vector w E T,(M), let T(u, w) he the limit as vol B(u)-+ 0 o f t h e w component of P , = P i - P i . Now let u = U , and consider the following choices of u, w .

+

+

(1) T(u,u ) = p(m), energy density at m.

The infinitesimal observer u, riding with the flow, can consider B(u) as a local restspace since its tangent spac5at m is u'. Then P i = 0, and the usual P , gives the energy E, and momentum $ecomposition P i = P , = E,u P, of the box B as measured by u. Then by the definition above,

+

T ( u , u ) = lim

EL3

__

vol If- 0 vO1

B'

Clearly this is the energy density of the fluid as measured by u.

( 2 ) If x, y E u', then T ( x , y ) = b ( m ) ( x , y ) , where b(m) is the pressure at m. Since x Iu, the box B ( x ) is a three-dimensional spacetime. Let B(x) = C x I , where C is a spacelike patch of surface through m and I is a time interval of length A t (see Figure 2). Then if A is the area of C,

Here

=

Fi - Fi,and a molecule of fluid contributes to Fi if it crosses

C from - x to + x during the time interval I. Force is the time rate of change

of momentum, hence the time limit above is the force F + exerted by the - x side of C on the + x side, minus the reverse F - (but - F - = F + ) .

f"

Figure 2. Here dimensions are reduced, since actually dim Z = 2 and dim B ( x ) = dim

Bfu)

=

3.

Perfect Fluids 339

The second limit shows that T ( x , y ) is the y component of stress (force per unit area) across C in the x direction. Thus T restricted to u' is the classical stress tensor measured by u in his restspace u' M B(u). Since a perfect fluid cannot support shears, the stress above can be written as +zxx.The pressure +zx is the same in all space directions, hence T ( x ,y ) = +z(rn)(x,y ) (valid for all x, y E u1 since T is bilinear).

(3) If x E u', then T ( x , u )

=

T(u, x) = 0.

Reasoning as in case (2) shows that T(x, u ) gives, for u, the energy flow across a patch of spacelike surface perpendicular to x, and T(u, x) measures the density of x-momentum. For a perfect fluid, both are zero. This discussion can be summarized rigorously as follows.

4. Definition. A perfect Puid on a spacetime M is a triple ( U , p , #z) where : (1) U is a timelike future-pointing unit vector field on M called the pow vector field. (2) p E S ( M ) is the energy density function; +z E S ( M ) is the pressure function. (3) The stress-energy tensor is

T

= (p

+ #z)U* 0 U* + +zg,

where U* is the one-form metrically equivalent to U Evidently this formula for T is equivalent to the three equations found above for X , Y I U , namely:

T ( U , U ) = p,

r(x,U ) = T ( U , x)= 0,

T ( X , Y ) = +z(x,Y ) .

For a perfect fluid the vanishing of the divergence of the stress-energy tensor has the following consequence.

5. Proposition. If ( U , p , #z) is a perfect fluid, (1)

(2)

+ +z)

U p = -( p div U (p +z)DuU = -grad,

+

(energy equation), #z (jorce equation),

where the spatial pressure gradienr grad, orthogonal to U .

#z is the component of grad +i

Proof. If T is the (2,O) tensor field metrically equivalent to T , it is easy to check that div T = 0 is equivalent to div T = 0. Writing T in terms of coordinates, T'j = ( p + +i)U'UJ+ +zg'j.

340 12

General Relativity; Cosmology

The divergence is then

1

=

j

C { ( p + # z ) ; ~ U ' U+~( p + k)U';jU' + ( p + #i)U'UJ; + p;jgij. j

Expressed invariantly this is the vector field div T = U ( p + b ) U

+ ( p + b)D,U + (p + p)(div U)U + grad b.

But div T = 0, and since U is a unit vector field, D,U I U . Hence H equation (2) is obvious, and (div T, U ) = 0 gives equation (1). Evidently the first equation is a formula for the time rate of change of energy density as measured by U . The second equation is an analogue of Newton's F = ma, with force replaced by spatial pressure gradient, and mass replaced by p + b, while D, U I U is indeed the spatial acceleration of the molecules of the flow as self-measured. (Compare with classical hydrodynamics.) The definition of perfect fluid does not tell how to construct a spacetime model of one. Model-building in general relativity is more subtle than in Newtonian physics or special relativity. The latter two theories have fixed universal geometries, R' x R3 and M z R:, respectively; for a given problem, matter is modeled in that manifold and physical laws then govern its behavior. But in general relativity there can be no universal a priori geometry, since for any spacetime M the Einstein equation already determines the stress-energy tensor; specifically, T = (1/8n)(Ric -4Sg).

Thus a given spacetime can be used to model matter only in the unlikely case that T happens to be a physically realistic stress-energy tensor. Schematically, spacetime

geomefry+

Ricci curvature

Einateiii equation

Stress-energy tensor

physics

e---

matter.

Hence relativistic model-building is a kind of nontrivial matching process. Since the tidal forces F,(y) = Ryuuin a spacetime M describe gravitation, a natural way to express the empirical fact that gravity attracts is (F,(y), y ) < 0; that is, K ( n ) I0 for all timelike tangent planes. Considerably weaker is the timelike convergence condition, Ric(u, u ) 2 0

for all timelike tangent vectors to M ,

which says merely that, on auerage, grauity attracts. (By continuity the inequality holds also for null vectors.) We say the condition holds strictly if 2 is replaced by >.

Robertson-Walker Spacetimes

341

In terms of the stress-energy tensor of M the timelike convergence condition becomes

T(u, u ) 2 4C(T)(u,u ) for all timelike (and null) tangent vectors to M . Here T(u,u) is energy density, and in this form the condition is also called the strong energy condition on M . For example, see Exercise 10.

ROBERTSON-WALKER SPACETIMES

In the seventeenth century Newton extended the range of physics from the earth to the solar system. Two and a half centuries later, relativity theory, supported by progress in astronomy, began studying the largest physical system : the universe. Astronomical evidence indicates that the universe can be modeled (in smoothed, averaged form) as a spacetime containing a perfect fluid whose “molecules” are the galaxies. At present, the dominant contribution to the energy density of the galactic fluid is the mass of the galaxies, with a much smaller pressure due mostly to radiation. The decisive fact is that no large asymmetry has been observed in the distribution of the galaxies. They come in clusters, and these clusters in clusters, but at the large scale appropriate to cosmology, the universeviewed from our galaxy-looks the same in all directions. What evidence there is supports the hypothesis that the same isotropy holds for all galaxies, and it thus becomes possible to build quite simple cosmological models whose gross properties have a reasonable chance of being physically realistic. At the very least, such models provide a testing ground for further study. We start with a smooth manifold M = I x S, where I is a (possibly infinite) open interval in R’ and S is a connected three-dimensional manifold. Let t and o be the projection onto I and S , respectively. The lines I x p will be the worldlines of the galactic flow. Let U = d, be the lift to I x S of the standard vector field d/dt on I c R’, and for each p E S parametrize I x p by y,(t) = (t, p ) . Since U gives the velocity of each such “galaxy” yp, they are its integral curves. Thus the function t will give the common proper time of all galaxies. Holding t constant gives the hypersurface S(t) = t x

s = { ( t , p):pES}.

As usual, the lift h(t, p) = h ( t ) of a function h E % ( I ) is again denoted by h, and we write h’ for Uh = dh/dt.

342 12

General Relativity; Cosmology

The geometry of the model will follow from physical assumptions about the galactic flow. Each y p is-potentially, at least-a particle with proper time t , hence (a) ( U , U )

=

-1.

As might be expected from isotropy, the relative motion of the actual galaxies is negligible on large-scale average. Thus we take each slice S ( t ) to be a common restspace for their idealizations y p , requiring (b) U IS ( t )

for all t € 1 .

Hence each such slice becomes a Riemannian (i.e., spacelike) hypersurface. We formalize the isotropy condition “all spatial directions the same” in local form as follows: Each ( t , p ) has a neighborhood JV such that, given unit tangent vectors x , y to S ( t ) at ( t , p ) , there is a galaxy-preserving isometry q5 = id x & of JV such that d 4 ( x ) = y. 6. Proposition. Under the conditions above, in I x S

(1) each slice S ( t ) has constant curvature C ( t ) ; (2) for any s, t E I the natural map p(s, p ) = ( t , p ) from S(s) to S ( t ) is a homothety. Proof. (1) In each tangent space to S ( t ) any plane I7 is x’ for some unit vector x . Thus for any two such planes there is an isotropy isometry such that dd(ZZ) = I7’.The restriction of q5 = id x cb,s to .Nn S(t) is again an isometry, hence I7 and I7‘ have the same sectional curvature in the geometry of S(r). Thus Schur’s theorem (Exercise 3.21) implies that S(t) has constant curvature. (2) First we show that each map p is conformal; that is, Idp(x)l is the same for all unit vectors x tangent to S(s) at (s, p). Any other such unit vector can be expressed as dq5(x), where (b is an isotropy isometry. Since p and 4 = id x &commute,

Here we have assumed that t is small enough so that (r, p ) is in the domain of

4, but a finite iteration will give this result for any t E I.

Let hfs, p , t ) be the scale factor of p : S(s) -,S ( t ) at ( s , p ) , so h is a smooth function on I x S x I. It suffices to show that x h = 0 for any unit vector x tangent to a slice, for then, since S is connected, h depends only on s and t. Let B be the geodesic in S(s) with a(0) = (s, p) and d(0) = x . Let q5 be an isotropy isometry such that d 4 ( x ) = - x . Then d#(cr’(u)) = - d ( - u ) , and

Robertson-Walker Spacetimes 343

again since 4 commutes with p, k(a(u), t ) = Idp (T’(u)\= Id# dp o’(tl)l = Idfi d# o‘(u)\ = I dp o’(- U ) I = h(o( - u), t).

Hence

Since the homothety p = pS, has scale factor h(s, t ) , Remark 3.65 gives h(s, t)’C(t) = C(s). Since pStis a diffeomorphism, h is never zero. Hence the function C maintains the same sign: k = - 1, 0, or 1. Fix a E 1, and define f ( a ) > 0 by C ( ~ ) f ( a = ) ~k. Assign S the metric tensor such that the map j,(s) = (a, s ) from S to S(a) is a homothety of scale factor f ( a ) , and define a function f~ % ( I ) by f ( t ) = h(a, t ) / f ( a ) . It follows immediately that (c) S has constant curvature k , and every injection j t : S tkety of scale factor f (t).

-+

S ( t ) is a homo-

In particular, the constant curvature of S ( t ) is k / f ( t ) 2 ,and for vectors x, y tangent to S ( t ) (x, Y > = .f 2 ( r ) ( d d x ) do(y)). >

Thus the conditions (a), (b), and (c) above express I x S geometrically as a warped product (Definition 7.33) with base I and fiber S.

7. Definition. Let S be a connected three-dimensional Riemannian manifold of constant curvature k = - 1,0, or 1. Letf > 0 be a smooth function on an open interval I in R : . Then the warped product M(k9.f) = I x/ S

is called a Robertson- Walker spacetime. (See Figure 3.)

+

Explicitly, M ( k , f ) is the manifold I x S with line element -dt2 do2 is the line element of S (lifted to I x S). It is timeoriented by requiring that U = d, be future-pointing. The notation M ( k , f ) is not completely descriptive, but sign k and scale function f are the essential ingredients. We say that the interval I is maximal providedfcannot be extended to a smooth positive function on an interval strictly larger than I. The Riemannian manifold S is called the space of M ( k , f ) ; it is a scale model of each spacelike slice S(r). Since S is the fiber of M ( k , f ) , we denote its metric tensor as in Chapter 7 by ( , ) and its connection by V. The standard choices for S are the complete simply connected ones: H 3 , R3, S3,with curvatures - 1,0, + 1, respectively. f 2 ( t ) do2, where

344 7 2

General Relativity; Cosmology

s-

P

'I

Figure 3

Since S has constant curvature, results from Chapter 8 show that spatially, M ( k , f ) = I X, S is quite homogeneous. If t E I and p , q E S, there is an isometry 4 from a neighborhood +2 of p to a neighborhood Y of q-in fact, dbPcan be preassigned. Then id x 4 is an isometry from I x 92 to 1 x Y , and if S is one of the standard choices, id x 4 can be defined globally. An analysis of the connection and curvature of Robertson-Walker spacetimes follows from general results on warped products. As usual let 2(S)be the set of all lifts to M ( k , f ) of vector fields on S . Thus X E 2(S)if and only if X is everywhere tangent to spacelike slices and is a-related to a vector field on S . The flow vector field U = d, is a (future-pointing) unit normal to each slice S ( t ) . Thus, as on page 205, nor is orthogonal projection onto the U direction, nor W = - ( W , U ) U , and tan projects orthogonally onto the tangent spaces of the slices.

8. Corollary. If X , Y E Q ( S ) on M(k,f),then (1) DUU = 0, ( 2 ) D , X = D, U = E ( X , U ) = ( f ' / f ) X , (3) nor D,Y = I I ( X , Y) = ( X , Y ) ( f " / f ) U , (4) tan D, Y is the lift of V,y Y on S. Proof. Recall that the slices S ( t ) are fibers not leaves; thus these equations follow from the correspondingly numbered ones in Proposition 7.35, changing notation in the latter by X , Y + U and V , W X , Y. (Note that gradf = - f ' U . ) However, a direct proof is quite simple; the Koszul formula is needed only to compute 2 ( D , U , Y) = U ( X , Y). --f

The Robertson-Walker Flow

345

By (l), each y p is a geodesic (a direct consequence of isotropy), and (3) gives the shape tensor of the totally umbilic fibers. Similarly, Proposition 7.44 gives this description of Robertson-Walker curvature. 9. Corollary. For vector fields X , Y , Z on M ( k , f ) tangent to all slices S(t), RXYZ = C(f’/n2 + ( k / f 2 ) l C < Xz> > y - ( Y , Z > X l . (2) RXUU = ( f ” / f ) X . ( 3 ) R x y U = 0. (4) RXUY = ( f ” / f ) ( X , n u . (1)

It then follows that every plane containing a vector of U has curvature = f ” k and every plane tangent to a spacelike slice has curvature K , = + k ) l f 2 (not to be confused with its curvature klf’ in the geometry of S(t)). We call these the principal sectional curvatures of M ( k , f ) , and they turn up repeatedly in Robertson-Walker geometry and physics. K, (f”

10. Corollary. For a Robertson-Walker spacetime M ( k , f ) with flow vector field U = 8,: (1) Ricci curvature is given by

Ric(U, U ) = - 3f”k Ric(X, Y ) = { 2 ( f ’ / f ) 2

Ric(U, X ) = 0,

+ 2k/fz + f”/f}(X,Y )

if X , Y I U .

(2) The scalar curvature is S

=

+

6{(f’/f)2 k/f

+ f”/f}.

These results follow immediately from Corollary 9 and general formulas for Ric and S .

THE ROBERTSON-WALKER FLOW

We show that, for any Robertson-Walker spacetime, the flow given by the vector field U = 8, is that of a perfect fluid. Since the stress-energy tensor T of M ( k , f ) is already determined by the Einstein equation, we must find functions p and p such that T has the form required by Definition 4. 11. Theorem. If U is the flow vector field on a Robertson-Walker spacetime M ( k , f ) , then ( U , p, b)is a perfect fluid with energy density p and

346

72

General Relativity; Cosmology

pressure h given by 8 ~ ~= 1 (.f 3 'ifl2 + Uj",

+

- 8 ~ f i = ?f"/.f

Prooc By the Einstein equation, T lemma gives T ( U , X ) = 0, and

=

(f"/f)2

+ k/f2.

(1/8n)[Ric -iSg]. The preceding

This simplifies to # ( X , Y ) for +Z as above. If p = T ( U , U ) , then T has the required form ( p + h)U*0 U* + lag. Computing T ( U , U ) from Corollary 10 gives the formula for p . In particular, p and h are functions oft only-constant on each spacelike slice S(r)-as could be seen directly from local isotropy. In terms of the principal sectional curvatures the equations in the theorem are just

K,

= 8np/3,

2K,

+ K , = -87th.

Eliminating K Ogives a basic relation between the scale functionfand pressure and density.

12. Corollary. For a Robertson-Walker perfect fluid, 3f

"lf = - 4 d p

+ 3+Z).

Now we turn to the dynamics of the flow, that is, to div T = 0 as expressed for a perfect fluid by Proposition 5. The force equation is trivial, =i 0. The energy since D,U = 0 and (fi being constant on slices) grad, + equation takes the following form. 13. Corollary. For a Robertson-Walker perfect fluid P'

= -3(P

+ jlf'lf.

Proof: Since U p = p', it suffices to check that div U of a frame field with E , = U ,

div U since D , U

=

=

3

3

m=O

j= 1

=

3f'# In terms

1 €,,,(DE, U , Em} = 1 ( D E ,U , E j )

0. By Corollary 8 each of the remaining summands isf'lf.

w

Robertson-Walker Cosmology

347

This formula for the time rate of change of energy-density can readily be checked intuitively by a Newtonian computation. For any region B in S the galaxies y p with p E B give the history through galactic time of the cornouing region B(t) = { y , ( t ) : p E B } in S(t). Since S ( t ) is just S scaled by the factor f ( t ) , the volume of B(t) isf3(t) vol B hence its total energy is p ( t ) f 3 ( t ) vol B. Because the fluid is perfect, there is no energy flow across the boundary of B(t). Thus the (algebraic) increase in energy A E during a time interval A t equals the work A W done on B(t) by the pressure on its boundary. Taking B for simplicity to be a unit cube, the work done on each pair of opposite faces is force x distance

=

pressure x area x distance

-fW

= #wf2(t>(

At).

(There is a minus sign, since E increases if,f’(t) < 0.) Thus ( p f 3 ) ’ ( t )A t

In the limit, ( p f 3 ) ‘ = corollary.

-

AE

=

AW

-

- 3 h ( t ) ( f 3 ) ’ ( t )A t .

-b(f3)’,which is equivalent to the formula in the

ROBERTSON-WALKER COSMOLOGY

A Robertson-Walker spacetime gives a relativistic model of the flow of a perfect fluid; to fit such a model to our universe, data is needed about the physical parameters fi, p , andf 14. Remarks. Astronomical Data. (1) According to Hubble (in 1929), all distant galaxies are now moving away from us at a rate proportional to their distance. For galaxies y p and yq the distance between y,(t) and y,(t) in S ( t ) is f ( t ) d ( p , q), where d is Riemannian distance in the space S. Hubble’s discovery, by current estimates, is that

H,

=

f ‘ ( t o ) is ~

f( t o )

1 18 f 2 x lo9 yr’

(2) For all known forms of matter, energy density dominates pressure: 1h.I.At the present time, t o , the energy density po of our universe is and 5 x estimated to be between g/cm3. The pressure +ao is positive but very much smaller: po $ bo> 0. p >

In particular, (1) shows that currently,f’haspositive derivative: the spaces S ( t ) are expanding. This qualitative fact was already enough to destroy the

348 12

General Relativity; Cosmology

static models of the universe prevailing in the 1920s, and general relativity deduces the following striking consequence.

15. Proposition. Let M ( k , f )

I x , S . If H , > 0 for some

t o , and < t , < to, and either (1) f ’ > 0, or (2) f has a maximum point after t o , and I is a finite interval ( t * , t*).

p

=

+ 3fi > 0, then I has an initial endpoint t , with to - Hi

+

Proof. By Corollary 12, p 3#2 > 0 impliesf” < 0. Thus the graph of f is-except at to-below that of its tangent line at t o .This line is the graph of F ( t ) = f ( t o ) + H,f(t,)(t - to). Thus the Hubble result H, > 0 shows that as t decreases from t o ,the function f > 0 must have a singularity at some t , before reaching the zero t o - Hi of F . Sincef” < 0, eitherf’ is always positive on I orfhas a maximum point after which,f’ < 0. In the latter case, an argument as before shows that another singularity ensues at t* > t,. rn

Taking the estimate of Hubble time Hi above, this says that the universe had a definite beginning some ten to twenty billion years ago and, if it does not continue expanding, must, after contracting for a while, come to an end. The result does not say that the universe begins small ( f - 0 as t + t*) or that in the expanding case (f’> 0) it endures forever. If the energy density p approaches infinity as t t* (or t*), we say that M ( k , f ) has a physical singularity there.

-

- -

16. Definition. An initial singularity of M ( k , f ) at t, is a big bang provided J- 0 and f ’ 00 as t t,. Similarly, a final singularity is a big crunch iff- 0 andf‘ -+ - 03 as t -+ t*. Such singularities are physical, and the converse holds under conditions weaker than those of Remark 14. 17. Theorem. Assume that M ( k , f ) = I x , S has only physical singularities and that I is maximal. If H, > 0 for some t o , if p > 0, and for constants a and A , -4 < a I bJp IA , then: (1) The initial singularity is a big bang. (2) If k = 0 or - 1, then I = ( t + ,00) and as t 00,f00 and p -+ 0. (3) If k = 1, thenfreaches a maximum followed by a big crunch, hence I is a finite interval ( t * , t*). -+

Prooc In particular, p + 3fi 2 c p > 0 for some preceding proposition applies, sof” < 0.

E

> 0. Hence the

Robertson-Walker Cosmology 349

To prove ( l ) , note that f' > 0 on the interval ( t * , Corollary 13 gives where

p' 2 - C p f ' k

C

=

3(A

to).

Since

p

5 Ap,

+ 1) > 2.

It follows that (pf')' 2 0. Hence pfc I p ( t o ) f ( t o ) con (t*, to).By hypothesis p + m as t -+ t,, hencef+ 0. The inequality p - ~p 2 - 3 b then in a similar way gives p'

I -(2

+ 4pf'Lt

and hence (pf2+&)' I 0 on ( t * , to). Thus p f 2 + &2 p ( t o ) f ( t o ) 2 ' & on this interval. As t t , we havef- 0 hence pf2 -+ 00. Then by the formula for p in Theorem ll,f'* + k -+ co,and hencef' + 00. Thus (1) is proved. --f

Case I . f has a maximum at say t, . Since f'(t,) = 0, we get 0 < p(t,) = 3k/(87cf2(t,)), hence k = 1. Since f " < 0, f ' will be negative for t > t,. Then arguments corresponding to those for t , show that the final singularity at t* > t, is a big crunch. Case 11. f h a s no maximum. Thenf' > 0 on the entire interval I , so the results preceding Case I are valid on I . The inequalities p > 0 and p + 3p > 0 imply p fi > 0. Corollary 13 then gives p' < 0, so there are no physical singularities as t increases. Thus I = ( t * , 00).

+

Subcase A. f - m as r -+ for t large. Hence pf2 -+ 0 as t

GO.

-+

GO.

Since (pf2+')' I 0, pf"& is bounded Thusf'2 + k -+ 0, hence k = 0 or - 1.

Subcase B. f is bounded as t + co. (It suffices to show that this is impossible.) As t 4 m , f + b and, sincef" < 0 , f ' + 0. Hence pf2 4 3k/8n. Thus k = 0 or 1. As t -+ 00,pf is nondecreasing, hence p f 2 f . 0, hence k # 0. Finally, if k = 1, then pf2 4 3/87c. Thus p 2 6 for some 6 > 0. Since f' -+ 0, there is a sequence { t i } + co such that { f " ( t i ) }+ 0. Hence Corollary 12 gives { ( p + 3p)(ti)}+ 0. But this contradicts p + 3 p 2 EP 2 ~6 > 0. The theorem predicts that our universe begins in a colossal explosion. The limit time t , is of course not part of the model-both physics and semiany fraction of a second later the Riemannian geometry fail at t,-but materials for the construction of the universe are present in the initial fireball. The following examples show that for a Robertson-Walker model, the big bang origin requires the inequalities in the theorem.

350

12

General Relativity; Cosmology

18. Example. (1)

M(0, 1

+

r2j3),

t

> 0. We compute

'1,

+

p = 1/(67ttZi3(1 t 2 ' 31

so 0 is a physical singularity. Also j " x)as t -+ 0, but obviously .f' --t 1. The upper bound in the theorem fails since p / p = 1/(3t2l3). (2) M ( 0 , sinh-' f), t > 0. Here --$

p = 3/(8n(r2

+ l)(sinh-'

so again 0 is a physical singularity. Alsof'+ 0 as t bound in the theorem fails. In fact, 2r sinh-' t - _ 1 --

t)2), + 0, butj" .+

1. The lower

-

P

3

J

m

3'

so j / p > -4 holds but not j / p 2 a > -3. Under the hypothesis of the theorem, the ultimate fate of the universe depends on the sign k of spatial curvature; this in turn depends on the present energy density po and Hubble number H , = ,fb/Jo. 19. Corollary. Let pc = 3(Ho)2/8n,called the critical energy density. If po Ipc, then k = 0, - 1 (so the universe expands forever). If po > p c , then k = + 1 (so the universe eventually collapses). Proof: By Theorem 11, p - (3H2/87c) = 3k/87tf

',so k = sgn(p,

- pc).

-

Taking the Hubble number H , to be (18r x lo9 yr)-' with r 1, the y r f 2 . For geometrical units 1 = critical density pc is (3.68/r2) x G = 6.67 x lo-* cm3/g sec2, and a year contains about 3.16 x lo7 sec, giving pc = (5.5/r2)x g/cm3.

For r near 1 this is within the rather large interval for p o in Remark 14. The estimation for p o is difficult. To obtain accuracy, large astronomical regions must be searched, and many elusive forms of matter may contribute to the extremely small average density. Thus even assuming our universe admits an excellent Robertson-Walker model, a firm prediction of its distant future is not yet possible.

FRIEDMANN MODELS Except in the earliest era of the universe and the final era, if there is one, energy density utterly dominates pressure. Thus Robertson-Walker models with fi = 0 should be reasonably good for this range, and it is easy to find them all explicitly.

Friedmann Models

351

A dust is a perfect fluid with #i = 0 and p > 0, the terminology suggesting the absence of any nongravitational influence between the “molecules” of the fluid. 20. Lemma. Let M ( k , f ) be a Robertson-Walker spacetime with f nonconstant. Then the following are equivalent: (1) The perfect fluid U is a dust. (2) p f 3 = M, a positive constant. (3) (Friedmann equation) ,f” k = A / f ,where A

+

=

8 7 ~ ~> / 30.

Proof. The equivalence of (2) and (3) is immediate from the formula for p in Theorem 11. If (1) holds, so = 0, then Corollary 13 becomes p’f + 3pf’ = 0. Hence p f 3 is a constant, positive since p andfare. Conversely, if (2) holds, then it follows from Corollary 13 that #if’ = 0. The nonconstancy off(imp1ied by H, > 0) is needed to prove #i = 0. Assume #i is not identically zero. Then there is a maximal interval J c I on which #i is never zero. Hence f ‘ = 0 on J,sofis constant on J . Thus J # I . The formula for #i in Theorem 11 then shows that is a nonzero constant on J. Thus #i is nonzero on an interval strictly larger than J : a contradiction.

+

The energy density of a dust derives from mass; thus the equation in (2) asserts the conservation of mass in any comoving region. A Friedmann cosmological model is a Robertson-Walker spacetime such that the galactic fluid is a dust and H = f’/fis positive for some t o .

21. Remarks. Friedmann Cosmological Models. We find the scale functionfin the three cases k = 0, 1, - 1, putting the big bang at t , = 0. (1) k = 0. The Friedmann equation ,f’” = A is readily solved for f = Ct2I3,where 4C3 = 9A. Thus the initial expansion continues forever with f -, co and f ’ -P 0. ( 2 ) k = 1. Integration of the Friedmann equation gives the parametric solution t = +A($ - sin 9),

f = t A ( 1 - cos 9)

(0 < 9 < 2n).

The graph of,fis the cycloid swept out by a point on the rim of a rolling wheel with diameter A (Figure 4). The expansion reaches a maximum f = A at t = 7 ~ 4 2Then . contraction begins, and fdecreases symmetrically toward a final collapse at t* = nA. (3) k = - 1. The Friedmann equation is now f f 2 = 1 + ( A / f ) . Thus f ’ 2 1 and the universe expands forever withf-, co a n d f ‘ + 1. An integration as for (2) gives t = &4(sinh q - q),

f = )A(cosh q

- 1)

( q > 0).

352

12

General Relativity; Cosmology

I

/

/

/

/

//

.

r

Figure 4. Friedmann scale functions

These models were proposed by Friedmann several years before the discovery of the expansion of the universe. The Friedmann equation is the same as the differential equation governing radial motion in Newtonian gravitation. Thus a three-dimensional model of a k = 1 Friedmann universe could be made out in space by detonating a ball of dynamite encased in a shell of bricks. The fragments fly radially outward and, provided the charge is insufficient, reach a maximum radius and fall symmetrically back together. With larger charges, escape speed is reached and the fragments continue outward forever. (In the latter case the spacelike splices are topologically wrong, since spheres do not admit metrics of constant curvature k = 0 or - 1.) Assuming Hubble time H i = 18 x lo9 yr, we compute some sample numbers for Friedmann universes, using the standard choices for space S . These form a single family that we parametrize by present energy density p o . Corollary 19 shows that the family splits in two at po = p c . (1) The Einstein-de Sitter cosmological model, M(0, t2I3) = R i x ,213 R3. The Hubble function H = f'/fis 2/3t, hence the age of the universe is to =

2/3H0 = 12 x lo9 yr.

Since k = 0, p = 3 H 2 / 8 n . In particular the present energy density p o is at g/cm3. the critical value pc = 5.5 x (2) M ( 1 , f ) = (0,nA) x f S3.The slice S ( t ) is thus a three-dimensional sphere of radiusf(t). By the parametric equations in Remark 21(2), the maximum radius is A and we compute H = f ' / f = ( 2 sin 9)/A(1 - cos 9)2.

Then by some trigonometry, p = (3/8n)(H2

+ (l/f

I)) =

3 H 2 / 4 n (1

+ cos 9).

Geodesics and Redshift 353

Since H , has been fixed, the present energy density p, determines the present value 9, of the angle parameter, and the formula for H then gives A . In view of Corollary 19, we try po at twice the critical density p c . Then cos 9, = 0, hence 9, = 71/2 and the maximum radius is A

=

2/H,

=

36 x lo9 lightyears.

Thus the age of the universe is

,

t - 12 A 1( p - 1) = 10.3 x lo9 yr,

less than a tenth of its total lifetime d . (3) M(-1, f ) = R+ x ~ (H3. ~ ) Much as in the previous example the parametric formulas in Remark 21(3) give H

=

(2 sinh q)/A(cosh q - 1)2,

p = 3H2/4n(cosh q

+ 1).

Taking po to be half the present critical density gives cosh qo = 3, so sinh q, = 2/5 and qo is about 1.76. Thus A = $Hi and the age of the universe is

’,

to =

(sinh qo - qo)/$H,

=

13.5 x lo9 yr.

Unless more mass is discovered in our universe, “open” models such as this are probably more realistic than “closed” models as in (2). The earliest era of the universe and the final one, if it exists, are dominated by radiation. There Friedmann models give way to radiation models, for which mass is zero and jh = p/3 (see Exercise 14).

GEODESICS AND REDSHIFT

Any curve ci in M ( k , f ) = I x/. S can be written as ( t ( s ) , /3(s)), where t ( s ) is the galactic time of a(s) and fl is the projection of M into S. Derivatives with respect to the parameter s are often denoted by a prime; thus in this section we writef, for U f = dfldt.

22. Proposition. A curve GI

= (f,

p) in M ( k , f ) is a geodesic if and only if

(1) d2t/ds2 + (P’, P’)f ( t ) f t ( f )= 0. ( 2 ) P” + 2(f,(t)/f(t))(dt/ds)P’ = 0 (hence

B is pregeodesic).

Proof. These equations follow immediately from those in Proposition 7.38, since here gradf = -f, U and (d/ds)( f ( t ) ) = ft(r)dt/ds.

23. Corollary. If a = ( t , B) is a null geodesic in M ( k ,f ), then the function f ( t ) dt/ds is constant.

354 12

General Relativity; Cosmology

Proof. Since a’ = (dt/ds)U 0 = (a’, a ’ )

+ p’, =

-(dr/ds)2

+ (b’, p’)f2.

Thus (1) implies

This result leads to a relativistic explanation of the physical phenomenon called cosmological redshift. When light from a distant galaxy is analyzed in an earth-borne spectroscope, the characteristic pattern of spectral lines is obtained but all wavelengths Aa are longer than for earth-emitted light-in proportions depending only on the source galaxy. As each wavelength A, at emission must be assumed to be the same as on earth, wavelengths haveuniformly lengthened during transmission. The fractional increase

z

=

(A0

- Ap)/Ap

is called the redsh$ parameter of the source (red having the longest wavelength in the visible band). Such redshifts occur naturally in any Robertson-Walker model. As in Figure 5, let CI be a photon (future-pointing null geodesic) emitted at past galactic time t, from a galaxy y p and received in our galaxy yo at the present time t o . Regard the galaxies as observers, with U giving their 4-velocities. Then, as in special relativity, the decomposition CI’ =

(dt/ds)U

+ p’

shows that for each s the galactic observers measure (dt/ds)(s)as the energy E(s) of a and P’(s) as its momentum. The relations E = hv and Av = 1 remain valid as before (page 179), where h is Planck’s constant.

Figure 5

Geodesics and Redshift

355

24. Corollary. In M ( k , , f ) a photon emitted (as above) at y,(t,) received at y o ( t o ) has redshift

and

z = (f(to)iif’(t,>> - 1. Proof. Since (dt/ds)(s) = E ( s ) = h / l ( s ) for all s, the preceding corollary implies thatf(t)/l is constant along a. Substituting 2 = Cf(t) in the definition of z gives the result. Cosmological redshift is an analogue of the Doppler shifts produced by relative motion. Galaxies generally have negligible relative motion, but their distances apart change as a consequence of the overall expansion (or contraction) of the universe. Since z > 0 impliesf(t,) > f ( t p ) the fact that in our universe all distant sources have positive redshift is the primary evidence for the expansion of the universe. In any universe that has always been expanding, the redshift z determines the emission time t,. In fact, z and f ( t o ) givef(t,), and sincef’ > 0, t is uniquely determined. For example, in the Einstein-de Sitter model, f ( t ) = t2I3 and the estimated age t o of the universe is about 12 billion years. If a galaxy has redshift z = then some algebra based on the corollary shows that the light was emitted at t , 3t0/4, say 3 billion years ago. For a quasar with z = 2 the emission time is about to/5, less than 2 i billion years after the big bang. In M ( k , f ) with the standard choices of space S, the times t , and t o determine the distance between the galaxies y, and yo at one, hence every, galactic time.

4,

-

25. Corollary. Let M ( k , , f ) have space S 3 , R3, or H 3 . If a photon emitted at y,(t,) is received at y,(to), then the present distance between the galaxies y , and yo is

provided in the case of S3 that the integral is In.

ProoJ The projection fi of the photon into S is a pregeodesic that, as the proof of Corollary 23 shows, has speed

I p’ Is = (p’, B’) 1 : 2

=

(l/f(t))(dt/ds).

Thus the length of 6, from p to o is

With the additional hypothesis for S3 this length is the Riemannian distance d(p, 0) in S, and the result follows. (The extra hypothesis can be avoided; without it, d(p, o) = min{ IL(p) - 2nrn):m E Z } . )

356 7 2

General Relativity; Cosmology

For example, for the Einstein-de Sitter redshifts mentioned above, the corollary gives a present distance of about 3 billion lightyears for the galaxy with t, 3r0/4,and present distance of 15 billion lightyears for the quasar with t , t,/S. The preceding corollaries lead to an explicit description of the null geodesics in a Robertson-Walker spacetime.

IC.

26. Corollary. In M ( k , f ) = I x, S let CI be a curve with t(a(0)) = t o . Then CI is a null geodesic if and only if where t ( s ) is the inverse function of s in S, and h(s) = J): d t / f ( t ) .

=

C J:,, f ( t ) dt, fl is a unit speed geodesic

Proqf: If a is a null geodesic, the characterization of t ( s ) is immediate from Corollary 23. The proof of Corollary 25 shows that the pregeodesic /? = u CI has arclength function h as above. Hence a has the required form. Conversely, a null geodesic of this form can be found to realize any given (null) initial velocity. 0

= Z x ,S, with Z = 27. Remarks. Consider null geodesics in M(k, 1') ( A , B),--co I A < B Ico.

(I) If j ; f ( t ) dt < 00, the preceding corollary shows that no futurepointing null geodesic can be defined on [0, 00). Hence every null geodesic is future incomplete. Similarly all null geodesics are past incomplete if fjf(t)dt

< co.

(2) Conversely, if both integrals in (1) are infinite, then t ( s ) and hence h(s) are defined for all s E R. If S is complete, then B is also defined on R, hence every inextendible null geodesic is complete; that is, M(k, f ) is null geodesically complete (hence inextendible). (3) Again if S is complete, for an inextendible null geodesic the monotone function t(s) traverses the entire interval I . Thus Robertson-Walker photons, unless otherwise destroyed, endure from the initial singularity t , or - GO to the final singularity t* or + co. (4) If S is one of the standard choices, then all null geodesics are congruent, mod parametrization. In fact, any two null geodesics can be parametrized to have the same initial values oft and dtlds. Then there is an isometry C#J of S such that id x C#J carries a;(O) to a;(O), hence a1 to a2 (spatial homogeneity).

28. Examples. Photons in Friedmann Models. We take t o in Corollary 26 as the limit value t , = 0; thus the photons start from the big bang.

Geodesics and Redshift 357

(1) Einstein-de Sitter, R f gives t ( s ) = s315.Then

:1

h(s) =

x,Z,3

R 3 . Since s

= (3C/5)t5I3, taking

C

=

4

t - 2 / 3 dt = 3(t3/5)1/3= 3 ~ " ~ .

Hence a typical photon is a(s) = (s3l5,

%'I5,

0,O)

for s > 0.

( 2 ) M(1, , f ) = (0, n.4) x S3 (,f as in 21(2)). For a(s) = (t(s),j(h(s)), let 9(s) E (0, 27~)be the rotation angle corresponding to t(s) E (0, nA). The parametric equations for t andfsatisfy d t = f ( 9 ) d9, hence t(S) dt W) h(s) = f(t> = d 9 = 9(s).

jo

jo

Thus as it travels from the big bang to the big crunch, each photon makes exactly one great circle trip around the space S 3 . Although general relativity has made possible the global study of the universe, it also imposes limits on what we can observe. Information from outside our immediate spacetime neighborhood consists of radiation (null geodesics) recorded in detail for only a few decades. Thus the events that have actually been seen by man lie mostly in a thin conical spacetime shell. However, there is good reason to believe that this shell extends to the extremely distant past. The large redshifts of quasars are one indication. More generally, the Robertson-Walker big bang picture of the universe has been reinforced by the 1965 discovery of microwave background radiation; this is pervasive, highly isotropic, and typical of the radiation emitted by a hot dense gas. Its properties argue both its origin in an early radiation-dominated era and the isotropy of the universe it has since traversed [HE, Section 10.11. Thus knowledge of the past is extensive, and an increasingly detailed history of the universe is being constructed, starting from its earliest seconds. The existence of the big bang, however, limits the spatial range of our observations, since obviously we can only receive radiation that has had time to get here. According to Corollary 25 our galaxy can have received information only from sources whose present distance is at most

In the Einstein-de Sitter model, for example, this marks out a ball of radius 30 billion lightyears in the infinite space S(t,) rz R3. (See also Exercises 8 and 9.) Thus at vast spatial distances or in the future, the universe could be different from the region we can observe.

358

72

General Relativity; Cosmology

OBSERVER FIELDS An observer jeld on an arbitrary spacetime M is a timelike, futurepointing, unit vector field U . Each integral curve of U is indeed an observer, parametrized by proper time. Thus U describes a family of U-observers filling M . The general question is whether they can agree on common notions of space and of time. (Observer fields are called referenceframes in [SW], from which this section derives.) Suppose that S is a (necessarily spacelike) hypersurface in M to which an . the infinitesimal restspace observer field U is normal at every ~ E S Then U i (as on page 333) is just T,(M) for every p E S, hence S is called a restspace of u.

29. Example. (1) On a Robertson-Walker spacetime the flow vector field U = d, is an observer field whose U-observers are the galaxies y p . The spacelike slices S ( t ) are restspaces of U . ( 2 ) A single freely falling observer win Minkowski spacetime determines a unique observer field U whose observers are all distantly parallel to w. Each restspace of w in the special relativity sense is a restspace of U in the above sense. It is not always possible to integrate the infinitesimal restspaces of U to obtain a restspace. 30. Proposition. For an observer field U the following are equivalent. (1) There is a restspace of U through each p E M . (2) If vector fields X and Y are orthogonal to U , then so is [ X , Y]. ( 3 ) ci is irrotationai; that is, curl U is zero on vector fields X , Y I U .

ProoJ: (For properties of curl, see Exercise 3.18.) ( 2 ) o ( 3 ) . If X , Y I U , then (curl U ) ( X , Y) = (D, U , Y ) - ( D yU , X ) = -( U , D, Y ) ( U , D y X ) = -( u, [ X , Y]).

+

(1) =-(2). Let S be a restspace of U through p E M . If X , Y IU then X and Y are tangent to S. By Proposition 1.32, [ X , Y] is also tangent to S , hence orthogonal to U . That (2) implies (1) is a special case of Frobenius’ theorem [W]. A vector field on R3 that depends on a time parameter t is irrotational provided curl( = 0 for each t. Thus the terminology in (3) above gives the natural relativistic analogue. An observer field U is geodesic if each of its observers is geodesic, that is, if D , U = 0.

v)

Observer Fields

359

31. Corollary. An observer field U is geodesic and irrotational if and only if curl U = 0. Proofi Since curl U is skew-symmetric it suffices by the preceding proposition to show that D, U = 0 is equivalent to (curl U ) ( U ,Z ) = 0 for all 2. But (curl U)(U,2 ) = (D,U, Z ) , since (D,U, U ) = +Z(U, U ) = 0.

In Example 29 both observer fields are geodesic and irrotational. Now we turn from space-agreement to time-agreement.

32. Definition. An observer field U on M is proper time synchronizable provided there exists a function t E g ( M ) such that U = -grad t. Then t is called a proper time function on M . In fact, t is a common proper time for all U-observers, since d(t o M ) / d z = U t = - ( U , grad t )

=

-(U, U > = 1.

Thus t CI differs from the proper time z of CI only by an additive “clocksetting” constant. In Example 29 both observer fields are proper time synchronizable, with t galactic time in (l),and the usual extension of w’s proper time in (2). 0

33. Corollary. If an observer field U on M is proper time synchronizable, then U is geodesic and irrotational. The converse holds if M is simply connected (hence always holds locally). Proofi The first assertion is immediate from the corollary above since the curl of a gradient is zero. For the converse, if curl U = 0 and M is simply connected, then U is a gradient [BG], hence U is proper time synchronizable.

Suppose U is proper time synchronizable and let CI be a U-observer. Any other material particle (or observer) from a(zl) to u ( z 2 ) takes less time than CI’S z2 - zl. The proof is a mild variant of that of Proposition 5.34. Thus, for example, the twin phenomenon (6.22) holds for a traveler leaving and eventually returning to a Robertson-Walker galaxy. Proper time synchronizability is a very strong condition and can be usefully weakened as follows. 34. Definition. An observer field U on M is synchronizable provided there are smooth functions h > 0 and t on M such that U = - h grad t.

Here t is a kind of average time agreed on by all U-observers. A synchronizable observer field is irrotational, since U is normal to the level

360

12

General Relativity; Cosmology

hypersurfaces of t , which are thus restspaces. If z is the proper time of an U-observer c1, the time-dilation d(t a)/dz is l/(h a), so the elapsed proper time between two restspaces will generally be different for different U observers. 0

0

STATIC SPACETIM ES

35. Definition. A spacetime M is static relative to an observer field U provided U is irrotational and there is a smooth function g > 0 on M such that gU is a Killing vector field. Any local flow {$,} of gU consists of isometries, and each $t preserves U-observers (though generally distorting their proper time parametrizations). Being irrotational, U has restspaces S , and $,(S) is again a restspace, since d$, U and U are collinear. Thus, locally at least, the spatial universe always looks the same to a U-observer. We now construct static spacetimes with a given restspace.

36. Definition. Let S be a three-dimensional Riemannian manifold, I an open interval, and g > 0 a smooth function on S . Let t and c as usual be the projections of I x S onto I and S . The standard static spacetime I g x S is the manifold I x S with line element -g(a)2 dt2

+ ds2,

where ds2 is the lift of the line element of S . This is just the warped product S x g I with time coordinate written first. Thus by contrast with a Robertson-Walker spacetime, space remains the same and time is warped. The following result verifies that I g x S is static relative to a,/g. 37. Lemma. For I g x S , (1) d, is a Killing vector field with global flow isometries given by = (s + t , PI. (2) The observer field U = d,/g is synchronizable, with U = -g grad t, hence U is irrotational. (3) The restspaces t x S of U are isometric under the flow isometries t,bt, and all are isometric under c to S .

+,kPI

Proof (1) Evidently each $, is an isometry of I g x S. (2) Both U and grad t are orthogonal to all slices t x S , which are thus restspaces. Since (grad t, a,) = 1, it follows that grad t = -a,/g’, hence U = - g grad t. (3) is obvious. w

Exercises

361

Every static spacetime is locally standard: 38. Proposition. A spacetime M is static relative to an observer field U if and only if for each p E M there is a U-preserving isometry of a standard static spacetime onto a neighborhood of p . Proof. If such isometries exist, then the lemma shows that M is static relative to U . Conversely, since U is irrotational there is a restspace S through p . Let {$,: t E I } be a local flow at p of the Killing vector field X = g U . Shrinking S and I if necessary, the map $: I x S -,$,q is a diffeomorphism onto a neighborhood of p in M . Furthermore, d$(a,) = X, and each $, is an isometry such that d$,(X) = X. Thus

(1) X I$,(S) for all t E I (since this is true for t = 0). (2) g = I X I > 0 is constant on the curves t + $,(q), q E S . (3) I S is an isometry of the Riemannian hypersurface S onto $,(S) = $0 x S). If 0 x S is identified with S, then (2) implies g 0 $ = (91s)0 cr, where cr is projection on S. Thus the three conditions show that the pullback by $* of the metric tensor of M has the form specified in Definition 36. Thus locally, any static spacetime has the properties given in the lemma; in particular its observer field U is locally synchronizable.

Exercises 1. (a) Let U be a future-pointing timelike unit vector field on a spacetime M . Prove that U is the flow vector field of a perfect fluid if and only if Ric(X, Y ) = Ric(X, U ) = 0 whenever X, Y, and U are mutually orthogonal. (b) If ( U , p, b) is a perfect fluid and X, Y I U , then Ric(U, U ) = 4z(p 3b), Ric(U, X) = 0, and Ric(X, Y ) = 4741 - b)(X, Y ) . Hence S = 8z(p - 3fi) and CR'jR,, = 64z2(p2 + 3#). 2. If a spacetime M is an Einstein manifold, so Ric = cg, show that any observer field U on M is the flow vector field of a perfect fluid with p = - b =

+

c/Sz.

3. Any Robertson-Walker spacetime I x f S is conformally diffeomorphic to a semi-Riemannian product J x S, J c R i . 4. (a) Give direct proofs of Corollaries 8 and 9. (b) Deduce Corollary 13 directly from Theorem 11. (c) Given k , deduce the formulas in Theorem 11 from Corollaries 12 and 13.

362

12

General Relativity; Cosmology

5. Let ( V , p , h) be a perfect fluid. An instantaneous observer measures the speed of the fluid as tanh q ; find the observer’s measurements of density and pressure. 6. Let U = d, on M ( k , f ) .Prove: (a) Corollary 13 is equivalent to div(pU) = - / div U and also to L,(pw) = -/zLU(co), where w can denote either the volume element of M ( k , f ) or of its spacelike slices. (b) On any spacelike slice div U = -trace S,, where S , is the shape operator of the slice. (c) If u = U ( l , p )the , tidal force operator F , is scalar multiplication by!”& and trace F , = -Ric(u, u ) = -4rr(p + 3b). 7. Prove: (a) Let U be a geodesic observer field on a spacetime M , and let y E g ( M ) . Then gU # 0 is Killing if and only if g is constant and U Killing. (b) A Robertson-Walker spacetime M ( k , f ) is static relative to U = 8,if and only iff is constant. 8. Consider the Friedmann model (0, nA) x J S 3 of Example 28(2), with H i = 18 x lo9 yr. (a) For p o = 2pc as on page 253, show thatf = $, where f is the fraction of all galaxies from which we can have received information. (b) Less realistically, suppose p o = lop,. Find the maximum radius A , the age of the universe, and j . 9. Consider the Friedmann k = - 1 model of Example 28(3), with H i = 18 x lo9 yr. (a) If p o = pJ2 as on page 253, find d,,, the largest present distance of galaxies from which we can have received information. (b) If po = pc/lO, find the age of the universe and d,,,,. 10. (a) A perfect fluid satisfies the strong energy condition if and only if y / 2 0 and p 3h 2 0. (b) A Robertson-Walker perfect fluid satisfies the strong energy condition if and only if K , 2 K , I 0. 11. In M ( k , f ) = I x J S prove: (a) If I is not all of R f ,then every timelike geodesic is incomplete. (Hint: Consider drldt.) (b) If S is complete and f ( t ) = (1 - t 2 ) - on I = ( - 1, I), then every null geodesic is complete but no timelike one is. 12. Let M be a spacetime and let 0 be an open set of instantaneous observers in T’(M). Prove: (a) A symmetric (0,2) tensor T at p is completely determined by its values T(u, u) for u E L,. (b) The curvature tensor R at p is completely determined by g and F,(J) = Ry,,u for all y Iu E 0. (Hint: See Exercise 3.22.) 13. Prove: (a) A Ricci flat Robertson-Walker spacetime is flat, and there are just two types, represented by Rf = Ri x R3 and M(-l, + t ) . (b) M ( - 1, t ) = R + x, H 3 is physically equivalent to a timecone in R: with the galaxies being the rays from 0. 14. Radiation models. (a) If M ( k , f’) has f nonconstant, show that the following are equivalent: (i) 3b = p > 0; (ii) pf4 = C > 0; (iii) f‘* k

+

+

+

Exercises 363 = a 2 / f 2 where , a’ = 8nC/3. (b) Prove that the scale functionfcan be written as (2at - t2)”’ if k = 1, (2at)”’ if k = 0, (2at t’)”’ if k = - 1. (c) Identify these as conic sections, and sketch graphs assuming a = 1. 15. Let K , and K , be the principal sectional curvatures of M ( k , f ) at (f, p ) , and let u = U(!,,,. Prove: (a) Every timelike plane 17 at ( t , p ) has a basis x, (cosh q)u (sinh cp)y, where {x, y , u> is orthonormal. Then K(17) = K , cosh’ cp - K , sinh’ cp. (b) A similar formula holds for spacelike planes. (c) At ( t , p ) and hence on the entire slice S ( t ) , K is constant o K,(t) = K,(t). 16. If M ( k , f ) does not have constant curvature, show that every self isometry q5 is a product as in Exercise 7.1 1. (Hint: K,(t) # K , ( t ) for some r ; use the preceding exercise to show that U Iq5(S(t)).) 17. Prove: (a) S: is (isometric to) a Robertson-Walker spacetime. (b) For a suitable (0, n) x f H 3 is isometric to half of H:(r). (c) H t ( r ) is not a Robertson-Walker spacetime. 18. A spacetime M is static relative to an observer field U if and only if at gOk = 0 for each p E M there is a coordinate system with U = do/=, 1 I k I 3 and dgii/dxO = 0 for all i, j .

+

+

13

SCHWARZSCHILD GEOMETRY

Schwarzschild spacetime is the simplest relativistic model of a universe containing a single star. The star is assumed to be static and spherically symmetric-and to be the only source of gravitation for the spacetime. The resulting model can thus be applied to regions around any astronomical object that approximately fulfills these conditions. For example, in the case of the sun it gives a model for the solar system even better than the highly accurate Newtonian model. Schwarzschild found the spacetime late in 1915, soon after the appearance of general relativity. Intially only half of it, the exterior, seemed to be physically significant. However the neglected half, suitably joined to the exterior, now provides the simplest model of a black hole.

BUILDING THE MODEL Schwarzschild spacetime will emerge naturally from the physical conditions given above. (1) Static. The spacetime is t o be static relative to observers comparable to Newtonian observers at rest in Euclidean 3-space. Using Definition 12.36 as a guide, take the restspace to be R3-but with line element q yet to be determined-and let the spacetime be the manifold R' x R3 with line element of the form A ( x ) dt2 q (x E R3),

+

where q is lifted from R3. The projection t : R' x R3 -+ R' is called Schwarzschild time. 364

Building the Model

365

By Lemma 12.37 the lift 8,of d/dt from R' is the Killing vector field required by the definition of static. (2) Spherical symmetry. Since the star and hence the resulting spacetime are to be spherically symmetric, for each q5 E O(3) the map ( t , x) + ( t , q5x) must be an isometry. Thus it is natural to give a spherical description of R3 (minus the origin) as R* x S2, where R' = ( p E R : p > 0} and S 2 is the unit 2-sphere. As in Example 7.37(2), spherical symmetry implies that the line element q on R + x S 2 z R3 - 0 can be written as B ( p ) d p 2 C ( p ) d o 2 ,where do2 is the line element standard on the unit sphere. For every 4 E O(3) the differential map of id x q5 carries 8,to a,, hence the coefficient function A ( x ) of d t 2 actually depends only on p. Thus the line element on R' x R + x S2 % R' x ( R 3 - 0) becomes

+

+ B ( p ) d p 2 + C(p) do2.

A(p) dt2

(3) Normalization. A change of variable in R + replaces C ( p ) by r2, so the line element is now E(r) dt2

+ G(r) dr2 + r2 do2.

The projection r : R' x R + x S2 + R'

is called the Schwarzschild radius

function.

By this normalization, in each restspace t constant, the surface r constant has line element r2 do2 and is thus the standard 2-sphere S2(r),with Gaussian curvature l/r2 and area 4nr2 (see Figure 1). The line element above exhibits the spacetime as the warped product P x, S2 where P = R' x R + is the half-plane r > 0 in the tr-plane, furnished with line element E(r) dt2 + G(r) dr2. The projections of P x, S2 on P and the unit sphere S2 are denoted as usual by n and o.

I S W

\

I I I

Tangent plane to S 2 ( r )

I /-----_

(,t

,

I

2

366 13 Schwarzschild Geometry

(4) Vacuum and Minkowski at infinity. The only source of gravitation in the Schwarzschild universe is the star itself-which we do not model. Thus the spacetime must be a uucuum, that is, Ricci flat. Sufficiently far away from a source of gravitation, its influence becomes arbitrarily small. Hence we require that, as r approaches infinity, the Schwarzschild metric tensor approaches the Minkowski metric of empty spacetime, which in spherical terms is

-dt2

Thus E(r) + - 1 and G(r) --f

+ dr2 + r2 da2.

+ 1 as r + 00.

The conditions above determine the functions E and G and thereby the Schwarzschild metric. 1. Lemma. P x, S2 is Ricci flat and Minkowski at infinity if and only if E = - 4 and G = k - ’ , where A(r) = 1 - @MI?-), with M an arbitrary constant.

Proof: Applying Corollary 7.43, with d condition is equivalent to

=

dim S2 = 2, the Ricci flat

(1) ‘Ric(X, Y ) = 2 H r ( X , Y ) / r for X , Y E 2 ( P ) , (2) FRic(V, W ) = ( V , W ) r * for I/, W e 2 ( S 2 ) ,

where r # = Ar/r + (grad r, grad r>/r2. We need consider equation (1) only on P itself, since leaves are totally geodesic and project isometrically. Because dim P = 2, ‘Ric(X, Y ) = Kp(X, Y ) . Choose X , Y from d,, d,; then by Exercise 5.5(a), equation (1) is equivalent to K p E = E’/ZG; K p G = -G’/2G. Thus E‘IE = -G‘/G, so EG is constant, and the limit conditions on E and G imply EG = -1. In equation (2),FRic is the lift through the projection (T (a homothety on fibers) of the Ricci curvature Ric = g = ( , ) of the unit sphere. Thus FRic(V, W ) = (daV, daW) so (2) is equivalent to r #

=

r#=-

Replacing E’IE by

-

=

( V , W)/r2,

l/r2. Using Exercise 5.5 again gives

’

z]

---

2rG E

1 -. 1 +-= r2G r 2

G’/G yields -GI 1 1 z+y2G=yZ.

Building the Model

367

Consequently, '

G

-

G'r

=

1.

Thus rlG is r plus a constant, which we denote by -2M, so

Anticipating the identification of the constant as the mass of the star, we require M > 0. Thus the Schwarzschild,function A(r) = 1 - (2M/r) rises from limit -a3 at r = 0 toward limit 1 at r = a3, passing through 0 at r = 2M. Hence the warped product line element

-A d t 2

+ A -' dr2 + r2 do2

fails at r = 2 ~We . have found not one but two spacetimes. 2. Definition. For M > 0 let PI and PI, be the regions r > 2~ and 0 < r < 2~ in the tr-half-plane R' x R', each furnished with line element - A d t 2 + A-1 dr2, where A(r) = 1 - (2M/r). If S2 is the unit sphere, then the warped product N = PI x, S2 is called Schwarzschild exterior spacetime and B = PI, x S2 the Schwarzschild black hole, both of muss M.

A star with properties as above is characterized by two numbers, its mass and radius r*. Since the star itself is not modeled, we are left with the region r > r* in N u B as a model for the spacetime around it. In the usual case r* > 2M, the surface of the star is outside the Schwarzschild radius so its exterior is given by the connected region r > r* in N . (For example, the sun has r* = 7 x lo5 km 9 M = 1.5 km.) The situation is more interesting if r* 2M. As we shall see, r* can then only be 0: the star has disappeared, leaving a black hole B. When the Schwarzschild radius r is sufficiently large, the metric on N is nearly Minkowskian and we can think o f t as time and r as radial distance, but for smaller r these interpretations are suspect. Indeed, as r passes below 2 ~the , function A becomes negative so it is 8,that is timelike and 8,spacelike. Generally, if say r < OM, the situation is already too relativistic for Newtonian analogies to be of much help. Consider the warped product definitions above, indicated schematically in Figure 2. M

(1) For each (t, r ) E PI the fiber n-'(t, r ) is the sphere S 2 ( r ) in the restspace of Schwarzschild time t. As always in a warped product, this sphere is totally umbilic in N and 0 maps it homothetically onto S2.

368

13

Schwarzschild Geometry

r =

2~

Figure 2. Schwarzschild warped product structure (schematic).

(2) For each q E S 2 the leaf a-l(q) = PI x q is, again by warped product generalities, totally geodesic in N and isometric under the projection 77 to the Schwarzschild half-plane P , . Such a radial plane a-'(q) is the twodimensional spacetime consisting of all events occurring directly over a single point ofthe star. Its Newtonian analogue is a radial line from the center of the star together with Newtonian time. A particle in o-'(q) is moving radially. The definition N = PI x S 2 thus presents the Schwarzschild exterior as the product of the standard radial plane P , with the unit sphere, the warping function turning the fibers into spheres of radius r > 2 ~ . The geometric remarks above are valid also for the black hole B but some physical interpretations will differ.

GEOMETRY OF N A N D B

Since the metric tensors of the Schwarzschild exterior N and black hole B are formally the same, the geometries can be computed simultaneously. 3. Lemma.

On PI u PI,,where ds2 = - R(r) dt2 + R(r)-' dr2,

(1) D,,a, = (Mh/r2)S,, D , ~a, = D ? 2, ~ = (M/r2h)a,, D,? a, = ( -M/r2h) a,. (2) grad t = - d , / R ; grad I' = A ( 3 ) H' = (M/r2)g. (4) K =

a,.

Proot (1) and (4) follow from Proposition 3.44, and the gradients and Hessian are computed as usual.

Geometry of

N and B 369

Let 8,and a, on N u B be the lifts of the corresponding vector fields on PI u PfI.2 ( S 2 ) consists as usual of all lifts to N u B of vector fields on S 2 . Also, following warped product conventions, on each tangent space T,(N u B), tan is orthogonal projection onto the tangent plane of the fiber 7c- '(r, r ) = S 2 ( r ) through p ; nor is orthogonal projection onto the tangent plane of the leaf a- '(ap) through p . 4.

Proposition. O n N u B, if I/, W E 2 ( S 2 ) , then

(1) Same as (1) in the lemma above. (2) Da,(V) = DJaJ = 0; Da,(J'> = oddr)= (l/r)J'. ( 3 ) nor(& W ) = I l ( V , W ) = -(&/r)(V, W ) a,, where I1 is the shape tensor of each fiber. (4) tan(& W )E 2 ( S 2 ) is the lift of V, W from S 2 . This follows immediately from Proposition 7.35. 5. Proposition. Let V,W be vector fields on N u B that are vertical, that is, tangent to all spheres S2(r).Then

(1) R,t,v(a,> = (-2Mh/r3> 8,; RAraI(d,>= (2M/r3&)Ft. ( 2 ) Ra,v(dJ = (M4/r3)v; Rarv(dr)= ( - M/hr3)I/ ; Ra,dar> = R,,.dJ,> = 0. ( 3 ) &,.aa,(V)= R v w ( ~ O= R,,(dr) = 0. (4) R,, W = Rxw V = ( ~ / r ~V), (W ) X for X = a, or 8,. ( 5 ) R,, U = (2M/r3)[(U , V ) W - ( U , W ) V ] , where U is also vertical.

Proof: These formulas follow from the correspondingly numbered ones in Proposition 7.42. In fact, letting X and Y denote any choices from a,, a,, and using Lemma 3 : (1) The curvature operator of P , u PI,is Ra1a,.X = ( ~ M / ~ ~ ) Ca (t >X J r , - < X , ar>dzI. (2) H" = (M/r2)g, hence R,, Y = - R v x Y = -(M/T3)(X, Y ) v . ( 3 ) This is obvious. (4) From Lemma 3 compute D,(grad r ) = (M/?)X. ( 5 ) We saw in Chapter 7 that this is just the Gauss equation of each fiber 7c- ' ( t , r ) = S2(r), for which the lift F R gives the curvature tensor. Thus FR,wU = r - ' [ ( U , V ) W - ( U , W ) V ] . But (grad r, grad r) = A = 1 - ( 2 ~ / r )since , grad r = h 8,.

Lemma 1 shows already that N and B are Ricci flat.

370

13

Schwarzschild Geometry

6. Definition. Let 9, cp be spherical coordinates on the unit sphere S2 (Exercise 3.13). Let t, r be the usual Schwarzschild time and radius coordinates on PI u PII.The product coordinate system ( t , r, 9, cp) in N u B is called a Schwarzschild-spherical coordinate system.

The domain of these coordinates omits say o-’(C), where C is a semicircle in s’, but the coordinate vector fields a,, a,, 3, are well defined and smooth everywhere, and a, is singular only over the poles 9 = 0, n of S’. Note that 8,and a, are tangent to radial planes (horizontal) while a, and a, are tangent to spheres (vertical). For indexing purposes, let xo = r, x1 = r, x’ = 9,x3 = 9.Then the line element - A dt’

+ A- ’ dr2 + r2(d9’ + sin2 9 dip2)

shows that the coordinates are orthogonal with

(?,.a,) = goo = - 4 , (a,, 8,) = gz2 = rZ7

(a,, a,>

= gll =

(av, a,)

=

A-i,

g , , = ’r sin2 9.

Because of spherical symmetry, the coordinate system t, r, 9, cp can be rotated as follows. An isometry A e O ( 3 ) of S 2 determines an isometry ( t , r, q ) --, ( t , r, A q ) of N u B ;thus t , r, 9 = 9 A , (p = cp A is a coordinate system with line element given by the same formula as before. By construction, 8,is a Killing vector field on N u B,and, as with any warped product, the lift of a Killing vector field on S 2 is Killing on N u B. These are essentially all. 0

0

7. Proposition. Every Killing vector field on N or B has the form c 8,+ V , where V is a Killing vector field lifted from S’. Proofi Since radial planes are totally geodesic, the component of Y tangent to G- ‘(4) is Killing (Exercise 9.7). The projection n is an isometry from G - ‘ ( 4 ) to P, that preserves t and r ; hence by Example 9.24 this component is a constant, say f(q), times a,. Thus Y = f a , + V , where V is vertical andfis the liftfo o of a functionfon S2.We must show that V E f?(S2) andfis constant. Since Y is Killing, DY is skew-adjoint and hence in particular (De, Y , Z ) + (DzY, 2,) = 0 for ZEQ(S’).Using Proposition 4, wecompute (OatY, Z ) = ( a p t ) ( V, Z ) and ( D , Y, 8,)= - AZf. Hence, (1)

( a / a t ) ( v , z)= AZ$

Similarly, replacing a, by (2)

a, leads to ( a i a r w z)= ( 2 / r ) w 7z).

Schwarzschild Observers 371

Fix q E S 2 and consider ( V , 2 ) as a function of t and r on 0- ‘(4). Equation (2) implies ( V , Z ) = g(t)r2. Substituting this into (1) gives g’(t)r2 = &(r)zf, where z E T , ( S 2 ) lifts to Z on 0- ‘(4). Since &(r) = 1 - ( 2 ~ / r )this , equation is impossible unless both g’(t) and zfare zero. Then g’ = 0 implies ( V , Z ) = kr2, where k is constant relative to t and r, but depends, of course, on z. Thus k(z) = ( V t ,r )

7

z(t,r ) ) / r 2

=

(ddyz,r J >

z).

Choosing z from a frame at q shows at once that d ~ ( y , , ~is)independent ) oft and r. Thus V E 2 ( S 2 ) . Since zf = 0 for every tangent vector z to S 2 , the functionsfand hencey are constant. Thus Y = c d, + V as required.

8. Corollary. (1) On N , every timelike Killing vector field has the form c 8,.(2) On B, every Killing vector field is spacelike. Proof: For a Killing vector field on N or B write Y = c d, + V as above; since V is vertical it is spacelike. (1) If Y is timelike on N , then

0 > (Y, Y ) =

-c2&(r)

+ (V, V).

Since V is the lift of a vector field V , on S2, ( V , V ) = (V,, E)rZ2 0. In any radial plane 0-‘(q), if r + co, then &(r) -+ 1. Hence the inequality above can be maintained only if (K, V,)(q) = 0. Consequently V, and hence V are zero. (2) On B, d, also is spacelike since & < 0 on B. SCHWARZSCHILD OBSERVERS

We consider some basic features of the Schwarzschild exterior N . The Killing vector field 8,is timelike on N , and N is time-oriented by requiring that d, be future-pointing. By construction, N is static relative to the corresponding observer field

u = a,/fi. The integral curves a of U are called Schwarzschild observers (represented by vertical lines in Figure 1). As predicted by Lemma 12.37, U is synchronizable, and Schwarzschild time t is the average time, since, by Lemma 3, U = grad t. If r is the proper time of a Schwarzschild observer a, then

-fi

d(t a) dr 0

= (a’,

grad t )

=

( U , grad t )

and this time dilation is constant along a.

= &(a)- 1’2,

372 13

Schwarzschild Geometry

Since 0 < A < 1 on N , Schwarzschild time is always faster than the proper time of a Schwarzschild observer. The two times are nearly the same if the observer is far from the star, where A(.) 1. But since R(r) ---* 0 as r -t 2w, Schwarzschild time speeds up unboundedly for observers with r(a) ever closer to 2w. By construction, the restspaces of U are all naturally isometric to the standard Schwarzschild restspace S : the region r > 2M in R' x S 2 , with line element &- dr2 + r2 do2. Again for r large, hence &(r) near 1, S is close to Euclidean space (described spherically). Thus Newtonian notions of both time and space are approximately valid far from the star. If y is a particle in N , deleting its t coordinate projects y to a curve Fin the restspace S. (Parametrizing 7 by t would give the analogue of an associated Newtonian particle in special relativity.) For a Schwarzschild observer a, the curve 2 is constant: these observers are at rest. Hovering over a fixed point of the star at constant height, they are definitely not freely falling:

-

9. Lemma. If ( w r 2 )a,.

CI

Proof. Since U

=

x- ':2(A-

"Q,

a,)

=

is a Schwarzschild observer, then

Ad, and aR/dt (M/rZ)ar.

=

cl" =

D,U =

0, Proposition 4 gives D, U

=

Thus each observer must aim his rockets at the star and supply the acceleration required to remain at rest. In the analogous Newtonian case (Appendix C), the acceleration has magnitude M/r2, Here, for r large, la,l = l/A(r) is close to I, hence the lemma gives [ D, U 1 M/r2, supporting the designation of M as the mass of the star. (See also Exercises 1, 2, and 3.) By Corollary 8, the black hole B is not static relative to any observer field; we consider its physical interpretation later on.

-

SCHWARZSCHILD GEODESICS

Schwarzschild-spherical coordinates xo = t, x1 = r, x2 = 9, x3 = cp are orthogonal, hence by Exercise 3.15 the geodesic equations become 1

2g.. dxj

1[Sii($)] c )..( =2 j = o JI ax'

where goo= -&, g l l =

&-I,

(0 Ii I 3 ) ,

gz2 = r2, g 3 , = r2 sin2 9.

Schwarzschild Geodesics 373

10. Lemma. If y is a geodesic in N

LJ

B, then for constants E, L,

R dt/ds = E , (2) r2 sin2 9 dcplds = L, (3) (d/ds)(r2 d 9 / d s ) = r2 sin 9 cos 3 (dcp/ds)2. (1)

Here (1) and (2) follow from the geodesic equations for i = 1, 3, since no g j j involves either t or cp. Equation (3) will be simplified below. For the geodesic equation for r, see Exercise 5. A curve in N u B is initially equatorial relative to spherical coordinates 9,cp on S2 provided it has 9(0) = 7-42 and (d$/ds)(O) = 0. Evidently a suitable rotation of coordinates will make any curve initially equatorial (assuming 0 EI). 11. Proposition. Let y be a freely falling material particle in N u B that is initially equatorial relative to Schwarzschild-spherical coordinates. Then (Gl) A d t / d s = E , ((32) r2 d q / d z = L, (G3) 9 = ~ 1 2 ,

where E and L are constants. Furthermore the energy equation holds: E2

=

(dr/dz)2 + (1 + (L2/r2))R(r).

Proof. Equations (Gl)-(G3) will follow from the corresponding formulas in Lemma 10, with the parameter s now proper time z. (1) is unchanged, and evidently 9 = 4 2 is the unique solution of (3) satisfying the equatorial initial conditions. Thus ( 2 ) implies (G2). Then y' = (dx'ldz) di becomes y' = (E/&) d, ( d r / d t ) d , (L/r2)d,. Hence

1

+

- 1 = (y',

y')

=

(E2/h2>( - &)

The energy equation follows.

+

+ ( d r / d z ) 2 A - ' + (L2/r4)r2.

w

These equations embody Galileo's principle that the motion of a freely falling body is independent of its mass, a result obtained in Newtonian theory only by equating inertial and gravitational mass (Appendix C.) By (G3), the particle remains always over a great circle of S 2 (or the surface of the star). On N at least, equation ((32) is formally identical with Kepler's second law (Appendix C ) ;hence we call L the angular momentum per unit mass of the particle.

374 13

Schwarzschild Geometry

If y has mass m and hence energy-momentum my', then in N the Schwarzschild observers measure its energy as - (my',U ) =

-(my', A-

1/2

a,)

=

r n f i (dt/dz).

By (Gl), A d t / d t is the constant E. But A + 1 as r -+ 00,hence E is called the energy per unit mass at infinity of the particle. Since y' is future-pointing, dtldr and hence E are positive. (For brevity we sometimes call E and L merely energy and angular momentum.) FREE FALL ORBITS

Let y be a freely falling material particle in the Schwarzschild exterior N . The projection 7 into the Schwarzschild restspace S lies in the orbital plane of y, given in equatorial coordinates by 9 = 4 2 . There we can write = ( r ( t ) , cp(z)), inviting comparison with the polar coordinate description of Newtonian gravitation. The orbit of y is the route followed by ?in the orbital plane. The particle is ingoing when drldr < 0, outgoing when drldr > 0. In dealing with initial conditions, we write simply 0 instead of t o .Using Proposition 1 1 it is easy to check that if r(0) and, of course, M are given, then E, L, and sgn(dr/dr)(O) uniquely determine the coordinates of the 4-velocity Y'(0). In order to relate the character of the orbit to the physical parameters E and L of the particle y, write the energy equation as

T(T)

E2

(d:)'

-

+ V(r),

A(r)= 1

2M r

=

where L2 r2

'2ML2 r3 '

The energy equation can be regarded as expressing conservation of energy for a unit mass particle whose motion on the half-line R + is given by r = r y. Ignoring a factor f, we call V ( r ) the effective potential energy of y. Energy diagrams can now be constructed as in Appendix C . 0

(1) Plot the graph of V ( r ) and draw a horizontal line at height E 2 . ( 2 ) Since (dr/dr)2 2 0, the energy equation restricts r = r o y to the

component I of { r : V ( r ) < E 2 } containing its initial value r(0). ( 3 ) Since 2 dZr/dz2= - V'(r) (Exercise 5), a critical point of V represents a circular orbit r = r o , stable for a minimum, unstable for a maximum. Furthermore, a noncritical point V ( r ) = E 2 is a turning point: r bounces off to retraverse I in the opposite direction.

Free Fall Orbits

375

(d )

(C)

Figure 3. Ordinary relativistic orbits: (a) Crash orbit. (b) Crash/escape orbit. (c) Bound orbit. (d) Flyby orbit.

The effective potential V ( r ) differs from that in Newtonian gravitation (Appendix C) by the relativistic correction term - 2ML2/r3, constants being unimportant. Thus for r large we expect approximately Newtonian results. But in general the relativistic situation is more complex since the shape of the graph of V ( r )depends on the ratio L/M. In all cases, limr+o V ( r ) = - cc, and limr+m V ( r ) = 1. One major change in the shape of the graph is signaled by the appearance of critical points r1 I r 2 , computable from V’(r) = (2/r4)[MrZ - L2r

+ 3ML2].

When two critical points rl < r 2 exist, r1 is a local maximum and r2 is a local minimum. An orbit is exceptional if E 2 = V(rl); otherwise it is ordinary. As we shall see, exceptional orbits are geometrically interesting, but they lack physical significance, since they can be made ordinary by arbitrarily small changes in E or L (while ordinary orbits remain ordinary). Considering at first only ordinary orbits, we now find t h e four types suggested in Figure 3. Case I . Low Angular Momentum: L2 < 12M2. V ( r ) has no critical points, hence is strictly increasing (see Figure 4). Thus there are two orbit types, depending on E : (a) E 2 < 1: crash orbit. Ingoing particles crash directly into the star. (For r* > 2 M this means that r y approaches r*, hence y meets the star, thereby leaving our model. In the black hole case, r y approaches 0, and the term “crash” is amply justified.) Initially outgoing particles move out to a turning point then back in to crash. 0

0

376 13 Schwarzschild Geometry

T

I I

E2

I

I

Figure 4. Low angular momentum.

(b) E 2 2 1: crashlescape orbit. Ingoing particles crash; outgoing particles escape to infinity. , point appears at 6h.1with When L2 reaches 1 2 ~a~critical

k'(6111) =

8.

Case I I . Moderate Angular Momentum 12M2 < L 2 < 1 6 ~ ' .- There are now two critical points, r l < 6~ < r 2 , forming the crest and trough of a potential well (Figure 5). Three ordinary orbit types occur in four cases, depending on energy E and r(0): (a) E 2 < V ( r , )and r(0) < r l : crash orbit as in Case I(a). (b) V ( r , ) I E 2 < V ( r , ) and r(0) > r I : bound orbit. The particle is in the potential well, its Schwarzschild radius oscillating between maximum and minimum, with a stable circular orbit at r = r 2 . By (G2), dq/d.r is thus bounded away from zero, so the particle travels perpetually around the star. (c) V(rl) < E 2 < 1: crash orbit. (d) E 2 2 1 : crashlescape orbit as in Case I(b). At L2 = 1 6 the ~ crest ~ V ( r , ) of the potential well, rising with L / M , reaches V(m) = 1.

Case I l l . Large Angular Momentum: L 2 > 1 6 ~ ~The . situation is now dominated by the rising potential barrier with crest V(rl). (See Figure 6.) (a) E2 < V(ul) and r(0) < r l : crash orbit. (b) V ( r 2 )I E 2 < 1 and r(0) > rl : bound orbit as in Case II(b).

I

E:

I

\ Graph of V ( r )

I

Figure 5. Moderate angular momentum.

Free Fall Orbits

377

Graph of V ( r )

Figure 6. Large angular momentum.

(c) 1 I E 2 < V(r,) and r(0) > rl : j y b y orbit. An outgoing particle has enough energy to escape. In the incoming case the potential barrier (derived from the large angular momentum) protects the particle from a crash and it turns back to escape to infinity. (d) E 2 > V(rl): crash/escape orbit. The simplest exceptional orbits are the unstable circular orbits r = r 1 at the local maxima of V(r)-or at r1 = rz ifLZ = 1 2 ~ ’ The . other exceptional orbits are infinite spirals approaching ever closer to these circular orbits (see Example 35(1)). Nearby ordinary orbits are also very different from Keplerian ellipses and hyperbolas since, by continuity, they too must spiral to some extent while close to r = r , . A freely falling particle with L = 0 is moving radially, since equations (G2) and (G3) show that its cp and 9 coordinates are constant. Its energy equation is

E2

- 1 = (dr/dz>’ -

2~/r,

which, but for notation, agrees with the equation for radial motion in Newtonian theory (Appendix C) and also with the Friedmann equation (Lemma 12.20). As for the latter, we can get parametric solutions. For example, consider the low energy case, in which the particle eventually falls back to the star.

12. Lemma. In N , let y be a freely falling material particle with L = 0 and E < 1. The proper time T and the Schwarzschild radius r of y are cycloidally related by T =

where

T =

+

R

~

(

+V sin u ] ) ,

0 at the maximum radius R of y.

r = +R(I

+ cos v ] ) ,

378 13 Schwarzschild Geometry

Proof: At maximum radius, dr1d.r is zero, hence E 2 = &(R) = 1 ( ~ M / R )Thus . the energy equation becomes (dr/ds)’

+ 2M/R = 2 ~ / r .

Direct substitution shows that the formulas provide the required solution.

PERIHELION ADVANCE

We want to obtain more precise information about the orbit of a freely falling material particle y in the Schwarzschild exterior N . In the nontrivial case L # 0, equation (G2) implies that dcpld.s is nonvanishing, hence the orbit can be described by expressing r as a function of cp. 13. Proposition. (The Orbit Equation). For a freely falling material particle y in N with L # 0.

d2U/dq2

where u

=

l/r, r

=

+ u = (MIL2) + 3Mu2,

r y. 0

Using (G2),

Proot

dr _ - drldcp - L dr ds - dr/& r2 d q ’

hence the energy equation becomes

Setting r

=

l / u gives E2

=

L ’ ( d ~ / d q ) ~+ (1 + L2u2)(1 - 2Mu).

Then taking another derivative gives the result. This equation differs from its Newtonian analogue (Appendix C ) by the relativistic correction term MU'. Obviously MU' $ u for large r. Furthermore 3Mu2 $ M / L if~ the tangential component r d q / d s of orbital velocity is small compared to the speed of light, since by (G2) MUz L2 - - / -L=~-( r2 M rg).

2

Perihelion Advance 379

Thus the relativistic orbit will be nearly Keplerian when these conditions hold. (The comparison is valid, since, for r $- M, the orbital plane is nearly Euclidean.) In the bound case each trip around the star does indeed have an approximately elliptical orbit, but the ellipses are not the same for successive trips. This precession can be measured by change of perihelion (point at which r is a minimum). 14. Corollary. Let y be a freely falling particle in bound orbit around a Schwarzschild star of mass M. If L 9 M and r(dcp/dz) < 1, then the orbit of y is approximately elliptical with angular perihelion advance

6---

671~’ 671M radians per revolution, L2 a(1 - e 2 )

where a is the semimajor axis and e the eccentricity of the ellipse. To see this, recall from Appendix C that the Newtonian orbit equation u = M / L ~has ii = (M/L’)(~ e cos cp) as solution with perihelion at q~ = 0. The hypotheses imply r 9 L 9 M so this is close to the relativistic solution. To obtain a more refined approximation, we use ii in the relativistic correction term thus (d2u/dcp2)

+

+

(d2U/dcp2)

+ u = ( M / L ~+) 3MU“’.

A routine computation gives u

M = - (1

L2

+ e cos cp) +

cos 240

as the solution with perihelion at cp

du dcp

--

Since L 3

Me . -_ sin

M,

L2

qI

=

+ ecp sin cp

0. To find the next perihelion we need

e [? sin 2cp + sin cp + cp cos cp + 3M3e L

1

.

the dominant terms here are the first and the one involving 6,

cp cos cp. Thus the next perihelion will occur near cp = 271 at say 2 x

+

where du

-sin 6

Hence

-

1

+ 3M2 (271 + 6) cos 6 L2 ~

.

Again since L & M, we obtain 6 671~~,/L’. The alternative formula follows since a(1 - e 2 ) = L’/M for a Keplerian ellipse (Appendix C).

380 13

Schwarzschild Geometry

Since perihelion advance is always positive, we can think of the particle as traversing an elliptical orbit in a plane that is itself slowly rotating in the same direction. This early result of Einstein led to the first experimental test of general relativity. The orbit of the planet Mercury has eccentricity 0.206, largest of the well-charted planets, so its perihelion can be located with precision. All known perturbations of perfect elliptical orbit (largely due to the gravitational effects of the other planets) give by Newtonian theory a predicted perihelion advance for Mercury about 43 sec of arc per century less than the observed value. The relativistic advance accounts for this discrepancy. In fact the hypotheses of the corollary are fulfilled, and using M = mass of the sun = 1.48 x lo5 cm, u = semimajor axis for Mercury = 5.79 x 10”

-

cm,

we compute 6 5.02 x lo-’ radians per revolution. There are about 2.06 x lo5 sec of arc per radian, and with its period of 87.96 days, Mercury makes 41 5.2 revolutions per century. The relativistic perihelion advance is thus about (5.02)(2.06)(415.2)x lo-’

5

43 sec per century.

LIGHTLIKE ORBITS

Now consider the behavior of a lightlike particle y (photon, neutrino, graviton) in the Schwarzschild exterior. By definition, y is a geodesic, so equations (Gl)-(G3) in Proposition 1 1 are the same as for a material particle (with affine parameter s instead of proper time), but the energy equation is altered since (y’, y’) is now 0 instead of - l. 15. Proposition. If y is a lightlike particle in N u B, then relative to equatorial coordinates,

(Gl) Addtlds = E, (G2) r 2 d q l d s = L, (G3) 9 = 4 2 ,

and y satisfies the energy equation E 2 = (drlds)’

+ (L2/r2)A(r).

As before, on N we interpret the constants E and L as the energy at injinity and angular momentum of y.

Lightlike Orbits 381

A lightlike particle with L = 0 has both cp and 9 constant, hence moves radially. Since each radial plane is naturally isometric to P,,Example 5.41 gives explicit formulas for the coordinates t and r of the particle. As with a material particle, the Schwarzschild view of a lightlike particle y is as its projection = (r, 9)into its orbital plane in the Schwarzschild restspace. In Newtonian theory, light rays are straight lines, but relativistically they are influenced by gravitation: y is geodesic, but I;is accelerating, producing the so-called “bending of light” by the star. By contrast with the material case, the orbit of y does not depend on E and L separately, but only on the ratio b = 1 L I/E, called the impact parameter of y. For b # 0, r becomes a function of cp. After rewriting the energy equation equation as

7

2 R 1 (1zdzr) +J=p’

we use (G2) to get 16. Corollary. For a lightlike particle in N with impact parameter b # 0,

As usual we interpret &/r2 as an egeective potential. This function rises from limit - co at r = 0 and, crossing the r axis at 2w, reaches a maximum of 1 / 2 7 ~at ~3 ~ - t h e n declines toward zero. The qualitative character of lightlike orbits can then be read from Figure 7 . on initial . Case I . Small Impact Parameter: b < 3 4 ~ Depending conditions, y either crashes into the star or escapes to infinity. For impact parameter 3 3 there~ is an unstable circular orbit r = 3M. A Schwarzschild observer at this radius can see the back of his own head. Two exceptional orbit types spiral toward this circular orbit.

Figure 7 . Graph of R l r 2 .

382

13

Schwarzschild Geometry

Figure 8. Photons in an orbital plane

Case I I . Large Impact Parameter:b > 3 $ ~ . orbit; (b) r(0) > 3 ~ flyby : orbit.

(a) r(0) < 3M: crash

; speaking, The only bound orbit is the unstable one at r = 3 ~practically any photon in N either escapes to infinity or meets the star. The impact parameter b of 1) has a more geometric characterization in terms of the angle between orbital velocity 7and the inward radial direction - 8, (Figure 8). 17. Lemma. If y is a lightlike particle in N with impact parameter b, then r sin q = b,,h@, where q = 3: ( - a,, 7).

Proof. In terms of equatorial coordinates, adding

3’ = (dr/ds)d, +

(dq/ds)d, and (dt/ds)d, gives the null vector 1)’. Hence by (G l),

171 = I(dt/ds) d,I

=

(E/&)ld,l= E/&.

Then using (G2),

We can now find what a Schwarzschild observer sees of a star of radius r* > 3M. Drawing the line r = r* in Figure 7 shows that a photon sent from r > r* will escape to infinity if and only if its impact parameter b is greater than r * / m . An ingoing photon with b* = r * / m has a turning point dr/dr = 0 at radius r*, so it just grazes the star (Figure 8). By the lemma, the initial angle q* of such a grazing photon satisfies r sin q* = b * m , with q < 7t/2 since it is ingoing. But the orbits of particles are reversible (Exercise lo), so the star will appear as a disk ofangular radius q*. Eliminating b* from the two equations above thus gives the following result.

Lightlike Orbits 383

18. Corollary. For a Schwarzschild observer at radius r, a star of radius r* > 3~ has angular radius q* I n/2 such that

The square root factor here is greater than 1, showing that as expected, the bending of light makes the star appear larger than the straight line Newtonian analogue sin q* = r*/r. As with material particles, substituting r = l / u in Corollary 16 and differentiating gives

19. Corollary (The Orbit Equation). For a lightlike particle y in N with L # 0, d 2 U / d q Z u = MU', where u = l J r , r = r y.

+

0

As the Schwarzschild radius approaches infinity, the relativistic correction term MU' above becomes negligibly small and the orbital plane approaches flatness. Thus for a lightlike particle in flyby orbit, as its parameter s approaches ? co,its orbit approaches limiting straight lines. If the entire orbit is far from the star, we can measure its total bending by means of the dejlection angle A between the limiting lines (Figure 9).

20. Corollary. If y is a lightlike particle in flyby orbit in N with perihelion ro 9 M, then its deflection angle is approximately ~ M / v , .

Figure 9

384 13

Schwarzschild Geometry

We use the same scheme as for Corollary 14. The (straight line) solution of d2u/dq2 + u = 0 with perihelion at cp = 0 is u“ = a cos cp, where a = l/ro. Replacing the right-hand side of the orbit equation by MU"^, we compute U =

U(1 - Ma)

COS

Cp

4- Ma’(2 - COS’ Cp)

as the solution for which u(0) = a. Since a % a2 and as s -+ co both u and cos 9 approach 0, the dominant terms in the limit are 0

-

-

a COS cpm

+ 2MU2.

Thus cos qrn - 2 ~ a .It is clear from Figure 9 that this limit angle is n/2 plus half the deflection angle A . Hence A

-

2 sin $A

-

-2 cos (in + i d )

-

4Ma

=

4M/r0.

During a solar eclipse, light from a distant star that just grazes the sun en route to earth is observed to be slightly displaced from its direction with the same earth-star position but the sun elsewhere. The earth-sun distance is sufficiently large compared to the radius ro of the sun that the deflection angle is essentially at its limiting value in the preceding corollary. For solar mass M = 1.5 km and r,, = 7 x lo5 km, the deflection angle is A

-

-

4 ~ / r ~$ x

rad

-

1.7 sec,

which is close to observed values.

STELLAR COLLAPSE

The first role of the Schwarzschild model was to confirm the essential validity of general relativity by accounting for minor flaws in the Newtonian model of the solar system. Since the solar system is not very relativistic, such refinements, though profound theoretically, are qualitatively not too significant. More recently, however, stellar objects have been discovered for which the Newtonian approximation is crude and in the extreme case fails completely. A star is formed when a cloud of gas is drawn together by gravity; as its density increases it gets hotter and nuclear burning begins. Then for a long period -say a billion years-near equilibrium prevails : energy loss by radiation is balanced by energy gain from the nuclear reaction ; gravitational tendency toward contraction is balanced by pressures from the hot dense core. (Our sun is now in this equilibrium state.) Eventually the supply of nuclear fuels runs low and gravitational contraction begins. This phase is short-lived and ends violently; though the process is not fully understood,

Stellar Collapse 385

several possible endstates have been singled out, including the following : (1) For masses less than about 2 that of the sun, the collapsing star may stabilize as a white dwarf, at a size comparable with the earth. This produces enormous densities, but since I* is on the order of 1 0 0 0 relativistic ~~ effects are not decisive. White dwarfs are fairly common throughout the universe. (2) For somewhat larger masses the collapse may continue, stripping atomic nuclei of their electrons and packing them together as a neutron star. With radii of say 10-15 km, enormously high densities are attained-billions of tons per teaspoon. Since r* is on the order of 8 ~relativistic , effects are crucial. Pulsars are believed to be rapidly rotating neutron stars. (3) For masses more than half again that of the sun such relatively temperate endstates are ruled out, and if the star cannot somehow manage to eject sufficient mass, no known physical mechanism can prevent its radius from shrinking below 2111. A catastrophic collapse ensues, and it is predicted that in a fraction of a second an endstate is reached that can only be said to have radius 0 and density co : a black hole has formed.

That black holes actually exist in our universe is widely accepted. The simplest mathematical model will join the Schwarzschild black hole B with its exterior N . This raises the question: How bad is the singularity r = 2~ that separates them? Let us drop a pebble into a black hole. If it starts from rest, say at r = 8 ~ then , the pebble falls radially and by Lemma 12 its Schwarzschild radius r and proper time T are given by the cycloidal parametrizations z = 8M(q

+ sin q),

r

=

4~(+ 1 cos q).

Thus r + 2~ as q + 27c/3, and the elapsed proper time is about 2 3 . 7 ~ To . see what happens to the pebble as r + 2 ~we, consider the tidal forces on it. 21. Remarks. Radial Schwarzschild Tidal Forces. Let u be a radial instantaneous observer, for example a 4-velocity vector of the pebble above. Thus u is a timelike unit vector of the form a a, + b a,. In the restspace u', u regards the vector x = (b/A) 8,+ a A 8,as pointing directly toward (or away from) the star. Vectors w E u' that are orthogonal to x are simply vectors tangent to the fiber S2(r); u regards them as transverse. The tidal forces measured by u are readily computed from Proposition 5 : F,(x) = ( 2 ~ / r ~ ) x , F,(w)

=

- ( M / ~ ~ ) w for all

w.

For example, F,(w)

=

+ b2Rwdra, = (M/T3)(-U2& + b2/k)w = -(M/r3)W.

Rwuu = a2Rwata,

306

13

Schwarzschild Geometry

Thus in u’s radial space direction there is tension of twice the magnitude of the transverse compression. In terms of acceleration, this is in agreement with the situation in Figure 12.1, where + x would point from o to b and d, while suitable & y point to a and c. (See also Exercise 3.) Since the tidal forces on the falling pebble are of order M/r3, there is no reason to expect any profound physical change at r = 2M. So let the pebble continue on into B. It reaches the central singularity r = 0 at q = n, taking proper time 8nM for the entire fall. This not very long: if M is k solar masses, about k ten-thousands of a second. Schwarzschild observers tell a different story. The time dilation dt/dz = E / R approaches infinity as r + 2 ~and , in fact for Schwarzschild observers rhe pebble will never reach r = 2 ~ though , it eventually drifts in arbitrarily close (Exercise 4). (By Example 5.41, the same is true for a beam of light directed radially inward.) Thus the pebble seems to experience no difficulty in falling through r = 2 ~ but , Schwarzschild observers cannot describe the trip. This suggests that the trouble at r = 2M is due to the Schwarzschild time function t and not to the spacetimes N and B.

THE KRUSKAL PLANE

In the remainder of this chapter we define and study Kruskal spacetime K, half of which joins N and B to give a (connected) spacetime. For some history of its discovery see Box 31.1 of [MTW]. Like N and B, K will be a warped product with fiber a 2-sphere; thus the problem is reduced to joining the Schwarzschild half-plane P, and strip PI!.Fix the notation f ( r ) = (r - 2M)e(‘’ZM’-

for r E R+,

with ~a positive constant. Sincef’ > 0 on R+,fis a diffeomorphism onto the half-line ( - 2w/e, a). Now let Q be the region in the uu-plane given by uu > -2M/e. (See Figure 10.) Iff- is the inverse diffeomorphism ofS, then r = f - ‘(uu) defines a smooth positive function on Q that is characterized implicitly by the equationf(r) = uu Thus the level curves r constant in Q are the hyperbolas uu = const, except that r = 2~ gives the coordinate axes. The function r has limit value 0 on the boundary hyperbola uu = - 2 ~ / e which , is not part of Q. Removing the coordinate axes from Q leaves its four open quadrants, denoted as usual by QI, . . . , Qlv .

The Kruskal Plane 387 I

Figure 10. The Kruskal plane 0.

22. Definition. With notation as above, the region Q in the uu-plane, furnished with line element

ds2

=

2F(r) du du,

where F(r) = (8M2/r)e’-(*” M),

is called the Kruskal plane of mass M . Since the natural coordinates u, u are a null coordinate system on Q (Definition 5.42), the null geodesics of Q are parametrizations of the coordinate lines u constant and v constant. The metric tensor of Q is

F(r)(du 0 du

+ d u @ du).

Thus (d,,, 8,) = F ( r ) > 0. The mapping (u, u) -+ ( - u , - u ) of Q preserves uu, hence r, hence F(r), hence the line element, and is thus an isometry of Q that reverses the quadrants Q, and QIII,and also reverses QIIand Qlv. Using Exercise 5.8 we find that Q has sectional curvature 2 ~ / r ~ . On the open quadrants uu # 0 of Q, define t = 2~ ln)v/ul. The level curves t constant are thus rays from the origin in Q.

23. Lemma. On the Kruskal plane, (1) F f = 8 M 2 & F f ‘ 1 - (2M/r).

=

4 ~ and , f/f’

=

~ M R where , as usual R(r) =

) (du/tl)), dr = 2MA((du/u) + ( d u / ~ ) ) ( 2 ) d t = ~ M ( ( ~ u / ufor uu # 0. (3) grad r = ( 1 / 4 ~ ) ( ud, + u 8”).

388 13

Schwarzschild Geometry

Proof: (1) The first identity is immediate. The second follows from ,f‘(r) = (r/2M)e((r’zM)-I).

The third is a consequence of the first two. ( 2 ) d t = 2M(U/U) d(u/u) = 2M(dU/U - du/u). Differentiating f(r) = uu gives f ’ ( r ) dr = u du + u du. Since 2 M k = f ( r ) / f ‘ ( r )= uu/f’(r), the formula for dr follows. (3) The vector fields metrically equivalent to d u and du are d J F and d,,/F, respectively. Hence by (2), grad r

=

( 2 ~ & / F f ) ( d, u

+ u 8,) = ( 1 / 4 ~ ) ( u8, + v a,).

We can now show that QIand hence QIIIare isometric to the Schwarzschild half-plane P I ,while QIIand QIv are isometric to the Schwarzschildstrip PI,. 24. Proposition. The function $: QI u QII+ PI u PI,sending (u, u ) to (t(u,u), r(u, u ) ) is an isometry that preserves quadrants and the functions t and r.

Proof: On PI u PII,t and r are just the natural coordinate functions, and we write them temporarily as 7 and F. Then by the definition of $, T 0 $ = t and F $ = r. Hence $*(df) = d(7 0 $) = d t and similarly t,b*(dF) = dr. The line element of PI u PI, is - A d t 2 + K dF2. Since $ preserves r it also preserves k, so $* applied to this line element is - & d t 2 + A- dr2. Substituting the formulas for dt and dr gives 0

’

8M2k(du du/uu)

=

( 1 6 ~ ~ & / f ( r ) )du d u= 2F(r) du du,

which is the line element of Q. It remains only to chCck that $ is a diffeomorphisrn on each quadrant. But we can solve the coordinate formula for $ to show, for example, that $: QI + PI has inverse function given by

The essential problem in joining N and B is now solved, since Q, % PI and QIIz PI,fit together naturally in Q along the positive u axis. By definition, the mapping $ preserves level curves o f t and of r, hence $ can be visualized in Figure 1 1 as (1) exploding the origin (0,O) of Q into the whole vertical line r = 2~ and thus lifting the pairs of radial lines t constant into horizontal lines t constant in PI u P I , , and ( 2 ) sending the u axis to a point at - 00 so that the hyperbolas r constant are carried to the vertical lines r constant in PIu PII.

Kruskal Spacetime 389

Schwarzschild

Figure 11. Mapping Kruskal to Schwarzschild.

KRUSKAL SPACETIME 25. Delinition. Let Q be a Kruskal plane of mass M, and let S 2 be the unit 2-sphere. The Kruskal spacetime of mass M is the warped product K = Q X, S2,where r is the function on Q characterized by . f ( r ) = uu.

Explicitly, K is the smooth manifold Q x S2 furnished with line element 2F(r) du du r2 do'. The time-orientation of K is specified below. Let 7c and cr be the projections of K onto Q and S2.As always, each leaf a-'(q) is totally geodesic and isometric to Q, and each fiber z-'(u, u) is a 2-sphere S2(r(u, 0)) of radius r(u, u) that is totally umbilic in K . We continue to denote by u, u, r the lifts (prefix by 7c) of the natural coordinate functions u, u and the radius function r on Q. For each Jlr = I, 11, 111, IV, let K , be the open submanifold 7 c - ' ( Q N ) over the quadrant QN of Q. Deleting these quadrants from K leaves the horizon H , consisting of all points over the coordinate axes of Q. Removing the central sphere 7c- '(0, 0) from H leaves four hypersurfaces, each diffeomorphic to Rf x S2.

+

390 13

Schwarzschild Geometry

The timefunction t = 2 M In 1 v/u 1 is defined only on the open quadrants of Q, hence its lift into K , given by the same formula, is defined only on K - H. It is easy to show that the quadrants K , and K , , are isometric to N and the quadrants K,, and K , , to B. For example, let $: Q, -+ P , be the isometry given by Proposition 24. Since I) preserves r, the mapping I) x id is an isometry from K , = Q, X, S2 onto N = P, x,. S 2 . Similarly, since the isometry (u, u ) -+ ( - u , -v) of Q preserves r, it induces an isometry +(u, u, p ) = ( - u , -u, p ) of K called its central symmetry. Evidently 4 reverses K , and K,,, and also K,, and K,,. We record these results as @

K , , & K , , & B. K,,, % K , & N , Hence Kruskal spucetime also is Ricci,flat but not,flat. The isometries above obviously preserve the functions t and r ; thus on any quadrant of K these,functions are geometrically the same as the usual Schwarzschild time and radius.functions on N or B. We emphasize that, unlike t , the function r is well defined on all of K .

26. Remark. Coordinate Systems on K . (1) The natural coordinate functions u, D on the Kruskal plane and spherical coordinates on S2 give a product coordinate system u, u, 3, cp, that, as in the Schwarzschild case, is effective on all of K except over the poles of S 2 . The line element for these Kruskal-spherical coordinates is 2F(r) du dc r2(dS2 sin2 3 dcp2).

+

+

(2) As noted above Schwarzschild-spherical coordinates t , r, cp, 9 are valid on K - H (but for poles), with formally the same geometric properties as on N v B. Covariant derivatives and curvature on K can be expressed in Kruskal terms using Exercise 5.8 and warped product generalities as in the corresponding Schwarzschild case. On K , % N and K,, % B the coordinate vector field a, is by construction a Killing vector field. Though the function t fails on the horizon, nevertheless S, can be uniquely extended as a Killing field over all of K . 27. Lemma. The vector field X vector field that equals 8,on K - H .

= (u

a,

-u

C?,,)/~M

on K is a Killing

Proof. X is the lift of the vector field on Q given by the same formula. Thus using Schwarzschild-spherical coordinates we have X = X t a, + X r d, on K - H . Since d(t n) = n*(dt), the formula for dt remains valid when lifted to K - H. Thus 0

(U

Similarly X r

=

0. Thus X

=

a, on K

-

H.

8,

-u

a,)

=

1.

Kruskal Spacetime

391

X is Killing on K , u K,, z N u B since dl is. The central symmetry

4(u, v , p ) = ( - u, - u, p ) reverses the signs of u, u, a,, and a,; hence d 4 X = X . Since 4 is an isometry, X is Killing on K,,, u K,, and hence by continuity on all of K . The proof shows that X is tangent to radial planes and on each is tangent to the hyperbolas r constant.

28. Corollary. Every Killing vector field Yon K has the form C X for X as above and V a Killing field lifted from S2.

+ V,

ProoJ: Since X = d, on K , % N , it follows from Proposition 7 that Y can be expressed in the given form c X V on K,. The remark following I/ everywhere on K . Lemma 9.27 implies that Y = c X

+

+

Kruskal spacetime is time-orientable, since, for example, 8, - a, is a timelike vector field. In order that the natural isometry I): K , + N preserve the physical meaning of these two spacetimes, it must preserve time-orientation. Thus we time-orient K by requiring that on K , as on N the vector field d, be future-pointing.

29. Lemma. On Kruskal spacetime the null vector fields -8, and 3, are future-pointing. O n K,, z B, grad r is timelike future-pointing. ProoJ: On K , , dl is timelike future-pointing, and by Lemma 27, a, = ( u d, - u d , ) / 4 ~ . Both functions -(d,,, a,) = -vF(r)/4w and (a,, 8,) = - u F ( r ) / 4 ~are negative on K , ; hence -a, and d, are future-pointing on K , . Since these are null vector fields, they remain future-pointing on all of K (see Figure 12).

t'

111

Figure 12. Kruskal future-cones.

392

13

Schwarzschild Geometry

By Lemma 7.34 the formula for grad r in Lemma 23 is valid on K . It shows that on K , , , grad r is a linear combination of -3, and 8,with positive coefficients - u / ~ M and 2114111; hence grad r is timelike and future-pointing on K , , . BLACK HOLES

The region v > 0 in Kruskal spacetime K is called a truncated Kruskal spacetime. If Q‘ is the region v > 0 in Q, then K’

=

n- ’(Q’) = Q’ X, S 2 .

Thus K ‘ joins the Schwarzschild exterior K , x N (r > 2M) and the black hole K,, x B ( r < 2 ~ along ) the horizon H’ = H n K ’ (r = 2 ~ ) It. is the sought-for spacetime containing a single Schwarzschild black hole and exterior. When a particle a from K , reaches the horizon H,it actually reaches H’. To see this, it suffices to show that the v coordinate of a is nondecreasing. Write a‘ as (du/ds)a, + (dv/ds)a, + tan a‘. Since -13, is future-pointing, (a‘, -a,) I0; hence dvlds 2 0.

30. Propition. No particle, whether material or lightlike, can escape from the black hole K,, x B. Furthermore any particle in K , , moves inward, ending (on a finite parameter interval) at the central singularity r = 0, if not before. Proof. Let a be a particle such that a(O)E K , , . Since r = 2M on the boundary of K,, in K’ (or in K ) , if dr/ds I0, then a remains in KII.By Lemma 29, grad r is timelike and future-pointing on K,,. Since a‘ is nonspacelike and future-pointing, drlds

= (a‘,

grad r ) < 0.

A material particle can always be ended by overacceleration, and hence need not reach r = 0. What we must show is that a cannot be extended to [0, a). For this it suffices to prove that dr/ds, which is negative, is bounded away from zero. If a is lightlike, then differentiating the energy equation in Proposition 15 shows that dZr/ds2< 0 for r < 3w; hence (dr/ds)(s) I (dr/ds)(O) < 0 for all s 2 0. If a is a material particle, then for initially equatorial coordinates, - 1 = (a’, a‘) =

- &(dt/dr)’

+ ((dr/dr)2/A)+ r’(dq/dz)’.

Since A < 0 on KI,, this implies - 1 2 (dr/dz)’/A, and thus - A I (drldr)’. But - A increases as r decreases, so again (dr/dr)(r) s (dr/dr)(O) < 0.

Black Holes 393

..

outgoing, B2 ingoing. Dashed lines indicate photons: a is at rest; p , ’ s message is received by fl, but there is no answer.

Figure 13. Observers: fl at rest,

8,

This result is easiest to visualize for particles in a radial plane a - ’ ( p ) , identified as usual with Q’ (Figure 13). Future timecones are marked off by the future-pointing null vectors -8, and a,. In the exterior Q, a material particle b can hover at rest ( r constant) by supplying the radial acceleration specified by Lemma 9. With larger acceleration, 4, becomes outgoing and cuts through the r constant curves to move away from the black hole. But if the acceleration drops, O, becomes ingoing and descends into the black hole quadrant Qn.There, as the figure indicates, no acceleration can save b. The extreme cases of radial motion are provided by lightlike particles, say photons. As null geodesics these parametrize the coordinate lines u constant and u constant, in the future-pointing directions d, and -d,, respectively. Evidently the -3, photons are always ingoing. Outside the black hole the d, photons are outgoing, but inside (in Qrr) they too are ingoing. Thus observers in the black hole can receive messages or even material particles from outside, and in favorable cases can exchange both with each other-but can send nothing outside. In the exceptional case of a a, photon parametrizing the positive u axis, r = 2 ~though , it is racing “outward” at the speed of light the pull of the black hole holds it hovering at rest. Since the u axis is the intersection of the radial plane with the horizon r = 2 M of K ‘ , the horizon H’ is the union of the worldlines oj’all such rest photons. 31. Corollary. Except for rest photons, every particle in K’ that meets the horizon continues into the black hole. Proof: Suppose that ~ ( 0is) in the horizon, given by u = 0 in the region K‘. There by Lemma 23(3), grad r = u a,/4~, which is null and futurepointing. Since tl‘ is nonspacelike and future-pointing, it follows from Exercise 5.3(a) that (dr/ds)(O)= (tl’(O), grad r ) I 0,

394 13

Schwarzschild Geometry

with equality holding if and only if a'(0) and grad r are collinear null vectors. Thus a enters K , , except in the latter case, in which a is lightlike. But then a is a geodesic, and since a'(0) and d, are collinear, a parametrizes the u axis of a-'(p) z Q' and is thus a rest photon. The attitudes of the future-cones on K' can now be visualized as follows. In the limit as r -+ m they are Minkowskian. As r decreases, they tip inward making escape to infinity more difficult, demonstrating the gravitation attraction of the black hole. At r = 2111, the inward tilting future-cones are tangent to the horizon along the worldline of a rest photon. For r < 2111, the entire future is inward, toward the central singularity. The gravitational attraction of a black hole is no stronger than that of a normal star of the same mass, but because a black hole has radius r* = 0 it offers no protection from the spacetime region r < 2~ where gravity dominates causality. Every material object entering this region is destroyed. In fact the tidal force computations in Remark 21 are valid in B just as in N ; since these forces are of order M/r3 they become infinite as r + 0. A steel ball falling radially is stretched in the radial direction and compressed transversally until it shatters to the shape of a radial line. Acceleration only shortens life (see Exercise 7), while nonradial motion produces even larger tidal forces (Exercise 13). Now we consider what a black hole looks like. 32. Corollary. For a Schwarzschild observer at radius r, the angular radius q* of the black hole K , , = B is given by sin q*

=

3Md3#%(r)/r,

where q* is in the first quadrant if r 2 3M, second quadrant if 2 M Ir I 3M.

Proof: As in Corollary 18, consider which photons y starting at radius r fail to escape to infinity. If r > 3M7 then Fi ure 7 shows that y fails to escape if and only if it is ingoing with b I 3JM. If b* = 3$M, the proaches the horizon and by Lemma 17, r sin are ingoing, hence q* = 4 ( -dr. ?'(O)) I 42. If r I 3 M , no ingoing photon escapes, and even outgoing ones with b > 3 , , b M are drawn back to the black hole. Thus sin q* is the same as before but q* 2 4 2 . For example, at radius r = 1 0 the ~ black hole appears against the stars of the background sky as a black disk of angular radius 28". For Schwarzschild observers closer to it, the black disk is larger, and at r = 3111 it fills half the sky. Closer still, the visible sky is concentrated in an ever smaller disk overhead that disappears at r = 2M.

-

Kruskal Geodesics 395

Let K” be the other half of Kruskal spacetime, namely the region over the lower half u < 0 of the Kruskal plane. The central symmetry 4 provides an isometry from K’ to K” that carries d, to - d u , hence reverses timeorientation. Thus K ” is K’ with past and future reversed. In particular, K,, zz B is not a black hole but a white hole: every particle in K,, must leave it; no particle in K,,, can enter it. The horizon H” = H n K ” consists of the worldlines of photon racing “inward” but only managing to hover at rest at r = 2 ~ . KRUSKAL GEODESICS The Schwarzschild geodesic equations can be extended to Kruskal spacetime as follows.

33. Proposition. Let y be a geodesic in K with (y’, y ‘ ) = E = - 1, 0, or 1. Let r and t be the usual Schwarzschild functions and let 9, cp be initially equatorial spherical coordinates. Then (Gl) &(r)t‘ = E on K (G2) r’q’ = L, (G3) 9 = n/2.

-

H,

Furthermore, E Z = r r 2 + V ( r ) ,where V ( r ) = (L2/r2- E)&(r). ProoJ: (G3) is obvious since, as in any warped product, the projection of y into S 2 is pregeodesic. By the conservation lemma (9.26),if X is the Killing extension of d,, then -(y‘, X ) is a constant, equal to -(y‘, d,) = &(dt/ds) = E on K - H . For (G2),d, is Killing on S2,hence so is its lift to K . Since 9 = 7112,the constant (y‘, a,) is r 2 q ’ . The energy equation holds as usual on N and B,and since the central symmetry preserves r and t , it holds on the four open quadrants of K . The hypersurfaces u = 0 and u = 0 of K are totally geodesic (Exercise 9).Thus if y meets H , then either (a) y remains in H , where the energy equation is trivial since r’ = 0 = A or (b) y cuts transversally through H ; hence by continuity the energy equation holds (with the same constant on both sides of H ) . Conversely, if these equations hold for a curve y, and r‘ is rarely zero, then y is a geodesic (Exercise 5). The energy equation can be used to find formulas for geodesics-and, in particular, to determine their maximal domains. 34. Remark. Integral Formulas for Geodesics. Let y be a geodesic in K I z N or K,, zz B. For definiteness, take y to be a freely falling material

396 13

Schwarzschild Geometry

particle with r(y(0)) = ro and (dr/dr)(O)# 0. Motivated by the energy equation, define ~ ( r=)

1:

( E 2 - V(r))-’” dr,

where the sign depends on whether y is initially outgoing or ingoing. If E2 > V ( r ) on some r-interval J, then the function t ( r ) is a diffeomorphism from J onto an interval I that may well be larger than the original domain of y. On I the inverse function r(r) has dr/dz = + ( E 2 - V(r))l”, hence r(z) satisfies the energy equation. For initially equatorial spherical coordinates, 9 = n/2, and we further define

Then equations (Gl)-(G3) also hold on I. By Exercise 5(b), these functions t , r, 9, cp are the Schwarzschild coordinates of a geodesic-one that agrees at first with y . Hence they provide a geodesic extension of y over the interval I . 35. Examples. (1) Exceptional orbits in N . Let y be a freely falling material particle in Nmoving outward from r,, < rl with angular momentum L2 > 1 2 ~ ’ .Suppose that E2 = V(rl), so the orbit of y is exceptional. Using Remark 34, define T(r)

=

s:,

( E 2 - V(r))-l”dr

on J

=

[r,, rl).

-

Since V’(r,) = 0 and V”(rl) < 0, we have E 2 - V(r) = V ( r , ) - V(r) C(rl - r)’ > 0 for r near r l . It follows that t(r) --f 00 as r -+ rl, hence the inverse function I(?) is defined on I = [0, 00). This shows that y can be geodesically extended to [0, GO),and that r + rl and dcp1d.r + L/rt as z + 00. Thus y spirals in ever closer to the circular orbit at r = rl. (2) Free full in B. If y is a freely falling material particle in B, then dr/dr < 0, hence E 2 - V ( r ) > 0. Thus We can define

As r -+ 0, E 2 - V ( r ) oci, hence r(r) approaches a finite limit b > 0. It follows as before that y can be geodesically extended to [0, b) and that r(z)+O as r + b. -+

36. Proposition. An inextendible timelike geodesic y : 1 .+ K in Kruskal spacetime is incomplete if and only if ry(r) -+ 0 as r approaches a finite endpoint of I.

Kruskal Geodesics 397

Proof: We can suppose I has the form [0, B), B < 00. First we consider the possibilities for y(0). The central symmetry of K and the isometries (u, u, p ) (v, u, p ) and (u, u, p ) + ( - u, - u, p ) preserve geodesics, but only the latter preserves time orientation. Considering the effect of these on the quadrants and horizon of K leaves just three cases. -+

Case I : y(0) E KI, and y isfuture-pointing. If y has an exceptional orbit, then it is complete (that is, B = m). In fact, if its orbit is circular, y is periodic, hence complete; otherwise, by Example 35(1) or a variant, y spirals and is complete. If y escapes to infinity-more precisely, if r y is outgoing with no points of graph V ( r ) ahead on its E 2 line-then the method of Remark 34 shows easily that y is complete and that r(y(z)) + 00 as t -+ 00. The crash case is more interesting. Suppose that r 0 y is ingoing with no points of graph V ( r ) ahead on its E 2 line. The integral 0

z(r) =

r

( E 2 - V ( r ) ) - ” 2 dr

is well defined for all r I yo, but Schwarzschild coordinates fail at r = 2 ~ , so the method of Remark 34 extends y only over [0, ~(2111)). To reach z ( 2 ~ ) we use Kruskal coordinates. That y is future-pointing implies u‘ < 0 and u‘ > 0, so u is decreasing and u is increasing. As t -+ T ( ~ M ) , r -+ 2 ~ hence , u 0, but u 2 u(0) > 0. The energy equation shows that dr/dt - E < 0, and then manipulation of the energy equation shows that ( E r’)/A approaches a positive limit as z -+ t(2M). Hence, by Exercise 16(b), v‘/v approaches a positive limit. Thus u‘/v I A for some A > 0. But then u ( t ) I v(0)eAr,so u approaches a positive limit as z -+ r(2M). As in Remark 34, it follows that y ( t ) approaches a limit as t 4 2 ~ )Hence . by Lemma 5.8, y can be geodesically extended past r(2M). Since dr/dr = - E < 0 there, y passes through H’ into K , , , thus reducing the situation to Case 3. Finally, we consider turning points. For definiteness, suppose that r y is moving outward, approaching p such that V ( p ) = EZ and V’(p) > 0. Defining z(r) as usual, we see that the limit value t ( p ) is finite. Now extend r(t) on [0, t ( p ) ] symmetrically to [0, 2z(p)]. There r(t) has a continuous second derivative, so by Remark 34, y can also be extended over this interval. Thus y passes the turning point in finite time, and the situation is reduced to one of the subcases above. (Another turning point, of course, would give a bound orbit.) -+

-+

+

-+

0

Case 2 : y(0) E H‘, or y isfuture-pointing and y(0) E n-’(O,O). It is clear from the warped product metric that dn(y’(0)) is also timelike, hence not tangent to either coordinate axis in Q. It follows that for small t > 0, y ( t ) is in either K , or K,l. The latter gives Case 3 and the former is reducible to Case 1.

398 13 Schwarzschild Geometry Case 3 : y(0) E K,,. If y is future-pointing, Example 35(2) shows that it cannot be extended past some b > 0, and that r(y(z)) -+ 0 as T -+ b. If y is past-pointing, a simpler variant of the crash subcase in Case 1 shows that y can be extended through the horizon H . Then the situation can be reduced (without circularity) to future-pointing geodesics in this case or Case 1.

37. Corollary. Kruskal spacetime K is incomplete and inextendible. Since tidal forces approach infinity as r 0 y -+ 0, inextendibility follows from Remark 5.45.

Exercises

-

1. Let be a freely falling material particle in N with L = 0. Prove (a) For small and r % M : r ro u o t - go 5’/2, where uo = (dr/dz)(O) and go = M/r$. (b) The smallest value of uo for which y escapes to infinity is ( 2 ~ / r ~ )in”which ~ , case r3’* = r:” 3 m z . 2. In N consider a freely falling material particle in circular orbit of radius r. (a) If w = d q / d t is the (constant) angular velocity relative to Schwarzschild time, verify Kepler’s formula 0 2 r 3 = M. (b) For 0 = d q / d t , relative to proper time, prove W2r2(r- 3 ~ =) M. 3. Define the tidal force tensor of a Newtonian force field Z E X(R3) to be DZ E 2 [ ( R 3 ) .For the gravitational force field due to a mass M at 0 E R3 (as in Appendix C) compute the tidal force tensor and compare with Remark 21. 4. The Schwarzschild time for any freely falling particle in N to reach r = 2 M is infinite. (Hint: Express this time as an integral.) 5. Show that: (a) In Proposition 11 the geodesic differential equation for r can be written as 2r“ = -dV/dr. (b) If the equations in Proposition 11 [Proposition 331 hold for some curve y, and r’ is zero on no interval, then y is a geodesic. 6 . Find expressions for the critical points r 1 Ir2 of the potential function V ( r ) of a material particle in N . Show that as L + 03 : r l -+ 3M and V ( r l )+ m; r2 -+ 03 and V ( r 2 )-+ 1. 7. Let U = --a, on the black hole B. Prove: (a) U is an observer field. (b) U is geodesic and proper time synchronizable. (c) Every U-observer goes from (limit values) r = 2~ to r = 0 in proper time xM; all other material particles take less time. 8. Kruskal ruruature. Prove: (a) On N u B, the curvature tensor is zero on any three choices from mutually orthogonal vectors a,, d,, u, w. (b) On 11

T

+

+

Exercises 399

K , the curvature invariant R’jkiRijkl is 4 8 ~ ’ / r ~(Hint: . For orthogonal coordinates, I = (Rijkl)’/(giigjjgkk ,sli).) 9. (a) Prove the analogue of Proposition 4 with Kruskal coordinates u, L’ replacing Schwarzschild coordinates t , r. Show that (b) in K , the hypersurfaces u = 0 and u = 0 are totally geodesic but not semi-Riemannian (Exercise 4.9), and (c) the central sphere r - ‘(0,O)is Riemannian and totally geodesic. 10. If y is a freely falling particle (material or lightlike) in N , let y ( s ) = $(y( -s)), where $(t, r, p ) = ( - r , r , p ) . Show that y - is a freely falling particle of the same causal character, with E - = E and L - = - L . Since y-(s) =y ( - s), the particles traverse the same orbit in opposite directions. 11. Exceprional orbits. (a) Consider the freely falling material particles in N with E 2 = V ( r , ) . Interpreting spiral as in Example 34, show that there are three noncircular orbital types for such particles: crash/spiral, spiral/spiral, and spiral/escape. Find the conditions on L/M and r(0) that characterize each type. (b) Show that for lightlike particles in N with impact parameter b = 3 d M there are two noncircular orbital types: crash/spiral and spiral/ escape. 12. T u r n angles. If y is a freely falling particle in N with L # 0, let A q be the total (equatorial) angle .of turn for T 2 0. (By Exercise 10, corresponding results will hold for T I 0.) Show that A q is infinite if y is bound or spirals, finite if y escapes to infinity or crashes. (Hint: Express d q / d r in terms of r.) 13. An arbitrary instantaneous observer on N can be written as u = a 3, + b d, y , where y is tangent to fibers. (a) If w Iy is also tangent to fibers and hence is in u’, compute the tidal force

1

+

F J ~= )

-(wr3)(i

+ 3 (u,u))w.

(b) Find the other eigenvectors and eigenvalues of F,: u L -+ 11’. (Hint: trace F , = - Ric(u, u).) 14. fsometries o f Q . Prove: (a) The four mappings of Q sending (u, c) to (u, u), ( - u, - u), (c, u), and ( - c, - u), respectively, constitute a group G of isometries. (b) The flow isometries 58 = {$s: s E R } of the Killing field X (Lemma 27) are given by GS(u,r ) = ( ~ e - ~ ’ ~ ).~ (c) , 4C0,O) = (0.0) for all # E I(Q). (d) I ( Q ) z O(1, 1) = O,(2). (Hint: For (c). look at curvature; for (d), consider d# at (0, O).) 15. Z(K) = I ( Q ) x O(3). Prove: (a) If ~ E I ( Qand ) q € I ( S ’ ) = O(3), then 5 x q e I ( K ) . (b) If $ E I ( K ) , there exist unique (, q such that I) = .I‘ x r l . (Hint: Find q first by showing that $ preserves r.) 16. Let y be a geodesic in K . (a) If Kruskal coordinates are used in Propo‘ uu’), valid on all of K , and sition 33, then (GI) becomes E = ( F ( r ) / 4 ~ ) ( u othe energy equation becomes 2F(r)u’c’ + L 2 / r 2 = c. (b) Deduce from Lemma 23 that E + r‘ = 4MkU’/U if v # 0, and - E + r‘ = 4Mhu’/U if u # 0.

400 13

Schwarzschild Geometry

17. Null geodesics in Q. Prove: (a) If y(s) = (u(s), u(s)) has both u and u’F(r) constant, then y is a geodesic. (See Exercise 5.8.) (b) The u and u axes can be parametrized as complete geodesics. (c) Given (u,, u,) E Q, with c, # 0, there is a unique function u(s) such that r(u(s), u,) = -s + r,. (d) Then the curve y(s) = (u(s), u,), defined for s < r,, is an inextendible geodesic. 18. Null geodesics in K . Prove Proposition 36 with timelike replaced by null. In particular, for an inextendible null geodesic y : [O, B) + K show that ass -+ B there are five possibilities: r -+ 0, r = 2w, r = 3w, r + 3h1, and r + 00. (Hint: Use the energy equation preceding Corollary 16 and the method of Remark 34.) 19. Spacelike geodesics in K . Prove Proposition 36 with timelike replaced by spacelike. (Hint : The case L = 0 is easy; for the many spacelike geodesics with IL1 = 2M = r, use Proposition 7.38.)

14

CAUSALITY IN LORENTZ MANIFOLDS

By causality we refer to the general question of which points in a Lorentz manifold can be joined by causal curves : relativistically, which events can influence (be influenced by) a given event. In a particular manifold M , causality may be trivial, but under fairly mild conditions it is closely related to fundamental geometrical properties of M . For example the study of causality leads to sufficient conditions for points to be joinable by a (longest) causal geodesic and also for there to be a normal geodesic from a spacelike hypersurface to a point. In both cases a useful aid is a Lorentz analogue of Riemannian distance. Essential parts of this chapter were developed by the physicists R. Penrose and S. W. Hawking in an effort to understand why the most fundamental relativistic models-of the universe or of individual stars-turn out to have “singularities.” (We use this term in its broadest sense to mean timelike or null geodesic incompleteness: some freely falling observer or light ray is prematurely ended by a flaw in the manifold M.) In this context a singularity theorem is a general result of Lorentz geometry that asserts the existence of singularities. We prove two such theorems-one, due to Penrose, motivated by black hole singularities; the other, due to Hawking, motivated by cosmological (e.g., big bang) singularities. For more results of this type, see [HE], our basic reference. Through the chapter, M will denote a connected time-oriented Lorentz manifold of dimension n.

401

402 14

Causality in Lorentz Manifolds

CAUSALlTY RELATIONS

The causality relations on M are defined as follows. If p , q E M , then (1) p 6 4 means there is a future-pointing timelike curve in M from P to q ; (2) p < 4 means there is a future-pointing causal curve in M from p to q.

Evidently p 4 q implies p < 4. As usual, p I q means that either p < q or p = 4. For a subset A of M , the subset I + ( A ) = {q E M : there is a p

< q}

EA

with p

EA

with p 5 q }

is called the chronologicalfuture of A , and J + ( A ) = { q E M : there is a p

is called the causalfuture of A . Thus J + ( A ) =I A u ! + ( A ) . For a single point, I + ( p ) = {q:p 4 q}, and for a subset, I + ( A ) = { I + @ ) : p E A } ; similarly for J + . Dual to the preceding definitions are corresponding past versions. Thus I - ( A ) = { q M~:there is a P E A with 4 4 p } is the chronological past of A. In general, past definitions and proofs follow from the future versions (and vice versa) merely by reversing time-orientation. The standard for causality is Minkowski space R:. There I + ( p ) is just the future timecone of p , that is, (4: is timelike future-pointing}, and J + ( p ) is p together with { q :8 is causal future-pointing}. Thus Z+(p) is an open set with closure J + ( p ) ,and the latter is the union of I+(p), the future nullcone at p , and p itself. At the other extreme, the Lorentz cylinder Si x R' has trivial causality: even for a single point, I + @ ) = J + ( p ) is the entire manifold. The relations defined above are transitive; furthermore, if x 4 z there are infinitely many y such that x < y 4 z (and similarly for <). Proposition 10.46 has this fundamental consequence:

u

z

1. Corollary. If x 4 y and y I z , or ifx

y and y 4 z, then x 4 z.

Such results are all summarized as ] + ( A ) = I + ( I + A )= I + ( J + A ) = J + ( I + A ) c J + ( J + A ) = J + ( A ) ,

where A is an arbitrary subset of M . An open subset 42 of M is a time-oriented Lorentz manifold in its own right, and the intrinsic causality relations of 42 imply the corresponding ones in M . For example, let I + ( A , 42) denote the chronological future in the

Causality Relations 403

manifold $2 of the set A c 42. Then I ' ( A , 42) c I ' ( A ) n 42. This remark is particularly useful in the case of a convex open set V, since the intrinsic causality of V is as simple as that of Minkowski space. 2. Lemma. I f V is a convex open set in M , then (1) For p # q in %?, q ~ J ' ( p ,W ) o $ isfuture-pointing causal (analogously for I+). ( 2 ) I + ( p , V) is open in %? (hence in M ) . ( 3 ) J'Cp, V) is the closure in V of I ' ( p , U). (4) The relation is closed on W ; that is, if { p , } + p and (4,) + q, with all points in V , then q, E J'(p,, V ) for all n implies q E J ( p , V). ( 5 ) A causal curve a contained in a compact subset K of V is (continuously) extendible.

Proof. The first three properties follow from causality in T,(M) z RI and properties of the exponential map-particularly Lemma 5.33. Because is a continuous function of ( p , q ) E V x V, (1) implies (4). In ( 5 ) we can suppose the domain of a is [O,B), B I00. Since K is compact, there is a sequence {si} -, B such that {a(si)}converges to a point p E K . We must show that every such sequence converges to the same point p . Assume { t i } -, B but { a ( t i ) }-+ q # p . Roughly speaking, a races back and forth between p and q. By (4),q E J'(p, V ) and p E J'(q, V). Hence (1) requires to be both future- and past-pointing. w

2

Of these properties it turns out that only (2) holds for arbitrary M ; in fact, a stronger result is true. 3. Lemma. The relation 4 is open; that is, if p << q in M , there are neighborhoods and V of p and q, respectively, such that p' 6 q' for all PIE%, q ' E K

Proof. Let o be a timelike curve from p to q. If V is a convex neighborhood of q, let q - be a point of V on CJ before 4. Dually, let p f be a point of (T between p and q- and contained in a convex neighborhood W of p. By the lemma, I + ( q - , %?)and dually i - ( p f , V ) are open in M . Hence they have the properties required of V and 42, respectively.

This lemma links causality firmly to the topology of M . It implies, in particular, that the chronologica1,future I + ( A )of any set A is open. By contrast with the situation in Minkowski space, for arbitrary M the sets J + ( p ) need not be closed. 4. Example. Let A4 be R: with a point, say (1, I), deleted. Taking p to the origin, I ' ( p ) is the usual Minkowskian cone. But no causal curve from

404 14

Causality in Lorentz Manifolds

Figure 1

p can reach points indicated by the dashed line in Figure 1. Thus J + ( p ) consists of only I + ( p ) together with the null geodesics rays c1 and in the figure. In particular, the closure of I + ( p ) is strictly larger than J + ( p ) .

This example also illustrates the following results (taking the set A to be a single point p ) . Theorem 10.51 gives 5. Corollary. If c1 is a future-pointing causal curve from A to a point q E J + ( A ) - I + ( A ) , then a is a null geodesic that has no conjugate points before q and does not meet I + ( A ) .

Thus the causal future J + ( A ) of A is a union of A , I + ( A ) , and (possibly) certain null geodesics from A . 6. Lemma. For a subset A , (1) int J + ( A ) = I + ( A ) , and (2) J + ( A ) c ] + ( A ) ,with equality if and only if J + ( A ) is a closed set.

Proof: (1) I + ( A ) is open and contained in J + ( A ) , hence is contained in int J + ( A ) . If q E int J + ( A ) ,then for a convex neighborhood V of 4,Z-(q, %) contains a point of J + ( A ) .Hence q E I + J + ( A ) c I + ( A ) . (2) The equality assertion is clear, since I + ( A ) c J + ( A ) . It will suffice to prove that J + ( p ) c I T ) for a single point. Evidently p so suppose p < q. Let a be a future-pointing causal curve from p to q. If %' is a convex neighborhood of 4, let q- be a point of a in J - ( q , W ) . By Lemma 2, q E J + ( q - , V ) c I + ( 4 - , W). But I + ( q - , V ) c I + ( J + p ) c I + @ ) , hence q E I+@).

~m,

QUASI- LI MITS

The study of causality demands some notion of limit of a sequence of (piecewise smooth) causal curves. For any reasonable notion the limit curves need not be piecewise smooth; hence complications ensue. We shall ask merely for quasi-limits : broken geodesics that are only approximate limits, their accuracy measured by a convex covering of M . By this means, global causality questions can often be reduced to easy local ones.

Quasi- Limits 405

7. Definition. Let {a,} be an infinite sequence of future-pointing causal curves in M , and let A be a convex covering of M . A limit sequence for {u,} relative to 52 is a (finite or infinite) sequence p = po < p1 < . . . in M such that (Ll) For each p i there is a subsequence {a,} and, for each m, numbers smo < sml< . . . < smisuch that (a) limm-troa,(smj) = p j for each j Ii. (b) For each j < i , the points p j , p j + and the segments a,I [ s m j ,sm,j + for all m are contained in a single set gj E si. (L2) If {pi} is infinite, it is nonconvergent. If {pi}is finite, it has more than one point and no strictly longer sequence satisfies (Ll).

Figure 2

Here (Lla) is a natural limit requirement, (L2) is a technicality, and (Llb) accomplishes the reduction to individual convex sets (Figure 2). After proving existence of limit sequences we shall geodesically connect successive points pi < p i + l to obtain quasi-limits. The existence will be assured by two mild initial conditions on {a,): (1) The sequence {a,(O)} converges to a point p . (2) There is a neighborhood of p that contains only finitely many of the curves a, (notation: {a,} f. p ) . 8. Proposition. Let {a,} be a sequence of future-pointing causal curves such that {a,(O)} + p but {a,} p . Then relative to any convex covering R, {a,} has a limit sequence starting at p .

+

Proof. Since M is paracompact, it has a locally finite covering R' by open sets 3such that each 3 is compact and contained in some member of R. By the hypotheses on {a,}, we can arrange for R' to contain a gosuch that infinitely many a,s start in 98,, and leave go. Relabel these curves as {'a"},

406 14

Causality in Lorentz Manifolds

and for each let ' a & , ) be the first point in bd Bo.Passing to a further subsequence we can suppose that {'a,(s,)} converges to a point p1 of bd go. Now choose B1E R' containing p l . If infinitely many 'a,s leave B1, we obtain as before a subsequence {2cr,} whose first departure points from Bl converge to a point p 2 in bd B l . Repeat this step as many times as possible, with this proviso on subsequent choices of Bi: if there is more than one candidate (element of R' containing pi), pick one that has been used fewest times before. Clearly condition ( L l ) of the preceding definition holds, with Vi any member of R that contains gi.Since the relation I is closed on Wi, it follows that pi+ 2 pi. By the construction, pi+ # pi,hence pi+ > pi. If the sequence (pi} obtained above is infinite, we must show it is nonconvergent. Assume {pi} -+ q. Pick B E 52' containing q, then pi E B for all but a finite number of the integers i 2 0. Since 3 is compact and R' is locally finite, only finitely many members of R' meet B. Hence some one must have been chosen as Bifor infinitely many i. But this violates the choice proviso, above, for 9-3itself was a candidate infinitely many times, but was chosen at most finitely many times (because, since it contains almost all pis, only finitely many can be in bd B). Finally suppose that the sequence {pi} obtained above is finite: p = p o < p1 < . . . < p k . Since the construction cannot continue, it must be true that only a finite number of the curves c'1, leave g k . Let {a,} be those trapped in gk.By Lemma 2(5) they are extendible; by Exercise 5 we may as well assume that a, is already defined on a closed interval [0, b,]. Since Bk is compact, for a further subsequence the endpoints a,(b,) converge to a point qEBk. If q = p k , then p o < . . . < p k obviously cannot be extended still satisfying (Ll); hence it is a limit sequence. If q # p k , then both (Ll) and (L2) hold for P O < " ' < Pk < P k + l = 4. If {pi> is a limit sequence for (a,} as above, let Ai be the (future-pointing causal) geodesic from pi to pi+ in a convex set W ias in (Ll). Assembling these segments for all i gives a broken geodesic A = Ai called a quasi-limit of {a,)with vertices pi. Thus A is a future-pointing causal broken geodesic that starts at p . Zf {pi}is infinite, then b y (L2), A isfuture-inextendible. In the finite case p o < . < p k , the curve A runs from p o to p k . A sequence satisfying the initiai conditions always has infinitely many limit sequences, though sometimes as in Example 9 they all give the same quasi-limit. (In this context we can ignore change of parametrization, since it has no effect on the causality properties of curves.) A quasi-limit A of future-inextendible curves {a,} is future-inextendible. In fact, it is clear from the preceding proof that every limit sequence will be infinite-hence A will be inextendible.

1

Causality Conditions 407

9. Example. In R: let a,, be the timelike geodesic segment from the origin 0 to (n + ( l / n ) ,n). In any limit sequence for {a,} the vertices clearly must lie on the null geodesic ray A(s) = (s, s), s 2 0. In fact, with 0 as initial point, A is the unique quasi-limit. It is easy to test the construction from Proposition 8 (convex sets are the same as for R'). Now, as in Example 4, let M = R: - (1, 1). Then {a,} has the unique quasi-limit p = A I [0, l), which is future-inextendible and incomplete in M . Note that only decreasingly small initial segments of the curves a,, are actually used to get p. CAUSALITY CONDITIONS

If M contains no closed timelike curves, we say that the chronology condition holds on M . Physically this is a natural requirement since its absence leads to distressing paradoxes: an observer could take a trip from which he returns before his departure-and then decide not to go after all. More radically, by killing an ancestor he could prevent his own birth. Compact spacetimes are of minor importance for relativity, since the chronology condition cannot hold for them: 10. Lemma. If M is compact, it contains a closed timelike curve. Proof. By compactness the open covering ( I + ( p ) : p E M } has a finite subcover I+(pl), . .., I'(pk). We can assume that I+&) is not contained in any later I+(pi)-otherwise discard I+(pl). But then p1 E I+(p,)-that is, there is a closed timelike curve through pl-for if p1 is in some later I + ( p i ) , then I + ( p , ) c l + ( p i ) . rn The manifold M satisfies the causality condition provided there are no closed causal curves in M . Obviously this implies the chronology condition, but not conversely: For example, the translation (t, x) + ( t + 1, x + 1) of R: generates a properly discontinuous group of isometries, and the resulting Lorentz orbit manifold M satisfies the chronology but not the causality condition. In fact, the null geodesic s + (s, s) in R: projects to a smoothly closed null geodesic in M , but a timelike loop r~ in M would lift to a timelike curve in R: from ( t , x) to (t + n, x + n)-an impossibility. The causality condition (and similarly for chronology) is said to hold at a point p if there are no closed causal curves through p , and on a subset A if it holds at each p E A . 11. Definition. The strong causality condition holds at p E M provided that given any neighborhood 42 of p there is a neighborhood V c 42 of p such that every causal curve segment with endpoints in Y lies entirely in 42.

408 14

Causality in Lorentz Manifolds

This says that causal curves that start arbitrarily close to p and leave some fixed neighborhood cannot return arbitrarily close to p. In short, there are no causal curves almost closed at p. Thus strong causality implies causality; but the converse fails. We give an example in the traditional pictorial style in which M is a region in R: (sometimes with identifications),null geodesics run at f 45“, and the future is upward. If no other metric is mentioned, use the flat R: metric. 12. Example. Following [HE], build M from S : x R1 by deleting two spacelike half-lines whose endpoints were the endpoints of a short null geodesic (dashed line in Figure 3). The causality condition holds on M, but the strong causality condition fails at each point of the null geodesic.

Figure 3. Lines L and L’ are identified. There are no closed causal curves through p, but a is almost closed.

13. Lemma. Suppose the strong causality condition holds on a compact subset K of M . If a is a future-inextendible causal curve that starts in K , then a eventually leaves K never to return; that is, there is an s > 0 such that a(t) 4 K for all t 2 s. Proof. Assume that the conclusion is false. Then a either remains in K or persistently returns. Thus, if the domain of a is [0, B), B I00, there is a sequence {si} in [0, B) such that { s i } + B and {a(sj)} is contained in K . Then, for a subsequence, {a(sj)} converges to a point p E K . Since a has no future endpoint, there must be another sequence I t j } converging to B such that { a ( t j ) } does not converge to p . By a further subsequence we can suppose that some neighborhood 42 of p contains no a(tj). Since {sj} and i t j ) both converge to B, they have subsequences that alternate: s1 < t , < sz < t z < . . .. Thus the curves a I [s,, sf+ are “almost closed” at p , contradicting the strong causahty of M at p. The following will be the main step in constructing geodesics joining points p q.

-=

Time Separation 409

14. Lemma. Suppose the strong causality condition holds on a compact subset K . Let {a,} be a sequence of future-pointing causal curve segments in K such that {a,(O)} + p and {a,(l)} -+ q # p . Then there is a future-pointing causal broken geodesic ;I from p to q and a subsequence {a,} of {a,} such that limm+mL(ccm)I t(;I).

Proof. Proposition 8 implies that {a,} has a limit sequence {pi} starting at p. If ( p i } is infinite, the corresponding quasi-limit 1is a future-inextendible causal curve starting at p . Hence by the preceding lemma, 1 must leave K and never return. In particular, some vertex pi is not in K . But this implies that a,s leave K : a contradiction. Thus the limit sequence is finite, starts at p, and ends at lim am(l)= q. The quasi-limit 1 with these vertices is a causal broken geodesic from p to q. In Definition 7, (Llb) lets us deal with one convex set Vi at a time. The length of the ith segment of am is at most the separation of its points in gi;that is, U a m I Esmi,

sm,i +

11)

d

IPmiPm, i +

I,

where pmi = am(smi). Hence k

am) I

C IpmiPm,i+lI. __+

~m

=

i=O

2

By Lemma 5.9, the vector and hence its norm Ip1)4( depend continuously on (p, 4). Thus {L,} converges to pi p i + = L(1). Taking a further subsequence gives the result. TIME SEPARATION There is a natural way to generalize the notion of the separation of points p I q in R; to the arbitrary time-oriented Lorentz manifold M . 15. Definition. If p , q E M , the time separation z(p, q ) from p to q is

sup{l(a): 01 is a future-pointing causal curve segment from p to q}. Here t ( p , q ) = co if the set of lengths is unbounded, and z(p, q) = 0 if it is empty; that is, if q $ J+(p). When the supremum is taken on, we can think of t ( p , q) as the proper time of a slowest trip in M from p to q. In Minkowski spacetime, ~ ( pq,) is the I if p I q and otherwise is zero. Of course z will behave badly separation if chronology conditions fail; for example, it is infinite for all p , q in S: x R'. The comparison between time separation and Riemannian distance is more dual than direct: T maximizes, d minimizes. Because it involves time orientation, z is symmetric only in trivial cases.

410 14

Causality in Lorentz Manifolds

16. Lemma. (1) z(p, q ) > 0 if and only if p .g q. (2) Reverse triangle inequality: If p Iq 5 r, then r(p, q) + z(q, r ) I z(P, r). Proof. (1) If z(p, q ) > 0, there is a future-pointing causal curve tl from p to q with L(a) > 0. Thus LY is not a null geodesic, so there is a fixed endpoint deformation of a to a timelike curve. The converse is obvious by definition of z. (2) If there are (future-pointing) causal curves LY from p to q and P from q to r, then, given any number 6 > 0, we can choose tl and P to have lengths within 6/2 of r(p, q ) and z(q, r), respectively. Hence z(p, r ) 2 L(LY

+ P) 2 Z(P, 4) + z(q, r ) - 6,

and the result follows. If there is no (future-pointing) causal curve from say p to q, then z(p, q) rn = 0. Since p I q, we have p = q, so the result holds trivially. 17. Lemma. The time-separation function z: M x M lower semicontinuous.

-,[0, co]

is

Proof. If z(p, q ) = 0, there is nothing to prove. Suppose q E I+(P) and 0 < 4% 4 ) < aGiven 6 > 0, we must find neighborhoods % and V of p and q, respectively, such that if p' E %! and q' E % then s(p', q') > t ( p , q ) - 6. Let tl be a timelike curve from p to q with L(a) > z(p, 4 ) - 6/3. Let %' be a convex neighborhood of q and let q 1 be a slightly earlier point of a in W. Write [r, r'] for the geodesic segment in %? from r to r'. Since the length of such segments depends continuously on endpoints, there is a neighborhood Y of q such that if 4' E "%", then [ql, 4'1 is causal and L[q,, 4'1 > L([q,, q ] ) - 6/3. Since [ q l , q ] is geodesic, it is at least as long as the segment of LY from 41 to 4. The corresponding construction at the endpoint p then gives a neighborhood % of p such that points p' E % and 4'E Y can be joined (in an obvious way) by a causal curve of length L > L(a) - 26/3 > z(p, q ) - 6. If ~ ( p4,) = 00, the same argument shows that for any A > 0 there are neighborhoods as above such that t(p', q') > A .

The notation J ( p , q ) = J + ( p ) n J - ( q ) is a convenient one; J(p, q ) is the smallest set containing all future-pointing causal curves from p to q (hence it is empty unless p Iq). 18. Example. The time-separation function z need not be continuous. Delete a spacelike segment from R:. Then for points p and q arranged as in Figure 4, z is not continuous at (p, 4 ) . In fact, the shaded region is J ( p , q), so

Time Separation

411

Figure 4

for E > 0 small, every causal curve from p to q is nearly lightlike, hence short. However, for points q' (as in Figure 4) arbitrarily near q, new causal routes such as p appear with L(p) large.

The time separation t ( A , B) of subsets A and B of M is sup(t(a, b): a E A , b E B } . A variant of the preceding proof shows that the functions x z(x, B ) and y -+ z(A, y ) are lower semicontinuous. For the Lorentz manifold M we now establish reasonable sufficient conditions for the existence of a longest causal geodesic from p to q. An obvious necessary condition is p < q. -+

19. Proposition. For p < q, if the set J(p, q ) = J + ( p ) n J - ( q ) is compact and the strong causality condition holds on it, then there is a causal geodesic from p to q of length z(p, q).

Proof: Let {a,} be future-pointing causal curve segments from p to q whose lengths converge to z(p, q). (A priori the latter could be 00, but the proof shows it is finite.) These curves are all in J ( p , q), hence by Lemma 14 there is a causal broken geodesic A from p to q with L(A) = r(p, q). If A actually has breaks, there is a longer causal curve from p to q (10.3 and 10.46), hence A is unbroken. This proposition motivates a fundamental definition: M is globally hyperbolic provided the strong causality condition holds, and for each p < q the set J ( p , q) is compact. Then any pair of points that can be joined by a causal curve can be joined by a (longest) causal geodesic. This is the best possible conclusion, and opens the way for a variety of geodesic constructions that make globally hyperbolic manifolds a particularly convenient type. Evidently Minkowski space Ry is globally hyperbolic, and, as we shall see, so are Lorentz spheres S;, Robertson-Walker spacetimes (with space S complete), and Kruskal spacetime. O n the other hand, removing

41 2 14

Causality in Lorentz Manifolds

a single point from a globally hyperbolic M destroys the property, since there will be noncompact J(p,q)s. Because global hyperbolicity of M is such a stringent condition, it is useful to weaken it as follows. 20. Definition. A subset S of M is globally hyperbolic provided (1) the strong causality condition holds on X , and (2) if p , q E X with p < q, then J ( p , q) is compact and contained in S.

The definition relates the subset 2 to the causality of the manifold M ; hence the property is definitely not intrinsic to %.' By Proposition 19, in a globally hyperbolic set Y? there is a causal geodesic of M joining any points p < q. 21. Lemma. If 42 is a globally hyperbolic open set, the time-separation function T of M is continuous on 42 x 42.

Proof. From before, T is finite and lower semicontinuous.Assume it is not upper semicontinuous at ( p , q ) E 42 x 42. Thus there is a number 6 > 0 and sequences {p,,} + p and {q,,} --* q such that ~ ( p , ,q,,) , 2 ~ ( pq) , 6 for all n. Since r(P,,, q,,) > 0, there is a causal curve a,, from p,, to q,, such that L(a,)> r(p,,, q,,) - l/n. Because 42 is open, it contains points p - -g p and q+ >> q. We can suppose that the sequences { p , } and (4.) are contained in the open sets I+@-) and I - ( q + ) , respectively. It follows that the curves a, are all in J ( P - , q+). Since 42 is globally hyperbolic, Lemma 14 applies (with K = J ( p - , 4')) to show that there is a causal curve 1 from p to q with L(1) 2 r(p, q ) + 6. This is impossible in view of the definition of T.

+

22. Lemma. If 42 is a globally hyperbolic open set in M , the causality relation I of M is closed on 42.

Proof. As in Lemma 2, the assertion is that if { p , } + p and {q,,} --* q with all these points in 42, then p,, I q,, for all n implies p I q. The proof is trivial if p,, = q,, for infinitely many n, hence we can suppose p,, < q,, for all n. Let a,, be a causal curve from p,, to q,,.Just as in the preceding proof, all a,, are in a suitable J ( p - , q'), and if p # q, the causal curve 1 given by Lemma 14 shows that p < q. rn In particular, if M itself is globally hyperbolic, then all sets J'(p), J - ( q ) , and J ( p , q) are closed.

Achronal Sets 413

ACHRONAL SETS

A subset A of M is achronal provided the relation p q never holds for p , q E A ; that is, provided no timelike curve meets A more than once. For example, in R; a hyperplane t constant is achronal. Obviously any subset of an achronal set is achronal. By Lemma 3 the closure of an achronal set is achronal. 23. Definition. The edge of an achronal set A consists of all points p

E

A such that every neighborhood 42 of p contains a timelike curve from

Z-(p, %2)to I + @ , 42) that does not meet A.

For example, in R: the interval A = ((0, x): 0 I x < I} is achronal and has two edge points, ( 0 , O ) and (0, 1). But if A is considered as a subset of R:, then edge A = A. In R; a hyperplane t constant has empty edge. We want to show that every edgeless achronal set is a hypersurface-not necessarily a smooth hypersurface, since, for example, any nullcone A'(p) in RY is achronal and edgeless. An n-dimensional topological manifold T is a Hausdorff space such that each point has a neighborhood homeomorphic to an open set in R". (Thus a smooth manifold is a topological manifold.) The classical Brouwer theorem on the invariance of domain [V] has the following useful consequence: Zf 4: T -, T' is a one-to-one continuous mapping of n-dimensional topological maniJolds, then 4 is a homeomorphism onto an open set 4 ( T )of T'. 24. Definition. A subspace S of Tis a topological hypersurface provided that for each p E S there is a neighborhood 42 of p in Tand a homeomorphism of 42 onto an open set in R" such that @(42 n S) = 4(@)n ZZ, where ZZ is a hyperplane in R".

Then S is an (n - 1)-dimensional topological manifold. Evidently this definition is just a topological form of the slice criterion (1.31) for smooth submanifolds. For example, the cone A'(0) in R; is a topological hypersurface, since the homeomorphism ( t , x) ( t - 1x1,x) carries it to the hyperplane t = 0. --f

25. Proposition. An achronal set A is a topological hypersurface if and only if A contains no edge points (that is, A and edge A are disjoint).

Proof. First let A be a topological hypersurface. If p E A, let 42 be an adapted topological coordinate neighborhood as in the preceding definition. We can suppose that 62 is connected and 42 - A has just two components.

414 14

Causality in Lorentz Manifolds

Since A is achronal, the open sets l - ( p , %) and I + @ , %) are open connected sets that are disjoint and do not meet A . Any timelike curve through p meets both sets, hence they are contained in different components of @ - A. Thus p t$ edge A . Now suppose A and edge A are disjoint. If p E A , then on a neighborhood @ of p let 5 be a coordinate system with d/dxOtimelike future-pointing. Using these coordinates we can get a smaller neighborhood -Y- such that

'

(1) ( ( Y )has the form (a - 6, b + S) x A'" c R' x R'- = R". (2) The slice xo = a of V" is in I - ( p , %); the slice xo = 6 of Y is in I + @ , %) (see Figure 5).

Figure 5

'

For % sufficiently small, if y E J1/' c R"- the xo coordinate curve s

-+

5-'(s, y )

(a 5 s 5 6)

must meet A , since p is not an edge point. Since A is achronal, the meeting point is unique; let h(y) be its xo coordinate. It suffices to show that the function h : ./lr -+ (a, 6) is continuous, for then 4 = (xo - h(x', . . . , x"-'), x', . . . ,x"-') is a homeomorphism that carries A A V" to the slice uo = 0 of 4 ( f ) c R".Thus A is a topological hypersurface. Let {y,} be a sequence that converges in A'" to y. Assume {h(y,)} does not converge to h(y). Then some subsequence {h(y,)} converges to r # h(y) (since h's values are bounded). But then 5 - ' ( y , r) is in the open set I-(q, V")u I + ( q , Y ) ,where q = { - ' ( y , h(y))E A . Hence the same is true for some 5 - ' ( y n , h(y,)) E A , contrary to the achronality of A . (In fact, the function h satisfies a Lipschitz condition). 26. Corollary. An achronal set A is a closed topological hypersurface if and only if edge A is empty.

Prooj. If A is a closed hypersurface, then by the proposition, A and edge A are disjoint. But edge A c A = A, so edge A is empty.

Cauchy Hypersurfaces 41 5

Conversely, suppose edge A is empty. Then A is a topological hypersurface. That A is closed is shown by the achronal identity A - A c edge A. To prove this inclusion, note that since 2 is also achronal, if q E A - A , then no timelike curve through q can ever meet A . It follows at once that q E edge A . rn A subset F of M is a future set provided I + ( F ) c F. For example, if B is any set, then J + ( B ) is a future set. Note that if F is a future set, its complement M - F is a past set (closed under I - ) .

27. Corollary. The (nonempty) boundary of a future set is a closed achronal topological hypersurface. Proof. Let p E bd F . If q E I + ( p ) , then Z-(q) is a neighborhood of p and hence contains a point of F . Thus q E I + ( F ) c F . This proves Zf(p) c F ; dually, Z-(p) c M - F . A first consequence is that I+(bd F ) and Z-(bd F ) are disjoint, and hence bd F is achronal. A second is that the closed set bd F has no edge points, since, in fact, I + ( p ) c int F and Z-(p) c ext F for p E bd F . Thus the result follows from the preceding proposition. rn

For example, in RY this shows again that a nullcone A + ( p ) = bd J + ( p ) is a closed achronal topological hypersurface. A subset B of M is acausal, provided that the relation p < q never holds , is, provided that no causal curve meets B more than once. for p , ~ E Bthat This is a stronger requirement than achronality ;for example, a null geodesic line in Minkowski space is achronal but not acausal. CAUCHY HYPERSURFACES

28. Definition. A Cauchy hypersurface in M is a subset S that is met exactly once by every inextendible timelike curve in M .

In particular, S is achronal. In R;, the hyperplanes t constant are Cauchy hypersurfaces, but the achronal sets H " - and A'(p) are not. Examples will show that a given M need not contain a Cauchy hypersurface.

'

29. Lemma. A Cauchy hypersurface S is a closed achronal topological hypersurface and is met by every inextendible causal curve. Proof. It is immediate from the definition that M is the disjoint union of the nonempty sets Z-(S), S, I+@). A timelike curve through any point of S instantly meets both Z-(S) and I + ( S ) . Thus S is the common boundary of the open sets I - ( S ) and Z+(S). In view of Corollary 27 it remains to show that S is met, not just by every inextendible timelike curve, but by every inextendible causal curve ct. Assume that 01 does not meet S. For definiteness, let

41 6 14

Causality in Lorentz Manifolds

a(0) E I + ( S ) . By Lemma 30, below, there is a past-inextendible timelike curve

p starting in I + @ ) that does not meet S . Any future-pointing timelike curve starting at p(0) must remain in If(S); thus adjoining it to p gives an inextendible timelike curve that avoids s. We combine the "avoidance lemma" required above with a sharper version needed later. 30. Lemma. Let a be a past-inextendible causal curve starting at p that does not meet a closed set C.

(1) If po E I + @ , M - C), there is a past-inextendible timelike curve starting at po that does not meet C . (2) If a is not a conjugate-free null geodesic, there is a past-inextendible timelike curve starting at a(0) that does not meet C . Proof. Since a is past-inextendible, we can suppose that it has domain [0, m) and that the sequence {a(n)} does not converge. (1) The scheme is to push a slightly to the future, the displacement dropping as one proceeds pastward on M. We work solely in the open submanifold M - C ; all points are in M - C and the relation 4 is that of A4 - C (implying that of M ) . Since po a(O), also po 9 a(1). There is a point p1 such that a(1) 4 p1 4 po. Continuing by induction we get a sequence {p,} with a(n) 4 pn 4 pn- for all n 2 1. Joining each pn- to pn by a timelike segment gives a past-pointing timelike curve p in M - C with p(0) = p o . During the construction we can choose p n so close to ~ ( nthat ) {p,} does not converge (e.g., &(p,, ~(n))< l/n for some topological metric d on M ) . Thus j3 is inextendible. ( 2 ) We can assume a I [0, 13 is not a conjugate-free null geodesic. Since a[O, 11 is compact and disjoint from C , it follows from Theorem 10.51 that this curve segment can be deformed to a timelike segment with the same endpoints, still avoiding C . Let a1 be the causal curve gotten from a by replacing a I [0, 1) by this timelike segment. Now a I [1, m) may be a conjugate-free null geodesic, but for 6, > 0 small, a1 [l - 6,, 21 is not. As before, we get a, by replacing this segment by a timelike segment avoiding C. Iterating the last step gives a timelike pastpointing curve /3 from a(0) that avoids C. Choosing the sequence {6,J to converge to 0 rapidly enough ensures, as in (l), that fl is past-inextendible. rn

+

The exception in ( 2 )is necessary; for example, let C be the lower imbedding of H" in R:". Then every past-inextendible timelike curve through the origin 0 must meet C , but null geodesics through 0 do not.

Warped Products 417

31. Proposition. Let S be a Cauchy hypersurface in M , and let X be a timelike vector field on M . If p E M , a maximal integral curve of X through p meets S at a unique point p ( p ) . Then p : M + S is a continuous open map onto S leaving S pointwise fixed. In particular, S is connected. Proof. Lemma 1.56 and Exercise 1.16 show that maximal integral curves of X are inextendible. Let $: 9 + M be the flow of X . 9 is an open set in M x R, and S is a topological hypersurface in M , hence 9 ( S ) = (S x R ) n 9is a topological hypersurface in 9. The restriction $: 9 ( S ) + M is continuous, and, since S is a Cauchy hypersurface, $ is one-to-one onto. 9 ( S ) and M are topological manifolds of the same dimension, hence by the invariance of domain, $ is a homeomorphism. The natural projection n: S x R - r S is open, continuous, and onto. But p = n o $ - ’ , since p$(p, t ) = por,(t) = a,(O) = p for p E S. Thus p has the same properties as n. Clearly p l S = id. Since M is connected, so is S .

32. Corollary. Any two Cauchy hypersurfaces in M are homeomorphic. Proof. Let S and T be Cauchy hypersurfaces in M . For a fixed timelike vector field let ps and p T be the resulting retractions of M onto S and T. Then clearly p T I S and ps I T are inverse maps. rn

WARPED PRODUCTS We consider some causality relations on Lorentz warped products M = B x f F that, like Schwarzschild spacetime, have B Lorentz and F Riemannian. Recall that for each q E F the leaf 6‘ ( 4 ) is totally geodesic in M and isometric to B under the projection n:M + B. (Wl) (dn(u), dn(v)) I (v, v ) for each u E T M , with equality ifand only i f v is horizontal.

This follows immediately from Definition 7.33 since f > 0 and F is Riemannian. Hence (W2) dn carries causal vectors onto causal vectors.

(Onto because each causal vector in T,,(B) has a horizontal lift to T,(M).) In fact, timelike and nonhorizontal null vectors project to timelike vectors, while horizontal null vectors project to null vectors. Thus causal [timelike] curves in M project to causal [timelike] curves in B. It follows also that M is time-orientable if and only if B is. As always in

41 8

14

Causality in Lorentz Manifolds

this chapter, M is assumed to be connected and time-oriented; then we time-orient B so that dx carries future-pointing vectors to future-pointing vectors. (W3) A subset A of B is achronal [acausal] {f and only if n - ' ( A ) is achronal [acausal] in M . Proqf'. If a is timelike with endpoints in x - ' ( A ) , then n 0 a is timelike with endpoints in A . If y is timelike with endpoints in A , then any horizontal lift 7 is timelike with endpoints in x- ' ( A ) .Similarly,

(W4) M satisfies the chronology, causality, or strong causality conditions, respectively, ifand only i f B does. (W5) Suppose F is complete and the strong causality condition holds on B, hence on M . If a causal curve CI in M is inextendible, then 7t a is inextendible in B (and conversely). 0

Proof. Assume that n a : 10, b) -, B is extendible to p : [0, b ] + B. By Exercise 5.14, x CI has finite length. Since f > 0 on the compact set b[O, b ] , there is a C > 0 such that f 0 TC CI 2 C. The formula for the warped product metric then gives I da(a')I II dx(cr')I / C . Consequently, a CI also has finite length. Since F is complete Riemannian, this means that (T CI stays in some compact set K . Thus a remains in the compact set b[O, b] x K , contradicting Lemma 13. 0

0

0

0

0

33. Lemma. With notation as above, if F is complete, then M B x f F has a Cauchy hypersurface if and only if B does.

=

Prouf. If S is a Cauchy hypersurface in M , then S n a- ' ( 4 ) is a Cauchy hypersurface in a leaf o- ' ( q ) ,hence n(S n a- ' ( 4 ) ) is a Cauchy hypersurface in B. Let C be a Cauchy hypersurface in B. Anticipating Corollary 39 (or, in special cases, Lemma 34), we assert that the strong causality condition holds on B. Thus (W5) applies and it follows immediately that x- '(C) is a Cauchy hypersurface. Some of the most important spacetimes in relativity theory are (spherically symmetric) warped products M = B x f S 2 with B a Lorentz surface. Then the preceding lemma is particularly effective, since a hypersurface in B is only a curve. Using null geodesic criteria such as Corollary 54, one can often tell by inspection whether B (hence M ) has a Cauchy hypersurface. For example, in the Kruskal plane (Figure 13.10), the diagonal line A : u = u is a Cauchy hypersurface, hence the cylinder n- ' ( A ) = A x S2 is a

Cauchy Developments 419

Cauchy hypersurface in the Kruskal spacetime Q x S2. In applications, B is often simply connected; then (W4) and the following lemma show that the strong causality condition holds on B x s S2. (These remarks remain valid if S2 is replaced by any complete Riemannian manifold.)

34. Lemma. The strong causality condition holds on any simply connected Lorentz surface B. Proof. (See, for example, [VJ for the topological results used.) Since B is simply connected, it is time-orientable and there are smooth future-pointing null vector fields U, V on B that are linearly independent. Hence V - U is a nonvanishing spacelike vector field. (1) The causality condition holds. Assume a is a closed causal curve. By taking its first loop we can suppose a is simply closed. Since it is a simply connected surface, B is homeomorphic to R2 or S2 (the latter is ruled out by Proposition 5.37). Thus the Jordan curve theorem applies: a parametrizes the boundary of a 2-cell E in B. Since V - CJ is never tangent to a, it (or U - V ) always points into E : integral curves starting a are initially in E. For all sufficiently small t, the flow map $ t is defined on (compact) E and carries E into itself. By the Brouwer fixed point theorem, each has a fixed point in E. But at a limit point of such fixed points, V - U must be zero: a contradiction. (2) The strong causality condition holds. If not, we can readily find a causal curve a whose endpoints a(0) and a(1) are joined by a segment CT of an integral curve of I/- U . Then the previous argument still applies to O+a.

H

The result fails in higher dimensions (see Exercise 8).

CAUCHY DEVELOPMENTS 35. Definition. If A is an achronal subset of M , the ,future Cauchy development of A is the set D + ( A ) of all points p of M such that every pastinextendible causal curve through p meets A . (In particular, A c D + ( A ) . )

Relativistically, D'(A) is that part of the causal future of A that is predictable from A : no past-inextendible particle or light ray can reach an event q in D + ( A ) without earlier having gone through A . With the past Cauchy development D - ( A ) defined dually, D ( A ) = D - ( A ) u D + ( A ) is the Cuuchy development of A .

420 14

Causality in Lorentz Manifolds

36. Examples of Cauchy Developments. (1) If A is a spacelike hyperplane t = c in R", then D + ( A ) = J'(A) = {(t,x): t 2 c}, and similarly for D - ( A ) . Hence D ( A ) = R;. (2) For the lower imbedding of H" in Minkowski (n 1)-space, D+(H") = J + ( H " ) n I-(O), the union of H" and the open region between H" and the past nullcone of the origin-only a small part of J'(H"). But D - ( H " ) = J-(H"). (3) Let M be R: x S' with a point deleted. For a spacelike circle S as in Figure 6, D + ( S ) is the union of S and the open region between S and the null geodesics a and 8. Again D - ( S ) = J - ( S ) .

+

Delete

s Figure 6

In view of Lemma 29, an achronal set A in M is a Cauchy hypersurface if and only i f D ( A ) = M . Thus in the general case, we can think of D(A) as the largest subset for which A plays the role of Cauchy hypersurface. The definition of Cauchy development makes sense for any set, and D'(A) t A u I C ( A ) c J'(A). Since we assume A is achronal, D + ( A ) and I - ( A ) are disjoint, hence D + ( A ) n D - ( A ) = A and D + ( A ) - A = D(A) nI+(A). For future reference we call the following fact regression: a past-pointing causal curve a starting in D + ( A ) cannot leave D + ( A ) withoutfirst meeting A . (Proof. If a(s) .$ D + ( A ) , there is a past-inextendible causal curve starting at a(s) that does not meet A-but a I [O, s] j? must meet A , hence a[O, s) meets A.)

+

37. Lemma. If A is achronal and p E int D(A), then every inextendible causal curve through p meets both I - ( A ) and ] + ( A ) . Proof. We can suppose also that p E A u Z+(A). Let a be a pastinextendible causal curve starting at p . The proof of Lemma 30(1)-with C

Cauchy Developments

421

a

empty-shows that there is past-inextendible causal curve starting in D(A) n Z+(A) c D'(A), such that each P(s) has a point of a in Z-(a(s)). Since meets A , it follows that a meets Z-(A). Let y be a future-inextendible causal curve starting at p . If p E A , then the dual of the past-inextendible case asserts that y meets Z+(A),and there is nothing to prove if p E Z+(A).

38. Theorem. If A is an achronal set, then int D(A) (if nonempty) is globally hyperbolic. Proof. (1) The causality condition holds on D(A). Assume there is a causal loop y at p E D(A).Traversing y repeatedly gives an inextendible causal curve jj which must then meet A . But jj meets A repeatedly, contrary to achronality. ( 2 ) Strong causality holds at p E int D(A). Assume not. Then there exist future-pointing causal curve segments a, defined on [0, 11 such that {a,(O)} and (a,(l)} both converge to p , but every a, leaves some fixed neighborhood of p . Thus {a,} has a future-directed limit sequence { p i } starting at p . If (pi} is finite, it ends at lim a,(l) = p . But then p < p contrary to (1). Thus {pi} is infinite, hence the corresponding quasi-limit A is future-inextendible. By Lemma 37, enters Z+(A) and hence remains there, so some vertex pi is in Z+(A).Thus there is a subsequence {a,} and (by reparametrization) a number s E [0, 11 such that lim a,(s) = p i . Evidently we can suppose every a,(s) is in Z+(A). Since p i # p , we can apply Proposition 8 dually to {a, I [s, l]} to obtain a past-directed limit sequence (pi}starting at p . If {pi}is finite, it must end at lim a&) = p i . But then p i < p . Since p i > p , this gives the contradiction p < p again. Thus {pi} is infinite. The resulting quasi-limit 2 is a past-inextendible causal curve starting at p . By the lemma it meets l - ( A ) . Hence as usual some a, 1 [s, 13 must meet I-@). Since a, is future-pointing and has a&) E I+(& this is contrary to the achronality of A . ( 3 ) If p I q in int(A), then J(p, q) is compact. If p = q, then by (l), J(p, q ) = { p } . So suppose p < q. Let {x,} be a sequence in J ( p , q ) ; we must show that some subsequence converges to a point of J(p, q). Let a, be a future-pointing causal curve segment from p through x, to q. Let R be a covering of M by convex open sets %'such that @ is a compact and contained in a convex open set; all limit sequences will be relative to 53. There is certainly such a sequence starting at p . Suppose it is finite, hence ends at pk = q. Let {a,} be a subsequence as in condition (Ll) of Definition 7. By the pigeonhole argument, there is an i < k such that, for infinitely many m, the point x, lies on the ith segment aml[smi,s , , ~ + ~of] a,. Pass to this subsequence. By

422 14

Causality in Lorentz Manifolds

(Ll) the segments, hence the points x,, all lie in a single member %? of 53. The properties of % imply that {x,} converges to a point x and by Lemma 2 that pi Ix i pi+ Hence p 5 x I q, that is, x E J(p, q). It remains to derive a contradiction to the assumption: eoery limit sequence,for { a , , ) ,relative to A und starting at p, is injinite. Let A be a quasilimit. Since 1 is a future-inextendible causal curve starting at p , we can find, as before, a subsequence {a,} and (reparametrizing) a single s such that {a,(s)} converges to a vertex pi in I ( A ) . The proof is increasingly like that for (2). Since pi # q, we can apply Proposition 8 dually to {a, 1 [s, I]}, getting a past-directed limit sequence {qi}starting at q. If {qi} is finite, it must end at lim a&) = pi. This means that p < p 1 < ... < pi < ... < 41 < q +

is a finite limit sequence for {a,,}starting at p-a contradiction to our current assumption. Thus {qi}is infinite. The corresponding quasi-limit p is a past-inextendible causal curve starting at q. As before, p reaches I - ( A ) , hence some urn1 [s, 11 does. Since a,(s) E I C ( A ) ,this contradicts achronality. (4) I f p I q in int D(A), then J ( p , q ) c int D(A). As before, we can assume p < q. By duality, only two cases need to be considered. Case 1. p , q E I + ( A ) . Pick q + E I + ( q ) n D(A) c D + ( A ) . Then N = I + ( A ) n I - ( q + ) is an open set containing J ( p , 4). so it suffices to prove c D + ( A ) . Let 0 be past-pointing timelike from q+ to y E A.. Since A is achronal and y E Z+(A), CT does not meet A. Hence, by regression, y E D + ( A ) . Case 2. p c J - ( A ) and ~ E J + ( A ) .The argument is similar; since p , q E int &A), there exist points p- E I - ( p ) n D - ( A ) and q E Z+(q)n D + ( A ) . We assert that the neighborhood N = I + ( p - ) n I - ( q + ) of J ( p , q ) is in D(A). If x E N , let CT and T be past-pointing timelike curve segments from q t to x and from x to p - , respectively. Since A c D(A) we can suppose x 4 A . By achronality at least one of the curves does not meet A. If 0,then, by regression, x E D + ( A ) ;if T, then x E D - ( A ) . +

In the theorem, the restriction to the interior of D(A) is unavoidable; for example, in the manifold of Figure 3, take A to be a spacelike closed segment with one endpoint on the null geodesic (dashed line). In the next section, we find reasonable conditions for D(A) to be open, hence globally hyperbolic. But the theorem already has an important consequence : 39. Corollary. If M has a Cauchy hypersurface, then M is globally hyperbolic.

Cauchy Developments 423

Proof. If S is a Cauchy hypersurface, then D ( S ) = M , hence int D ( S ) =

M.

1

For example, a Robertson-Walker spacetime with space S complete is globally hyperbolic. In fact, any spacelike slice t x S is a closed acausal hypersurface and by Remark 12.27 every inextendible null geodesic meets t x S . Corollary 54 will show that this makes t x S a Cauchy hypersurface. Since Kruskal spacetime has a Cauchy hypersurface, it too is globally hyperbolic. (For the converse of Corollary 39, due to R. P. Geroch, see [GI or [HE].) 40. Lemma. Let A be an achronal set. If p E int D(A) - Z-(A), then J - ( p ) n D + ( A ) is compact.

Proof. If p

E A , then the set in question consists of p alone, so suppose D(A). Let {x,} be an infinite sequence in J - ( p ) n D + ( A ) and let u, be a pastpointing causal curve segment from p to x,. There is nothing to prove if any subsequence of {x,} converges to p . Otherwise there is a past-directed limit sequence {pi}for {a,} starting at p . If {pi}is infinite, then, as usual, it follows that some x, is in Z-(A): a contradiction. If {pi}is finite, some subsequence {x,} converges to a point x E J - ( p ) . Let o be a timelike curve from p + E D + ( A ) n I + @ ) to x. If u meets A , then either x E A c D + ( A ) or x E Z-(A), the latter implying some x, E Z-(A); a contradiction. If CT avoids A , then, by regression, x E Di(A).

p

E Z+(A) n int

As Figure 7 indicates, this lemma and Lemma 37 both fail if the point p is not in the interior of D ( A ) .

figure 7 . The point p is in D + ( A ) ,but (1) the past-pointing null geodesic a starting at p fails to reach I - ( A ) , and (2) J- ( p ) n D +( A ) is not compact.

Causality in

fl:.

(See definition, page 228.) Although = - 1, its causality is not trivial. The following facts are readily found using the model of A1 discussed in Example 8.27. There we saw that the slices t constant (see 41.

Example.

A; is complete, diffeomorphic to R",and has constant curvature K

424 14

Causality in Lorentz Manifolds

t'

S-

z

Figure 8. Causality in A;. Dashed lines are null geodesics; horizontal lines are spacelike hypersurfaces; all other lines are timelike geodesics. (For a three-dimensional representation, rotate the figure about the t axis.)

Figure 8) are totally geodesic spacelike hypersurfaces, each isometric to hyperbolic (n - 1)-space (a line for n = 2). The t axis is a timelike geodesic and 8, is everywhere timelike future-pointing. The points p and q, symmetrically placed relative to the slice S : t = 0, project to antipodal points in H ; . Maps ( t , x) + (+ t + c, x) are isometries. (Note that since PIis frame homogeneous, the r axis represents an arbitrary timelike geodesic.) (1) The future-pointing null geodesics starting at p form a curving cone A + @ )that approaches but does not reach S . (Consider the corresponding null geodesics in &.) (2) J'(p) is the entire closed half-space on the future side of A + @ ) = bd .I+@). (Vertical lines in the figure are timelike, though only the t axis is geodesic.) (3) The strong causality condition holds. (Causal curves starting near p can never return to near p . ) (4) J ( p , q ) is the closed region bounded by A - ( q ) u A + @ ) . Since is not globally hyperbolic. this set is not compact, (5) Timelike geodesics in d ; project to closed geodesics in H ; ; thus the future-pointing timelike geodesics starting at p all meet, after distance n, at the conjugate point q (and again periodically along the t axis). These geodesics are the normal geodesics to S, s o p and q are its nearest focal points. Evidently, many points of J ' @ ) cannot be reached by geodesics from p . (6) D + ( S ) = S u (open region between S and A - ( g ) ) = J+(S) n I-(q). Thus D(S) = I+@) n Z-(q).

Spacelike Hypersurfaces 425

SPACELIKE HYPERSURFACES

Causality can be expressed in topological terms, but in Lorentz geometry the initial data will often be smooth. Furthermore, the most obvious examples of achronal sets appearing earlier have been (smooth) spacelike hypersurfaces. For these we establish some small but significant advantages.

42. Lemma. An achronal spacelike hypersurface S is acausal. Proof. Assume there exists a future-pointing causal curve segment a with endpoints a(0) and a(1) in S . If a is not a null geodesic, it admits a fixed endpoint deformation to a timelike curve-contradicting the achronality of S. If a is a null geodesic, then since a'(0) cannot be normal to S , Lemma 10.50 applies, giving again a contradiction to achronality.

43. Lemma. If S is a acausal topological hypersurface in M , then D(S) is open (hence globally hyperbolic). Proof. (1) First we show that S c int D(S). Assume p E S - int D(S). Then there is a sequence {a,}of inextendible causal curves such that {a,(O)} + p and each a,, avoids S . Since S contains no edge points, we can show that Z(S) = Z-(S) u S u Z+(S) is open. Let N be a neighborhood ofp such t h a t j is compact, contained in a convex set, and contained in I(S). Without loss of generality, suppose all a,,(O) are in N n Z+(S). As in the construction of a limit sequence, let en be the first point after a,(O) at which (past-pointing) a, meets bd N.For a convergent sequence {ern}+ e, we thus have e E J - ( p ) . Hence e 4 Z+(S). S is acausal and e # p , soe 4 S . Finally, e cannot be in I - ( S ) since, while in I ( S ) , no a,, can reach I - ( S ) without meeting S. This contradicts e E bd JV c Z(S). (2) Assume for the moment that S is closed in M . By duality it suffices to show that the set D'(S)

-

S = I + @ ) nD(S)

is open in M . Assume that p E D + ( S ) - S is not an interior point. Thus there is a sequence {a,} of past-inextendible causal curves such that {a,,(O)} -,p and every a,, avoids S . By (1) and the fact that S is closed, there is a convex covering R whose members are either contained in int D(S) or disjoint from S. Let A be a quasi-limit of {a,} relative to R. Since 1starts at p and is a pastinextendible causal curve, it meets S at a (unique) point A(s). Let pi be the vertex of 1 such that pi > 1(s) 2 p i + l . The member of R containing this segment of A meets S , hence by construction is contained in int D(S). Since S is acausal, p i4 S, and by regression p i E D'(S). Thus p iE Z+(S). Consequently

426 14

Causality in Lorentz Manifolds

I + ( S ) n int D(S) c D'(S) is a neighborhood of pi. Some a, must meet it, and hence meet S ; a contradiction. (3) For arbitrary S , we can assume S connected, since the Cauchy developments of different components would be disjoint. Clearly S is closed in I(S), hence the preceding arguments apply with M replaced by its connected open submanifold I ( S ) . The Cauchy development of S is the same in M as in I ( S ) , and being open in I(S), it is open in M .

This lemma fails if the hypothesis acausal is weakened to achronal (in Figure 7, extend A to a dosed achronal hypersurface). 44. Theorem. Let S be a closed achronal spacelike hypersurface in M . If q E D + ( S ) , there is a geodesic from S to q of length z(S, q). Hence y is normal to S and has no focal points of S before q. (y is timelike except in the trivial case q E S.)

Proof. By the preceding lemma, D ( S ) is a globally hyperbolic open set. By Lemma 40, J - ( q ) n D + ( S ) is compact; hence its intersection with S namely, J - ( q ) n S-is compact. By Lemma 21, the function x --+ z(x, q) is continuous on J - ( q ) n S, hence takes on a maximum at say p . Evidently this maximum is z(p, q ) = r(S, q). By Proposition 19, there is a geodesic segment y from p to q of length z(p, q ) = r(S, q). Assuming q @ S gives p 6 q, hence r(p,q ) > 0; so y is timelike. Then Corollary 10.26 implies that y is normal to S, and Theorem 10.37 forbids focal points.

The result extends by duality to the entire Cauchy development. Note that D ( S ) is not necessarily a normal neighborhood of S, since the geodesic y in the theorem is not unique. (For example, put a ripple in the x axis of

e.1

The failure of a desirable property in M is often not too serious if the property holds in some covering manifold of M . Our goal now is to show that any closed spacelike hypersurface of M can be made achronal (hence acausal) by lifting it into a suitable covering manifold. The proof uses a result from intersection theory [GP]. A homotopy of closed curves (loops) in which the endpoint is allowed to move is called a free homotopy. A curve a is transversal to a submanifold S at a(t) E S provided a'(t) not tangent to S . Thus c1 must cut cleanly through S at a(t).For example, causal curves are always transversal to spacelike submanifolds. The result we need is this: a closed curve that meets a closed hypersurface S exactly once, and there transversally, is not freely homotopic to a closed curve that does not meet S . Since M is time-oriented, a spacelike hypersurface S in M is two-sided: If ./tr is a normal neighborhood of S , then -/tr - S is the union of the disnI + ( S ) . joint open sets A'- = JV n I - ( S ) and N + =

Spacelike Hypersurfaces

427

45. Lemma. Let S be a closed connected spacelike hypersurface in M . (1) If the homomorphismj,: n,(S) -+ nI(M) induced by the inclusion map j : S c M is onto, then S separates M (that is, M - S is not connected). (2) If S separates M , then S is achronal. Proof. (1) The hypothesis simply means that-picking base point loop in M at p is fixed-endpoint homotopic to a loop in S. With notation as above, let a:[ - 1, 11 + N be a timelike curve from A'- to N + that meets S only at a(0) = p . Now assume S does not separate M . Since M - S is connected, there is a curve CI from u(1) to a( - 1) that does not meet S . Then y = 0 1 [0, 11 CI + a I [ - 1,0] is a closed curve meeting S only at a(0) and there transversally. By hypothesis, y is fixed endpoint homotopic, hence freely homotopic, to a closed curve in S. Evidently a small deformation of this curve in the future direction moves it into N'. So y is freely homotopic to a closed curve disjoint from S . This contradicts the intersection theory fact mentioned above. (2) Assume S is not achronal. Thus there is a future-pointing timelike curve segment CI with endpoints CI(O), a(1) in S. It is clear that there are numbers 0 < a < b < 1 such that a(a) E Jlr+ and a(b) E A'-. N + and Jlr - are connected, since S is; hence they are contained in the same component o f M - S. Since M is connected, each point of M - S can be joined to N - S = JV- v JV+ by a curve in M - S. Hence M - S is connected, contrary to the hypothesis that S separates M .

p

E S-each

+

46. Corollary. If M is simply connected, then every closed spacelike hypersurface in M is achronal (hence acausal). Given a spacelike hypersurface S of an arbitrary M , it will not suffice merely to pass to the simply connected covering of M since it may not be possible to find a copy of S there. 47. Example. In the Lorentz torus M = S : x S', the circle S = p x S' is spacelike and compact but not achronal (Figure 9). The simply connected covering manifold of M is R:, which contains no spacelike circles. However, the Lorentz cylinder I@= R: x S' also covers M , with map A = exp x id. Each component of A - '(S) is a closed spacelike hypersurface in fi that is achronal and isometric to S. The result in this example can be achieved in general. 48. Proposition. Let S be a closed, connected, spacelike hypersurface in M . Then there is a Lorentz covering R : A + M and a closed spacelike that is achronal and isometric under A to S. hypersurface 5 in

42% 14

Causality in Lorentz Manifolds

s"

s

I!

exp

x

id

M = Sf x S'

9 :----

6l = R f

x

R'

Figure 9

Proof. For an arbitrary Lorentz covering R : 0 -+ M , R - ' ( S ) is a closed spacelike hypersurface in 0and any connected component $of A - '(S) shares these properties. Furthermore, R Is": -+ S is a covering map and a local isometry, that is, R 15 is a Riemannian covering map. By Corollary A.13 there is a connected covering 4 : fi -+ M such that Image k, = Image j,, where j : S c M . Assign h?i the induced Lorentz metric and time-orientation. (1) R I 3 is an isometry. It remains only to show that A 1 is one-to-one. We can assume that contains the base point p" of the fundamental group of Then since R I$ is a covering map, it suffices to show that p" is the only point of S sent to p = ~ ( fE iS. If q E 9 and A(q) = p , let a be a curve in 5 from p" to q. Then A 0 a is a loop in S at p . Thus its homotopy class [ A E] E n,(M, p ) is in Image 4, so there is a loop /3 in fi at p" such that R ci 4 p. But by Corollary A.lO, a and p end at the same point; that is, q = p". (2) s" is achronal. By the preceding lemma it suffices to show that the homomorphism i, induced by i: 3 c 0 is onto. Let x E n,(fi, By construction there is a y ~ n ~ (pS) such , that k , ( ~ )=j,(y). By_(l), I&[? is a homeomorphism, so there is a z E n,($ fi such that ( R I S ) , z = y. But A i = j o &Is", hence R , i,(z) = j,(y) = A,(x). Since 4 , is always one-toone (Appendix A), i,(z) = x.

s

a.

0

0

F

0

a.

0

CAUCHY HORIZONS

The future part of the boundary of the Cauchy development D + ( A ) is defined in causal terms as follows. 49. Definition. H + ( A ) is

If A is an achronal set, its future Cauchy horizon

s ( A ) - I - ( D + A ) = ( p E D f ( A ) : Z + ( p ) does not meet D'(A)}.

Cauchy Horizons 429

-. 'y . 'I~.

5'(,4)

Delete

/

(

D'(A)

.4

Figure 10. H + ( A ) separates D + ( A ) from the rest of J + ( A )

Relativistically H + ( A )marks the limit of the spacetime region controlled by A ; if H + ( A )is nonempty, the entire future of A cannot be predicted from A (see Figure 10). With the past Cauchy horizon H - ( A ) defined dually, the Cauchy horizon of A is H ( A ) = H - ( A ) LJ H + ( A ) .

50. Examples of Cauchy Horizons. (See Example 36.) (1) For a restspace t constant in R;, both H + and H - are empty. (2) For the lower imbedding of hyperbolic space in R " + ' , H + is the null cone A-(O) and H - is empty. (3) Referring to Figure 6, the future Cauchy horizon Hf(S) is given by the null geodesics a and p, while H - ( S ) is empty. (4) In Figure 8, D'(S) = K ( q ) and D - ( S ) = A+(p). It is clear from the definition that a future Cauchy horizon H + ( A ) is closed. Also H + ( A ) is achronal, since the open set I ' ( H + A ) is disjoint from Df(A), hence from $ ( A ) , hence from H + ( A ) . If A is not closed, H + ( A ) may not even be contained in J f ( A ) (consider an open interval in the x axis of R f ) .For A closed, H + ( A ) can meet A along null geodesics or at edge points (see Figure 7). However, we shall see that-as in the examples above-if A is a closed achronal spacelike hypersurface, then Hf(A) is contained in I + ( A ) , hence does not meet A . 51. Lemma. For a closed achronal set A , D ' ( A ) is the set T of points p such that every past-inextendible timelike curve through p meets A .

Proof. (1) Z ( A ) c T. Assume p ~ g ( A -) T. Then there is a pastinextendible timelike curve c( starting at p that does not meet A . Thus p # A , so p has a convex neighborhood W disjoint from A . Move from p in the past direction on c1 to a point r still in W. Then I + ( r , V) contains p and hence a point q E D + ( A ) .The geodesic segment from q to r in W followed by the part of a past r then constitutes a past-inextendible timelike curve that does not meet A , contradicting q E D + ( A ) . (2) $(A) 3 T. If q $ $ ( A ) , pick rEZ-(q, M - o ' A ) . There is a past-inextendible causal curve c1 starting at r that misses A . By Lemma 30, there is a past-inextendible timelike curve through q that misses A . Thus q$T. H

430

14

Causality in Lorentz Manifolds

52. Lemma. If A is a closed achronal set, then bd D ' ( A ) = A uH+(A).

Proof. The definition of H + ( A ) and the fact that A is achronal give A u H + ( A ) c bd Df(A). To prove-the reverse inclusion, assume p E bd D + ( A ) - A - H + ( A ) . Then p E D + ( A ) - A ; hence by the preceding lemma, p € I + ( A ) . Also ~ E F ( A -) Hf(A), so there is a point q E I + ( p ) n Df(A). Then ] + ( A ) n Z-(q) is a neighborhood of p that, by regression, is contained in Df(A). This contradicts p E bd Df(A). 53. Proposition. Let S be a closed acausal topological hypersurface.

Then (1) H + ( S ) = Z+(S) n bd D'(S) = P ( S ) - D+(S).Inparticular,H+(S) and S are disjoint. (2) H + ( S ) , if nonempty, is a closed achronal topological hypersurface. (3) Starting at each point of H + ( S ) there is a past-inextendible null geodesic without conjugate points that is entirely contained in H + ( S ) . (Future-extended as far as possible in H + ( S ) , such null geodesics are called generators of H + ( S ) . )

Proqf. (1) (a) Using Lemma 51 gives H + ( S ) c o ' ( S ) c S u l + ( S ) . (b) D + ( S ) does not meet H + ( S ) . If fact, by Lemma 43, D(S) is open, so if p E D + ( S ) c D(S), then I + @ ) meets D(S)-but not D - ( S ) . Thus I + @ ) meets D + ( S ) ;so p c$ H + ( S ) . (c) Since S c D(S), (a) and (b) imply H + ( S ) c I + ( S ) . (d) Intersecting I + ( S ) with the sets equated in Lemma 52 then gives If(S) n bdD+(S) = H + ( S ) . (e) Then by (b), H + ( S ) c D'(S) - D'(S). Suppose p E F ( S ) - D'(S). Then if 4 E f ' ( p ) , there is a past-pointing timelike curve from q to p , and it avoids S since p c$ S u Z-(S). Also there is a past-inextendible causal curve starting at p that avoids S. Hence q $ D+(S), and consequently p E H+(S). (2) P = D + ( S ) u I - ( S ) is a past set, by regression. Thus Corollary 27 asserts that bd P is a topological hypersurface. By (I), H + ( S ) = I c ( S ) n bd D'(S). Since Z-(S) is an open set disjoint from I + ( S ) , it follows that H + ( S ) = I + ( S ) n bd P. Clearly this implies that H + ( S ) is also a topological hypersurface, and we know it is closed and achronal. (3) If p E H + ( S ) , then by (1) there is a past-inextendible causal curve y starting at p that does not meet S. By Lemma 51, y cannot be timelike. Thus it follows from Lemma 30(2) that y is a conjugate-free null geodesic. It remains to show that y never leaves H + ( S ) . Evidently y cannot meet D + ( S ) or it would meet S . If y ( s ) $ z ( S ) for some s > 0, there is a past-pointing past-inextendible timelike curve fl starting at y(s) that misses S. But then, applying Lemma 30(2) to y([0, s] + gives a contradiction to y(0) =

~EP(s).

Hawking's Singularity Theorem

431

54. Corollary. Let S be a closed acausal topological hypersurface. If every inextendible null geodesic meets S, then S is a Cauchy hypersurface.

Proof: First we prove that S is a Cauchy hypersurface fand only i f H ( S ) is empty. Since the boundary of a union is contained in the union of the boundaries, Lemma 52 gives bd D ( S ) c S u H(S). But D ( S ) is open and contains S, so bd D ( S ) c H ( S ) . The reverse inclusion is always true, hence bd D(S) = H(S). Since M is connected, H ( S ) is empty if and only if D ( S ) = M . Definition 28 and Lemma 29 show that the latter is equivalent to S being a Cauchy hypersurface. Now we show H ( S ) is empty. Assume there is a point p in say H + ( S ) . By the proposition, the (past-inextendible, null geodesic) generator y of H + ( S ) through p does not meet S. The inextendible extension of y cannot meet S in the future since S is achronal and p E l + ( S ) : a contradiction.

In view of Lemma 42 the two preceding results apply to a closed achronal spacelike hypersurface S. HAW KI NG 'S SI NG U LAR ITY THEOREM

The idea of the theorem is to extract the geometrical essence of the cosmological argument for the existence of a past singularity in a RobertsonWalker model of our universe: On the spacelike slice of present galactic time the galaxies are diverging (shape tensor); since gravity attracts (Ricci tensor), they have been diverging no less rapidly in the past. Thus trouble can be expected in the sufficiently distant past. As usual we state the result in future terms: convergence producing a future singularity. For a spacelike hypersurface S in a time-oriented n-dimensional Lorentz manifold, the convergence k (Definition 10.36) reduces to a real-valued function on S :

k

=

(U,H)

=

1 ~

n-1

trace S,,

where U is a future-pointing unit normal on S, and H is its mean normal curvature vector field. We call k here the,future convergence of S. There are two versions of Hawking's theorem; the first assumes more and proves more.

55A. Theorem. Suppose Ric(u, v) 2 0 for every timelike tangent vector to M . Let S be a spacelike future Cauchy hypersurface with future convergence k 2 b > 0.Then every future-pointing timelike curve starting in S has length at most l/b.

432 14

Causality in Lorentz Manifolds

Proof: If q E D + ( S ) - S, by Theorem 44 there is a (timelike) normal geodesic y from S to q with L(y) = z(S, q) and no focal points before q. But Proposition 10.37 asserts that there will be a focal point along y before q if L(y) > l / b . Consequently, D'(S) c { p E M : r(S, q) 5 l / b } .

That S is a future Cauchy hypersurface means H + ( S ) is empty (see Exercise 9). If a future-pointing timelike curve starting in S leaves D'(S), it must meet bd D'(S), but by Proposition 53 this would imply that H'(S) is nonempty. Hence I + ( S ) c D'(S). In view of the inclusion above, the result follows from the definition of z.

55B. Theorem. Suppose Ric(u, u) 2 0 for every timelike tangent vector to M. Let S be a compact spacelike hypersurface with future convergence k > 0. Then M is future timelike incomplete. Proof: We can suppose that S is connected. Let b > 0 be the minimum of k on S. Actually we show

(a) There is an inextendible future-pointing normal geodesic starting in S that has length _< l / b . By Proposition 48 there is no loss of generality in assuming that S is achronal (hence acausal). Thus, as in the preceding proof, D + ( S ) c { p E M : z(S, q) 5 l / b } . If H + ( S ) is empty, then Theorem 55A applies and proves the present theorem. Hence we can suppose that H + ( S ) is nonempty. Assuming (a) is false, we shall derive a contradiction. Two preliminary facts are needed. (b) If q E H + ( S ) , there is a normal geodesic from S to q of length z(S, q) Il/b. In the normal bundle of S, let B consist of all zero vectors and all futurepointing vectors u with 1 U I I l / b . Evidently B is compact, since S is. There is a sequence {q,) in D + ( S ) that converges t o q. For each q, there is a geodesic with the asserted properties. Hence for each q, there is a vector u, E B such that exp(u,) = q,. Since B is compact, {u,} converges to some u EB. By continuity, (4.1 -+ exp(u), hence the latter is q. Now Iu,( converges to I u I I l / b and, by construction, I u, I = T(S, 4,).Since the function p z(S, p ) is lower semicontinuous, I u I 2 z(S, q). Since (a) is assumed false, the geodesic y o is defined on [O, 1 3 ; thus it runs from S to q and has length ( u J .Hence z(S, 4 ) = 101. -+

(c) Thefunction p -,z(S, p ) is strictly decreasing on past-pointing generators o f H + ( S ) (see Proposition 53).

Hawking’s Singularity Theorem 433

Let c1 be such a generator; in its domain suppose s < t. By (b) there is a past-pointing timelike geodesic a from a(t) to S of length z(S, u(t)). Since CI is null, the causal curve ciI [s, t ] + a is broken and hence can be lengthened by a small fixed-endpoint deformation. Thus z(S, a@)) > L(al [s, t ] + a) = L(a) = z(S, cr(t)). The contradiction: Since (a) is false, the normal exponential map is defined on all of B. It follows that H + ( S ) is compact, since it is closed and by (b) is contained in the continuous image of the compact set B. But p + z(S, p ) is lower semicontinuous, so its restriction to H + ( S ) takes on a finite minimum at some point. This contradicts (c) since there is a generator extending pastward from each point of H + ( S ) . If S is neither Cauchy nor compact, the theorem fails utterly. For example, the lower imbedding of H“ in R;+ is a closed achronal spacelike hypersurface with k = 1, and RY+’ is flat-but complete. The “galaxies” from H“ all collide at the origin, but without harm to the containing manifold. When time-orientation is reversed in Hawking’s theorem, past convergence (future expansion) implies past singularities. Obviously the result can be refined by only requiring Ric(y’, y’) 2 0 on geodesics normal to S. Then it provides a powerful augmentation of the Robertson-Walker result in Proposition 12.15. Let us verify that Theorem 55A generalizes this proposition when the space S in the latter is complete. Let M = I x S and let the hypersurface in Theorem 55A be the spacelike slice to x S. The geodesics normal to the slice are the galaxies; hence Corollaries 12.10 and 12.12 show that Ric(y’, y’) 2 0 is equivalent to p + 3#i 2 0. By Corollary 12.8(3) the totally umbilic slice has mean curvature f ‘ ( t o ) / f ( t o ) U ,hence constant convergence k = - f ‘ ( t o ) / f ( t o ) . Thus Hubble expansion at galactic time t o is equivalent to t o x S having past convergence k I a < 0. If the space S is complete, the slice is a Cauchy hypersurface; hence Theorem 55A applies and gives the conclusion of Proposition 12.15. Since our universe seems to be at least approximately Robertson-Walker, the hypotheses of Theorem 55A are not unreasonable, and this result strongly suggests that our universe is catastrophically singular in the past. This conclusion is thus freed from the specific Robertson-Walker model ; in particular, the global hypothesis of exact spatial isotropy is no longer needed. Though Theorem 55B proves less, its hypotheses are strikingly weaker. M is required to satisfy only the timelike convergence condition, which is natural both mathematically and physically (see page 340). The replacement of global hyperbolicity of M by compactness of S means, for example, that when versions 55A and 55B both apply, deleting a closed set from Z+(S) destroys 55A but not 55B.

434 14

Causality in Lorentz Manifolds

56. Example. T h e conclusion of 55A need not hold under the hypothesis of’55B. Delete from R: the points (i,x ) with 1x1 2 1 and the points (1, x ) with 1x1 I 1.1. Alter the Minkowski metric on the resulting open set M as follows: Suppose 0 I f ( x ) I 1, with f ( x ) = 1 if 1x1 I 1 and 0 if 1x1 2 1.1. Then let

if t I+, if $ 5 t I 1, if 1 I t.

With line element - d t 2 + gz d x 2 , M is Minkowskian above the deleted sets (Figure 11) and “conical” below. Then 8,is a future-pointing unit normal on the (spacelike) x axis. There S,,(d,) = -Dd,8, = d,, so the x axis has constant future convergence k = 1. The metric is independent of x for 1x1 2 1.1. Identifying the lines x = + 2 turns M into a topological cylinder and the x axis into a circle S . Since M is flat, the hypotheses of Theorem 55B are satisfied. The future-pointing geodesics normal to S are t-parameter curves; none is lQnger than 1. But, for example, the timelike curve (t,x) = (2s,$ + s) has infinite length for s 2 0. Thus all galaxies end, but a well-directed spaceship can go on forever.

Figure 11

PEN R 0 sE’S SING ULA R ITY TH EO R EM

This result aims at recognizing in the geometry of a collapsing star,general conditions that imply the existence of future “black hole” singularities. In a Schwarzschild black hole B, the most remarkable feature is that from a t-constant sphere S2(r) even the “outgoing” light rays are actually going inward: drlds < 0. These null geodesics have a better chance to escape than any particle starting inside S2(r), hence their failure to d o so looks like a good warning of the ensuing singularity.

Penrose‘s Singularity Theorem 435

In the general case no such function r is available, but Example 58 will show that the initial condition dr/ds < 0 on these null geodesics to S 2 ( r ) is equivalent to positive convergence. This condition makes sense for any spacelike submanifold P of an arbitrary M , and under the appropriate Ricci curvature condition, the initial collapse will continue. The problem then (as analogously in the preceding section) is to show that this concatenation of null geodesics actually involves a singularity in M . Let P be a spacelike submanifold of M with codimension 22.If H is the mean curvature vector field of P , then linear algebra in each (Lorentz) normal space T’(P)’ shows that the following are equivalent: k(u) = ( H , u ) > 0 for all future-pointing null vectors u normal to P. (2) k(w) = ( H , w) > 0 for all future-pointing causal vectors w normal to P. (3) H is past-pointing timelike. (1)

57. Definition. A spacelike submanifold of M is future-converging provided its mean curvature vector field H is past-pointing timelike. The definition makes sense even for hypersurfaces. For example, the hypersurface S in Theorems 55A and 55B is future-converging. In relativistic contexts, a future-converging surface is called a trapped surface. 58. Example. Trapped Surfaces in a Schwarzschild Black Hole B. By Lemma 13.3 and Proposition 13.4(3) the sphere S 2 ( r ) in restspace t constant has mean curvature vector field H = -(l/r)grad r. Lemma 13.29 shows that grad Y is timelike future-pointing on B, hence H is timelike past-pointing. If u is a future-pointing causal vector normal to S Z ( r ) , then then k(u) = ( H , u ) = - u(r)/r, so decrease of r in the u direction is equivalent to positive convergence.

The term “trapped” has another-purely causal-meaning, which we now discuss. For a subset A of M let E + ( A ) = J + ( A ) - ] + ( A ) . It is easy to see that E + ( A ) is achronal and that A c E + ( A ) if and only if A is achronal. By Corollary 5, E + ( A )is generated by conjugate-free null geodesics: if q E E + ( A ) , there is a null geodesic in E + ( P ) from A to q that has no conjugate points before q. 59. Definition. A closed achonal subset A of M is future-trapped provided E C ( A )is compact. Dually, past-trapped means E - ( A ) is compact. In either case A itself must be compact. For example, in the cylinder Ri x S’, each individual point is both future- and past-trapped (see Figure 6).

436 14

Causality in Lorentz Manifolds

A main step in Penrose's proof of Theorem 61 is to show that under reasonable hypotheses a trapped surface (differential geometric concept) is a trapped subset (causal concept).

60. Proposition. Suppose

(1) Ric(u, u ) 2 0 for all null tangent vectors u to M . (2) M is future null complete. If P is a compact achronal spacelike ( n - 2)-submanifold of M that is futureconverging, then P is future-trapped. Proof. Globally, P need not admit a (continuous) normal null vector field, but we can choose at each p E P a pair of future-pointing null vectors that locally fall into two independent (smooth) normal vector fields. The subspace P c N ( P ) consisting of these null normals is a double covering of P , hence in particular is compact. By Proposition 10.43, for each u E P there is a focal point of P along the geodesic y,l[O, l/k(u)],where k(u) = (H,u ) > 0. Since k is continuous on compact p,there is a number b > 0 such that for every u E P there is a focal point of P along yu 1 [O, b). If q E E + ( P ) ,then as noted above there is a null geodesic y from P to q. Since q $Z+(P), Theorem 10.51 shows that y is normal to P and has no focal points of P before q. Thus y is a reparametrization of some y v , u E P, and we conclude that

E'(P) c exp(K),

where K = {su E N ( P ) : u E P and 0 I s

< b}.

Since P" is compact, K is compact, hence exp(K) is compact. To see that E + ( P ) is compact, let {q,,} be a sequence in E + ( P ) . Some subsequence { q m } converges to a point q E exp(K) c J'(P). But q cannot be in Z-(P) since no qn is. Thus q E E'(P). We state the main result in positive terms, followed by two versions as singularity theorems. 61. Theorem (Penrose). Suppose (1) Ric(u, u) 2 0 for all null tangent vectors to M .

(2) M has a Cauchy hypersurface S . (3) P is a compact achronal spacelike ( n - 2)-submanifold in M that is future-converging. (4) M is future null complete. Then E + ( P )is a Cauchy hypersurface in M .

Exercises

437

The theorem is almost illustrated by taking P to be a point of Ri x S’ (compare Figure 6). The only hypothesis to fail is future-convergence, but P is nevertheless future-trapped and E + ( P )is indeed a Cauchy hypersurface. Proof. Since M has a Cauchy hypersurface, it is globally hyperbolic. Then Lemma 22 implies that the sets J - ( p ) and J + ( p ) are closed for all p E M . Since P is compact, Exercise 4 shows that J + ( P ) is a closed set. By Lemma 6, int Jt(P) = I + ( P ) , hence E + ( P ) = bd J+(P). As the boundary of a future set, E + ( P )is a topological manifold, and by the preceding lemma it is compact. For the Cauchy hypersurface S let p: E + ( P ) -, S be the restriction to E + ( S ) of a retraction as in Proposition 31. Thus p is continuous, and since E f ( P ) is achronal, the uniqueness of integral curves implies that p is one-toone. By the invariance of domain, p is a homeomorphism of E + ( P ) onto an open subset of S. Since E + ( P ) is compact, p E + ( P ) is compact, hence closed in S. Since S is connected, p E + ( P ) = S; that is, p : E + ( P )4 S is a homeomorphism. Hence Corollary A. Zf (l), (2), and (3) hold, and the Cauchy hypersuvface is noncompact, then M is future null incomplete.

Continuing the proof of the theorem: Since E + ( P )is achronal, it remains to show that every inextendible timelike curve p meets E+(P). The strong causality condition holds, hence by Exercise 11, B is an integral curve of some timelike vector field X . We saw above that the retraction produced by X gives a homeomorphism E + ( P ) z S. Thus since B meets S , it also meets E+(P). Corollary B. Zf(l), (2), and (3) hold, and there is an inextendible causal curve in M that does not meet Ef(P), then M is future null incomplete.

In fact, by Lemma 29 the curve hypothesis means that Ef(P) is not a Cauchy hypersurface. Penrose’s result, the first singularity theorem (1969, improved theoretical prospects for the existence of black holes, since the “trapped surface” sufficient conditions for singularities are free of the symmetry of particular models such as Schwarzschild’s. Exercises

M denotes a connected time-oriented Lorentz manifold. 1. Let II be a quasi-limit of {DL,}.If either II is null or every 2, is geodesic, show that every neighborhood of a point on II meets infinitely many mas.

438 14

Causality in Lorentz Manifolds

2. Give an example of a sequence {a,} of causal curves in R: with infinite

limit sequence {pi} such that there does not exist one single subsequence {a,} such that every pi is a limit of points on the curves a,. 3. Prove that the set of points in M at which the chronology [causality] condition fails is a (possibly empty) disjoint union of sets of the form I+(P) I - ( P ) [ J + ( P ) n J-(P)l. 4. Suppose that for every p E M the sets J - ( p ) and J + ( p ) are closed. Prove that if K is compact, then J f ( K ) is closed. 5. Let a : [0, b ) + M be a future-pointing causal curve starting at p. If CI is future-extendible, with endpoint q , show that: (a) The continuous extension i? of tl to [0, b] need not be smooth. (b) If tl is merely piecewise smooth, then ti need not be piecewise smooth. (c) p Iq. (d) Given any neighborhood % of &LO, b] there exists in % a (smooth) causal curve segment from p to q. (Hint: For (a) and (b), zigzag; for (c), use Lemma 2.) 6. The Schwarzschild exterior N and black hole B are globally hyperbolic. 7. If I/ and W are linearly independent vector fields on a smooth manifold P , show that there is a Lorentz metric on P such that (a)'V is timelike and W is null; (b) both V and Ware null. 8. Give examples of (a) a Lorentz metric on R3 for which there are many closed timelike curves and many closed null curves (compare Lemma 34); (b) a Lorentz metric on S' x R' for which some J + ( p ) is an open set with compact closure. (Hint: Use the preceding exercise.) 9. Let A be a closed achronal set. Prove : (a) H + ( A ) is empty if and only if J f ( A ) c D+(A)-and in this case A is a closed topological hypersurface (called a future Cauchy hypersurface). (b) A is a Cauchy hypersurface if and only if it is both a future and past Cauchy hypersurface. 10. Causaliry neighborhoods. Prove : (a) The strong causality condition holds at p E M if and only if every neighborhood ofp contains a neighborhood = Zt(x) n I-(y). (b) If .h- is such a neighborhood, and is compact and contained in a convex open set, then any future-inextendible causal curve starting in -4.leaves it and after its first departure never returns. 11. Let r : I M be an inextendible timelike curve. If the strong causality condition holds on M , prove: (a) r(Z) is a closed submanifold of M . (b) a is an integral curve of a timelike vector field on M . (Hint: Use Exercise 3.12, Lemma 5.32, and a partition of unity.) 12. Give examples showing that neither converse in Lemma 45 is true. 13. Let A be a subset of M . Prove: (a) E + ( A ) has the properties stated before Definition 59. (b) The event horizoii 8 = bd Jt(A) is a closed topological hypersurface. (c) € + ( A ) = 6 n J + ( A ) . (d) Describe 6 = bdJf(ol) if M &!f"

dv

Exercises 439

is (i) a timelike geodesic in RY, (ii) the accelerating (iii) a timelike geodesic in SY.

observer given on page 181,

14. Prove: (a) If N is a compact simply connected four-dimensional manifold, its Euler number x ( N ) is at least 2. (Hint: Since N is orientable, Poincare duality [V] can be used.) (b) The simply connected covering manifold of a four-dimensional Lorentz manifold is noncompact. 15. In view of its relation to Theorem 55A (see page 433), prove the analogue of Proposition 12.15 with p + 3 f i > 0 replaced by p + 3 f i 2 0. (?'here can be a static future.)

This Page Intentionally Left Blank

Appendixes

A

FUNDAMENTAL GROUPS A N D COVERING MANIFOLDS?

Both topics are topological, however we describe them in terms of manifolds M . This is sufficient for our purposes and simplifies matters somewhat. The fundamental group of M is a group whose elements are certain classes of continuous curves in M . Let I be the closed unit interval [0, 11 in R'. A pathfrom p to q in M is a continuous map a: I -+ M such that a(0) = p and a(1) = q. Let P(p, q ) be the set of all such paths. 1. Definition. If a, fl E P(p, q), a $xed-endpoint homotopy from is a continuous map H : I x I + M such that for all s, t E I

W, 0) = a(t),

W O , s) = P,

m t , 1) = P(t),

H(1, s) = 4.

c1

to

p

Defining a,(t) = H(t, s) shows that in effect H is a one-parameter family of paths a, E P(p, q), varying continuously from a. = CL to a1 = p. If such a homotopy exists, a and p arefixed-endpoint homotopic, written a N p.

2. Lemma. Fixed-endpoint homotopy on p(P, 4).

=

is an equivalence relation

The equivalence class containing a E P(p, q ) is denoted by [a] and called thejxed-endpoint homotopy class o j a . In favorable cases two paths combine to give a single path. t See [Ma] and [ST].

441

442 A

Fundamental Groups and Covering Manifolds

3. Definition. If a E P(p, q ) and p E P(q, r), let

Then a * p is a path from p to r called the path product of a and /?. If 01 E P(p, q), the reverse path CC E P(q, p ) is defined by ( E ) ( t ) = a(l - t). These two operations on paths respect fixed-endpoint homotopy, hence yield well-defined operations on homotopy classes: [a][p] = [a * /?I and [El. The set of all such homotopy classes in M fails to be a group solely because path products are defined only when the paths of one class end at the start of those of the other class. This difficulty is eliminated by considering only loops at some point p E M . A loop is an element of P(p, p), that is, a path starting and ending at p . 4. Proposition (Poincark). If p E M , let nl(M, p ) be the set of all fixed-endpoint homotopy classes in P(p, p). The multiplication [a]v] = [a* p] makes nl(M, p) a group, called thefundamental group o f M at p .

Proof: Only associativity, a * (B* y ) N ( a * p) * y, demands care in constructing the required homotopy. It is easy to see that the identity element of nl(M, p) is the homotopy class consisting of all loops at p that are fixed-endpoint homotopic to the constant loop e p at p . Then the group inverse of [a] is [El. rn

The general idea is that if there are "holes" in M , the loops surrounding a hole cannot be shrunk back to the base point p. Homotopic loops surround the same holes, and thus n l ( M , p ) gives an algebraic description of them. We list some easily verified properties of fundamental groups. (a) I f M is connected, then the groups xl(M, p)for all p are isomorphic. In fact, if y is a path from p to q, then [B] + [y * p * 73 is a well-defined isomorphism nI(M, q) % n l ( M , p). Thus we can speak of the fundamental group of a connected M , writing merely nl(M) when the particular base point p is unimportant.

5. Definition. M is simply connected provided M is connected and its fundamental group is trivial, that is, reduces to the identity element. (Thus every loop in M is homotopic to a constant.) For example, R" is simply connected, since any loop a at 0 is fixed-endpoint homotopic to eo under the homotopy H ( t , s) = scr(r). Also S" is simply connected if n 2 2,but Chapter 7 shows that nl(S1) is infinite cyclic.

Fundamental Groups and Covering Manifolds

443

-

(b) For a connected product manifold, nl(M x N ) x rc,(M) x nl(N). In fact, any loop in M x N can be written as t (cr(t),p(t)). Then [(a, P)] ([a], [PI) is the required isomorphism. -+

(c) A continuous map # : M -+ N induces a homomorphism # # : nl(M, p ) n,(N, &) given by ##[a] = [# a]. Thus 4# provides an algebraic description of the map 4. -+

0

In dealing with a manifold we prefer to use paths that are at least piecewise smooth so that manifold machinery can be applied to them. 6 . Lemma. If a is a path in M from p to q, there is a piecewise smooth path p that is fixed-endpoint homotopic to b.

This can be proved by a direct argument in coordinate neighborhoods. If M is a semi-Riemannian manifold, we can take the neighborhoods to be convex and construct p as a broken geodesic. Now we turn to a notion that seems quite different but turns out to be closely related to fundamental groups.

7. Definition. A smooth map A : fi -+ M onto M is a covering map provided each point p E M has a connected neighborhood 92 that is evenly covered by A ; that is, r& maps each component of K ' ( 4 2 ) diffeomorphically onto a. For example, the exponential map t -, (cos t , sin t ) from R' to S' is a covering map, but its restriction to an interval J # R' is not.

8. Lemma. If A : fi -+ M is a covering and M is connected, then the number of points in A-'(p)-an integer or m-is the same for all p E M . (This number is called the multiplicity of the covering.) Proof: For each rn I co, let 0, consist of all p E M such that n-'(p) has exactly m points. The even covering condition shows each LO, is open. Hence M = 9, for some one m. 4 For example, considering S' as the unit circle in the complex plane, for each n > 0 the map z -,z" is a covering map S' -, S' of multiplicity n. In general, given maps rc: E + M and #: P + M , a lift of # through rc is a map P -+ E such that rc 0 = 4.

4:

4

9. Lemma. Let A : fi 4 M be a covering. Let a : J -, M be a continuous [smooth] curve, and let q be a point of such that A(q) = a(0).Then there is a unique continuous [smooth] lift iZ:J + M of 01 through A such that E(0) = q.

a

444 A

Fundamental Groups and Covering Manifolds

Proof. Decompose J into subintervals so that each subcurve a,lies in a (connected) evenly covered open set 9i of M . There is no choice as to the lift of a l : If V is the component of #-I(%) containing q, then B, must be ( AI V")- 0 a 1. Continue by induction, replacing q by the appropriateendpoint 0fEi. w

In short, paths can be lifted uniquely to any level. A similar proof establishes the two-dimensional analogue: If H is a homotopy in M and A(q) = (0,0), then H has a unique lift I? through A such that fi(0,O) = q. The following consequence links covering maps to fixed-endpoint homotopy :

a

10. Corollary. Let A : + M be a covering, and let a and B be fixedendpoint homotopic paths in M . If 6 and fl are lifts of 01 and through R such that B(0) = fl(O), then E and /? are fixed-endpoint homotopic. In particular, E(1) = fl(1).

It follows, for example, that the homomorphism A# induced by A is one-to-one.

a

11. Proposition. Let A : -+ M be a covering map and 4 : P + M a smooth map. Let p o E P and 40 E fi be such that +(po) = R(qo). Then (1) if P is connected, there is at most one lift of 4 through R such that &po) = qo; (2) if P is simply connected, such a lift exists.

6

Assertion (1) follows from the uniqueness in Lemma 9. In (2), if p E P, let a be a path from p o to p. Let /? be the lift of 4 0 a starting at qo. Then &p) = p(1) can be shown to provide the required map Topological or manifold properties attributed to a covering R : + M refer to the covering manvold fi.Thus a simply connected covering is one for which A is simply connected (hence M is connected).

6.

a

12. Theorem. Every connected manifold has a simply connected covering.

Using Proposition 11, it is easy to show that any two simply connected coverings Ai: Mi --* M of the same manifold are equivalent; that is, there is a diffeomorphism t,b: M , + M 2 such that R 2 0 t,b = A,. Thus we can speak of the simply connected covering of M (also called the universal covering of M ) . 13. Corollary. If M is connected, then, given any subgroup H of n , ( M , p ) , there is a connected covering A: + M and a point EE fi such that A#(nl(fi, 8)= H .

Fundamental Groups and Covering Manifolds

445

A covering R : -+ M is trivial if each component of M is evenly covered R. Thus if M is connected, A is a diffeomorphism of each component C of onto M , so 1 = (R1 C ) - is a global cross section of k. 14. Corollary. Every covering of a simply connected manifold is trivial.

The proof is a straightforward application of Proposition 11.

B

LIE G R O U P S

A Lie group G is a smooth manifold that is also a group with smooth group operations; that is, the maps p: G x G + G

sending (a, b) to ab

and [: G

+G

sending a to a - '

are both smooth. We always assume G is second countable. The identity element of G is denoted by e. 1. Example. The Full Linear Group GL(n,R). The set gI(n, R ) of all n x n real matrices is in a natural way a real vector space, hence a manifold. Stringing out the entries of each x E gl(n, R ) in some fixed order would give a linear isomorphism (hence diffeomorphism, hence coordinate system) from gl(n, R ) to R"'. The set GL(n, R ) = { g : det g # 0} of all invertible matrices in gI(n, R ) is evidently a group under matrix multiplication. The formula for the determinant of a matrix shows that the determinant function det: gI(n, R ) -+ R is smooth; thus GL(n,R ) is an open submanifold of gI(n, R). The formulas for matrix multiplication and inverses then show that for GL(n, R), the maps p and iabove are smooth. In this way GL(n, R ) becomes a Lie group.

A Lie group H is a Lie subgroup of a Lie group G provided H is both an abstract subgroup and an immersed submanifold of G. This notion is subtle t See [Ch], [HI, [W]. 446

Lie Algebras

447

because H need not have the induced topology. For our purposes something much simpler suffices.

2. Definition. A closed subgroup H of a Lie group G is an abstract subgroup that is a closed set of G. 3. Theorem. If H is a closed subgroup of a Lie group G, then H is a submanifold of G and hence a Lie subgroup of G. In particular, a closed subgroup has the induced topology. 4.

Examples of Closed Subgroups. ( 1 ) K

5

The kernel

{a E G : +(a) = e }

of a smooth homomorphism 4: G -+ H is a closed subgroup of G. (2) The determinant function det : GL(n,R ) -+ R - 0 is a smooth homomorphism. Its kernel SL(n, R ) = { a :det a = l} is thus a closed subgroup of GL(n, R), called the special linear group. (3) For a Lie group G, its component Go containing the identity element e is a closed (and open) subgroup. In all three cases above, the subgroup is normal, that is, invariant under a + gag- for all g E G.

LIE ALGEBRAS 5. Definition. A Lie algebra over R is a real vector space g furnished with bilinear function [ , ] : g x g -+ g, called its bracket operation, such that for all X , Y , Z E 9, (1) [ X , Y ] = - [ Y , X ] (skew-symmetry), (2) [ [ X , Y ] ,Z ] [ [ Y , Z]. X ] [ [ Z , XI, Y ] = 0

+

+

(Jacobi identity).

For example, gl(n, R) will always be made a Lie algebra by defining [x, y ] = x y - yx, where xy is matrix multiplication. Lie algebras are assumed

to be finite-dimensional unless the contrary is mentioned. We shall now see that there is a Lie algebra canonically associated with each Lie group. The essence of Lie theory is to study the groups in terms of their algebras. If a is an element of a Lie group G, define L,(g) = ag and R,(g) = ga for all g E G. Then La: G + G is a smooth map, in fact a diffeomorphism since La- is its inverse (similarly for Ra).

448 B

Lie Groups

By convention, leji-multiplication L, is the standard way to get around in a Lie group. In particular, any a E G can be moved to e by La6. Definition. A vector field X on a Lie group G is lef-invariant provided d L , ( X ) = X for all a E G.

Explicitly, dL,(X,) = X,, for all a, g E G . Thus left-multiplication merely permutes the tangent vectors constituting X . It is not hard to show that a leji-invariant vectorjeld is smooth. Let g be the set of all left-invariant vector fields on a Lie group G. The usual addition of vector fields and scalar multiplication by real numbers make g a vector space. By Lemma 1.22, g is closed under the bracket operation, hence the bracket properties in Lemma 1.18 hold on g. Thus g is a Lie algebra, called the Lie ulgebra of C . Furthermore, g has (finite) dimension n = dim G, since

7. Lemma. The function g + T,(G) sending each X X , E T,(G) is a linear isomorphism.

Eg

to its value

Proof. The function is obviously linear, and it is one-to-one since, if X , = 0, then X , = dL,(X,) = 0 for all a. To prove onlo: If x E T,(G),define X , = dL,(x) for all a E G. Then X is left-invariant and X , = x. This isomorphism is so natural that, where convenient, we can neglect it and think of the Lie algebra of G as T,(G) with induced bracket operation. A Lie algebra homomorphism is a linear transformation of Lie algebras that preserves brackets. A subalgebra of a Lie algebra g is a vector subspace l) that is closed under brackets (hence Q is also a Lie algebra). We show now that the Lie algebra of GL(n, R ) is (canonically isomorphic to) the matrix Lie algebra gl(n, R). Let uij be the real-valued function on gI(n, R ) such that uij(a)is the ij entry aij of each matrix a. Then { u i j :1 Ii , j In)

is a coordinate system on gI(n, R), hence on its open submanifold GL(n, R). 8. Lemma. The map X -+ (X,(uij))is a Lie algebra isomorphism from the Lie algebra of GL(n, R ) to gI(n, R).

Proof. The map is a composition of canonical isomorphisms: Lie algebra of GL(n, R ) A T,(GL(n, R ) ) +% T,(gI(n, R ) A gl(n, R), where (1) is from Lemma 7, (2) is the usual open submanifold identification, and (3) is the canonical isomorphism for vector spaces as manifolds. That brackets are preserved is a straightforward coordinate computation.

The Lie Exponential Map 449

The preceding results extend to complex matrices as follows. Let gl(n, C) be the set of all n x n matrices with complex entries. Although gl(n, C) is in a natural way an n-dimensional vector space over C, we always restrict to real scalars. Thus gl(n,C) is a real vector space of dimension 2n2. The matrix bracket [ x , y ] = x y - yx makes it a real Lie algebra. For a E gI(n, C) let ui,(a) = Re a i j ,

uij(a)= Im aij.

Then { u i j ,uij: 1 I i, j I n } is a natural coordinate system on gl(n, C). The set GL(n, C) = { a :det a # 0} of invertible matrices in gl(n, C) is, as before, a group under matrix multiplication and an open submanifold of gl(n, C) x R2"'. By the usual formulas, these structures make GL(n, C ) a Lie group, the complex full linear group. Computations as for Lemma 8 show that gl(n, C)is (canonically isomorphic to) the Lie algebra of GL(n, C).

THE LIE EXPONENTIAL M A P 9. Definition. A one-parameter subgroup in a Lie group G is a smooth homomorphism a from R (under addition) to G.

Thus a : R

-, G is a curve such that a(s + t ) = a(s)a(t) for all s, t. Hence

a(0) = e, a( - t ) = a(t)-

and a(s)a(t) a(t) =

(

= a(t)a(s) for

c; t si; t -sin t cos t

all s, t. For example,

:j 0

is a one-parameter subgroup of GL(3, R).

10. Proposition. The one-parameter subgroups of G are exactly the maximal integral curves, starting at e, of the elements of its Lie algebra 9. The proof is not difficult. 11. Definition. Let g be the Lie algebra of G. The Lie exponential map exp: g + G sends X to a x ( l ) , where ax is the one-parameter subgroup of X E g (as above).

Like its geometric analogue, exp carries lines through the origin 0 of g to one-parameter subgroups (geodesics), and its differential map at 0 is the canonical isomorphism T,(g) x g % T,(G). Hence, by the inverse function theorem,

450 B

Lie Groups

12. Lemma. Some neighborhood of0 in g is mapped diffeomorphically by exp onto a neighborhood of e in G. Its interpretation as T,(G) shows that the Lie algebra g is determined by any arbitrarily small neighborhood of e in G. Conversely, g has direct influence on the set exp(g), which necessarily lies in the identity component Go of G (but need not fill it). The preceding lemma and the one that follows show that the influence of g extends to all of Go. 13. Lemma. Given any neighborhood % of e, every element of Go can be expressed as a finite product of elements of 42.

Let H be a Lie subgroup of G, and let b be the Lie algebra of H . Each X E b has a unique extension to r? E g, namely f, = dL,(X,) for all a E G. The function X + r? is linear, one-to-one, and bracket-preserving. It is customary to ignore this natural map and treat f~ as a subalgebra of g.

14. Lemma. Let H be a Lie subgroup of G, Then b is the set of all X such that exp(tX) E H for 1 t I sufficiently small.

Eg

The reason for the term “exponential map” is clear in the case of the full linear group, as follows.

15. Example. The Lie exponential map exp:gI(n, C) sends x to ex, where the latter is given by the convergent series ex =

id

-+

GL(n, C)

+ x + . . . + x“/n! +

The proof consists in showing that the one-parameter subgroup of gI(n, C)is t efX.

XE

-+

For instance, the one-parameter subgroup t -+ e‘”, where

c1

given after Definition 9 is

THE CLASSICAL GROUPS

If G is a Lie subgroup of GL(n, C),then its Lie algebra is a subalgebra of gI(n, C).We shall compute the Lie algebras for some important closed subgroups of GL(n, C)that are defined using the following linear functions on

g b , C):

The Classical Groups

451

c

The trace of x E gI(n, C) is xii, hence trace(xy) = trace(yx). (2) The transpose 'x of x has ( ' x ) ~=~x j i ,hence '(xy) = 'y'x. (3) The complex conjugate X of x has ( j s ) i j = hence Xy = Xj. (1)

(c),

Furthermore, if a E GL(n, C), then '(a- ') = ('a)- and similarly for the other two operations. Here are some related properties of the matrix exponential map. 16. Lemma. If x, y E gI(n, C ) and a E GL(n, C), then

( I ) e x + y = eXeY if xy (2) e-x = (ex)-', (3) elx = '(ex),

=

yx,

(4) (5)

ex = e', eaxa-l - aeXa-',

(6) det ex = elracex.

Each of the great number systems R (reals), C (complexes), and H (quaternions) has for each n a natural n-dimensional geometry whose group of linear isometries is a Lie group. (1) T h e orthogonal group O(n) is the group of linear isometries of R" with its natural inner product u w = uiwi. A simple computation shows that g u - g w = w for all u, w is equivalent to 'g = g - ' . Hence O(n) = { g E G L ( n , R ) : ' g= 9 - l ) . (2) T h e unitary group U ( n )is the group of C-linear isometries of C" with its natural Hermitian product (u, w) = viiVi. It follows that U ( n ) = {g E GL(n, C) : 'g = g- '}. (3) The symplectic group Sp(n) is the group of H-linear isometries of H" with natural symplectic product [Ch]. Since quaternion multiplication is not commutative, the equivalent definition Sp(n) = {g E U(2n): ' g J = J g - ' } may be preferred, where J = ( y i').

-

u

s

c

Because the operations defining them are continuous, these are closed subgroups of GL(n, C) and hence are Lie groups.

17. Lemma. The Lie groups above are all compact, and all except O(n) are connected. The compactness can be shown directly by verifying that each G is a closed bounded subset of gI(n, C) z R2"' for suitable n. Connectedness is shown in Chapter 11. Using Lemmas 14 and 16, it is easy to compute the (matrix) Lie algebras of these groups. (1) The Lie algebra o(n) of the orthogonal group O(n) consists of all n x n (real) skew-symmetric matrices:

~ ( n=)

{X E gL(n,

R)

'X = -x}.

452

B

Lie Groups

(2) The Lie algebra u(n) of the unitary group V(n)consists of all (complex) skew-Hermitian matrices : u(n)

= {x E gl(n, C):

‘x = -x}.

(3) The Lie algebra sp(n) of the symplectic group Sp(n) consists of matrices in g1(2n, C)of the form (? I), where x E u(n) and ‘ y = y. The dimensions of these Lie algebras are readily counted, and dim O(n) = dim o(n) = n(n - 1)/2; dim V ( n ) = dim u(n) = n2; dim Sp(n) = dim sp(n) = 2n2 + n. The three types respond differently when we take their special subgroups as in Example q2). (1) The special orthogonal group SO(n) = ( a E O(n) :det a = l}. If a E O(n), then ‘a = a - ’ implies det a = k 1. Lemma 9.6 shows that SO(n) is connected, hence it is the identity component of O(n), and det a = - 1 gives the only other component. Since O(n) and SO(n) have common neighborhoods of e, remarks above show that they have the same Lie algebra: eo(n)

o(n). (2) The special unitary group SU(n) = { a E U(n) : det a = 1). The Lie algebra is eu(n) = {x E u(n) :trace x = 0). Thus dim SU(n) = dim 4u(n) =

=

nz - 1. (3) Every element of Sp(n) has determinant 1.

Note that U(1) is the unit circle in the complex plane, and SO(2) x U(1). The matrix descriptions above are convenient, but invariant descriptions are sometimes more natural in applications. For example, if I/ is an ndimensional real vector space, then the vector space gl(V) of all linear operators on I/ becomes a Lie algebra under [ A , B ] = AB - BA. The invertible elements of gI(V) form a group GL(V) under composition of functions. Assigning to each operator its matrix relative to some fixed basis for V gives a Lie algebra isomorphism gl(V) = gl(n, R ) and a group isomorphism GL( V ) = GL(n, R), the latter making GL( V )a Lie group, with gI( V ) its Lie algebra.

c

NEWTONIAN GRAVITATION

For the sake of comparison with the relativistic version in Chapter 13, we outline briefly the Newtonian description of planetary motion. As in Chapter 13, geometric units are used (Remark 6.7). A mass M is located at the origin of Euclidean space R3, and a is a particle of mass m 6 M in R3. By Newton's law of gravitation, the central mass exerts a force F = -(Mm/r2)U on a where r = la1 and U is the outward radial unit vector. By canonical identification, U = a/r. By Newton's second law of motion, F = ma", where primes indicate derivatives relative to Newtonian time. Actually, two different notions of the mass of a appear above; it is assumed that they have the same value m. Less subtly, since m 6 M , we ignore the motion of the mass M. Then

(I) a" = -(M/r3)a. The vector field L' = a x a' is the angular momentum vector ofa per unit mass.

t

(2) is parallel, hence can be identified with a point of R3. If L' # 0, then a lies in a plane through the origin and does not pass through the origin. +

Proof. L' = (a x a')'= a' x a' + a x a" = 0, since a and a" are collinear. Also a - L = a - ( a x a') = 0, and obviously a(t) = 0 for some t implies = 0. w

t

If L' = 0, then a lies in a line through the origin, hence we can always assume that o! lies in the xy-plane of R;'. Then the angular momentum o f a per unit mass is the number L such that L = L a,. Shifting to polar coordinates replaces (1) by (3)

1''

-

rq"

=

-M/r2,

+

rq" 2r'q' 453

=

0.

454 C

Newtonian Gravitation

The second equation here shows that r2q’is constant. In fact a polar computation of L gives

(4) r2q’ = L (Kepler’s second law). Assume L # 0, so r and cp’ are never 0. Then the substitution u = l / r transforms the first equation in (3) to

( 5 ) d2u/dcp2

+ u = M / L ~ (the orbit equation).

This has general solution u = M/L’+ Acos(cp - qo).By rotation of coordinates, we can arrange that cp, = 0 and A 2 0. Then resubstituting u = l / r gives (6) r =

L~/M

1

+ e cos cp’

where e

=

A L ~ / M2 0.

Consulting an analytic geometry book we conclude that

(7) If L # 0, the particle a parametrizes a conic section of eccentricity e with a focus at the origin (Kepler’sJirsf law). This conic section is the orbit of a : an ellipse (0 s e < l ) , parabola (e = l ) , or hyperbola (e > 1). Recall that if X E X(%) is a force field on a connected open set 92 in R3, and V E %(a) is a function such that X = -grad V , then X is said to be conservative and I/ is a potentia1,function for X . In this case, if c1 lies in 92,its total mechanical energy B is the sum of its kinetic energy ma’*a‘and its potential energy m V(a). A simple computation verifies that d is constant. Henceforth we deal with E = B/m,the total energy per unit mass of a. (8) On R3 - 0 the gravitational force field F = -(M/r2) 8, is conservative, and its potential function such that V(o0) = 0 is I/ = -M/r.

Then in polar terms: (9) 2E = r”

+ L2/r2 - 2M/r

(the energy equation).

Proof: We can suppose m = 1. In polar coordinates the kinetic energy r2qf2)/2. By (4), this becomes g r ’ 2 ( L 2 / r 2 ) ) .Since V ( u ) = -M/r, the result follows.

+

+

a’. a’/2 is (rr2

The energy equation can also be interpreted as follows. If r = r ( t ) is regarded as the position of a unit mass particle moving on the half-line R + , then its kinetic energy is Equation (9) will express conservation of its energy if we introduce the artificial potential energy V ( r ) / 2 , where now V ( r ) = L 2 / r Z- 2M/r. In fact, (9) gives E = ir’2 + $V(r), and since r” 2 0, E 2 V(r)/2.Plot the graph of V ( r ) / 2 ;then drawing horizontal lines at various $I2.

Newtonian Gravitation 455

Hypcrbolic orbit

_____I _____________ Parabolic orbit

Figure C1

heights E , as in the figure, shows how E determines the range of r. Evidently E < 0 for elliptical orbits since then r oscillates between finite endpoints, and E > 0 for hyperbolic orbits where incoming r bounces off the graph to return to infinity. =

(10) If a is the semimajor axis of an elliptical orbit, then a(1 L2/M.

Proof: By (6), rmi, = L2/M(1 rmi, = a(1 - e).

+ e), but

-

e’)

by the geometry of ellipses,

(1 1) If T is the time for one elliptical orbit, then MT’ = 4n2a3(Kepler’s third law). Proof. By (4), the rate at which a sweeps out area is L/2. Hence the area enclosed is TL/2. By the geometry of ellipses, this area is m2(1- e’)”’. Thus, using (lo),

TZL2/4= 7r’a4(l - e 2 ) = n 2 a 3 L Z / ~ .

REFERENCES

Following the works specifically referred to in the text, a few general references are listed. Ambrose. W. Parallel translation of Riemannian curvature. Anpi. q/’Moth.64 (1956). 337-363. Beem, J . K.,and P. E. Ehrlich. GlobalLorenrzian Geometry, Dekker, New York. 1981. Bishop. R. L., and S. 1. Goldberg. Tensor AnuI.ysis on Manifolds, Dover, New York, 1980. Bishop, R. L.. and B. O’Neill. Manifolds of negative curvature, Trans. Arner. Marh. SOC.145 (1969). 1-49. Cheeger, J., and D. G. Ebin. Comparison Theorems in Riemannian Geometry. NorthHolland Publ., Amsterdam, 1975. Chevalley, C. Theory qf’ Lie Groups, Princeton Univ. Press, Princeton, New Jersey, 1946. Einstein. A,. P I al. The Principle of Relativity. Dover. New York. 1952. Gcroch. R. P. Domain of dependence. J . Math. Phys. 11 (1970). 437-449. Gromoll, D., W. Klingenberg, and W. Meyer. Riemannsche Geometrie im Grossen, Springer-Verlag, Berlin and New York, 1968. Guillemin, V., and A. Pollack. D@ierentialTopology, Prentice-Hall, Englewood Cliffs, New Jersey. 1974. Helgason, S. Di$erential Geomeiry, Lie Groups, und Symmetric Spaces, Academic Press, New York, 1978. Hawking, S. W., and G. F. R. Ellis. The L a y e Scale Structure of Space-time, Cambridge Univ. Press, London and New York, 1973. Kobayashi, S., and K . Nomizu. Foundations qf Differential Geometry, Wiley (Interscience), New York, Vol. I, 1963; Vol. 11, 1969. Lang, S. Introduction to Drffcventiahle Manifolds, Wiley (Interscience), New York, 1962. Marcus. L. Cosmological Models in Differential Geometry (mimeographed notes), University of Minnesota. Minneapolis, 1963.

456

References

CSPl

CVl CWl

457

Massey, W. S. Alyehraic Topo1og.v: A n Introduction, Springer-Verlag, New Y ork and Berlin, 1977. Matsushima, Y. Diffrenriahle Munifolds, Dekker, New York, 1972. Milnor, J. Morse Theory, Princeton Univ. Press, Princeton, New Jersey, 1963. Misner, C. W., K. S. Thorne, and J . A. Wheeler. Grazitation, Freeman, San Francisco, 1973. O'Neill, B. Elemeniary D{ff>reniiulGeometry, Academic Press, New York, 1966. O'Neill, B. The fundamental equations of a submersion, Michigan Math. J . 13 (1966), 459-469. Palais, R. S. A global formulation of the Lie theory of transformation groups, Mem. Amer. Math. Soc. 22 (1957). Spivak, M. A Comprehensive lntroduction to Drflerential Geometry, Vols. I-V, Publish or Perish, Berkeley, California, 1970, 1975. Singer, I. M., and J . A. Thorpe. Lecture Notes on Elementary Topolog.vand Geometry, Springer-Verlag, New York and Berlin, 1976. Steenrod, N. Topologyqf Fibre Bundles, Princeton Univ. Press, Princeton, New Jersey, 1951. Sachs, R. K . , and H. Wu. General Relatii~ityfor Mathematicians, Springer-Verlag, New York and Berlin, 1977. Taylor, E. F., and J. A. Wheeler. Spacetime Physics, Freeman, San Francisco, 1966. Vick, J. W. Homology Theory, Academic Press, New York, 1973. Warner, F. W. Foundations of Drfl2rential Manifoldsand Lie Groups, Scott, Foreman, Glenview, Illinois, 1971. Wolf, J. A. Spaces of Constant Curvature, Publish or Perish, Berkeley, California, 1977.

Bishop, R. L., and R. J . Crittenden. Geometry oJMantYolds, Academic Press, New York, 1964. Boothby, W. An Introduction to DiJl^erentiuble Manifolds and Riemannian Geometry, Academic Press, New York, 1975. Chern, S. S. Pseudo-Riemannian geometry and Gauss-Bonnet formula, Ann. Acad. Brasil CIi.nc.. 35 (1963). 17-26, Frankel. T. Graritational Curcwrure. Freeman. San Francisco, 1979. Geroch, R. P. Spacetime structure from a global viewpoint. In Generul Rehtiuity arid Cosmology ( R . K. Sachs. ed.). Academic Press, New York. 1971. Penrose, R. Techniques qf Diflerentiul Topo10,y~in Relutirity, Regional Conference Series in Applied Mathematics, Vol. 7, SIAM Publications, Philadelphia, 1972. Smith, J. W. Lorentz structures on the plane. Trans. Amer. Math. Soc. 95 (1960), 226-237. Thorpe, J. A. Elementary Topics in Differential Geometry, Springer-Verlag, New York and Berlin, 1979. Weinberg, S. Graritution and Cosmolog),. Wiley, New York, 1972. Wu. H. Holonomy groups of indefinite metrics, Pacifir J . Math. 20 (l967), 351-392.

This Page Intentionally Left Blank

INDEX

A Acausal set, 415 Acceleration, 66, 159 Achronal set, 413 Action of group, 254 transitive, 255 Ad(H)-invariance, 301 Age of universe, 352-353, 362 Angular momentum Newtonian, 453 Schwarzschild, 373-374, 380 Anti-isometry, 92 Arc length, 131 first variation of, 263, 264 local maximum and minimum of, 272276 second variation of, 266 Associated Newtonian particle, 168 Atlas, 2 At rest in Newtonian space, 161 in Schwarzschild exterior, 372 Automorphism of Lie algebra, 302 of Lie group, 300

B Backwards Schwarz inequality, 144 Backwards triangle inequality, 144

Basis theorem, 8 Beem and Buseman’s example, 209 Bending of light, 381 Bianchi identity first, 75 second, 76 Big bang, 348 Big crunch, 348 Bi-invariant metric, 304-306, 330 Bilinear form, symmetric, 46, 53 Black hole, 367, 392-394 formation of, 385 no escape from, 392 Bochner’s theorem, 259 Boost, 236 Bracket operation, in Lie algebra, 447 on vector fields, 13, 31 Bump function, 6 C

Canonical isomorphisms, 26 Cartan, E., 224 Cauchy development, 419-423 Cauchy horizon, 428-431 Cauchy hypersurface, 415-417, 431, 436 future, 438 Causal character of curve. 69

459

460 Index of submanifold, 142 of vector, 56 of vector subspace, 141 Causal future [past], J' [J-I, 402 Causal cone, 146. 155 Causal (nonspacelike) curve, 146 Causality conditions, 407 Causality in Lorentz manifolds, 293-298, 401-437 Causality relations, 402 Causality in special relativity, 165-166 Chain rule, 10 Christoffel symbols, 62 Chronological future [past], I' [I-], 402 Chronology condition, 407 Clifton-Pohl torus. 193, 260 Closed geodesic, 192 Closed subgroup, 447 Codazzi equation, 115 Codimension, 20,98 Collision, 179-181 Complete atlas, 2, 33 Complete semi-Riemannian manifold, see Geodesic completeness Complete, metrically, 138 Complete vector field, 29 Complex Grassmann manifold, 326, 327 Complex hyperbolic space, 327 Complex projective space, 327-329 Complex structure J on a vector space, 324 Component, connected, 21 Components of tensor, 39.52 Conformal mapping, 92 Congruence, 102, 120 Conjugate point, 270-273 on cospacelike geodesics, 274-277, 299 on null geodesics, see Focal points Connectedness, 21, 72 geodesic, 138 Connection, 59 Levi-Civita, 61 natural, on RE, 59,62 normal, 118 Conservation of energy-momentum, 179-181 infinitesimal, 335 of Newtonian energy and momentum, 453-454 Conservation lemma, 252 Constant curvature, 79-80 manifolds of, 113,223, 227-231

Contraction of tensor, metric, 83 natural, 40-42 Contravariant tensor, 37 Convergence k, 287-288, 292,43 1-435. Convex open covering, 131 Convex open set, 129 Coordinate expression, 4, 5 Coordinate function, 2 Coordinate neighborhood, 3 Coordinate system, 1-3 Coordinates, adapted to subset, 16 associated Lorentz, 167 cylindrical, on R', 63 Euclidean, 159 Kruskal-spherical, 390 Lorentz (or inertial), 164 natural, on R",1 normal, 72-73 null, 153, 156 Schwarzschild, 152 Schwarzschild-spherical, 370 spherical, on R', 94-95 Coset manifold, 306-309 Cosmological model, 34 1 Cospacelike geodesic, 273 Cotangent space, 14 Covariant derivative, 59, 64 normal, 114, 119 of vector field on curve, 65 Covariant differential, 64 Covariant tensor, 37 Covering manifold, 444 map, 443 semi-Riemannian, 191, 201-202 Covering, by subsets, 21 open, 21 Critical point, 33 O ~ E 290 , of length function, 268 Cross section (or section), xiii local, 32 Curl, 95 Curvature Gaussian, 81 Gauss-Kronecker , 197 holomorphic, 325 mean, see Mean curvature normal, see Normal curvature vector Ricci, 87-89

Index Riemannian, see Curvature tensor scalar, 88 sectional, 77-79 Curvature and gravity, 334 Curvaturelike function, 79 Curvature operator, 74 Curvature tensor, Riemannian (or Riemann-Christoffel), 74, 96 components of, 76, 83 normal, 125 sign of, 89 symmetries of, 75 Curve, 10 causal (nonspacelike), 146 null, 69 parameter, 122 periodic, 29 piecewise smooth, 11 regular, 11 spacelike, 69 timelike, 69 Curve segment, 11 D Dajczer and Nomizu’s criteria, 53 Deck transformation, 185 Derivation, 12 de Sitter spacetime, 229 Diameter of Riemannian manifold, 279 Diffeomorphism, 55 Differential form,43 Differential of function, 15, 33 Differential map, 9 Dimension of manifold, 3 Direct product and direct sum, 34 Displacement vector, 131, 165 Distance Riemannian, 134, 136-138 in Robertson-Walker cosmology, 347 in special relativity, 166, 171 Distant parallelism, 67 Divergence, 195, 213 Dot product, 1, 47 Duality of symmetric spaces, 321-323 Dust, see Friedmann cosmological models

E E = mc2, 177 Edge of achronal set, 413 Einstein, A., 54, 172, 173, 177, 332, 334, 336

461

Einstein addition law, 172 Einstein-de Sitter model, 352, 356, 357 Einstein equation, 336 Einstein gravitational tensor, 336 Einstein manifold, 96 Endpoint of extendible curve, 30 Energy density of perfect fluid, 339 Newtonian, 159, 177 relativistic, 177-180, 333 Schwarzschild, 374 Energy-momentum conservation of, 179-180 of lightlike particle, 178, 333 of material particle, 176, 333 as source of gravity, 335 €-neighborhood, 134 Euclidean space, 1, 3, 55, 228 Evelyn and Jean, 183 Evenly covered, 443 Event, 160, 163, 333 Event horizon, 438 Expansion of universe, 347-348 Exponential map, 70-71 examples of, 73, 104 Lie, 449-450 normal, 199 Extendible curve, 30, 438 Extendible geodesic, 68, 130 Extendible manifold, 155, 157 Extrinsic geometry, 102

F Fiber of vector bundle, 197 of warped product, 204, 205 First variation of arc length, 263-265 of E, 289 Fixed endpoint homotopy, 441 Flat manifold, 79 Flow, 29 Focal point, 283-284 on cospacelike geodesics, 285-288 on null geodesics, 290-293, 296,298 Force, 159 Frame (orthonormal), 84 Frame field, 84-85 Frame-homogeneous manifold, 258-259, 260 Free fall, 164, 334

462 Index Friedmann cosmological models, 350-353, 356-357, 362 Fundamental group, 442 Future, 163 Future Cauchy hypersurface, 438 Future cone (causal, null, timelike), 163 Future-convergence, 435, 436 Future-pointing curve, 163 Future-pointing vector, 163

G Galaxies, 341 idealizations of, 341-342 Gauss, K. F., 54, 74 Gauss equation, 100, 101, 107 Gauss lemma, 126-127 Gauss map, 1% Gaussian curvature, 81, 124 General (or full) linear group GL(n,R),446 complex, 449 General relativity, foundations of, 332-337 Geodesic, 67 broken, 72 causal character of, 69 closed, 192-193 inextendible (maximal), 68 minimizing (shortest), maximizing (longest), 136-138, 156,409,411 locally, 272, 276 Geodesic completeness, 68 null, 154 from point, 138 spacelike, 154 timelike, 154 Geodesics in cylinders, 148 in hyperquadrics, 149-150 local properties of, 133-135, 147-148 in submanifolds, 102-103 in surfaces, 150-153 variational properties of, 263-299 Geodesic symmetry, 223 Geometric units, 162 Geroch, R. P., 423 Geroch’s example, 154 G-invariant metric, 3 10 Global hyperbolicity, 412 Global symmetry, 224, 231 G-orientation, 241 Gradient, 85

Grassmann manifold, real, 308, 310 complex, 326, 327

H Hadamard’s theorem, 278 Hawking, S. W.,401 Hawking’s singularity theorem, 43 1-434. Hermitian scalar product, 324 Hessian of function, 33, 86 and index form, 268-269, 290 Holomorphic curvature, 325 Homogeneous space, 257 naturally reductive, 312 normal, 330 reductive, 310 Homomorphism of fundamental groups, 443 of Lie algebras, 448 of Lie groups, 329 Homothety, 92 Homotopy, fixed endpoint, 441 Hopf-Rinow theorem, 138 Hopf‘s corollary, 228 Horizontal subspace, 205-212 vector (field), 205, 212 Hubble law, 347 Hubble time, 348 Hyperbolic space, I 1 1, 113, 156, 228, 278 upper and lower imbeddings of, 111 Hyperbolic angle, 144, 156 Hyperquadric, 108-1 10 curvature of, 113 frame-homogeneity of, 113 geodesics in, 112-113, 149-150 isometries of, 113, 239 product manifold structure of, 110 sign of, 108 Hypersurface, 20, 106-108, 124 totally umbilic, 116-1 18

I Identity component Go, 447 Imbedding smooth, 19 isometric, 122

Index Immersed submanifold, 19 Immersion smooth, 19 isometric, 121-122 Impact parameter, 381 Incomplete, see Complete entries Indefinite (or semi-) unitary group U ( p , q ) , 324 fundamental group, 329 Lie algebra, 324 Index form, 269 of scalar product, 47 of scalar product space, 51 of semi-Riemannian manifold, 55 Induced connection on submanifold, 98-99 Induced topology, 15-16 Inextendible, see Extendible entries Inner product, 47 Instantaneous observer, 180, 333 Integral curve, 27 maximal, 28 Interpretations of tensors, 36 Inverse function theorem, 10 Involutive map, 231 Isometric invariant, 58 Isometry, 58 local, 90-91 Isometry group, 233 as Lie group, 255 Isotropy local spatial, 342 observed, 341 Isotropy subgroup, 255

J

Jacobi equation, 216 Jacobi field, 216, 232 Jacobi identity, 13, 447 Jacobian function, 196 Jacobian matrix, 10

K Kahler manifold, 325 Kepler’s laws, 453-455 Killing form, 302 Killing vector field, 250-256 Kinetic energy, 159

463

Kobayashi’s proposition, 321 Koszul formula, 6 1 Kruskal plane, 386-388, 399, 400 Kruskal spacetime, 389-391 geodesics in, 395-398 isometries of, 399 truncated, 392 Kulkami’s theorem, 229

L Laplacian, 86, 213 Leaf of warped product, 205 Left- and right-multiplication, 447-448 Left-invariant vector field, 448 Levi-Civita connection, 61 Lie algebra, 447 abelian, 301 of Lie group, 448 semisimple, 306 Lie bracket, see Bracket operation Lie derivative, 46, 53, 195, 250 Lie exponential map, 449, 450 Lie groups basic theory and examples, 446-452 further properties, 300-304; see also individual Lie groups, e.g., Orthogonal group Lie subspace, 310 Lift of functions, vectors, and vector fields, 25, 205 of mapping, 443 of tensors, 210 Lightcone, see Nullcone Lightlike; see atso Null entries Light-like particle, 163, 380-384, 392-393; see also Null geodesic Lightlike submanifold, 142 Lightlike subspace, 141 Limit sequence, 405 Linear operator, 242-243 Linear isometry, 51 Linear isomorphism, xiii Linear isotropy group, 3 11 Line element, 56 Local diffeomorphism, 10 Local isometry, 90, 91 Local section, 32 Locally symmetric semi-Riemannian manifold, 215, 219-224

464 Index Loop (or closed curve), 186 Lorentz, H. A., 158 Lorentz coordinate system, 164, 167 Lorentz group, 235, 240 Lorentz manifold, 55, 126, 143-149; see also Causality Lorentz surface, 150-153 Lorentz vector space, 140 Lorentz-Fitzgerald contraction, 175 M

Manifold Lorentz, 55 Riemannian, 55 semi-Riemannian, 54 smooth, 3 construction of, 23 topological, 413 Map (mapping), smooth, 5 Marsden’s proposition, 258 Mass of particle, 159, 163, 177 of Schwarzschild and Kruskal spacetimes, 367, 389 Matched covering, 203 Material particle, 163 Matter, 335 Maximal geodesic, 68 Maximal integral curve, 30 Mean curvature, 101, 123, 124 Metric equivalence, 60 Metric tensor, 54 Microwave background radiation, 357 Minimizing geodesic, 136 Minkowski, H., 158 Minkowski space(time), 55. 163 Misner-completeness, 156 Momen turn Newtonian, 159 relativistic, 177-178, 180 Myers’ theorem, 279

N Natural coordinate functions on R”, 1 Newtonian gravitation, 453-455 Newtonian motion, 159-161 Newtonian space, 159 Newtonian time, 159 Nondegenerate bilinear form, 46

Nondegenerate subspace, 49 nor (normal projection), 98, 205, 344, 369 Normal bundle, 198 exponential map of, 199 Normal connection, 114, 118 covariant derivative, 114, 119 curvature tensor, 125 parallel translation, 119 Normal coordinates, 72-73 Normal curvature vector, 105, 106, 108 Normal neighborhood of point, 71 of submanifold, 199 Nullcone, 53, 56, 109, 128 Null geodesic and causality, 404,430-431, 435-437 closed nonperiodic, 193 focal points on, 290-298 in hyperquadrics, 149-150 Kruskal, 400 Robertson- Walker, 353-357 in surfaces, 152, 153; see also Lightlike particle Nullspace, 53 of index form, 272, 285 Null vector, 48, 56 0

Observer, 167 instantaneous, 180, 333 Schwarzschild, 371 Observer field, 358 geodesic, 358 irrotational, 358 proper time synchronizable, 359 synchronizable, 359 One-form, 15 One-parameter subgroup, 449 Open submanifold, 3-4 Orbit free fall, in Schwarzschild spacetime, 374-384 manifold, 187, 188, 191 of point, 187 Order of conjugate point, 271 of focal point, 283, 291 Orientation (orientability, oriented) covering manifold, 190 of hypersurface, 189, 197

index of manifold, 23, 189, 195, 214 natural, of R",189 space-orientation, 237, 240-242 time-orientation, see Time-orientation of vector bundle, 198 of vector space, 189 orientation-preserving map, 190 Orientation-reversing map, 190 Orthogonal coordinate system, 64 Orthogonal group O(n),451 components of, 238 Lie algebra of, 451 Orthogonal projection, 50 Orthogonal vectors, 48 Orthonormal basis, 50 Orthonormal expansion, 50

Poincark half-plane, 94, 151, 260 Polar map, 221 Position vector field, 26, 128 Pregeodesic, 69 Product manifold smooth; 4, 24 semi-Riemannian, 57, 89 Product rule, 44 Projective space complex, 327 real, 188, 192, 247, 259 Properly discontinuous group, 188 Proper time, 163 Pseudohyperbolic space, 110; see also Hyperquadric Pseudosphere, 110; see also Hyperquadric Pullback, 42

P Pair isometry, 102 Paracompactness, 22 Parallel tensor field, 65 Parallel translation, 66 Parallel vector field on curve, 66 Parameter curve, 122 Partial derivative on manifold, 7 Particle, Newtonian, 159 relativistic (material and lightlike), 163 Partition of unity, 22 Past (dual of future), 163, 402; for Past entries, see Future entries Penrose, R., 401 Penrose's singularity theorem, 434-437 Perfect fluid, 337-339, 361, 362 energy density of, 339 pressure of, 339 Robertson-Walker model of, 345-347, 362 Perihelion advance, 378-380 Perp operation, 49 +related vector fields, 14 Photon, 163 frequency and wavelength of, 179 Physical equivalence, 336 Physical singularity, 348 Piecewise smooth curve, I 1 Piecewise smooth curve segment, 11 Piecewise smooth variation, 264 Poincark, H., 158,442

465

Q Quadratic form, 47 Quasi-limit, 404 Q ( v , w ) , 77

R Radiation cosmological model, 353, 362363 Redshift, cosmological, 354 Reductive homogeneous space, 310 naturally, 312 Relative speed, 172 Reparametrization of a curve, 11, 132 Rest photon in Kruskal spacetime, 393, 395 Restspace in general relativity, 358 in special relativity, 171 Ricci curvature, 87 Ricci equation, 125 Ricci flat, 87 Riemann, G.F.B., 54 Riemannian manifold (metric), 55 completeness of, 138 existence of, 140 Riemannian symmetric space, 319-321 of compact type, 319 of noncompact type, 319 Robertson-Walker spacetime construction of, 341-343 cosmology of, 347-350

466 Index geodesics in, 353 perfect fluid in, 345-346 space of, 343 S

Scalar curvature, 88 Scalar product, 47 Hermitian, 324 natural Hermitian, 324 Scalar product space, 48-53 as semi-Riemannian manifold, 58-59 Schwarz inequality, 141 Schwarzschild black hole, 367 Schwarzschild exterior, 367 Schwarzschild free fall orbits, 374-384 Schwarzschild half-plane P I , 251 Schwarzschild observers, 371 Schwarzschild radius function, 365 Schwarzschild spacetime N U B construction of, 364-367 curvature of, 369 extension of, 386-390 geodesics in, 372-384 Schwarzschild strip PIi,367 Schwarzschild time function, 364 Second countability, 21 Second fundamental form, 100, 107 Sectional curvature, 77, 124 Self-adjoint linear operator (matrix), 243, 260-262 Semi-Euclidean space, 55 geodesics in, 69 isometries of, 239-240 Semiorthogonal group Ov(n)= O(p,q), 234 components, 236-238 fundamental group, 310 Lie algebra, 235 Semiorthogonal linear operator (matrix), 234-238, 243, 262 Semi-Riemannian covering, 191, 201 Semi-Riemannian hypersurface, see Hypersurface Semi-Riemannian manifold, 54 homogeneous, see Homogeneous space isotropic, 260 locally symmetric, 215, 219-224 symmetric, see Symmetric space Semi-Riemannian submanifold, 57, 97-125 Semi-Riemannian submersion, 212-21 3

Semi-Riemannian surface, 80-81, 94, 124, 150-153, 156, 262 Semi-Riemannian warped product, see Warped product Separation, 166 Shape operator, 107, 119 Shape tensor, 100, 118-121 Sign of curve, 263 of hypersurface, 106 of curvature tensor, 89 Signature, 50 matrix, 234 Simple connectedness, 442 Simply connected covering, 444 Singularity theorem, 401 Hawking’s, 431-432 Penrose’s, 436 Smooth Euclidean function, I Smooth mapping of manifolds, 4-5 Smooth overlap, 2 Space form, 227, 243-248 classification of simply connected, 227228 positive and negative, 244 Spacelike curve, 69 Spacelike submanifold (or Riemannian submanifold), 57, 142 Spacelike subspace, 141 Spacelike vector, 56 Space-orientation (Space-orientability, Space-oriented), 237, 240-242 Spacetime, 163; see also individual spacetimes Special orthogonal group SO(n) = O + ( n ) , 452 compactness and connectedness of, 237, 309-3 10 fundamental group of, 310 Lie algebra of, 45 Special relativity, 158-184 and general relativity, 332-333 Special unitary group SU(n), 452 Lie algebra of, 452 simple connectedness of, 310 Speed of light, 161-162 Sphere, geometry of, 57, 94, 101, 103-104, 105, 113, 137, 228, 239, 271, 318-319 as smooth manifold, 3, 20, 307 Star, 364, 384-385

Index Static spacetime, 360-361, 363 Stress-energy tensor, 335-337, 340 of perfect fluid, 337-339 Strong energy condition, 341 Submanifold, semi-Riemannian, 57, 97-125 extremal, 299 totally geodesic, 104-106, 125 totally umbilic, 106, 108 smooth, 15-18 open, 3-4 Submersion, semi-Riemannian, 212-2 13 smooth, 20-21, 32, 33 Support, 6 Surface, 3; see also Semi-Riemannian surface Surface theory, classical, 124 semiclassical, 262 Symmetric space, 224, 231 Lie construction of, 315-317 Riemannian, 319-321 Symplectic group Sp( n), 45 1, 452 topological properties of, 309-3 10 Synchronizable observer field, 359 Synge’s formula, 265

T tan (tangential projection), 98, 205, 344, 369 Tangent bundle, 26-27 Tangent space, 7 Tangent vector, 6-7 Tensor (field), 35 components of, 39 contravariant, 35, 37 covariant, 35, 37, 42-43 derivation, 43, 52 metric equivalence of, 83 multiplication, 36 at point, 37 type, 35 type-changing, 81-84; see also Contraction Test particle, 337 Tidal force (Ricci operator), 216, 219, 278, 299, 335, 362 Schwarzschild, 385-386, 399

467

Time Newtonian, 159-160 proper, 163 Timecone, 143 Time dilation, 171 Time function, 359 Timelike curve, 69 piecewise smooth, 146 Timelike convergence condition, 340 Timelike submanifold, 142 Timelike subspace, 141 Timelike vector, 56 Time-orientation (Time-orientability , Timeoriented), 144-145, 194, 237, 240-242 Time-separation, 409-41 1 Topological hypersurface, 413 Topological manifold, 413 Topological properties of manifolds, 21 -23 Torsion tensor, 93 Totally geodesic submanifold, 104, 125 Totally umbilic submanifold, 106, 116-1 18 Trace form, on matrix Lie algebra, 303 Transferred vector field, 14 Transvection, 231 Trapped subset, 435 Trapped surface, 435 Twin paradox, 173 Two-parameter map, 122-123

U Umbilic point, 105 Unitary group U ( n ) , 451 compactness and connectedness of, 309310, 329 fundamental group of, 310 Lie algebra of, 452 Unit speed curve, 132 Units, geometric and conventional, 162

V Vacuum, 337 Variation of arc length, 263-288 of curve, 215 O f E , 288-293 geodesic, 216 vector field (infinitesimal variation), 216 Vector bundle, 197

468 Index Vector field on curve, 65 perpendicular, 2 18 tangent, 218 on manifold, 12 on mapping, 27 +-related, 14 on submanifold, 97 normal, 98 tangent, 98 Vector space as manifold, 25-26 Velocity, 10, 171 parameter, 171 Vertical subspace, 205, 212

Vertical vector (field), 205, 212 Volume element, 195

W

Warped product, 204 causality in, 417-418 curvature of, 210-21 1 fibers, 205 geodesics in, 207-209 leaves, 205 warping function of, 204 Worldline, 160, 163

Semigroups (Pure and Applied Mathematics)

Read more

CRC Pure and Applied Mathematics)

Read more

Geometry of Feedback and Optimal Control (Pure and Applied Mathematics)

Read more

Discrete Geometry (Pure and Applied Mathematics (Marcel Dekker))

Read more

Complex Geometry (Lecture Notes in Pure and Applied Mathematics)

Read more

Principles of Algebraic Geometry (Pure and Applied Mathematics)

Read more

The Geometry of Geodesics (Pure and Applied Mathematics Volume 6)

Read more

Conditional Measures and Applications, Second Edition (Pure and Applied Mathematics)

Read more

Measure and Integral (Pure and Applied Mathematics)

Read more

Topology (Pure & Applied Mathematics)

Read more

Categories and Functors (Pure and Applied Mathematics)

Read more

Applications of Orlicz Spaces (Pure and Applied Mathematics)

Read more

Rings with Generalized Identities (Pure and Applied Mathematics (Marcel Dekker))

Read more

Real Analysis With Point-Set Topology (Pure and Applied Mathematics)

Read more

Functional Analysis (Pure and Applied Mathematics)

Read more

The Heat Equation (Pure and Applied Mathematics)

Read more

Classical Complex Analysis (Pure and Applied Mathematics)

Read more

Theory of Distributions (Pure and Applied Mathematics)

Read more

Homotopy Theory (Pure and Applied Mathematics 8)

Read more

Sobolev Spaces. Pure and applied Mathematics

Read more

Companion to Concrete Mathematics: Vol. I: Mathematical Techniques and Various Applications (Pure and Applied Mathematics Series)

Read more

Linear Algebra. Pure and Applied Mathematics

Read more

Noncommutative Distributions (Pure and Applied Mathematics)

Read more

The Smith Conjecture (Pure and Applied Mathematics)

Read more

Ring Theory (Pure and Applied Mathematics 44)

Read more

Dynamic Programming (Pure and Applied Mathematics)

Read more

The relation between pure and applied mathematics

Read more

Complex Analysis (Pure and Applied Mathematics)

Read more

Number Theory (Pure and Applied Mathematics)

Read more

General Relativity with Applications to Astrophysics

Read more

Recommend Documents

Semigroups (Pure and Applied Mathematics)

CRC Pure and Applied Mathematics)

AN INTRODUCTION TO FUNCTIONAL ANALYSIS Charles Swartz New Mexico State University Las Cruces, New Mexico Marcel Dekke...

Geometry of Feedback and Optimal Control (Pure and Applied Mathematics)

GEOMETRY OF FEEDBACK AND OPTIMAL CONTROL PURE AND APPLIED MATHEMATICS A Program of Monographs, Textbooks,and Lecture ...

Discrete Geometry (Pure and Applied Mathematics (Marcel Dekker))

DISCRETE GEOMETRY In Honor of W. Kuperberg's 60th Birthday edited by Andras Bezdek Auburn University Auburn, Alabama...

Complex Geometry (Lecture Notes in Pure and Applied Mathematics)

Principles of Algebraic Geometry (Pure and Applied Mathematics)

The Geometry of Geodesics (Pure and Applied Mathematics Volume 6)

THE GEOMETRY OF GEODESICS PURE AND APPLIED MATHEMATICS A Series of Monographs and Textbooks EDITED BY PAUL A. SMITH ...

Conditional Measures and Applications, Second Edition (Pure and Applied Mathematics)

Conditional Measures and Applications Second Edition © 2005 by Taylor & Francis Group, LLC PURE AND APPLIED MATHEMATI...

Measure and Integral (Pure and Applied Mathematics)

Topology (Pure & Applied Mathematics)