Theory of Relativity Based on Physical Reality

L. J A N O S S Y Theory of relativity based on physical reality AKADÉMIAI KIADÓ, BUDAPEST THEORY OF RELATIVITY BASED ...

Author: L Janossy

131 downloads 1427 Views 13MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

L. J A N O S S Y

Theory of relativity based on physical reality AKADÉMIAI KIADÓ, BUDAPEST

THEORY OF RELATIVITY BASED ON PHYSICAL REALITYUby

L. JÁNOSSY

AKADÉMIAI KIADÓ, BUDAPEST 1971

A»

® Akadémiai Kiadó, Budapest 1971 Printed in Hungary

CONTENTS

INTRODUCTION

13

Chapter I. A SURVEY OF SOMÉ EXPERIMENTÁL RESULTS RELATI N G TO THE PROPAGATION OF LIGHT A N D CONNECTED PHENOMENA

19

A. The critical velocity B. Measurement of the velocity of propagation of light signals 1. A generál remark 2. The suggestion of Galilei 3. Astronomical methods a. Observations of Römer b. The aberration of light 4. Laboratory methods a. Methods of Fizeau and of Foucault b. The experiments of Fizeau and Foucault 5. The propagation of light in refractive média a. The refractive index b. The determination of the velocity of light in a refracting médium C. Interferometric methods for the measurement of the velocity of light 1. The Michelson interferometer 2. Measurement of the phase velocity of light 3. The experiment of Fizeau D. The Doppler effect 1. Moving source 2. Observation of the Doppler effect E. The perpendicular Doppler effect 1. Experimentál observations 2. Physical interpretation of the perpendicular Doppler effect 3. Considerations in connection with the Doppler effect a. The question of the moving observer b. Acoustical and electromagnetic Doppler effects F. Somé relativistic effects 1. Variation of decay time with velocity

19 21 21 22 22 22 23 24 24 24 26 26 27 28 28 31 32 33 34 36 37 37 38 40 40 41 43 43

2. Change of mass with velocity 3. Remark on the methods of measuring the mass velocity relation Chapter II. INVESTIGATIONS CONCERNING THE CARRIER OF ELECTROMAGNETIC WAVES

44 45 48

A. The question of the ether B. Experimentál investigations 1. Rotation of the Earth 2. The Sagnac experiment 3. The experiment of Michelson and Gale C. Translational motion relatíve to the ether 1. Propagation of a spherical signal 2. Propagation of light relatíve to a moving system of reference 3. The Michelson-Morley experiment 4. The interpretation of the negatíve result of the Michelson-Morley experiment

48 50 50 50 53 53 56 57 59

5. Considerations concerning the contraction hypothesis D. Experiments of the Michelson-Morley type 1. The Trouton-Noble experiment 2. The experiment of Isaak and co-workers E. General remarks concerning the series of negatíve results 1. Generalization of negatíve experiences

64 65 65 66 69 71

Chapter III. THE PROBLEM OF MEASUREMENT A. The problem of measures 1. Representations 2. An example: the measure of electric charge 3. Distinguished representations 4. Measures of lengths B. Systems of space coordinates 1. Determination of coordinate vectors 2. Explicit determination of coordinate measures 3. Question of consistency 4. Various representations C. Problems connected with coordinate representations 1. Remark on "non-Euclidean" geometry 2. Coordinate transformations and deformations 3. Orthogonal transformations a. Definitions b. Group character of orthogonal matrices 4. Rigid bodies Chapter IV. THE LORENTZ TRANSFORMATION A. The time scale

60

72 72 72 73 76 77 79 79 81 83 83 86 86 86 89 89 90 92 94 94

1. General remarks 2. Atomic time scale 3. Systems of reference constructed with the help of light signals B. The Lorentz transformation as coordinate transformation 1. The explicit form of the Lorentz transformation 2. The physical significance of the parameters of the Lorentz matrices C. Homogeneous propagation of light 1. The concept 2. Test for homogeneous propagation of light 3. Connection between various representations a. Transformations of the propagation tensor D. The relation of systems of references obtained with light signals and with solids Chapter V. THE LORENTZ PRINCIPLE

94 97 98 100 102 104 106 106 107 108 109 110 114

A. The Lorentz transformation as deformation 114 1. Deformation operators 114 2. Lorentz deformations 116 3. Particular types of Lorentz deformations 117 B. Formulation of the Lorentz principle 120 1. Interpretation of the negatíve results of ether drift experiments in terms of the Lorentz principle 120 2. Non-orthogonal representations 123 3. General remarks on the Lorentz principle 124 C. The dynamical principle 125 1. The mechanism of the Lorentz deformation 127 a. Relaxation processes 127 b. Comparison between change of temperature and change of translational state 128 c. Deformations of unconnected systems 128 d. Length contraction of non-connected systems 130 2. Significance of subgroups of the Lorentz group 131 Chapter VI. THE INNER CONSISTENCY OF THE LORENTZ PRINCIPLE 133 A. Kinematical considerations in connection with the Lorentz principle 133 1. Addition of velocities 133 2. Addition formula and Lorentz deformations 136 B. Considerations about contraction of solids and the slowing down of clocks 137 1. The clock "paradox" 142 2. The "paradox of the twins" 146 Chapter VII. RELATIVISTIC MECHAN1CS A. Momentum and energy 1. Newton's first law

148 148 148

2. Elastic collisions 3. Inelastic collisions B. Equivalence of mass and energy 1. Remark on the mechanism of increase of mass with energy C. Distant collisions 1. Experimentál evidence D. Mechanical laws in terms of four-vectors and tensors 1. Newton's laws 2. The energy-momentum tensor Chapter VIII. THE ELECTROMAGNETIC FIELD

162

A. Maxwell's equations 1. Another formulation B. Solutions of Maxwell's equations 1. Gauge transformation 2. Retarded potentials 3. Advanced potentials 4. Wandering waves C. Maxwell's equations in terms of four-tensors 1. Retarded four-potential 2. The motion of light signals in terms of Maxwell's equations D. Maxwell's equations and the Lorentz principle 1. The field of a point charge Chapter IX. RELATIVISTIC EFFECTS OF THE

149 153 154 155 155 157 157 159 160

162 163 165 167 168 169 171 172 174 177 179 181

ELECTROMAGNETIC

FIELD

183

A. Effects of the first order 1. Effective field strengths 2. The field of dipoles B. Transformation properties of four-currents 1. The electric field of a moving current C. Further effects of the first order 1. Doppler effect and aberration 2. Frequencies of the Doppler effect 3. Effect of aberration 4. Intensities in the Doppler effect 5. Observation of the effect of aberration of star light 6. Propagation of light in a refracting médium a. Dispersion 7. The experiment of Fizeau D. Effects of the second order 1. Action of a charge upon itself 2. Mass defect E. Relativistic mechanics of a continuum

183 183 184 186 187 189 189 190 191 192 193 195 197 199 202 202 205 206

1. Interpretation of the Trouton-Noble experiment F. Transient phenomena

206 207

Chapter X. THEORY OF GRAVITATION

211

A. Observational facts B. Statement of the problem of the theory of gravitation 1. Mathematical formulation of the problem 2. Experimentál criteria for homogeneous regions a. An example 3. Construction of straight systems of references a. Locally homogeneous regions b. Criteria for homogeneous regions 4. Almost straight system of reference 5. Similar regions C. The generalized Lorentz principle 1. The Lorentz principle formulated in terms of curved coordinates 2. Generalization to inhomogeneous regions a. A physical example 3. The Lorentz principle valid for small physical systems a. First approximation b. A second approximation 4. The ambiguity in the formulation of the Lorentz principle Chapter XI. APPLICATIONS OF THE GENERALIZED PRINCIPLE

211 212 214 214 216 216 217 218 222 225 226 227 228 228 230 230 231 233

LORENTZ 235

A. Geodetic orbits 235 1. Definition 335 2. Lorentz invariance of geodetic orbits 236 a. Geodetic orbits and the Lorentz principle 237 B. Equation of motion in a gravitational field 237 1. Variational principles 238 a. Deviations from geodetic orbits 239 2. The physical contents of the variational principle 239 C. Connection between a gravitational field and the propagation of light 242 1. The equations of motions in a gravitational field 242 2. Integrals of the equations of motions 244 a. Perihelion motion 245 b. The deflection of light in the vicinity of the Sun 246 c. The red shift of spectral lines 247 D. Connection between the sources of gravitation and the propagation tensor g 249 1. Einstein's equations of gravitation 250 2. Energy momentum considerations 251

3. The Schwarzschild solution of the gravitational equations 4. The relativistic effects in the field given by Schwarzschild a. The planetary motion b. Deflection of light E. Electromagnetic field and gravitation 1. An invariant formulation 2. Question of electromagnetic polarization of the ether 3. Remark on the consistency of the generalized theory of electromagnetic fields F. Energy and momentum relations of the gravitational field 1. The gravitational force 2. Another aspect of the gravitational equations 3. The mechanism of the gravitational force Chapter XII. COSMOLOGICAL PROBLEMS A. The physical significance of invariant formulation of physical laws 1. Tensors and distinguished measures 2. The physical significance of the tensor g 3. A normál form of the propagation tensor B. Physical contents of particular representations 1. Stationary representations 2. The energy momentum distribution C. Cosmological problems 1. The results of astronomical observations 2. The solution of Friedmann D. Analysis of Friedmann's solution 1. The recession of the Galaxies a. The measures of intergalactical distances b. Doppler effect E. Mach's principle a. The Thirring effect Appendix I. TENSOR ANALYSIS IN HOMOGENEOUS REGIONS A. Systems of reference 1. The Lorentz system 2. Straight systems of reference 3. Propagation tensor g 4. Lorentz transformation 5. Standard form of linear coordinate transformations B. Vectors and tensors 1. Two-dimensional tensors a. Invariant products b. Pseudo scalar

253 255 255 256 256 257 258 258 260 260 261 262 264 264 265 266 267 270 270 271 272 272 273 276 276 276 278 279 280 282 282 282 283 283 284 285 286 287 288 289

C. Fields 1. The 3Í operator a. The Grad operator b. Further operations

289 291 291 292

Appendix II. TENSORS I N INHOMOGENEOUS REGIONS A. More-dimensional measures 1. Jfc-dimensional measures 2. Multiplication of more-dimensional quantities B. Permutation operators 1. Cyclic permutations 2. The transposition of a mátrix 3. The n, operators C. The 91 operator in curved representations 1. Tensor of several dimensions 2. Symmetry properties

293 293 293 294 295 297 298 299 299 300 301

(4)

3. The antisymmetric tensor e D. Tensor fields 1. The Christoffel bracket symbols 2. The covariant differentiation E. Criteria for homogeneous regions 1. Almost straight representations 2. Tensor character of the Riemann-Christoffel tensor

302 303 305 306 308 310 312

(4)

3. Symmetries of the R tensor

314

(4)

4. The reduced form of the R tensor

316

INTRODUCTION

1. In this book I attempt to give a re-evaluation of the facts which led to establish the special and generál theories of relativity. As the result of this analysis I obtain a mathematical formalism which is strictly equivalent to the known formalism of relativity. Nevertheless my approach to phenomena differs from the usual ones; it resembles sometimes that which is found in the classical literature of relativity rather than the treatments given in modern textbooks. I have always treated with great respect Laue's book* on relativity and many of the ideas elaborated in this volume have their origin in remarks I found there. 2. At first sight it appears a dangerous attempt to reformulate a wellestablished theory in particular — as it is in the case of relativity •— if the old theory gives mathematically a correct description of phenomena. I am, however, of the opinion, that the task of a theory is not merely to give mathematical formuláé which happen to describe correctly certain physical phenomena — a theory must give alsó a physical insight into the particular laws of nature it deals with. The theory of relativity in its originál formulation is certainly not a mere attempt to describe phenomena by suitable mathematical expressions — the theory is a far reaching attempt to give a theory of space and time. Our criticism of the theory is just connected with this latter feature. We think that the theory reflects correctly certain generál physical laws, but these laws — in our opinion — have nothing to do with the "structure of space and time". Therefore our attempt is to give a physical interpretation of the relativistic formuláé, which is different from the old one. 3. We want to give a number of personal experiences which led us to our attempts of reformulation of the theory of relativity; we think that possibly somé of our colleagues had similar experiences. I got acquainted with the theory of relativity at a comparatively early age—I read the famous popular book written by Einstein.** Reading the * M. v. Laue: Die Relativitatstheorie. Fr. Vieweg und Sohn, Braunschweig, 1923. ** A. Einstein: Über die spezielle und die allgemeine Relativitatstheorie. Fr. Vieweg und Sohn, Braunschweig, 1921.

latter I had difficulties with somé of Einstein's concepts; however, having been young and enthusiastic, I convinced myself in the end that I could understand those concepts — to prove this I tried to explain the theory to everybody who was interested. In the course of such attempts I learned the "language of relativity" and gradually I "got used" to the theory. A certain bad feeling never ceased altogether. Many years later I reád several years in succession a course of physics at the university of Manchester. My course contained alsó the special theory of relativity. As the years went on I developed a technique of presenting the subject so that in the end I could convince my students that they really understood the theory. However, as my technique of presenting the theory improved my own belief in the adequateness of the concepts vanished. In the end I became convinced that from the philosophical point of view the concepts had to be changed. Since about 1950 I have struggled with the problem of the reformulation of the theory and the results of my deliberations are found in this volume. 4. Reading Einstein's book on relativity I still can remember the thrill I felt when I met the suggestion that performing a coordinate transformation, it appears to be proper not to transform the space coordinates alone, but it seems necessary to transform alsó the time! Analysing, however, the above suggestion — which started the revolutionary ideas about space and time — I have arrived to a more sober view. We may describe events @ by four coordinates, i.e. three space coordinates giving a position vector r and for the fourth coordinate we can take the measure of time / at which the particular event happens. We may write x = r, t for the four-coordinate of an event. Changing from one system of reference to another we can introduce transformed coordinates x' = f(x),

(1)

where f(x) is somé reversible four-function of its variable x. If the coordinates x are suitable for describing events, then the transformed coordinates are alsó suitable. Introducing particular measures x or x' for events we give merely somé kind of names to events with the help of which we can recognize them — it is a more or less trivial thing whether we take x or x' as the measure of an event. The fact that a transformation of the type (1) mixes the measures of the time and space coordinates does not seem to be of particular importance and it does not imply any properties of time and space. An objective physical process develops according to its own laws and it can be described in terms of arbitrary measures. In a correct description of a phenomenon the role of the representation must vanish, i.e. it must be immaterial whether we use for the measures of events occurring in a process x or x'. 5. Sometimes it is stated that the special relativity gives the laws of nature in a form invariant with respect to linear transformations, while

the generál theory gives invariant formulation with respect to arbitrary coordinate transformations. We do not agree with the above statement. In our opinion — and this is elaborated in detail in this book — the laws of the special theory just as those of the generál theory can be formulated independent of the representation, i.e. in terms of arbitrary coordinate measures and alsó arbitrary time measures. The real difference between the theories is that the special relativity refers to phenomena in an approximation in which gravitational effects can be neglected, while the generál theory is an attempt to give the laws of physics valid in regions where gravitational effects cannot be neglected. This point of view is held among others by A. V. Fock; we think that somé of our arguments resemble those of Fock. Linear transformations play an important role in regions without gravitation — this is so because in a region free of gravitation a closed physical system can be made to move with a constant velocity and (using a suitable representation) this process can be described with the help of linear transformations. In a region where there are gravitational effects a closed system which is made to move changes in the course of its motion its gravitational surroundings and thus suffers deformations which cannot be expressed by linear transformations whatever representation we choose. 6. In our approach of physics in generál and of the theory of relativity in particular we think it very important always to remember that we are dealing with objective physical quantities and that we attempt to describe the latter in terms of measures. A physical quantity can be described by measures in a very large number of ways. To make clear the distinction between a physical quantity and its measures, we use Gothic letters for the quantities themselves and Latin letters for their measures. We write e.g. R(á) = a,

R\a) =

a',...

where we take a to be somé physical quantity and a, a',. .. the measures we find for a using particular representations R, R',. . . Looking in this way at the transformation of time, as has been mentioned above, we can say that a coordinate transformation involves the changes of the measures of the components of the coordinate vector and alsó the change of the measure of time. An event (5 is thus represented in terms of a coordinate measure and a time measure so tbat = r, t,

K'm =

r', t',. . .

the measures t and /' are just parts of the representation of the event 6 and thus / and t' do not give "the time" but two measures of the time of an event 6. 7. We can define measures a, a',. . . for a quantity a; it is obvious that these measures are not all equally useful. We shall see how one can choose from the various possible measures of a quantity such measures that reflect

clearly certain physical properties of the quantity. Such distinguished measures play a great role in the building up of the theory. The Lorentz transformation produces in Einstein's terminology the "transformation of time"; in our concept it gives the connection between equally distinguished measures of time. 8. In our presentation we feel that the ether appears as a médium which is the carrier of electromagnetic waves and alsó of other processes, e.g. it appears as the carrier of matter waves. This point of view does not essentially contradict somé of the views of Einstein. Wequote in the text (p. 49) a passage from an article by Einstein where he expresses views very similar to our own. The question of physical reality appears in our view in a form quite different from the way it is treated in many textbooks. To make this point clearer we quote from Laue's book* a sentence with which we do not agree: "Indem die allgemeine Relativitátstheorie von derartigen Normalkoordinaten vollstandig absieht, entkleidet sie nach einem Ausspruch Einsteins: 'den Raum (wie auch die Zeit) des letzten Restes physikalischer Gegenstándlichkeit'." "As the generál theory of relativity ceases to use any such normál coordinates, as Einstein pointed out, it takes away 'the last remainder of physical reality of space (and alsó of time)'." (My own translation.) It must be pointed out, however, that Laue has left out the above remark in the newer edition of his book and one finds in the newer editions ideas which resemble to somé extent our own ideas given in this book. In our view from the fact that the measures of space and time coordinates can be chosen in an arbitrary manner it does not follow that these coordinates do not express realities. In fact any physical quantity, e.g. temperature, can be expressed with the help of arbitrary scales and thus we can obtain very different measures for a temperature — nevertheless we do not doubt that temperature is an objectively existing quantity. 9. What we have to be careful about is that between measures taken on arbitrary scales there exist connections which connections remain whatever measures we introduce and the latter connections express physical realities. Somé physical quantities are expressed by tensors like the energy-momentum tensor 2 of the electromagnetic field, the energy-momentum tensor £ of a mechanical system, and so on. Among these tensors the tensor g plays a great role which is hormally supposed to give the "metric of the fourdimensional space-time continuum". We introduce the tensor as propagation tensor of electromagnetic waves — it appears at the beginning of our deliberations as a tensor with the help of which the mode of propagation of light can be described. It turns out that the tensor g appears alsó ( e / )

( m )

* D r . M. von Laue: Die Relativitátstheorie. Band II, p. 25. Verlag Friedr. Vieweg und Sohn, Braunschweig, 1923.

in the equations of motion of a partiele and alsó in other physical laws. It seems natural to connect g with the energy-momentum tensor of the carrier of the various phenomena, i.e. with the energy-momentum tensor of the ether. Laue in the newer editions of his book speaks about g as giving a "Führungsfeld", thus he is alsó of the opinion that g deseribes a kind of field. 10. The tensors % , $ , g,. . . have elements the numerical values of which depend on the representation we choose. Nevertheless, there are mathematical connections between these tensors which are independent of the representation and these connections express physical laws connecting matter, field and gravitation. The role of g in the latter relations is analogous to those of the other tensors and, therefore, we think that g represents something physical just as the other tensors do. These considerations are given in more detail in the last chapter of the book. We think the above philosophical considerations to be essential. Nevertheless we wish to point out that all our statements found in this book could alsó be translated into what may be called an orthodox language. Since our mathematical formulations — although strictly equivalent to the usual ones — are presented in a form noticeably different from the usuai formulation, we think that the book may be of interest alsó to readers who do not accept our philosophical point of view. I want to thank many colleagues for interesting and useful discussions on the subject of the book while in preparation. In particular to Mr P. Király wo took active part in clearing somé particular questions. I wish to thank to Mr A. Werner who gave very effective help in editing the manuseript. - I am greatly indebted to my stepfather G. Lukács with whom I had very many fruitful exchanges of point of views in connection with philosophical problems. íel)

( m )

J

2 Relativity

FIZIKUS TAN5Z£KcK)PORT «* KÖNYVTARA ^ ,

r

CHAPTER I

A SURVEY OF SOMÉ EXPERIMENTÁL RESULTS RELATING TO THE PROPAGATION OF LIGHT A N D CONNECTED PHENOMENA

11. In any attempt of dealing with the problems of the theory of relativity the question of the mode of propagation of light plays an important role. Maxwell's electromagnetic theory of light gives a concise description of the phenomena of light. We shall describe somé experiments which support Maxwell's theory. In particular according to Maxwell's theory the velocity c of propagation of light can be determined from measurement on the interaction of charges and magnets without using observations which are concerned directly with the propagation of light.

A. THE CRITICAL VELOCITY 12. Maxwell's equations representing the laws of the electromagnetic field contain a constant c which has the dimension of a velocity. This constant which can be obtained from the measurement of the forces acting between charges and currents is usually called the critical velocity. Its numerical value was determined experimentally with great precision and was found to be equal to the value of the velocity of propagation of light in vacuum. 13. In order to see more clearly the significance of the critical velocity we note the following. Electric charges act upon each other according to the Coulomb law; in particular in the case of two point charges e and e we have for the force F with which e acts upon c 1

12

x

2

2

where r is the vector pointing from e to e . Similarly, the force with which a magnetic pole m acts upon another magnetic pole m can be written x

2

x

2

F&> = ^

.

(2)

However, as single magnetic poles do not exist in nature, the above equation can be verified only indirectly, e.g. by using long raagnetized rods, which behave as if they possessed opposite magnetic charges at their ends. 14. An electric charge does not exert a force upon a magnetic pole if the charge and the pole are both at rest. However, if, say the pole is at rest and the charge is moving with a velocity v, we find that the force acting upon the magnetic pole is given by

^)

=

_ ^LL, a

( 3 )

r where r is the vector pointing from m to e and a is a constant. Comparing (1), (2) ánd (3) we see that the dimension of a is that of an inverse velocity. The íatter velocity can be determined numerically by measuring the forces acting between moving charges and magnetic poles. The velocity c' = l/a is called the critical velocity — or sometimes the Kohlrausch-Weber constant. From the most recent measurements of the interaction between moving charges and magnets it was found that c' = (299 790 ± 30) km/sec*

(4)

while the best value for the velocity of light is given as c = (299 792.50 + 0.10) km/sec** 15. The fact that a moving charge acts upon a magnet was first shown by Rowland. Exact experiments for the determination of the critical velocity appearing in (4) were carried out by Kohlrausch and Weber. In the latter experiment the magnetic action of the current arising when a condenser is discharged is compared with the action of the same charge when distributed electrostatically on the plates of the condenser. 16. From the empirically established fact that c = c' Maxwell concluded that light is an electromagnetic phenomenon. As it will be seen further below (in chapt. VIII) Maxwell's results apply to two phenomena.

* F. Kohlrausch: Lehrbuch der praktischen Physik. Verlag und Druck von B. G. Teubner, Leipzig und Berlin, 1930. p. 684. ** E. R. Cohen and J. W. M. DuMond: Rep. to the Com. on Nuclidic Masses and Related Atomic Constants of UIPAP. June 1963.

Firstly, a wave packet, e.g. a short signal emitted by a source moves as a whole with the velocity c. Secondly, in a monochromatic beam of light the surfaces of constant phase alsó move with this velocity. We presently describe very briefly a few experimentál methods with the help of which these results of Maxwell's theory were verified experimentally. B. MEASUREMENT

OF THE VELOCITY OF OF LIGHT SIGNALS

PROPAGATION

1. A GENERAL REMARK

17. In principle the velocity of propagation of light might be measured in the following way. A short signal is emitted from a source near a point A. The arrival of the signál is detected in a point B at a distance / from A. If the time of departure of the signal is t , that of its arrival t , we have for the velocity of the signal x

c=

/

2

.

Such a measurement cannot, however, be carried out in a straightforward way, because in order to measure the times t and t two clocks, say P and P one near A and the other near B have to run synchronously with sufficient an accuracy to render the difference t — t significant. The first problem — the solution of which is not obvious — is thus how to synchronize clocks. 18. The simplest way of doing this seems to be by direct comparison. We might e.g. observe from the point B by means of a telescope the clock P near A. However, when comparing the image P' of the clock P — as seen through a telescope situated near B — with the clock P , one has to remember that light takes a certain time to travel from A to B and therefore the image P' must be expected to be delayed in phase relatíve to P . This means that for the synchronization of P and P in this way it is already necessary to know the velocity c of propagation of light. The above difficulty may be overcome by observing both P and P through a telescope near a third point C situated at equal distances from A and B. The clocks P and P have to be regulated until the images P" and P" of P and P as seen in C appear to be in phase. The synchronization thus described supposes that light is propagated with the same velocity along AC and BC. This method, although feasible, has not in fact been used in any real experiment. x

2

A

B

2

x

A

A

A

B

A

A

A

B

A

A

B

A

B

B

B

A

2. THE SUGGESTION OF GALILEI

19. Galilei* supposed that light is propagated with a finite velocity and for its determination he suggested the following experiment. Two covered lanterns are situated at a certain distance from each other and near each there stands an observer. The first lantern is suddenly uncovered by the first observer. The second observer near the other lantern uncovers his own lantern as soon as he notices the light from the first. Now the first observer must be expected to see the lighting up of the distant lantern with a delay equal to the time the light takes to travel to and fro between the two lanterns. When the experiment was actually carried out no delay could be observed in this way when the lanterns were placed a few miles apart, and Galilei came correctly to conclude that the velocity of propagation of light is very large. He suggested as a possible method for the determination of this velocity with the help of the observations of the Satellites of Jupiter.

3. ASTRONOMICAL METHODS a. OBSERVATIONS OF RÖMER

20. The difficulty of synchronizing distant clocks might alsó be overcome in the following way. The clocks P and P are placed at first close to each other and are thus synchronized. The synchronized clocks are then transported to their positions near A and B respectively. If the transport takes place with sufücient care, we may hope that the clocks remain synchronized during their travel to their final positions and the experiment described in 17 can now be carried out with these clocks. In his investigations which led to the first numerical determination of the velocity of light (in 1676) the astronomer Olaf Römer used the same principle. The Satellites of Jupiter circling periodically around Jupiter provide a clock in space which can be observed from the Earth. While the Earth moves along its orbit the distance between Jupiter and Earth changes and the Jupiter-clock appears to be slow when the Earth is moving away from Jupiter and to be fast when it is approaching Jupiter. The actual method is as follows: Suppose that one of the Satellites of Jupiter actually completes its revolutions at times A

í = v

t + vT, 0

B

v = 1, 2 , . . . ,

k

(5)

where Tis the time taken for one revolution and t is the time when the first observed revolution has started. We observe the completion of the revo0

* Discorsi e demostrazione mat., Elzevir, 1638. p. 43; see alsó R. J. Seeger: Galileo Galilei, his life and his works. Pergamon Press, Oxford, 1966. pp. 1 8 5 - 1 8 7 .

lutions at times t' = t + IJc, v

(6)

P

where l is the distance Earth-Jupiter at the time t' * From (5) and (6) we find for the times observed for the first k revolutions v

v

t' -t' k

0

= kT+

(l - l )/c, k

0

and for the subsequent k revolutions 4 -t'

k

= kT+

(/

2fc

-

l )lc. k

The difference between the times observed for the first and second sequence of k revolutions is thus 4 - 2/* + t' = (l 0

2k

- 24 + l )/c, 0

thus _ hk — 2/fc + /p tzk — 2íjt + t'

0

Römer did not observe complete revolutions of the Satellites, but observed the times of their eclipses; for the sake of simplicity we have described here the method somewhat simplified in terms of full revolutions. Römer observing a few revolutions öf the Satellites found c = 220 000 km/sec.**

(Römer)

The above value is rather inaccurate, nevertheless the fact that the correct order of magnitude was obtained at all must be regarded as a remarkable achievement. The exact instant of an eclipse cannot be determined with an uncertainty less than somé fraction of one minute. The procedure — even using modern methods of astronomical observation — can therefore not be improved sufficiently to give a precise determination of c. b. THE ABERRATION OF LIGHT

21. The first reasonably accurate determination of the velocity of light was carried out by Bradley*** (1727) who predicted and alsó observed the effect of the aberration of light. We shall discuss this effect further below * It may be supposed that the velocity of the Earth relatíve to that of light is so small that it makes no noticeable difference whether we consider / to be the distance Earth-Jupiter at í or at t' . ** R. Römer: Mem. Acad. des Sciences,Paris, 1675; see alsó C. Ramsauer:Grundversuche der Physik in historischer Darstellung. I., Springer Veri., 1953. p. 63. *** J. Bradley: Phil. Trans., London, 35, 637, 1728. v

v

v

(chapt. IX). It will be seen that this procedure yields a determination of the critical velocity c' rather than that of the velocity of propagation c. 4. LABORATORY METHODS a. METHODS OF FIZEAU AND OF FOUCAULT

22. The difficulty in the measurement of the velocity of light mentioned in 17 arising from the need to synchronize distant clocks, can be avoided by making use of return signals. This circumstance was already appreciated by Galilei. Sending light signals from a fixed point A to a point B at a distance / from A and reflecting the light signal by means of a suitable mirror back towards A we may measure the time of departure í of the signal and the time of its arrival t back in A by one and the same clock P . We may thus write x

3

A

21 '3

'l

When using (7) we suppose that t , the time of arrival of the signal in B, is given by 2

i.e. we suppose that the velocity of propagation c of the signal from A to B is equal to its velocity of propagation c_ from B to A. Instead of registering the return times the signals take to move from A to B and back, we may use a clock P near A and view its mirror image obtained on a mirror situated near B. Viewing the mirror image P' in a telescope near A we find the phase shift At = 21)'c between the phase of the clock P and that of its mirror image P' . The velocity of light is thus obtained +

A

A

A

A

c = 21/At. b. THE EXPERIMENTS OF FIZEAU AND FOUCAULT

23. Experiments using the principle explained in the previous paragraph were carried out (in 1849) by Fizeau* and led to a precise determination of the velocity of propagation of light. In the experiment of Fizeau the clock is replaced by a fast moving cogwheel. The light of a small source is made to fali on the rim of the wheel * H. Fizeau: Compt. Rend. Hebd., 29, 90, 1849; Ann. d. Phys., 79, 167, 1850.

and is passing between the cogs to a mirror placed at a distance /. The beam refiected from there is projected by means of a suitable optical arrangement back on the cog-wheel. The optical arrangement, which is shown in a little more detail in Fig. 1, is so adjusted that when the cog-wheel is at rest the returning beam falls on the same part of the rim of the wheel through which it has passed originally and in this way the light source can be seen through the gap between the cogs. If, however, the cog-wheel is set in motion and has turnéd by, say,

F

Fig. 1. Determination of the velocity of light (method of Fizeau)

half a cog during the to and fro passage of the light beam, this can no longer pass through the gap and the image of the light source is extinguished. When the wheel moves faster the returning beam may just fali on the gap following the one through which the originál beam has passed and the image of the source becomes again visible. In the experiment the relation between the angular velocity of the cogwheel and the re-appearances of the image were registered. In the actual experiment of Fizeau, the distance was taken to be /=8633 m and a wheel with n = 720 cogs was chosen. The rate of revolutions at which the image appeared were therefore the integrál multiples of c N = —— = 22.6 revolutions/sec. 2nl

24. In a further experiment Fizeau and Foucault* used an improved method suggested by Arago.** In the latter arrangement a rotating mirror is made use of. A beam of light falling on a rotating mirror placed near A is reflected in such a way that it falls on a stationary mirror placed near B at a distance / from A (see Fig. 2). The mirror in B reflects the beam back to the mirror in A but during the time At = 2l/c which the light takes to travel from A to B and back the mirror turns by an angle ö=a>At where co is the angular velocity of the mirror. Measuring the angle fi the value of At can be obtained as the value of co is known and thus c can be determined. The results of the most accurate determination of c were quoted in 14.

Fig. 2. Determination of the velocity of light

5. THE PROPAGATION OF LIGHT I N REFRACTIVE MEDIA a. THE REFRACTIVE INDEX

25. It follows from Maxwell's theory that in a refractive médium light is propagated with a velocity different from c. We shall discuss this aspect of the theory further below (in chapt. IX). We note here the following: 1) The velocity of propagation of light in a médium with refractive index n is obtained from the theory as V = cjn,

-

(8)

where in a first approximation n = ^/ey. (e is the dielectric constant, /z the magnetic permeability of the substance). * L . Foucault: Compt. Rend. Hebd., 30, 551, 1850; ibid. 55, 501, 1862. * * F . Arago: Compt. Rend. Hebd., 7, 954, 1838; 30, 489, 1850; 55, 792, 1862.

A more detailed analysis shows that (8) is valid for monochromatic light only the propagation of planes of constant phase of a monochromatic beam having velocities V=cln(y), (9) where «(v) is the refractive index for light of frequency v. 2) V gives not only the velocity of propagation of phase planes but alsó that of wave packets, the width of which much exceeds the wave length. Relation (9) can be checked experimentally by measuring directly the velocity V of propagation of planes of constant phases of light of frequency v or by measuring the velocity of wave packets in a transparent médium. The refractive index n(v) may be determined from the observed angles of refraction of a beam of light of frequency v. Thus V and n(v) may be determined by independent experiments and the relation (9) thus checked. The actual experimentál findings described below support the validity of relation (9).

b. THE DETERMINATION OF THE VELOCITY OF LIGHT IN A REFRACTING MÉDIUM

26. The experiment of Foucault using a rotating mirror can be carried out when the distance / is comparatively short. The arrangement could therefore be used alsó to measure the velocity of light in a refracting médium. For this purpose Foucault used in his arrangement a container fiiled with water, which was placed between the rotating and the stationary mirror (Fig. 3). In this way he could ascertain that the velocity of propagation of

Fig. 3. Determination of the velocity of light in a refractive médium

ight through water was indeed equal to V = c/n, where n is the refractive index of the water. So as to avoid the necessity of measuring the angular velocity of the rotating mirror Fizeau improved this experiment by comparing directly the times light needs to traverse a layer of water of depth / = l/n with the time of travel taken for a distance / in air. He found the times to be equal and thus verified the relation (9). x

C. INTERFEROMETRIC METHODS FOR THE MEASUREMENT OF THE VELOCITY OF LIGHT 27. The methods described so far were used to determine the velocity of signals of light. Indeed, in the experiment of Fizeau the rotating cogwheel cuts the incoming beam into packets (which may be regarded as signals) and it is the velocity of these packets that is determined. Similarly in the experiment of Foucault the rotating mirror produces short flashes and the speed of propagation of these flashes is measured. We shall discuss presently interferometric methods of the measurement of the velocity of light. These methods make use of beams of monochromatic light and sérve for the observation of the velocities of planes of constant phase. 1. THE MICHELSON INTERFEROMETER

28. The most suitable arrangement for such interferometric measurements is the interferometer of Michelson.* We give a short description of the arrangement. Schematically the Michelson interferometer is shown in Fig. 4 and can be described as follows. A beam of light starting from a source S is falling with an incidence of 45° on a semi-transparent mirror SM. The beam is split by the mirror into two components. The components fali on mirrors M and M and are reflected back into SM. The returning beams are split again and we obtain one component of each returning beam which passes into the telescope T. In the telescope we obtain a system of interference fringes with the help of which it is possible to determine the difference of the lengths of the return paths 1

2

2/j = SM -* M - SM x

and

2/ = 2

SM -* M -» SM 2

* A. A. Michelson: Sill. Journ., 15, 394, 1878; 18, 390, 1879; Nature 21, 94 and 120, 1880; Naut. Alm. p. 235, 1885; Astrophys. J., 60, 256, 1924; 65, 1, 1927.

of the components of the incident beam. More precisely the system of fringes permits to determine the difference AT = T — T of the times of return travels of the two components between SM and M respectively M . 29. For the better understanding of what happens in the Michelson interferometer it is useful to give a somewhat fuller description. As shown in Fig. 4, light coming from the source S passes through a lens L and an approximately parallel beam is obtained; this beam appears to come from a source 5 ' far to the left from S. 2

x

x

s,f s

2

2

Obstacle

-s' s SM'

Fig. 4. The Michelson intirferometer

The parallel beam is now split by the semi-transparent mirror, the components are reflected back and produce components passing into the telescope. If we put an obstacle between SM and M and thus prevent the latter component to fali into the telescope, then we see in the telescope an image Sí of 5 which appears to come from a source S situated at somé distance behind the mirror M . This image is produced by the reflections of the incident beam first on Af and then on SM. Similarly, when we stop the beam between SM and M by somé obstacle, we see in the telescope an image S of S which image appears alsó to come from a source S situated alsó somewhere behind M . The images Sí and S are coherent and therefore if both paths SM -> M and SM -> M are free we see in the telescope the interference pattern which arises from the superposition of the two images — the pattern is such as if we observed through the telescope the images of two coherent sources S[ and S . If the interferometer is built up symmetrically, i.e. if the mirror M is placed exactly in the pláne into which falls the mirror image of M produced by SM, then the virtual sources S and S coincide and no interferences are 2

x

2

x

±

2

2

2

2

x

2

2

2

x

±

2

"1

i

produced and we observe a bright picture of the originál source 5 in the telescope. We denote this bright picture of the source alsó as the zero interference pattem. If we shift, say, the mirror My parallel to itself by a small amount Al, then the virtual image S[ moves by an amount 2 Al behind and thus we observe in the telescope circular fringes corresponding to two coherent sources Sy and S placed behind each other at the distance 2 Al. Knowing the wavelength X of the light of the source, we can determine from the distribution of the interference rings the numerical value of Al. 2

Fig. 5. Radius of interference fringes

30. Indeed, suppose the distance between the virtual sources Sy and S to be a = nk, n = n + e,

2

0

where s is the non-integral part of n. Bright interference rings will then appear in the field of view when the &-th ring having the radius r = Ftg& k

k

with

n —k cos# = — , A

k = 0, 1 , 2 , . . .

and F being the effective focal length of the telescope. If the & are small one finds, comparing Fig. 5, in good approximation k

íi«M \ n +e 0

*=

0,1,2,...

;

The concentric rings appearing in the field of view show decreasing spacing when going outwards (see Fig. 6). From the pattern both n and £ can be determined. Denoting the effective distances SM -» M by l and SM -* M by l (i.e. travelling time multiplied by c) we can determine with great precision the difference Ál = l — l = nk, 0

x

2

x

2

2

x

if A is known. In particular, we can shift one of the mirrors M or M parallel to itself until the interference fringes disappear and we obtain the zero interference pattern, i.e. we see a bright image of the light source in the telescope. We have then n = 0 and therefore

2

x

Ál = 0.

Fig. 6. Scheme of interference fringes

If we tilt, say, M by a small angle instead of shifting it from its symmetrical position, then the virtual images Sí and S will bé placed side by side and we obtain a system of parallel fringes instead of a system of rings; for practical consideration often the interferometer is adjusted so as to give parallel fringes instead of rings — the exact way of adjustment has no great significance, therefore we discuss further below the applications of the interferometer supposing the mirrors M and M to be adjusted so as [to'give a system of interference rings. x

2

x

2

2. MEASUREMENT OF THE PHASE VELOCITY OF LIGHT

31. The Michelson interferometer described above allows the measurement not so much of the difference ál of the lengths of its arms, but rather gives the difference AT in the times of travel of the two wave fronts produced by the mirror SM from the moment of their separation until they unité again.

If the wave fronts travel along the arms l and l with different velocities, say, with velocities C] and c , then from the interference pattern we do not determine Al = l - h but rather t

2

2

2

The fact that we are not measuring the difference of lengths of light paths but rather the difference of times of fiight was made use of for to measure the velocity of propagation of wave fronts in a refracting médium. Inserting a refracting médium in one of the arms of an interferometer and adjusting the interferometer so as to give AT = 0, we find from (10) i '•

c

2

c

— h '• h-

Thus, if e.g. Cj = c, c — c/n, we have 2

n =

ljl . 2

The above experiment extends the result of that described in 26. Considering both types of experiments we find that wave fronts like wave packets both propagate with velocities c/«(v). .• 3. THE EXPERIMENT OF FIZEAU

32. Similar methods led to the measurement of the velocity of phase planes in a moving médium. The arrangement for such a measurement is shown schematically in Fig. 7.

Fig. 7. Determination of the velocity of light in a moving médium (Fizeau)

The incident beam is split into two components and these are made to pass through columns of water flowing with a velocity v. The one front moves into the direction of flow, the other in a direction opposite to it. As will be seen further below (in chapt. IX), the phase velocities of light, passing through the moving médium, are expected theoretically to be equal to

Thus re-uniting the beams a shift of interference fringes depending on the velocity v of flow is to be expected. The measurements of Fizeau* confirmed at least qualitatively the relation (11). A good agreement between experiment and theory was obtained by Zeeman** when the values observed directly were corrected for dispersion. Evaluating the experimentál results one has to put in place of (11) more precisely c { 1 1

K . - in- r í l - : '

1

l

»(v ) J' 2

2

where n(vj) and n(v ) are the values of the refractive index for frequencies and 2

v =v(l 2

+

|J,

where v is the originál frequency of the source and \\ and v respectively are the frequencies shifted by the Doppler effect, i.e. the frequencies with which the light acts upon the atoms of the moving liquid. We shall come back to the theoretical interpretation of this effect in chapt IX. 2

D. THE DOPPLER EFFECT*** 33. It follows from Maxwell's theory that the velocity of propagation of light is independent of the velocity of the source at the instant of emission. Sometimes it causes confusion that the frequency of the light emitted by a moving atom is nevertheless affected by the state of motion of the atom. The latter effect predicted by Doppler is, however, quite compatible with * H. Fizeau: Compt. Rend., 33, 349, 1851. ** P. Zeeman: Amst. Akad. Vrl. p. 245, 1914; and p. 18, 1915; see alsó A. Sommerteid: Vorlesungen über theor. Physik. Optik, p. 64. *** Ch. Doppler: Abh. d. K. Böhmischen Ges. d. Wiss., 2, 465, 1892.

3 Relativity

the fact that the velocity of propagation of light is independent of the state of motion of the source. Indeed, the frequency of light we observe is determined by the rhythm in which consecutive wave fronts reach us. This rhythm may be affected by the motion of the source but the wave fronts themselves will always proceed with the velocity c independent of the motion of the source. 34. The emission of waves by a moving atom can be treated with the help of Maxwell's theory. The exact treatment shows that an atom emits spherical waves, the surfaces of equal phase expanding isotropically with a velocity c. That this treatment leads to correct results can be seen most clearly if we express the field of the emitting atom in terms of retarded potentials (see chapt. VIII). Consider thus an atom which is at rest in the point r = 0 and which emits radiation of constant frequency v. Fronts of constant phase are then emitted at intervals T =- , v

(12)

and the fronts may be supposed to start at times t = t + kT, k

0

k = 0, 1, 2 , . . .

At any time t the fronts are distributed as a system of concentric spheres around the point r = 0. 1. MOVING SOURCE

35. If the emitting atom moves with a constant velocity v the position of the atom at the time t is given by r(í) = yt and the wave fronts will be eccentric spheres with centres r = vt . k

k

If the fronts are observed from a point A towards which the atom moves (see Fig. 8), they will appear to be crowded together and to arrive at intervals shorter than T while in a point B from which the atom moves away the wave fronts will arrive at intervals longer than T. Of course the velocity of the individual fronts is equal to c both when passing through A and alsó when passing through B. The points of the fc-th front at the time t > t obey the relation k

(r - yt f k

= c\t - t f k

for

t > t. k

The k-th front arrives at the time t' in the point A with coordinate vector x = R so that (K-yt f = c\t' -t f, thus = + | R - yt l/c. k

A

k

k

k

k

Supposing the distance R to be much longer than vt we may write — neglecting terms (vtJR) as compared to unity k

2

_

,

Wtcos^i

„r

|R-tM=JI

V " ) '

1

Fig. 8. Wave fronts in the Doppler effect

where cos = Rv/Rv is the cosine of the angle between R and v. Thus we have in this approximation t' = Rlc+ | l - -^-cos^J t . k

k

(13)

Writing t' i-t k+

k

= T'

and

v' = ^ L ,

(14)

we find from (12), (13) and (14)

• —i 1

• c

cos &

(15)

We see thus that the observed frequency v' varies with the angle & between v and the direction of observation. We note that formula (15) is asymptotically correct for large distances R. However, in practice we observe the radiation of atoms always at such great distances that the deviation from expression (15) which arises from having neglected terms of the order of vt /R can always be neglected. The deviation can be neglected evén if v ~ c. k

2. OBSERVATION OF THE DOPPLER EFFECT

36. The Doppler effect was first observed by Galitzin and Belepolsky (1895);* they compared the frequency of the light emitted by a monochromatic light source with the frequency which the mirror image of the

Fig. 9. Scheme of Doppler effect produced by a moving mirror

source obtained on a moving mirror seems to have. The above procedure was chosen as it seemed difficult at that time to move a real source of light with a sufliciently large velocity. Apart from the fact that it is easier to move a mirror than a whole light source the procedure had alsó the advantage that the mirror image appears to move with a velocity 2v, twice the velocity v of the mirror. The frequency observed from the moving mirror image was found to be ,

v 1 — 2v cos é/c

in accordance with (15). That the observed frequency of the moving image changes just as the observed frequency of a moving source can be understood readily (see Fig. 9). If a mirror moves towards the source with a velocity v perpendicular to the wave fronts, then subsequent wave fronts are reflected from diminishing * Quoted from Grirasehl: Lehrbuch d. Phys., Band III. p. 290.

distances. The times of travel of the wave fronts — from the source to the mirror, and alsó from the mirror to the observer — decrease for subsequent wave fronts and therefore the fronts arrive near the observer crowded together in the same fashion as if they had started originally from the moving image S of S. The Doppler effect was observed later by many others. Stark observed the Doppler effect of the radiation emitted by atoms moving in a beam. He alsó confirmed relation (15).* The Doppler effect can now be considered to be a well-established effect and it may be made use of for the determination of the velocity of moving sources. The velocity of fast moving atoms can conveniently be determined by measuring the frequencies of the light emitted by them. 37. In astronomy the velocity of motion of the components of double stars can be obtained with the help of the Doppler effect. From the observation of double stars evidence was obtained for the fact that the velocity of light is independent of the motion of the source. If we were to suppose that the velocity of light depended on the motion of the source, then we had to expect that a component of a double star emitted light which is propagated towards us with a velocity changing with the state of motion of the star. Because of the large distance the light has to travel we would observe the motion of the star in a distorted fashion. By analysing the observational data of the double star /?-Auriga, it was found that these are incompatible with the assumption that the velocity of light depends on that of the source. x

E. THE PERPENDICULAR DOPPLER EFFECT 1. EXPERIMENTÁL OBSERVATIONS

38. More recent experiments by íves and Stillwell** in 1938 and later by Otting*** in 1939 showed that the frequencies of radiations emitted by very fast atoms are given by

v-wW -^', 1

1

c

(16)

cosi?

the observed frequencies v" being a little smaller than the frequencies predicted from simple geometrical considerations and given by (15). It fol* J. Stark: Phys. Zs., 6, 893, 1905; Ann. d. Phys., 21, 401, 1906. ** H. E. íves and G. R. Stillwell: J. Opt. S o c , 28, 215, 1938. *** G. Otting: Phys. Zs„ 40, 681, 1939.

ows in particular from (16) that for •b = nj2 we find = v / l -v lc* = v*. 2

N

(17)

Relation (17) describes the so-called perpendicular Doppler effect. The frequency of the radiation emitted at right angles to the direction of motion of an atom is found to be v* — less than the frequency v of the radiation emitted by the same atom when at rest. Equation (16) can be written in accordance with (17) as v'(#) =

£ 1

•

(18)

cos#

c Relation (18) gives the frequencies of the radiation we expect to receive from an atom at various directions when the atom oscillates with a frequency v* =

/l-v lc . 2

2

Vy

Therefore (18) can be interpreted phenomenologically by supposing that the frequency of a radiating atom is slowed down from v to v* if the atom is made to move with a velocity v. 39. Since the ordinary Doppler effect described by (15) can be observed on moving mirror images the question arises whether relation (15) has to be replaced by (16) in the case of observing a source through a moving mirror. Thus the question may be raised whether a moving mirror image shows a perpendicular Doppler effect? The latter question can be answered in the negatíve as the result of the following consideration. On viewing a distant star through a mirror and tilting the mirror slightly, the image of the distant star will move with an enormous velocity. If relations (16) and (17) were to apply to a moving image, then the colour of a distant star as viewed through such a moving mirror ought to change radically. (It can easily happen that such an image moves with a velocity considerably exceeding that of light and then the relations (16) and (17) lose their meaning.) Such effects obviously do not occur and therefore we see that formuláé (16) and (17) cannot be applied to moving mirror images. 2. PHYSICAL INTERPRETATION OF THE PERPENDICULAR DOPPLER EFFECT

40. Regarding the physical interpretation of the perpendicular Doppler effect we have to conclude that an atom when accelerated so as to move with a velocity v, then its inner rhythm is reduced by a factor y/l — v jc . 2

2

As there exist misunderstandings in connection with this question we discuss the above statement still further. We emphasize that (15), i.e. the relation

1 ——cos# c is derived as the result of very simple kinematic considerations and that these considerations must be accepted under any circumstances. Observing thus the frequencies of radiation v"(#) received from a moving source we expect from our geometrical considerations that v"(#) (l — - c o s # = independent of d l I = frequency of moving atom = v(v). c

(19)

Observing v"(é) the numerical values of the left hand expression of relation (19) can be determined for various values of é by experiment. If the latter expression is found to be independent of % indeed, then we have proved by experiment our theory of the Doppler effect to be correct. We have to identify the empirically found value v(v) with the frequency v* emitted by the moving atom. Carrying out the experiment for various values of v we may establish empirically the relation / ^ v(f) = v ( 0 ) / l - - = v*. x

?

41. That the above interpretation of the Doppler effect is correct indeed — and that the factor y/l — v /c is not caused by somé "geometrical" phenomenon which we have overlooked — is further supported by the fact that a moving mirror image does not show the perpendicular Doppler effect. Indeed, the difference between a mirror image and a real atom is that the former has no inner mechanism, therefore its oscillations reflect simply those of the light source. A real atom — having an inner structure — may be affected physically when set to move. The perpendicular Doppler effect observed on moving atoms indicates therefore that the process of acceleration slows down indeed the inner frequencies of atoms. The perpendicular Doppler effect shown by a moving atom is thus caused by the slowing down of its inner rhythm. The lack of this effect when observing the radiation of a mirror image is caused by the fact that a mirror image is not a physical system and it does not reduce its inner motion when put to move. 2

2

We shall see further below that taking together a number of phenomena there is good reason to believe that the inner forces inside an atom and alsó inside other closed physical systems are always such that a change of velocity of the system as a whole produces a reduction of the inner rhythms.

3. CONSIDERATIONS I N CONNECTION WITH THE DOPPLER EFFECT a. THE QUESTION OF THE MOVING OBSERVER

42. Consider an atom A emitting a radiation of frequency v and an observer B both moving on one straight line such that the atom moves to the

X

Fig. 10. x — t diagram of comparison of rates of moving clocks

left with a velocity v and the observer to the right with a velocity w (see Fig. 10). The coordinates of the atom and observer at a time t can thus be written = a-vt,

XA( ) 1

x (t) = b + wt. B

(20)

The atom emits phase surfaces at times t = t + kT, k

T = —,

Q

k = 0,1,2,...

(21)

v

from points with coordinates x (t ); the latter arrive to the observer at times t' so that xA*k) + c(t' - t ) = x (tí). (22) A

k

k

k

k

B

It follows from (20) and (22) (c + v)t =

(c-w)t' -l,

k

k

where / = b — a, and therefore with the help of (21)

writing 4

+ 1

— t' = T = 1/v' we have alsó k

v' = ^ v ,

(23)

C+V

where Y is the frequency with which the phase planes arrive to the observer in B. b. ACOUSTICAL AND ELECTROMAGNETIC DOPPLER EFFECTS

43. The formula (23) is remarkable because it is unsymmetric with respect to v and w thus replacing v -» w and w -* v, i.e. interchanging the roles of the source and the observer we obtain a changed value of v'. In particular if we put v = 0, w = V, i.e. if we consider the case where the source is at rest and the observer moves, then we find

while in the case when the source moves and the observer is at rest, i.e. if we put v = V, w = 0 we have

It is a well-known acoustical phenomenon that the Doppler shift appearing in the case of the source moving with a velocity v towards the observer differs from the shift which appears if the observer moves with the same velocity v towards the source. It is not always realized that the relation (23) expressing this asymmetry applies to electromagnetic waves also. Indeed the formalism making use of retarded potentials leads to (23) in a straightforward way — and the reason for the asymmetry thus obtained is analogous to that in the case of sound waves. 44. Considering, however, real observations the asymmetry of relation (23) disappears for the following reason. A real experiment in a somewhat simplified manner can be described as follows. Consider two atoms A and B moving in opposite directions on a straight line. The velocities of A and B may be v and w related to a fixed point of the line.

The frequencies of the atoms thus moving are expected to be "A

v

0

V1

- v /c\

v = v

2

B

0

71

- v?\c\

(24)

where v„ is the common frequency of A and B at rest. The atom A emits radiation with a frequency v , the phase planes thus emitted reach the moving observer B, they arrive with a frequency v' and according to (23) we expect c—w v' = . (25) c+v A

A

AA

A

v

;

Dividing both sides of (25) with v we obtain a quantity which we denote Q and find thus with the help of (24) and (25) B

= ^ = í ^ . . ^ ] ' . v [c + v c + w) 1

e

2

(26)

B

We note that the quantity Q is the quantity we can determine by experiment: it is the ratio of the frequency v' falling upon B and the frequency v of B itself. The latter expression is symmetric with respect to v and w. Thus in the case of electromagnetic waves — unlike the case of sound — a real experiment does not permit us to distinguish whether the source or the observer is in motion. Interchanging the roles of A and B we find with the help of (26) A

B

Úlv = v lv B

B

A

= Q.

(27)

As already pointed out the ratio Q is a quantity which can be measured directly by forming the ratio of two frequencies. It follows from (27) that the numerical value of this ratio is expected to have the same value no matter whether we compare the two frequencies at the position of A or of B. However, the ratios v' jv and v' /v are determined by two different experiments. Thus whether or not (27) stands can be established in principle by experiment. If (27) is found experimentally to be correct, then the latter experiment can be taken to support the assumption to the fact that the frequencies of the atoms change indeed with velocity according to (24). Real experiments which effectively support relation (27) were carried out by Isaak, Champeney and Khan.* We shall come back to this question further below. 45. It is remarkable that with the help of the perpendicular Doppler effect relations (24) can be checked without knowing the numerical value A

B

B

A

* G. R. Isaak et al., Phys. Letters 7, 241, 1963.

of v and w explicitly. The quantity which can be measured directly is ^

c+v

c+w

v

'

So as to see the significance of the latter quantity more clearly, we may introduce a quantity V with the dimension of a velocity such that c- V

^-V-

e =

(29) c+V Comparing (28) and (29) one finds as the result of a simple calculation 2

v v

=

+

w

vw 1

+—

c The velocity V is, apart from small terms, equal to the sum of the velocities v and w. From the point of view of the Doppler effect the latter quantity characterizes the relatíve motion of observer and source. Provided relations (24) hold the Doppler effect is characterized by the quantity V which may be called the "relativistic relatíve velocity". We shall come back to this question in greater detail. l

F. SOMÉ RELATIVISTIC EFFECTS 46. We discuss now two effects which are not connected directly with the propagation of light, the results of which will be, however, of importance further below. 1. VARIATION OF DECAY TIME WITH VELOCITY*

47. As has been seen in 38, we are led to conclude that the frequency of oscillations of an atom is slowed down if the atom is made to move with a velocity v. A similar effect was observed with unstable elementary particles. Direct observation of /í-mesons in cosmic radiation showed that the mean life of a |t-meson when it is brought to rest in an absorber is equal to T = 2.1983 + 0.0008 //sec.** 0

* See for details e.g. L. Jánossy: Cosmic Rays. Clarendon Press, Oxford, 1950. 2nd ed. ** From Review of Partiele Properties UCRL-8030.

The cosmic ray /i-mesons are förmed in the higher layers of the atmosphere by primary cosmic rays. Supposing that the mesons move with velocities approximately equal to that of light, the mean rangé of them would be of the order I « CT « 650 m. 0

0

The experiments show that ^-mesons travel distances of many kilometers, i.e. distances much exceeding the above value. The behaviour of the /imesons can be explained readily by the fact that the mean life of a /x-meson travelling with a velocity v is given by T = Wl

- t> /c 2

2

(30)

Experiments of Rossi* and others have shown this relation to be correct. It is interesting that the cosmic ray experiments give evidence concerning (30) for velocities v x c, where the change of the decay time is considerable and the ratio T/T is of the order 10-30. It must be noted, however, that in the latter experiments v could not be measured directly. The observed results were analysed using the relation between velocity and momentum. We shall discuss this question further below. Relation (30) was first established for the /x-mesons of cosmic rays and has since been proved to be correct for a number of other unstable elementary particles. Somé of the evidence was obtained making use of observations of fast particles accelerated artificially. 0

2. CHANGE OF MASS WITH VELOCITY

48. Another effect which will be found relevant to our analysis is the change of the mass of a partiele with velocity. From theoretical arguments Ábrahám** came to conclude that the mass of an electron should increase, when the electron is accelerated; the increase of mass was derived considering the action of an accelerated charge upon itself. The theory of Ábrahám was modified by Lorentz*** who obtained an expression somewhat different from that of Ábrahám. The experiments of Kaufmann**** (1901) proved that the mass of the electrons changes indeed with velocity, his measurements were, however, * B. Rossi and D. B. Hall: Phys. Rev., 59, 223, 1941; B. Rossi and others: Phys. Rev., 61, 675, 1942. ** M. Ábrahám: Ann. d. Phys., 10, 105, 1903; see alsó A. Sommerfeld: Atombau u. Spektrallinien, 1950. p. 313. *** H. A. Lorentz: The Theory of Electrons. Leipzig, 1909. **** W. Kaufmann: Ann. d. Phys., 19, 487, 1906.

not sufficiently accurate to decidé between the theoretical formuláé of Lorentz and those of Ábrahám. 49. From theoretical considerations it was supposed that the formuláé of Lorentz and not those of Ábrahám are correct and a large number of experiments were carried out later with the object of trying to proveLorentz's formuláé to be correct. The latter experiments claimed to support the Lorentz formuláé, however, as a detailed analysis showed on the whole they did not have sufficient accuracy to decidé the question experimentally (see the analysis given by Faragó and Jánossy*). The first experiment which proved convincingly that Lorentz's formula and not that of Ábrahám described the change of mass of the electron correctly is that carried out by Rogers, Reynolds and Rogers.** The experiment was later improved by Staub*** who attained still greater accuracy. 50. Experiments to determine the change of thé mass of protons with velocity were carried out by Zreliov, Tapkin and Faragó.**** In the latter investigation the velocity of the protons which was 83 per cent of that of light was determined directly with the help of Cerenkov radiation. The Lorentz formula at that velocity was found to be correct inside the margin of experimentál error which was estimated not to exceed 0.1 per cent. Taken all the evidence together it supports the validity of the Lorentz formula for electrons and for protons. It seems reasonable to suppose that the formula has generál validity. 3. REMARK ON THE METHODS OF MEASURING THE MASS VELOCITY RELATION

51. The experiments for determining the change of mass with velocity are based on the observation of the orbits of charged particles in electric and magnetic fields. The analysis of an orbit may give information about velocity and acceleration of the partiele — so as to obtain information about the mass of the partiele dynamical considerations have to be introduced. For the analysis of the experiments it is supposed that the force F acting upon a partiele of charge e and velocity v is given by the Lorentz formula. F = e |E + — (v x B) , l I c

* P. Faragó and L. Jánossy: Nuovo Cim., 5, 1411, 1957. ** Rogers, Reynolds and Rogers: Phys. Rev., 57, 379, 1940. *** Staub and others, Helv. Phys. Acta, 36, 981, 1963. **** V. P. Zreliov, A. A. Tapkin, P. Faragó: Soviet Physics, JETP 7, 384, 1958,

where E and B are the electric and magnetic field strengths. Secondly it is supposed that the force produces a change of momentum, i.e. f =

Supposing that

F

-

<

3 i

>

p = m(v)y,

with m(v) =

° yi-^/c ' m

2

we can write in place of (31) my

F=

\ _

0

J

dt j l

- v \c ) 2

mv

m„(vV)v/c

:

0

2

/ l - v jc 2

v

2

(1 -

W

2

We may alsó write

where _ m

l

_

fn

0

2 ~ ~T\ ..2f.2\3/2 ' (l-v2JcY' '

m, ' —

w (1 - «; /c ) 0

2

2

1/2

and = (vv)v/i; , 2

V l

v = v - Vi 2

are the components of the acceleration which are parallel respectively perpendicular to the velocity. We see thus that if we define the mass as the ratio of force and acceleration then the mass thus defined depends on the angle between velocity and acceleration. The extrémé cases are ni] the mass which appears when the partiele is accelerated in the direction of velocity and iw, the mass which appears when the partiele is accelerated in a direction perpendicular to v. 52. The real experiments can be divided into two types. In the experiment of Rogers and co-workers and alsó of Staub and co-workers particles are made to move along a circular orbit. So as to maintain a partiele of velocity c o n a circular orbit with radius R a radial force F = mv lR 2

is required. In the radial electric field F—Ee, while in a homogeneous magnetic field B perpendicular to the orbit we have F = eBv/c.

We may adjust E and B until we arrive at such a state that our beam of charged particles moves along the same orbit whether it is under the influence of the electric field E or alternatively under the influence of the magnetic field B. From the equations of motion we find for the fields thus adjusted eE = eBv/c = mv /R.

(32)

2

With the help of (32) we can determine both m and v. We find v = cE/B,

m = eRB^c

2

E.

Comparing the values of m obtained for particles with different velocities, the change of mass with velocity can be obtained. In the experiment of Zreliov et al. the velocity v of the proton beam was determined directly with the help of Cerenkov radiation and thus only the magnetic deflection had to be measured for to determine m. 53. We note that the above experiments give only information about the ratio ejm. Interpreting the various experiments it is always supposed that the charge e is independent of the velocity. We come back to this question in 280. While the assumption that the measure e of a charge remains unchanged if the charge is set to move is partly a definition, it contains alsó an element which can be checked experimentally. Without going into details we mention that in the Stern-Gerlach experiment neutral atoms possessing magnetic moments are deflected by a strongly inhomogeneous magnetic field; the field strength of the deflecting field is alsó considerable. The atoms thus deflected must be very exactly neutral as otherwise they would suffer very noticeable deflections by the magnetic field. From the fact that in the Stern-Gerlach experiment an atomic beam passing through a strongly inhomogeneous magnetic field can be satisfactorily focused, one concludes that the charge of the nucleus compensates exactly that of the electrons. Furthermore, because of the motion of the electric charge insides the atoms, it must be concluded. that the totál electric charge is not affected by the internál motion of the electrons.

CHAPTER II

INVESTIGATIONS CONCERNING THE CARRIER OF ELECTROMAGNETIC WAVES

A. THE QUESTION OF THE ETHER 54. From Maxwell's theory it follows that light in particular, and all electromagnetic action in generál, is propagated with a velocity c = c' where c' is the critical velocity. We shall analyse Maxwell's equations in more detail in chapL VIII and show there that the above conclusion is indeed an integrál part of Maxwell's theory. The question cannot be avoided relatíve to what are electromagnetic waves propagated with the velocity cl A simple answer to this question could be obtained claiming that light is propagated with the velocity c relative to its source. The latter assumption contradicts, however, the well-estabüshed theory of Maxwell and seems alsó to be contradicted directly by experiment. The latter assumption, sometimes referred to alsó as the "ballistic theory of light", must therefore be rejected. An electromagnetic perturbation once it has left its source is propagated thus with a velocity c independent of how the perturbation has come about. The only reasonable interpretation of this is to assume that the perturbation moves with a velocity c relative to its carrier. The carrier may be denoted, using Maxwell's terminology, the ether. We shall in accord with the ideas of Maxwell alsó assume that light is propagated with a velocity c relative to the ether. 55. So as to avoid misconceptions we wish to emphasize that we regard the ether merely as the carrier of electromagnetic waves and possibly of the waves associated with other fields and of elementary particles. In the last century a number of mechanical models were proposed so as to explain the properties of the ether by rather artificial mechanisms. Such models are meaningless as there is no reason why the ether should have properties like e.g. solids consisting of atoms and molecules. On the contrary one might suppose that the properties of atoms, molecules and solids are ultimately determined by the properties of the ether. We think of the ether more or less in a manner which was discussed by Einstein in a not very known article.* * A. Einstein: Über den Áther. Verh. d. Schweizer. Nat. Ges., 105, Teil II. 8 5 - 9 3 , 1924.

56. Einstein's polemic against the ether concerned mainly the assumption that the ether is at "absolute rest". Thus Einstein denied the existence of a system of reference K which is at "absolute rest". We think that the assumption that electromagnetic waves possess a carrier has nothing to do with the question of absolute rest. The concept of "absolute rest" is a metaphysical concept which must be rejected. However, the concept of the ether as the carrier of electromagnetic and other phenomena is quite a different one. 57. Maxwell thus supposed the carrier of electromagnetic waves to be the ether. He supposed electromagnetic waves to be perturbations of the ether propagated in a way somewhat similar to that in which sound waves are propagated in air. Using a terminology similar to that introduced by Einstein we may denote by K a system of coordinates which we suppose to be at rest relatíve to the ether, i.e. at rest relatíve to the carrier of the electromagnetic waves. As clearly explained above we do not suppose the system K to be that of "absolute rest". Whether or not the ether, i.e. the carrier of electromagnetic waves, is at rest or even at "absolute rest" is a question which does not arise here and certainly has no significance in relation to our problems. 0

0

0

"Der Áther der allgemeinen Relativitatstheorie unterscheidet sich alsó von demjenigen der klassischen Mechanik bezw. der speziellen Relativitatstheorie dadurch, dass er nicht 'absolut', sondern in seinen örtlich variablen Eigenschaften durch die ponderable Materié bestimmt i s t . . . Dass es in der allgemeinen Relativitatstheorie keine bevorzugten, mit der Metrik eindeutig verknüpften raumzeitlichen Koordinaten gibt, ist mehr für die mathematische Form dieser Theorie als für ihren physikalischen Gehalt charakteristisch." "Aber selbst wenn diese Möglichkeiten zu wirklichen Theorien heranreifen, werden wir des Áthers, d. h. des mit physikalischen Eigenschaften ausgestatteten Kontinuums, in der theoretischen Physik nicht entbehren können; denn die allgemeine Relativitatstheorie, an derén grundsátzlichen Gesichtspunkten die Physiker wohl stets festhalten werden, schliesst eine unvermittelte Fernwirkung aus: jede NahewirkungsTheorie aber setzt kontinuierliche Felder voraus, alsó auch die Existenz eines 'Áthers'." The translation of this remarkable statement is as follows: "The ether of the generál theory of relativity diífers from that of classical mechanics or from that of the special theory of relativity in so far as it is not 'absolute' but its spatial distribution is determined by that of matter. The fact that, in the framework of the generál theory of relativity, there are no distinguished space-time representations connected in an unambiguous manner with the metric — is rather a characteristic of the mathematical methods of the theory than a characteristic of its physical contents." "However, even if these possibilities developed into a real theory we shall not be able to dispense, in the field of theoretical physics, with the ether, i.e. a médium which possesses physical properties; indeed, the generál theory of relativity — to the principles of which physicists will probably always adhere — excludes any direct distant action. Every theory based on close action supposes the existence of continuous fields thus they suppose alsó the existence of an 'ether'." (My own translation.)

4 Relativity

Furthermore, for our considerations it is alsó immaterial whether or not various parts of the ether move relative to each other. It seems quite plausible that considered on. a cosmic scale distant parts of the ether are streaming with various velocities and thus the system K , we consider, has only local significance. K is supppsed to float together with the ether. We may assume that in a certain vicinity of the origin of K the ether has negligible velocity relative to K . 0

0

0

0

B. EXPERIMENTÁL INVESTIGATIONS 58. The question arises whether it is possible to determine experimentally the state of motion of the ether in somé definite region. In particular the question arises whether it is possible to ascertain the state of motion of the Earth relative to the surrounding ether. To answer the question, we have to consider separately the effects of the rotation of the Earth and its translational motion relative to the ether. 1. ROTATION OF THE EARTH

59. The fact that the Earth is rotating around its axis can be seen from the apparent motion of the stars in the sky. The rotation can alsó be observed by mechanical experiments carried out on the surface of the Earth, i.e. with the help of Foucault's pendulum, or by observing the motion of a fast rotating gyroscope. It is interesting that the rotation of the Earth can alsó be observed by optical experiments. The experiment in question is the extended form of the experiment of Sagnac (1913), i.e. the experiment of Michelson and Gale (1925). We give a description of the experiments. We note that if we speak of the rotation of the Earth, we imply thus rotation relative to the carrier of electromagnetic waves, i.e. rotation relative to the ether. 2. THE S A G N A C EXPERIMENT*

60. Before describing the real experiment we describe a schematical version of it which will elucidate its essential features. Consider a disc of radius R rotating with an angular velocity co around its axis. Suppose a large number of mirrors arranged on its periphery in such a way that a light signal starting, say, from a point A of the periphery is guided along a path very nearly coinciding with the edge of the disc. * G. Sagnac, Compt. Rend., 157, 708, 1913; J. de Phys., 5, 177, 1914.

If the disc is at rest a signal starting at the time t = 0 from a point A on the periphery arrives back into A at a time T = InRIc. [f, however, the disc is rotating with an angular velocity co and the light signal is moving in the direction of rotation it will reach at the time T = = 2izR/c a point A located in the place which A had left at t = 0. The signal has to catch up the point A which is moving away and it will be reached by the signal at a later time T so that 0

+

cT

+

therefore

= 2nR + RcoT , +

2%R

=

=— > c — Rco T could alsó have been obtained by supposing that the light signal moves relatíve to the edge of the disc with a velocity c+ = c - Rco. (1) T

T

+

We emphasize, however, that we have made no assumption about the velocity of the beam relatíve to the disc but have simply calculated the time the signal starting from A needs to catch up again the point A moving away. If the light signal moves in the opposite direction it reaches A sooner than at / = T as the point A moves then towards the signal. In this case we find for the time at which the signal reaches A T~ =

2nR — < T.

C + RÍO

The latter result could be obtained directly if we were to assume that the signal moves with the velocity c~ = c + Rco,

(2)

relatíve to the edge of the disc. Here again we note that (2) üke (1) is not an assumption but follows from the consideration of the time of flight of a signal. The difference in the times needed to circle around the disc in opposite directions is thus r + - T~ = 2nR

L-L

L_] = ***V „ 4Sa>/c>

\c-Rco c + Rcoj where S = nR is the area of the disc circled 61. Sagnac carried out his experiment with schematically in Fig. 11. A light source, mirrors 2

c - R co round by the beams. an arrangement reproduced and a telescope are mounted 2

2

2

1

on a disc which can be made to rotate. The beam leaving the source S is split into two coherent components by the semitransparent mirror SM. The components are led around a square path with the help of the mirrors Mi, M and M . After having circled round in opposite directions the beams meet on SM, and split again. The components of the two returning beams which are moving back towards the source are led with the help of a second semitransparent mirror SM' into the telescope T, where an interference pattern is produced. Just üke in the case of the Michelson interferometer 2

3

M

2

Fig. 11. Scheme of the Sagnac experiment

(see 31) the interference pattern can be used to determine the difference of the times of füght of the beams circling round SM, M M , M in opposite directions. In the originál experiment carried out by Sagnac and that repeated later more carefully by Pogány* (1926) the phase difference between the two beams was observed when the arrangement was at rest. The arrangement including light source and telescope was made to rotate with an angular velocity co and a shift of fringes of magnitude 1;

2

3

AX = 4Sco/c was found. S is the area which the beams are made to circle round. This shift is exactly what is expected from the calculations. * B . Pogány: Ann. d. Phys., 80, 217, 1926; 85, 244, 1928; Naturwiss., 15, 177, 1927.

3. THE EXPERIMENT OF MICHELSON A N D GALE*

62. The experiment of Sagnac was developed further by Michelson and Gale who succeeded in observing the effect of the rotation of the Earth with the help of an interferometer of the Sagnac type. The angular velocity of the Earth is about 0.8 • 10~ s e c ; the distance between the mirrors were of the order of 1 km, thus the effect to be expected was 4

-1

JA = 10 000 Á and in the experiment a shift of about / fringe was found. It must be noted that the experiment is carried out on the rotating Earth; it is impossible to "stop" the Earth while adjusting the arrangement. The required adjustment is, however, rendered possible by the fact that the effect of rotation increases with increasing area S. In particular for a return beam the encircled area is zero and the effect of rotation is not felt in this case. One might in principle adjust the lengths of the sides of the square path with the help of return signals and measure the fringe shift obtained with beams circling round the square path, the sides of which have thus been adjusted. In the actual experiment Michelson and Gale compared interferences obtained with beams circling round a smaller rectangular and a larger quadratic area. From the difference of the observed fringe shifts they could determine the angular velocity of the system as a whole. The Michelson-Gale experiment can be repeated by making use of lasers. The fringe shift thus obtained can be observed more easily because the line width of the laser beam is extremely small. 1

i

C. TRANSLATIONAL MOTION RELATÍVE TO THE ETHER 63. Unlike the rotation of the Earth its translational motion relatíve to the ether cannot be observed by mechanical experiments. The fact that the translational motion of a system cannot be observed was already recognized by Galilei.** We quote an interesting passage: "For a final indication of the nullity of the experiments brought forth, this seems to me the place to show you a way to test them all very easily. Shut yourself up with somé friend in the main eabin below decks on somé * A. A. Michelson and H. G. Gale: Astrophys. J., 61, 140, 1925. ** R. J. Seeger, Galileo Galilei, his life and his works. Pergamon Press, Oxford, 1966, pp. 2 3 6 - 3 7 .

large ship, and have with you there somé flies, butterflies, and other small flying animals. Have a large bowl of water with somé fish in it; hang up a bottle that empties drop by drop into a wide vessél beneath it. With the ship standing still, observe carefully how the little animals fly with equal speed to all sides of the cabin. The fish swim indifferently in all directions; the drops fali into the vessél beneath; and, in throwing something to your friend, you need throw it no more strongly in one direction than another, the distances being equal; jumping with your feet together, you pass equal spaces in every direction. When you have observed all these things carefully (though there is no doubt that when the ship is standing still everything must happen in this way), have the ship proceed with any speed you like, so long as the motion is uniform and not fluctuating this way and that. You will discover not the least change in all the effects named, nor could you teli from any of them whether the ship was moving or standing still. In jumping, you will pass on the floor the same spaces as before, nor will you make larger jumps toward the stern than toward the prow, even though the ship is moving quite rapidly, despite the fact that during the time that you are in the air the floor under you will be going in a direction opposite to your jump. In throwing something to your companion, you will need no more force to get it to him whether he is in the direction of the bow or the stern, with yourself situated opposite. The droplets will fali as before into the vessél beneath without dropping toward the stern, although while the drops are in the air the ship runs mony spans. The fish in their water will swim toward the front of their bowl with no more effort than toward the back, and will go with equal ease to bait placed anywhere around the edges of the bowl. Finally the butterflies and flies will continue their flights indifferently toward every side, nor will it ever happen that they are concentrated toward the stern, as if tired out from keeping up with the course of the ship, from which they will have been separated during long intervals by keeping themselves in the air. And if smoke is made by burning somé incense, it will be seen going up in the form of a little cloud, remaining still and moving no more toward one side than the other. The cause of all these correspondences of effects is the fact that the ship's motion is common to all the things contained in it, and to the air alsó." At first sight it may appear as if the translational motion of the Earth could be observed by optical experiments. Although in reality this cannot be done it is necessary to analyse the question in greater detail. The experiments of Fizeau and Foucault determined the velocity of light relative to the solid Earth. The question arises whether or not the motion of the Earth relative to the ether effects the results of the experiments? The question is not a trivial one as can be seen from the following argument. 64. Consider a signal of light to be emitted at the time t = 0 from a point A which we suppose to be at rest relative to the ether; the signal will reach 0

a point B which is alsó at rest and is situated at a distance / from A at a time T = l/c. 0

0

Considering further two points A and B both moving with a velocity v in the direction A -> B such that at the time t = 0 A coincides with A and B with B , then the time T which is needed for the signal emitted from A to reach B differs from T. Indeed supposing the points A and B to move along the x-axis of the system of reference their respective coordinates can be written as 0

0

0

+

0

XAQ)

=

>

VT

B(()

X

= vt + I,

v>0.

(3)

The signal emitted at / = 0 from A will have at the time t a coordinate x {t) = ct.

(4)

s

When writing down (4) we remember that the propagation of light is independent of the motion of the source, therefore (4) gives the motion of the signal no matter whether it was emitted from A or from A . (We may e.g. suppose that when A approaches A$ a spark takes place between A and A and the light of the spark gives the signal.) The signal reaches at the time t = T = l/c the point B but it does not reach B since B has moved away from B during the time T. The signal reaches B at a time t = T > T, T satisfies the condition 0

0

0

0

+

+

*s(T ) = x (T ). +

+

B

(5)

Thus introducing into (5) the expressions (3) and (4) we find r+ = //(c

-1-).

(6)

If we consider a signal moving from B towards A, then we find with the help of a similar argument that the time of travel T~ from B to A is equal to T~ = U(c + v).

(7)

We see thus that T > T > T~ therefore the time of travel between the points A and B is affected by the motion of these points relatíve to the ether. 65. Relations (6) and (7) can alsó be interpreted as follows. Consider a number of points C, D, . . . which all like A and B move with the same constant velocity v relatíve to K. The points A, B, C, D,.. . define a system of reference K' relatíve to which all the above points are at rest. We can define as the velocity of light relatíve to K' in the direction from +

A -* B respectively B -» A as c+ = //r+,

c- = //r-.

(8)

Thus we find from (6), (7) and (8) c

+

= c—

V,

C~ = C + V.

Relations (6) and (7) are based on a purely phenomenological consideration. The argument leading to (6) is the same as if we were asked the following question: a dog runs with the velocity c after a hare which runs with a velocity v < c, how long will it take that the dog reaches the hare, if the initial distance is /? When discussing the latter problem it is quite immaterial what runs after what? Whether the dog catches up with the hare or whether a light signal catching up with the moving Earth. Furthermore it is evident that if the hare loses its nerve and starts to run towards the dog instead of running away, the dog will catch the hare more quickly than if it was running away from him, thus T > T~ in any case. The above argument is supported — if support is at all needed — by the Sagnac experiment where the delay is directly observed with which the light going round a closed path reaches the semitransparent mirror moving away from it. +

1. PROPAGATION OF A SPHERICAL SIGNAL

66. Formulating the problem a little more generally we may state that a light flash emitted at a time t = 0 will expand and it will be found at a time t distributed on a sphere with points

r (0 = ch\

(9)

2

5

Considering a point B moving with a velocity v the coordinate vector of which is given as = 1 + vf,

x (i) B

(10)

we find that the flash emitted at t = 0 from a point A the coordinate vector of which at t = 0 is given by r (0) = 0, A

will reach B at the time T

+

such that r (T+Y = r (r+) s

B

2

or with the help of (9) and (10) (1 + v r + ) - c T 2

2

+2

= 0.

(11)

From (11) it follows /,

V

c

•2

2

V Q

sir # c

1

+ — COS 1 2

where we have written vl = vl cos thus # is the angle between I ard v. The time of travel T~ of a signal from B to A is obtained by replacing •& by n — thus we find ,1

/ V

r- =

d

2

sin #
?

- r - -

v cosé c

?

•

O) 3

1

In place of (12) we can alsó write T

=

+

y/c

2

. — v sin # — r cos # 2

(14)

2

2. PROPAGATION OF LIGHT RELATÍVE TO A MOVING SYSTEM OF REFERENCE

67. The time of flight from A to B depends on the velocity v of B relatíve to the carrier of light, but it does not depend on the state of motion of A. Supposing that A like B moves with a velocity v, we can take both points to be at rest relatíve to a system of coordinates K which system is moving with a velocity v relatíve to K . We may define as the velocity of light relatíve to K in a direction inclined by an angle § to v as c(#) = //r+. 0

Thus we obtain from (14) c(#) = Jc 2

i; sin # - v cos 2

2

(15)

According to this definition light is propagated unisotropically relatíve to K. It must be emphasized that (15) is obtained merely as the result of a definition. The time of travel T from A to B follows, however, from a purely phenomenological argument of the type of the "dog-hare" problem. 68. Returning to the discussion of the experiments of Foucault and of Fizeau, we note that in both experiments the time of to and fro travel between two points is observed. Supposing the arrangement to move with +

a velocity v relative to the ether, we find thus for the time of to and fro travel with the help of (12) and (13) 21 + T~ = — • — c

T(&) = T

+

= 1

.

(16)

v

If v c then we can develop in powers of tP/c and find in a good approximation 2

r(#) = ^ ( l

^jl-I in^J).

+

(17)

s

From the experiment T(fi) is determined; the velocity of light can be taken as the solution of (16) into c. Let us have c = 21IW) 0

where c is the value of the velocity of propagation which is obtained from the Fizeau experiment without attempting to correct the value for the motion of the arrangement. We find from (16) 0

1 2

v V' / 1 + 4-^-cos ^} 2 V " ^ „

v*_ 4

+

1

+

2

2

2

2

or with the help of the approximate formula (17) v..2 ,

c*

l+

J

l--^-sin # 2 2

+ terms of higher order.

(18)

69. The relative difference between c and c is of the order of v /c . If we were to take v of the order of 30 km/sec, i.e. supposing that the Sun is about at rest relative to the ether, then the relative difference between c and c would be of the order of 10~ ; this difference is thus smaller than the experimentál error of the most precise measurements. Thus from the practical point of view the motion of the Earth can be neglected when evaluating the results of the Fizeau experiment. 70. Although the correction (18) is of no importance for the practical determination of c it remains an interesting question of principle whether or not the translational motion of the Earth relative to the ether can be observed through an experiment. This question was already raised by Maxwell.* In his paper Maxwell considers a type of Michelson-Morley experiment but remarks that the expected effect in such an experiment is 2

2

0

0

8

* J. C. Maxwell: Nature, 21, 314, 1879-1880.

too small to be observable. He suggested an experiment with the help of the Satelütes of Jupiter. It is interesting that in fact the Michelson experiment became practicable while the Jupiter experiment seems to be unsuitable from the practical point of view. 3. THE M I C H E L S O N - M O R L E Y EXPERIMENT

71. It follows from (16) or (17) that in an arrangement of the type used by Fizeau the time T(é) of to and fro journey of the light signal should vary with the angle •& subtended by the direction of the light path with the velocity vector v with which the arrangement moves relatíve to the ether. The difference between c and c is too small to be observable directly, but it can be observed using an interferometric method. Indeed, having an interferometer with arms l and / , adjusting l parallel and / perpendicular to v we calculate with the help of (16) a difference in the running times 0

x

2

x

2

Á

T

r

=

^

w

-

^

W

-

(

1

9

)

Developing in powers of v /c and neglecting terms of higher orders, we find alsó 2

j

r

2

_ % z U (

1

+

í

A . 4 .

+

( 2 0 )

c [ ej c c If we choose e.g. the interferometer so that / = / , i.e. the arms are of equal length, then we obtain l

t

2

- r - 1 . 4 C

C

E

and determining AT from the pattern, we can apparently determine v. In reality, however, it is impossible to ascertain whether or not the arms of an interferometer are really of equal lengths. We shall discuss further below (in chapt. III) the questions of principles involved in measuring the lengths of the arms. Here we note only that the most accurate practical method of measuring lengths is the interferometric method. In fact the international standards of lengths are determined with the help of interferometric methods. Thus determining AT from the interferometric pattern, we obtain a relation in which both l —1 and v \c play roles — thus we cannot ascertain whether or not the arms are of equal lengths unless we make a supposition of the value of v. z

x

2

2

72. Information as to the value of v could be expected to become available if we turn round the interferometer. Indeed, adjusting the arms until we find AT = 0 we expect in accord with (19) that

when the interferometer is adjusted so as to give AT = 0. Turning round carefully the interferometer through an angle of 90° we expect that the shorter arm of length l is turnéd perpendicular to v while the longer arm is turnéd to be parallel to v, therefore in the turnéd round position the difference of the arms increases the difference of running times instead of compensating them. We expect thus in the turnéd round position a pattern corresponding to a difference of times of flight x

21 AT(9Q°) = c

v . c 2

sl

We expect a difference of times of flight / v AT(&) = — . — (1 - cos 20) 2

(21)

when the first arm is at an angle the second at an angle § + n/2, to v. Thus turning round carefully the interferometer one expects the pattern to change continuously in accord with (21). In the actual experiments of Michelson and Morley* and others, it was found that the fringes do not change when the interferometer is turnéd — although the conditions were chosen so that a small fraction of the shifts predicted by (21) could have been observed. 73. The experiment of Michelson and Morley was repeated among others by Kennedy and Thorndike.** The latter authors used an interferometer such that the lengths l and / of the arms differed considerably. The arrangement thus modiíied gave the same negatíve result as the experiment in the originál form; no fringe shift was observed while the interferometer was turnéd round. x

2

4. THE INTERPRETATION OF THE NEGATÍVE RESULT OF THE M I C H E L S O N - M O R L E Y EXPERIMENT

74. The fact that turning round the interferometer produces no shift of fringes could be explained in a rather trivial way. Let us suppose that at the time of the experiment the Earth happened to be at rest relatíve to the ether. Thus one might suppose that the solar system * A. A. Michelson and E. W. Morley: Amer. J. Sci., 31, 377, 1886. ** R. J. Kennedy and E. M. Thorndike: Phys. Rev., 42, 400, 1932.

happened to move relatíve to the ether with a velocity which coincided with that of the orbital velocity of the Earth — and that by chance the experiment was carried out during a period when the Earth was moving relatíve to the Sun in the same direction as the ether. So as to guard against such a trivial interpretation of the negatíve outcome of the interferometer experiment, the latter was carried out severaí times in the course of a year. As the direction of the orbital velocity of the Earth changes in the course of a year the velocity of the Earth relatíve to the ether — if it happened to be zero at one period — should increase to 2v after six months where v = 30 km/sec is the orbital velocity. Thus the Michelson-Morley experiment proved that T(#) = independent of # during any time of the year, and thus we must conclude that the turning round of the interferometer produces no fringe shift whether or not the Earth moves relatíve to the carrier of light. Another supposition which might explain the negatíve outcome of the Michelson-Morley experiment would be to suppose that the ether is sticking to the Earth so that it carried the surrounding ether with itself. The latter assumption is disproved to somé extent by the Michelson-Gale experiment which shows that at any rate the rotation of the Earth is not shared by the ether. The assumption of the Earth dragging the ether along with itself seems to be very unlikely and such a hypothesis need not be considered seriously. 75. The only acceptable explanation — in our view — of the negatíve outcome of the Michelson-Morley experiment is to assume that the interferometer, however carefully it is turnéd round, nevertheless it deforms; the deformation being such that it compensates exactly the phase shift which would appear without the deformation. 76. A hypothesis to this effect was put forward by Lorentz and FitzGerald. In this hypothesis they suggest that a solid when accelerated so as to move with a velocity v relatíve to the ether its dimensions parallel to v contract by a factor *Jl — v /c while the dimensions perpendicular to v remain unaffected. The effect of contraction could be pictured in a rather primitive way supposing that a solid moving in the ether is subject to "pressure" caused by the ether wind penetrating it and that this pressure causes the contraction. From the Lorentz-FitzGerald hypothesis it follows that a solid moving with a velocity v is compressed in a direction parallel to v. If we turn round the solid by 90° so that the dimensions which were originally parallel to v are turnéd round to stand perpendicular to v then the dimensions which were originally compressed are now being released and thus expand by 2

2

a factor l/^/l — v /c . Similarly the dimensions which are turnéd into the direction of v are compressed by a factor y/l — v /c . The deformations thus obtained are exactly suitable to compensate the change of interference patterns (21) which would arise if the interferometer behaved like a rigid body. 77. So as to see in a little more detail that the contraction hypothesis is suitable to account for the negatíve outcome of the Michelson-Morley experiment and alsó for its version carried out by Kennedy and Thorndike we calculate the change of length of a moving rod when it is turnéd round relative to the ether. The above argument is rendered necessary because Kennedy and Thorndike expressed an opinion in their originál paper to the effect that the Lorentz contraction alone is insufficient to explain the result of their experiment. This opinion as quoted in the literature (see e.g. Dingle*). We show in the present argument that contrary to the generál belief the negatíve outcome of the Kennedy-Thorndike experiment like that of the Michelson-Morley experiment can be fully accounted for in terms of the Lorentz contraction of the arms of the interferometer. 78. Consider a rod with length a when at rest. If the rod is made to move with a velocity v relative to the ether but it is turnéd into a direction perpendicular to v then its length remains a . If the rod is now turnéd so as to stand at an angle é relative to v then its length will be 2

2

2

2

0

0

a (&) = af + a\ 2

where «i = Ö(0) sin •&,

a = a(é) cos ö, 2

(22)

thus a and a are the projections of the rod perpendicular respectively parallel to the directions of v (see Fig. 12). The projection remains unaffected by the motion, while we may write x

2

a-i = a /l-v /c 2

(23)

2

2y

where a' would be the length of the projection if the Lorentz contraction had not taken place. We may thus write 2

aj + a' = a 2

2

2

(24)

and thus with the help of (22), (23) and (24) we find cos # 2

a (#)/sin # + 2

2

* H. Dingle: The Special Theory of Relativity. London, Methuen and Co. Ltd., New York, J. Wiley and Sons, Inc., 1940.

We see therefore that the time T(ff) of the to and fro flight along a rod remains constant while the rod is turnéd round if the rod suffers Lorentz contractions, so that it is adapting itself continuously to its orientation relatíve to v. The system of interference fringes observed with the Michelson interferometer depends on the difference AT{§)

=

JJO?) -

J,(# +

90°),

where é and •& + 90° are the angles subtended between the arms of the interferometer and v; 7i(#) and r (# + 90°) are thus the times of return 2

flights along the two arms. Since neither T^ö) nor T (fi) vary with expect the difference alsó to remain constant, i.e. z

we

AT(fi) = independent of fi. The latter argument is valid both if l = / or if l ^ i . x

2

x

2

5. CONSIDERATIONS CONCERNING THE CONTRACTION HYPOTHESIS

79. As an argument against the contraction hypothesis of Lorentz and FitzGerald it was brought forward that the latter hypothesis is only an "ad hoc" hypothesis for to explain the negatíve result of the Michelson-Morley experiment. We do not think that this argument should be taken seriously. Firstly, if we observe somé phenomenon (e.g. the negatíve result of the Michelson experiments) then we have to find the processes which account correctly for the phenomenon. It seems to us that the negatíve outcome of the interferometric experiment cannot be understood except by supposing that the interferometer deforms when it is turnéd round. In fact we think that the Michelson-Morley experiment can be taken as an interferometric measurement of the Lorentz contraction. So as to support this statement we shall further below analyse more precisely principles involved when measuring lengths and other physical quantities (chapt. V). 80. Furthermore it is important to point out that the phenomenon of the Lorentz contraction can be understood in terms of generál dynamical consideration. A solid consists of atoms and the shape of the solid arises as a dynamical equilibrium of these atoms. It must be supposed that the atoms act upon each other in a retarded fashion. It can be seen easily that the retarded interaction leads to different equilibrium configurations in the case of atoms at rest and in the case of atoms moving with a constant velocity v. This question will be discussed in more detail further below. Lorentz, without being able to give a precise mathematical formulation of these effects, felt that the deformations, like the Lorentz contraction, are caused by the forces keeping the atoms together. The weakness of Lorentz's point of view was that he tried to explain the various relativistic effects one by one making use of independent, more or less accidental circumstances. The progress obtained by the concepts of Einstein was that Einstein realized that the various relativistic effects are not independent of each other, but that these effects can all be understood through a generál principle with large scope of validity. Einstein thought that the above principle describes the properties of space and time. In our opinion the relativistic effects can be accounted for successfully supposing certain generál common features for all laws of nature — these features appear as a kind of symmetry properties of the laws of physics.

We shall formulate a principle further below which can be taken as a completion of Lorentz's ideas and deviating from Einstein's ideas. Mathematically our formulation will lead to results identical with those derived of Einstein's concept. D. EXPERIMENTS OF THE MICHELSON-MORLEY TYPE 81. After the negatíve result of the Michelson-Morley experiment a number of other experiments were carried out, which all attempted but failed to measure the value of v, the translational velocity of the Earth relatíve to the ether. The most important of these experiments was the experiment of Trouton and Noble. 1. THE TROUTON — NOBLE EXPERIMENT*

82. Consider two opposite point charges + e and — e; the radius vector pointing from — e to + e be denoted by r. If the charges are at rest the force acting upon +e can be written F = eE =

-

0

7

3 - .

As the force acts in the direction of r the moment of force produced by the pair of charges vanishes, i.e. M = r x F = 0, 0

0

If the pair of charges is made to move with a constant velocity v, then the positive charge will be under the action of the Coulomb attraction of — e and alsó under the influence of the magnetic field B = — (vxr)/r . c 3

Thus the totál force acting upon e is given by

since

F(v) = e |E + — (v x B) c

F

O + - J T V x (v x r)

v x (v x r) = v(vr) —

VT 2

we find that the moment of force produced by the pair of charges is equal to e M(v) = r x F(v) = -5- (vr)(r x v)/r . c 2

3

* Fr. T. Trouton and H. R. Noble: Proc. Roy. S o c , 72, 132, 1903.

Denoting the angle between v and r by we find for the absolute value of the moment of force eV M = sin 20. (26) In the above derivation we have neglected the effects of retardation. A more detailed calculation shows that the latter effects give only a negligible correction to (26). 83. In the actual experiment a charged condenser was suspended on an elastic string. The condenser was placed so that ö = 45°, i.e. so that the line perpendicular to the surface of the condenser plates subtended an angle of 45° with the supposed direction of the orbital velocity of the Earth, the direction of v. If the moment (26) exists then the elastic fibre upon which the condenser is suspended is twisted to such an extent that the elastic stress arising in the fibre compensates the moment M exerted by the condenser. Turning the condenser together with its support by 90° the moment M changes its sign and so the equilibrium is expected to be disturbed. In the actual experimentál arrangement the condenser was suspended and it was watched whether or not it would change its orientation while the Earth was turning round and therefore the orientation of the condenser relative to the direction of motion of the Earth would change. The actual experiment showed no such changes in orientation. The negatíve outcome of the Trouton-Noble experiment can be interpreted by supposing that the motion of the system relative to the ether produces not only an electromagnetic moment of force but alsó elastic stresses which compensate exactly the electromagnetic moment of force. 84. Experiments were alsó carried out in which a condenser was charged periodically and oscillations were expected to be caused by the periodic change of M. Further experiments were carried out where it was investigated whether or not the equilibrium adjustment of a Wheatstone bridge is changed while the orientation of the bridge is changed relative to the direction of the orbital velocity of the Earth. All these experiments led to negatíve results, i.e. the expected effects proportional to v /c were not found to occur. 2

2

2. THE EXPERIMENT OF ISAAK A N D CO-WORKERS*

85. Recently an experiment was carried out by Isaak, Champeney and Khan. The experiment dealt with the effects produced by the relativistic slowing down of clocks. The main idea of the experiment can be described as follows. * G. R. Isaak and others: Phys. Lett., 7, 241, 1963.

Consider an emitter of electromagnetic waves in a point A (which is at rest to the ether) and a point B which is moving round A along a circle with radius R with constant velocity. A signal emitted at t from A reaches B at f = t + R/c; we have thus dt

'

and we expect that the radiation of constant frequency v emitted from A will act upon B with the same frequency. If A moves with a constant velocity v and B moves round A as before, then the coordinate vectors A and B at t can be written r (t) = yt,

r (í) = vt + R(í),

A

(27)

B

where R(/) is the vector pointing from A to B at the time t. A signal starting from A at t will reach the point B at the time f so that (r (t) A

r*(0)2

- c (f - tf = 0.

(28)

2

Differentiating (28) into t we find with the help of (27)

«.-o.-w)(.(.-Í)-i«Í)-«v-o(.-Í). Since R(í')R(í') = 0 we have

(c* - t, ) - R(/)v -

( " "íj

2

!

+ vk

C> = °-

We note that Rif)l{t' — f) ~ c, thus neglecting small terms of the order (D/C) we find from the above relation 2

dt' , vR(/') — « 1 +— dt c 2

We expect therefore the radiation emitted with the constant frequency v from A to arrive in B with a varying frequency Y(f) so that .

vR(í') c

V

)

(29)

2

If the emitter ^4 is moving with the Earth the frequency of radiation received in B should fluctuate periodically in accord with (29). In the experi-

ment of Isaak and co-workers a y-emitter was placed in the centre of a disc and an absorber B was placed on the edge of the disc and resonance absorption was observed with the help of the Mössbauer effect. If the frequency of the radiation received in B would fiuctuate indeed according to (29), then the resonance would take place only during part of the motion and only the measure of the absorption would be expected to vary with the position of B relative to A. 86. In the actual experiment no effect on the absorption was found, when the disc supporting the absorber was made to rotate. The experiment was sufficiently accurate that a small fraction of the effect expected according to (29) could have been observed. v+w

Y

r v-w

\

Fig. 13. The orbit of the source used in the experiment of Isaak and co-workers

The negatíve result of the above experiment can be understood supposing that the frequency of the absorber B changes with the velocity of B in a manner as it suggests itself from the perpendicular Doppler effect. Indeed, if the inner frequency of the absorber depends on its velocity w(í) = v + R(í),

(30)

so that v*(0 = v ^ 1 - " ^ . o

(31)

then we find inserting (30) into (31) and neglecting small terms v*(í) = v 1 0

vR(0) c

2

and we find thus

r

v*(í') = v'(0, .e. the frequency v*(í') of B at any time t' will be equal to the frequency v'(í') falling upon B. Summarizing the above considerations we can thus state the following. If the emitter A moves with a constant velocity v relative to the ether, then the absorber B circling round A moves along a cycloid path (see Fig. 13).

The phase surfaces emitted by A in equidistant times will reach B in a sequence corresponding to a periodically varying frequency v'(f). However, the velocity w(í) = v + R(í) of the absorber along the cycloid alsó changes periodically and if the inner frequency \*{f) of B adjusts itself at any instant t' to the velocity of motion of w(í) according to (31), then the radiation of A falling on B with periodically changing frequency v ' ( 0 will be exactly in resonance with the changing inner frequency v*(/') of B. Thus we find just as in the cases of the Michelson-Morley and TroutonNoble experiments that alsó in the experiment of Isaak and co-workers the effects which occurred on purely geometrical grounds are compensated by other effects and therefore the two types of effects caused by the translational motion of the system compensate each other and no observable effect remains. E. GENERAL REMARKS CONCERNING THE SERIES OF NEGATÍVE RESULTS 87. The series of failures to find effects suitable for to determine the translational velocity v of the Earth relatíve to the ether might appear at first sight as a series of "accidents". In this view the Michelson-Morley experiment might have been suitable for to determine v, but for the "accidental" fact that the arms of the interferometer deform. Similarly the Trouton-Noble experiment is rendered inconclusive by the apparently accidental fact that mechanical torques appear to compensate the electric ones. The fact that the mechanical torque should change exactly to the same amount as the torque inside the suspended condenser must be regarded as a rather remarkable fact which, however, is proved directly by the experiment itself. In the Isaak experiment we are prevented from determining v by the apparently accidental fact that the inner frequencies of the absorber vary with the velocity w(í) of the absorber relatíve to the ether. Similar apparent accidents prevent the determination of v by a number of other methods. In a scientific analysis a series of apparent accidents cannot be accepted as real accidents. 88. So as to remind of precedents in the history of science, we note that all the attempts to build perpetuum mobile seemed at that time to be prevented by a series of "accidents". Somehow the proposed mechanisms did not work because the one or the other disturbing circumstance was underestimated. The fact that this series of accidents were not real accidents but they reflected somé generál law of nature, was first clearly pronounced when the

French Academy of Sciences declared that it will not be concerned in future with perpetuum mobiles. The above declaration is only a negatíve result. It was realized that the series of failures encountered cannot be accidental but that there must be a generál law of nature which makes it impossible to construct such a machine. The second step was to find and to formulate the law underlying the negatíve experiences. This law was the law of conservation of energy. It must be emphasized that the law of conservation of energy is a law which (as far as we know) is valid for all phenomena. This law gives, however, only a framework for many particular laws of physics without determining these laws precisely. 89. It must be noted that should one find somé new phenomena where the conservation of energy is not satisfied, then this phenomenon would restrict the validity of the conservation law without making the law itself invalid. Indeed, the law of conservation of energy has proved its validity upon a very large field of phenomena and thus it is clear that this principle does indeed refíect correctly an aspect of laws of nature. Its success would not be undone even if we were to find somé particular phenomena outside its validity. The above remark can be supported if we remember the activities of the alchemists. Trying unsuccessfully to produce gold, the alchemists found a large number of laws of chemistry. Eventually it was realized that there exists a law of nature which had made the efforts of the alchemists futile. The law itself is that the chemical elements consist of indivisible atoms. Today we can produce gold from suitable elements by nuclear chemical methods. Nevertheless the laws discovered by the alchemists remain the foundation of chemistry. The fact that we know now the limitations of a law does not affect its importance inside its region of validity. 90. It is often pronounced as a principle of science that a hypothesis can only be maintained until no single fact is known which contradicts it and it is claimed as soon as there is one fact which contradicts a given hypothesis that the hypothesis must be dropped. It is a fortunate thing for the development of science that this principle is only pronounced in textbooks but is not taken seriously by scientists. Indeed, if a hypothesis proved its worth by explaining a series of facts then this is an indication that this hypothesis reflects correctly at least part of reality and therefore such a hypothesis has a content of reality. If new phenomena seem to contradict the hypothesis a careful analysis is necessary to the effect of whether and how the new phenomena affect the validity of the hypothesis. It is very important for the stability of scientific progress that hypotheses which have proved their worth are not thrown away when the first difficulty arises.

1. GENERALIZATION OF NEGATÍVE EXPERIENCES

91. We have brought a number of generál remarks upon methodological questions so as to make more clear our procedure. From the series of negatíve results like that obtained by Michelson and Morley it seems reasonable to suppose that it is not accidental that various experiments trying to determine v, the velocity of the Earth relatíve to the ether, remained unsuccessful. And we may accept the view that there exists a generál law of nature which prevents us to determine v by laboratory experiments. As a second step we have to formulate this law in a positive form. The usual formulation of this law is the principle of relativity by Einstein. We shall, however, give an alternative formulation of the generál law which we shall denote the Lorentz principle. We shall give the formulation of this principle further below.

CHAPTER III

THE PROBLEM OF MEASUREMENT

92. In 75 we have claimed that the Michelson-Morley experiment amounts to a measurement of the Lorentz contraction. So as to support this statement more deeply and alsó for to formulate the Lorentz principle, it is necessary to analyse the problem of measurement in somé detail.

A. THE PROBLEM OF MEASURES 1. REPRESENTATIONS

93. Presently we give a generál analysis of measures and quantities. In these considerations we want to distinguish clearly between the real quantities* which exist objectively independent of whether or not we attempt to measure them and the measures of quantities, or shorter measures which describe the quantities by means of numbers (or sets of numbers). It is quite clear that real quantities can be described in very many ways by numbers; we have merely to try to choose for the description of particular quantities numbers which reflect adequately certain physical properties. 94. To make the distinction between quantities and measures quite clear we shall use Gothic symbols for the real quantities and Latin letters for their representations. Thus we may write, e.g. e,

%

G, • • •

(i)

for certain real quantities and write

m) = £, xm = p>

*(&) = e . . .

for certain representations. E,P,Q . .. are thus numbers or sets of numbers describing the quantities (1) in a representation R. The quantities (1) can * We use the term "real quantity" because we cannot find a better expression. We understand as explained in the text the quantity as it exists objectively as distinct from its measure.

(2)

alsó be described in a different representation, say R', and we have thus

R'(&) = E', R'{%) = P\ R'(£i) = Q'... The E', P', Q are of course functions of the E, P,Q; the latter functions give the transformation between the representations R and K. 95. In particular if © stands for an event then it can be represented relatíve to a system of coordinates K by K{
with

m) k

= x

k

k=

1, 2 , . ..,

n

(3)

Taking another system of reference K' we have instead of (3) =4

with

xí = f(x*)

(4) (5)

where f stands for four functions /„, v = 1, 2, 3, 4. In generál we shall require of two representations (3) and (4) that there should be a one to one correspondence between them. Thus we shall require that the transformation (5) should possess an inverse, i.e. there should exist with (5) alsó a unique relation x^f- ^). 1

We analyse presently the problems connected with the representation of quantities by numbers. 2. A N EXAMPLE: THE MEASURE OF ELECTRIC CHARGE

96. We illustrate the problem of how to represent quantities by measures by typical examples. For the sake of the first example let us consider Coulomb's law in the form

where e and e are the measures of two charges e and e and similarly r and F are the measures of the distance r between the charges respectively that of the force % acting between them. For the moment we take it for granted that % and t can be expressed by Fand r in the usual way and we discuss merely the problem of how to determine the measures of the charges. Measuring F and r and taking the validity of (6) for granted we obtain one equation for the two unknown quantities e and e . Considering the force between two equal charges, i.e. if by symmetry consideration we can ascertain in a particular case that e = e then we can determine the common measure e of both charges in terms of F and r. In generál, however, so as to be able to determine the measures of charges it is necessary to measure forces acting at least between three charges. We may thus write x

2

x

x

x

2

2

2

F, = ^ r

*, 7 = 1 , 2 , 3

k

(7)

where F is the force acting between e and e, from a distance r when the third charge (e.g. e ) is moved away. From (7) we find for the measures kl

k

m

e = r / k

k l

- ,

k,l,m

km

= permutation of 1, 2, 3.

The sign of the square root has to be taken properly so that the right hand side of (7) should be positive in the case of repulsion and negatíve in the case of attraction. 97. If we have n > 3 charges e e , . . ., e„ then we can determine the measure e of e in more than one manner. We find thus with the help of (7) 1;

k

2

k

'

1

'

lm

i

l' ' m

where k, l, m and k, /', m' refer to groups of three charges with the help of which we attempt to determine e . From (8) we see that e can be determined in a number of different ways. We must expect that the various methods for determining e lead to the same numerical value. In other words relations (8) give a check of consistency of the Coulomb formuláé (7). If we carry out measurements with one fixed distance r only, then the checks support the assumption that the force acting between two charges e and e, is indeed proportional to the measures e and e, of the charges. 98. Sometimes it is claimed that relation (7) gives simply the definition of the measures of charges. This claim cannot, however, be maintained altogether. — Indeed we have seen that relation (8) provides checks of k

k

k

k

k

relation (7); if e.g. having four charges and we were to find from measurement that ^12^13 TI ••23

, ^12^14 '

^24

then we are led to a result contradicting the law (7) in a particular instance. Thus (7) is not merely a definition; nevertheless it contains a definition also. Indeed, if a set of checks of the form (8) is satisfied, we cannot conclude that Coulomb's law has necessarily the form (7). If we were to assume instead of (7) a law of the form *u =

p

,

(9)

where q>(e') is a monotonous function of its argument, then starting from (9) in place of (7) we find in place of (8)

where

into (9) we find consistency for any postulated function

k

F =C ^ ~ ,

(11)

kl

where C is a constant. The difference between (7) and (11) is merely that a different unit is chosen for the charge. We saw, however, that one can generalize (7) also to a larger extent; replacing (7) by (9) in this way one does not change merely the unit but one deforms the scale used for the measures of the charges. 100. Let us denote for clarity the measure of a charge e by e if the measure is taken to correspond to the ordinary Coulomb law (7) and let us denote e' the measure of t we obtain from Coulomb's law in the form fc

k

k

k

(9) (with a fixed function
k

k

e = R{t ) k

e' = R'{t ),

k

k

k = 1, 2, 3 , . . . , n

k

where R and R' stand for two possible representations. 3. DISTINGUISHED REPRESENTATIONS

101. The Coulomb law can be expressed taking the measures of charges in the representation R or in the representation R'. The representation R is simpler than the representation R' as in terms of the former the Coulomb law contains simply the product of measures of charges while in the representation R' the force between two charges is obtained by a more complicated combination of the respective measures. The representation R has therefore a certain advantage over other representations. Apart from this convenience there is a further reason for to prefer the representation R. Joining two charges t and e together they appear to act like one effective charge which we may denote e . Symbolically we may write x

2

12

ei í e = e , 2

(12)

12

where we write the sign í for "joining the charges together". In the representation R relation (12) can be written e + e = e x

2

(13)

12

while in the representation R' we find in place of (13)
12

(14)

The operation (14) is usually called generalized addition. 102. We see that in the representation if both the sum and the product of the measures of two charges appear in the fundamental laws describing the behaviour of the charges. These laws expressed in other representations contain more complicated combinations of the measures. The possibility to obtain a consistent scale such that measures are additive expresses a particular property of the measured objects. The simplest case where we obtain measures which can be added are the measures of sets of objects. Such sets are characterized by the numbers of

objects they contain and joining two sets we obtain new sets containing numbers equal to the sums of the numbers in the individual sets. — Possibly the earliest occasion in humán history for such an addition occurred when two herds of animals were joined and gave a bigger herd. 103. The representation R of electric charges is thus doubly distinguished — it gives the force between two charges to be proportional to the product of the measures and at the same time the sum of the measures of two charges as measures of the combined charges. — It seems to be a generál rule that for the representation of physical quantities we prefer measures such that both product and sum of the measures should express significant quantities. (In somé cases only the sum of the measures has a physical significance.) We venture to suggest that the measures thus obtained reflect best the physical properties of the represented quantities. 104. In the case of charges the representations which differ from R only by the choice of unit (as discussed in 99) can be used equally well. Thus • in the case of charges there seems to exist no natural unit. — For somé purposes the elementary charges could be taken as a natural unit; however, for other purposes such a choice would be quite artificial. While in the case of measures of charges the unit does not seem to be a naturally given quantity — the zero point of the scale is strongly distinguished; any scale appears to be highly artificial unless it gives zero measure to no charge. Another example for the determination of measures is obtained when analysing the scale of probabilities. The analysis of this scale has been given elsewhere.* The latter analysis shows very similar feature to that given here. It is interesting, however, that in the probability scale a natural unit can be determined uniquely. 4. MEASURES OF LENGTHS

105. When we jóin rods together, we find that (in terms of proper measures) two rods I and I of length l and / joined together are equal in length to a rod I of length / with x

2

x

12

2

12

h + h = li2The scale in which the measures are additive can be taken as the distinguished scale of lengths. If we were to use a distorted scale in terms of the latter, the length of the rods l and í might appear as = (4) and the length of the rod 1 appeared as x

2

-1

12

* L. Jánossy, Theory and Practice of the Evaluation of Measurements. Clarendon Press, Oxford, 1965.

where the monotonous function 9? expresses the kind of distortion of the scales. We note that in terms of the distinguished scale the product of two lengths gives an area — namely S = kh, and the areas so obtained are themselves additive. In practice there seems to be no point in introducing non-additive scales for quantities if there is a possibility of introducing alsó additive representations. It must be emphasized, however, that it is not trivial that for certain quantities additive measures can be introduced. Whether or not such measures can be introduced in a particular case is a question which can be decided experimentally, as was shown e.g. in the case of the measures of electric charges in 101. 106. So as to see the role of the distinguished representation more clearly we give the following example. Let us consider a straight rod. Under practical conditions we may consider a rod to be straight if its contour can be made to coincide with a free string under tension — or if a beam of light can be made to move along it. Neither of the two definitions claim to be more than practical definitions with the help of which it can be decided with a limited accuracy whether or not a rod appears to be practically straight. We may mark on such a straight rod a number of consecutive points $ , SPi,. . F u r t h e r m o r e we may match a number of rods so that the rod x „ l > k = 0, 1, 2 n fits exactly between the points % and ty,. We can now ascertain whether or not the rod x when turnéd round still fits between the points $ and Once this has been found to be the case, we may order the rods according to their length thus we may establish in a qualitative manner e.g. that 0

k

k

kl

t

t*

A

< t

t A

< .• •

We can now ascribe arbitrary measures to the lengths of the rods; we may put •R( «) = ki > r

r

where the r should be positive numbers obeying kl

r

kíll

< ...

kJt

(15)

i.e. we choose the Iarger the measure the longer the rod. 107. A distinguished representation may be looked for in which the measures of the lengths are additive, i.e. the measures obey the relations r» + r

lm

=r

km

for k < l < m.

(16)

Such a representation is obtained if we put kl — l ~

r

r

l>k

k

r

(17)

where r = r^, r = r are the measures of the lengths of rods fitting in between the points ty and ty, respectively and ty . The measures obtained by (17) are additive; they give aconsistent system of measures if the k — 1, 2 , . . . can be chosen in such a manner that the r obtained by (17) obey (15). 108. Whether or not it is possible to obtain a distinguished representation with measures r obeying (15) and (16) depends on the physical properties of the rods. The everyday experience shows that using solid rods, additive measures can be obtained indeed inside the margin of error of measurement. We may also reverse the argument and state that we define as ideál solid rods the ones with the help of which an additive scale of lengths can be obtained. Constructing a scale of measures with real rods we may find small inconsistencies. Such inconsistencies can be made use of to determine deformations which real rods suffer when moved about and these deformations give the deviations in behaviours of the real rods from ideál solid rods. 109. Further we note that if we obtain additive measures r , for the lengths of rods then these measures can also be replaced by t

k

ok

0

k

r k í

kl

kl

k

r'ki = ar , k

a>0.

The r' are the lengths expressed in a new scale. We may also introduce a new scale using a factor a < 0. Doing so we reverse our convention and ascribe smaller measures to larger rods. Such a convention is unusual but nevertheless is internally consistent. kl

B. SYSTEMS OF SPACE COORDINATES 1. DETERMINATION OF COORDINATE VECTORS

110. We may now analyse the method of how to construct three-dimensional systems of coordinates. Consider for this purpose a set of fixed points . .., ^S„. The distances between the pairs of points can be measured with the help of rods and thus we can obtain (additive) measures K(*ki) = r„

k,l = 0, 1, 2 , . . . N

of the distances between various pairs ?$ and % of points. k

We expect the measures thus obtained to obey the relations +r

ri k

lm

r

=

k m

(18)

,

where the equality sign stands if the points ty , ty, and ty lie along a straight line. If the relations (18) are found to hold for the measures of distances taken for any group of three points then this can be taken as a qualitative check of consistency of our method. The above result supports in particular the assumption that our measuring rods behave like solids. 111. For to obtain measures of the coordinates of the points ty we may introduce relative position vectors x so that certain vector quantities k

m

k

kI

= *« define (in a representation K) the position of % relative to % . From experience we can take it that the position of a point relative to another is given by a three component quantity, therefore we suppose t

OT/,2' kl,3

kl — kl,l>

T

k

r

r

thus r i = 1, 2, 3 are the components of r . We may try to look for such a representation K in which the relative position vectors are additive, i.e. such that kIi

w

r« + f/m = r*m-

(19)

The relation (19) stands for three relations, i.e. for one relation for each of the three components of the vectors. The relations (19) are automatically satisfied if we suppose r « = i-/ -

r

k = 0, 1,. .., n

/,

fc

(20)

where r

/

=

r

k

0/»

T

0k •

=

T

We can take the r as the coordinate vectors of the points ?$ k = 0, 1, 2 , . . . the point ty with coordinate vector r = 0 being taken as the origin of K. 112. So as to obtain a statement which can be checked experimentally we make an assumption as to the connection between the position vectors r and the measures r . The distances r are supposed to be measured with solid rods, therefore we suppose the law of Pythagoras to be valid, thus fc

k

0

kl

0

kt

r£, =

kl

for

rf,

k, 1 = 0, 1 , . . .,

where we have written short 3 kl

T

=

X kl,i • I=L r

n

(21)

Relation (21) is valid in an orthogonal system of reference. In terms of skew coordinates we can suppose in the place of (21) rf, = T Gr kl

(22)

kl

where G is a symmetric and positive definite mátrix. In particular supposing G = 1, (22) reduces to (21). The following considerations will be carried through in representations where we do not specify whether they are orthogonal or skew. In this way we can considerably simplify the actual calculations and the results can be applied in orthogonal representations if this is desirable. For the moment we take thus (22) as a hypothetical connection between coordinate measures r and measures r of distances. The precise meaning of "orthogonal" respectively "skew" representations will be elucidated further below. 113. So as to check (22) experimentally we can write in place of (22) making use of (20) also w

kl

(r.-rOGfo-r*)^,.

(23)

In the case of N + 1 points ty , ^ . . . , ty the relations (23) give N(N + + l)/2 equations for the 3N components of the vectors r k = l,2,...,N. For sufficiently large values of N the system (23) becomes overdetermined and from the purely mathematical point of view (23) need not admit of solutions. If nevertheless in a particular case the overdetermined system (23) does admit of solutions, then this circumstance cannot be taken to be accidental. If in a given case relations (23) do not lead to contradiction this circumstance reflects on the properties of the measuring rods used. We may conclude (generalizing the result of 108) that the measuring rods can be taken to behave like ideál solids — if the measures of distances r obtained with them can be expressed in terms of quadratic expressions of the form (23). 0

u

N

k

kl

2. EXPLICIT DETERMINATION OF COORDINATE MEASURES

114. Explicit solution of (23) can be obtained in the following manner. Consider four points % (24) which four points should not lie in one pláne. (How this can be ascertained for four given points will be discussed further below.) We may consider a coordinate system such that ^ fixes its origin and the axes of the system lie in the direction $ ^P/t k = 1, 2, 3. Furthermore choosing the units along the axis suitably we can take the coordinate vectors of these points to be given by 0

0

r = 0,0,0, 0

r

1

=

1,0,0,

r = 0, 1, 0, 2

r = 0, 0, 1. 3

(25)

Writing down (23) for k = 0, / = 1, 2, 3 we find G„ = rf,

/ = 1, 2, 3

(26)

where we have written r, in place of r . Making use of (26) we obtain from (23) for k, l = 1, 2, 3 0/

G = j{ri

+ r?-rl).

u

(27)

Thus (26) and (27) give the elements of G if the coordinates of the four points (24) are to be given by (25). It will be necessary to suppose further below that det G # 0.

(28)

Whether or not the above relation stands, depends on the numerical values of the distances r in terms of which we express the elements of G. We have to require therefore that the points (24) should have distances in terms of which (28) is fulfilled. The latter requirement is equivalent to the requirement that the points (24) should not lie in one pláne. Thus we have to start our procedure with four points (24) with mutual distances satisfying (28). 115. Considering five points M

*o. %,

%

$3,

m>3.

m

(29)

Writing down (23) for the mutual distances we obtain relation 4 - 2(Gr ) + r G r = r | . m

fc

m

m

(30)

m

Supposing for the moment that r Gr = r ,

(31)

2

m

m

we can write in place of (30) G r = DC"> m

where D

( m )

is a vector with components ^ " Y ^ + r i - r í J

* = 1,2,3,

m>4

(32)

thus r m

=

G

-i

D ( m )

The relation (33) gives explicitly the coordinate vector r in terms of measured distances only.

( 3 3 )

m

of a point ty„

3. QUESTION OF CONSISTENCY

116. Introducing (33) into (23) we find that the relations (23) are indeed satisfied for the coordinate vectors of the five points (29) provided (31) is satisfied. Introducing (33) into (31) we find a relation D( >G- D > = r . m

1

(m

(34)

2

m

The latter relation gives a quadratic equation for Solving the above equation, we can express r in terms of the remaining nine distances between the five points (29). [Since (34) is a quadratic equation, we obtain in generál two solutions for r . ] If the measured value of r coincides with one of the solutions of (34), then (25) and (33) give a consistent set of coordinate vectors for the points (29). Relation (34) gives thus a test of consistency; it shows whether or not the distances between the five points (29) can be expressed in a form (23). 117. Adding a sixth point to the five points (29) we can determine the coordinate vectors r„ of *)3„ as m

m

m

r„ =

G-W>

where the elements of D are obtained from an expression of the form (32). The coordinate vectors thus obtained can be taken to be consistent provided we have apart from (34) also (n)

r Gr„ = r\ n

and

r„Gr = i - (r* + r%- r * J . m

Increasing the number of points we can construct expressions which give the values of the coordinate vectors provided a consistent set of coordinate vectors exist at all. In case of N > 4 points, the coordinate vectors obtained have to fulfil

conditions. 4. VARIOUS REPRESENTATIONS

118. It remains to investigate how far our procedure is affected by the arbitrary choice (25) of the coordinate vectors of the four standard points. We note first that equation (33) is only meaningful if (28) is fulfilled, i.e. if G exists. We have thus to choose the points S$ so that the determinant of the mátrix G should not vanish. It can be taken that the latter condition is fulfilled if we choose the four standard points (24) so as not to lie in one pláne. - 1

k

det G = 0

(35)

as the definition for points to lie in one pláne. If det G ^ 0 we find it to b positive definite. 119. If we succeed in determining coordinate vectors r k = 0 , 1 , 2 , . . . , N to N + 1 points such that these vectors satisfy all the relations (23) then we can take linear transforms fc

r' = Sr + s k

k = 0, 1, 2,...

k

N

(36)

alsó as consistent measures of the coordinate vectors. Indeed from (36) it follows that r - r, = S-V/c' - *í) = W - O S " .

(37)

1

fc

Introducing thus (37) into (23) we find (f * ~ ''i)G'(r - r',) = rli

(38)

k

with G' = S - G S - . 1

(39)

1

Thus we see that provided the coordinate vectors r satisfy the overdetermined systems (23), then the transformed coordinate vectors r' satisfy the relations (38). 120. When constructing coordinate measures for the coordinate vectors t of the points ty we assumed in 114 (25) particular values for the measures of the coordinate vectors r of the points ?$ k = 0, 1, 2, 3. If we were to assume in place of (25) that in somé representations K' the coordinate vectors are given by k

k

k

k

k

rí =

r ,

r'

kr2

ktS

k

k = 0,1,2,3

(40)

where the r' are chosen arbitrarily, then we can regard the r' as transforms of the originál r^, the transformation having the form kJ

k

t = Sr + s k

fc

k = 1, 2, 3

with $ik — k,i ~ 'o,t r

r

— ó.i r

k,l=

1,2,3.

(41)

Applying the transformation (41) to any of the coordinate measures of ty in the representation K we obtain its representation in K'. The representation K' could alsó be obtained directly by assuming the values (40) in place of the values (25) for the coordinate measures of the points ty k = 0, 1, 2, 3. The only restriction to be imposed on (40) is that the coordinate measures have to be chosen so that k

k

dct|r^-.ri|5É0 the latter condition is necessary for to make det S # 0 and to make the transformation (41) a reversible one. From the above considerations it follows that providéd consistent coordinate vectors r can be introduced, supposing distances to be given by (23) then consistent coordinate vectors can alsó be found if G is replaced by G' as defined by (39). 121. For given matrices G and G' we can determine S so as to satisfy (39). We may write e.g. S = G'- G . k

1 / 2

1 / 2

We see that if it is possible to construct consistent measures óf coordinate vectors for one assumed value of the mátrix G then it is possible to find consistent representation for any other choice G' in place óf G. We conclude therefore, that we can decidé by measurement whether or not it is possible to represent a set of measured distances r by a positive definite quadraticform of the difference of coordinate vectors. No information can be obtained about the elements of the mátrix G occurring in the quadratic form. In particular we can obtain orthogonal coordinates in which G' = 1 if we put r' = G ' !. u

1 2

Thus the coordinate vector of the point ty„ in the orthogonal representation is obtained as r

'

m

=

G

-l/2 (m)_

(42)

D

122. So as to formulate our results in a more generál form we denote @ the metric tensor. We say that ® is represented in a system of reference K by K(®) = G, where G is always a symmetric positive definite mátrix. The representations G = K(&)

and

G' = K'(®)

of the metric tensor are connected by a relation G' = S ^ G S -

1

provided the coordinate measures relatíve to K and K' are connected by r' = Sr + s. In particular we can obtain an orthogonal representation as described in 121. C. PROBLEMS CONNECTED WITH COORDINATE REPRESENTATIONS 1. REMARK O N " N O N - E U C L I D E A N " GEOMETRY

123. The above statements can also be formulated in another way. If the measured distances r between the points of a set can be expressed by a quadratic form (23), then one might conclude that the space in which the points are situated is "Euclidean". Or if no consistent coordinate measures can be obtained one might conclude that the space involved is "nonEuclidean". We do not think, however, that such a conclusion has any meaning. The fact that the overdetermined system (23) possesses solutions r , k = 0, 1, 2 , . . . , n seems to us to reflect upon the method of measurement of the distances r and in particular upon the physical properties of the measuring rods used. Roughly speaking one may conclude from the consistency of the measures that the measuring rods made use of are behaving like rigid bodies, i.e. if the measuring rods are turnéd or shifted they do not change their length. Of course the procedure described provides only necessary conditions for the rods to behave like rigid rods. 124. A further aspect of the question is as follows. As relation (23) is a (generalized) form of the law of Pythagoras we come therefore to conclude that the latter law can be tested experimentally. This statement appears at first sight paradox as the law of Pythagoras is usually proved with the help of the axioms of geometry. In fact no paradox is involved. The axioms of geometry simply reflect the properties of ideál solids. The experimentál test described above is a test to the effect that our measuring rods behave like ideál solids. u

k

kl

2. COORDINATE TRANSFORMATIONS A N D DEFORMATIONS

125. nr>

x

m

Let us consider a system of points % with coordinate vectors = 1. 2 , . .., n. In a particular representation K we have m

K(tJ

= r . m

We obtain the coordinate vectors in another representation K' with the help of a linear transformation; we may write K\i ) m

= t' = Sr + s. m

(43)

m

A linear transformation of the coordinate vectors may alsó be interpreted in a different manner, by writing in place of (43)

C = Tr

m

+ t

(44)

where detT # 0 and t is a constant vector.

5*

Fig. 14. Scheme of a deformation

We can regard the r* as the coordinate vectors of points ty* m = 1, 2 , . . . , / ! in the representation K. Thus we may suppose In the above sense the transformation (44) produces from a system of points fy *Js , . . . , $ „ another system • • •> The transformation (44) describes thus a deformation of the configuration of a set of points: the deformation being expressed in measures relative to K (see Fig. 14). 126. Let us consider the deformation (44) in measures of coordinate vectors of two systems of reference K and K'. Thus suppose lt

2

K(x) = r,

K\x) = r-

and r and r' are connected by (43). Transforming both sides from (44) according to (43) we find Sr* + s = STr + St + s, m

(45)

expressing r from (43) we have m

r = S - ^ - S - V

(46)

m

Inserting (46) into (45) we find C = Tr'

m

+ t\

T' = S T S -

(47) (48a)

1

t' = (1 - T')s + St.

(48b)

Relation (47) gives the connection between r*' = *'(r*)

and

r' = K'(t ) m

m

i.e. it gives the connection between the coordinate vectors of $ in the representation K'. 127. In a more generál notation we may also write

mj=r ,

m

and ^3*

(49)

m

where % stands for the deformation which shifts the points S$ into points ty*. The representations of % relatíve to K and K' can be written m

K(%) = T, t K'(%) «= T', t'. The relations between the representations T, t and T', t' of £ are given by (48a, b). The points ty m = 1, 2 , . . . n can also be taken to be the points constituting somé physical system m

a = %,

Applying the deformation operation 2 to the points of & we obtain another system q* = ..., «p;, and thus we may also write D* =

£(&)

where £}* is a deformed and displaced version of |Q the operator £ giving the deformation.

3. ORTHOGONAL TRANSFORMATIONS a. DEFINITIONS

128. In particular we may be interested in deformations, which we shall denote by £) which leave the distances i , between pairs of points ty , of JQ unchanged. Writing O in place of T we find from (44) k

k

r* = Or* + t,

(50)

k

and therefore

*ti =

thus

Or , w

r£ = r Ő G O r . 2

w

(51)

w

Provided O satisfies the relation ŐGO = G

(52)

we find from (51) r

r*i = ki

k,l=\,2,

...

Thus the deformation mátrix O produces deformations which leave the measures r unchanged. We denote such matrices orthogonal matrices and transformations (50) orthogonal transformations. In particular in an orthogonal representation K with G = 1 (52) reduces kl

0

to

ŐO = 1. 129. Equation (52) gives the definition of the representation of an ortho gonal mátrix in one particular system of reference. We may write O = X(£>). From (48a) and (39) we find 1

O' = S O S ,

1

1

G' = S G S ,

(53)

where O' = K'(£>)

and

G' = K'{®).

Thus introducing (53) into (52) we find Ő G O ' = G'. We see therefore that reference in which we 130. Relation (52) However, since G is a

the definition (52) is independent of the system of represent © and £). gives nine equations for the nine elements of O. symmetric mátrix only six out of the nine equations

are independent and the matrices obeying (52) form a set depending on three independent parameters. We may denote this by writing O in place of O where p stands for the parameters. We may also write £) for the mátrix giving a particular deformation and p

p

K(£) ) = O

K(ti) = p.

or

p

v

Thus we write p for the parameters defining a particular orthogonal de formation mátrix and write p for its representation in a system K. b. GROUP CHARACTE R OF ORTHOGONAL MATRICE S

131. The orthogonal matrices O , O , . . . form the socalled ortho gonal group. Indeed, taking the determinant of both sides of (52) we find since det G > 0 det O = ± 1 . (54) p

q

We see thus that there exist orthogonal matrices with determinant + 1 and others with determinant —1. From (54) it follows that any orthogonal mátrix O possesses an inverse O p ; thus multiplying (52) from the left with Op and from the right by Op we find l

p

Őp^GOp = G, 1

thus Op" is also an orthogonal mátrix. Furthermore if O orthogonal matrices then 1

ŐpGOp = Ö „ G O , = G

p

and O

q

are (55)

thus the product O, = O O p

q

is also an orthogonal mátrix. Indeed, multiplying (52) from the left by O , and from the right by O remembering O = O Ő we find q

r

q

p

Ő G O = G. r

r

The unit mátrix obeys also (52) and matrices are always associative, thus we see that the matrices obeying (52) fulfil the postulates of a group and thus they form indeed a group. The matrices with det O = + 1 form a subgroup which subgroup may be called the proper orthogonal group. Since the O form a group thus their transforms to another system of reference, i.e. the matrices p

O . = SOpS" , p

1

(56)

form alsó a group. The relation (56) defines alsó the transformation of the representation of the parameters p from one system of reference to another. 132. In particular we may consider coordinate transformations S taken with orthogonal matrices S = O . We write O in place of O, to signify that 0 ( , ) does not represent an operator but is the mátrix of a coordinate transformation. In case of one orthogonal coordinate transformation we may write in place of (56) (<0

tq)

O;. = OWOPO'o- .

(57)

1

Because of the group character of the transformations the product on the right hand side of (57) defines an orthogonal mátrix O - which is an element of the representation of O in K'. Thus an orthogonal transformation of the orthogonal group changes only the parameters of the deformation operators. It is convenient to define the parameters of the coordinate transformations such that we have p

OW = O , ;

(58)

1

using the above definition, we have r;; = 0 « O J R W =

r

w

i.e. the representation of x in K will become equal to the representation of r*, in K' if the coordinate transformation K -* K' is taken with the same paraméter as the operator producing the deformation 0, -* Q*. It can be seen easily that using the definition (58) the coordinate transformation O corresponds to a deformation according to O of the system of reference. 133. The question may be raised as to what are the common features of the various representations O , O -, OJ-, . . . of a deformation mátrix ? From (56) we see that a change of representation of Dj, leaves the eigenvalues of the mátrix O unchanged. The eigenvalues of an orthogonal mátrix O with real elements can be shown to have the form u

m

q

p

p

p

p

e'V-'M, thus they depend on one real paraméter q> only. As can be seen easily an orthogonal deformation r* = O r„ + t p

can always be regarded as a shift and a turning round through an angle

ular for any orthogonal mátrix £> a representant K exists, so that p

0

K (&) = 1 0

'

^o(^p) =

cos

£

where e = + 1 is found for the proper and e = — 1 for the improper orthogonal matrices. The measure q> of the angle is independent of the representation, we have thus K(q>) = K'{
= ?>

in any system of reference. The improper matrices produce apart from the turning round by an angle cp also a reflexion on a pláne perpendicular to the axis of rotation. 4. RIGID BODIES

134. A deformation of the form (44) describes a type of change which a solid suffers if moved and also submitted to an elastic stress. An orthogonal deformation of the type (50) takes place, when we move a rigid body. We may in fact denote a physical system as a rigid body if moved about, the coordinates of its points change according to orthogonal transformations. The latter assertion may appear at first sight to contain a vicious circle. We determine coordinate vectors by methods as described in 115 using measuring rods which are supposed to behave like rigid bodies. Using a system of reference thus obtained we assert that a rigid body, when moved about, changes its coordinate vectors according to orthogonal transformations. In fact the above assertions have a good physical significance. Indeed, using measuring rods alleged to be rigid, we can determine coordinate vectors. Whether the measuring rods behaved indeed like rigid bodies can be found out by investigating with the methods described in 116 whether or not the system of coordinates thus obtained forms a consistent set. Thus we may conclude: we can construct a system of reference K using rigid measuring rods — that the measuring rods used are rigid indeed is confirmed if the coordinates obtained with their help are found to be consistent. That solids behave in practice in a good approximation as rigid bodies is found from experience. We think of the experience of carpenters, architects and also of engineering and geodetic experiences.

It may be noted that these practical experiences show merely that solids behave approximately like rigid bodies — but these experiences can be extrapolated and thus we obtain the notion of the ideál rigid body with the help of abstraction. Real solids deform under stress and these deformations can be determined from the deviation of their behaviour from that of ideál rigid bodies. 135. It must be added to the above considerations that the procedures described above provide only necessary conditions to the behaviour of ideál solids. If we were to deal with bodies which deform in a suitable manner, when moved about, these deformations might be such that they compensate each other and, using measuring rods with such peculiar properties, we still may obtain consistent coordinates.

CHAPTER IV

THE LORENTZ TRANSFORMATION

A. THE TIME SCALE 1. GENERAL REMARKS

136. So as to describe moving physical systems we require apart from coordinate measures also measures of time. The simplest case is the description of the orbit of a particle. In a given representation K we can take the orbit of a partiele to be given by a four-coordinate xOO = r(p), í(p)

(1)

where p is an independent paraméter, r(p) is the position vector of the partiele at the time t(p) and it is usually supposed that KP)

* 0.

The dot indicates differentiation with respect to p. So as to be able to describe orbits in this fashion we need a system of reference. As a system of reference we can take a set of points ty ... which are distributed about uniformly in somé region 8t; further, clocks Ei, ©> • • • E« near the points. In a given representation K we can aseribe arbitrary coordinate vectors lt

2

K(t ) = r„ r

v =

1, 2 , . . . , n

to the points and we may adjust the clocks (S„ in a more or less arbitrary fashion. We may require in a purely qualitative manner that measures of coordinate vectors of points close to each other should not differ too much and similarly the readings of close clocks should not deviate appreciably from each other. Having defined a system of reference with the help of standard points and standard clocks, we can — interpolating between the standard coordinates and time measures — determine the four-coordinates x(p) of the orbit of a partiele crossing the region 31 by expressions of the form (1). 137. In the previous chapter we have shown that making use of solid measuring rods we can obtain a distinguished representation of coordinate vectors. These distinguished coordinates can be taken to reflect particularly clearly the properties of solids. The question arises whether it is possible

to find a distinguished representation for time measures alsó which reflect particularly clearly certain physical processes? 138. We shall see that it is possible indeed to find distinguished time measures starting from actual physical processes. Before describing the above methods we want to emphasize that time as such has no particular rhythm and time can adequately be expressed in terms of very different measures. All that we can say about time is that it flows into one direction only. In the literature there exists a considerable confusion because the selection of a distinguished scale for the measure of time is confused with what is called a "definition of time". By analyzing actual phenomena we can compare the rhythms of processes with each other; we can find e.g. that one process may be accelerated or slowed down relative to another but our observations are always confined to physical processes which proceed in time, and the comparison of such processes. 139. An apparent method for the determination of a distinguished time scale is provided by Newton's first law. Indeed if Newton's first law is to be valid then there exist systems of reference K such that in terms of their measures the orbits of free particles are given by linear expressions, i.e. r(í) = r + \t. 0

(2)

Relation (2) can be checked experimentally. We may consider for the purpose a straight system of reference K in which the coordinate vectors have been determined with the help of solid measuring rods. Standard clocks can be adjusted in the system of reference by observing a number of free particles crossing K and supposin gthese orbits to be given by (2). Once the clocks in K have thus been adjusted we can check whether or not the orbits of further free particles appear in the form (2), when the time measures are taken from the clocks adjusted by the former procedure. Thus observing the orbits of a sufficiently large number of free particles we can synchronize the clocks in various parts of K, furthermore we can alsó check the consistency of the synchronization thus obtained. The measures of time obtained by the clocks thus synchronized can be taken to be a distinguished representation of the measures of time. Furthermore we may denote clocks as ideál clocks, the readings of which give immediately — without correction — the above distinguished measures of time. Ideál clocks are an abstraction just like ideál solids. However, having a number of real clocks which behave in a good approximation like ideál clocks, we can determine their deviations from ideál clocks by observing the inconsistencies which arise using the readings obtained from them.

We may denote an ideál clock also as a clock having constant rhythm or we shall say that its rate is uniform. 140. We have described the method how one could in principle obtain a distinguished time scale making use of Newton's first law. The above method is, however, only an ideál method which cannot be made use of in practice since we cannot escape the gravitational field surrounding us and therefore we do not dispose of freely moving particles. Nevertheless Newton's laws can be made use of for to synchronize clocks by applying the laws to bodies moving under the influence of outside forces. It is expected that a planetary body falling freely under the action of the gravitational field will rotate uniformly around its axis. Thus it follows from mechanics that the apparent motion of fixed stars can be made use to establish a distinguished or uniform time scale; we obtain thus the sidereal scale. The consistency of mechanical laws can also be checked by comparing the sidereal time scale obtained from astronomical observation with the time measures which are obtained from various mechanical clocks which in their turn should also be expected to provide ideál clocks in the sense explained above. 141. Another time scale can be obtained with the help of the observation of the orbits of the planets in the solar system. According to Newton's laws the orbits have to be Kepler ellipses with certain perihelial motions caused by the mutual perturbations. The observation of planetary motion enables us to determine a system of reference in terms of the measures of which Newton's laws are obeyed in a good approximation. Such a system of reference is called an inertial system of reference. The data obtained from the motion of the planets in terms of the measures of an inertial system provide a distinguished time scale. The latter time scale, the so-called ephemerical scale is the most consistent scale we dispose of at present. The above procedure does not only give an inertial system of reference and a distinguished time scale, but it provides also an internál check of the theory of planetary motion. That this is indeed so can be seen from the fact that in reality it was not possible to fit in all the details of the planetary motion into this picture. Constructing the best possible approximation of an inertial system in the measures of which Newton's laws are very nearly correct, one finds that the orbit of the planet Mercury shows still an anomalous motion of itsperihelion; theanomaly amounts to a shift of about 0.4" per year. The latter anomaly shows that there exists in nature a small but noticeable deviation from Newton's laws. 142. We find thus that we are in a position to construct distinguished time scales in a number of ways. We can construct a scale with the help

of ordinary clocks, we can observe the rotation of the Earth and we can alsó obtain a time scale from the observation of planetary motion. By comparing the various scales we obtain checks of consistency. These checks are partly checks whether our hypotheses regarding laws of motion are correct, but these checks provide alsó information whether or not our clocks behave like ideál clocks we suppose them to be. If e.g. a pendulum clock shows irregularities when compared with the sidereal time scale obtained from the rotation of the Earth, we shall suppose that the clock shows somé imperfection, e.g. the pendulum does not move like a rigid body or somé other feature of the clock will be supposed to cause the irregularity. 143. A more fundamental problem arises when comparing the sidereal time scale with the ephemeral time scale. Actual observations seem to indicate that there exist deviations between the two time measures. It seems — from astronomical observations — that there exist non-trivial discrepancies between the sidereal time measure t and the ephemeral time measure t ; thus it seems that d%ldtl¥=0. s

E

The latter discrepancy between the two time scales can only be understood by supposing that the rotation of the Earth is not quite uniform in terms of the ephemeral scale. Since, however, the latter scale must be regarded as the more fundamental of the two, this means that there must be somé physical process affecting the rotation of the Earth. Indeed, it is very likely that the discrepancies (if they are real) are caused by the fact that the Earth cannot be taken strictly to be a rigid body and thus its momentum of inertia is subject to small changes which in their turn affect its rotation. In this sense the Earth differs from the ideál clock which would be represented by an ideál solid rotating freely. 2. ATOMIC TIME SCALE

144. Recently the technique of constructing atomic clocks has been perfectuated to such an extent that it seems feasible that it will sometime become possible to obtain with the help of such clocks a well-defined atomic time scale. If a sufficiently accurate atomic time scale could thus be obtained, then one could investigate whether or not (3) where t is the measure of the atomic time and t that of ephemeric time. Should future observation show that (3) is not satisfied — this would be an interesting result from the theoretical point of view. A

E

It must be emphasized that if there existed a deviation from (3) one could not put the question whether the ephemeric time scale or the atomic time scale gives the true scale of time? A deviation from relation (3) would simply show that somé of the quantities we suppose to be constant do in fact vary in time. We remark that the introduction of sidereal time depended on the assumption that the Earth rotates uniformly and one of the main conditions for this to be is the assumed fact that the moment of inertia M of the Earth is constant. The deviation between sidereal and ephemeral scales seems to indicate that in fact M is subject to variations. Similarly we expect (3) to hold provided certain elementary constants, like the gravitational constant G on the one hand, the elementary charge e, Planck's constant h and others are constant. If somé of these constants were to change in relation to the others, then such a variation could cause a deviation from (3). In particular theories have been put forward predicting a gradual change of the gravitational constant — such a change if it exists might cause a deviation from (3). E

E

3. SYSTEMS OF REFERENCE CONSTRUCTED WITH THE HELP OF LIGHT SIGNALS

145. Systems of references including time scales can be obtained also by making use of light signals. We may start from the assumption that light is propagated isotropically with a constant velocity c. Supposing this to be the case relatíve to a system of reference K, then we can determine coordinate measures and synchronize clocks inside this system making use of signals of light only. 146. Using light signals we may obtain the coordinates and time measures in two steps. Let us consider a number of clocks E , Ej., E , . . . placed in points ^5 , 5J5 As the first step of constructing the system of reference K take the clock E to be the standard. The rhythm of the clocks E E , . . . can be adjusted to be equal to that of E . For this purpose we may view E from E k = 1 , 2 , . . . , say, through telescopes. The clock E* can be adjusted so as to be synchronous in rate with E ^ ' the image of E as seen in the telescope near ty > Once the rates of the clocks E k = 1, 2, 3 , . . . are thus adjusted, we can check the consistency of the adjustment. Indeed, we expect that the clocks thus adjusted should be synchronous to each other. Thus viewing from $ not E but, say, E, / > 0 the image E$ of E/ seen in the telescope near ty should be found to be synchronous with E . The latter test when fulfilled supports not only the originál hypothesis about the mode of propagation of light, but it also supports the assumption 0

0

2

2

0

l5

2

0

0

fc

0

k

fc

<

fc)

k

k

0

fc

that the rate of the standard clock E was uniform indeed and that the distances between the clocks remained constant. 147. Once the rates of the clocks @* are thus adjusted and found to be consistent, as the second step we can determine the distances r between pairs of points ty and % using a radar method. Indeed, measuring the time of flight t a signal of light takes to proceed from to 5$, and back, we can suppose 0

kl

k

M

rui

= y

chi

•

(4)

Exchanging light signals between the various pairs of points we can determine according to (4) the numerical values of the r experimentally. So as to obtain coordinate vectors r we have to try to satisfy the relations kl

k

( r - r ) = f|,

' M=l,2, ...«.

a

f c

l

(4a)

As it was pointed out in 115 the above set of relations is mathematically overdetermined if n > 4. However, in the cases where (4a) admits of solutions, these can be obtained with the help of (33) in 115. In particular using equ. (42) in 121 we may obtain systems of coordinates in an orthogonal representation. Once the r , are determined by return signals and are found to give a consistent set of measures of length we can proceed to adjust the phases of the clocks Indeed, observing E from ty we expect that the image © of &o should show a delay of At = r jc. Thus we have to adjust © so as to show a phase shift At relative to ©o*'. Adjusting thus the phases of the clock ($* k = 1, 2 , . . . to be synchronous to ©o we expect that any pair of clocks and £, so synchronized appear alsó to be synchronous relative to each other. This can be tested by viewing, say, E from ty . If the clocks are consistently synchronized then the image ©J*' of (£ as seen in a telescope near ty should be late by an amount k

k

0

fc)

k

0

k

t

k

;

k

;

At

kl

k

=

r ijc. k

148. Summarizing the procedure we see that using the exchange of light signals between the points 5Js k = 0, 1, 2 , . . . we can synchronize the clocks $ k = 1, 2 , . . . with respect to both rate and phase. Furthermore we can determine the measures of i of the coordinate vectors x of the points The system of coordinate vectors and alsó the modes of synchronization of clocks obtained by the above procedure can be subjected to a number of tests of consistency. If the tests lead to positive results, then it can be taken to support the hypothesis regarding to the mode of propagation of light relative to the system of reference K in which the points 9p are at rest. However, it is important to point out that the fact that there exist consistent coordinate measures in terms of which the light appears to propfc

k

k

k

fc

agate isotropically is only a necessary condition which has to be fulfilled and this condition may be fulfilled even if the mode of propagation of light is not isotropic; we shall return to this question further below.

B. THE LORENTZ TRANSFORMATION AS COORDINATE TRANSFORMATION 149. We may construct a system of reference with the help of signals of light with a method described in the preceding section. The consistency of the measures thus obtained can be ascertained by timing departures and arrivals of light signals. The following relation is expected to hold (r, - r*) - c (t, - ttf = 0 2

t, > t

with

2

(5)

k

for a signal departing at t from x and arriving at t, in r,. Relation (5) can also be written in a different manner. Let us for this purpose describe events by four-vectors; thus we write k

k

Xx, x , x , x

£(<£) = x =

2

3

4

where (£ is the event taking place at a time / in a point 5(5 with the coordinate vector r and r = 2

Xi, x , x$ / = x± . *

Considering two events (S and (5, where 6 is the departure of a light signal from ?$ , and (5/ the arrival of that signal in ^ we may write in the place of (5) also fc

fc

k

h

x Tx kl

= 0

kl

(6)

where kl = /

X

x

—

x

t

is the vector of the "four-distance" between the events @ and (£/ and 0 0 \ 0 (1 0 0 1 0 0 1 0 0 r =I: X 1 • (6a) 0: 0: fc

\o

* We deliberately use the fourth coordinate x = t and not x = ct or x = ict. The symmetries which can be obtained in various formuláé by the latter definitions of the fourth coordinates are misleading ones; by putting e.g. x, = ict we obtain such apparent symmetries between space and time coordinates which do not correspond to anything in reality. t

t

4

150. If we succeed in constructing consistent coordinate vectors and succeed in synchronizing clocks in a consistent manner, then we may assume that light is propagated according to (5) or according to the equivalent relation (6). However, the tests which prove the consistency of our procedure give only a necessary condition for light to be propagated isotropically. Indeed, let us replace the coordinates x of events (5 by x' = Ax +

(7)

a,

where det A ^ 0 and a is a constant four-vector. We take further x« = xj - x*' = A x . w

(8)

From (8) and (6) we obtain = 0 thus if ATA =

0T

0*0

(9)

then the relation (6) is equivalent to x' Tx' kI

kl

=

0.

Thus provided the propagation of light appears isotropic with respect to K then it appears alsó isotropic with respect to K' where the four-coordinates of events transform according to (7). 151. We shall denote matrices obeying (9) with 0 = + l Lorentz matrices ; thus a Lorentz mátrix is defined by a relation A R A = T.

(10a)

Multiplying (10a) from the left by F " 1 and from theright by A - 1 we obtain in place of (10a) a relation which is identical with (10a), namely A" = 1

r'Xr.

(10b)

Thus (10b) can alsd be taken as the defiöition of Lorentz matrices. A transformation x' = A x + X, (11) we shall denote a Lorentz transformation. We see thus that if we can construct measures of four-coordinates x in a representation K so that in terms of its measures light appears to be propagated isotropically, then we can construct similar measures relative to a system of reference K' which is obtained from K by a Lorentz transformation.

1. THE EXPLICIT FORM OF THE LORENTZ TRANSFORMATION

152. The relation (10) defining Lorentz matrices corresponds to sixteen relations corresponding to the sixteen elements of the matrices involved. However, taking the transposed of (10) the relation does not change and thus one sees that only ten of the sixteen relations (10) are independent of each other. Consequently the matrices A satisfying (10) form a sixparameter manyfold. We can also write A

íQ) (Q) rA

(12)

=r

where q stands for the six parameters defining one particular solution of(12). 153. Repeating the considerations of 131 for the orthogonal matrices O we find that the A form a group, the so-called Lorentz group. In particular taking the determinant of (12) we find that w

( , )

detA< > = ± 1. q

154. So as to obtain an explicit representation of A containing six independent parameters, we consider two particular types of Lorentz transformations. Let us denote by (,)

O 0

0 1

(a) (13)

with OO = 1

(b)

where O is an orthogonal mátrix of third order. We find that A satisfies (12). Indeed, the transformation (13) corresponds to a turning round of the system of reference leaving the adjustment of the clocks unchanged. A second type of matrices satisfying (12) can be written as* ( t )

( A<°> =

B

0 0

0 1 0 0

0 0 1 0

-vB\ o. 0 B )

(14)

with B=

1-

-1/2

* The transformation (11) with A = AJ") and X = 0 is sometimes referred to as "the Lorentz transformation"; this is of course only a particular case of the Lorentz transformations.

The transformation (14) can be generalized by writing /

V

o

V

l+—-(B-l) \ - yfi/c

-vB B

2

J

Inserting (14) or (15) into (12) one verifies that A is a Lorentz mátrix indeed. 155. Both (13) and (15) represent a three-parameter manyfold. From the group character of the Lorentz matrices it follows that (,)

A<*>A<'> = A<*-> T

is alsó a Lorentz mátrix. We find from (13) and (15) o +

A<*-*> =

Vov

vJJ/c

(B-l)

-XB

(16)

B

2

where Ov,

V and thus

V. 2

Writing A »- > = A * (

T

1

q = tp, v

we have a set of Lorentz matrices depending on six independent parameters, e.g. the three components of tp which define the orthogonal mátrix O and the three components of the velocity v. 156. The matrices with v
det A

(,)

= +1

Afl > 0

(17)

form a subgroup of the Lorentz group. The latter subgroup is denoted the proper

Lorentz

group.

It can be shown that all the elements of the proper Lorentz group can be represented in the form (16) with the restriction (17). The whole of the Lorentz group is obtained if we consider apart from the elements defined by (16) and (17) alsó elements where d e t A = - 1 and alsó where B is replaced by —B. (q)

2. THE PHYSICAL SIGNIFICANCE OF THE PARAMETERS OF THE LORENTZ MATRICES

157.

We may write in place of (16) also A

with

(<|)

=

ÍL

u

|U

B

u = - 0\B

U = - vB/c

2

L= O + 1/2

B=

c

uoU 1+ B

= (1 + u O U )1/2

2

With the help of (10b) we obtain also an explicit expression for the reciprocal of a Lorentz mátrix, i.e.

158. Considering the orbit of a partiele in the paraméter representation x(p) relatíve to K we obtain for the orbit in a Lorentz transformed system of reference K' x'(p) = Af*>x(p) + X . (19) Separating the space and time measures, we can write also r'(p) = LrGfl + ut(p) + 1 1 t'(p) = Vr(p) + Bí(p) + í J

(20)

0

where we have written X = 1, t . Furthermore using the relation (18) we can also write 0

r(p) = Lr'00 - cHJfip) + 1+ t(p) = - ur'(/>)/c + Bt'{p) + t$ 2

(21)

where we have written X = 1 + , fo for the constant term of the reversed transformation. With the help of (19) and (20) we can clarify the physical significance of the parameters q occurring in the Lorentz matrices. 159. Supposing that x(p) and x'(p) represent the orbit of a point A relatíve to K respectively K' we can write for the velocity of that point +

t

relative to K and K' r

dt

„,

di'

r'

where the dot denotes differentiation with respect to p. Differentiating (20) respectively (21) into p we obtain thus with the help of (22)

V

^ - ^„v^T

( A )

V

" - UVTTB

(B)

-

( 2 3 )

Considering a point with Y = 0, i.e. a point at rest to K', we find from (23a) that the measure of its velocity relative to K is equal to A

\

A

= - cHJ/B = v.

(24a)

Similarly, considering a point ^4 with = 0, i.e. a point at rest relative to K, we find \' = u/B = - Ov = - V. (24b) A

From (24a) and (24b) we conclude that the velocity of K' relative to K in measures of K is equal to v, further the velocity of K relative to K' in measures of K' is equal to — V. We see therefore that systems of reference K and K' (connected by a Lorentz transformation) are in generál in a translational motion relative to each other; the velocity v of the translation is closely connected with the k4 and 4k (k = 1, 2, 3) elements of the transformation mátrix. The mátrix L = O + terms of the order of v /c 2

2

thus for velocities v <^ c, the mátrix L defines a nearly orthogonal transformation connecting the axis of K and K'. Thus L defines the orientation of the axis of AT relative to K'. Taking the higher order terms into consideration, we find that the axes of K' are not exactly orthogonal relative to each other when considered in the measures of K. 160. With the help of Lorentz transformations we can thus construct a six-parameter manyfold of systems of references. The systems have translational velocities relative to each other and their coordinate axes are turnéd round relative to each other. In terms of the coordinate measures of any of these systems light signals appear to be propagated isotropically with a constant velocity c. We may denote a system of reference thus obtained a Lorentz system of reference. A Lorentz system is therefore an inertial system in which the coordinate measures are chosen in such a way that the propagation of light appears to be isotropical expressed in terms of them.

C. HOMOGENEOUS PROPAGATION OF LIGHT 1. THE CONCEPT

161. In the preceding section we have started from a hypothesis to the effect that light is propagated isotropically relatíve to the ether. This hypothesis can be tested by timing arrivals and departures of signals of light. We may suppose light to be propagated isotropically in a certain region if we succeed in introducing coordinate measures such that (x, - x F C ) R ( X / - x ) = 0

(25)

k

where

x* = r , t k

and

k

x, = r,, t,

are the four-coordinates of the departure of a signal from a point ^ and the arrival of the signal in a point %. The fact that in a region Dt systems of references can be constructed relatíve to which the propagation of light obeys (25) proves only that the observations are consistent with the assumption that light is propagated isotropically in 9t. The observations do not prove necessarily that light is propagated isotropically in üt. So as to see this in more detail we discuss alternative hypotheses as to the mode of propagation of light. 162. Instead of supposing light to be propagated isotropically we may try to start from a hypothesis which supposes that light is propagated relatíve to the ether in a manner not unlike light is propagated in an unisotropic médium. Thus in place of (25) we may suppose k

(r, - r^Gír, - r ) - c\t - t f fc

t

k

= 0

(26)

where G is a symmetric and positive definite mátrix. We may also generalize (26) further supposing that the ether flows with a velocity v relatíve to our system of reference. If this is the case — and if (26) describes the propagation of light relatíve to the system of coordinates floatíng together with the ether — we have to suppose that a signal starting at t = t from the point T will arrive at í in a point the coordinate of which was r, at the time of the departure of the signal, which point however, at the time t, has shifted to a position with radius vector k

t

k

i> + v(/, -

í ). k

We may thus write in place of (26) (r, + T(Í, - t ) - r )G(r, + v(/, - t ) - i ) - c (í, - t f 2

k

k

k

k

k

= 0.

We may also write (x, - xFC)G(x - x ) = 0 ;

k

(27)

V = Gv

C 2 = c - vV 2

and alsó det g = - c det G. 2

The above consideration leading to the hypothesis (27), (28) is an auxiliary consideration. Independent of the details of the above consideration we shall investigate the law of propagation of light in accord with (27) supposing g to be a symmetric mátrix such that g < 0 and g should possess three positive and one negatíve eigenvalue. u

2. TEST FOR HOMOGENEOUS PROPAGATION OF LIGHT

163. The mode of propagation of light as described by (27) and (28) may be called homogeneous propagation of light. Making an assumption about the elements of g the relation (27) can be tested experimentally in a way which is very similar to the test for isotropy described above. For this purpose we may consider a set of points $ , ^ ••• d clocks (S , Ei, ©2> • • • near the points. Using signals of lights we can synchronize the rates of the clocks just as described in 145. The fact that this synchronization can be carried out consistently is a test of the hypothesis regarding the mode of propagation of light. Once the clocks © k = 0, 1, 2 , . . . are synchronized we can observe the time t = t + t , (29) a n

0

u

0

fc

m

kI

m

a signal takes to move from % to % and back. From (27) and (28) it fol* lows that rGr + 2Vr/< > - C í > = 0 (30) k

1

2

(1

2

where r = r, — r and is the time the signal takes to move from ty to Solving (30) we find (remembering that f > 0) k

k

(1)

_ Vr + Y C ( r G r ) + (Vr) 2

2

C2

The time t^ of the flight back from % to *p is obtained from the above expression replacing r by — r. We find thus }

fc

+ /<> = 2(rGr) 1 ' 2 /C, 2

(31a)

G = G +

VOV

c

(31b)

2

From (29) and (31a) it follows (r -r,)G(r -r,)--C í| =0.

(32)

2

;

/

/

We may take as the definition of the measures of distances r is given by

kl

that it (33)

Supposing (33) we find from (32) («•; - r )G(r, k

(34)

- r) = r . 2

k

kl

The above relation (34) is analogous to relation (23) of chapt. III, 113. The relation (34) gives a system of n(n— 1 )/2 equations for the 3n unknown coordinates. The right-hand expression can be taken to be the distances measured with return signals. Thus the system is overdetermined. If the overdetermined system (34) can be solved, then it provides us with consistent coordinate vectors r of the points % k = 1 , 2 , . . . and at the same time we obtain support as to our assumptions concerning the mode of propagation of light. 164. Viewing the clock E from a position near E, we observe an image EJjP of E which appears as a clock retarded by an amount k

k

fc

t

í

(i)

(r -r )/C + y í . 2

=

V

/

f c

w

If the coordinate vectors r , r have been determined we can determine also and thus we can adjust the phase of E, so as to be synchronous to E*. In this manner we can adjust the phases of all the clocks E k = 0, 1, 2 , . . .. The synchronization of the phases provides us again with numerous checks upon our assumptions. Carrying out the whole of the procedure we may arrive at consistent values of the coordinate vectors r k = 0, 1, 2 , . . . and to consistent rates and phases of the clocks of E , E If this is the case we may suppose that the assumption (27) about the mode of propagation of light is consistent with the observed behaviour of light signals. k

;

fc

k

0

x

3. CONNECTION BETWEEN VARIOUS REPRESENTATIONS

165. Having found coordinate measures in terms of which the propagation can be described in a consistent way by (27), then we can find other representations in measures of which the propagation of light also appears

homogeneous, but which representations correspond to a mátrix g' different from g. Indeed, we may introduce transformed measures x' = Sx + s,

(35)

where S is a fourth order mátrix with det S # 0. The propagation of light in terms of the new coordinates can be expressed as (*í - x*)g'(*í - 4 ) = o. where (36) g' = S ^ g S " . 1

Relations (36) are mathematically equivalent with (27). Thus provided measures x can be found which are consistent with (27) then the measures x' are automatically consistent with (36). We may write cj for the propagation tensor and thus tf(g) = g

K'(q) = g'.

(37)

We see if there exists a system of reference K where the propagation tensor is represented by g, then there exists alsó a system of reference K' where the propagation tensor is given by g'. a. TRANSFORMATIONS OF THE PROPAGATION TENSOR

166. Presently we show that there exists always a transformation (35) which changes g from a given form to another given form g'. Let us write g in the form (28). We introduce a mátrix a such that &R<x = g;

(38)

we find that a will satisfy (38) provided we put g~ V\

j v2

1I2

G

" ( o

*

)'

P»>

We use c for the velocity of light in the Lorentz system of reference, thus we suppose r _ _ 2 0

1

Further we take

u —

R c

0 •

C 2 = C 2 - Vv.

The inverse of a can alsó be written explicitly; we find «-I = Í

G _ 1 / 2

(O

-G-^O/C] C0/C

Introducing (39) into (38) we obtain an identity.

We note that (38) can be satisfied if we replace a by a

(0)

=

A

(q)

a >

where A is the orthogonal representation of a Lorentz mátrix. 167. In analogy with (38) we can introduce a mátrix a' satisfying (<ú

öTo' = g,

(40)

/

we find thus

t 'ir-

*H

G'' V'l 1/2

G

C-k

)•

(4i

Comparing (38) and (40) we find a ' S g a a ' - g'. Thus putting S = a'-V we find S ^ g S " = g'. - 1

>

- 1

(42)

1

In place of S as given by (42) we can also use a transformation mátrix S defined as S«ö = c t ' - ^ a (43) (q)

where A is a Lorentz mátrix. Thus transformations (35) where S is given by (42) or (43) lead from a representation K to one K' satisfying (37). 168. We shall always suppose if we speak of a propagation mátrix (or of a propagation tensor) that it is represented by a symmetric mátrix with three positive and one negatíve eigenvalue such that g < 0. We see therefore that observing the propagation of light in a region 9? we can merely find out whether or not light is propagated homogeneously in that region, i.e. we can find out whether or not the propagation of signals of light can be described by a quadratic expression of the form (27). We cannot, however, determine the elements of the propagation mátrix g itself from the observations of signals of light. Thus the question whether or not light is propagated isotropically in a given region has no particular significance — we can only find out whether or not light is propagated homogeneously in ÍR. (q)

u

D. THE RELATION OF SYSTEMS OF REFERENCES OBTAINED WITH LIGHT SIGNALS AND WITH SOLIDS 169. We have shown in 110 how we can construct, using measuring rods, consistent sets of coordinate vectors. We may denote coordinate vectors obtained with the help of measuring rods by r. The coordinate

vectors of a set of points are thus obtained solving a system of equations (r -r )G(r -i ) l

k

l

= ¥f

k

(44)

k

where the r are the distances between points as measured with the help of rods. Similarly, using signals of light instead of measuring rods we can determine measures of distances r in accord with 163 and we obtain coordinate measures r as solutions of the equations tk

lk

k

VO V \

l

(r, - r )|G + - ^ - j (r, - r ) = r? . k

fc

(45)

k

The question arises whether there exist such representations K where the coordinate measures i satisfying (44) coincide with the measures r satisfying (45) in this turn. So as to answer this question we have first to investigate the relation between the measures r and r of the distances between points ty and As the methods of determining r and r are different there is a priori no reason why these measures should or should not be equal. It is, however, an empirical fact that k

k

kl

kI

k

kl

kl

kfiki — constant,

r

or choosing suitable units r i = hi • k

(46)

170. The experiment proving (46) is the Michelson-Morley experiment and the Kennedy-Thorndike experiment. Indeed, the latter experiments prove the following. Marking two points A, B on a solid, the time return of light T = t + t between the two points remains constant if we turn round the solid, or if we shift it about. Changing the position of a solid, the points A, B are shifted into positions A*, B*. From the deliberations of 134 and 135 we have to suppose that in the case of an ideál solid AB

AB

BA

AB

=

R

thus the measures of the distances between the two points do not change if we shift the solid. From the interferometer experiments we conclude that TAB

—

AB

—

T *B* a

and therefore R

R

A'B*

jf we take the measure determined interferometrically of a distance to be r = CT. Thus if two distances appear equal, i.e. AB — PQ>

r

r

when compared with rods, then they will also appear to be equal when measured with signals of light. If we add to this that the distances of points along straight lines are additive — we see that (46) will be correct for all pairs of points ty and ^5, if it is correct for one particular pair. Thus if we choose the units so that e.g. k

r

12

=

r

12

then we expect (46) to hold for any values of k and /. 171. If we can take it as a result of experiment that (46) is valid, then we can obtain coordinate measures by solving the equation (

VOV)

(T - r,) JG + — ^ - \ (r - r,) = rf,; see 163 equs (31b), (32) and also K

(47a)

k

{i - í,)G(F - F,) = k

(47b)

fc

The two systems admit of solutions r* = í*

k = 0, 1, 2 , . . . , n

provided (46) is valid and G= G+

VOV

.

(48)

We see therefore that provided (1) we can determine in a region 9t consistent coordinates with the help of measuring rods, (2) we can also determine consistent coordinates with the help of light signals and (3) we find empirically that the measures of distances r , are equal to those of r (choosing the units suitably), thus we can introduce a representation K such that k

kl

r* = í*;

(49)

for the purpose we have to take VOV

G = G+ - ^ - ,

(50)

as the connection between the elements of g and the elements of G. 112

172. The requirements (l)-(3) which lead to (49) contain three distinct physical statements. These statements reflect upon: (1) the physical properties óf solid rods, (2) the physical properties of the propagation of light, (3) the connection between the physical properties of solids and those of the propagation of light. We note that the physical properties of solids mentioned under (1) are by no means trivial ones. These include the Lorentz contractions suffered by solids. We cannot, however, determine the elements of the propagation mátrix g itself from the observations of signals of light. Thus the question whether or not light is propagated isotropically in a given region has no particular significance — we can only find out whether or not light is propagated homogeneously in 9í.

CHAPTER V

THE LORENTZ PRINCIPLE

A. THE LORENTZ TRANSFORMATION AS DEFORMATION 173. After the preliminary considerations given above we are now in a position to give the Lorentz principle, i.e. we can formulate the generál law which explains why it is impossible to observe directly the translational motion of a physical system relatíve to the ether. We have seen that in the case of the Michelson-Morley experiment the geometrical effect from which we might hope to determine v is compensated by a deformation of the interferometer which exactly compensates the geometrical effect. Similarly as in the other experiments described in chapter II we have to conclude that experimentál arrangements when moving relatíve to the ether suffer deformations so that these deformations are compensating exactly such effects that otherwise might be suitable for to determine the velocity of the system relatíve to the ether. Presently we analyse in more detail the deformations which are suitable to compensate exactly effects caused by translational motion of a system relatíve to the ether. We shall see that the compensating effects can be described in terms of what we shall denote Lorentz deformations. We give the definition and properties of these deformations below. 1. DEFORMATION OPERATORS

174. Consider a physical system O consisting of particles in points ?$ , k = 1, 2 , . . ., «. The representation Q of D relatíve to K, i.e. k

Q = can be expressed with the help of the four-coordinates of the points ty

k

x

l , 2s . . . , X x

n

.

We do not suppose £t to be at rest and we allow that the points ty of Q may move relatíve to each other. The motion of the points of Q can be expressed in a paraméter representation, i.e. putting k

x* = MP) = kÍP), Kp) T

where i(p) > 0.

* = 1,2,... n

Thus choosing various values of the paraméter p we obtain the positions of tyk various times t(p). Applying a reversible linear transformation we obtain four-coordinates a t

x* (p) = Tx (p) + a. k

k

(1)

The four-coordinate x*(p) defines the orbit of a point in terms of its coordinates relative to K. The operation determined by T and a defines thus the orbit of any point ty to that of another point ^s*.. The points ÍJs*. thus obtained can be taken to form a physical system C* and thus (1) can k

x Fig. 15. Scheme of the Lorentz deformation of an orbit

be taken to define (in the representation K) a physical system O* to a given physical system £}. We can write symbolically &* = £(D). 175. Relation (1) has the mathematical form of a coordinate transformation — it represents, however, not a coordinate transformation, but a deformation. Indeed, the coordinates x*.(p) just as the coordinates x (p) represent the orbits of points relative to one and the same system of reference K (see Fig. 15). One finds that the representations of Q and C* relative to a system K' can be written *k(p) = Sx (/>) + s 1 k

fc

x* '(p) = Sxt(p) k

+ s j

W

and comparing (2) with (1) we find x*'(p) = Tx'(j>) + a', with T' = S T S " a' = (1 - TO s + Sa 1

(3)

We can thus write K(%) = T, A

and

K'<%) = T, A'

T, A and T', A' represent the deformation % in terms of K respectively K'. 2. LORENTZ DEFORMATIONS

176. A particular type of deformation is obtained if we take T to be a Lorentz mátrix, thus we denote £i* a Lorentz deform of Q if the deformation leading from the coordinates of O to those of O * is given by a Lorentz mátrix. Thus we write £ (B) = O *

(4)

q

and in an orthogonal representation we have x-tiP) = KMP)

+

(5)

where A„ obeys AQRA„ = T.

(6)

177. The Lorentz deformations form a group as follows from the properties of the Lorentz matrices. Further there is no need to restrict ourselves to orthogonal representations. In place of (5) and (6) we can also write x* (p) = M x (p) k

q

k

+L l

(a)

with

(7) M M = í g

Q

(b)

g

where g = *(0) is the representation of the propagation tensor in the measures of K, while J

ST(£ Q ) = M Q , F JI

is the representation of the Lorentz deformation with respect to K. Making use of (3) we find that M, =

SM S~\ q

and thus M,g'M;= ', g

provided (7b) holds and G' =

S-,GS-1.

We see thus that a deformation which appears as a Lorentz deformation as given by (5) in the measures of one straight system of reference K appears as Lorentz deformation in any straight system of reference. Relation (4) has thus a good meaning independent of the, representation we choose. 178. We have to clarify a little further the contents of (5) defining the Lorentz deformations. For this purpose we use orthogonal representation. Separating in (5) time and space parts we may alsó write (see 158 (20)) r*(í*) = Lr,(0 + uí + l, t* = Ur^f) + Bt +1 , 0

(a) J (b) j

{

>

i.e. the point has at a time t* a coordinate vector r*(/*) and t can be regarded as an independent paraméter with the help of which r*(í*) and t* can be evaluated. So as to avoid misconceptions we may alsó write in place of (8) —using a little different notation: r*(0 = Li-*(>fc) + nt + 1 / = Vr (t ) + Bt + t k

k

k

k

0

(a) 1 (b) j '

(

'

Relations (9a) give the coordinate vectors of the points r*(r) k = 1, 2 , . . . , n at the time t while the t are parameters which have to be determined for every value of k and t from equation (9b). If the points % move in an arbitrary way, then the r (t ) are in generál non-linear functions of the t and the relations (9b) are non-linear equations which have to be solved for to determine the values of t corresponding to a given value of t. In such cases it is more convenient to regard (9) as paraméter representations of the orbits of the points *p£. Inserting various values for t we obtain from (9a) the coordinate vectors r*(f) of for such times / as obtained in terms of t from (9b). k

k

k

k

k

k

k

k

3. PARTICULAR TYPES OF LORENTZ DEFORMATIONS

179. We consider the deformations obtained for a physical system £} consisting of points 3 ^ which are at rest relative to K. We thus suppose r (f) = r = independent of t. k

k

(10)

Let us consider the Lorentz deformations which arise if we use as transformation matrices

and Yoy

l + ( B - l ) — j v

By

\Byjc

B

2

which matrices are analogous to the transformation matrices A A™ given in 154 (13) and (15). The transformation (11) applied to (10) gives r

*(0 = O'/t + constant = independent of t.

w

and

(13)

Thus the transformation gives a simple turning round of Q without apparent internál deformation. However, one must be careful when interpreting the above transformation. No internál deformations occur in the sense that the measures of distances rh = (T,-Tk?~ti-$?

= r$

(14)

remain unchanged. The deformation (13) can thus be described by stating that the system Q submitted to the transformation A,, deforms in such a way that the measures r of distances in the representation K do not change their numerical values. This more careful formulation is rendered necessary if we remark that considering the same deformation in another representation K' relatíve to which the points S$ and % are in motion, we find M

k

ki¥=r

r

k l

.

Thus relation (14) is only valid for particular representations of Q and Q*. 180. A transformation with the mátrix (12) gives the following **(0

= T

k

+ ( B - l )

y(r

k

y)/v

2

+ Byt

+ l

k

0

(15a)

with t = Byr [c k

+ Bt

2

+ t

k

0

.

(15b)

We may simplify the notation by writing r* = ri ' + f f

(16)

rí = y(r y)lv

(17)

1

where 1}

2

k

is the component of r parallel to v, while rjj?' = r — ponent perpendicular to v. fc

k

is the com-

Eliminating with the help of (15b) t from (15a) we find using the notation (16) and (17), k

r*(r) = rj^/J? + r^ + ví + constant. 2)

We see thus that Q* moves with a velocity v relative to Q; besides, the measures of Q* perpendicular to v are equal to the corresponding measures of Q while those parallel to v are shorter by a factor

than those of Q. Thus Q* is obtained from Q by a compression in the direction v by a factor y/l — v /c . The deformation Q -* Q* is just the type of deformation which Lorentz and FitzGerald supposed to happen to a solid if it is set to move relative to the ether. 181. A further type of the Lorentz deformation is obtained considering a clock ©. Suppose the clock © to rest relative to Kin a point r = constant and to give signals at times 2

2

k

í<*> = kT

k = 0, 1, 2 , . . . .

The Lorentz deformed clock has a coordinate vector r

*(0 =

+ constant,

y t

it gives signals at times /(*)* = Bt

(k)

+ constant,

thus we have í<*)* = kT* + constant, with T* = T/y/l -

v lc . 2

2

The clock E* gives thus its signals at intervals T* > T and thus its rate is slower by a factor yjl —v lc? than that of ©. This slowing down of a clock is of the same kind as that we have found when discussing the perpendicular Doppler effect and alsó the experiment of Isaak et al. 182. A further type of the Lorentz deformation which is important rather for the interpretation of ideál experiments than of real experiments is the following. Consider as the system D a cylinder rotating around an axis with an angular velocity co. A transformation A where v is parallel to the axis 2

T

of SQ leads to a deformed state Q>* which is a cylinder moving with a velocity v in the direction of its axis and rotating with an angular velocity to* = co J\

- v^/c . 2

The cylinder Q,* appears to be contracted by a factor J\ — v /c parallel to its axis and it appears to be twisted around its axis. The angle of twist between two sections which are at a distance a from each other is found to be equal to a*vto* 2

2

where a* = a ^ / l - v /c . 2

2

B. FORMULATION OF THE LORENTZ PRINCIPLE 183. Taking the above results together — the following principle suggests itself: the laws of nature are such that provided Q, is a real physical system then the Lorentz deformed systems

ö* = £(G) are possible systems obeying the same laws as the system £}. We shall denote the above principle the Lorentz principle, as it reflects on ideas put forward already by Lorentz. We show presently that the Lorentz principle in the above form accounts for the failures of observing the translational motion of physical systems relatíve to the ether. 1. INTERPRETATION OF THE NEGATÍVE RESULTS OF ETHER DRIFT EXPERIMENTS I N TERMS OF THE LORENTZ PRINCIPLE

184. The Lorentz principle gives the generál law which explains why it is impossible to determine experimentally the state of translational motion of physical systems relatíve to the ether. So as to illustrate by an example the results which can be derived from the Lorentz principle consider a Michelson interferometer which rotates slowly around an axis perpendicular to the pláne of its arms. The observations of Michelson and Morley showed that the interference pattern of such a rotating interferometer remains stationary. The above observations may be interpreted by supposing that the interferometer has no translational motion relatíve to the ether and thus it rotates like a rigid

body. The observations can, however, alsó be interpreted by supposing that the interferometer is in a state of translational motion relative to the ether and in the course of the rotation its arms periodically contract and relax according to their instantaneous direction relative to the direction of v. From the Lorentz principle it follows that if there exists a system £} which is an interferometer having no translational motion relative to the ether and which rotates like a rigid body around an axis, then there may exist alsó a system £1* which is a similar interferometer but which has somé translational motion relative to the ether and the arms of which contract and relax in a suitable way to keep the interference pattern stationary in the course of rotation. The observer moving together with the interferometer has no means to decidé whether the object of his observations is a system iQ or a system O* and therefore he has no means to decidé whether or not together with the interferometer he is moving relative to the ether. 185. So as to elaborate the above example in a more generál way let us consider somé physical system ö with a representation =

Q,

where K is a Lorentz system of reference. Let us suppose that the centre of £1 is at rest relative to K, thus the translational velocity of Q is equal to that of K. Any experimentál evidence as to the measures of the translational velocity of v, of K and alsó of ö relative to the ether could only be based on an analysis of the numerical values of the components of Q. However, if we suppose that the laws governing the internál motion and the structure of O are Lorentz invariant, then no information can be obtained about the measures v from the analysis of the measures Q as we show presently. 186. Indeed, suppose two physical systems £1 and D * which move relative to each other and suppose that ö * is a Lorentz deformed version of £*, i.e. that £I* =

£„(£».

Observing the two systems we must necessarily express the results of our observations in terms of coordinate measures. The latter must be taken relative to a fixed system of reference. In a given system of reference K we thus obtain representations K{£í)

= Q

with Ö* = LjQ)

*(£>*) =

Q*

_ q = tf(q).

We may choose K so that the centre of 0, is at rest relatíve to it; in the latter case we find that the centre of £}* moves with somé velocity v rela tíve to K. _ We can choose, however, also a representation K so that k(Q) = e

k(o,*) = e*.

From the properties of the Lorentz deformation it follows that

The Lorentz transformation leading from K to K may be denoted Z, , thus we have _ (p)

£ (Ö) = Ő (P)

and

_

If we choose then we have also e = £* thus the representation Q of iQ relatíve to K consists of the same measures as the representation Q* of Q* relatíve to K. Thus knowing merely the measures of the representations Q of a physical system which we may denote by Q, we can equally well suppose that this represents a system Q. as we can suppose that it represents £L*. If the centre of our system of reference is at rest relatíve to >Q, then we cannot decidé whether our own motion is that of jQ or that of £}* ? Thus if we find e.g. that K(Ö) = Q we can suppose equally well that K = K

and

Q = O

or K = R and

£ = Q*

thus we cannot decidé whether our system of reference moves with £l or with D* — (we could also suppose that it moves relatíve to both). Thus we see that from the observations of physical systems we cannot arrive at conclusions as to the translational velocity of these systems relatíve to the ether provided the internál motion of the system obeys the Lorentz principle.

2. NON-ORTHOGONAL REPRESENTATIONS

187. Observing the measures of systems C we are not only left in the dark with respect to the translational velocity of G relative to the ether but a further ambiguity is left. We may construct a system of reference K in which the propagation of light can be expressed by a relation xgx = 0

(18)

where g is constant but not necessarily equal to T. Such a system of reference may be taken to be an inertial system of reference. The system of reference K in which the propagation of light is expressed by (18) may be taken to be a skew system of reference if we suppose that light is really propagated isotropically. However, we can equally well suppose that the system K is an orthogonal system of reference and the light is in reality propagated unisotropically. Observing the representation of physical systems jQ in K and taking various measures of ö with the help of signals of light, we can still not arrive at conclusions as to the real mode of propagation of light. To make this clear we may consider again the Michelson interferometer. Experimentally we can ascertain that when turning round carefully an arm of the interferometer the return flight time of light along the arm does not change from position to position. From such a result one might conclude that both light is propagated isotropically and that the arm behaves like a rigid body — but alsó the possibility must be admitted that light is propagated unisotropically and the arm while being turnéd round deforms itself suitably so as to make the return flight time independent of the orientation of the arm. 188. One might argue that it is "simpler" to suppose the former, i.e. that the propagation is isotropic and that the arm does not deform. In fact, laws of nature are not necessarily in accord with our requirement of convenience. Apart from that, from the analysis of the Michelson-Morley experiment carried out while the orbital velocity of the Earth changes its direction it follows that the arm must in fact deform at somé occasions. Thus having evidence of the fact that such deformations do occur at all — it is difficult to see what the extent of these deformations ultimately may be. Summarizing our arguments we conclude therefore the following: — from measurements of the components of a physical system Q. and observations of the propagation of light we can obtain information about the following: 1) whether or not light is propagated homogeneously; we do not get information, however, whether light is propagated isotropically, 2) about internál physical properties of the system £l. We do not get information whether or not Q moves relative to the ether,

3) about the relations of physical properties of £1 and the mode of propagations of light. We can thus check e.g. whether or not the time of to and fro fiight of light between two points of jQ remains constant if Q is turnéd round. The latter investigation decides whether £t behaves like an ideál solid. We do not get information whether or not £l when turnéd round retáins its originál configuration. 4) We can determine the velocity of light by an experiment, e.g. of the type of Fizeau. Such an experiment provides us with the value of c = ( - det g )

1/2

as can be seen easily. Thus one of the ten components of g can be determined if we give nine of them arbitrary values. 189. From the above considerations we see that from measurement we cannot determine the measure of the velocity v with which our system of reference K moves relative to the ether. We see also that we cannot determine the parameters defining the mode of propagation of light. Thus we can just as little claim that an experiment proves light to be propagated isotropically relative to the ether as we cannot claim that our system of reference is at rest relative to the ether. Sometimes it is claimed that "the Michelson-Morley experiment proves that light is propagated isotropically relative to any system of reference". The latter claim is quite unfounded. The Michelson-Morley experiment proves merely that the time of to and fro fiight of a light signal between two points of a solid remains unchanged, if we turn the solid round without changing its translational velocity. 190. The state of the ether in a homogeneous region is described by ten parameters nine of which cannot be determined by the experimentál investigation of physical objects. These parameters cannot be determined, because the laws of nature (as far as they are in accord with the Lorentz principle) show symmetries, which lead to ambiguities in the interpretation of the measured data. A phenomenon, if it can be interpreted consistently using one particular choice of the nine parameters, can equally well be interpreted using another choice for the parameters. 3. GENERAL REMARKS ON THE LORENTZ PRINCIPLE

191. We have thus shown that the Lorentz principle as formulated in 183 accounts for the failure of the experiments with the object to determine the translational velocity of a physical system relative to the ether. Effects from which we could draw conclusions upon this velocity could only be produced by physical systems which do not obey the Lorentz principle.

Whether or not there exist such effects or whether the existing laws are exactly in accord with a Lorentz principle or only to a certain approximation, are open questions. Assertions to the effect that no phenomenon exists which does not obey the Lorentz principle or to the effects that the known laws are Lorentz invariant with an "absolute" precision, are cheap assertions — and they are basedon a naive and dogmatic "belief" in these principles; such assertion cannot be taken seriously from the scientific point of view. It is, however, a remarkable fact that the Lorentz principle is certainly valid with high precision in a rather large field of phenomena — and therefore without falling into naive exaggerations it must be taken as one of the fundamental principles of physics.

C. THE DYNAMICAL PRINCIPLE 192. The Lorentz principle as formulated in 183 accounts fully for the negatíve results of all attempts to determine the velocity v of the Earth relative to the ether. It shows alsó that it is impossible to ascertain whether or not light is propagated isotropically; thus it shows that it is impossible to determine by direct measurement nine of the ten elements of the propagation mátrix g. However, the Lorentz principle in its form given above is insufficient to make positive statements as to the real outcome of experiments. This is so because the Lorentz principle only makes predictions as to the possible configurations £}*, £}**,. . . which a system IQ can take up but it does give no indication whether or not O will change into, say, C* under certain circumstances. For the interpretation of real experiments it is therefore necessary to have a principle which helps to predict as to what happens to a given physical system in the case of outside interference. The position considering real experiments can be illustrated by examples. In the case of the Michelson interferometer experiment we investigate what happens to the apparátus if we turn it round carefully. Similarly in the Trouton-Noble experiment the behaviour of a charged condenser is investigated when it is turnéd round relative to the direction of orbital motion of the Earth. In the experiments investigating the perpendicular Doppler effect and alsó in the experiment of Isaak et al. we investigate the change of inner frequencies of a physical system if its translational velocity is changed. In all the above experiments carefully accelerating a system as a whole we change its state of translational motion and observe the changes which occur as a result of the change of state of motion.

It is necessary to formulate a theory which predicts the changes to be expected as the result of adiabatic interferences which result in the change of the state of translational velocity of the system. 193. Prediction as to the results of such interferences can be made with the help of the following principle which is compatible with the originally formulated Lorentz principle and it is an addition to it. We may thus postulate: If a connected physical system is carefully accelerated, then as the result of the acceleration it suffers a Lorentz deformation. The above principle can also be expressed by writing &(t) = £<,(.)(&)

(19)

where £XQ is the configuration Q takes up after an adiabatic interference for a duration t has taken place, q(t) represents the paraméter of a Lorentz transformation so that, say q(0) = 0, i.e. at t = 0, i.e. at the start of the interference £ is the identical transformation. In a particular representation (19) might be written n { 0 )

e ( o = £ (,)(e)9

194. As particular examples we might mention the following: 1) A rod of length / accelerated carefully in the longitudinal direction deforms so that at a time t its length is given by /(/) =

lj\-v{tflc

2

where v(t) is the velocity at the time t. 2) A clock which at rest has a period co = 2n/T when made to move with a velocity v(t) will change its rate so that at the time t it has a rate approximately equal o>(í) = (O J\

-

v{tf/c . 2

More precisely the phase q>(t) of the clock at the time t will be t

(20)

The relation (20) is supported directly by the experiment of Isaak et al. and also by the experiments on decay times of elementary particles and the results concerned with the perpendicular Doppler effect. 3) A rod at rest will not change the measures of its length when turnéd round, i.e. we find for the coordinates of the ends of a rod which is turning

round with constant angular velocity, r

^(0 = 0> *(0 = I r

c o s

"""i / sin co í, 0 .

If the rod apait from rotating alsó moves with a constant velocity v then the length of the rod will change periodically. The latter deformation accounts for the result of the Michelson-Morley experiment. The adiabatic principle formulated in 192 can be taken to hold only approximately and it holds the better the smaller the acceleration to which the system £} is subjected. To see this we have to discuss the mechanism of the Lorentz deformations in more detail. 1. THE MECHANISM OF THE LORENTZ DEFORMATION a. RELAXATION PROCESSES

195. The question may be raised what are the physical processes which produce the Lorentz deformation. e.g., what causes a rod to contract when accelerated ? To answer this particular question we remark that a solid rod consists of atoms which are in a state of dynamical equilibrium. Accelerating the rod by a small amount, we disturb the equilibrium of the atoms. After the outside forces producing the acceleration cease the (moving) atoms have to establish again an equilibrium configuration. This process takes place quite independently of whether there appear relativistic effects or not. We note that by accelerating a real rod we produce elastic waves inside it and the state in which the accelerated rod moves as a whole (a configuration with a constant final velocity v) is attained only after the elastic waves produced by the acceleration have died down and the new state of equilibrium has been established. If we take the forces between the atoms to obey the Lorentz principle then we expect the equilibrium configuration of the moving atoms to be slightly different from that of the atoms at rest and therefore, taking the relativistic effects into consideration, we have to expect that, after the elastic disturbance produced in the rod by the outside forces dies down, the system settles down into the equiübrium configuration Q* of the moving system — which configuration differs from Q, the configuration of the system at rest. 196. If the system is accelerated slowly and step by step, then the system has time after each consecutive step to settle down into new and new configurations. If on the other hand the system is accelerated in a continuous manner, then its configuration will lag behind the continuously changing configuration Q(t) which corresponds to the changing totál velocity of the system. Only after the acceleration has come to an end will the configura-

tion eventually catch up with the real equilibrium configuration corresponding to the final velocity of the system. We illustrate this process schematically in Fig. 16. Q(t) Q'

Q t=o

Kf

Ft,

Fig. 16. Scheme of the dynamics of a Lorentz contraction t. COMPARISON BETWEEN CHANGE OF TEMPERATURE AND CHANGE OF TRANSLATIONAL STATE

197. So as to illustrate further the Lorentz deformation we consider a process somewhat analogous to the Lorentz deformation, namely the change of configuration of a body while its temperature is changing. The atoms of a solid have equilibrium configurations which are functions of the temperature of the solid. Suppose we increase the temperature óf a solid slowly — the solid will as the result of the heating expand and it will pass through the equilibrium configurations corresponding to the temperatures at the various steps of the heating process. If we succeed, however, in rising the temperature of a system suddenly, then the system is incapable of following up the sudden change. In the first instance no change of configuration occurs while the temperature is raised and the system is thus thrown out of equilibrium. The system will after the sudden temperature change take up gradually the new equilibrium configuration. The above process occurs e.g. in the course of an explosion. The analogy to the Lorentz contraction is as follows. If we accelerate a rod suddenly by applying suitable forces simultaneously to all the points of the rod, then the rod attains somé velocity but has no time to contract. After the acceleration has ceased the inner forces come into play and they produce in the course of somé time a contracted configuration. c. DEFORMATIONS OF UNCONNECTED SYSTEMS

198. We have pointed out that the adiabatic principle formulated in 193 (19) is only valid for connected systems. This statement can be elucidated by the following example.

Consider an axle and mounted on it two wheels at a distance / from each other (see Fig. 17) and suppose both the wheels to rotate with an angular velocity co. If the system is accelerated into the direction of the axis so as to move finally with a velocity v in this direction then the wheels slow down and their angular velocity reduces finally to a value co" = co J1 - t> /c 2

2

I Fig. 17. The Lorentz deformation in connected and non-connected systems

The Lorentz deformed configuration of the two wheels is a configuration in which they rotate indeed with a velocity co* but in which configuration the one wheel is turnéd round relative to the other by an angle co*vl*

as can be seen easily subjecting the coordinates of the points of the originál configuration to a transformation A (see alsó 182). Considering, however, the mechanism of the acceleration we have to conclude that provided the wheels are rotating freely on the axle no phase shift

«>?(0 = o»í(0 =

a>y/\-

v(tfl
and as the angular velocities are all the time equal no phase shift can develop. 199. The state of affairs is quite different if we suppose the wheels to be fixed on the common axle. As follows from the adiabatic principle the rotating axle when accelerated shows a tendency to twist; the axle thus infiuences the motion of the rotating wheels and makes them to shift by the amount q>.

To see the process more clearly, we note that the axle has to overcome the inertia of the rotating wheels and in the case of symmetry it will turn the one wheel by an angle —
200. Because of the importance of the question we give another example of the behaviour of an unconnected system under the influence of acceleration. Consider two equal masses m situated at a distance / from each other on the x-axis of the system of coordinates. At the time t = 0 they have coordinates x = 0, x = /. Subjecting them to equal outside forces acting along the positive x-axis, the masses will obtain at the time / velocities (1)

(2)

0

and the x coordinates will thus be given by i

x«(í) = x ) ( 0 - / = j" v(t)dt. (2

Thus because of the perfect symmetry of the treatment of the two masses they react alsó similarly and we expect - x (í) = / for any value of t. (2)

The system of the two masses thus accelerated will not contract. If we connect, however, the two masses by an elastic string, then the string (being a connected system) will try to contract when it is set to move into the longitudinal direction. Thus the string produces a tendency for the system to contract. If the string is strong enough and if there is sufficient time at their disposal, then the masses will eventually be pulled closer to each other and their distance will be eventually equal to where v is the fmal velocity of the system. However, just as in the example with the two wheels, the string — if it is not strong enough — might alsó break or deform. In the latter cases the Lorentz contraction does not come about and the acceleration which produced the permanent deformation of the string must be taken to exceed the measure of the adiabatic acceleration. It becomes clear from the above consideration that the limit as to what acceleration acts adiabatically is determined by the physical properties of the accelerated systems. 2. SIGNIFICANCE OF SUBGROUPS OF THE LORENTZ G R O U P

201. We have seen that with the help of the Lorentz principle the results of many experiments can be interpreted. The question arises whether it follows from the actual experimentál evidence that the laws of nature are invariant with respect to the whole of the Lorentz group ? The question is not trivial and it can be stated straight away that the invariance of laws of nature can only be ascertained to the proper Lorentz group, i.e. to transformations with matrices for which v < c

(21a)

B > 0

(21b)

detA = + l p

(21c)

From the last relation it follows automatically 6 =

1

(21d)

(see 156). The transformations with matrices obeying (21) form a subgroup of the whole of the Lorentz group; (21a) must be supposed so as to obtain transformations with real elements; (21b) contains a real physical

assertion, i.e. a transformation with B < 0 would define a system C* in which the inner motions are reversed in time relative to those of Q. There is evidence to the effect that not all types of physical motions occur also in the time reversed form — we come back to this question in 263. A transformation with B > 0 but det A = — 1 would produce a ö * which is a mirror image of £1. There exists also evidence that the mirror image of a possible physical system is not always a possible system. The relation (21d) refers to the trivial fact that changing the measures of a system by a constant factor 9 # 1, in generál we do not get a possible system. 202. We note that although we have to restrict the operators A, giving possible Lorentz deformations to the proper group of Lorentz transformations obeying (21) we may use for coordinate transformations matrices A from the whole of the Lorentz group. Indeed writing p

(q)

Ap^A^ApAO- , 1

we see that A - will be an element of the proper Lorentz group if A is a member of it irrespective whether or not the transformation mátrix A belongs to the proper Lorentz group or not. p

p

( , )

CHAPTER VI

THE INNER CONSISTENCY OF THE LORENTZ PRINCIPLE A. KINEMATICAL CONSIDERATIONS IN CONNECTION WITH THE LORENTZ PRINCIPLE 203. In the present chapter we shall give a number of kinematic considerations mainly to show the logical consistency of the Lorentz principle. Such considerations are rendered necessary because there can be found in the literature a number of alleged "paradoxes" in connection with the theory of relativity. We shall show that no real paradoxes exist if we treat the phenomena consistently. 1. ADDITION OF VELOCITIES

204. One of the sources of misconceptions is connected with the problem of addition of velocities; we give a more detailed analysis of the problem. Consider two small physical systems 21 and 23. In a representation K their respective coordinate vectors should be

r^(0 = v t + «>

ra(0 = V + »>•

A

Thus the velocities of A and B can be written dr dt

dr dt

A

B

'

A

The distance vector pointing from A to B at the time t is given by and therefore the rate of increase of the distance vector can be written (,) The vector \ can be taken by definition as the relative velocity of B with respect to A (in a representation K). This is being a mere definition and therefore whether or not \ is "really" the relative velocity of B with respect to A cannot be proved or disproved by experiment. From (1) it follows alsó A B

A B

B

Y

=

VA

+

*AB,

( ) 2

i.e. the velocity r of B is equal to the velocity v of A and added to it the relative velocity v of B relative to A. The above so-called "classical law of the addition of velocities" has been unnecessarily discredited. As was pointed out above, the validity of (2) is based on the definition (/) of the concept of relative velocity and therefore (2) can neither be proved nor disproved by experiment. It should be added that (1) is a very useful definition. The relative velocity v introduced by (1) gives the rate of increase of the distance vector r with time with the help of a factor; the change of the distance between A and B can thus be conveniently calculated. 205. Apart from the relation (1) another definition of the relative velocity can also be introduced as we show presently. Consider a point P which moves with a velocity B

A

AB

AB

AB

dx

P

dt relative to a system of reference K. The velocity of P can be expressed relative to a system K' which system moves with a velocity v relative to K. In the measures of K' we have d

X

-

p

dt'

w'

where w' expresses the velocity of P relative to K' in measures of K'. The transformation of coordinates between K and K' can be written r = Lr' - c Ut', 2

t =

- \ w '

(3)

+ Bt'

with V O V

L=l+—=-(*-!), (4) u = - \B,

U = -

\B/c . 2

Inserting the coordinate vectors of P into (3) and differentiating, we find Lw' - c U 2

-\mt'+B cr

Inserting (4) into (5) we have further

c

i

where w' = w{ + w ,

= v(vw')/i>

2

2

thus wí and w are the components of w' parallel resp. perpendicular to v. In particular if v and w' are parallel we have in place of (6) 2

v+w vw'

(7)

1+ —

We note that (7) can be written in the following symmetric form c —V c+V

c—v c+v

c — w' c + w'

(8)

and alsó c

2

206. Relation (6) is Einstein's addition formula for velocities. Einsteirís addition formula differs from the ordinary addition (2) because it gives the explicit expression for the composition of two velocities v and w' which velocities are taken in different representations. Indeed v is taken in measures of K while w' is taken in those of K'. The velocity w' of P relative to O' the origin of K' can alsó be expressed in measures of K. Since r - = v/ + constant, G

one may write w

=

0

= v

_

v

Thus in the case of parallel velocities we find from (7) w' + v

VW W = 1+—Ö- ;

, V = W

c

2

VW 1+ —2-

2. ADDITION FORMULA A N D LORENTZ DEFORMATIONS

207. The addition formula (6) has also another significance. Consider a system O and its representation Q relative to K. Suppose £} contains a point 91 which moves relative to K with a velocity dt = w. Suppose e.g. O to be a solid at rest relative to K and suppose 91 to be a point of a wave front © moving across Cl (see Fig. 18).

Fig. 18. Lorentz deformation and law of addition of velocities

If the system üQ is accelerated so as to move as a whole with a velocity v relative to K, then it changes into a Lorentz deformed configuration O*. In particular the wave front © changes into a front
i%t*) = U {t)

+ ut,

A

t* = Ur,(í) + Bt. Thus we find Ml"

=

dr*(t*) dt*

Lw + u Uw + B '

and thus in accord with 205 w* -

Y

+

W l

+

(9)

W 2 / g

c

l

where w and w are the components of w parallel respectively perpendicular to v. We see that because of the Lorentz deformation the system £} suffers when accelerated, the deformed front @* appearingin the deformed system; £i* will move with a velocity w* # v + w. The velocity of the front <S* relative to C* can be taken as w* — v which velocity differs from w by terms of the order of v /c . x

2

2

2

If v and w are parallel we obtain instead of (9) the particular relation v+w

w* =

VW

.

K

(10). '

c

2

208. An important application of (10) is obtained considering the experiment of Fizeau. The system Q can be taken to be a liquid at rest relative to K. Electromagnetic wave-fronts move inside O with a velocity c w=— n as measured in the representation K. Considering a liquid £1* which moves with a velocity v relative to K then the wave fronts © which moved with the velocity cjn in a direction of v inside £1 correspond to fronts ©* in £1* which move relative to K with a velocity v

+

-

c

,

i

(11) ; + terms of higher order. v n \ n 1+ — ^ cn Equation (11) corresponds to the relation (11) of 32 where we described the results of Fizeau. We note that we did not obtain (11) by "adding" the velocities v and w. We obtain (11) by supposing that the processes giving rise to the propagation of phase surfaces with a velocity w — c/n in the system Q. are Lorentz invariant. If so, then the processes which lead to a velocity cjn in the system O at rest lead to a velocity w* in the moving system jQ*. The circumstance that w* # v + w is connected with the fact that the atoms of £1 and of Q* have different velocities relative to the ether and therefore the waves in G and Q* build up in a somewhat different manner. We shall show in 302 that the relation (11) can alsó be obtained considering the behaviour of the atoms in D and D* making use of purely classical concepts without reference to the addition formula (10) or to the Lorentz principle. W* =

"

\- V \l

B. CONSIDERATIONS ABOUT CONTRACTION OF SOLIDS AND THE SLOWING DOWN OF CLOCKS 209. We are led to suppose from various considerations that a solid rod contracts when accelerated relative to the ether — similarly a clock when it is made to move with somé velocity relative to the ether slows down its rate.

Superficially, the above effects might appear suitable for to determine the velocity of an object relative to the ether. Indeed, suppose the Earth moves with a velocity v relative to the ether. A rod layingin the direction of v, and at rest relative to the Earth, is contracted by a factor / l — v jc as compared to the configuration it would take up when it was at rest. If the rod is made to move with a velocity w relative to the Earth and if w is parallel to v then the velocity of the rod relative to the ether is thus increased and its length should further diminish. If on the other hand the rod is accelerated so as to move with a velocity w relative to the Earth so that w is in the direction opposite to v, then the velocity of the rod relative to the ether is diminished and consequently somé of the originál contraction is relaxed and the length of the rod increases. 2

2

N

Thus giving to a rod velocities into various directions and mapping the changes of lengths it suffers one might suppose that both the absolute value of v and its direction could be determined. Similarly making a clock to move into various directions relative to the Earth one might expect to be able to determine v from the changes the rate of the clock suffers. There is no doubt that the above effects exist, i.e. a rod changes indeed its length and a clock its rate when made to move relative to the Earth. Nevertheless the real observation of moving rods or of moving clocks do not permit to determine the velocity v of the Earth as it can be seen from the analyses we give presently. 210. Let us consider two clocks A, B. Suppose at first the clocks to be close to each other and also to be at rest relative to each other. In this preliminary state the clocks can be synchronized and afterwards accelerated adiabatically so as to move with specified velocities. Let us consider the motion of the clocks with respect to a system of reference K which is at rest relative to the ether. (We shall see presently that it is unimportant for our considerations whether or not we can pick out the system K among the various inertial systems which have translational motions relative to the ether.) 211. Suppose the clocks to move along the x-axis of K ; they have thus coordinates at the time / given by 0

0

0

x (t) = a-vt,

x (t) = b + wt,

A

(12)

B

where a, b are constants. If the clock A gives signals at intervals T these signals will start at times t = t + kT , (13) A

k

0

A

and the point of the k-th signal which moves along the x-axis has at a time t > t an x-coordinate x {t). We obtain with the help of (12) k

k

x (t) = x (t ) k

A

k

+ c(t - t ) = a - (c + v) t + ct. k

k

(14)

The signals reach B at times t' so that k

x {t'k) =

x (t ), k

B

k

thus we have considering (12) and (14) a — (c +

+ ct' = b

v)t

k

+

k

wt

k

and + (c + v)t ^—— .

b - a

tl =

k

(15)

We may write tl = Ú + kT'

(16)

A

where T' is the interval between subsequent signals arriving from A into B\ we may denote T' alsó the rate of A as seen from B. We find from (16), (15) and (13) A

A

T-t' A — 'k+1

' k —

t

c +

T

v

l

1

A-

1

C— W

Comparing the rate of the clock B with that of A as seen from B we can determine the ratio Q=T ir B

= ^ ^ -

A

(17)

c+v

T

A

From the hypothesis about slowing down of clocks we take Ts-TjJT^w ]?,

T^T/Jl-Jjc -, 0

(18)

2

where T is the period of either of the clocks if at rest relative to the ether. Introducing (18) into (17) we find I c — v c— w ő= J • V c+ v c+ w Making use of relation (8) of 205 we can alsó write

,„ (19)

v +w Ö =

+V J-—77

V=

With

C

+

2

c

or 2

F

=

c

1 ö12 T r f -

( 2 0 )

212. Observing the rate of A from the position of B we can only determine the ratio Q dermed by (17) of the rate of the clock in B and that of the rate which A appears to have when seen from B. According to (20) from the observed value of Q we cannot determine the velocities v and w separately but only their combination V förmed according to Einstein's additional formula. The observed ratio Q is compatible with the assumptions that A moves with a velocity v', while B moves with a velocity w' where v' + w' _ l-g v'w' ~ ° 1 + Q ' c 2

(21)

z

+

2

Any pair of solutions v', w' of (21) corresponds to an alternative consistent hypothesis as to the velocities of A and B relative to the ether. 213. We may also measure the rate of the clock B as seen from A; we may thus measure the ratio

Since, however, relation (19) is symmetric with respect to v and w we expect that (22) Q = Q'. Since Q and Q' are obtained as the results of independent measurements the relation (22) is a check upon our assumption. If (22) is satisfied by measurement, i.e. if we find by measurement that

then the latter experimentál result supports the hypothesis (18) as to the change of rates of clocks; the check also supports the assumption that there exists a system K relative to which light is propagated isotropically. 214. Relation (19) gives one equation for the two unkown quantities v and w. If one could obtain another mathematically independent relation between t' and w then one could determine v and w from the system of equations thus obtained. However, no such equations can be found. Trying to attempt to obtain a further relation containing v and w one can determine the velocity of A relative to B by using the radar method. To determine the relative velocity between A and B the observer near A may emit light signals at times t = 0 and t = T towards B. Suppose the signals are reflected from B; they will arrive at times / = t respectively at t = t back to A. 0

x

2

From simple geometry (see Fig. 19) we find

h

_ 2{b-á)c ~ (c - v)(c - w) '

h

_ 2(b - a)c + (c+v)(c + w)T ~ (c - v)(c - w)

and therefore we find

Fig. 19. x — t diagram illustrating the determination of relative velocity between two physical systems

Comparing (17) and (23) we see that both measurements—i.e. that when we try to compare the rate of the clocks A and B, and that when we try to measure the relative velocity between A and B—lead to the determination of the same quantity Q which is a function of Einstein's sum V of the velocities v and w. Thus the two types of measurement described in 211 respectively 214 give merely a check of consistency of our assumptions in so far as we find out whether the two methods do indeed lead to the same numerical value for Q and thus for V. The comparison of the results does not help, however, to determine the measures of v and w separately. 215. Carrying out similar measurements by emitting signals from B towards A one finds a quantity v e = ^fi2

•

Because v and w appear in a symmetric form in relation (23) one is led to expect that t

r

'

The latter relation provides a further check upon our hypothesis, but does not provide further information as to the values of v and w. We see thus that carrying out various observations with moving clocks we obtain checks of our assumptions. In particular we can check the law of the relativistic slowing down of clocks. We are, however, unable to determine from the measured results the velocities of the clocks relative to the ether. 216. A similar consideration can be carried out for how the length of two moving rods can be compared with the help of light signals sent to and fro. The result is similar to that found for the comparison of the clocks. Considering two rods moving with velocities v and — w we can attempt to measure the length of the one rod with the help of radar signals sent from an instrument moving with the other rod. Analysing the results of such measurements we are led to still another determination of V — provided we suppose that the rods contract when set to move relative to the ether by factors y/l — v \c and y/l — w /c respectively. The above experiment provides thus still another test of the consistency of our hypothesis — it does not provide, however, additional information as to the values of v and H'. 217. The above considerations repeat partly the arguments of 44 in connection with the Doppler effect. The present considerations go further than the former in showing that we may add to the observation of Doppler frequency still other observations and even this further information does not provide sufficient information for to determine the individual values of v and w. 2

2

2

2

1. THE CLOCK "PARADOX"

218. We describe a group of problems which is sometimes referred to as the "clock paradox". We shall see that no paradox is really involved. Consider somé closed physical system, with periodic internál motion, which system can be regarded as a clock. We may thus consider, for example, an oscillating atom, an oscillating crystal, or a solid spinning with constant angular velocity. The rate of internál motion of such a clock can be characterized by somé angular velocity to. We can take as the reading of the clock a quantity t

J

o

to dt

=

T,

where t is the measure of time in a point near the clock relative to a system of reference K; T is the reading of the particular clock and we take a clock to have constant rate if dT 2

n

=

-aW

°-

219. If the clock as a whole is subjected to adiabatic acceleration, then according to the Lorentz principle its rate is reduced. When the clock is moving with a velocity v its rate becomes co V,/l

- v /c 2

2

If co is constant the reading of the clock at the time / will then be i

t

T* = j co* dt = Y J* y/l - v(t') /c 2

0

2

dt',

(24)

0

where v(t') is the velocity of the clock at the time t'. From (24) we see that T* < T. If the velocity v(t) is all the time much smairer than c, then we may develop in powers of v\i)lc and we find from (24) when neglecting terms of fourth and higher order 2

AT=T-T*

=

2c

2

where v = 2

j^v(t'fdt'

is the average of the square velocity. Thus the rate of loss of the moving clock is proportional to the average square velocity on its journey. We may denote AT the loss of phase of the clock. That a clock suffers indeed a loss of phase when moving with a high velocity can be seen experimentally when observing the áecay of fast ^-mesons (see in particular 47). 220. Consider two clocks A, B such that A is moving with a constant velocity v while B is moving with a varying velocity "Át) = v + w(f).

The losses of phase sufíered by A and B, due to their motions, can be expressed as cov t 2

J r

*

I?/ o

( 0 =

( v +

w)2
i

co

cot —x

C

= 47^(0 + - g - T J

W

í

/

—w . 2

Í

+

0

Fig. 20. Slowing down of clocks along different orbits

We may write i

f w dt = t (t)-T {t) B

=

A

T {t). AB

0

Thus we find also co cot —7. = -^ vr (t) + ^ w.

AT -AT B

2

A

AB

221. In particular if A moves all the time with a constant velocity v while B moves non-uniformly so that A and B start at t = 0 from the same point P and meet again at, say, r = f at a point Q (see Fig. 20) then we have x

and {AT -AT ) =^w\ B

(25)

A l=t

thus AT > B

AT . A

Thus comparing the readings of the clocks A and B at two subsequent meetings, we find that the clock B which moved non-uniformly, suffers greater loss of phase than the clock A which moves with a constant velocity Y. 222. Considering the two clocks starting at r^O) = r (0) = 0, fl

and proceeding until their distance is we have A T , - A T

= ™

A

+

% * .

(26)

Supposing that B leaves A moving along a straight line with practically constant velocity, we may put -2

— 2

r

" =72

thus

2

—~W 2c

2

2

= -zr^ 2c\

>0

lf

U ~*

00.

We see thus that, provided the velocity w with which B moves away from A is sufficiently small, the second term on the right of (26) can be neglected and we have in a good approximation AT B

AT K a

turv c"

—— •

(27)

In particular if v = 0, i.e. if A is at rest while B is moving very slowly away, then we find AT

A

= 0,

AT = 0 apart from small terms. B

Thus a clock moved about sufficiently slowly in K suffers no noticeable phase shift even if it is moved covering great distances. 223. We note, however, that if rr * 0 then the clock B suffers phase shift however slowly it is moved across r. The clock A moving with a velocity v can be taken to rest in the origin of a Lorentz system K. The clock B is moved slowly across K; from (27) we see that (unless B is moved perpendicular to v) it accumulates a phase shift relative to A which phase shift is proportional to the distance covered. It can be seen easily that the phase shift thus suffered by B is equal to the differences in phases of the clocks in K adjusted with the help of light signals. Thus a clock moving very slowly through K will take up phases which — at any point of its journey — are equal to the phase of the local clock it just encounters. 0

We see therefore that a clock moved about very slowly in K retains its phase. A clock moved about slowly relative to a moving system of reference K suffers phase shifts which make it appear to have a constant phase when compared with the clocks adjusted by signals of light in K. Thus we cannot distinguish between the system K and a system K with the help of a clock which is moved about slowly. 224. Returning to (25) we see that having two clocks, A and B starting from a point at t = 0, and meeting later at a time t = í we expect the clock B to lose phase relative to A. What is sometimes regarded as a paradox is that, if B is accelerated relative to A, then this process might also appear as if A was accelerated relative to B. It is argued incorrectly that the state of affairs is quite symmetric, therefore, it appears a paradox that in spite of the symmetry one clock should definitely lose relative to the other. However, the above argument is quite incorrect; the clock A remains at rest all the time relative to a Lorentz system and therefore has a constant velocity relative to any Lorentz system. The clock B on the other hand, in the course of its journey, is changing its velocity relative to any Lorentz system and therefore the behaviour of the clocks is different from the physical point of view. Equation (25) shows that the non-uniformly moving clock loses more than the uniformly moving one. 0

0

1(

2. THE "PARADOX OF THE TWINS"

225. The "clock paradox" is made more stringent by applying it to humán beings. It is argued that considering a pair of twins if the one of them partakes in a fast journey through space and returns again to Earth, then he should have become younger than his brother who stayed on the Earth all the time. When making the above statement one considers a humán being as a "clock" and supposes that his internál motion is slowed down while he is partaking in his journey through space. The rate of internál motion is of course supposed to become normál again once he has returned. When discussing this statement, we have to emphasize that the whole question is a purely specuiative one — at least for the time being. The space rockets which have been dispatched so far had velocities of the order of 10 km/sec, thus the slowing down effects are of the order of 1 : 10"'. In a day's journey the slowing down of the clocks could produce a loss of the order of AT ~ 1 0 - sec. 4

The latter loss is negligible when considering humán beings. 146

226. The question arises whether a humán being or for that matter a biological process can or cannot be regarded as a clock in the sense used in the previous paragraphs. Remembering that all the biological and physiological processes which take place in a humán being are based on complex interactions between atoms, it is not absurd to suppose that the resulting processes are Lorentz invariant like the simple atomic phenomena. If we accept such a hypothesis then we can suppose that a humán being placed into the rockét, when being under the influence of outer acceleration, will slow down indeed. His nerves, his muscles, the rate of beat of his heart may all thus slow down — the rate of oxygen used by the system can be reduced as the various chemical processes are slowed down. Thus the organism behaves like under the influence of a drug and slows down as a whole. In such a "drugged" state it is conceivable that the organism ages more slowly than it would when it develops its usual activity. The problem can be compared with that of a certain bacteria, which if cooled down sufficiently stop almost completely their inner activity and are revived as soon as their temperature is raised again. These bacteria do not age while their temperature is low. Thus the "twin paradox" has to be understood in terms of a slowing down of the whole of the organism by an outside influence. It must be emphasized, however, that the assertion that biological processes are exactly Lorentz invariant, contains a far-reaching extrapolation. Whether this extrapolation leads to correct results could only be decided by future experiments.

CHAPTER VII

RELATIVISTIC MECHANICS

A. MOMENTUM AND ENERGY 227. The Lorentz deformation as formulated in 176 is suitable to describe the deformed configuration Q* of a physical system Q provided we describe O as the conglomeration of a number of moving points S$ . Indeed, if we give the orbits of points in a paraméter representation in a form k

XFCO) = ráP),

Kp)

k=l,

2,...

then the orbits of the points ty* of the deformed system are given by xt(p) = A x (p) 9

k

+\

k =1,2,...

where A, is the mátrix of a proper Lorentz transformation. Describing a physical system not merely in terms of the orbits of its points, but using quantities like energy, momentum, charge, mass, etc, we have to extend our formalism so as to be able to give also the transforms of quantities other than four-coordinates. 1. NEWTON'S FIRST LAW

228. Denote a small closed physical system a partiele; the orbit of a partiele can thus be described by a four-coordinate, say x(/>) =

KP),

KP).

We may conclude from the Lorentz principle the following: A partiele may be permanently at rest relative to a Lorentz system of reference. In this state it is described by x(p) = r, t(p)

r independent of p.

(1)

The deformed version of this partiele is described by a four-coordinate x*(p) = r + yt(p), t(p),

(2)

where v and r are constants independent of p. From (1) and (2) we conclude that free particles will be found either at rest or moving along straight lines with constant velocities. The latter result gives Newton's first law.

229, The above formulation contains more than Newton's first law. Indeed (2) is taken to be valid in an inertial system of reference. The inertial systems were taken further above to be systems in the measures of which light is propagated homogeneously. Thus we conclude that, relative to a system of reference to which light is propagated along straight lines and with constant velocities, free particles move alsó along straight lines with constant velocities — thus Newton's law as formulated by (2) gives not only the law of motion of free particles but it gives alsó a connection between the levf of motion of free particles and the mode of propagation of light.

2. ELASTIC COLLISIONS

230. So as to obtain the description of particles in more detail we consider the energy and momentum of a particle. In classical physics we have p = ww,

K = - i mw

2

(3)

where K and p are energy and momentum of a partiele of mass m moving with a velocity w. Considering the elastic collision of two particles the law of conservation of momentum can be expressed tf' + P ^ p f + pf,

(4)

where P}*> = JWFHRFJ

[i,*=L,2;

(5)

the suffix / taking up values 1 and 2 refers to the two particles, the upper index k refers to the states before and after the collision. In the classical approximation we can suppose that the masses of the particles do not change in the collisions, thus we may suppose m

= m

t

i=L,2.

The law of conservation of energy can be written similarly as Xjo + K£> = Kf> +

(6)

with Kp=^mpw?*

i,k=\,2.

231. Applying the Lorentz principle to the collision we expect that the Lorentz deform of the elastic collision described by (4) and (6) should deseribe another collision which alsó obeys the laws of conservation of energy and momentum. We investigate how the expressions (3) have to be modified so as to make (4) and (6) Lorentz invariant.

Considering a Lorentz deformation with a deformation mátrix A we can calculate the velocities the particles take up in the deformed version. We find applying the results of 207 y

w<*>* =

i,k= 1,2

(7)

where we write $ for the addition according to Einstein's addition formula in 207 eq. (9). So as to simplify notation we shall use the upper index k = 1, 2 for the states of the particles before and after the collision and denote by upper indices k = 3,4 the quantities referring to the state of the particles before and after the collision in the deformed configuration. Thus we use the notation w

f>*

=

w

mj*)* = mj*+"),

f+2>,

p<*>* = pf >, +2

i,Jfc=l,2.

(8)

Considering in particular collisions so that the velocities w^ and v are all parallel we can write in place of (7) using the notation (8) fc)

VWT>

1+

—T-

From (9) it follows also that (k)

vw

where 2rt*> = — = J

/ = 1,2 k= 1,2,3,4.

r

w«

2

232. The law of conservation of momentum for the deformed system can be written p(3) + p f = p(4) pW (11) +

Introducing (9) and (10) into (5) using the notation (8) we find _ (fc+2) (fc+2) ~i m =

Pi

w

t

i

w< k)

+

.

v

_

±!_ p+2y (fc+2) • B

5

OT

( ]

UA>

Introducing (12) into (11) and writing = pf^mf^ we find after multiplication with y / l ' — tP/c a relation of the following form: 2

(/#> + vnífioéP

+ (f/P + 0 4

' = (/>í + wf^VS0 + (PÍ + vmf>)<£?

1

2>

2)

(13) with W

( 1 4

)

The law of conservation of momentum which is expressed for the originál pair of particles by relation (4) appears in the form (13) for the transformed pair. The law (4) is Lorentz invariant provided (13) is necessarily fulfilled if (4) is fulfilled. We investigate the circumstances under which (4) and (13) follow from each other. The terms of (13) not containing v reduce to (4) provided the tx\ have all the same values. We can suppose k)

«f*) = 1

i = 1, 2

k = 1, 2

(15)

and find from (14) and (15)

mf^mfif^

">>

Í

=

h

l

r

íl-%-

(16)

^=1,2,3,4

where m and m are constants. The terms of (13) containing v cancel provided (16) holds and alsó the relation mP + mf> = ni? + n%> (17) is satisfied. As the result of a short calculation we find that it follows from (4), (16) and (17) that x

2

mi > + m > = < > + m > . 3

3

4

2

2

233. Summarizing our considerations we see that in the case of a linear elastic collision it follows from the relation pf>+rf> = p f +

p > ( 2 2

(18)

together with the relation m + M = f ) + mf)

m

m

w

(19)

that the corresponding relations referring to the Lorentz deformed configurations are alsó valid; i.e. it follows from (18) and (19) that PÍ 3 , + P ( 2 3 ) = PÍ 4 ) + P ( 2 4 )

and where the transformation from the originál system of particles with k = 1, 2 to the deformed system with k = 3, 4 has to be taken in accord with (16). In particular we conclude that the momentum of a partiele with rest mass m moving with a velocity w has to be taken as P=

^

r.

•

(20)

The latter formula can be interpreted by writing p = m(w)w,

m

m(w>) =

. (21) yj 1 - W /C 234. We see that the law of conservation of momentum (18) can then and only then be formulated in a Lorentz invariant manner, if we add to it the relation (19). The relations (18) and (19) together appear Lorentz invariant if the transformation of mass and momentum is taken to be in accord with (21). The classical law of conservation of energy expressed by (6) is not Lorentz invariant; we see immediately from our formalism that in generál 2

2

even if (6) is satisfied in the originál configuration. The classical law of conservation of energy can, however, be replaced by the invariant relation (19). The latter relation for small values of the velocities reduces in a good approximation to the classical energy expression. Indeed, multiplying (19) by c we can write 2

+

with E\ = k)

=

+

(22)

' = = = mfi + — m-wf? + terms of higher order. y 1 - w^ jc 2

k)

2

2

Thus we can take where E = rriiC is the rest energy of a partiele and E\ — E can be taken as the kinetic energy. We see thus that the relativistic energy of a partiele can be defined by k)

2

t

t

E

=

^

L

=

(23)

where m is the rest mass and w the velocity of the partiele — and we find that the relativistic energy of particles is exactly conserved in elastic collisions. 235. We have obtained relations (20) and (23) as sufficient conditions for the invariant formulation of the conservation laws. Regarding, however, the independent parameters involved in the consideration, one can see easily that for given values m,{0) these conditions are alsó necessary. Thus supposing that the momentum vector has a direction parallel to w energy and momentum have necessarily to be assumed to have the dependence upon w as given in (20) and (23). (In the energy expressions there remains of course always the possibility of adding arbitrary constant values to the expressions £,-.) 3. INELASTIC COLLISIONS

236. The conservation laws can be formulated in an invariant manner only if both the laws of conservation of energy and of momentum are taken together. That the energy and momentum relations are indeed connected can be seen from the following example. Consider an inelastic collision of the following type. Two particles of equal rest masses mtö) = m (0) = m move towards each other with velocities and wl = - w. 2

0

1)

Suppose the particles to collide and stick together after having touched each other and to come to rest in the end. We suppose thus wf> = - w « = w

= w<> = 0 2

and we have m vf

><1> W 2

0

= p>= 0 2

2

and PÍ + P 1)

( 1) 2

= PÍ + P . 2)

( 2) 2

Transforming the collision with a velocity v = w we find wf> =

2w ,2 '

wf> = 0;

w^ = w 4)

4) 2

= w,

the momenta of the transformed system are found to be „(3) _

_

2

W

W

2

PÍ = P

„(3) _ o

0 C

4)

2

/n w 0

4) 2

thus we find , / 1 - w /c 2

2

thus for w > 0 the momentum relation in the transformed system is not fulfilled. The reason for this discrepancy is that the energy relation is not fulfilled in the originál version of collision, indeed we have IT?) = Ef = -j^£==,

E? = E? = m c . 2

0

/1-H' /C ' 2

X

2

B. EQUIVALENCE OF MASS AND ENERGY 237. The inelastic collision can, however, be treated in a Lorentz invariant manner if we suppose with Einstein the principle of the equivalence of mass and energy. According to this principle a system which has a totál energy E possesses a mass m = Ele .

(25)

2

If the inelastically colliding particles stick together, then their kinetic energy is transformed into somé other type of energy e.g. the particles may heat up. We have thus 22Í > + E? = £ f > + £ > + Q = - r ^ — , 2

1

2

y/ 1 -

Vf /C 2

(26)

2

where Q is the energy of the heat produced in the collision. With the help of (25) and (26) we find thus for the effective mass 2m of the complex of the two particles which have stuck together

0

2m = 0

Ej» + E?> C

5 2

2m

0

= — —

^/l -w /c 2

, 2

and thus we nave in the place of (24) P! + P ( 3)

3) 2

= Pl + P , 4)

4)

2

where p(4)

=

p(4) _

m

„

and

Thus the conservation of momentum is fulfilled in the transformed colüsion if we take into consideration the increase in mass caused by the heating up of the particles. 1. REMARK O N THE MECHANISM OF INCREASE OF MASS WITH ENERGY

238. The increase of mass with energy if a macroscopic body is supplied with heat can be understood immediately. The heat increases the average thermal velocity of the atoms of the body and thus the increase of mass of the body is caused by the changes of the masses of the molecules due to the increased thermal motion. Since the collisions between individual atoms can be taken to be elastic ones, an inelastic collision between two macroscopic bodies resulting in heating up, can be taken alternatively as a system of elastic colüsions between the atoms of the colliding bodies and thus it becomes evident that the collision as a whole will be in accord with the relativistic laws valid for elastic collisions. The problem is somewhat more involved, if the macroscopic collision produces not simply heat, but also elastic or inelastic deformations. The energy stored in these deformations corresponds also to change of mass in accord with (25); the mechanism of this increase will be dealt with further below.

C. DISTANT COLLISIONS 239. Two particles acting upon each other at a distance influence mutually their motion. The problem of the conservation laws in this case is somewhat more complicated.

SchematicalJy a collision at a distance can be pictured by supposing that the changes of momentum of the colliding particles occur suddenly. We may thus suppose that (as shown in Fig. 21) the partiele 1 is deflected in the point A at a. time t and it changes there its momentum from a value p W t pC2). At the same instant the partiele 2 changes its momentum p into p at a point B at somé distance from A. If we have 2

0

2

1 )

2 )

PÍ +P 2 =Pf + P , 1>

(

1)

)

2)

2

then the collision can be regarded as an elastic one.

Fig. 21. Scheme of a distant collision

Considering the Lorentz deformed version of the collision we have alsó

However, transforming the motion of the particles we find that in the transformed system the changes of momenta do not take place simultaneously. The momentum of the one partiele changes at a point A* at somé time '*> while the change of momentum of the second partiele takes place at a point B* at a time t = t% and in generál 1

=

If, say, t% < t% then in the interval t%

< t < t*

the momentum is not conserved since the momentum of the first partiele has already changed while that of the second partiele has still remained unchanged. If we consider instead of a sudden collision a continuous interaction at a distance, then it is to be expected that even if the momentum balance is restored after the interaction has ceased nevertheless the sum of momenta of the two particles does not remain constant in the course of the interaction. Or, even if it were constant in a particular configuration, then it will not be constant in other Lorentz deformed configurations.

The energy and momentum of the whole system can, however, be taken to be conserved if we consider that the interaction between distant particles is always transmitted by a radiation field. As will be seen in more detail, the totál energy and momentum consisting of that of the particles and fields remains strictly constant in the course of the collision. 1. EXPERIMENTÁL EVIDENCE

240. The pre-relativistic form of the conservation laws of energy and momentum expressed by relations (4) and (6) were obtained and confirmed observing particles with small velocities. Observing particles moving with velocities small as compared with that of light, these laws cannot be distinguished from the relativistically invariant laws expressed by the relations (18), (19) and (22). Observing collisions between fast moving elementary particles the discrepancies between the two sets of laws become very pronounced and such collisions do not obey the laws expressed by (4) and (6) but rather the relativistic laws expressed by (18), (19) and (22). While at least qualitatively the relativistic laws of collisions were confirmed by actual experiments, it would be useful to analyse the observational facts more critically than it was done so far, so as to obtain a quantitative confirmation of the relativistic laws of collisions. We note that in somé of the relativistic collisions the effects of radiation mentioned in 239 are strongly felt; an important example is the collision of fast electrons with atomic nuclei emitting an appreciable amount of bremsstrahlung in the course of the collision. D. MECHANICAL LAWS IN TERMS OF FOUR-VECTORS AND TENSORS 241. The considerations about elastic collisions can be simplified mathematically making use of four-vectors. The concept of four-vectors and tensors is well known, nevertheless we give in the appendix a short account of vector and tensor analysis, so as to point out certain particular features of the subject which are relevant to the conception developed here. Here we note that the energy and momentum of a partiele can be expressed by a (covariant) four-vector n = ,-e. P

(27)

The deformed form of the energy-momentum vector can be written n* = Á - n , 1

(28)

where A is a Lorentz mátrix. (It is immaterial whether we take (27) and (28) in an orthogonal representation or not.) In particular considering a partiele at rest relative to a system of reference K we can write for its energy momentum vector in this representation q

I I = 0, 0, 0, - m c .

(29)

2

0

Applying a deformation A we find for the deformed system T

my, - wc

II* =

(30)

2

with m=mB = (>

Thus we see that supposing the energy and momentum of a partiele to form a four-vector, we obtain the same expressions for energy and momentum which we derived from the Lorentz principle by direct calculation in 232. 242. Supposing energy and momentum to form a four-vector, we automatically ensure that the formalism thus obtained is in accord with the conservation laws. Indeed, if we consider the elastic collision between two particles, we may write instead of (18) and (19) 233 n < > + n< > = n i > + n < 1

1

2

2

(3i)

2 ) 2

with TT(*) _

n

(*)

'=1,2

_ pW

Relation (31) contains both the laws of conservation of energy and that of momentum. The relation (31) expresses automatically a Lorentz invariant law — indeed applying the operator A ~ to both sides of (31) we obtain n f + n > = n< > + n >, 3

4

2

4

2

where H

(*+2>

=

njk)* = x^injfc)

= \,i.

i,k

243. The derivation of the energy and momentum expressions as it was obtained in 233 is more circumstantial than that given here with the help of four-vectors. However, the derivation of 233 differs — in principle — from that obtained with the help of four-vectors. Indeed, the derivation of 233 shows that the energy and momentum expressions m\ 0

E=

mc 0

2

are the only possible relativistic generalizations of the non-relativistic laws. The considerations in terms of four-vectors give merely one possible generalization of the non-relativistic law and it is not obvious from the fourvector formalism whether or not other generalizations are possible? This remark is important because it is a fact that the measures of most physical quantities can be expressed in terms of four-vectors or four-tensors — nevertheless there is no reason why all physical quantities should necessarily be expressed by vectors and tensors. This is particularly important in connection with certain problems of gravitation. 1. NEWTON'S LAWS

244.

Newton's law, which can be written F=

f

(32)

is not an invariant one. Indeed, even if we take p to be the space part of a four-vector, the derivative of such a vector into time is not invariant. Nevertheless (32) can be taken to be valid in any system of reference. Making use of the transformation properties of p we can obtain the measure of the force F in various representations. However, the measures thus obtained are not Lorentz invariant in the sense that they do not form parts of fourvectors.

245.

(33)

Introducing

into (32) we find

JT^Tj?

or we can write

F = F

1

with

F, =

W

(i-

+

(34)

F2,

F =

m,y

x

2

2

m,y

2

where F and F respectively y and v are the components of F respectively y parallel and perpendicular to v, thus x

x

2

2

F = y(yF)/v\ x

F = F - F, a

(35)

y = v - v

l5

(36)

2

and similarly = y(yy)/v , 2

V l

2

finally m (1 v*lc ?' '

m (1 i; /c ) '

Q

m

'

=

2

Q

2

m

=

'

2

2

1/2

( 3 ? )

We see thus that a mass is accelerated by a given force to a different extent if the force acts parallel or if it acts perpendicular to the velocity of motion. 246. Relations which can be expressed in terms of tensors are obtained if we consider the mechanics of continua; a partiele e.g. can be regarded as a small cloud of mass. Supposing thus a mass distribution representing a partiele, we can sup pose that inside an element of volume ÖV v/e find őp momentum and öE energy. The force acting upon the matter inside ő V can be written as (38)

SV = főV

where f is the density of force. Intersecting the cloud of mass into elements of volume 5V we can take those elements to move together with the cloud — and thus because of the Lorentz contraction we can take 2

2

ŐV = ŐV fl-v /c

(39)

0y

where v is the velocity of flow of matter contained in bV. Rewriting (32) we find with the help of (38) and (39) I =

'

.

— —— • 2

l.T»/

2

Jl-v lc dt

The latter relation can be taken as the space part of a fourvector relation. Thus the fourforce density dU. = B — d*

Í41)

forms a fourvector. 2. THE E NE RGY-MOME NTUM TE NSOR

247. The relation (41) gives the density of force which is needed to máin tain the state of motion of a matériái system with a momentum density p(x). Ifthe matériái system possesses internál stress, then the expression (40) must be extended. One finds that the state of matériái moving and alsó possessing internál stresses can be expressed by an energy momentum

tensor of the form a

-c^x

-q

cru)

where p is the density of momentum, q the density of the flow of energy, u the density of energy and a the internál stress tensor. Writing DivT=/ (43) we obtain an invariant relation between the tensor T describing the state of the matériái and the vector / describing the outer forces which are needed to maintain the state described by T. Indeed, if we have a state which (in a particular representation) is described by a tensor T with a = 0, then (43) reduces exactly to (41). In configuration where a # 0 we can take div a to represent the internál force density acting upon the volume element of the médium. In a closed system we have DivT = 0; separating space and time components we can also write őp div a +

~

=

0

dt

div q +

du dt

= 0

the above relations can be taken as the continuity relations for the flow of momentum and of energy and thus a can also be taken as to represent the density of flow of momentum. In such cases where T is symmetric such that 2

p = q/c , 2

the density of flow of energy is (apart from the factor l/c ) numerically equal to the density of momentum. The latter relation is describing the inertia of energy.

CHAPTER VIII

THE ELECTROMAGNETIC FIELD

248. For the understanding of a number of relativistic phenomena it is necessary to make use of Maxwell's theory of the electromagnetic field. We give here a short presentation of the well-known theory, partly to facilitate understanding, but alsó because we prefer a formulation, which deviates a little from the usual formulations and is adopted to somé extent to our own points of view. A. MAXWELL'S EQUATIONS 249.

Maxwell's equations are usually written as rotE = - ± B c

(a)

rotH =

— D + 4ni c

(b)

divD =

4nq

(c)

divB =

0.

(d)

(1)

The connection between E, D and H, B can be written as D = E + 4;tP

B = H + 4;rM

where P and M are the electric and magnetic polarizations. The current density i and charge density q are connected by the continuity equation div i + — q = 0 c

(2)

and we may add the expression for the force density acting upon a current of density i accompanied by a charge density q f = pE + i x B.

1. ANOTHER FORMULATION

250. We prefer in the following to write Maxwell's equations in a way, which though equivalent with (la-d) differs a little in form.* We shall thus write rot E = - — B c

(a)

rotB=

— É + 4m c

divE=

4nQ

(c)

divB=

0

(d)

efr

ef[

(b)

(3)

where we take i

= i + rot M + — P c

eff

(4)

Geff=e-divP. It can be seen that (3a-d) together with (4) are mathematically equivalent to (la-d) and it follows also that provided (2) stands, we have also divi

e f f

+ — éerf = 0,

c

(5)

and we may also write eff = eeffE + Í fXB.

f

ef

Relations (3a-d) give a set of differential equations for E and B only. The current and charge densities i and Q can be taken as the currents and charges flowing through matter together with the currents and charges flowing inside the atoms; f is the force density including the forces acting upon polarized atoms. Writing down the system of Maxwell's equations in the form of (3a-d) we think that they express the purely electromagnetic properties of the field and the role of matter is introduced through {4) which gives the currents i and charges g which depend on the state of matter. 251. P and M together with g and i characterize certain features of the state of matter; they determine the interactions between matter and the electromagnetic field. As outer fields cause polarization, move charges and efí

C{{

eff

eff

eff

* A more detailed analysis of Maxwell's equations I have given in Acta Phys. Hung., 20, 59, 1966, and Acta Phys. Hung., 20, 67, 1966.

change current densities, the change of the quantities P, M, Q and i are affected by the field acting upon matter. The exact mode of the change of these quantities depends not only upon the field, but alsó upon the particular properties and of the state of the matter on which the field acts. In older treatises it was usual to consider the simple assumption to the effect that P = KE and M = x'B (6) i =
a r e

s=l+

4UK

= 1+

\i

Anx'p

thus 1

where e is the dielectric constant and fi the magnetic permeability. The assumption to the effect that matter can be characterized by the distributions of a dielectric constant and a magnetic permeability and electric conductivity is a very primitive assumption on the physical nature of matter. We have to drop the simple assumption (6) and consider instead P, M, i and Q as characteristics of the state of matter the change of which depend partly upon the fields acting on matter but alsó on the physical state in which the matériái under consideration is found. We shall deal further below with a particular process for which it is important to replace (6) by more complicated relations. 252. From Maxwell's equations one can derive two relations which can be written as follows div^* —«= -Ei c

e f f

(7)

with

ff = -J-(ExB) « =

J - ( E + B ). 2

2

We can take c§ to be the density of the flow of energy, while u to be the density of electromagnetic energy.

Similarly we can form a tensor T =

— ( E o E + BoB) + lw

(8)*

which obeys the relation divT + - á = - f . (9) c Relations (7) and (9) give thus the distribution and flow of energy and momentum in an electromagnetic field. Indeed, the left hand expression of (7) gives the continuity equation of the flow of energy; if the right hand expression is equal to zero, then it describes a state of affairs where the energy flow deposits energy at a rate div § and the deposited energy appears in the form of electromagnetic energy of density u. The right hand side of (7) gives the rate with which energy is transformed from electromagnetic energy into other forms of energy. Similarly relation (9) gives the distribution of the flow of momentum which is described by Maxwell's tensor T; §jc can be taken as the density of electromagnetic momentum. The right hand side of (9) gives the rate of momentum transformed into non-electromagnetic form. I have given a detailed discussion of the relations (7) and (9) elsewhere.** eff

B. SOLUTIONS OF MAXWELL'S EQUATIONS 253. With the help of equations (3a) and (3b) the distributions E(r, t), B(r, /) can be calculated for any value of / provided the currents and charge densities i ff(r, t) and Q (r, t) are known and provided we impose an initial e

efí

* The expression (8) giving Maxwell's tensor differs from the förm it is usually given. We prefer the form (8) for two reasons: 1) The tensor (8) contains only E and B and is symmetric; in the usual form the tensor contains also D and H and is unsymmetric. The tensor in the form (8) expresses the purely electromagnetic momentum densities, while if D and H are also made use of, then we obtain a tensor which contains also part of the elastic energy and momentum which is produced in the raattér upon which the field acts. 2) The sign of T as given in (8) is the opposite to the sign usually used. We prefer this choice of the sign because T as defined by (8) obeys the continuity relation (9), i.e. the components of T can be taken to represent the flow of momentum density. In regions, where f = 0, (9) expresses the conservation of electromagnetic momentum. For regions with f # 0 (9) expresses the fact that the electromagnetic momentum increases or decreases but the increase or decrease is balanced by the change of mechanical momentum; the latter change appears in the form of a force acting upon matter. ** L. Jánossy: Acta Phys. Hung., 20, 6 7 - 7 9 , 1966. e[I

ef[

condition of the form E(r, 0) = E (r)

B(r, 0) = B (r)

0

0

(10)

upon the field strength. Relations (3c) and (3d) play the role of supplementary conditions. It can be seen easily that the latter conditions if fulfilled at t = 0 remain fulfilled at any other time, provided the currents and charges satisfy (5). Thus (5) and (3d) impose merely a restriction on the initial conditions. 254. Maxwell's equations can alsó be expressed in terms of potentials. Writing E = - g r a d $ - — Á,

B = rotA,

(11)

equations (3a-d) can be transformed into the so-called wave equations, namely V A 2

V

5- Á = - 4ni*eff

(a)

e

= -

\ &

(b)

4itQ

tfí

div A + — $ = 0. c

(12)

(c)

Equations (12a-c) are equivalent to equations (3b-c) in the following sense. We may prescribe an initial condition for the potentials in the following form: A(r, 0) = A (r) Á(r, 0) = Á (r) #(r, 0) = (r). 0

0

0

Because of (12c) we have é(r, 0) =

0

(14)

and therefore (13) must not include an arbitrary initial condition for The system (12a) and (12b) together with the initial conditions (13), (14) determine A, í> uniquely for any value of t # 0. Relation (12c) plays the role of a supplementary condition. It can be seen that the latter condition will be fulfilled automatically for any value of t if it is fulfilled for one value of í, and provided the current and charge densities satisfy the continuity relation (5). The solution of (12) thus obtained and introduced into (11) provides a set of field strengths E(r, t), B(r, r) which for themselves satisfy (3a-d). The

latter field strengths satisfy initial conditions of the form (10), namely E(r, 0) = - grad

0

0

B(r, 0) = rot A (r) = B (r). 0

0

The initial conditions (15) fulfil automatically (3c) and (3d) for t = 0 as it must be required for proper initial conditions. We see thus that solutions of (12a-c) provide with the help of (11) solutions of (3a-d). 1. G A U G E TRANSFORMATION

255. So as to see also the inverse of this statement we note that the equations (15) can be reversed. Indeed, relations (15) are satisfied if we put e.g. 4> (r) = 0 0

Á„(r) = - cE (r) 0

(15a)

1 C rotB (r') dx 4TJ |r-r'| 0

From (15a) it follows also div A„(r) = 0

and therefore

#o( ) r

=

0-

Giving therefore the functions E (r) and B (r) [in accord with (3c-d)] we can always construct functions A (r), A (r),

0

0

A' = A + grad V

0

0

0

X

C"

V = 0

(16)

for all values of r and t. Indeed replacing in (11) A and tf> by A' and the values of E and B will not be affected. The transformation (16) is the so-called gauge transformation — and we see that the solutions of Maxwell's equations can be expressed by potentials differing in gauge. 256. Because of the ambiguity of gauge it is often suggested that the potentials A and d> do not represent real physical quantities but are only convenient mathematical expressions with the help of which the solutions of Maxwell's equations can be obtained. We think, however, this not to be the case. So as to show why we think that A and $ have their good physical meaning, we investigate in a little more detail the set of wave equations. 2. RETARDED POTENTIALS

257. A set of particular solutions of the wave equations (12a-c) in terms of the current and charge densities can be given as follows

j-^^rfV,

A«(r,0»

(a) (17)

(b) with

/' = / - ] r - r' | /c.

(c)

j

The integrals (17) give indeed solutions of (12a-c) as can be shown inserting them into (12a-c). The latter integrals give, however, only particular solutions of (12), as the solutions (17) fix already A and

= $ (r, t) solutions of the equations (0)

(0)

(0)

(0)

V A 2

( 0 )

- 4 A ' = O, ( 0

1 ..

y2$(o)

$(0) o, =

divA<°> + - í » = 0. c Equations (18) admit of non-trivial solutions.

(18)

It can be seen easily that we can always construct solutions of (12) of the form

A(R, 0 = A « ( R , t) + A<°\r, t) 1 cP(r, t) = &"Kt, f) + <£ (r, 0 (0)

j

so that A(r, /) and <£(r, /) provide the solutions of Maxwell's equations (3) with an arbitrary initial condition (10). We note that A and

(0

W

(0)

7

3. A D V A N C E D

259.

The wave equations (12a-c) admit also solutions of the form A<*»(r,0= *W(r,0 = J

with

POTENTIALS

\j~~^dh',

(a)

^

(b)

^T-r'7

t" = t+ | r - r ' |/c.

(19)

(c)

The latter expressions are called advanced potentials. The kinematic interpretation of (19a-c) is that the charges and currents at times í" > t determine the potentials at the time t in a point P with the coordinate vector R. Such an "action from the future" has, of course, no physical meaning; somé authors are inclined to reject the physical significance of both potentials not only because of the uncertain gauge, but also because of the possibility of expressing fields in terms of advanced potentials.

260. In our opinion the retarded potentials have a good physical meaning while the advanced potentials have no such meaning*. The advanced potentials can be eliminated from the theory in a simple fashion. Let us write A { i } resp. A {i} for retarded and advanced potentials derived from a current density i according to (17) respectively (19). Writing A{i} for an arbitrary solution of the wave equation corresponding to a current density i, we find considering the linearity of the equations w

(a)

A{i} = A "{i'} + A {i"} (

(ű)

i' + i* = i,

(20a)

and similarly for the scalar potentials tp{ } e

= d>«{ '} + * « { g * } e

e

' + e" = Q-

(20b)

We can thus split the potentials which correspond to source densities i, q in retarded and advanced parts corresponding to source densities i', q' respectively i", q". B y splitting i, q suitably we can achieve that (20a) and (20b) give solutions of (12) satisfying arbitrarily given initial conditions. Superficially it may thus appear as if it was necessary to make use of both retarded and advanced potentials so as to obtain solutions of Maxwell's equations with arbitrary initial conditions. 261. We note, however, that an arbitrary solution of the wave equations can also be written A{i} = AW{I} + (AWfc} - AWOx}).

(21a)

Indeed if we put i

1 =

i'-i

(21b)

i" = i - i ' ,

then (21ab) become equivalent to (20a). Furthermore we may write A « { l } - A W { i } = A«»{i }. 1

1

1

A {'i} thus defined is a solution of the homogeneous wave equation (18). Thus writing in place of (21a) (0)

A{i} = A('>{i} + A ^ } ,

(22a)

and similarly for the scalar potential (r)

<*•{(?} = * {«?} + *«»{ei},

(22 b)

we see that the field described by A and 4> can be split either according to (21a, b) into a retarded and advanced contribution, or according to (22) into a retarded and a homogeneous part. In our opinion the splitting up according to (22) reflects upon the physical contents of the potentials. * L. Jánossy: Acta Phys. Hung., 20, 5 9 - 6 6 , 1966.

4. W A N D E R I N G WAVES

262. It can be concluded therefore that any soiution of Maxwell's equations can be represented as a superposition of a retarded solution, i.e. the solution which is obtained as the solution arising from the retarded actions of currents and charges, and of a homogeneous solution, i.e. an electromagnetic field without sources. The question arises whether in reaüty fields corresponding to homogeneous solutions exist? Such solutions, if they are realized in nature, have to be regarded as "wandering waves" which are traversing the universe but did not arise originally from moving charges. It is a question of experiment to decidé, whether or not such wandering waves occur in nature. 263. We remark that whenever we meet electromagnetic waves we immediately search for their sources, i.e. we suppose immediately that they are of the retarded kind. When astronomers observed electromagnetic waves coming from the universe, they immediately made theories to their origin. It seems not unlikely that wandering waves do not exist at all and that waves occurring in nature are all of the retarded type. The latter hypothesis involves that Maxwell's equations are giving only necessary conditions for the motion of electromagnetic fields — and that in nature only particular solutions of these equations, i.e. those which are represented by retarded potentials do really occur. As an example of an "advanced" solution of Maxwell's equation we mention the solution corresponding to a spherical wave contracting with the velocity c and finally disappearing in one point where a suitable charge (waiting for the wave) absorbs it. It is quite clear that such solutions of Maxwell's equations are only mathematical possibilities and that such solutions do not correspond to real phenomena. In generál a process described by advanced potentials only (like the example of the contracting spherical wave) can be regarded as the time reversals of processes which can be described by retarded potentials only. If we suppose that electromagnetic processes occurring in nature can always be expressed in terms of retarded potentials, then we have to conclude that electromagnetic processes are irreversible. Indeed, the time reversal of any real process leads to one, which would have to be expressed in terms of advanced potentials — and therefore the reversed process is excluded by our hypothesis. 264. As an argument in favour of the occurrence of advanced solutions it was put forward that retarded potentials correspond to the emission of waves, while advanced potentials represent the process of absorption. We do not want to discuss the problems of quantum electrodynamics here. However, we note that as far as we are inside the limits of validity of classical theory, there is certainly no need to suppose the existence of advanced fields for to understand the process of absorption.

The totál exchange of energy and momentum of the electromagnetic field and matter was given in 252. The latter equations show that an electromagnetic field may transfer energy and momentum to matter and thus lose corresponding amounts of field energy and momentum. This process follows from Maxwell's equations; to understand its mechanism we note that an atom while it is interacting with radiation is excited and emits secondary radiation. If energy is transferred to the atom in the process, then the secondary radiation is emitted with such phase that it extinguishes by interference part of the incoming waves and thus reduces the energy contents of incoming radiation. Thus processes of absorption can be unambiguously accounted for in terms of the retarded fields only. We note, further, that it is quite irrelevant to the question of advanced potentials that for certain technical calculations (for the reason of mathematical simplicity) engineers sometimes make use of advanced solutions.

C. MAXWELL'S EQUATIONS IN TERMS OF FOUR-TENSORS 265. In the sections below we use four-vector notations in a form which is explained more precisely in Appendixes I and II. Maxwell's equations can also be expressed in terms of four-tensors. Consider a Lorentz system of reference K in the terms of which Maxwell's equations have the form as given in 250-254. The relation given there can be re-written. Denote ¥ = A, - c
(23)

Ieff = >eff, - cq

(24)

the four-potential and tfí

the four-current density. In accord with (4) we can also write I

e f f

=I +

Divn,

= i, -

CQ

(25)

where I

is the conductive part of the four-current and I I is an antisymmetric mátrix representing the polarization, so that Ilik = Mi i7

j4

= ~cP-,

i, k, l = cyclic permutation of 1, 2, 3 i = 1, 2, 3.

266.

Using the above notation the continuity relation (5) can be written Divl

=0.

e f f

(26)

The wave equations (12) can be written D i v ¥ = 0,

K

}

(where we write L for the Laplace operator in four-dimensions, see Appendix I, 449). The field strength can be obtained from the four-potential as F = Rot *F.

(28)

Comparing (28) with (11) we find that Fik = ' B t — Fa c

U k, / = cyclic permutation of 1, 2, 3 i = l, 2, 3.

= E,

(28a)

Maxwell's equations in the form (3a-d) can thus be written Div F =

47rl

e f f

Div F = 0,

(29)

with ~ 1 (*) F = y((e-F))

(30)

(see Appendix II 468). 267. The energy and momentum can be expressed by a four-tensor, i.e. (> 1 ~ ~ T = - — ( F - F + F-F). 2

(31)

Separating (31) into space and time components one fmds with the help of (7a, b) and (8) ik

T

= ik> T

T

i4

= r, 4

c§

= -

T

h

u

= c-u.

(31a)

Relations (7) and (8) can thus be written (2)

DivT + ^

e f f

= 0,

(32)

where ^ef =fef , f

f

-cEi

e f f

thus L^eff is the density of the ponderomotoric four-force.

(32a)

268. The relations (23)-(30) give Maxwell's equations — including the expressions of the conservations of energy and momentum — in terms of four-vectors and tensors. We note, however, that this formulation in itself does not give any new physical result. Indeed, if Maxwell's equations as given by three-vectors and tensors are valid in the measures of a system K, then the four-vectors and tensors relations as given in 265—267 are automatically valid — as they represent the same mathematical relations just expressed with a different notation. Furthermore, submitting the four-vector and tensor relations to an arbitrary reversible coordinate transformation x' = f(x) then according to the results of Appendix I the relations (23)-(32) retain their form provided we define suitably the operations Div, Rot, L. The latter result is not surprising. It shows merely that Maxwell's equations express objective laws which can thus be expressed in terms of arbitrary measures of coordinates, vectors, etc. The simple form in which Maxwell's equations can be expressed using four-vectors and tensors proves that the four-potentials, four-currents, etc, can be taken as a kind of distinguished measures which are particularly suitable for to describe an electromagnetic field. 1. RETARDED FOUR-POTENTIAL

269. A new result — which cannot be regarded merely as change of notation — is obtained if we consider the retarded potential solutions of Maxwell's equations. Relations (17) can also be written as

Y(x) = J

Ieff(x

^

+X)

/ R 3

t

(a) (33)

T= - Rjc

(b) I

X = R, T R = | R |. (c) The expressions (33) contain explicitly the space and time parts of coordinate vectors and therefore it is not obvious that the relations (33) are invariant. We show presently how the invariance of (33) can be proved with respect to reversible linear transformations. We consider a linear coordinate transformation x' = Sx + s.

(34)

According to Appendix I we have for any covariant vector field A'(x') = S-^AÍx). Thus applying S to both sides of (33) remembering that T and I covariant fields, we have _ 1

eff

are

S - I ( x + X) = i; (x' + SX) 1

eff

fr

and therefore Y'(»') = J

l

°"( '+ x

S X )

d

*

R

( 3 5

)

with T=-R/c.

(36)

We note that the coordinate transformation (34) does not affect the variables of integration and therefore in the integrál on the right of (35) is to be integrated just as (33) into the components of R. We introduce X' = SX

(37)

as new variables of integration. We note that (37) expresses the relations between the three components of R and those of R'; the fourth components T of X and T of X' have to be expressed in terms of R respectively R'. We define X' so that X'gX' = 0

(38)

with g = S" rs- . 1

(39)

1

It is convenient to separate space and time components; we write thus for the sake of simplicity U B

Aj a)

_ [G V ~ (v - C

g

One finds alsó using known algebraicai relations det S - = det U(a - (TJ- A)B)

(40a)

det g = - det G(C + VG V).

(40b)

1

1

2

_1

We obtain from (39) and (40) G = UU-c BOB,

(a)

V = AU - c aB,

(b)

2

2

2

2

C = c a - A, 2

2

(c)

and also det S = c/c'

c'=V-detg.

with

(d)

270. Separating the space and time components of (37) we may write with the help of (40) R = UR'

+ AT',

(a) (b)

T = — Rjc = B R ' + aT';

(42)

eliminating T from the first relation with the help of the second we find AR R +

v - ^ ] r .

ac

Differentiating the above relation into R ' and taking the determinant of the differentiated expressions of both sides we find as the result of a simple calculation ARI 1 , l dR R +1 —det

1

ac } R —

= det S~7a;

(43)

ŐR'

expressing R in terms of R ' and 7" with the help of (42) and (41) we find AR 1 (44) R + = — (- C T' + V R ' ) . 2

ac

ac

However it follows from (38) 2

C T'

= VR'-

C\/R'GR'

with V O V G = G +

- - ^ -

(45)

where we take 7" to be the solution of (38) so that T

<

0.

Thus inserting from (44) and (45) into (43) we find 1 ~R

-det

dR dR

1

(46)

271.

Introducing (47) into (35) we obtain Iefffr' + X') d R'. R' 3

(48)

We see thus that in the system K' the four-potential can be expressed by the same expression as in K. The interesting feature of this result is the manner in which the retardation is introduced in the system K', Considering the action of sources from a point P' to a point P we note that the action which arrives in P at a time t' starts at the time + t' where \T'\ is the time of flight from P' to P; that this is indeed the case follows from the fact that 7" is obtained as the suitable solution of (38). Furthermore according to 163 the expression (47) for R' gives just such a value that 2R'\c' = tp . + t'p. P

P

i.e. 2R'/c' gives the time of the return flight of a signal of light between /' and P'. Thus the expression (48) has to be taken supposing that the measures of three dimensional distances R' in K' are given by the times of return flights of signals covering the distance. The latter result shows the consistency of our considerations. 2. THE MOTION OF LIGHT SIGNALS I N TERMS OF MAXWELL'S EQUATIONS

272. Maxwell's equations written down in terms of four-tensor contain implicitly the propagation tensor g as the expressions for the vector operators contain g. We introduced g as a tensor describing the orbits of signals of light; we supposed in a purely phenomenological manner that the orbit of a signal of light obeys Xgx = 0. (49) So as to show the consistency of the theory, we have to show that (49) follows Maxwell's equations expressed in terms of an arbitrary system of reference. 273. So as to show this to be the case we consider as a first step the propagation of an electromagnetic pláne wave. Maxwell's equation can be

written V¥ = 0,

D i v Y = 0.

(50)

The system (50) permits pláne wave solutions of the form * = !(«),

(51)

(remembering that x(p) is a contravariant four-vector) where a is a zerovector obeying o - a = 0,

(52)

and f(w) is a four-function satisfying o-f'(u) = 0

(53)

for any value of u. Consider a four-point the orbit of which is given by x(p); this point will rest on a pláne of constant phase, provided «x(/>) = constant, thus oi0») = 0.

(54)

Relation (49) can be fulfilled e.g. by putting x(p) = g - a = a g - , 1

(55)

1

as can be seen from (52). From (54) it follows using again (48) x(p)gxQ>) = 0.

(56)

We see thus that the planes of constant phase of the waves contain points moving according to (56). 274. That signals of light — which can be taken as somé kind of wave packets — move also according to (56) is a statement going further than the result of the preceding paragraph. The latter statement can be seen to be correct considering the retarded solution of the wave equation. Indeed a signal is produced if somé atoms in the vicinity of a point r around a time t oscillate for a short period. The oscillation can be described by a source density

0

0

Ieff( , 0 r

differing from zero if r ~ r , t ~ í . 0

0

In an orthogonal representation the potentials in a point r at a time t are given by

r

-r

t ' = t - ~ — -

The integrand on the right-hand side vanishes unless t' ~

r ~ r, 1

t

0

0

thus we obtain potentials noticeably different from zero in four-points with four-coordinates r, t obeying t

~

t. 0

c We see thus that the disturbance in x gives rise to a spherical wave expanding isotropically with a velocity c. Such spherical waves can be used as signals of light — just as it was needed for the determination of coordinate measures. Changing from the orthogonal representation to an arbitrary straight representation, we find that the expansion of the spherical wave in the new representation can be described by a quadratic expression of the form (56). Thus the retarded solution of Maxwell's equations in an arbitrary representation lead to propagation of light in accord with (56). 0

D. MAXWELL'S EQUATIONS AND THE LORENTZ PRINCIPLE 275. In the preceding section we have shown that Maxwell's equations can be expressed in a consistent form which form is independent of the system of reference. Physically new statements are obtained if we apply the Lorentz principle to Maxwell's equation. Let us consider an electromagnetic field In a straight representation we may write K(%) = F(x), where F may contain the potentials, the field strength and sources of the field in the representation K. According to the Lorentz principle we expect that provided % is a field moving in accord with Maxwell's equations then £ ®)=r f l

should alsó be such a field. 276. The latter statement can be seen to be correct considering Maxwell's equations expressed by four-tensors. To show this consider e.g. the tensor

F(x) expressing the field strength of j$. We have F*(x*) = M ^ x j M

1

with

(57) X* = M X + (JL q

and M is a Lorentz mátrix obeying q

M„gM = g.

(58)

q

The transformation (57) has the form of a coordinate transformation: therefore if F(x) obeys Maxwell's equation in K, then the deformed field quantities F*{x*) obey Maxwell's equations in a system of reference K' the measures of which are obtained as ( 5 ,

x' = M " x + vl . with

M

( < , ) _

'= M . q

Since the transformation is a Lorentz transformation we have

m

= A"(A) = g,

therefore the Operators Grad, Div, etc. are identical in AT and K'; thus from the fact that the representation of the field % in K' obeys Maxwell's equa tions, it follows that the representation of the deformed field obeys these equations in K. 277. In place of (57) one can also write F*(x) = M ^ M

1

(59)

with i = M ^ í x [x).

(60)

Relation (59) is identical with (57) but the former expresses more clearly the fact that F*(x) just like the undeformed field F(x) is the representation relative to the same system of reference K. Explicitly written we have for the field strengths of a deformed field E*(x) = E,(x) + 2?JE (x) 1

(vx B fx)))

2

(61) B*(x) = B iíx) +

j

Ő[B (X) + 2

(vxE(x))j

where we denote with suffix 1 respectively 2 the component of a vector parallel, respectively, perpendicular to v. Thus e.g. E = v(vE)/u

2

1

E2 = E — E^.

Splitting into components parallel and perpendicular to v we can alsó write in the place of (61) E?(x) = E i ® ,

BJ(x) = B j ®

and E*(x) = I ? ( E ( X ) -

|(vxB (x))

2

2

(62)

B?(x) = i?JB (x) + i ( v x E (x)) 2

2

The inverse of transformation (60) split into space and time components can be written x = r, t AND

r = B(i - ví) + r 1 = B(t - vr/c ) t

(63)

2

2

where r and r are the components of r parallel respectively perpendicular to v. x

2

1. T H E FIELD OF A POINT CHARGE

278. Consider the field of a point charge e at rest in the point r = 0; we find for the field strengths B(r) = 0

for any value of t.

The transformed field is found with the help of (61) and (63) E*(r,

0=

Be(r - ví)

Be B*(r,0= — ( v x r ) cs

(64)

with s = B^ 2

2

- \t) + r 2

2

So as to see the significance of (64) we note that for velocities M c w e have B ~ 1 and s ~ r. In the latter approximation we find that the moving charge carries away the electric field as if the field together with the charge förmed a solid. A magnetic field is induced which is circling round the charge. The above result is verified experimentally; it amounts to the BiotSavart law giving the magnetic field of convection currents. We see that — from the experimentál point of view — the constant c appearing in the

latter relation has to be identified with the critical velocity c' rather than with the velocity c of propagation of electromagnetic waves. 279. So as to see the relativistic effects we consider for a fixed value of t the electric field in a pláne Ti =

\t,

i.e. in a pláne perpendicular to v which contains the charge. We have r i

- \t = 0,

.s = | r | = r 2

and therefore Et(r,t) =

Be -^-,

i.e. the field strength is increased by a factor B relative to that of the charge at rest. The longitudinal field with r = 0 2

and

s — Br

gives an electric field

Thus the field strength is reduced by a factor l/B in the longitudinal direction. In the extrémé relativistic case when Bf>\ the field carried by the charge is concentrated into a small region in the vicinity of the pláne perpendicular to v and moving with the charge. The field thus moves with a velocity v ~ c, it is nearly transversal and the electric and magnetic field strengths are nearly equal. Thus the field carried by the fast moving charge resembles strongly an electromagnetic pláne wave accompanying the charge. The latter result can be compared with experiments; it is verified indirectly by observing collisions between fast moving charged particles.* 280. The measure of charge e* of the moving partiele may be defined by Gauss theorem, i.e. 2

e* = -^-J*E*(r,/>ÍS,

(65)

where the integration is to be taken at a fixed time t over a closed surface containing the moving particle. Inserting the field strengths as obtained from (64) into (65) we find e* = e. Thus maintaining the definition (65) we conclude that the measure of the charge does not change if the partiele is set to move. * Sec for details e.g. L. Jánossy: Cosmic Rays. Clarendon Press, Oxford 1950. 2nd ed.

CHAPTER IX

RELATIVISTIC EFFECTS OF THE ELECTROMAGNETIC FIELD A. EFFECTS OF THE FIRST ORDER

281. Developing into powers of v/c we can classify effects for velocities v < c as of the first order, of the second order or of higher orders. The essentially relativistic effects are of the second (or higher) orders. Somé effects of the first order have, however, played a great role historically and it was not always fully realized that the latter effects can all be accounted for without making use of relativistic concepts. Such misunderstandings appeared in connection with the Doppler effect (not including the perpendicular Doppler effect) the effect of aberration of light and in particular in connection with Fizeau's result giving the velocity of light in a moving médium. It was claimed sometimes incorrectly that "Fizeau's experiment gives an experimentál proof that the relativistic law of addition of velocities is the correct law, the classical addition law being only approximately correct". The latter statement was already criticized in chapter VI208. Discussing the mechanism of first order effects we shall show in particular in this chapter that the results of Fizeau can be understood in terms of the classical theory of propagation of light in refracting média and it is not necessary to make use in this treatment of relativistic concepts. 1. EFFECTIVE FIELD STRENGTHS

282. The claim that the first order effect can be explained without using relativistic concepts has to be made a little more precise. One finds that the first order effects can be correctly interpreted supposing (1) that a closed physical system when made to move with a constant velocity v does not deform noticeably. This assumption implies that for the purpose of the first order approximation we can neglect the second order effects like the length contraction and the slowing down of the rates of clocks. We suppose e.g. when giving the theory of aberration that the telescope moving together with the Earth behaves like a rigid body and does not suffer deformations while the orbital velocity of the Earth changes. (2) We suppose that an electromagnetic field of strengths E and B acts upon a physical system Q moving with a velocity v as if Q was at rest and

placed into a field with field strength E

eff

= E + -(vxB), c

B

eff

= B - -(vxE). c

The expressions (1) describe first order effects only; they can be obtained considering the Biot-Savart law and the law of induction. From the latter it follows that an electric charge moving with a velocity v possesses a magnetic field B (r) = e

while a moving magnetic pole m produces an electric field of strength E (r) = - •m v x r m

Supposing the principle of action and reaction to hold (at least up to terms of the order of v/c) from the above two relations (1) can easily be obtained. The interpretation of first order effects as the aberration and Fizeau's result can be obtained in a much more elegant way using the Lorentz transformation then with the help of detailed classical considerations. We give the latter considerations only so as to show that these effects can indeed be understood without the use of relativistic concepts. The fact that these effects can also be interpreted making use of Lorentz transformations proves the consistency of the theory. 2. THE FIELD OF DIPOLES

283.

The field of an electric dipole situated in r = 0 can be written E(r) = - grad

,

B = 0,

(2a)

where p is the dipole moment. Similarly we find for the field of a magnetic dipole Bír) = - g r a d - ^ , r

E = 0,

(2b)

where m is the magnetic dipole moment. The field of an electric or a magnetic dipole moving with a velocity v can be obtained as the Lorentz deform of the fields (2a) and (2b).

The moving dipoles of both kinds possess both electric and magnetic fields. The electric field of a moving electric dipole is found to be — apart from terms of the order of v jc — to be that of the originál dipole moving together with the dipole. The magnetic field has a complicated structure. The main (non-relativistic) features of the magnetic field of a moving electric dipole can be obtained, if we take in place of a dipole a thin strip, which is polarized perpendicular to its surface (see Fig. 22) and which is moving parallel to its surface. We see thus that the electrically polarized strip, when set to move, shows a magnetic field corresponding to a magnetic polarization given by (2b). 2

2

+ + ++++ +

Fig. 22. Scheme of a moving strip polarized perpendicularly to the velocity

The strip behaves as if the electric dipoles when set to move would develop magnetic dipole moment dm = v x í/p/c. The above result has to be taken with care; the electrically polarized strip behaves as if it would contain magnetic dipoles of moment dm =

MdS.

However, the magnetic fields of the single electric dipoles when set to move differ from those of magnetic dipoles of moment dm. The superposition of the fields of the moving electric dipoles is, however, equal to that of the superposition of the equivalent magnetic dipoles. The strip can be taken to consist of electric dipoles of cross-section dS with an electric dipole moment dp =

TdS.

The Poisson charge representing the polarization consists of surface charges of density + P on the surfaces of the strip. The moving strip carries then convection currents of opposite signs on its surfaces; the surface densities of the currents are + | v x P \jc and they are currents which produce a magnetic field corresponding to a magnetic polarization M = vxP/c.

(2c)

Similarly, a magnetized strip, when set to move, produces an electric field and we find P = _ v x M/c for the apparent electric polarization of the magnetized strip.

(2d)

B. TRANSFORMATION PROPERTIES OF FOUR-CURRENTS 284. For the understanding of somé electromagnetic phenomena it is necessary to consider higher order effects also. Consider a uniform charge distribution with constant density Q . The current charge density distribution can be written 0

I = 0, 0, 0, - CQ . 0

0

A Lorentz deformed version of the distribution is obtained; applying the operator A to I„ we find T

I* = í, -

CQ,

with í

=

v c

Q =

o

Bg . 0

285. The current in a conductor consists of moving charges of one sign and a system of stationary charges of the opposite sign. The two charge distributions compensate each other statically. We may thus write for the current density in a conductor I = Ii + I , 2

with I = 0, CQ,

II = í, ~ CQ,

2

thus 1 = 1,0.

(3)

The deformed current density corresponding to (3) can be written I* = i*, (vi)^ with i* = Bi

1

+

i, 2

(4)

ii, i being the components of i parallel respectively perpendicular to v. It is interesting that the deform of the neutral four-current with density (3) gives a four-current (4) which possesses an excess charge with density 2

Q* =

-(VÍ)B/C.

(5)

The usual interpretation of (5) is that the density Q* occurs because "simultaneity depends on the system of reference". We do not think the above statement to make sense, but we do think that (5) has a very simple physical content.

1. THE ELECTRIC FIELD OF A MOVING CURRENT

286. So as to see how the excess charge Q* comes about consider a rectangular current as shown in Fig. 23. Suppose charges of density Q to move around the conductor with a constant velocity w and suppose that the conductor contains charges with density — g which are at rest. The current density is thus of the form (3) and the system has a magnetic field only.

Fig. 23. Scheme of a closed current moving with a velocity v

Consider the deformed version of the rectangular current which moves as a whole with a velocity v in the direction AB. In the latter configuration the charges moving in the circuit will have velocities (see (7) in 205) Wy

=

v+ w

, and

vw

c

v—w .

vw

c

2

2

in the sections AB and CD and therefore they have velocities

W

1

= w —v = 1

vw

1+ — cT 2

and w|lW

2

=w - v= 2

vw 1*

relative to the conductor. Supposing vw > 0 we find W

y

<

W

2

thus the particles flowing faster in the section C -» D than in the section A -+ B. Since the particles are thus moving with a variable velocity through the conductor their density changes accordingly, the actual number of particles to be found per unit volume being inversely proportional to the velocity of progress. (This effect can be compared with the density of traffic along a road. In sections where the traffic is slowed down by somé obstacle the density of vehicles increases.) The densities of the moving charges in the sections AB and CD are thus equal to ei =

-^r-e =

^ - ( e + WO

1

c

2

where we have written gw/c = i. Neglecting thus terms in v /c we see that the arms AB resp. CD of the circuit show excess charge densities + Aq with 2

2

Aq =

Q*

= iv/c.

We conclude that the electromotoric force which drives the charges evenly around the circuit, when the circuit is at rest, produces a nonuniform motion if the circuit is moving with a velocity v. The unevenly moving charges produce in the deformed systems excess charges through effects which can be compared with a "traffic jam". These excess charges give rise to an electric field. 287. The current shown in Fig. 23 acts at large distances like a magnetic dipole m the direction of which is perpendicular to the pláne of the current. If the circuit is set to move with a velocity v in the direction A -> B, then the excess charges ±Aq forming along AB and CD produce a field which at greater distances appears as that of an electric dipole of moment it = — v x m/c; the latter is in accord with solutions (2d) 283 relating to the transformation properties of electric and magnetic dipoles. The above derivation is making use of the addition formuláé of velocities and thus the electric field of a moving closed current appears as a second order effect. The electric dipole field of a moving magnet can be taken to be a first order effect in accord with 283. This comparison shows that the distinction between first and second order effects is not a sharp one.

C. FURTHER EFFECTS OF THE FIRST ORDER 1. DOPPLER EFFECT A N D ABERRATION

288. The above effects can be interpreted up to first order terms with the help of relations (1). However, as the higher order effects are alsó of somé interest we give here the exact calculations which include Lorentz deformations also. It can be seen easily that the relation obtained concerning the Doppler effect and aberration can be obtained in a first order ápproximation without making use of the Lorentz transformation but using only relations (1); this gives an effective field strength acting upon a moving system O. Let us consider a source of monochromatic light in a point A at rest and at a large distance from the origin of the system of coordinates K. The waves arriving from A produce in the vicinity of the origin oí K a field which can be described in a good ápproximation as a pláne wave. The waves satisfy Maxwell's equations, thus they can be expressed in terms of a four-potential •F(x) = «P cos 2K 0

+ 4>)

(<XX

(6)

where *F , a are constant four-vectors and <j> is a constant. Inserting (6) into Maxwell's equations we find that 0

a-a = 0

' F - a = 0. o

(7)

Writing a = x, — co

co > 0.

(8)

From (7) and (8) it follows also that co = ex,

(9)

we can take K to be the wave vector. We shall also write k thus k is the unit vector pointing into the direction of propagation of the waves. From (6) it follows F(x) = F sin 2n (ctx + 0

F = - 2rox x W . 0

0

<j))

289. Taking the Lorentz deformed configurations of the field considered in the previous paragraph, we obtain the field of a source A* moving with a velocity v relative to K. Using the formuláé of 277 we find F*(x) = F J sin 27c(aA_^ + )

(10)

with F*, = A L . F . A ^ .

(11)

aA _ = A_ a = o*

(12)

We may write T

and in place of (10) F*(x) =

FJ

sin 2JI

( O * X + <j>).

From (11) and (12) we see that o and F change in the course of deformation like a four-vector and a four-tensor. So asto obtain the wave length frequency and direction of propagation of the deformed wave we introduce the quantities 0

a* = x*, —co*, co* = ex*

(13)

and also x* = —

with k * ' = 1.

Inserting (13) into (12) we find ,<x = Bx* + Bvco*/c co = B(co* - vx*)

(a) (b)

2

(14)

where x* and x* are the components of x* parallel resp. perpendicular to v. 2. FREQUENCIES OF THE DOPPLER EFFECT

290.

Taking the square of both sides of (14a) and remembering that x*x* = x*v = 0

we find 5 |cos#*- - | c 2

+sin #*

where we have written x* \/x*v — cos •&*,

2

thus d* is the angle between x* and v. Remembering the connection between B and vjc (15) can also be written x=x*5|l--^cos#*j .

(16)

From (9) and (13) we have further co* = cox*/x. Thus we may write in place of (16)

T

•

(17)

COS0*

1

c

Starting from the inverse expression of (14) we obtain in place of (17) similarly co

J

—

(18)

.

1 H — cos 0 c

3. EFFECT OF ABERRATION

291.

Multiplying (17) and (18) we find further l/B = 2

|i

+

-!icos#j j l - - % o s 0 *

and thus cos#* =

cos •&

v H

— • V 1 -\— cos V

(19)

c

The latter expression can also be written sin &*

sin 1 + — cos# c

Neglecting terms of second order in vjc we can also write

= # _ &* „ 1 i # c

S

n

(20)

where 0 is the angle of aberration. We come back to the discussion of this effect in 293. Equation (17) gives the frequency distribution of the Doppler effect, the latter expression is identical with that obtained from the purely phenomenological consideration of 38. 4. INTENSITIES IN THE DOPPLER EFFECT

292. So as to obtain the inténsity distribution of the radiation emitted by a moving source we calculate the energy density of the deformed wave. The field strength of the pláne wave arising from the source at rest can be written E =

B =

nA,

x E,

k

(21)

where TC is the unit vector in the direction of polarization, thus TTk

= 0

Jt = 2

k

2

= 1.

A is the amplitude of the wave. The inténsity of the wave can be characterized by the density of energy, thus u = -L(E

ö7t

2

+ B )=^. An

(22)

2

The energy density of the deformed wave is given by u* = ~(E** on

+ B**) = ^ .

(23)

47t

Introducing the expressions for E* and B* from (61) in 277 we obtain with the help of (23) as the result of a short calculation ^*/yt = 5|l

cos & =

+ -cosí c

\k/v.

Starting instead of (21) from relations B* = k * x E*

E* = n*A*

cos &* = k * yjv

and expressing E and B in (22) in terms of E* and B* we find also A*

1

A

5Jl-yCos#*

The relation gives the inténsity of the deformed wave as the function of direction of incidence.

In particular we find for the forward and backward directions c

A* ~A~

1

for

é = 0,

for

= n.

B\l + In the extrémé relativistic case if v 4B

2

c then we have for •» = 0

1 (24) for S* = 0 48* From (24) we see that in the extrémé relativistic case, i.e. for v ~ c, B > 1 the intensity of the deformed radiation will be very small in all directions except inside a narrow cone the axis of which points into the direction of v. ~A

5. OBSERVATION OF THE EFFECT OF ABERRATION OF STAR LIGHT

293. The considerations of the previous paragraph give the change of a radiation field from which takes place if the source is adiabatically accelerated and is made to change from © to @*. The observations of fixed stars show that their apparent positions change while the orbital velocity of the Earth changes. In the latter observation not the source @ is subjected to deformation, but the instrument Q of observation. The instrument Q suffers continuous deformations while the orbital velocity of the Earth changes and we take it that at two times, say, t = 0 and t = t the instrument has configurations Q respectively 0

D í = S (Oo), n

where S„ is the Lorentz transformation describing the deformation caused by the change of the orbital velocity of the Earth. Observing the radiation % we observe the effect of % upon £}„ and compare it with the effect of % upon £}*. However, we register the results of both observations relative to systems of reference K respectively K', relative to which the instrument at the time of observation is at rest. We compare thus the effect of F =Kfö) upon öo = *(Go) with the effect of F = * ' ( 8 ) upon e'o* = K'(G3). Since K and K' are the rest systems of Q respectively of Q* we have 0

Öo* =

Qo •

Thus we observe F respectively F' with instruments which appear to have the same configuration; thus from the point of view of the observer it appears as if not the instrument had changed from Ci -* ö * but as if the radiation field had changed from 0

3r-£-«,©) = S*. We may therefore apply the formuláé obtained in 288 — referring to change of radiation field if the source is made to move — also to the case, where the sources have remained stationary, but the instrument has been made to move. In particular eq. (20) 291 can thus be taken to give the apparent change of position of a star while the velocity of the Earth changes. 294. It is important to note that there exists a physical difference between the processes where the source is made to move and that when the instrument instead is made to move. In the former process there appears a particular type of radiation emitted while the source is accelerated. The radiation field F* reaches the observer only after the radiation emitted in the transient period has passed away. If the instrument Cl is accelerated instead, then the deformations take place practically instantaneously and no transient phase is to be expected. The above considerations can also be formulated as follows. If an instru ment of observation is placed into the radiation field of, say, a star then it indicates the direction of the Pointing vector § =

4n (E x B ). c

If the instrument is set to move with a velocity v then it will react upon the field as if its field strength had changed to values E and B as given by (1) and it will indicate the direction not of § but of eff

eff

471 e?eff =

(E ff x B e

eff

);

c is (neglecting higher order terms) equal to

the angle between § and § the angle of aberration. Let us consider e.g. a narrow metallic tűbe through which we view a star. The tűbe acts as a wave guide and the star light can pass through it only ifit is pointing into the direction of §. If the direction of the tűbe is changed then the heat losses caused by currents — produced by the electric field strength in the walls of the tűbe — extinguish the beam. If the tűbe is moved with a velocity v then E instead of E is responsible for the losses and we have to adjust the tűbe into the direction of § so as to allow the wave to pass. t í {

eff

e { (

Instead of the metál tűbe we can consider any other instrument, e.g., a telescope or — as was done by B radley — a telescope fiiled with water. Any such instrument will indicate the direction of § . The above consideration, though elementary, seems nevertheless to be worthwhile — as we find in the literature misconceptions regarding the effect of aberration. e{{

6. PROPAGATION OF LIGHT IN A RE FRACTING M É D I U M

295. Let us consider a source of light A and a receptor in a point B at the distance /. Suppose A starts to emit light at a time t = 0; the front of the emission will reach B at a time t = Ijc. This can be seen e.g. by taking

S Fig. 24. Penetration of waves through a moving médium

the retarded potential *F(x) produced by the source A in the point B; we have T(r , 0 = 0 B

for

/<— c

where r is the coordinate vector of B and / = | r — r |. 296. If we place a diffracting slab S between A and B (see Fig. 24) then we know from experience that the front reaching B will arrive at somé time t > t, the delay being caused by the transit of the electro magnetic waves through S. So as to understand more clearly the process we note that the retarded action of A arrives in B exactly at the time / = l/c as it can be seen from the explicit expression for 'F(x). The observed delay is caused by the secon dary processes arising in S. Indeed, when the radiation emitted by A falls on S, then the atoms of S start to oscillate and to emit secondary radiation. Thus the primary wave emerging out of S will be accompanied by the secondary radiation it has excited in S and — as we show presently — the secondary radiation extin guishes by interference the primary radiation for a time At = ti-t. Thus the radiation emitted by A is felt in B only at a time t = t + A t when B

B

A

x

x

the interference between primary and secondary waves has ceased to extinguish the field. 297. So as to see the above process in more detail suppose A to be far to the left and thus we can take that a pláne wave falls upon S. The propagation of the pláne wave in S can be described by

V A-X'A

= - 4m,eff

2

V 4>-\cp=-4no „ c

(25)

2

e

div A

S

= 0. c Supposing the S to be an uncharged dielectric we can take + —

•eff

= r o t M + —P

(26)

and supposing the field strength not to be too large P = xE

M = x'B

(27)

x' = -

If the dielectric is homogeneous we can suppose grad / = grad x = 0 (inside S) and thus div E = x div P = 0

and

Q

=

E(T

0.

We may thus write in place of (25) o. V A

1 ••

4rcx •

5- A =

.

,

„

E - 4tí7' rot B

c

(a)

c

div

A =

0.

(b)

Expressing E and B in terms of the potentials (see 254 (11)) we find E=

A

c rotB= - V 2 A ; we have in place of (28a) (1 - 4 T T / ) V A 2

-

1 + 47CX

(28)

or

V A--1-Á = 0

(29)

2

with V=clJen

e = 1 + 4UK

=

1—

—. T

4nx

We find thus that the planes of constant phase inside 5 are propagated with a velocity V < c. It must be emphasized that the radiation in S consists of the superimposed effect of the incident pláne wave and the radiation emitted by the atoms of S. The sources of the latter radiation can be taken as the inner atomic currents i given explicitly by (26). The front of the compound wave penetrates with a velocity V < c into S. There is thus a region into which the primary wave (travelling with the velocity c) has penetrated — but into which the compound wave (travelling with the velocity V) has not penetrated. Inside this region the secondary waves extinguish fully the primary wave as can be seen from solving (29). The latter equation gives the behaviour of the totál radiation — i.e. that of the primary wave superimposed with secondary radiation. Constructing the solution of Maxwell's equation for the region to the right of S, we find a delayed pláne wave. The front of the latter wave starts when the compound wave arrives at the right-hand surface of S. efr

a. DISPERSION

298. The above consideration is mathematically exact but it does not give a full account of the phenomena. The velocity of propagation of a front in a diffracting médium is experimentally found to be V = c/n(co) where n(co) is the geometrical refractive index for radiation of frequency co. The considerations obtained above are only in agreement with experiment as long as n(co)

~

Jhe,

The latter relation is fulfilled for very low frequencies co only. 299. The physical reason for the discrepancy is the incorrect assumption about the connection (27) between field and polarization. Consider e.g. the electric polarization: if we switch on a field E it cannot polarize atoms instantaneously, but it takes somé time until it overcomes the inertia of the atomic electrons and thus polarizes the individual atoms.

Considering in a first approximation the atomic electrons to be bound by harmonic forces, we have to replace the first relation (27) by P/co

+ P = x E

2 0

(30)

where co is the frequency corresponding to the harmonic force binding the electrons. Relation (30) was first given by Sommerfeld. A similar expression could be given for the connection of M and B; for the sake of simplicity — as we want to give here only a qualitative description of the phenomena — we consider a non-magnetic médium with ti = 1 or x — 0. 300. Using thus the relation (30) in place of (27) we can write (supposing M = 0) 0

and we have

V A--^A = - — P .

(31)

2

c

c

In place of (30) we can also write 1 •

1

- - A = - ( P / ( ü c x

+P)

and thus differentiate (31) into t 1

v (PK + p)

(IV)

dwy,

..

2 (PR + P) =

2

c

where

..

p,

2

c

(ív) d«p P =

The above relation is a differential equation of the fourth order for P. Pláne wave solution can be obtained with p = P , cos (Kr — co t) (

co |

co

a>l)

c

2

2

c

2

2

t

2

co

K =

l [

2

í

CO | 2

2

alj

« 1

47CXÜ)

1

2

+ 4tcx

r

co

2

and thus we find for the velocity of propagation of phase planes

We see that the effect of dispersion at least in a qualitative form is obtained from Maxwell's equation if we consider also the inertia of the atomic electrons. As the result of a further simple calculation one finds that in the interval / -* t between the arrival of the primary wave and the compound wave the interference does not completely extinguish the field. Thus we expect a phenomenon which was sometimes called "Vorláufer", i.e. a wave which appears before the main front arrives. Although the Vorláufer certainly exists, we have good reasons to believe that the latter phenomenon cannot be observed experimentally — we cannot go into details here. x

7. THE EXPERIMENT OF FIZEAU

301. According to Maxwell's theory we can obtain the phase velocity V(co) = c/n(co) of an electromagnetic wave passing through a médium. Consider thus a slab S through which a wave of frequency co is propagated with a velocity V(co). According to the Lorentz principle we may conclude that using a slab S* moving with a velocity v relative to S we expect that a wave with frequency co* will be propagated with a velocity V*(co*) = V(
v+ V(co) vV(co) '

neglecting terms of high order in v/c we find V*(co*) = V(co) + and co

1 v\ln\co))

The velocity of propagation of a wave of frequency co in a médium moving with a velocity v is therefore 1 n (5)

V*(co) = V(co) + v 1

2

(32)

co = co I 1 -

Relation (32) follows from the assumption that both the electromagnetic fields and their interactions with the atoms of 5 obeyed the Lorentz principle. As there exists somé misunderstanding about the significance of (32) (see chapt. VI 208) we show that (32) can be obtained from Maxwell's theory also. Furthermore we point out that (32) describes an effect of the order of v/c and therefore this effect is not an essentially relativistic effect. 302. The atoms of S* are moving with a velocity v, therefore the atoms behave as if the field acting upon them had components E B

= E + - ( v x B) c

e r f

e f f

(33)

= B - - ( v x E) c

Taking the constants of polarization x and x' we have thus for the induced polarization P = xE M = Z'B . (34) 0

eff

0

eff

We denote with the suffixes " 0 " that P and M refer to the polarization of the moving atoms. From the result of 283 we find that the atoms moving with a velocity v and showing polarization according to (34) are equivalent to matter at rest with polarization 0

0

P = P + -(v x M ) c 0

0

(35)

M= M --(vxP ) 0

0

With the help of (33), (34) and (35) we find thus p = xE +

X

+

M = x'B — *

X

c

(v x B) - \

c

v x (v x E)

* (v x E) — ~ v x (v x B).

+

c

We may express E and B in terms of the vector potential A and thus we can express i

eff

= rotM + —P, c

also in terms of A and its derivatives. Using this expression the wave equation V A - -^Á = -4m 2

e f f

(36)

appears as an equation containing A only. Supposing A to represent a pláne wave, i.e. A = A cos (Kr - cot) (37) 0

we obtain from (36) and (37) an amplitude relation. Supposing A to be parallel to v we find in particular V - 2 \ \ - \ \ v V - V = 0, 2

2

(38)

where V = co/K is the velocity of propagation of the wave in the moving médium and V = c/n 0

n = E/i; 2

neglecting terms of the order of v /c we obtain from (38) 2

v =

c

-

n

2

+

l-l)r

(39)

in agreement with the result obtained further above. We see thus that the velocity of propagation of planes of constant phase of an electromagnetic wave in a moving dielectric can be calculated from Maxwell's equations. Doing so one has to take into consideration that the incident wave acts upon moving atoms — and that the secondary radiation which interferes with the primary wave is emitted by moving atoms. The above considerations could be easily extended by taking into consideration the effects of inertia of the electrons as it was due in 299. As the result of such a calculation the effect of dispersion enters into the result — and we are led in place of (37) to the equation (32) in 257. We see thus that relation (39) giving the velocity of propagation of electromagnetic waves in a moving médium can be obtained from purely electromagnetic considerations without using explicitly the Lorentz principle.

D. EFFECTS OF THE SECOND ORDER 1. ACTION OF A CHARGE U P O N ITSELF

303. The force with which a moving electric charge acts upon itself can be worked out with the help of Maxwell's equation. Let us consider a charge e at rest at t = 0; the charge distribution may be given by g(r) so that |g(r) cPt = e and Q(T) = 0 r> a thus the charge is contained in a sphere with radius a. The electric field E(r) can be worked out and the force upon itself can be taken as F = Je(r)E(r)rf r.

(40)

3

If the charge is at rest the latter vanishes as can be seen from symmetry. 304. We calculate the field of the partiele if it is accelerated so that its velocity is zero at t = 0. We can thus write for the velocity and displacement of the partiele • 2

1

\ — vt

vr • 2 If we take the partiele to move like a rigid body, then the charge and current distribution at the time t can be taken as i(r,

0

=

víg

Q(T,Í) = Q

a =

|r

- y T í

s

(r-yví

The vector potential can thus be written according to eq. (17) in 257

A(r,

.•A|,(, + »-L,|,-i)1

0= -

R

c where we have put t' = t

R c

Differentiating into t and neglecting terms of the order of l/c or smaller, we find 3

-ÍÁ(r,0)=-^

R) similarly we have e

|

r

R_^

+

4>(r,0) = |

l V j R

2

/ c

2

L J-d V 3

^

and neglecting again higher order terms _1_.

r

Ő |r + R - vT^ 7 c | 2

d 3 R

e

— grad 4>(r, 0) = —

dr

(41)

R

Developing the integrand in powers of R we can take 1 grad Q\r + R - j-vR lc \ 2

2

= grad Q(T + R) -

\ 2

1 R? . d q(t + R) + smaller terms. 2 c 2

(42)

2

őtf

Introducing (42) into (41) and integrating by parts twice we find neglect ing the small terms grad#(r,0) = grad * (r) +

+V

0

^

The field of the accelerated partiele can thus be written

E(r) = E ( r ) - - ^ J 0

l

-

^

2

2R

]

^

L

3 d

R

R

where E (r) = grad <í>(r) 0

0

is the field of the partiele at rest. 305. Integrating over the whole of the accelerated partiele considering (40) we find s

3

F<> = Je(r)E(:r)d r =

U

with

Uf/(l^f)ÜÍ±^rtA.

(43,

U is a mátrix the components of which have the dimensions of energy. 306. As can be seen easily the orders of magnitudes of the elements of U are comparable with

where U is the electrostatic energy of the charge. The equation of motion of the charge can be written 0

wv = F

+

(ou,)

(44)

where F is the outsidé force and F* the self force with which the charge acts upon itself. Introducing (43) into (44) we find also (out)

J)

ml

+ —5-

v = F<° . u,)

(45)

Thus we see that the outside force accelerates the charged partiele less than the corresponding neutral one. The partiele behaves thus as if its mass had increased as the result of the charge. Since U is a tensor, the excess mass of the charged partiele appears to depend on direction. If we take as an order of magnitude relation U~

1U

0

we find that the effective mass of the charged partiele is about m

efí

x m-\

f- • c

Writing m

e{{

— m + ám, taking a uniformly charged sphere one finds

and thus

- 1 ) so the apparent increase in the mass caused by the electromagnetic reaction is 2/3 of that corresponding to the relativistic increase in mass expected from the Lorentz principle.

2. MASS DEFECT

307. Considering a pair of opposite point charges +e and — e at a distance R from each other, one finds

Am = — — 5 - (1 - cos #). Rc* 2

We see thus that the attraction between the charges causes a decrease in the apparent mass. Indeed accelerating a pair of particles with opposite charges, we find that the field of each of the particles acts upon the other exerting a force in the direction of the outside force. Thus the inner forces increase the acceleration and the system as a whole can be accelerated more easily than the system of the unconnected particles. The latter effect is connected with the mass defect observed in a system of particles bound together. 308. From the considerations of 237 one is led to expect that a closed system, e.g. a partiele, has a mass E where E is its totál energy. If E is changed in any way the mass is expected also to change — e.g. if we charge a neutral system and supply an electrostatic energy to the system the mass is expected to change by Am = 0

.

(46)

The calculation of the force which an electrically charged system acts upon itself gives an apparent mass change which has the same order of magnitude as expected from (46) but differs from this value to somé extent. It is believed that real particles behave in accord with the relativistic formuláé. The reason of the diserepancy is the following. Charging a system electrically we produce a stress which has to balance the Coulomb forces with which the charge acts upon itself. The stress is caused by the interaction of the atoms; the latter interaction may be expected also to be of retarded nature and thus the interatomic forces give rise also to a self force. If the system as a whole obeys the Lorentz principle, then it is to be expected that the reacting forces caused by stress and the electromagnetic self force produce together a mass increase which corresponds to the relativistic mass increase.

E. RELATIVISTIC MECHANICS OF A CONTINUUM 309. In a closed physical system we expect momentum and also energy to be conserved and therefore we have DivT

= 0;

( m )

(47)

the above relation gives four equations for the ten components of T , it gives therefore only a necessary condition for the motion of the médium to which T refers. We have to add to (47) the equations of motion of the médium, so as to be able to determine its motion from initial conditions. 310. In a configuration where (m)

(m)

DivT "° = F # 0 (

(48)

we have a system in which energy and momentum are not conserved Such a case may arise, if the system under consideration is under the influence of an outer force. The most important cases are, where a closed physical system is under the influence of an electromagnetic field. We can write for the energy momentum tensor of the latter and, as was shown in 252, Div T gives the rate of energy and momentum transferred by the electromagnetic field upon matter. In a mechanical system which is under the influence of an electromagnetic field we can suppose that, on account of the interaction, the totál energy and momentum are conserved. Thus we may suppose (el)

Div(T

+ T< >) = 0;

(m)

e,

(49)

the latter equation describes the circumstance that the density of force with which the electromagnetic field acts upon the matériái system provides exactly the force density — F which according to (48) is necessary to balance the force density + F which is produced by the mechanical stresses inside the system. 1. INTERPRETATION OF THE T R O U T O N - N O B L E EXPERIMENT

311. The field of a charged condenser can be worked out and in terms of the field strengths the ponderomotoric force equals F

(el)

= Div T , (el)

the moment of force acting upon the condenser can be expressed with the help of the density M

(el)

=

x

x

F

(el).

the totál moment of force can be expressed by the antisymmetric tensor of third order m = $M fdx *, 7 = 1 , 2 , 3 . kl

k

Considering a charged condenser at rest, we find /?#/> = 0

*, 7 = 1 , 2 , 3 .

Transforming the four-tensor M we obtain the tensor M * describing the state of the deformed system. In generál the space components of the latter will not vanish, even if those of the former did vanish, i.e. we have ( e l )

(el)

m{f* # 0. Thus the electromagnetic forces acting between the plates of the condenser produce a non-vanishing moment of force if the condenser is in a state of translational motion. The system consisting of two charged condenser plates is, however, not a closed system. To maintain the system, there is need for a mechanical support keeping the plates apart from each other and internál forces which counterbalance the Coulomb repulsion of the charge upon the plates. The totál system including electromagnetic forces and mechanical stresses can be described by an energy momentum tensor T = T + T — and since the latter system is in equilibrium we have (el)

Div T = 0 thus

( m )

M = 0.

From the above relation it follows that mu^rrtiP

+ m'tf = 0,

and also mf, = mif* + mif* = 0. Thus in a closed physical system, where the electric forces are balanced by mechanical forces, the moment of force vanishes in any representation and therefore it vanishes for the condenser at rest just as for the condenser moving with a constant velocity.

F. TRANSIENT PHENOMENA 312. We have seen that Maxwell's equations are consistent with the Lorentz principle in the following sense. If a field % can be taken to result from a current distribution 3 , then the Lorentz deformed field

can be taken to result from the Lorentz deformed current distribution 3* =

m i

The dynamical part of the Lorentz principle asserts that as the result of an adiabatic interference a physical system Q changes into a deformed system with configuration O* =

1(0).

Applying the latter result to an electromagnetic field, one expects adiabatic deformations of the type & 3 -

Q*.

(50)

313. So as to analyse what is meant by a deformation of the type (50) we note that, having a field only, we cannot apply any interference to the field itself, but can only interfere with the sources of the field; i.e. having a source distribution Q we can interfere with the matériái, the carrier of charges and currents, and make it change into 3 * . A simple example is e.g. a charged partiele P which is at rest relative to a system of reference K; we may apply outside forces upon P and make it move with a velocity v relative to K. The field of the point charge when it is moving with the final velocity v is given by (64) in 278. Using the retarded potential solutions of Maxwell's equations we can directly calculate the field of the particle. Suppose the partiele is at rest for t < 0, the acceleration starts at t = 0 and at a time t = t the partiele reaches its final velocity v. We may thus suppose that the velocity of the partiele is given by x

, Í0 v(í) = [v

í<0, t>h.

The coordinate vector of the partiele is thus

i

r(r)= j" v(t')dt'

for 0 < t < ^

0

and r(í) = r + yt

for t > r

t

x

ri = rOi) - v í . Calculating the field of the moving partiele at somé time t > t we find three zones in the field (see Fig. 25). x

t

1) There exists a region I surrounding the charge at a time t > t

x

| r(íi) - r | < c(t - h) for r inside region I. Inside region I we find, calculating the retarded potentials of the moving charge, that the retarded times are all t' > t , therefore the action of the charge is only felt from positions where the partiele moved already with its final velocity v. The field in I is thus the deformed field as given in eq. (64) 278. t

Fig. 25. Scheme of regions I, II and III

2) A region II surrounds region I at a time t > t ; we have x

| r | < ct, c(t - íi) < | rí/j) - r | for r inside region II. Inside region II the field is that which was produced by the partiele in the period 0 < t' < t thus II contains the radiation field travelling outward which was emitted by the partiele while accelerated. 3) Outside region II we find region III which for t > í is defined by x

x

ct <

\ T\

for r inside region III. The field in region III is that produced by the partiele at times t < 0, i.e. before the onset of the acceleration. The field inside III is therefore that of the charge at rest. The inner boundary of III expands with a radial velocity c thus it recedes with the velocity of light. 314. The receding of regions II and III can also be interpreted in another way. The electromagnetic field surrounding a partiele which con-

tains currents and charges can be considered as a kind of matériái continuation of the particle. If we now start to interfere with the sources of the field, then the latter will carry with themselves at first only the field in their immediate vicinity, i.e. inside region I. The distant parts of the field, i.e. those inside III, are at first not affected by the interference. In the region II placed in between, we find electromagnetic waves moving from the boundary of I towards that of III and these waves deform the outside field from % into %*. If we compare the field surrounding the partiele with an elastic médium, then we can compare the process of acceleration with the propagation of an elastic disturbance from the centre of the disturbance to distant parts. Indeed, if we were to accelerate an elastic body by means of outside forces acting only upon a region in the centre, then we would find that the outside forces set into motion at first only the centre part of the body and its outside remained for a time unaffected. The elastic disturbance gradually extends to the whole of the body and sets into motion the outside parts. The electromagnetic field of a partiele behaves exactly in this manner, the velocity of propagation of the disturbance being equal to c. The above considerations describe thus the mode of relaxation into the Lorentz deformed configuration, shown by a physical system which contains charges, currents and fields.

CHAPTER X

THEORY OF GRAVITATION

315. In this part of the book we shall deal with the problem of the generál theory of relativity from the points of view we have dealt with the problems of the special theory of relativity. It seems to us that the generál theory of relativity can be taken to deal with the effects of gravitation upon physical processes. Our view that the generál theory of relativity gives in fact a theory of gravitation resembles to somé extent the views expressed by V. A. Fock on this subject.* A. OBSERVATIONAL FACTS 316. The gravitational effects which are successfully interpreted in terms of the generál theory of relativity can be summarized briefly as follows. 1) The deflexion of a beam of light when passing a gravitating mass. This effect can be described phenomenologically by supposing that the beam of light behaves as if it consisted of a stream of particles subject to the gravitational attraction. From considerations somewhat similar to this the possibility of the deflexion of light in a gravitational field was foreseen in about 1800. — These considerations seem to have been forgottén later. The deflexion of light in the vicinity of a gravitating body could also be accounted for by supposing that the gravitational field surrounding a gravitating body acts like a diffracting médium. Thus the gravitational field e.g. of the Sun can be taken to act as a lense surrounding the Sun. The image of the sky viewed through this lense appears magnified and therefore the stars which appear on the sky in the direction near to that of the Sun, seem to shift from the Sun whenever the Sun approaches them. 2) The gravitational red-shift of spectral lines. Atoms change their frequencies when they approach a gravitating centre. The effect was observed when measuring the frequency of atoms in the solar and in stellar atmo* B . A.4>OK, TEOPAANPOCTPAHCTBA,BPEINEHH H TSROTEHHH,H3MATRH3MOCKBA, 1961.

spheres. This effect could also be shown under laboratory conditions with the help of the Mössbauer effect.* It was found that the frequency of certain y-emitters increases when they are shifted upwards. The effect is caused by the change of the gravitational potential of the Earth with height. 3) The anomalous perihelion motion of the planet Mercury. The curved motions of masses in a gravitational field are themselves gravitational effects. However, the laws of Newton which give the form of this motion in a first approximation, are taken usually as classical laws and only the deviations from Newton's laws are taken to be relativistic effects. The orbits of the planets around the Sun can be described with great precision in terms of Newton's laws. The orbits of the planets are in a first approximation Kepller ellipses, but because of the mutual perturbations these ellipses show slow precessions. The orbit of Mercury deviates slightly from that predicted by Newton's theory. The motion of the perihelion of its orbit differs by about 0.4" per year from that obtained from the amount calculated from the perturbations. This latter anomaly could be accounted for in terms of the generál theory of relativity. 4) An important cosmological result of the generál theory of relativity was the prediction of the recession of the distant extragalactical systems by Friedman** which was observed afterwards by Hubble.*** 5) Apart from the observed effects described above, the generál theory of relativity makes use of the fact of the equivalence of gravitational and inertial mass. This equivalence has been proved experimentally to a very high accuracy.**** B. STATEMENT OF THE PROBLEM OF THE THEORY OF GRAVITATION 317. The curved motion of a body in a gravitational field is a very apparent gravitational effect. The deflexion of light in a gravitational field is another gravitational effect which effect is, however, small and can be observed under favourable experimentál conditions only. The task of a theory of gravitation is firstly to find the laws of motion of bodies under gravitational action; the latter laws must give in a good approximation those formulated by Newton. Secondly, one has to find * R. L. Mössbauer: ZS. f. Phys., 151, 124, 1958; R. V. Pound and G. A. Rebka, Jr.: Phys. Rev. Letters, 4, 337, 1960. ** A. Friedman: ZS. f. Phys., 10, 377, 1922. *** E. Hubble: Astrophys. J., 74, 43, 1931. **** R. v. Eötvös: Ann. d. Phys., 59, 354, 1896; R. v. Eötvös, D. Pékár and E. Fekete: Ann. d. Phys., 68, 11, 1922; R.H. Dicke; Experimentál Relativity (Relativity Groups and Topology), Ed. De Witt and De Witt. p. 173, Blackie and Son Ltd., 1964.

the forms of other natural laws valid in regions where gravitational action is not negligible. These laws can be found by extrapolating the formulation of the laws valid for regions where gravitational actions can be neglected. For example, the laws of the propagation of electromagnetic waves can be described by Maxwell's equations in regions where gravitational effects can be neglected. In regions where gravitational effects are noticeable, the propagation of electromagnetic waves can be supposed to be expressed by equations similar in form to those of Maxwell's but such that the parameters characterizing the distribution of the gravitational field occur explicitly in the modified equations. Without giving the exact form of Maxwell's equations which are found to be valid in a gravitational field, we note that one might consider e.g. as a possible form of the wave equations relations

* -^b) = ' 2A

A

0

(1)

where c(r, t) is the velocity of light in a point with coordinate vector r at the time /. In a region where gravitational effects can be neglected we have of course c(r, t) = c = constant, and relation (1) becomes identical with the wave equation derived from Maxwell's equations in their originál form. Similarly, by introducing into the Schrödinger equations the parameters characterizing the gravitational field one can get the laws giving the effects of gravitational fields upon atoms. In particular the frequencies of oscillations of atoms are found to depend on the gravitational potential; therefore the Schrödinger equation adapted to regions containing gravitational fields, must lead to frequencies depending on the parameters of the gravitational field. 318. From a purely mathematical point of view Maxwell's equations, the Schrödinger equation and other physical laws could be generalized for regions containing gravitational fields in a large number of ways. In the following sections we shall give those generalizations which follow from the generál theory of relativity. Sometimes suggestions are made to the effect as if the generalizations of the laws of nature which lead to the forms of the laws in gravitational fields could be obtained from a priori considerations. According to such views the laws thus obtained are logically more or less the only possible ones — and in a paradox formulation one is led to the assumption — as if there existed no other possibilities, and "nature has to obey the laws deduced a priori".

Such considerations are at fault; we shall show in the following that the relativistic laws are based on well-defined physical hypotheses concerning the structure of matter and gravitation. It is a question of facts as to what extent these hypotheses give a correct description of real nature. 1. MATHEMATICAL FORMULATION OF THE PROBLEM

319. In the special theory of relativity only such regions are considered in which light is propagated homogeneously. The laws governing the motion of physical systems inside such regions obey symmetries which can be expressed by the Lorentz principle. In reality light can nowhere be assumed to be propagated strictly homogeneously, as we have reason to believe that the propagation of light is aftected by gravitation and regions entirely free of gravitation do not exist. The Lorentz principle can therefore be taken to be valid only to such an approximation as gravitational effects can be neglected. The question arises how the Lorentz principle should be generalized so as to apply to regions containing not negligible gravitational fields. 2. EXPERIMENTÁL CRITERIA FOR HOMOGENEOUS REGIONS

320. So as to be able to generalize the Lorentz principle — as a first step — we have to investigate how it is possible to decidé experimentally whether a region is or is not homogeneous. The above question is by no means trivial and it can be approached in a way we discuss presently. Consider in a region 9Í a number of points ty , ty with clocks (So, & situated near the points. Suppose 0

N

N

(a) that 9í is homogeneous, (b) that the points ^S do not move relative to each other, (c) that the points ty are either at rest or have at most translational motions relative to the ether, (d) that the standard clock 6 has a uniform rhythm. fc

k

0

If (a)-(d) stand, then we can find representations K(x ) v

=

r„

of the coordinate vectors of the points and establish time measures *(t,) = /, of events (S, occurring in points such that the propagation of light as represented relative to K is defined by a propagation tensor: ^(S) = g = independent of x = r, t.

(2)

We note that provided a straight representation K satisfying (2) can be found, then there exist also other representations ^'(S) = g' = independent of x' = r', t' with arbitrary values of the propagation mátrix g'. 321. In a straight representation the propagation of a signal of light obeys the relation

(x - Xi)g(x - xO = 0 2

(3)

2

where x and x are the four-coordinates of events (5 , i.e. of the departure of the signal of somé point at a time t and its arrival in another point % ti hSupposing that signals are propagated along straight lines we can describe the orbit of a signal of light in parametric representation by x

2

2

x

a t

a

me

2

x(p) = kp + a. where the vector x has constant components and obeys the relation xgx = 0 and a is an arbitrary vector with constant components. 322. If the attempt is successful to determine a straight representation K of the region 9í containing the points ty and clocks E then we may suppose that the conditions (a)-(d) of 320 are fulfilled indeed. A system of reference K can be constructed using methods described in 145, chapt. IV. As it was explained there, coordinate measures r and time measures t can be obtained making use of the exchange of light signals and interpreting the observational data with the help of relations of the form (3). However, the coordinate measures r„ and í„ are obtained as the solutions of a strongly overdetermined system of equations; we can therefore conclude that the region in which we observe the signals is indeed homogeneous provided the overdetermined system admits of solutions. We thus can obtain an internál check of whether the propagation of light in 9t is homogeneous indeed. However, considering things more strictly — when using the method described above — we check the fact whether or not all the conditions (a)-(d) given in 320 are fulfilled. Physically it is mainly of interest whether or not (a) is fulfilled — thus it is of interest whether or not the propagation in 9t is homogeneous indeed. The conditions (b)-(c) reflect only upon the question how the points $ move and how the clocks © are adjusted. The latter questions are of practical importance but are not significant regarding the question of the real mode of propagation of light in 3t k

fc

v

v

v

v

a. AN EXAMPLE

323. So as to give a practical example we note that light in the vicinity of the Earth is propagated homogeneously in a very good approximation. Nevertheless if we take the points ty as points fixed relative to the solid Earth we cannot synchronize the clocks © in a consistent manner supposing the propagation to be given by an expression of the form of (3). Indeed, because of the rotation of the Earth around its axis, the condition (c) is violated and therefore using the exchange of light signals between the points ty using the methods of 145 we cannot obtain a consistent system of reference relative to which (3) holds. The experiment of Michelson and Gale (see 62) showed that the rotation of the Earth around its axis can be determined with the help of interferometric measurements. It makes use of the fact that relative to the rotating Earth light does not appear to be propagated homogeneously. The experiment of Michelson and Gale does not prove, however, that light in the vicinity of the Earth is propagated inhomogeneously. Indeed introducing a system of reference K which does not share the rotation of the Earth, we find that relative to the latter the propagation of light appears to be homogeneous. However, the coordinate vector r of the points relative to K are changing in time. Thus taking into account the non-translational motion of the points ?$ we can introduce a straight representation in which the propagation of light is described by relation (3). Considering on the other hand the deflection of light in the vicinity of the Sun, we have to suppose that the propagation of light in the vicinity of the Sun is inhomogeneous indeed — and there exists no system of coordinate relative to which g(x) = constant in the vicinity of the Sun. k

fc

k

k

k

3. CONSTRUCTION OF STRAIGHT SYSTEMS OF REFERENCES

324. If we want to investigate whether a region is homogeneous or not, i.e. if we want to investigate the question whether condition (a) of 320 holds for a region ÍR and if we want to get rid of the conditions (b)-(d) then we have to use a new procedure. Let us suppose that the points ?$ are situated in a homogeneous region ÍR but we drop the conditions (b)-(d) thus we suppose that the points ?$ may move relative to each other and also relative to the ether in a more or less arbitrary fashion. The clocks S moving with the points should be adjusted arbitrarily and we ascribe arbitrary coordinate vectors to the points S$ . We want to restrict ourselves only to such an extent that we choose "reasonable" coordinate and time measures in the sense that we ascribe to points close to each other coordinate vectors which do not differ to a great extent and we regulate the clocks so that clocks in the vicinity k

k

fc

k

of each other should give readings not differing very much from each other. The latter conditions could be formulated more precisely, but such a formulation does not appear to be important. 325. Using the more or less arbitrary system of reference thus obtained we can establish empirically the orbit of a light signal. Interpolating suitably between the coordinate vectors of the points and readings on the clocks <E we find that the orbit of a particular light signal can be given in a parametric representation in the form k

\(p) = T(p), t(p), where we suppose that the various values of the paraméter p give the time measures t(p) at which the signal passes points with coordinate vectors r(p). We may suppose the representation to be such that t(p) # 0 for any value of p. Considering the orbits of a large number of signals of light we may be able to establish that the orbits of light signals in terms of our arbitrary coordinate measures obey a relation xO>)g(x(/>))xO>) = 0.

(4)

More precisely we may find that in the vicinity of any given four-points x light is propagated homogeneously in a first ápproximation. The propagation tensor g in the representation K may vary with x. a. LOCALLY HOMOGENEOUS REGIONS

326. It must be emphasized that relation (4) expresses already a particular feature of the mode of propagation of light. This feature — at least in principle — can be tested experimentally. Indeed, we can determine empirically the vectors x (p) k = 1, 2 , . . . , « for a number of beams of light passing through a fixed four-point x. Introducing the values thus obtained into (4) we obtain n relations for the elements of g(x). For sufficiently large values of n the system thus obtained is mathematically overdetermined; if the overdetermined set of equations admits of solutions, we can suppose that the propagation of light obeys indeed a relation (4).* We note that one could imagine regions where light is propagated in quite a different manner. We might imagine a region where a signal is propagated so as to occupy surfaces strongly deviating from elliptical ones as illustrated in Fig. 26. Such a supposed mode of propagation of light might be called locally inhomogeneous. k

* Since (9) is homogeneous g(x) can be determined only up to a factor a(x). — See for details L. Jánossy, Foundations of Phys. 1, N o . 3, 1971.

The relation (4) is analogous to the law of propagation of light in an inhomogeneous refracting médium. Thus relation (4) supposes that the propagation of light in the ether obeys a law not unlike the law of propagation of light in an inhomogeneous refracting médium. The assumption (4) can be taken as a hypothesis which is justified by the success of the theory based on it.

Fig. 26. Inhomogeneous mode of propagation of waves

We shall denote a region locally homogeneous if we find that the propagation of light in an arbitrary representation K obeys the relation xg(x)x = 0.

(5)

We note that the relation (5) gives only one necessary condition for the orbits of signals; therefore the orbits themselves in a locally homogeneous region cannot be obtained from (5) alone. b. CRITERIA FOR HOMOGENEOUS REGIONS

327. The question arises whether an extended region 9í inside which the law (5) holds is a homogeneous one or not? If the region is homogeneous, then there exist representations K' diflering from the originál representation K such that relations (5) expressed in terms of K' can be written x'g'x' = 0 with

g' = independent of x'.

The transformation between the measures of K and K' can be written as x' = f(x)

or

x =

f-V)

(7)

where f is a reversible four-function and f its inverse. From (7) we see that a point which is at rest relative to K, i.e. a point with a coordinate vector r = independent of t is moving relative toK'; similarly, if f is a non_1

linear function we see that in the representation K' the rates of clocks Qí differ from the rates which appear in K. 328. Transforming the coordinate measures x according to (7) we obtain the law of propagation of light in the representation K' in the form (6) provided k

S(x)g'S(x) = g(x)

(8a)

S(x) = - ^ - = f(x)oQ

(8b)

with

For given g' (8a, b) represents tett differential equations for the four components of f(x). This system is thus in generál overdetermined. We may therefore conclude that, provided the overdetermined system (8) admits of solutions, then this fact is not an accidental one, but signifies that the region 9í is homogeneous indeed. The representation K' which is obtained from K by the transformation (7) can be denoted a straight representation. 329. In the following we give the explicit condition which g(x) has to obey in a homogeneous region and give an expression for the transformation function f(x) which leads from a curved representation KXo the straight representation K' of a homogeneous region. From the relation (8a) it follows that we can put S(x) =a'A< >cr (x) p

1

(9)

where a' and a can be determined from the elements of g' and of g(x) according to Appendix I; 443 eq. (13). A is a Lorentz mátrix; we note that (p)

P = P(x) the six parameters of the Lorentz mátrix may vary with x. The paraméter p(x) have to be chosen so that (9) should satisfy (8b). However, introducing (9) into (8b) we get a very complicated system of differential equations for p(x) and therefore the solution (9) is of no practical use. 330. In Appendix II we have given in detail the solutions of (8a, b) and have also discussed the conditions g(x) have to fulfil for (8a, b) to admit of solutions. Because of the importance of these considerations, we give here a short account of the problem the details of which are found in Appendix II. Differentiating (8a) into x we find, remembering g' o = 0 (see eq. (29) in 471), (3)

(3)

( l + c - ) ( g S - ( x ) S ) = g(x). 1

3

1

(10)

Applying the operator n = 1 — c 3

1 3

+ c

(3)

to both sides of (10) we find

2 3

(3)

S(x) = S(x)-C(x), with

(11)

1 (3) C(x) = "2 rcsg(x).

(3)

Thus (11) gives a system of linear differential equations for the determination of S(x). Whether (11) admits of solutions can, however, not be seen directly. 331. So as to find the conditions for g(x) which have to be satisfied if (11) is to admit of solutions we differentiate (11) into x and find (4)

Ö)

(4)

(3)

S(x) = S(x) • (C(x) - (24)C(x) • C(x)).

(12)

Relation (12) gives a differential equation for S(x); if the latter admits of solutions, then such solutions can be obtained e.g. by giving an initial condition and integrating step by step. (4)

Since S(x) is to be a third derivative, it must be symmetric in the last three suffixes. The sufficient and necessary conditions for the solution of (12) to possess the required symmetries can be written (see for details 477) (4)

(1 - ( 2 4 ) ) S ( x ) = 0. (4)

Expressing S(x) by (12) we find (4)

S(x)-R(x) = 0 where (4) 1 (4) (3) (3) R(x) = - ; r ( g ( x ) + C(x)-C(x)) 4

is the Riemann-Christoffel tensor. Since det S(x) # 0 w e can also write (4)

R(x) = 0.

(13)

Relation (13) is a necessary condition which g(x) and its derivatives have to fulfil in every point of a homogeneous region. We see therefore that g(x) represents only then a homogeneous region if the Riemann-Christoffel tensor förmed of g(x) and its first and second derivatives vanish identically in the region.

332. The condition (13) is, however, not only necessary but also sufficient for a region to be homogeneous; we can show this by constructing explicitly a transformation function f(x) the derivative S(x) of which obeys (8a) and (8b). To show that (13) is also a sufficient condition for a region to be a homogeneous one, we show that if (13) is fulfilled we can construct transformation functions f(x) which lead to a straight representation K'. To show this we can differentiate (11) successively into x and find expressions of the form (3 +

0

(3)

S(x) = (S(x)-C(x))oQ'.

(14)

Writing down (14) explicitly for / = 0, 1, 2 we obtain a recursion for-

(3+0

mula with the help of which we can determine S (x) in terms of the (3 + /')

(k)

S (x) /' < / and the derivative g (x) k = 2, 3 , . . . 3 + / of g(x).

(3+0 Taking S (x) thus obtained to be the derivatives of f(x) we can develop f(x) e.g. around x = 0; writing 1

(3)

f(x) = (jl + Sx + - S x + . . . 2

with

(3)

S = S(0),

(15)

(3)

S = S(0),

and u. is an arbitrary four-component constant. (4)

If (13) is fulfilled we find that the S(x) are symmetric in the 2, 3, 4-th suffixes; since the suffixes from the 4-th are obtained by successive difW

ferentiation, the S(x) are in any case symmetric in the 4, 5 , . . . , k suffix. (*)

(fc)

Taken together, the S(x) and thus also the S are symmetric in all the suffixes except the first. Inserting (15) thus obtained into (8a, b) and developing both sides into powers of x we find that (15) satisfied the conditions (8a, b) in the vicinity of x = 0. (The considerations can of course be carried out in the vicinity of any four-point x = x„ instead of x = 0.) We see thus that the function defined by (15) gives indeed a transformation to a straight system of reference K'. We note that the transformation (15) contains ten arbitrary parameters, i.e. the four components of u, and the six parameters which can be chosen freely when determining S.

333. Summarizing our considerations we find that observing the mode of propagation of light in a region ÍR, in this region we may succeed in expressing the law of propagation of signals in a form xg(x)x = 0 using arbitrary coordinate measures. If the region around x = 0 is a homogeneous one, then we can transform our coordinate measures according to 1

(3)

x' = f(x) = u. + Sx + y S x + • • • 2

(16)

W

where the S are to be determined with the help of a recursion (14). In the representation K' thus obtained we find Á"(g) = constant. The coordinate measures x' give a straight representation K' of the region, while the originál representation K can be taken to be a curved representation of the region ÍR. 334. If we do not want to carry out the transformation (16) we can check also directly whether or not ÍR is homogeneous. In terms of g(x) = X(g) we may form the Riemann-Christoffel tensor R(x) in ÍR and if the region (4)

is homogeneous indeed, then we find R(x) = 0. We see thus that using signals of light only we are in a position to examine whether or not light is propagated homogeneously in the region we are investigating and if the propagation of light proves to be homogeneous, we are in a position to construct a straight system of reference with the help of signals of light. Returning to the problem raised in 320 we can decidé whether or not (2) holds in a region without regard whether or not the clocks we are using to measure the arrivals and departures of signals obey the conditions (b), (c) and (d). It seems to us important that we need not start our considerations by defining a straight system of reference. Indeed, starting with an arbitrary system of reference we are in a position to find out whether light is or is not propagated homogeneously and provided (inside the accuracy of measurement) the propagation proves to be homogeneous, we can afterwards — with the help of signals of light — construct a straight system of reference. 4. ALMOST STRAIGHT SYSTEM OF REFERENCE

335. The question arises whether it is possible to define in an inhomogeneous region a system of reference which resembles a straight system of reference as much as possible.

In a homogeneous region one can use straight coordinates but one can also use e.g. polar coordinates — which must be taken as a type of curve coordinates. It is reasonable to suppose that in an inhomogeneous region one can introduce coordinate measures which are the analogy of straight coordinates, but one can introduce also such coordinate measures that rather resemble e.g. polar coordinates than straight coordinates. Therefore in an inhomogeneous region one expects that one can distinguish between almost straight coordinates and strongly curved coordinates. So as to obtain a transformation which makes the representations g'(x') of g almost independent of x' we can start from eq. (12). The function (4)

S (x) will not be symmetric in (24) or (34) if the region we consider is inhomogeneous. We can introduce, however, a symmetric quantity (4)

1

(4)

<S(x)>= (l + y

(24) + (34))S

As shown in Appendix II (479 eq. 50) the unsymmetric solutions of (12) (4)

(4)

differ from the symmetrized functions <S(x)>; the difference <5S(x) can be expressed in terms of the Riemann-Christoffel tensor as (4)

1

(4)

áS(x)=- y (l + (23))(S-R) The transformation function obtained from <S(x)> does not lead to a representation K where g is strictly constant, but one obtains an almost straight representation K' so that *'(fl) = S'(x') = g' + <5g'(x') with 1

(4)

«5g'(x') = - ( l + ( 1 2 ) ) R ' x ' + 2

+ terms of higher order. We see thus that in the almost straight system of reference the g'fx') is almost constant in the sense that the coefficients of the development are (4)

of the order of R'(x) and its derivatives. 336. The representation K' can be taken to be almost straight in the sense explained above — but it can be taken also as a representation which is as straight as possible in the following sense. Considering any system of reference K' such that in the origin (3)

g'(x') = 0,

(17)

the elements of the Riemann-Christoffel tensor are built as linear com(4)

binations of the elements of g'. The coefficients in the linear combination are of the order of unity. (4)

(4)

The elements of R' may thus be much smaller than those of g' since (4)

the elements of g' with comparatively large numerical values may compensate each other when inserted into the expressions giving the elements (4)

(4)

of R'. However, somé of the elements of g' must be at least of the order (4)

(4)

of those of R' as it would be impossible to construct the elements of R' as (4)

linear combinations of the elements of g' if the latter elements were all too small. We see thus that there exists no representation K' obeying (17) such (4)

(4)

that all the elements of g' are essentially smaller than the elements of R'. Thus we conclude that there exists no representation K' in terms of which (3)

(4)

g' = 0 and the elements of g' are essentially smaller than the elements (4)

(4)

of R'. Thus there exists no representation K'in which the g' are essentially smaller than in the almost straight representation. (3)

337. If we consider representations for which g' # 0 we may formulate our result by stating that the set (3)

(3)

(4)

g'-Hg'og'), g' necessarily possesses elements which are of the same order as the elements (4)

(4)

of R'. Thus there may exist representations where g' = 0 but in such (3)

representations the value of the g' must be comparatively large. 338. Generalizing the above considerations one finds that the almost straight system of reference obtained in 335 has the property that suitable homogeneous expressions built of the elements of the derivatives of the g have orders of magnitude not exceeding those of the elements of corre(*+2)

sponding invariants R . Reversing the latter argument one can define a system of reference to be almost straight if the elements of (*,) (*,) (*,) g""(g ° g ° g) have order not exceeding those of the elements of (k, + k + . . .k„) t

g

1

R

The above consideration does not give strict definition of the almost straight representation. Indeed, representations K' and K" obeying the above criteria may show small differences in the elements of somé of the g and these systems of references can all be considered to be nearly straight. 5. SIMILAR REGIONS

339. Let us consider a larger region 9t with two sub-regions fR and Sí* (see Fig. 27). The propagation tensor in 9t may be given in a particular representation K as 0

0

m

= go(x).

Fig. 27. Similar regions

We denote by x and x* the coordinates of the centres of 9Í and 9t*. Denoting further the four-coordinates relative to the centres of 91 and íft* by \ and we can also write g(5) = gofx + 5) g*(5*) = 8o(x* + \*) Under exceptional circumstances we find that the regions 9í and 9t* appear as transforms of each other. More precisely there exist cases where a function X(x) can be found so that M © g * ( g * ) M ( S ) = g(§) (18)

with M © = x(g)on

We note that the relation (18) gives in generál an overdetermined system of equations for X(x). If the distributions of g(x) taken in the vicinities of x and of x* obey certain relations, then (18) can be satisfied and the regions (3)

(4)

9í and ÍR* can be taken to be similar. Denoting by Mfé) and M(?) the

derivatives of M(5) we find in analogy to (11) (3)

(3)

(3)

M(§) = M(5)-(C(5)-C(5)),

(19)

(19a)

where we have written (3)

(2)

with 1

(3)

c*«*) =

T

(3)

***(§*),

(III)

(III)

and M(cj) is produced from the elements of M(x) like S(x) of those of S(x). Relation (19) admits then and only then of solutions if (4)

(4)

R(5) = R(?),

(20)

where we have written Ü)

(IV)

(4)

R © = (M(E,)R*(!-*)) 340. Relation (20) gives twenty differential equations for the four components of X(Ej). However, M(£) contains six arbitrary parameters, therefore choosing M(£) suitably we are left^with fourteen conditions between the g ( 9 and the g*(§*). Physically we see thus that similar regions are characterized by fourteen parameters (corresponding to the fourteen conditions imposed on the propagation tensor). Further, similar regions can be characterized by a four-dimensional orientation; the relative orientations of the regions 3t and dt* are expressed by the parameters of Mfé). (4)

C. THE GENERALIZED LORENTZ PRINCIPLE 341. According to the Lorentz principle — as it was formulated for homogeneous regions — laws of nature have such a form that provided £i is a real physical system, then a Lorentz deformed version C* = S 0Q) q

js a possible system. Furthermore, to the above principle is added a dynamical part, i.e. if £} is interfered with adiabatically, as the result of the interference, it changes into a Lorentz deformed configuration ö*. The transformations £ can be defined uniquely for homogeneous regions. A generalization of the Lorentz principle may be attempted by extending the definition of the operators 2 to inhomogeneous regions. q

q

1. THE LORENTZ PRINCIPLE FORMULATED IN TERMS OF CURVED COORDINATES

342. If system Q consists of a number of moving points $p\, 5P> • • • > then the system Q* can be supposed to consist of corresponding points ty*> • • • > T h orbits of the points of C can be represented by four-coordinates ?

e

Xfc(/>) = táp),

tk(j>) ik(P)

^0

'«.

= 1, 2,. . ., n.

The Lorentz transformation may be defined by a reversible transformation x* = X(x)

x =

7,-\x*),

thus by supposing the four-coordinates of the points to be given by xj?0) =

H*k(p))-

343. In a homogeneous region — and using a straight representation — we have X(x) = M x + (i (21) t

where M g M , = g. 9

The Lorentz principle in a homogeneous region can also be written down in terms of curved coordinates. If in terms of the curved coordinates we have K(Q) = g(x) = depending on x we have x* = X(x),

(22)

M(x)g*(x*)M(x) = g(x),

(23a)

and with M(x)

=X(x)on,

(23b)

g*(x*) = g(X(x)).

(23c)

and

Relations (23a, b, c) define the function X(x); the system (23a-c) is mathematioally overdetermined, but in a homogeneous region it admits of solutions. In a homogeneous region the Lorentz transformation defined by (22) and (23) is equivalent with that defined by (21) and (22). 2. GENERALIZÁLTON TO INHOMOGENEOUS REGIONS

344. Equations (22) and (23) cannot be taken to express the Lorentz transformation in inhomogeneous regions because the system is overdetermined and in inhomogeneous regions the system admits of solutions only in exceptional cases. Nevertheless it must be assumed that the Lorentz transformation is somehow connected with the systems (22) and (23). Indeed consider a signal of light § passing through somé physical system O. The orbit of the signal can be expressed by four-coordinates x (p) such that s

iÁPtó*(.P)y*ÁP) = 0.

(24)

Submitting the system Cl to a deformation, we obtain a new system ö * which is traversed by a signal 3*. The orbit of g* is expected to be given by x*(/>) = X(x (/>)). 4

Since §* is a signal of light we also expect the coordinate of the signal §* to obey the relation kf(p)g(x*(p))xf(p)

= 0.

(25)

If both (24) and (25) are to be valid, then X(x) has to obey the relation (23). Thus from the hypothesis that in the case of a light signal § passing through Cl the Lorentz deformation leads to a light signal §* passing through Cl* we are led to the condition (23) for the transformation function X(x). Since (23) is overdetermined we see thus that the above hypothesis cannot be strictly correct. a. A PHYSICAL EXAMPLE

345. So as to see in more detail the physical significance of the above hypothesis let us consider a system Cl which is something like a Michelsoninterferometer. Suppose Cl to be a solid carrying a source of light in a point A and mirrors in points B and C. The mirrors should be so adjusted that a signal starting from A should be reflected back into A by them. The distances should be so adjusted that the return signals into A should arrive simultaneously, i.e. the times of fiight A-B-A and A-C-A should be equal (see Fig. 28).

A deformed system £i* can be taken to contain a source in a point A* and to contain mirrors in points B* and C*; we expect that in the deformed system the times of flights A*-B*-A* and A*-C*-A* are also equal. The deformed system £1* can be taken to be the originál interferometer shifted to another position and submitted to a (four-dimensional) rotation. 346. Shifting the interferometer from one point to another we change its gravitational surrounding; the hypothesis of 341 amounts thus in

Fig. 28. Scheme of an interferometer

our particular case to the hypothesis that the interferometer which was adjusted in one position remains still adjusted if we move it to another gravitational surrounding. The above hypothesis is physically unreasonable. Indeed, the gravitational field produces stresses in a mechanical system and the system will deform so as to compensate these stresses. It is unfounded to suppose that the deformations which are produced by changing gravitational stresses upon the mechanical system should give rise to deformations suitable to compensate exactly the effects of gravitation upon the propagation of light signals. So as to see how absurd conclusions we were led to if we were to maintain after all such a hypothesis, let us imagine a Michelson interferometer mounted on a horizontal axis such that the arms move in a verticai pláne. Turning the interferometer round its axis can be taken to be a Lorentz deformation. It is obvious that the interferometer will deform under its own weight when turnéd round. It cannot be expected that the position of the fringes remains stationary, while the interferometer is turnéd round its horizontal axis. In particular we note that the deformations the interferometer suffers on account of its own weight depend strongly on the properties of the matériái of the interferometer. Thus two interferometers built of different materials will deform differently when turnéd round.

347. Taking, however, a small interferometer built of strong matériái the internál stresses may become negligible and the interferometer will behave practically independently of the gravitational stresses. Thus the small and strongly connected interferometer will show no noticeable fringe shift even if turnéd round a horizontal axis. From the above qualitative consideration we may come to conclude that the Lorentz principle in an inhomogeneous region is valid for sufficiently small and sufficiently strongly connected systems. A mathematical formulation which takes the above restriction into consideration will be developed presently.

3. THE LORENTZ PRINCIPLE VALID FOR SMALL PHYSICAL SYSTEMS

348. A Lorentz deformation must be taken in a first approximation to consist of an adiabatic shift, which can be expressed by a four-vector p. and a (four-dimensional) turning round, which corresponds to the deformation obtained by an orthogonal Lorentz mátrix A . The deformation is thus characterized in a first approximation by the ten parameters q and fi.. In an inhomogeneous region a system, if turnéd round, is usually deformed to somé extent because the direction of the gravitational stress changes relative to the system. A shift [x is followed in generál by a change of the gravitational surrounding and this change also causes deformations when the system accommodates itself to the new surrounding. q

a. FIRST APPROXIMATION

349. So as to describe in a first approximation the behaviour of physical systems when subjected to a shift q, (x we consider a small and strongly connected system O in the vicinity of a four-point ty with a four-coordinate x . The points of £i have coordinates 0

0

x (p) = x„ + fc

% (p)k

The relative coordinates íjfcO) are taken to be small. The shifted system £i* has its centre round a four-point coordinate x =x 7

and the points

o o + {*>

with a four-

of the deformed system have four-coordinates

«2 = *í + 5ÍO0-

The transformation function X(x) cannot be supposed to obey (23) exactly, we can, however, require that

M®g*(S*)M® * gflj)

(26)

for not too large values of \ . We have written

M(5) = A(5)og

A(?)=X(x + ?)-X(x ) 0

gflD =

0

g*OÍ*) =

go(x + I), 0

go(x„ + H +

Thus the transformation X(x) connects the vicinity of Xo with the vicinity of x*i = x + (/. and we expect the relations (23) to be approximately valid, when we take these vicinities only. 350. The relation (26) can be taken to be exactly valid for? = = 0; we can thus write Mg*M = g, (27) 0

where we write M in place of M(0), etc. The solutions of (27) are matrices M = a * A„a, - 1

q

where A is an orthogonal Lorentz mátrix and a and oc* are given expücitly in Appendix I. In the above ápproximation we can write q

X(x) or

w

X«(x) = u. + M (x q

Xo)

A(í-) * < > ( ! • ) = M l-. The transformation thus defined contains ten parameters, four parameters giving the four-dimensional shift [Í, and six parameters q defining the fourdimensional rotation corresponding to M . q

q

b. A SECOND ÁPPROXIMATION

351. A better ápproximation of (23) can be obtained if we require that the relations corresponding to the first derivative of (23) should also be satisfied in the point x . Thus we may require 0

[M(5)g*(5*)M(5)-g(?)]on = 0 for 5 = 0 . From the latter relation it follows according to 339 eq. (19) (3)

(3)

(3)

M = M • (C - C)

where C is defined by (19a). We can thus write in this approximation 1

(3)

(3) _

A(?) * A >(E.) = M !- + - M • (C - C)%\

(28)

2

q

q

q

^52. The approximation (28) cannot be essentially improved since a equirement to the effect that derivatives higher than the first of (23) in x„ should give correct relations cannot be satisfied. The above formulation of the Lorentz principle is not yet entirely satisfactory and it must be improved for the following reason. The function A (?) as defined by (28) is not invariant. Indeed, changing from a representation K to K' by a non-linear transformation r

(2)

x' = f(x) we obtain in the new representation A q

2)'

=

f A

<2) -l f

and thus A ' contains higher order terms in x' even if A does not contain such terms in x. If f(x) is a strongly non-linear function, then it can happen that the higher order terms in the expansion of A >' become quite essential. Thus the transformation A defined by (28) may correspond to essentially different deformations, if we apply (28) in different representations. So as to reduce the ambiguity of the definition (28) we note that the relations (23) are best fulfilled in an almost straight representation. Thus we may require (28) to hold in almost straight representations only — in curved representations (28) has to be extended by suitable further terms. That a restriction of this type is necessary indeed, can be best seen considering the homogeneous case. In a homogeneous region the Lorentz transformation is a linear transformation in terms of a straight representation. In terms of a curved representation, however, the transformation function becomes non-linear; the non-linear terms in the expansion X'(x) compensate the curvature of the representation. In a nearly homogeneous region the transformation X'(x') must be expected to deviate only little from a linear transformation, if we use an almost straight representation. However, in a strongly curved representation X'(x') will in any case be an essentially non-linear function — the non-linearity compensating in a first approximation the curvature of the representation. (2)

(2)

(2

(2)

4. THE AMBIGUITY I N THE FORMULATION OF THE LORENTZ PRINCIPLE

353. Restricting thus (28) to almost straight representations there remains still an ambiguity, as transformations between almost straight representations lead to changes in higher order terms. We can thus write l (3) A(5) = % + — M ^ + higher order terms, 2

where in a straight representation the higher order terms should be "small", <*)

i.e. not exceeding the order of the elements of the invariants R k = 4, 5 , . . . . 354. We can write down a definition as how to find transformations A(§) which reduce in a good ápproximation to (28) in a nearly straight representation. Indeed, remembering the considerations of 335 we find that an exact solution of (23) should obey the relation (4)

(3)

(3)

Mft) = ( M © ) • ( c ® - c ( 9 ) o n . (4)

The above relation possesses solutions M(£) but these solutions do not possess symmetry properties required for the second derivatives of a function A(5). (4)

Symmetrizing M(£) by putting (*) i <*) <M(©> = j (1 + (24) + (34))M(5) we obtain a totál differential and we can define e.g. « 1 (3)

(4)

if-

A(l) = M% + - Ml- + - J <M(§')>(? - i'f 2

(29)

o

(the value of the integrál does not depend on the path of integration as (4)

<M(5)> is a totál differential). The latter expression in an almost straight representation contains only small higher order terms — in curved coordinates (29) contains the higher order terms compensating the curvature (4)

and also terms of the order of the derivatives of R(?)355. One might be tempted to regard (29) as the "exact definition" of the Lorentz transformation in an inhomogeneous region. This is, however, not the case. Indeed, the series (29) in an inhomogeneous region is only

almost invariant in the sense that forming the expansion (29) in different representations, we obtain almost equal deformations. Let us write K(2) = A(x) and K'(2) = A'(x') thus let us take A(x) and A'(x') as the representations of an operator £ in K and K'. A detailed analysis shows that (in the terminology of Appendix II) A(x) as A(x) thus the representations of £ transform almost like those of a vector, but not exactly.

CHAPTER XI

APPLICATIONS OF THE GENERALIZED LORENTZ PRINCIPLE A. GEODETIC ORBITS 1. DEFINITION

356. So as to find the form of various physical laws in inhomogeneous regions it is useful to see how the mathematical form of such laws, valid in homogeneous regions, can be generalized. It is a question of experiment to find out whether or not the generalizations which suggest themselves are in accord with experiment. It is a useful guide in formulating the hypothetical forms of physical laws to suppose that in almost straight representations the laws have forms very nearly like the forms of the corresponding laws valid in homogeneous regions formulated in the measures of a straight representation. 357. Let us consider Newton's first law. Consider for this purpose a small closed physical system in a homogeneous region — the latter can be regarded also as a particle. The position of the system can be described by a four-coordinate x'(p). Newton's law in a straight representation K' can be written 5'QO = 0. (1)* We write down the relation (1) in a curved representation K which is connected with K' by the transformation x' = f(x).

(2)

Introducing x = x(p) into (2) and differentiating into p we find x'(/>) = S(p)x(/>)

(3)

where we have written S(/0 = ( f ( x ) o f j )

x=JtOl)

,

* Instead of relation (1) we could also use the relation CPT _

= 0

r = r(/>)

t=t{p).

(la)

The above relation contains more solutions than the relation (1), as the latter relation implies a linear connection between t and p . Since, however, the restricted relation (1) gives already all possible orbits of free particles, there is no need to use the more generál relation (la).

Differentiating (3) once more, remembering (1) we find (3)

0 = S(p)x\p)

+ S(p)x(p).

Multiplying with S (/>) from the left we find _1

(3)

i(p) + e(p)*\p) where we have written (see 472) (3)

=o

(3)

(3)

S" S = g - C = <2. 1

1

2. LORENTZ INVARIANCE OF GEODETIC ORBITS

358. Generalizing the above result we may suppose that the equation of motion of a free partiele in an inhomogeneous region can be written (3)

x + Sx = 0.

(4)

2

The latter relation is Lorentz invariant as we show presently. Taking a number of particles in the vicinity of a point x their orbits obey the law 0

<3>.

5* + <2E. = 0

A: = 1 , 2 , . . .

2

(5)

with (3)

(3)

e = s(5 ) fc

where we have put *k(P)

= x + \ . 0

k

Submitting the orbits in the vicinity of x to a Lorentz transformation we obtain deformed orbits with coordinates 0

x?(/0 = x* + 5*, where according to the definition of the Lorentz transformation (see 358 eq. (28)) { 1 (3) (3) \ 5* = M \$ + — (<S - e)%l\ + terms of the order of q

k

where (3)

(3)

(3)

A I U ) (3)

V3)

e = g~ C and C=|M, C(x*)J 1

Differentiating the above relation twice into p we find (3)

I* = M^

k

(3) .

+ (<2 - <9)

+ terms of the order of

We have included into "terms of the order of

(6)

also terms of the form

(3)

Q\ \ \ the latter terms can be taken to be small compared with the terms not containing as we suppose the Lorentz transformation to apply only to a small vicinity of x . The measures of velocities \ or of the accelerations % need not be small. According to definitions we find k k

0

k

k

(3).

(3)

.

and therefore it follows from (6) (3) .

..

(3).

8 + <2*8 = M,(5 + e5 ). s

2

t

Thus it follows from (5) that (3) .

§í + e*5* = o. 2

Thus the relation (4) is Lorentz invariant in the sense of 351. a. GEODETIC ORBITS AND THE LORENTZ PRINCIPLE

359. The equation of motion (4) has still another aspect. It follows from the dynamical part of the Lorentz principle. that a system Q subject to an adiabatic interference changes into a Lorentz deformed system ö * . If there is no interference at all, then a partiele with somé initial velocity Y will shift during a time t by an amount vr. The latter shift can also be considered a "spontaneous" Lorentz deformation. The analogue to this process is the drift of a system in an inhomogeneous region. The motion in accord with the equation of motion (4) can be taken as a sequence of infinitesimal spontaneous Lorentz deformations — the orbit of the partiele in a gravitational field can thus be regarded as a sequence of such Lorentz deformations. B. EQUATION OF MOTION IN A GRAVITATIONAL FIELD 360. We investigate presently a few properties of the equations of motion. In the latter investigations we suppose g to be known. The connection between g and the gravitational field will be discussed further below.

An integrál of the equation of motion (4) can be obtained as follows. Multiplying (5) from the left by g(p) we obtain an expression which is a totál differential, as can be shown using the explicit expression for the Christoffel symbols. Indeed, one finds (3) . d . %P)g(p)%(p) + UP)C(P)¥ = -jj (&P)%{P%{ )). P

(7)

Comparing (5) and (7) we find as an integrál of the equations of motion Í(p)g(p)Í(p)

= constant.

(8)

In particular, if the constant is taken to be equal to zero, then we obtain the relation valid for propagation of signals of light. Therefore the equation of motion (4) can be taken to be valid, not only for particles, but also for the orbits of signals of light. 1. VARIATIONAL PRINCIPLES

361. The equation of motion (4) was obtained by considerations in the course of which terms of higher orders had to be neglected. Nevertheless, relation (4) can be taken to be an exact relation in the following sense. Describing an orbit by a four-coordinate x(p) we may consider the orbits obeying the following variational principle S f x(p)g(p)x(p)dp

= 0.

(9)

The Eulerian equations which can be derived from the above variational principle can be written 80

d (39)

n

with 9(p) = x(p)g(p)x(p).

(11)

As the result of a short calculation one finds that (10) and (11) reduce to relation (4). Thus the equation of motion (4) can also be replaced by the variational principle (9). We note that the variational principle (9) defines orbits in an invariant way, i.e. independent of the representation. Indeed, the transformation relations for g(x) are so defined that subjecting x and g(x) simultaneously to a coordinate transformation, the numerical values of 6(p) remain unchanged. Thus the solutions of (9) and therefore also those of (4) define

definite orbits independent of representation. Solving (4) and (9) in different representations, we obtain the parametric representation of the same orbits relative to various systems of reference. In the above sense the relations (4) appear to be exact relations in spite of the circumstance that in the course of their derivation higher order terms had been neglected. a. DEVIATIONS FROM GEODETIC ORBITS

362. The fact that higher order terms have been neglected in the equation of motion (4) leads nevertheless to ambiguities in the following way. The geodetic orbits which can be determined in an unambiguous manner must be supposed to be the orbits of sufficiently small and connected objects. If we consider e.g. a planet, then we find that the geodetic orbit gives an excellent first ápproximation of its real orbit. However, taking the problem more strictly, we have also to consider deformations the body of the planet suffers, while it is moving from one gravitational surrounding to another, i.e. we have to consider the tides of the body. The tides themselves produce perturbations of the orbit and therefore the real orbit of the planet will deviate to a small extent from the geodetic orbit. The amount of this deviation is determined by the mechanical properties of the planetary body, i.e. from the measure of the distortions which arise while the planetary body continuously adjusts itself to its gravitational surrounding. Therefore the deviations of the real orbit from the geodetical orbit depend on the internál mechanical structure of the planet. Considering the motion of the planet as a sequence of Lorentz deformations in the manner explained in 358 we can say that the higher order terms of the Lorentz transformation express the deformations of the moving planet. We see therefore that the theoretically expected deviations of the orbit of a real planet from the geodetical orbit are connected with the ambiguous higher order terms of the Lorentz transformation. 2. THE PHYSICAL CONTENTS OF THE VARIATIONAL PRINCIPLE

,

363. So as to make clear the physical significance of (9) we write it down in a little modified form. We introduce the vector x/x = w, 1, 4

where w is the velocity of the centre of Q. We can thus write 6 = ((w + v)G(w + v) -

c )xl, 2

where
ki

xgx = —A = constant, 2

therefore we find x

i

= -

A y/c

2

(12)

=

- (w + v)G(w + v)

(we note that for G = 1, v = 0 we have x, =

.

A

The value of A depends on the scale we choose for p. Let us write A = m , where we take m to be the rest mass of £t in suitable units and the scale of p to be chosen accordingly. We find thus 0

0

Odp = (K-U)dt

(13a)

K = y m(v + w)G(v + w)

(13b)

where

~2 m=

/

^ 1 - ( v +w)G(v + w)/c

2

(13c)*

Let us take K to represent the kinetic energy, U the potential energy and m the mass of Q; we can write K— U = L, where L is the Lagrangian of Ci. In place of (9) we can also write <5

j W í = 0. !

(14)

We see thus that (9) corresponds to the Hamilton principle for the motion for the centre of O. * We note as a curious feature of equations (13) that although they have been derived from the generál relativistic equations of motion, they have a remarkable similarity to the non-relativistic expression. In particular v + w is the velocity of thé partiele relative to the ether and thus (13b) gives an expression very similar to the classical expression for kinetic energy.

We note that the inner structure of the system does not appear in the derivation and therefore we conclude that the orbit of any small closed system is expected to obey the same law. This result reflects on the principle of the equivalence of gravitational and inertial mass. 364. The variational principle can be written also in a different manner. Making use of (13) and remembering dp = díjki we find xgx dp = = — A x dt and have 2

4

ö

.

JJc

2

- (w + v)G(v + w)

=0.

Further writing Jc

2

- (v + w)G(v + w)

we have thus x,

<5 J dx = 0.

(15)

x,

T can be taken to be the propertime of the system and therefore the latter relation has the form of Fermat's principle in optics. The path followed by the system in Q is that along which the propertime is stationary. 365. While (9) can be naturally interpreted as a Hamilton, or as Fermat principle, the usual interpretation given to this relation, i.e. that the system moves along a "geodetic line" seems to us very artificial. We may of course "define" the orbits obeying (9) as four-dimensional geodetic lines — however, it seems to us that it is a play with words if we suppose the geodetic line to be a "straight line in four dimensions". The orbits satisfying (9) have specified variational properties and they can be taken to be a particular type of orbit — however, whatever we suppose them to be, they are certainly not "straight". As we shall see later, the solutions of (9) include among others the Kepler ellipses along which planets move. — If we call those orbits "straight" then we lose completely the meaning of what is usually called straight. The satisfactory procedure seems to suppose that the propagation of a light signal in vacuum is in a good ápproximation straight and an orbit deviating strongly from that of a light signal is noticeably curved. With the help of signals of light we can also construct an almost straight system of reference and orbits in such a system described by x'(p) = constant can be regarded as to define almost straight lines. In the presence of a grav(4)

itational field when R # 0 the concept of the exactly straight line is

meaningless. However, we are in no need of such a concept, as we can construct satisfactory systems of reference without having first introduced the concept of a "straight line" or of a "geodetic line" as this was shown in detail in 334.

C. CONNECTION BETWEEN A GRAVITATIONAL FIELD AND THE PROPAGATION OF LIGHT 366. Relation (4) can be used to obtain the orbit of a small closed system, e.g. a planet in a region, where the propagation tensor g(x) is known. We supposed from the beginning of our arguments that the propagation of light becomes inhomogeneous in a region, where there is a gravitational field. Thus relation (4) gives the equation of motion of a mass in a gravitational field, where the field is characterized by the propagation tensor g(x). Since the motions of the planets are well known the relation (4) can be used to determine the propagation tensor g(x) in a known gravitational field. Indeed, in the vicinity of the Sun using a nearly straight representation we can take the gravitational potential to be given by

*(r)---

ix = MG,

(16)

r

where M is the mass of the Sun and G the gravitational constant. The values of g(x) in the vicinity of the Sun must be such that (4) reduces — at least in a good approximation — to Newton's laws of motion. 1. THE EQUATIONS OF MOTIONS I N A GRAVITATIONAL FIELD

367. So as to carry out the comparison between (4) and the laws of planetary motion, it is convenient to eliminate the paraméter p from the system (4) and to obtain a set of equations with t = x f a ) as independent variable. For this purpose we remember that dt

~dt

4>

With the help of the above relations we can eliminate p of the equations of motion, and we get an equation of the form (2)

(3)

v = A + Av + Av ,

(17)

2

(2) (3)

(3)

where the coefficients A, A, A can be obtained from the elements of (2. Returning to the problem of determining g(x) we may suppose that 0 0 1 0

0 1 0 0

(1 0 g(x) = 0

V>

^ 0 0 -c (r)J 0

(18)

2

Introducing (18) into the expressions 472 of Appendix II, defining the Christoffel bracket symbols we find thus (3)

c\r)C

uk

(3)

=C

kii

= grad

c (r)

k = 1,2, 3.

2

fc

(19)

Introducing (19) into (17) and using the notation drldt = \ we obtain the following equation a dAc c(ír(r) r )^ +, v(vgradc A—_ — ^ _(r))v v = - ^g rgrad vvr 1

2

2

2

y

+

( 2 0 )

where we have denoted with a dot the derivation into time. The second term in (20) is much smaller than the first if we consider velocities v < c(r). Neglecting terms in t? /c (20) obtains the form of Newton's equation of motion if we suppose 2

2

c (r) = constant + 24>(r). 2

(21)

In particular we can write c\ ) = C - ^ 2

T

(22)

where c is the velocity of light at large distances from the Sun. 368. Thus choosing g(x) in the form (18) with c(r) given by (22) the equation of motion of a planet is obtained as

^=-73-+-^ ar

2a(vr)v

(23)

where we have written short c in place of c(r). Neglecting terms in v /c in (23) we obtain Newton's equations of motion and their solutions are the Kepler ellipses. 2

2

Considering the terms in t> /c we obtain corrections and these corrections can be taken as the relativistic perturbations of the motion of the planets we have already mentioned in 316. 2

2

2. INTEGRALS OF THE EQUATIONS OF MOTIONS

369.

So as to integrate (23) we remember •

A

vr = rr

and

Thus multiplying (23) by v we obtain

2

d

vv = - —— v . 2 dt

(v) 2 dt 1 - 2v,2/„2 jc K

1

}

a.r

2

(24)

„2

Neglecting the dependence of c upon r we can integrate both sides of the above relation and we obtain the relativistic energy integrál
2

v

2

\

A

*

where A is the constant of integration. We may also write , = ^-{l-exp{-4^ + ^c }|. 2

2

(25)

Neglecting terms in t> /c the above relation reduces to the classical relation, i.e. 2

2

v = 2 2

j/í +

.

(26)

Multiplying (23) by r in a vectorial manner we obtain d , 2<xr(r x v) — (r x v) = ~ dt cr x

—ka—: 2

- j r (

r

x

v

) =

2

from the above relation we see that the vector P = rxv

(> 27

(28)

does not change its direction, therefore we can write in place of (27)

thus

P _ 2<xr y ~ eV P = aexpi-^1},

(29)

where a is a constant of integration. From (28) and (29) it follows that the motion takes place in a pláne perpendicular to a. Introducing polar coordinates in that pláne we find r = f + rV, 2

P = r q>,

2

(30)

2

where

(3D we can write making use of (30)

rfo-fjsa*,.

JJtl'-P'lr' Introducing P from (29) and v from (25) we have if we introduce further s = 1/r as a new variable of integration 2

l/r

ads Vc /2(exp {4aí/c } - exp {-4-A/c }) 2

2

2

- eV

Developing into powers of 1/c we obtain 2

«,) — f

üdS t

J J2A + 2as - aV + l/c {4ocV - 4^ } + 2

2

+ terms of higher order. a. PERIHELION MOTION

370. Neglecting the terms of higher order we obtain a motion with a period 2a 2n • . = 2n 1 + • 7c -4a /c "V ' e V 2

2

2

2

thus a shift of the perihelion per revolution is given by 4na. ca

2

2

z

However, we can write a = v f, 2

a = vr,

where v is the average velocity, r the average radius vector, therefore we find ác = 4n'^.

(32)

P

371. We note that in a treatment where we modify the classical one by considering the relativistic change of mass with velocity only, we obtain v

2

Aq>

=

2K •

c

-2 5 .

The shift of the perihelion according to the generál theory of relativity as obtained by Einstein is v

2

Acp = 6n • -=-. c

(33)

The latter value seems to be in accord with the observed motion of the planet Mercury. The result obtained from the assumed form (18) and (21) gives therefore the correct non-relativistic form of the planetary motion and gives also a relativistic correction which is at least in a qualitative agreement with the observed deviation from the classical orbit. The value (33) for the perihelial motion of a planet could be obtained from our consideration, if we were to add suitable higher order terms to the expression (22) giving the connection between gravitational potential and velocity of propagation of light. Such a procedure appears, however, completely arbitrary andwe conclude that from the theory, in its form given so far, only the order of magnitude of the perihelial motion of the planets can be obtained. For to obtain precise results it is necessary to find the exact relation between propagation tensor g and gravitational fields. b. THE DEFLECTION OF LIGHT IN THE VICINITY OF THE SUN

372. Considering a signal of light passing the Sun, the deflection A fi of its orbit from a straight line can be obtained using relation (30); we find + 00] n-Ap

= ^~dt.

(34)

— 00 Supposing the signal to move in a first approximation along a straight line, which passes the Sun at a distance with a constant velocity c, we can write r = í + c t\ (35) 2

2

2

and considering (28)

Thus introducing (35) and (36) into (34) we find 2a

4J = -

2

cb

r

4a

dx

J (1 + — (30

2

3

T ) '

4

(37a)

Remember th at th e angular distance at wh ich th e ray appears to pass the Sun as seen from th e Earth is ű = b/r, where r is th e distance between Sun and Earth and a = i?r, wh ere v is the orbital velocity of th e Earth . We find also (37b) Thus •& is th e angular distance at wh ich a star appears from th e Sun and AfS th e apparent sh ift it suffers because of th e deflexion. The value of th e sh ift obtained from th e generál th eory of relativity, which value seems to agree with th e observed one, is twice th e value th us obtained: (37c) We see th at th e simple assumption about g(x) leads to a qualitative correct value of th e deflexion of star ligh t in th e vicinity of th e Sun. c. THE RED SHIFT OF SPECTRAL LINES

373. We consider th e adiabatic sh ift of an atom from a point with coordinate vector r to anoth er point with coordinate vector r* = r + a. The atom in its originál position sh ould oscillate with somé frequency co = l/T; th is oscillation can be described sch ematically by supposing that th e atom emits signals at times tk = t + x > k

k =

x

kT.

The emission of signals are th us events with relative coordinates %

k

= Q,

0, 0,

kT.

Shilling the atom adiabatically from r -* r* we may carry out the shift in such a manner that the atom in its final position should be again at rest, thus we take the transformed coordinates to be given by § t = 0, 0, 0,

T* .

Such a shift can be described by a Lorentz transformation of the form 351 eq. (28). We have thus

:

i r« + lC(r)-

(38)

T

However, it follows from the explicit expressions of 471 given for the Christoffel brackets that (3)

(3)

C ( r ) = :C (r*) = 0. m

444

Since the space components of

are zero we find thus in place of (38^

Remembering further that the space components of % should also be zero we can determine explicitly M,: k

(l 0 M = 0 q

0 1 0 0

0 0 1 0

0\ 0 0

c(r*)

(39)

Thus taking the fourth component of (38) we find with the help of (39) *

T

(> c(r*) * c

T

r

Considering the periods of oscillations we can also write co : co* = c(r) : c(r*). We see thus that if an atom is shifted about adiabatically, then its frequency will change in the course of the motion so that the ratio between frequency of oscillation and local velocity of light will remain constant. In particular, if an atom is brought from the vicinity of the Earth to that of the Sun, its frequency is expected to decrease, since the velocity of propagation of light near the Sun must be supposed to be smaller than that near the Earth. The effect thus described is the second relativistic effect mentioned in 316, i.e. the gravitational red shift of spectral lines.

374. It must be emphasized that from the above result it follows not merely that the frequency of atoms decreases when brought from the Earth to the Sun. Indeed, atoms of a specified type have frequencies which are affected by the gravitational potential. If we have a set of atoms near a point r another set of similar type of atoms near another point r*, then the frequencies of the two sets will be co respectively co*. If we transport an atom of the first set from r to r*, then it will change its frequency from co to co* in the course of its journey and once it arrives in r* it will behave like those atoms which have been all the time near r*. So as to see this in more detail we remember that the frequencies of atoms can be calculated e.g. from the Schrödinger equation. The Schrödinger equation must be supposed to contain implicitly the gravitational potential in such a manner that the frequencies obtained from it depend to somé extent on the gravitational potential and thus its solutions give frequencies depending on the gravitational potential in accord with (38). When an atom is moved adiabatically, then it adopts itself in the course of the motion to the changing gravitational potential and this causes the change of frequency in the course of the displacement.

D. CONNECTION BETWEEN THE SOURCES OF GRAVITATION AND THE PROPAGATION TENSOR g 375. We have seen that the motion of the planets, and the deviation of light in a gravitational field, are obtained, at least to a first ápproximation, if we suppose fi 'o

í°

0 1 0 0

0 0 1 0

1

^ o o . -c(rf) 0

(40a)

and C\T) = c (l + 2í>(r)/c ) 2

2

(40b)

where $(r) \% the gravitational potential. The latter relation is, however, valid in particular representations only. One can suppose, e.g. that (40a, b) hold in a nearly straight representation, where the sources of gravitation are at rest. So as to obtain a formulation valid for moving sources, it is necessary to express (40a, b) in a form valid independent of representation. The latter problem becomes particularly important, if we consider three or more body problems, where the sources of the field are in motion relative to each other.

1. EINSTEIN'S EQUATIONS OF GRAVITATION 376. On the basis of the Laplace-Poisson relation, which must be taken to be valid at least to a good approximation, it follows from (40a, b) (in an approximation neglecting higher order terms in l/c) 4nG S/8u=-y~Q,

(41)

where Q is the density of gravitating matter and G the gravitational constant. We may try to approximate the Laplace-Poisson relation (41), which is valid in particular representations only, by a relation independent of the representation. We show that it is possible to find an invariant relation between tensors only, which relation reproduces in a good approximation the relation (41). So as to obtain such a relation we note that the Christoffel symbols calculated from (40a) can be written

2 8x

dx

k

k

(42) £=1,2,3 all the other components are zero. The non-vanishing components of the Riemann-Christoffel tensor can thus be written (see 482 eq. (63)) (4)

(4)

(4)

1 dg 2 8x dx 2

1 4g

u

k

t

ti

C4)

Ög dx

u

k

Ög dx

(43)

u

t

Expressing g in terms of c(r) u

<> 4

d c(r) c(r)-r—^2

^ 4 4 « = -

k

(2)

fc,/=

cx ox

(4)

1,2,3.

(44)

t

The contraction R of R has thus components (2)

R

= c(r)V c(r) = - V g 2

U

2

c(r)

u

1 - —

2^44

ox ox, k

(grad

g i i

)

2

(45)

Taking thus only the highest order terms in c into consideration, the Laplace-Poisson equation (42) can be replaced by AnG

(2)

(47)

If we take (47) to be the 44-component of a covariant relation we may write thus in free space (2)

(48

R = 0.

So as to obtain the covariant form of (47) in regions containing matter we remember that QC can be taken as the 44-component of the energy 2

(2)

momentum tensor T of matter; thus we might write (2)

(2)

R = - xT x

=

(49)

4nG

Relation (49) gives a possible relation between the sources of a gravitational (2)

field described by T and the propagation tensor g(x). 2. ENERGY MOMENTUM CONSIDERATIONS

377.

From relation (49) it follows that (2)

Div T =

1 x

(2)

Div R

(49a)

(2)

thus in regions where Div R # 0 the physical system described possessing (2)

the energy-momentum tensor T is not strictly conservative. Equation (49a) shows a certain amount of exchange of energy and momentum between the system and the gravitational field. The latter exchange — as will be shown further below — is in generál a very weak one and it does not correspond to the action by gravitational forces of Newtonian type. Einstein was of the opinion that such an interaction does not exist and he suggested to replace (49) by (2) 1 R - j g R = - x T .

(50)

The difference between (49) and (50) is that the Div of the right-hand expression (50) is identically zero (see Appendix II 484), therefore it follows from (50) that DivT = 0 a relation to be expected if T is the energy-momentum tensor of a closed system. Mathematically the relations of (49) and (50) diífer by very small amounts only and the difference of the relations is too small to give such effects that are observable at present. We note that for a region where T = 0 it follows from (50) that R = 0, therefore a difference between (49) and (50) is to be expected only in regions occupied by matériái systems. We come back to this question further below. 378. The question arises whether (50) is the only covariant generalization of (41) which is in accord with (49a)? It was pointed out by Einstein that we can add to the left-hand side of (50) a term Ag without disturbing the relation (49). The relation thus obtained gives in a first approximation the same results as the relation without the A-term. We prefer to write the generalized relation by adding the new term to the right-hand expression. We find thus (2)

i

R - _ g * = -xT-Ag.

(51)

The latter relation can be interpreted by supposing - 8 = T (52) to be the energy momentum tensor of the ether, which tensor signifies the state of stress of the ether.* 0

* The above energy momentum tensor is unusual in the following sense. Provided g has approximately the form (40a), then it resembles the energy momentum tensor of a gas. The first three diagonal components represent the hydrostatic pressures, while the 44-component the energy density. In an ordinary gas one expects that the sum of the first three diagonal elements, i.e. T + T + T of the energy momentum tensor, is less than — T^jc , while we have for the ether n

2i

33

1

5-u +

+ gs > 3

-gjc*.

Thus the hydrostatic pressure is larger than in an ordinary gas. The latter circumstance seems, however, of no importance, as there is no reason to expect the ether to have the same mechanical properties as matériái média consisting of atoms. We come back to this question in more detail further below.

The 44-component of (51) in empty regions can be written with the help of (45) V c(r) = M r ) .

(53)

2

A solution of the above relation which is regular at infinity can be written c(r) = c

0

^ - ^ .

(54)

Thus a A-term with A > 0 corresponds to a shielding off effect of the ether. 379. Relation (51) contains the derivatives of g only up to the second (* + 2)

order. Adding to (51) terms containing the R k = 3, 4 , . . . we can construct covariant relations containing higher derivatives of g. There is no valid reason to suppose that the exact relation between g and its sources should not contain such higher terms. Indeed, under circumstances where observational data are available the invariants constructed with the help of derivatives of g higher than the second appear to be too small to give observable effects. It seems thus very likely that the real connection between g and its sources differs from (51) which gives merely the relation obtained if the exact relation is expanded in terms of the derivatives of g and terms containing third and higher derivatives are neglected. One might suppose that the effects of higher order terms could be noticeable, if more information was available about the properties of stellar objects with extrémé densities. Supposing, however, that (51) represents only an approximate relation, it is very dangerous to extrapolate the results obtained from (51) to too large distances or for too long intervals of time. The danger is of the same type as if we used the approximate formula e~ ~ 1 — x valid for small values of x and if we tried to apply this formula for large x values. Somé of the paradoxes which arise when applying Einstein's equations to cosmical problems might be explained in the above way. x

3. THE SCHWARZSCHILD SOLUTION OF THE GRAVITATIONAL EQUATIONS

380. So as to discuss the planetary motion in terms of the generál theory of relativity, it is necessary to determine the gravitational field of a central body from Einstein's gravitational equation. One can suppose from consideration of symmetry that — at least in an almost straight representation — the field of a small central body should be spherically symmetric, i.e. we may suppose g(x) = g(r)

if

r

P a

where a is a length of the order of the dimension of the central body. Furthermore supposing the central body to be at rest, we may suppose

9kÁ*) = 0. The simplest supposition for g(x) would be that it is of the form given in eq. (40a) 375. However, a g(x) having this simple form cannot satisfy Einstein's equations. Schwarzschild has shown that Einstein's equations can be exactly satisfied by a spherical symmetric g(x) which corresponds to a locally unisotropic mode of propagation of light. 381. So as to obtain Schwarzschild's solution we may suppose that the velocity of propagation in the radial direction diflers from that in the tangential direction. Such a mode of propagation can be written down in polar coordinates r, <j>. We may thus write for the relation giving the propagation of light in terms of polar coordinates Adr + r dd + r sin Mcp + Bdt = 0, 2

2

2

2

2

2

2

(55)

supposing that A and B are functions of r only. Thus we suppose g(x) to be given by A 0 0 V0

0 r 0 0

0 0 r sin d 0

2

o o

2

)

B

(2)

The components of R can be obtained inserting the above expressions into eq. (12) of 331 and (63)-(67) of 4 8 2 - 8 4 of Appendix II. Wefind (2)

(2)

that among the non-diagonal elements of R only R tíontains nonvanishing terms, however, the latter cancel each other identically. The 12

(2)

diagonal elements of R are obtained as > 2 A' B" B' * n = - - — + — - 2&A ( ' (2

A B

(2) ^22=-r—

A

>

AB

2

- 2 + — + rB'jAB,

A

(2)

+ ')

A

22

«

(57b)

(57c)

2

R

(57a)

(2)

= sin &R , (2)

>

2

B» = ^

+

B'

B'

7 ^ - ^ B ^ B

+

A B ' ) .

(57d)

Writing according to (50) and (51) (2)

R = Ag

we find from (57a) and (57d) with the help of (56) (2)

BR

n

7 = —— (A'B + AB') = 0,

(2)

- AR

it

rAíS

thus we have A'B + AB' = 0 or

(58) A = y/B,

where y is a constant of integration. Introducing (58) into (57a) and (57d) we find ylA = B=y[\-^

+ ^kr''

The above expression is found to satisfy identically (57b) and (57a). It represents the so-called Schwarzschild solution of the gravitational equations. The constant of integration y is connected with the velocity of light, we may therefore also write y=-\lc\

(59)

Thus relation (55) written explicitly has the form dr

2

1 1

2 0 1

+ r\d&

2

+

sm -dd

2

1 2

1

kr

1

r

3 c |l2

— r

-~kr \dt 2

= Ü.

2

3

(60)

4. THE RELATIVISTIC EFFECTS IN THE FIELD GIVEN BY SCHWARZSCHILD a. THE PLANETARY MOTION

382. The Christoffel brackets can be obtained from (56) and the equation of motion of a planet in the field of a gravitating mass at r = 0 can be written OL

{

CL

\

2CC

Integrating the above relation in a manner analogous to the procedure given in 369, we obtain eventually for the differential dtp ds

dtp =

Jc l2a 2

2

exp {4asjc } 2

- exp j - 4/c

2

2c

2

Integrating over a whole period, we obtain when neglecting terms of higher order in l/c 2

P c where v is the average orbital velocity of the planet and Atp is the angular shift of perihelion in the course of one revolution. The above result was obtained by Einstein and it is supposed to be in agreement with the astronomical data obtained for the motion of the planet Mercury. b. DEFLECTION OF LIGHT

383. Repeating the calculations of 372 we obtain for the deflection of a signal of light starting from the equation of motion (61) instead of (23) in 368 the value of the deflection 8a where b is the distance of the closest approach of the signal. Thus from the Schwarzschild solution we obtain twice the deflection which we obtained from the simple theory. The reason for the factor two thus obtained is that the Schwarzschild solution of the gravitational field describes a state of the ether where not only the density varies with r, but also the ether is in a state of radial stress giving rise to an anisotropy of propagation. The latter anisotropy increases the deflection of light in the vicinity of a gravitating mass by a factor two. 384. The red shift of spectral lines in the vicinity of a gravitating centre retains its value whether one uses for g the simple form (18) of 367 or eq. (56) of 381. Thus the red shift of spectral lines is not affected by the anisotropic mode of propagation of light in the vicinity of a gravitating centre. E. ELECTROMAGNETIC FIELD AND GRAVITATION 385. The electromagnetic phenomena are described in homogeneous regions by Maxwell's equations on account of the phenomena given in chapts VIII and IX.

The relations valid in inhomogeneous regions can be obtained making use of the procedure discussed in the beginning of chapter X. It is important to remark that the generalization o ' Maxwell's equations for inhomogeneous regions cannot be uniquely determined from mathematical considerations only. We can postulate sets of equations which represent such generalizations and it will be always a question of experiment to decidé whether or not a particular mode of generalization leads to a correct description of the phenomena. We note that formulating Maxwell's equations for inhomogeneous regions, we make really assumptions about the role of g in the equations. Thus such a formulation in effect amounts to a hypothesis as to the effect of the gravitational field upon electromagnetic phenomena. 1. A N INVARIANT FORMULATION

386. A particular assumption as to the form of Maxwell's equations in an inhomogeneous region is obtained, if we start from Maxwell's equations as written down in chapt. VIII 266 in terms of four-tensors. Defining suitably the differential operators these relations are also valid in arbitrary curved representations. One may suggest that the relations thus obtained are also valid in inhomogeneous regions. We may thus write for Maxwell's equations in inhomogeneous regions V¥ = — 47cJ

eff

Div T = 0 DivJ,EFL

(62)

0

j = J + DÍV n eff

F = Rot ¥

Alternatively Div F = 47tJ

eff

(63) Div F = 0 DivJ

eff

= 0.

The operators Div, Rot, L have to be taken in accord with Appendix II. Indeed, the set of equations thus obtained contains Maxwell's equations in homogeneous regions. In inhomogeneous regions using an almost straight

representation the relations (62) or (63) are very similar to the relations valid in homogeneous regions, the gravitational effect upon the electro(4)

magnetic field being of the order of the elements of R. 2. QUESTION OF ELECTROMAGNETIC POLARIZATION OF THE ETHER

387. Relations (62) or (63) thus obtained are, however, by no means the only possible mathematical generalizations of Maxwell's equations; e.g. one could add to the four-current any four-vector which is free of divergence and which disappears in the homogeneous case. One might e.g. replace J by /(2) 1 \ J

e f f

=J + A"F •

R-yg*I.

(64)

The second term in (64) could be taken as the four-current corresponding to the polarization of the ether caused by the electromagnetic field. Whether or not such a polarization term appears is a question which could only be decided by experiment. Such an experiment is rendered difficult, as it follows from the gravitational equations that in free space (2)

R= 0

R = 0.

Thus the extra terms of the form (64) gives only effects for electromagnetic waves propagated through matter. The ordinary interaction of light with matter is, however, so strong that a background caused by the A'-term in (64) could hardly be detected under normál experimentál conditions. 388. A possible mode of detection of the A'-term could be found observing the scattering of light on light. Indeed, in a region containing a radiation (2)

field, the energy momentum tensor is different from zero and thus R and R differ also from zero. Thus the effects of the A'-terms could be felt observing e.g. the radiation of high density. Since g is contained implicitly in Maxwell's equations therefore independent of whether there exists a A'-term, a scattering of light on light is to be expected. From the measure of such scattering information about the A'-term could be obtained—at least in principle. 3. REMARK ON THE CONSISTENCY OF THE GENERALIZED THEORY OF ELECTROMAGNETIC FIELDS

389. Summarizing our considerations concerning gravitational effects we started introducing methods of constructing systems of references with the help of signals of light. We formulated Maxwell's equations by using the systems of references thus obtained.

A necessary check of the consistency of the theory is to show that the propagation of signals of light as obtained from the generalized formulation of Maxwell's equations is consistent with the phenomenological assumptions regarding the mode of propagation of light upon which assumptions all further considerations were based. Indeed, when introducing systems of coordinates we supposed that beams of light bbeyed relations

x(/0g(x(/0)x(p) = 0 ; that the latter assumption is consistent indeed with Maxwell's equations as formulated in 386 follows from considerations analogous to those valid for homogeneous regions; these considerations were given in chapter VIII 272. From the generalized Lorentz principle it follows further that a beam of light follows a geodetic zero orbit, if it obeys the relation (3)

x( ) + e(x(p))i( y P

P

=o.

That the orbits of beams of light obey the above relation could be proved most convincingly, if one were to succeed in giving in inhomogeneous regions the solutions of Maxwell's equations in terms of retarded potentials in a manner resembling that given in homogeneous regions explicitly. The latter procedure meets with difficulties and therefore one may instead use a method well known in classical optics. Making the transition from wave optics to geometrical optics, one can determine with the help of the eiconal the orbits of beams in unisotropic média. The generalized Maxwell equations have a form analogous to the wave equations in an inhomogeneous médium. Therefore, considering the ether as an optically unisotropic médium — the optical properties of which are determined by the propagation tensor g — we can determine the orbit of a beam of light with the help of the methods of classical optics. Such calculations were given by Laue and they prove that in a first ápproximation beams of light which can be derived from the wave equations follow indeed geodetic zero orbits. From these results we conclude that Maxwell's equations as formulated in 386 are consistent indeed with the generalized Lorentz principle. The question as to the existence or magnitude of the non-linear effects caused by the polarization of the ether cannot be decided, as at present there exists no experimentál evidence which gives information concerning such effects. In particular the question whether or not X = 0 presently cannot be decided experimentally.

F. ENERGY AND MOMENTUM RELATIONS OF THE GRAVITATIONAL FIELD 390. Einstein's equations of the gravitational field given in eq. (50) can also be regarded in a different manner. Supposing ^ g = T<'\

(65)

where may be regarded as the energy momentum tensor of the ether, we can also suppose 1 /(« 1 \ -(R- E*]«T« (66) or T

(2)

R/x = T »

(67)

to be the energy momentum tensor of the gravitational field, and we can write in place of (50) X<») + T + = T* (68) (m)

0

where T*"* and T are the energy momentum tensors of matter and of fields other than gravitational fields. From (65) it follows (see Appendix II) identically (W)

DivT<*> = 0, thus the totál energy and moment carried by the ether appear strictly conserved. The energy and momentum stored in the ether are equal to the sum of the energies and momenta of the various fields carried by the ether. The tensor differs from the others in such a respect that according to (66) (or (67)) can be expressed by g and its derivatives only. For this reason "T** could also be regarded as a kind of potential energy and momentum, i.e. the part of the totál energy and momentum density which is stored by the ether in the form of stresses and flows. The remaining part, i.e. T + T , may be taken as the part which appears as the energy and momentum of the fields carried by the ether. (m)

icl)

1. THE GRAVITATIONAL

391.

FORCE

If we take (67) as the definition T**" then we find

DivT<*> = J— - Div gR 2x and according to Appendix II Div T^^-l-Grad-R = f >. (ff

The latter expression can be taken as a force density acting upon matéria! systems by a gravitational field — or it can be taken as the rate of energy and momentum transferred from the gravitational field to other fields. (We can take T also to be the energy momentum tensor of the field of waves of matter.) Since (m)

R = 2k = constant this interpretation can be maintained for regions where T*'^, T*" = 0, i.e. for regions where fields other than the gravitational field do not exist. 392. The force density f is one which is exerted by a gravitational field upon a matériái system. However, the latter force density is not the gravitational force appearing in Newton's theory, but it is rather an internál force, which produces in a first ápproximation an internál stress only. This force is identically zero, if is given by the expression (66) in accord with Einstein's hypothesis. The main part of the gravitational action appears in a different form and quite independent of whether the tensor is given by (66) or by (67). 0

w

2. ANOTHER ASPECT OF THE GRAVITATIONAL EQUATIONS

393. Expressing in the relation (65) the energy momentum tensor and T in terms of g and its first and second derivatives, we obtain a system of equations which can be taken as equations of motions of the ether. Indeed, let us consider somé representation K; we may give T(x) as function of coordinates and time. If we give further (f)

goto = S(J, 0) i.e. the distribution of g(x) at t = 0, then the gravitational equations give a connection between the time derivatives of g(x) at t = 0 and the components of T (r) = T(r, 0) of the energy momentum tensor at t = 0. Since we have ten equations, we can determine in generál ten time derivatives of g(x). 394. The Riemann-Christoffel tensor contains the g k, I = 1, 2, 3, i.e. the second time derivatives of the g . It does not contain g nor the £(£444 thus the second time derivatives of the g or of g do not appear explicitly in (65). The Riemann-Christoffel tensor does, however, contain in the non-linear terms g and also g (k = 1, 2, 3) thus the first time derivatives of g and the g . We can thus determine from (65) the values of the second time derivatives of the g (k, l = 1, 2, 3) and the first time derivatives of the g (v = 1, 2, 3, 4) at t = 0. Thus integrating step by step the g(x) can be determined for times t > 0 from the initial condition goto. 0

km

kl

u u

ki

iU

44

fc44

ki

kl

iv

u

The procedure given above is not quite unique, as the equations from which we determine the first time derivatives of g(x) are quadratic equations. However, choosing once one particular solution, on account of continuity considerations, no ambiguity remains in the course of the integration of the equations. Furthermore, taking T (r) arbitrary, it is not certain, whether the system admits of real solutions for the first derivatives. It seems thus that the gravitational equations (65) give certain restrictions as to the T(x) for which the equations can be solved into g(x). The above restrictions may be real restrictions as to the possible states in which matter appears.* The restrictions may, however, also be interpreted not so much to be real restrictions as to the possible mode of distributions of matter, but it may be supposed that these apparent restrictions show the equations (65) to be incomplete. If we were to include in the latter equations terms containing derivatives of g(x) of higher order 0

(4)

than the seconds (say terms including Grad R ) , then the relations thus generalized may become compatible with arbitrary distributions T(x). 395. The position described in the previous section is not unlike the one we encounter in the case of Maxwell's equations, where — starting from an initial condition — we can determine the distribution of the electromagnetic field — provided the motion of the source densities is known. Just like in the case of Maxwell's equations, also in the case of the gravitational equations the field g affects the motion of T. Therefore, the motion of matter can be derived from (65) only, if apart from the initial condition (66) we give also initial conditions for the distribution of matter, e.g. in the form T(r, 0) = T (r) (69) 0

and besides we have to know the equations of motions for T(x), the latter may contain g(x) explicitly. 3. THE MECHANISM OF THE GRAVITATIONAL FORCE

396. The gravitational force upon a closed system can be regarded not so much as an outside force, but rather as an action of the system by itself caused by internál force. This can be seen from the following consideration. So as to see the essential features of how the force with which a system acts upon itself arises, we may consider an example, an electrically charged partiele placed into a gravitational field. We consider merely the part of the gravitational force which acts upon the part of the mass cor* This restriction may be of the same type as the suggested restriction of Maxwell's equations according to which only retarded solutions of the wave equations are realized in nature (see 263). Or it resembles the restriction according to which wave functions are always antisymmetric. — See also L. Jánossy, Foundations of Phys. (in the press).

responding to the energy of the electric charge. Thus we consider the mechanism of the gravitational force upon Am

=~ c 2

where Am is the mass equivalent of the energy E, the static energy of the electric charge. Consider the charged partiele to fali freely. In a nearly straight system of reference we can represent the freely falling partiele as a partiele which is practically at rest relative to the origin of the system. Calculating the force with which the electrical charge acts upon itself, we can apply in a good ápproximation Maxwell's equations valid in a homogeneous region — as in the nearly straight representation, considering a small partiele, the terms caused by inhomogeneity can be neglected. We thus find that the measure of the electric force with which the partiele acts upon itself is zero. If, however, the partiele is prevented from falling freely, then it shows acceleration relative to the origin of the nearly straight representation, and the force with which the accelerated charge acts upon itself can be calculated in accord with 305 chapter IX. The force thus obtained tends to reduce the acceleration, i.e. to establish the state of free fali. This electric force is equal to the gravitational force which accelerates a free partiele relative to a system of reference where gravitational forces appear. 397. A little more generál formulation of the problem can be given as follows. If we have an arbitrary closed physical system, it is kept together by internál forces which can be supposed to be propagated with the velocity of light. In the absence of gravitation the internál forces are propagated homogeneously and, as a rule, they produce no resulting force of the system upon itself. If, however, the system is brought into a gravitational field, then the mode of propagation of the internál forces is disturbed (just in the way as the mode of propagation of light). The perturbation of the propagation of inner forces disturbs the system in such a way that the forces acting between the elements do not obey any more exactly the principle of action and reaction and a resulting force arises. This resulting force tends to accelerate the system just like the Newtonian gravitational force is expected to do. We may thus suppose that the gravitational force observed phenomenologically is equal to the self force with which a closed system acts upon itself, if the propagation of the internál forces is made inhomogeneous by the gravitational field. We note that in a freely falling particie the propagation of the inner forces is nearly homogeneous relative to the partiele itself and therefore in the state of free fali no resultant self force is present.

-ff

CHAPTER XII

COSMOLOGICAL PROBLEMS A. THE PHYSICAL SIGNIFICANCE OF INVARIANT FORMULATION OF PHYSICAL LAWS

398. In the preceding sections we have given the formulations of various physical laws for inhomogeneous regions. The formulations thus obtained are generalizations of physical laws formulated for homogeneous regions. From the purely mathematical point of view, many different generalizations of a law can be obtained, if we merely require that the generalized law should contain as a limiting case the law valid for homogeneous regions. Among the possible mathematical generalizations of the laws valid in homogeneous regions one usually considers those which have invariant form* and supposes them to give the real physical laws. When considering the invariant generalizations only of given laws we find that there still exist a great manyfold of such generalizations and therefore even if we restrict ourselves to invariant generalizations there remains the question of picking out the correct form. This question was discussed e.g. in connection with Maxwell's equations. 399. In any particular case it remains thus to be decided by experiment which of the generalized forms of the physical law describes correctly the observed phenomenon. However, we have to go further: it is also a question to be decided by experiment whether or not the law describing a particular phenomenon correctly is an invariant one? It is often — incorrectly — argued that the requirement that laws of nature should be expressed by invariant expressions is a trivial one; this requirement is supposed to follow from the obvious requirement that "the laws of nature should be independent of the system of reference we use". While the latter requirement is trivial indeed, we show presently that the usual requirement of invariance contains far more than the trivial fact that laws of nature are independent of representation.

* Invariant form means that the relations can be expressed in terms of tensors and covariant operators.

1. TENSORS A N D DISTINGUISHED MEASURES

400. Linear relations between tensors are automatically invariant because of the transformation properties of tensors. Furthermore, we can define certain types of products between tensors which are tensor quantities themselves. Therefore expressing the measures of a physical system in terms of tensors, we achieve that linear combinations (e.g. sums) of the measures and also certain types of products — being also invariants — can be taken to correspond to significant physical quantities. Thus expressing the measures of a physical system in terms of tensors, we select a distinguished scale for the measures such that in terms of this scale sum and products of the measures are also physically significant. This procedure is much the same as was discussed in chapter III, where we discussed distinguished measures for quantities like length, electric charge, etc. Just as in the cases discussed in chapter III for simpler measures, we note that the fact that certain physical quantities can indeed be expressed in terms of tensors reflects upon physical properties of the quantities thus expressed. The assumption e.g. to the effect that the energy and momentum of a partiele can be expressed with the help of four-vectors can be tested experimentally. 401. Invariant products between tensors (apart from the direct product) can only be taken with the help of the tensor g which must be given together with the system of reference. We supposed in the preceding considerations that the representations of g such as g(x) = *(g) can be obtained by observing the mode of propagation of light in terms of the four-coordinate measures x of K. We see thus that the mode of propagation of light is explicitly being made use of for the invariant formulation of physical laws. 402. The mode of propagation of light has a twofold role for the invariant formulation of natural laws. Firstly, as was already pointed out in chapter X in 325 the assumption that the propagation of light signals in the vicinity of a four-point x can be expressed by relations xg(x)x = 0 (1) is an assumption which can be tested by experiment. The above law expresses a property of light signals and one may take that the above law reflects upon the physical properties of the carrier of light, i.e. of the ether. Relation (1) is in somé analogy to the relations we obtain from the classical form of Maxwell's equation for the propagation of light in inhomogeneous média; the propagation tensor depends on stresses and state

of motion of the médium. Thus (1) expresses the fact — which (in principle) can be verified experimentally — that light is propagated in the ether in a way resembling the way it is propagated in a médium in a state of flow and of stress. Thus invariant formulation of laws using the tensor g makes only sense in regions where g exists, i.e. in regions where light is propagated in a locally homogeneous manner. The latter statement shows that the invariant formulation presupposes an important physical property of the propagation of light. 2. THE PHYSICAL SIGNIFICANCE OF THE TENSOR g

403. A physical system can be described by a number of measures % 33, ©,. . . and the four-coordinates j j , . . . of certain points of the system; the physical properties of a system can be described in an invariant manner by stating somé connections between the measures. These connections are also making use of g. Let us take e.g. the very simple case of a partiele moving freely; we have then in a particular representation 1 (

2

xg(x)x = a

(2)

where a is a scalar. Taking a more generál case a physical law describing properties of a physical system can be described in a form

5(9l,S3,e, ...S1.Í2, ...g) = 0

(3)

where % is a function or functional. A representation of the tensor g can be determined making use of (1) and observing signals of light; g could equally well be determined making use of (2) and observing the orbits of free particles. In principle g could be determined observing any of the phenomena which we suppose to obey a law of the form (3). From the above remark it follows that g represents somé physical field which appears when observing very different phenomena — the propagation of light is only one of many such phenomena. The usually accepted interpretation of g is that it represents the "metric of the space-time continuum". We do not think the latter interpretation to be a fortunate one. We would rather suggest that g represents the state of the ether which is the carrier of all physical fields. The latter assumption makes it clear that in physical laws like (3) g appears explicitly — this circumstance shows that the behaviour of a physical system is determined not only by specific quantities % 2 3 , © , . . . , Zk - • • but also by the state of the ether in the region which is occupied by the system itself.

404. The assumption that we can formulate laws of nature in invariant manner using relations of the form (3) implies also the assumption that the state of the ether is fully determined by the tensor field g. It might be conceivable to suppose that the state of the ether in a region 9t is determined by g together with other fields f. The latter fields might be supposed not to affect the mode of propagation of light — and therefore f cannot be determined observing merely the orbits of signals of light. Treder* has suggested a form of the theory where instead of the symmetric tensor field g a mátrix field with 16 independent components appears. According to this theory the field equations of scalar, vector or tensor fields depend only on g, i.e. on 10 independent combinations of these 16 functions, while the equations of motion for spinor fields depend on all of the 16 functions. 3. A N O R M Á L FORM OF THE PROPAGATION TENSOR

4 0 5 . The measures of the elements of a tensor (or tensor field) can have very different values when given in various representations. For this reason one meets with opinions according to which the one or the other tensor is "void of physical significance". Putting this question, one must be careful. The measures of a physical quantity differ from each other when taken in various representations, nevertheless these measures refíect an objective quantity. Let us e.g. consider the propagation tensor g. In a representation K we have K(g) = g(x), g(x) has for any value of x ten independent elements; applying a coordinate transformation x' = f(x) we obtain a new representation g'(x') of g. However, as the transformation contains four functions only, we can prescribe at most four of the ten independent elements of g'(x') and then find a transformation which transforms g(x) into the required g'íx')- If we have two arbitrary representations gj(x) and g (x') of the propagation tensor, then it is the exceptional case where there exists a tensor g such that 2

*(9) = gi(x) and

/T(g) = g (x'). 2

* H. Treder: Einstein-Symposium (Berlin, November 1965), Akademie Verlag, Berlin, 1966.

As a rulc the two representations g and gó cannot be regarded as representations of one and the same propagation tensor g. Therefore a representation g(x) of a tensor g contains specific features of that tensor which features are common to all possible representations of g and these features express somé objective physical properties of the field g; these features are thus independent of representation. 406. To make this question clearer we give the following consideration. The transformation connecting two representations, say, K and K' can be written x' = f(x). (4) x

As the transformation contains four arbitrary functions, we may obtain something like a normál representation if we impose four suitable conditions upon g(x) = (5)

m)

which conditions reduce the choice among the possible representations. Writing in the usual way G V

V - c

(6)

2

we can e.g. require for the normál representations that V = 0

Grad C = 0.

(7)

We show presently that for any distribution g representations can be found satisfying (7). 407. Indeed, consider a representation K where g(x) is given by (6) so that (5) is not necessarily fulfilled. Changing to a new representation K' by a reversible transformation (4) we may write g(x) = S(x)g'(x')S(x),

(8)

and we may put

(9)

S(X):

where b,=

dx

k

(10a)

8x

k

and a, =

df,<x) dt

(10b)

Inserting (9) into (8) supposing ÍG'

0

constant

(11)

we find G =PG'P-(bob)c

(12a)

2

V = aG'P - ac^b

(12b)

C = ac

(12c)

2

2

-aG'a.

2

From (12a) it follows G' = P

_ 1

GP~

1

+ (bP obP )c. _ 1

_ 1

(13)

From (12b) we have a = (V + ( A J P - V - . 1

(14)

Inserting (13) and (14) into (12c) we can express a in terms of the elements of g together with P, b and G'. We obtain the expression of the form a = a(P, b, c, g), a = a(P, b, c, g).

(15)

If we prescribe the value of c and somé arbitrary values for t = 0 P(r, 0) = P (r) b(r, 0) = b (r)

(16a) (16b)

0

0

then we can determine a and a for t = 0 thus we can determine the time derivatives of the components of f(x) for t = 0. Knowing the time derivatives of f(x) for all values of r at t = 0 we can determine step by step f(x) for any later time t > 0 — or going backwards in time we can also determine f(x) for any time t < 0. 408. Thus prescribing an initial condition of the form (16) we can indeed construct a representation K' so that ^'(9) = g'(x ) /

and

V' = 0,

C = c.

The system K' can thus be regarded as a representation relative to which the ether appears to be at rest. The system AT'(g) is, however, not determined uniquely. In the homogeneous case we know that in the systems of inertia which move with translational velocities relative to each other it is possible to introduce coordinates such that light appears to be propagated isotropi-

cally in terms of those measures. In the inhomogeneous case we find a similar ambiguity. Indeed, considering two representations K and K' so that V = V = 0,

C =

C

c,

we find with the help of relations (10) for the elements of a transformation which connects two normál representations K and K' G' = P - ( G + b o b c ) P 1

2

1

a = (1 - b(G + (bo b ) ^ ) - ! * ) - / 1

2

1

2

(17)

a = c ab(G + b o b c ) - P . 2

2

1

We can choose P and b for t = 0 arbitrarily and find their values for t # 0 remembering the relations and integrating step by step. Since — c a gives the velocity of a point r' = constant relative to one with r = constant, we see that the normál systems of references may move relative to each other just like Lorentz systems of references do. 2

B. PHYSICAL CONTENTS OF PARTICULAR REPRESENTATIONS 1. STATIONARY REPRESENTATIONS

409. If we choose a suitable representation a requirement to the effect that V = 0, grad C = 0 can always be satisfied. Further restriction of g(x) cannot be satisfied in generál by a suitable choice of representation. For example, we may consider a distribution such that ^

= o.

os)

(18) imposes ten independent conditions upon the representation g(x) of g. Given an arbitrary representation g(x) which does not fulfil (18) we may try to construct another representation g'(x') such that ^

= 0.

(19)

Since g'(x') = S-\x)g(x)S-\x)

with

S(x) = f(x)oQ

(20)

(19) and (20) together impose ten conditions upon the four components of the function f(x). The latter system is in generál overdetermined and does not admit of solutions.

In the exceptional cases where g(x) is such that the systems (19) and (20) can be solved we must suppose that it is not a mere accident that the mathematically overdetermined system admits of solutions. In the latter case we have to suppose that the fact that g admits of representation obeying (19) is of physical significance. One may suppose that configurations which possess representations satisfying (19) may be taken to be stationary configurations. We may thus suppose that the ether is in a stationary state if g admits of representations g'(x') where all its elements are independent of t'. Similarly we may consider physically significant other symmetries or particular properties of which appear in somé representation g(x) of g provided these symmetries or restrictions cannot be produced by a suitable choice of representation of an arbitrary g(x). 2. THE ENERGY MOMENTUM DISTRIBUTION

410.

Similar problems are encountered considering the representations (2)

T(x) = K(%) of the energy momentum tensor of matter. It is always possible to find (2)

representations T such that (2)

T (x) = 0 ik

k= 1,2,3

(21)

the latter representations correspond to systems of reference in which matter does not move relative to points with constant coordinates. Comparing, however, the state of the ether with that of the distribution of matter, there may or may not exist a representation where (2)

r *(x) = 0 4

and

g (x)=0 lk

for

k= 1,2,3

(22)

are fulfilled simultaneously. The condition (22) imposes already six conditions upon the representation and therefore starting from an arbitrary representation, in generál, it will not be possible to find a transformation such that in terms of the transformed measures the whole of the condition (22) is fulfilled. In the exceptional cases where we can find a representation satisfying (22) we may conclude that matter moves together with the ether. Considering the distributions of matter at large in the universe, we can suppose in a good ápproximation that the interaction between stellar systems is a purely gravitational one. Considering the stellar systems of the universe like the atoms of a gas, we can suppose that the matter in

bulk behaves like a médium free of inner stress. Thus averaging over sufficiently large volumes we may suppose that (2)

Tyjji)

=

except for

0

v=

n=

4

(23)

(2)

r (x) = nix) > o. 44

(2)

More precisely we may suppose that there exist representations of T = = K(%) such that (23) is satisfied for them. Although (23) is not of an invariant form it nevertheless describes a significant feature of the distribution of matter. Indeed, if we start from (2)

an arbitrary representation T(x) of % then, in generál, it is impossible (2)

to find a representation T'(x') of % obeying (23). If we nevertheless find that applying a suitable coordinate transformation to the originál repre(2)

sentation, a representation T'(x') of Z can be found obeying (23), then this proves that the distribution 2 has particular properties. We may suppose that matter characterized by í£ is free of inner stress. If we find a representation which apart from (23) satisfies also g

ki

= 0

k = 1, 2, 3

then we may conclude that the matter thus represented is free of stress and flows together with the ether.

C. COSMOLOGICAL PROBLEMS 411. In the present section we give a very brief account of somé cosmological aspects of the theory of relativity. Cosmological theories try to give an account of the observed features of the distribution of matter in the universe in terms of distributions in accord with Einstein's gravitational equation or of somé generalization of the latter equation. 1. THE RESULTS OF ASTRONOMICAL OBSERVATIONS

412. Astronomical observation shows that the universe is fiiled with galactical systems which — as far as it can be ascertained — are distributed uniformly in space. Considering sufficiently large parts of the universe, we may consider the number of galaxies per unit volume to be about constant. Considering these galaxies like the atoms of a gas we can suppose matter in the universe to behave like a gas of constant average density.

The proper motions of the galaxies are such that collisions between galaxies must be supposed to be extremely rare events even on a cosmical time scale and therefore it can be supposed that the interactions of the galaxies are given in a good ápproximation by their gravitational interaction only. Taking the hydrodynamical analogy, we can take the universe to be fiiled with a gas of low temperature, which is moving only under the influence of its own gravitational field. 413. The observations of Hubble have shown that the spectra of known atoms emitted in distant nebulae show a considerable red shift. The observations can be summarized by a relation Av — =-kr, v

(24)

where Av is the difference of frequency of a spectral line when observed in the laboratory, respectively when observed in the spectrum of a galaxy at a distance from us. k « 10

_ 2 7

cm

_ 1

x 10" parsec 8

_1

is the Hubble constant; its numerical value can be determined only with great uncertainty, as the measure r of the distances of nebulae can only be estimated by rather indirect methods. Supposing that the Hubble effect is caused by Doppler effect, i.e. supposing that the distant nebulae are receding with great velocities, we can also write in place of (24) v = ckr (25) where v is the velocity of recession of objects at distances r. There exists also somé astronomical evidence upon electromagnetic radiation inside the universe; we do not discuss this complex of problems here. 2. THE SOLUTION OF F R I E D M A N N

414. Many cosmological considerations deal with possible solutions of the gravitational equations (2)

1

(2)

R - y g * =-xT-Ag.

(26)

While attempts were made to find solutions of the gravitational equations which correspond to the observed state of the universe, a solution was looked for where matter in bulk is free of stress and can be represented in a form (23). So as to specify the problem, it was investigated by Fried-

mann whether or not Einstein's equations possess solutions such that in a suitable representation the tensor X is given by (23) while the tensor g is given by IA B 0 0 -C 2

g(*)=L

2

(27)

It is supposed that Grad C = 0

and

-^j-=°>

() 28

while A may depend in an arbitrary manner upon x. The expression (27) for g(x) is a more restricted one than the normál representation (7) of an arbitrary g. Indeed (27) supposes that" G(x) = .4 (x)B(r)

(29)

2

which supposition restricts G(x) and thus g(x). 415. Supposing g(x) to be given by (27) we can obtain the elements (2)

of R. We find as the result of a short calculation <2>

„

d \nA

<2'

3

eA

2

2

From the gravitational equations (26) it follows with the help of (23) (2)

R

ki

(2)

R

it

=0

k= 1,2,3

1 = - — xu + k.

(31a) (31b)

We find thus ö \nA dx dt 2

= 0 n

k

and therefore A(x) = a(0/J(r). Without restricting the form of g(x) further we can include the factor /?(r) into the mátrix B(r) and thus we may suppose A(x) = ^(í).

(32)

So as to determine A(i) explicitly, we note that (2)

DivT = 0 and therefore according to (23) and eq. (37) 476 of Appendix II we find .-£-

c

c - ( - d e , r - " ' 'A ' " ' C

with

C

B

g

(33)

and dt Inserting (33) into (31b) and remembering that according to (30b) and (32) (2)

R

i4

depends only upon t, we find 3 dA _ A dt ~

p + X A

2

2

where

(34)

3

• / ? = 0.

1 dA Multiplying (32) with — A —- we obtain a totál differential. Integrating 6 dt we find IdA'Y

-p-KA

+ XA ^ 3

dt j

3A

where we have written K for the constant of integration. Inserting (35) we obtain explicit forms of A. In particular putting K = 0 and X = 0 we find A(t) = A (t+t ) '

(36)

2 3

0

0

thus A\t) increases with time. We see that according to the initial conditions and the values of K and X we obtain solutions with increasing A(f) or decreasing A(t). Equation (35) admits of a singular solution A = A

0

with

- / } - KA + Al = 0. 0

The latter solution can be shown to represent an unstable configuration, which at the slightest disturbance changes into one, where A(t) varies in time monotonously. The above considerations imply that no stable stationary solution of the gravitational equations exist — i.e. solutions in which both matter and ether are at rest.

D. ANALYSIS O F FRIEDMANN'S SOLUTION 416. We make a few remarks so as to clarify the types of motion which are involved in the configuration corresponding to the Friedmann solution of the gravitational equations. 1. THE

417.

RE CE SSION OF THE

GALAXIE S

Firstly we note that, on taking the Friedmann solution for g(x), the (3)

components of <2, where two or three suffixes are equal to 4 vanish. There fore we find from the equations of motion that a partiele which is at rest relative to the system of reference at t = 0 remains at rest; indeed it fol lows from the equation of motion x(p) = 0

if r(p) = 0 (37)

therefore r(p) = r = constant

is a solution of the equations of motions. We conclude therefore that the matter under influence of its own gravi tational action may float together with the ether. a. THE MEASURES OF INTE RGALACTICAL DISTANCE S

418. The journey of a light signal set from a point P with fixed coordinate vector r to another point P with a fixed coordinate vector r = r + ÖT takes a time 5ty. If the signal is refiected back from P to P the time of to and return fiight is equal to (see 381 eq. (55)) x

2

x

2

x

x

2

1 2

St = őh + őt = Ait^ÖTBŐt) ' ^. 12

(3 8)

2

From the factor A(t) it is seen that the return journey P -* P -> P of the subsequent signals changes in time. In particular if we consider a type of solution (38) we find that the fiight increases in time. If we take the time of return fiight as a measure of the distance between Px and P then we find thus that the distance between the points increases. The points P and P may be regarded as two masses floating freely. Thus Pi and P may be e.g. two distant stellar bodies. 419. If we suppose for the moment that the physical structure of a solid is such that the time of fiight of a return signal between two fixed points of the solid does not change in time then we can interpret the results of the previous paragraph by supposing that the measure of the distance between points P and P which float freely increases in time, when the measure ofthe distance is obtained with the help of solid measuring rods. x

2

x

2

2

1

2

2

x

The Friedmann solution can therefore be taken to imply that the ether as a whole expands and the bulk of matter floats with it. So as to see the physical significance of the above effect it is necessary to investigate the rate of a physical clock of usual construction in terms of the measures of the system of reference K where K(q) is given by (27), (28) and (37). " Considering an oscillating atom, a rotating wheel, or somé similar closed system an ordinary clock, we can suppose that the motion of the latter is given in a good ápproximation by the classical formuláé if we use an almost straight representation. So as to obtain such a representation we have to use a transformation described in 335 and Appendix II eq. (51a); we may put x' = x + F(x)

(39)

with 1

(3)

F(x) = x + - C x + . . . 2

From (27) and (28) it follows, however, that the non-vanishing components (3)

of C in a fixed point x = r , t are 0

(3)

0

0

(3)

C

=- C

kki

= A'(t )(A(t )

ikk

0

a

k= 1 , 2 , 3 .

From (39) it follows dt' <> + terms of order of (r - í ) and (r - r ) . (3)— - = 1 + C Since C dt = 0 we find dt' , = 1 + small terms. dt 3

z

444

0

0

444

(40)

Since the rate of a physical clock must be taken to be constant (apart from small terms) in an almost straight representation, we expect a clock to have a nearly constant rate in measures of /' and therefore according to (40) it must be taken to be nearly constant in terms of the originál measures t also. Thus measuring the return times of light signals between the points P and P we find the point P to recede if we measure the return time with the help of a physical clock of usual construction. 420. Differentiating (40) into the components of r we find X

2

2

17

x

17

x

+

\ W'oWoÜi'

- *°)

wé see thus that the measure of the distance between two points which have fixed coordinates relative to K increases when taken in the measures of K'. However, since K' is an almost straight system of reference, we expect the measures of a solid to remain almost constant when taken relative to K'. We see thus that the distance between two points P and P at rest relative to K increases when the distance is measured with the help of a solid rod. We see thus that in the configuration represented by the Friedmann solution the bulk of matter is streaming radially with the ether. The closed physical systems, however, which can be taken as the "atoms" of the mterstellar matter do riot expand. Thus the expansion can be taken as an increase of the distances between intergalactic systems relative to the dis tances measured on solids. Measures of stellar bodies which are kept together by mechanical inter atomic forces do not increase in time — or rather the increase of inter galactic distance must be taken as an increase relative to the dimensions of measuring rods kept together by interatomic forces. x

2

b. DOPPLE R E FFE CT

421. Consider an atom situated in a point P emitting radiation of frequency v. We suppose thus that wave fronts are emitted at times x

t„ = t + n/v

n = 0, 1, 2 , . . .

0

where the times are taken in the measures corresponding to the point Pi with coordinate vector The signals arrive in the point P with coordinate vector r = T + <5r in times 2

2

t + n/v + A\t„) 0

L

(SrBSrp'/c

+ terms of higher order thus we find n+l

2

/* + -(!+

lAXQAiQ^^lc )

+ terms of higher order and neglecting further terms of higher order, Av

2 A'A c

Av

V* — V.

8r

8r = (őrB őr) 1/2 1

The radiation of the atom situated in i \ arrives in P with a frequency v* < v. The above result describes the red shift of Nebulae. We see thus that the configuration obtained from Friedmann's solution of the equations of motion leads quaütatively to the Hubble effect. 422. The argument can be summarized by stating that the Friedmann solution gives such distribution of g(x) that the coordinates describing the vicinity of a fixed four-point x give almost straight representations. Thus describing the (small) vicinity of x„ by coordinates 2

0

x = x + § 0

then £ can be taken as measures of an almost straight system of reference. Describing a local process like the motion of a small solid, the rhythm of a clock, etc, we can apply— in a good ápproximation — in the vicinity of any point the laws valid for homogeneous regions. Observing with the help of clocks and measuring rods the behaviour of distant objects, we find that the distances of the latter increase in time; further we observe with a local clock the Doppler shift of the frequencies of distant objects just in accord with its apparent radial motion. It is important that the Friedmann solution gives thus automatically two suitable scales: one valid in the vicinity of fixed points and another which can be applied to large distances.

E. MACH'S PRINCIPLE 423. Before the formulation of the generál theory of relativity E. Mach formulated a principle which has somé connection with the theory of relativity. It was thus suggested that the forces of inertia (i.e. centrifugai forces) which appear in a rotating physical system Sí can be taken to be caused by the rotational motion of 21 relative to the systems of fixed stars surrounding 91. It was ventured by Mach that if instead of rotating the system 21 — "we were to rotate the whole of the universe around 21" — the interaction would remain the same and the system 21 surrounded by the large masses of stars rotating round could produce centrifugai forces of the same kind as are produced by the rotation of 21 relative to the surrounding masses. An experiment was attempted by rotating a system 21; a massive wheel was put around 21 and it was investigated whether a reduction of the centrifugai forces can be observed when the wheel was made to rotate with 21. No observable effect was found.

a. THE THIRRING EFFECT

424. It was shown by H. Thirring* that in terms of Einstein's gravitational equations inside a rotating spherical shell one expects forces which resemble to somé extent forces of inertia in a rotating system. Furthermore, it was shown by Thirring and Lense** that in the vicinity of a rotating sphere the gravitational field is affected by the rotation and it is suggested that anomalies of the orbits of the satellites of Jupiter (and possibly of Satura) might be expected to arise asthe result of this effect. The anomalies are, however, so small that they are on the limit of being observable and they have not been found so far by observation. The above effect is known in the literature as the Thirring effect. 425. While the gravitational field of a rotating body differs from a similar one which does not rotate, nevertheless, Mach's principle is not in agreement with the generál theory of relativity as was pointed out by Treder. The problem which arises with the interpretation of Mach's principle can be seen best if we remember that the physical state in a region 9t is given by both the tensor g (representing the state of the ether) and % (representing the distribution of matter and fields). Taking a representation K in which both the g

ki

=0

and

7^ = 0

k= 1,2,3

(41)

we describe a state of affairs where matter is at rest relative to the ether. Making the bulk of matter rotate, we obtain a configuration for which we have (in the originál representation) 8U = 0

T* *0 ki

k= 1,2,3.

(42)

(We indicate by the * the difference between the configurations.) The latter configuration differs physically from the former as was shown in detail in 410. Thus roughly speaking if we make matter rotate, but leave the ether free of rotation, we obtain a configuration which can be physically distinguished from the configuration where matter is free of rotation relative to the ether. 426. From the above considerations the following qualitative conclusions can be further derived. Consider a stationary configuration of the form (41) where gki(*) = g*/(r)

g,i(x) = 0.

* H. Thirring: Phys. ZS., 19, 33, 1918. (See also somé minor corrections: Phys. ZS., 22, 29, 1921.) ** J. Lense and H. Thirring: Phys. ZS., 19, 156, 1918.

If we replace T

ki

by, say,

and suppose that at t = 0 g*(r,0) = g(r)

then we can obtain g*(x) for / > 0 from Einstein's gravitational equation just as discussed in 393. We note that at * = 0 though g = g* because of the change of T we expect the second time derivatives &* and also the first time derivatives g£ to differ from zero. Because of the latter /44

u

(3)

(3)

we find that — even at t = 0 — the C* differ from the C and thus the equations of motion in the starred configurations differ from those in the originál configurations. The latter difference is connected with the Thirring effect. The difference between our remark here and the quoted calculations of Thirring is that in the latter stationary solutions of the gravitational equations were given, which can be found in the inside or the vicinity of a rotating sphere. These solutions were compared with those valid in the vicinity of the sphere free of rotation. We consider here — qualitatively — the dynamic of the process; i.e. we discuss the effect upon the surrounding, if a massive body is being put into rotation. It would be of interest to investigate whether as the result of rotation of the massive body the distribution of g(x) changes indeed into g*(x). As far as we know this question has not been investigated.

APPENDIX I

TENSOR ANALYSIS IN HOMOGENEOUS REGIONS

427. In this book we have used notations and conventions which deviate from the usual ones. We did this in order to simplify mathematical expressions but also we tried to find notations and conventions which are particularly adequate to express the physical concepts upon which we build up the theory. A. SYSTEMS OF REFERENCE 428. We consider a system of reference a means by which we can express events with the help of four-coordinates x = r, t. The orbit of a partiele relative to a system of reference K can be given in a paraméter representation, so that x(/>) = *(/>), t(p)

with

i(p) # 0.

The various values of p give the coordinate vector x{p) at times t(p) of the particle. Dot (•) denotes differentiation into p. 1. THE LORENTZ SYSTEM

429. A Lorentz system of reference is a system K in the measures of which the orbits of signals of light can be expressed by the relation 0

with

xri = 0

fl 0 0

0 1 0 0

(1) 0 0 1 0

^

0

1

0o -c )

lo where we have written x in place of x(p) (the four-coordinate the orbit of the signal).

2

expressing

2. STRAIGHT SYSTEMS OF REFERENCE

430. Submitting the coordinate measures of a Lorentz system K to a reversible linear transformation of coordinates we obtain a new system of reference K' with coordinate measures 0

x' = Sx + s.

(2)

Considering an orbit given by x = x(p) respectively x' = x'(/>) we find differentiating (2) into p x' = Sx.

(3)

x'g'x' = 0

(4)

From (1) and (3) we obtain with g' = s - r s - . 1

1

We denote a system of reference K' to be straight ifin terms of its measures the orbits of signals of light obey (4) where the mátrix g' has constant elements. We can add to the definition that we require g' to be symmetric and to have three positive eigenvalues and one negative eigenvalue. The latter requirement is fulfilled by JT. 3. PROPAGATION TENSOR 0

431. The mátrix g with the help of which we can express orbits of light signals may be called the propagation mátrix. Having systems of references K, K',.. . they can be characterized by the matrices g, g',. . . etc. expressing the propagation of light in their respective measures. We may write q for the propagation mátrix and K(Q) = g, tf'(8) = g',. . . .

(5a)

In particular for a Lorentz system of reference K for various straight representations we have *o(8) = r . (5b) 0

432. Consider two straight systems of references K and K' with coordinate measures x and x' connected by the reversible transformation x' = Sx + s.

(6)

The propagation of light is thus expressed in terms of K ox K' by xgx = 0

(a)

and

x'g'x'= 0

(b).

(7)

From (6) it follows that x' = Sx and thus if we require that the relations (a) and (b) given in (7) should follow mutually one from the other then we have to suppose g = ÖSg'S

0 > 0.

We shall suppose in the following that 9 = 1 and take that g transforms according to g = Sg'S . (8) 4. LORENTZ TRANSFORMATION

433. We denote a coordinate transformation a Lorentz transformation if it leads from a straight system of reference K with a propagation mátrix g to another K' with g' = g-

(9)

From (8) and (9) it follows that a transformation x' = Mx + u, is a Lorentz transformation provided M obeys MgM = g.

(10)

Matrices obeying (10) will be denoted Lorentz matrices. If K and K' are both Lorentz systems of reference then we have in place of (10) ATA = T

(11)

and we can call A an orthogonal representation of a Lorentz mátrix. 434. The Lorentz transformation can also be taken to define a Lorentz deformation. The connection between a system Cl and its Lorentz deform Cl* can be written symbolically 2 0 = O*.

Suppose Cl to consist of points ty k = 1 , 2 , . . . , n the orbits of which are given in a representation K by four-coordinates \ (p). The orbits of the points $*. of ö * are given by four-coordinates k

k

x* = Mx* +

(A

where the M are Lorentz matrices. We note that both x and x* are measures taken relative to the same system of reference K. We can also write tf(S) = M,u. i.e. the deformation 2 is represented by a Lorentz mátrix M and a constant four-vector u. 5. S T A N D A R D FORM OF LINEAR COORDINATE

TRANSFORMATIONS

435. For many purposes it is convenient to split space and time components of g. We use the notations G 8 ~ l< V7

V -c

/"2

where G is a symmetric positive definite mátrix, V a vector and C a scalar. The determinant of g can be written det g = - (C + V G ^ V ) det G; 2

we shall also write

c = y j - det g = J { C

2

+ VG- V)detG x

and consider c to be the measure of the velocity of light in K. We can split g into factors according to g = Sr<x;

(12)

relation (12) does not define o uniquely; however, if we require a

= 0

ik

k — 1, 2, 3

then we find a

_ jG

~|o

1/2

G " V1 1/2

\

(C + VG-V)^/J 2

1

(13)

where —c\ — T . As the result of a short calculation one finds also 44

_

x

"

_ ÍG~

1/2

~[0

-CoG-V/ÍC + VG" c /(C + V G ^ V ) ' 2

2

0

1

2

W '•

K

}

436. Supposing that representations of g in K and K ' are given as in eq. (5), the question arises as to the linear transformation (6) which leads from K to K ' . The homogeneous part of the linear transformation leading

from K -* K' is given by a mátrix S obeying (8). From (8) and (12) we see that the mátrix S = o'"^ satisfies (8) where a is built of the elements of g according to (13) and a ' is built of the elements of g according to (14). One verifies easily that apart from S also matrices - 1

/

S = a'- A<«>«

(15)

1

fl

satisfy (8) where A is an orthogonal representation of a Lorentz mátrix. Thus there exists a six-parameter manyfold of transformation matrices S„ which give coordinate transformations such that g changes into g'. 437. In particular if g = g then (15) gives the mátrix of a Lorentz transformation. We can thus write in accord with (10) ( , )

MW = a-W<úa.

(16)

Relation (16) gives the connection between the orthogonal and skew representations of the Lorentz matrices.

B. VECTORS AND TENSORS 438. It is useful to express measures in terms of quantities with particular transformation properties. Many — though not necessary all — physical quantities can be expressed in terms of tensors (the expression tensor is here taken so as to include also vectors and scalars). The simplest such quantity is the scalar. We denote a quantity a scalar if its numerical value is the same in all representations. Thus if a is a scalar, we have K(á) = K'{a) = ... = a. 439. We denote a four-component quantity a contravariant vector representation if its components transform like the measures of a four-dimensional distance. Thus we may write K(á) =

á,

K'(a) =

and á' = Sa

(17)

where S is the homogeneous part of the coordinate transformation K -*• K'. The sign upon a and a' denotes contravariant representation. We shall use in generál covariant representation K(á) = a,

K'(a) =

»',...

when by definition the covariant representation of a vector is transformed according to a' = S ^ a . In place of (18) we can also write a = Sa' = a'S.

(18) (18a)

From relation (8) we find Sg^S = g -

1

(19)

thus taking (18) and (19) together we find g - V = Sg^a;

(20)

comparing (17) and (20) we find further á = g a _ 1

and

á' = g ^ a '

thus the contravariant representation of a vector is obtained from the covariant by multiplication with g . We shall avoid in generál the use of contravariant representations with the exception of four-distances. - 1

1. TWO-DIMENSIONAL TENSORS

440. We denote a quantity 91 to be two-dimensional if its representations are given by matrices, i.e. #(21) = A where A has elements A v, p = 1, 2, 3, 4. The quantity 91 can be represented relative to systems K, K',.. . tfJL

K(H) = A, #'(91) = A',. . . the matrices A, A',. • • may be given in any arbitrary fashion. We take 91 to be a tensor if its representations transform like those of g, i.e. provided we have A = SA'S, (21) where x' = Sx + s defines the coordinate transformations from K -» K'. From the above definition it follows in particular that g is a two-dimensional tensor.

a. INVARIANT PRODUCTS

441.

From the relation

it follows that ag-'b = a Y - V (22) where a, b and a', b' are the representations of vectors a, b relative to K and K'. We shall use the notation a b = ag- b; (22a) 1

the latter quantity is in accord with (22) a scalar and it will be denoted the scalar product of a and b. 442. Similarly we can form the invariant product between a vector and a tensor. We shall write a-A = a g A = a

(23a)

A-a = A g " a = p .

(23b)

_ 1

and also x

We see immediately that the quantity a and also p defined by (23a, b) are vectors. Finally we denote the invariant product between two tensors A - B = Ag" B = C. 1

(24)

If A and B are tensors, then C defined by (24) is also one. 443. We can form also the direct product of two vectors, i.e. we write aob = A. The components of A are A

r(L

= aJ>

lí

v, u = 1,2, 3, 4

furthermore the vectorial product of two four-vectors can be defined as a x b = » a o b — boa; the vectorial product of two vectors gives an antisymmetric tensor A such that

b. PSEUDO SCALAR

444. The determinant förmed of the elements of a tensor is a one-component quantity; it is, however, not a scalar as the value of the determinant is different in different representations. Taking the determinant of the relation (21) we find det A = det A'(det S) . 2

(25)

Denoting thus det A = a , det A' = d we find 2

2

d = a/det S.

(26)

Quantities transforming in accord with (26) may be denoted pseudo scalars. Thus denoting a a pseudo scalar quantity, we can write for its representations K(a) = a, K'(a) = d, K"(a) = Ű", . . . The representations of a transform in accord with (26). Relation (26) gives more than (25) as it determines the signs of the representations a, d,a" . . . etc. if that of the representation a in one system of reference K is given. 445. We may write ; v

-detg = c

(27)

and take c as a measure of the velocity of light relative to K. In a L orentz system K where g = T the relation (27) gives indeed the measure of the velocity of light in the ordinary sense; the relation (27) is a reasonable definition valid in any straight system of reference. The sign of c can be fixed assuming that in a system of reference K we have c > 0 provided the space part of K is a right-handed system and the time coordinate is chosen so as to make the measure of time to increase with passing time. Using the definition (27) c is a pseudo scalar and any pseudo scalar can be written a = ac, where a is a scalar. 0

C. FIEL DS 446. We have considered so far tensor quantities defined in one fourpoint only. We can define a tensor quantity for all four-points — or at least for those of the points of a region 9t and we thus obtain a field.

The simplest field is a scalar field. Having a scalar a defined for all four points j , we have e.g. in a representation K a(x) = *(a). The transformation of the scalar field quantity is given by the relation a(x) = a'(x').

(28)

Taking the transformation formula of the fourcoordinates into considera tion, we can also write c(x) = o'ÍSx + s) or a'(x') = Ű ( S - V + s+) thus a(x) depends on x in a way different from how a'(x') depends on x'. However, in a fixed fourpoint j all representations of a have the same numerical value. 447. Similarly a vector or a tensor field can be defined as follows. Let us write #(21) = A(x) where A(x) is a fourvector defined for any value of x. We have thus X'(9l) = A'(x') with A'(x') = S ^ x ) ; we can also write A(x) = SA'(x') = SA'(Sx + s). A twodimensional tensor % can be represented by K{Z) = T(x) where T is a twodimensional mátrix. The transformation between represen tations can be written as T(x) = ST'(x')S and also T(x) = ST(Sx + s)S.

1. THE

31 OPE RATOR

a. THE GRAD OPE RATOR

448. Sometimes it is of interest to consider the derivatives of a field with respect to the components of x. We define an operator 91 with repre sentations

=•= oxy

dx

cx

2

3

*'(3Í) = • ' = - £ - , - / T ,

öXi

i -

tfJC

0*3

2

ÖX

4

We shall also write • and • ' if the differentiation is to be applied to the expression at the left of the operator. Using the formai rules of differentiation, we can write

J_ yÍíkJ_. =

dx

dx

v

v

(

29)

dx'p

Differentiating the transformation relation x' = Sx + s into the components of x we find 1 x ~ -

and thus in place of (29) we can also write • = §•' = •'$.

(30)

Comparing (30) with (18a) we see that the 31 operator transforms as a four vector. Applying • to both sides of (28) we find with the help of (30) •ű(x) = SD'a'(x'). Thus putting • a ( x ) = Grad a(x) • ' a'(x') = Grad' a'(x') we see that Grad a(x) = S Grad a'(x').

(31)

Thus Grad a(x) can be taken as the representation of a four-vector. The four-dimensional Grad operation produces a vector field from a scalar field. Sjmilarly we find • oA(x) «= Grad A(x) = S(Grad' A'(x'))S. We see thus that the Grad operator produces from a vector field a tensor field. b. FURTHER OPERATIONS

449.

We can define further

• x A(x) = • o A(x) - A(x) o • = Rot A(x)

(32)

• •A(x) = Q g - A ( x ) = DivA(x).

(33)

and 1

We see easily that Rot A(x) defined by (32) gives an antisymmetric tensor field, while Div A(x) a scalar field provided A(x) is a vector field. The Laplace operator can be taken as the combination of Grad and Div, thus we may write LA(x) = Div Grad A(x) = • • ( • < > A(x)) .

(34)

Similarly we have Grad Div A(x) = • o ( • • A(x)).

(3 5)

Taking the difference between (34) and (35) we find (Div Grad - Grad Div)A(x) = Div Rot A(x). We may also write LA(x) = Grad Div A(x) + Div Rot A(x) where A(x) is a four-vector. The above operations are valid only in straight representations as we have assumed S and also g to be independent of x.

APPENDIX II

TENSORS IN INHOMOGENEOUS REGIONS

A. MORE-DIMENSIONAL MEASURES 450. So far we have restricted ourselves to give a formalism of measures up to two dimensions. This we did so as to give a formalism which is sufficient for purposes of the special theory of relativity. Dealing with the generál theory, i.e. with problems of gravitation, we have to introduce measures that have more than two dimensions. The formalism we give below for such quantities seems to be adequate to our particular physical approach of the problems. In the formalism we shall try to avoid — as far as is practical — the explicit use of sufBxes and upper indices indicating components of moredimensional measures. Furthermore we shall restrict ourselves with few exceptions to covariant representations of tensors. At the same time, we shall consider in detail the symmetry properties of tensors and make use of permutation operators. 1. Ar-DIMENSIONAL MEASURES

451. We denote a quantity 21 to be £-dimensional, if it can be represented by a mátrix of /^-dimensions, i.e. we have

#(21) = A (*)

where A has elements • (*n (*> ' J v , . . . ^ = ^.',.,..., A

t

v v . . .v = 1,2,3,4. a

2

fc

The upper symbol (k) denotes the number of dimensions. k = 0 corresponds to a measure with one component only. We shall omit the upper symbol whenever this can be done without causing ambiguity. In particular we shall always omit it for k = 0, 1 and sometimes also for k = 2.

2. MULTIPLICATION OF MORE-DIMENSIONAL QUANTITIES

452.

(*>

O

We define the direct product of two matrices A and B as (*) (/) (/c+o AoB = C

so that (fc + 0 ^"l'f-'k

(fc) =

(/) -0)1,(1,.. .(!(•

AriVt...Vk

(1)

Another type of product can be defined so that we carry out summations of a pair of adjacent suffixes, thus we shall write

cf( + l) Í/ + 1) A

' here

B = C

(*+/)

4 (i,/i ...fi; t

=

ik + l)

0 + 1)

X o- =

A ,^

B

Vi t Vka

afítfíi

w

.

(2)

l

We shall also use summation over more than one pair of suffixes. Thus we 11 A Bj]=C write where the double bracket indicates double summation and in terms of elements we have <* + /)

+

0 + 2)

Similarly we may use a product where summation over n pairs of suffixes is to be taken. Since it is inconvenient to write down severalfold brackets, we shall write short {A

B J

= C

where the sign (n) over the bracket indicates summation over « pairs of suffixes. Explicitly we have

^ V * i '.

(k+l)

'k+n) tíliL2...M

=

X

0 + ") •Á-r r,...Pi ir «,...a 1

c

l

H

ü f W i.i (

^a ...a,<' n

l

O O V / .I. It is also useful to introduce invariant products in the following way (fc+i) (/+i) (fc+i) (í+i) A • B = A g" B 1

thus the fat dot • denotes that a factor g has to be taken between the factors, where g is the in verse of the propagation tensor. Similarly, we introduce - 1

- 1

Hk + m)

| A • B the above expression stands short for (* + /) ^IV,...! 1 * F»,JI,...|I/~

(fc + m) 'i-»'k

X a,...om TI...Tm

(k + l)

J

= C ; (l + m)

"i..-am8om*m " " '£
A

r,...z

B

m

HI.-.FI;

where g+ are the elements of the mátrix g . Furthermore we shall use a process which produces from a k + 2-dimensional mátrix a fc-dimensional one. We shall write - 1

11 (í(.k

or explicitly

+

YV

2)

(fc)

R g" ]] = R

(3)

1

(k+2)

(AT)

The above process can be generalized in the following way (fc)

/(fc + 2m)

R= 1 R

\(2m)

g(-""j

where we have written g "" = g " o g " . • • ° g (

,)

1

1

_ 1

M FACTORS

B. PERMUTATION OPERATORS 453. We may introduce the following definition. Writing P for a permutation operator, we have (fc) (fc> PA = B (fc)

(fc)

thus the operator P produces from A another mátrix B. The above operation written down explicitly in terms of elements is

(

(*)")

(*)

and P(V!V . . . v ) = uyfr. . .p. 2

fc

k

where / i j / ^ . . . ^ is the permutation of the elements V j V . . . v indicated by the symbol P. 454. The product of two operators P and Q can be taken as a new operator, i.e. PQ = R. (4) 2

k

R is defined so that provided (fc) (*) QA = B,

(*) (*) PB = C,

(5)

then by definition of R (*)

(*)

RA = C.

(6)

(P0A = PleAJ.

(7)

We can also write

Thus we introduce the product of the permutation operator so as to be associative in the sense of eq. (7). 455. We wish to draw attention to another aspect of the properties of multiplication of operators. The relation (5) can also be written down in terms of elements. We find í

ik)\

í (*n

(*)

(fc)

<*)

w

(8)

If we choose P(Hi lh •••Hk) = V! v . . . 2

then we can also write in place of (8) (*)

(*)

(*)

further with the help of (9) (*) (*) CVid—M* = AaiPUitii,...^kn

•

v

k

(9)

From (4) and (6) it follows, however, (fc)

(fc)

Comparing thus (10) and (11) we see that (fc)

ifc)

W =

^Ö[/V,.

.

)]

W

<.PQXIÍ,...^)1

A

=

we see that in generál (*)

(fc)

(fc)

thus the associative law cannot be taken to be valid when applying permutation operators to suffixes of more-dimensional quantities, as seen from the above argument. 1. CYCLIC PERMUTATIONS

456.

We shall make use of cyclic permutation, i.e. c = (1, 2

k).

k

The permutation c applied to k suffixes makes each suffix to shift by one step to the right, with the exception of the last suffix, which moves to the first place, thus k

c-*(viv... ) = v 2

Vk

k

. . . v_ k

v

The powers of c are operators c' which produce shifts of / steps to the right. In particular k

k

4=i.

We can also introduce negatíve powers, i.e. the operator c ' shifts the elements by / steps to the left. We have of course k

k'

c

= 4 '•

An operator c* can also be applied to a set of n > k suffixes. The significance of this operation is that the cyclic permutation is to be carried out with the first k suffixes only, the latter are to be left on their places. We have thus e.g. k

c (viv . k

2

. . vv k

k+1

• • • v„) = VfcVV j a. . . v*_!V* . . . v . +1

B

In particular the transposition P = (lm) produces the interchange of the suffixes on the /-th and on the m-th place.

2. THE TRANSPOSITION OF A MÁTRIX

457. The transposition of a fc-dimensional mátrix can be defined generalizing the usual form of transposition. For many purposes it appears to be suitable to use the definition (*)

(k)

A = cA k

thus written in terms of the elements í

(k)\

(*)

\ k^-)t r,...v

~

c

l

k

(*)

= A,

A,,

kV

Taking the product of two matrices we introduce summation on adjacent suffixes. Sometimes it is necessary to carry out the summation over somé arbitrary pairs of suffixes, i.e. we have to carry out products

(fc + 1)

ik + t)

^>»lf-»t f*l(l/.»W

=

X A, ,.

(í+1)

..,...V

tV

k

a

B

j......>

( tfl f (t/

where a stands between, say, v and v in the first factor and between u, and / Í / + I in the second factor. So as to avoid somé complicated notation for to write down such products we note that (k+i) ( (fc+in / (fc+m Av ...v av,j ...r = \ s+l A )trr,...v l fc + l( j+l A ))p,...v , similarly s

J + 1

=

C

l

s

thus we have

rl

k

C

C

k

ka

(/+1)

• (*I...(t|ff(l/ D

k+ +tt

+

í

<,k + i)\(

y + m

We find a generalization of how to form the transposition of products. Using the rules given above one finds that (*+i)(/+i) A B

r«+i)(fc+in { B A J

(13)

f(/+l)(fc + l)\ [ B • A j

(14)

and (*+l)(í+l) A • B =c'

k+l

In the case of A: = / = 1 we thus obtain the well-known rule for two-dimensional matrices.

3. THE TI, OPERATORS

458.

An operation which often occurs in application is the following

» i - q + cr - ...+(-íy-v'- '-'

•,

1

n

2

1

From the definition we obtain 1(1+

C,- )*,1

1 if / is odd 0 if / is even (3)

Applying e.g. the operator n to a three-dimensional mátrix g we find 3

(3)

(3)

(^g)^, =

(3)

(3)

glfu, - g^í

+ gvlti •

C. THE 9í OPERATOR IN CURVED REPRESENTATIONS 459.

We can define the 3Í operator represented by KQfl) =

8 OX-L

8 8x

•

8 8x

2

8 8x

3

t

The 91 operator produces from a fc-dimensional quantity a (A: + l)-dimensional one. We may write in a particular representation A(X)OQ=

A(x)

which is in terms of suffixes as follows (k) SA

(k + í)

In the following we shall, for the sake of brevity, omit the variable x. 460. Sometimes we have to apply the 9í operator to a product of matrices. It is useful to make use of the following formuláé, which can be verified easily with the help of the definitions given above (.A BJ o Q = A B + c ,_, (cí.íi A )Bj. k+

L

(16)

The above rule can also be extended to /n-fold summation; one finds (A

B )

o• = ( A

B

)

+ c {[c ^ k+l+l

k

m

+l

A

J B ) .

Further, if we apply the 9Í operator to an invariant product of matrices, then we have rtk) CM

(/c) (/ + 1)

ff

[ A - B J c . Q = A - B + c ,_ k+

1

(*) (3)\1 (0|

(ík + l)

iLct+i { A - A - g J J - B j

(17)

where (3)

g(x) = g(x) o

g.

(«

We shall apply the 9í operator to the quantities R introduced in 452 eq. (3) so we have ((lk + 2)

X\

(((

f(fc + 3)

(ft + 2)

(3)1)

~i\

1. TENSOR OF SEVERAL DIMENSIONS

461. The representations of a &-dimensional quantity 91 may be arbitrary /c-dimensional matrices, i.e. <*) #(9Í) = A,

w K'(9l) = A',. . .

We take 91 to be a tensor, if the connection between its representations is of a particular form. Generalizing the definitions of 440 we require in the case of a tensor that its representations are connected according to the following relation (*)

(*)

Mi —fje

where the S are the elements of the mátrix S giving the coordinate transformation x' = Sx + s, VJ1

leading from the system K to K'. Indeed, for k = 1, 2 (18) reduces to relations (18a) and (21) of Appendix I. 462. So as to Write down the above relation free of suffixes, we intro(D duce a 2&-dimensional mátrix, which we shall denote by S with elements (!)

The transformation (18) can thus be written (*) f(f)(*>y*) A = (SA'J .

The reverse transformation can be written as (fc)

'/•()+ (fc)Vfc)

A' = [ S

with

AJ

(D v r ...r í fí _ ...ii

Ll

1 t

kl k

k

l

"Wi^t!,*,

1

• •

•" íít»f J

463. Considering a quantity 91 which can be represented by fc-dimensional matrices (*)

(fc)

K(W) = A, K'QX) = A',. . . (fc) (fc)

etc. where the A, A' are defined somehow, then 21 is in generál not a tensor. (fc)

We can form from the representation A" a quantity (fc) (O) (fc)\(*) A = 1SA'J . (fc)

(*)

If A = A then 21 is a tensor, the difference (fc) (fc) A-A gives an indication of the deviation between 21 and a tensor quantity. 464. The direct product of two tensors 21, 23 built according to 452 eq. (1) is itself a tensor, as can be verined in terms of the definitions. The product (fc + l) (/+l) (fc + O A B = C (fc+D

(/+»

is not an invariant one; even if A and B represent tensor quantities, (fc+/) C defined by (2) is not a tensor. Furthermore one verifies easily that the invariant product of two tensors is a tensor and the process, as defined (fc + 2)

(fc)

by (3), produces from a tensor R another tensor R. 2. SYMMETRY PROPERTIES

465. The permutation operators are invariant operators in the sense that the relation (fc)

(fc)

PA = B

(20)

if valid in one representation K, then it is also valid in any other representation K'. We can thus write in place of (20) also P9t = 33. If a tensor is such that for a given permutation P we have i>2I = 2l then we can take that 21 is symmetric with respect to P. We may also write P s l mod 21 where we denote by 1 the identical permutation operator. 466. There exist tensors so that in any representation we find W (fc) (/m)A = - A : (*)

we denote A to be antisymmetric with respect to the /-th and m-th suffix. In particular a tensor which is antisymmetric in all pairs of suffixes may be denoted totally antisymmetric. An important example is given below. (4)

3. THE ANTISYMMETRIC TENSOR e

467.

(4)

Consider a four-dimensional mátrix e with elements (4)

c if xXfiv is an even permutation of 1, 2, 3, 4 — c if xA/zv is an odd permutation of 1, 2, 3, 4 0

(21)

otherwise

where c = ^J— det g. Transforming e like a four-dimensional tensor, we obtain (4)

am w\a)

e' = ( s

+

ej

.

(22)

From (21) and (22) together with g = Sg'S we find (4) (4) (4) e' = e/det S = ec'/c. (4) The components of e' are obtained by the relation (21) if c is replaced by c'. We see thus that the definition (21) gives the representation of the elements of a tensor.

468.

The operation (4)

(2m (2) TjJ = T

(2) (2) (2) produces from a two-dimensional tensor T another T. T is always anti(2) (2) (2) symmetric. If T is antisymmetric, then T is obtained from T by a certain mode of interchange of the space and time components, as can be seen easily when writing down the relation (23) explicitly. Applying e.g. the above operation to the field strength tensor, we find / F =

0

-B cE{\ Bx cE 0 cE -Bi -cE —cE 0 J B 0

2

3

B \-cEr

2

2

3

2

3

/

0 E -E 0 F = E -Ex V cBx cB 3

-E -cBx\ Ex -cB • (24) 0 -cB cB 0 / 2

2

2

3

2

3

D. TENSOR FIELDS (fc)

469. A A>dimensional field is defined by a mátrix A(x). The field will be denoted a tensor field, if its elements transform suitably, when we change the representation. Consider thus a reversible coordinate transformation x' = f(x). We denote 21 a tensor field, if we have (fc)

#(21) = A(x), with

(fc)

#'021) = A'(x')

Ufc) (fc) («) (fc) M A(x) = [S(x)A'(x')J (!)

where S(x) is built of the elements of the mátrix S(x) = f(x)oQ (t)

just in the manner as S is built of S in 462 eq. (19). 470. We may also introduce a quantity (fc) ni) (fc) y
and we can say that 91 is a tensor field if we find that tt) (*) A(x) = A(x). (*)

We shall apply the 92 operator to the quantities A so we have (fc) Al) (*) \(k) Aon = (SA'j orj.

(25)

So as to obtain (25) in a more convenient form, we note that the right-hand expression represents a sum each of the terms of which being the derivative of a product containing A: + 1 factors. The differentiation • gives thus (« k + 1 terms. One of the terms contains the derivative of A' into the variable x. Making use of the rule

• = 5's we can write (*)

A'oQ=

(*+«

A' S ,

therefore we find for the term obtained by differentiating A'

(.sU'oSJJ

=[stA'SJ|

= [ S

A'j

= A

.

The terms obtained from differentiating the factors of the elements of (i) S give contributions of the form

c * 4 U

A Js-'s].

Taking the k + 1 terms together we find (*) (fc+T) k ív (3)i A o Q = A +X^'LUAJS-1SJ.

(26)

1=1

In place of (26) we can also write (* + l) (*+!) A - A

A*)

W\

= l A - A J o n +

k

U

(k)\

(3)]

E^'IUAJS^S}.

(27)

(fc)

(fc)

(fc)

We note that provided A is a tensor, we have A = A and thus the righthand expression becomes simpler, however (*+i) (fc) A = AoQ is in generál not a tensor, since the sum on the right of (27) as a rule does not vanish. 1. THE CHRISTOFFEL BRACKET SYMBOLS

471. We apply (27) to the tensor g(x). Remembering that (12)g = g we find (3)

g (3)

(3)

(3)\

(

(3)

g O + Í U f U ls ^ s J .

(28)

Since S = (23) S we can write in place of (12) also (12) = (12)(23) = c thus we have (3)

í

(3)

(3>-\

g - g = (I + c - ) l g S - S J . 1

(29)

1

3

Applying the operator 713/2 of (15)

to both sides of (29) we find with the help (3)

(3)

(3)

gS- S = C - C

(30)

1

with (3)

_1

3

1

(3)

1

(3)

0)

C = — n g and C = y 7 c g .

From (31) the reversed relation

3

3

(3)

(31)

(3)

g - O + c ^ C

(32)

can be obtained. In place of (30) we can also write (3)

/-(3)

ü(h

S = S • (C - C J .

(33)

(3)

472. The three-dimensional mátrix C defined by (31) is the so-called Christoffel bracket symbol. The usual notation of the Christoffel symbol is:

In the literature one uses also expressions

where the latter expressions are elements of the mátrix (3)

(3)

e=

- c. 1

g

From the definition (31) it follows that the Christoffel symbols are symmetric with respect to the second and third suffixes. One finds (3)

(3)

(23)C = C . The Christoffel symbols represent three-dimensional field quantities — but these fields are not of tensor character as can be seen. 2. THE COVARIANT DIFFERENTIATION

473. The 91 operator applied to a scalar field produces a vector field. Indeed, applying to the relation Í(X) = Í'(X')

the operation

• • s ( x ) = Ds'(x') = S(x)(rjy(x')).

Writing thus • s ( x ) = Grad Í ( X ) we see that Grad s(\) transforms like a vector, i.e. Grad s(x) = S(x) Grad' s'(x'). We can thus write symbolically ©rab 3 = 81.

Applying the 9Í operator to vector or tensor fields we do not obtain a tensor field, because the differentiation produces also derivatives of the transformation mátrix S. The process of differentiation which leads to invariant fields is considered in more detail below. 474. Writing down (27) for a tensor field 21, i.e. supposing that 21 is <*o (*) (« represented by A such that A = A we find +

(A + i)

A - A

k

[(

'k)\

(3)1

= I^-'LUAJS^SJ.

(3) • (••••••• (3) (3) We can express S S in terms of the Christoffel symbols C and C. Writing the quantities without bar on the left and with bar on the right, we obtain thus the following relation (fc + 1) k [f (k)\ (3)1 (fc + 1) * [Y (fc)\ Öl] - 1

A - Ic -'LUAJ-Cj= A - X^'LUAJ-CJ. k

/=l

•••

-

7=1

Introducing thus a Grad operator as (fc)

(fc)

fc

17

(fc)~\ (3)1

IVLUAJ'-cI,.

GradA = A ö Q -

(34)

(fc)

we find that Grad A is a tensor of fc + 1 dimensions. Applying (34) to a scalar the sum contains no terms and has to be omitted. In the case of a vector we have one term only and find thus (3) GradA = A o p - A - C . (35) 475. If we apply the Grad operator to an invariant product of tensors, then we have A*) (0)

(fc)

(0

Grad \ A • Bj = A • Grad B + c ^ k+l

1

u (*n wi \{c li Grad A j • Bj . k

(*)

In the case we apply the Grad operator to the quantities R introduced in 4 5 2 eq. (3) we have Grad

(C'R 'g- )) = c

(({<#, G r a / Í * }

1

k+1

g" )) . 1

(36)

476, We can define also the covariant form of the Div operator. We may write DivA^ttJGradAJg- )).

(37)

1

It is useful to make use of the following formuláé; which caii be verified easily with the help of (36) and the definitions given above: /Y(*+2)

r\

Div H R g- )J = l k 1

or

Div

í

tó

(* 12)| |

Grad R ( j g H

( ( T g - ) ) = (U+1 \c U Grad 1

k

(K

R '}]

y«)

,

(38)

g^'f'-

The Laplace operator can be taken as (fc)

(fc)

LA = Div Grad A .

(39)

(3)

(3)

G r a d g = g - ( l + (12))g-C = 0 , thus we have identically Grad g = 0

and also

Div g = 0.

(40)

E. CRITERIA FOR HOMOGENEOUS REGIONS 477. A region 91 is taken to be homogeneous, if it admits of representations of Q g' = K'(q) = independent of x ' .

(41)

Suppose g(x) = *(8) to be the representation of g in an arbitrary system of reference K. If region 9t is a homogeneous one, then it exists as a coordinate transformation x' = f(x),

(42)

so that S W S Í X ) = g(x)

(43)

S(x) = f ( x ) c . g .

(44)

with (s)

(?)

From (41) it follows that C'(x') = 0 and therefore also C(x) = 0, and we may write using relation (33) (3) (3) S(x) = S(x)-C(x). (45) Relation (45) is a differential equation with the help of which S(x) and also f(x) can be determined if the region 9í is homogeneous indeed. Whether or not (45) possesses solutions can be ascertained by differentiating (45) into x. We write (3) (4) S o g = S

where we omit to write down the variable x explicitly. We find with the help of (17) (4) (4) r cm W\ (s)\ S = S-C-t-c (c - lS-S-gj-Cj, 1

4

s

(3)

(3)

Inserting S from (45) and g from (32) we find (4) (4) (4) r / (3)) (3)1 S = S-C-C4 cí l - 3~ CJ-Cj. 1

s

c

1

L

(3)

Using the symmetry of C we can simplify the second term and find as the result of a short calculation S = S-ÍC-(24)(C-CJJ.

(46)

(4) If S is to be the third derivative of the transformation function f(x), then it has to be symmetric in the last three suffixes; we expect thus e.g. (4)

( 1 - ( 2 4 ) ) S = 0.

(47)

(4) Since it follows from (45) that S is automatícally symmetric in (23) it follows also that provided (47) is fulfilled it is also symmetric in (34) since we have (34) = (23X24X23). (4) Thus provided (47) is fulfilled, then S is symmetric in the three last suffixes. Inserting (46) into (47) we find (W

(3) (3))

( 1 - ( 2 4 ) ) | C + C - C | = 0. We can also write (4) 1 (4) 1 (*) (1 - (24)) C = - (1 - (24))** = - 2 * 4 * . Defining thus a four-suffix quantity (4) 1 (4) /S) (3) ^ R(x) = y * g(x) + (1 - (24)) (C(x)• C(x)j, 4

(48)

we see that a necessary condition for a region 91 to be homogeneous is that (4) R(x) = 0 inside 91. (49) Indeed, unless (49) is fulfilled the differential equation (45) does not admit of solutions.

478.

The argument can also be reversed. Differentiating (45) successively (2 + /)

into x we obtain a recursion formula for the derivatives S of S(x). Expanding S(x) or f(x) into powers of x using the derivatives thus obtained we obtain a solution of (43)-(44), provided (49) is fulfilled. We see therefore that (49) is a necessary and also sufficient condition for a region ÍR'to be homogeneous. 1. ALMOST STRAIGHT RE PRE SE NTATIONS

479. In an inhomogeneous region the system (43) and (44) admits of no solutions; in such a region one can try to introduce almost straight coordi nates, i.e. one may try to find a.representation K' such that gf(x') is almost constant. For any value of g(x) we are led to (45), giving a differential equa tion which, if it admits of solution, leads to the transformation (42) defin (*>

ing straight coordinates. Differentiating (45) one is led to (46); if S thus obtained is symmetric in the last three suffixes, then (43) and (44) admits of solution. (4)

The mátrix S as obtained from (46) is in any case symmetric with respect to (23); we can introduce a symmetric tensor (4)

(4)

1

< S > = ( l + ' ( 2 4 ) + (34))S. • T

(4)

(4)

The latter is symmetric in all the three suffixes. Since < S > is equal to S (4)

in a homogeneous region, we can take <S> to define an approximate solution of (43) and (44). The difference between the two solutions can be written (4)

(4)

(4)

(4)

1

ŐS = <S> S = j [(1 - (24)) + (1 (34))] S , however (1 - (34)) = (23)(1 - (24))(23) = (23)(1 - (24)), therefore we can also write (4)

1

(

(4)1

SS= - y ( l + ( 2 3 ) ) ( s R | .

(50)

The function <S(x)>, giving the approximate solution, can be obtained by integration; we find (3)

•*•(*)

<S(x)> = S(0) + S(0)x + J «S(x')> (x x') o dxj*

(51)

o

or 1 (3) í r (4) = JI. + S(0)x + S(0)x + - J «S(x)>(x x') W)< > 2

3

0

(the integration can be taken on any path from 0 to x); the above trans formation function satisfies thus approximately (43) and (44) in the vicinity of x = 0. 480. We may write g(x') = g' + őg(x') for the representation of g in the almost straight systems of reference. We have thus <S(x)>(g' + <5g'(x'))<S(x)> = g(x).

(52)

Differentiating (52) into x we find for x = 0 (3)

W ) = 0.

(53)

Differentiating twice with the help of (51) and (53) we find for x = 0 (

U)\

(?)

1

( l + (12))lgS ŐSJ = ő g , therefore with the help of (52) remembering that (4)

(1 + (12))(23) = 0

mod R ,

we find (4)

1

(4)

ág=3 (1+(12))R and transforming to K' (4)

1

(4)

<5g' = ( l + (12))R' T

thus we have 1 (« g'(x') = g' + - (1 + (12))R'x' + . . . 2

.

The above representation is as straight as possible, since the second derivative of gfíx') must be at least of the order of the elements of R'. 2. TENSOR CHARACTER OF THE RIEMANN-CHRISTOFFEL TENSOR (4)

481. The quantity R(x) defined by (48) is a tensor; apart from a change in convention it is equal to the Riemann-Christoffel tensor. One finds (4)

(4)

át(x) = - ( 2 3 ) R ( x ) , (4)

where <St(x) is the usual form of the tensor. We prove the tensor character (4)

of R using our formalism. We start from a relation which is obtained by applying the operator • to g S S = g; with the help of (16) one finds thus - 1

<í HgS" o • ) = [c,- (g - g S " ? ) ] S" . 1

1

1

(54)

1

Applying • to the relation (30) one finds with the help of (16) and (54) f<3)

ff

(4)

(3)\

(C - Cj c-n = g S ^ S + c ( k

(3ni

f(3) 1

k

(3)1

[g - g S ^ S j J S ^ S J .

With the help of (30) and (32) one can rewrite the second term on the righthand side and find fí(3)

c {---} = c (12)|lc + 1

4

(3V|

((3)

a>\\

CJ-lc-Cjj.

(3)

As the C are symmetric in (23) the expression in the { } is symmetric in (34) and one can write for the operator before the bracket c (12) s (1234)(12)(34) = (24). 4

We have thus A3)

ÖA

(4)

f/O)

(3)1

/(3)

<3)\)

[C - Cj o Q = g S ^ S + (24) {(C + Cj • (C - Cj} .

(55)

(3)

Applying (27) to C one finds (4)

(4)

OA

/(3)

(3) f(3)

<3)\

C-C = [c-CjoQ + C-[C-Cj+ + ca"

1

3

f(S) ((3) L

\í

(3)11

(in my

(imi

C• |c - Cjj + c I k ' c J • IC - C j j . 3

(56)

With the help of (14) we find further (3) /(3)

r/(3)

OÁ

C-{C-CJ

(3A (3)1

= cl[{C-Cj-c\

(57)

and C (3ft f(3) (Sn fffl) (3)1 (3)1 (c - cJ-lC-Cj = c Llc-Cj-Cj. (58) The expression in the []-bracketof (57) is syrnrnetric with respect to (34) therefore we have 1

3

2

o n éöi

|7(3)

c - c [] = ( 2 4 ) [ i C - C j - c J . 1

2

3

(59)

Similarly, the expression in the []-bracket of (58) is symmetric with respect to (12) thus we find cA

=

(14).

We note further ~ f7(8)

m (3)"V (3)1

(1 - (24)(14)) Lic - CJ • Cj = 0 .

(60)

(4)

So as to obtain expressions of the form of R we note that

(4)

1 (4)

( 1 - ( 2 4 ) ) C = -n&.

(61)

Thus applying ( l - ( 2 4 ) ) to both sides of (56) we obtain with the help of (47), (55), (59), (60) and (61) the following relation 1 y

ao

*i [ g -

(7n f ((3) cm (<$> ö n 8 i = (1 - (24)) | - [C + Cj • (C - Cj + (3) H3)

(3)1

/(3)

(3)1 (3)1

+ c-(c-cj-lc-cj-c|.

The terms in the { }-bracket reduce to (3) (3)

(3) (3)

{} = - C - C + C - C and therefore we find 1 2

(«

(<S) OÁ 1 (4) (3) (3) + (! ~ (24))lC-cJ = y n a + (1 - (24))C-C,

or using the notation (48) (4) (4) R=R (4)

thus in accord with 463 R is a tensor. (4) 3 . SYMMETRIES OF THE R TENSOR

482.

We note that (3) (3)

(13)(24) = 1

modC-C

thus (1 - (24)) = i (1 - (24)) (1 + (13X24)) = (T)

y

(1 - (24)) (1 - (13)) = tt

4

(3)

mod C-C and we find in place of (48) also (4) 1 /tt) (3) (3A R = - ( 1 - ( 2 4 ) ) ( 1 - ( 1 3 ) ) U + C-Cj or (4)

]

(62)

(3) (3)\

((4)

R = y JTU + C-Cj. 4

(63)

(62) and (63) are, of course, identical with (48). From the above relations the (4)

symmetries of R can be obtained in a simple manner. From (62) we see immediately that (4)

so

(4)

(4)

(24)R = (13)R = - R (4)

(64a)

(4)

(13)(24)R = R.

(64b)

Further, remembering that according to the definitions (4)

Q7t = -7i 4

(4)

(1234)R = - R .

4

(64c)

Multiplying (63) from the left by (12)(34)(1234) = (13), and remembering (64a) and (64c) we find (4)

(4)

(12)(34)R = R,

(64d)

thus (4)

(12) s (34)

mod R.

Furthermore one finds as the result of a simple calculation that (1 + c + c | K = 0

(modg)

3

therefore (4)

(1 + c , + 4)R = 0 .

(65) (4)

Relations (64a) and (64d) give all the symmetries of R. (4)

483. Taking the symmetry properties of R into consideration, the number of linearly independent elements can be derived. Firstly we note (4)

that because of (64a) elements R x

c a n

o n r

VjlM

v #

x,

y be different from zero if

it # L

(66)

There exist 21 configurations of suffixes obeying (66) and there exist 21 types of components, which do not vanish identically. Indeed, there exist six pairs of suffixes v T 6 K and fifteen quadruplets obeying (66) with v # ii and six with v = n , where we count those configurations only once which give the same absolute value for the element R . We can thus arrange the eleVílxX

(4)

ments of R into twenty-one groups, each group containing (not identically vanishing) elements with equal absolute values. Relation (65) gives a linear connection between three different elements (4)

of R. Because of (64a) this relation will be fulfilled identically for elements (4)

of R expecting those with four different suffixes. For the latter elements (65) gives the relation (4)

(4)

(4)

•^1234 + ^3124 + -^2314

=

0>

which is independent of (64a). The elements occurring in this relation connect three out of the twenty-one groups of elements discussed above. We see thus that out of the 21 groups given above only 20 contain independent elements. Thus the Riemann-Christoffel tensor possesses altogether 20 nontrivial independent elements. (4) 4. THE R E D U C E D FORM OF THE R TENSOR (4)

484. The Riemann-Christoffel tensor R can be contracted and thus we obtain the following tensor quantities

R = ((Rg-i)) = ( ( g - S ) ) .

(67)

(2)

The contraction leads to the same tensor R whether we multiply from the left or from the right by g . Further we have - 1

J ? = ( ( g - R ) ) = (g(- )R) 1

)

2

(4>

where we write g = g og . From (67) we find with the help of (38) ( _ 2 )

(2)

- 1

([

_ 1

\&)

(4XU2)

Div R . [ l g - grad RJ 1

g" ] 1

(

(

(tí\\(i)

= (g<- )[(35) grad RJJ 2

.

(68)

We find also í

(4)1(4)

Div (gR) = Grad R = [g<" > grad Rj . (69) We can reduce (69) to a simple form. With the help of a simple consideration one finds that 2

i

(4)

(5)

Grad R = - (1 - (24) - (13) + (13)(24)) ,

(70)

Y

where

(«)

= ( l + ( 3 4 ) + (35)) It is verified easily that Y

f 1 (S) T

® (4)

(55 (3; (4) (3)

(3) (3) (3)1

g + C-g-(13)g-C-g-C-2C-C-Cj.

(12) = (34) = (35) = 1

(5)

mody.

(71)

Introducing (70) into (69) we can simplify the expression thus obtained remembering that g<- has the following symmetries 2)

(12) = (34) = (13)(24) = 1

modg<- . 2)

(72)

(4)

Therefore we can apply any of the above operators to Grad R — or to any (4)

terms of Grad R without changing the value of the sum. Introducing thus (70) into (69) we obtain four terms; we find that the first and fourth are equal, since í

(sn(4)

(

(

(5m(4)

U^YJ =lg - l(13)(24) jj . We find also that the second and third terms are equal. Indeed, (

2)

Y

(

(

í

«m(4)

(

(73)

<5A(4)

=U- 1(24)YJ

U- 1(13)YJJ (2)

(2)

thus with the help of (69), (70) and (73) we find Grad R = 2 (g<- > ( ( 1 - ( 2 4 ) ) Y ) ) . 2

(74)

W

Similarly, introducing (70) into (68) we find making use of (71) also (

(

<4)Y\(4)

(g<- > 1(35) Grad RjJ 2

<

( 2

Y

- ( g " ((35)(1 - (24))(13) Y ) ) (

(5)U(4)

= lg<- > 1(1 - (24)) jJ

2)

M )

.

(75)

The second term on the right vanishes as can be seen in the following manner. The operator inside the bracket can also be written (5) ( 3 5 ) ( 1 - (24))(13) = (35)(1 - (24))(I3)(35) = (15) - (24)(15) mod Y -

However, under the summation we can replace (24)(15) by (13)(15), further (13)(15) = (13X15X35) = (15). Thus the second term under the sum cancels the first; thus the second term on the right of (75) can be omitted. We find therefore comparing (74) and (75) (2) 1 Div R = y Grad R, and also

/(2) i DivlR-yg*

=0.

1 B2/40 00001492 Fiz. Kvt.

This analysis shows, among others, the experimentál significance of the Michelson-Morley experiment and why it was necessary to use the well-known experimentál arrangement in the form it was actually used. The author has written many articles and gave many lectures both in Hungary and abroad elaborating his views which are deviating in certain respects from those usually put forward. The concepts and modes of treatment have given rise to discussions and aroused great interest. With the new concepts the author's aim is to induce constructive discussion but at the same time his book may sérve as a handbook for physicists, scientists and philosophers interested in the problems. It is an interesting reading and it could also be used as a university textbook.

AKADÉMIAI

KIADÓ

Publishing House of the Hungárián Academy of Sciences Budapest Distributors: K U L T Ú R A Budapest 62. P.O.B. 149

L.

Jánossy

THEORY OF RELATIVITY BASED O N PHYSICAL REALITY The book deals with the subject of the special and generál theory of relativity. It gives the usual mathematical formalism elaborated by Einstein and others. The treatment of the book places great emphasis on the question of how the theory can be derived from the experimentál facts and this is the feature in which the treatment deviates to somé extent from the usual ones. There is also a tendency to make use as little as possible of specialized systems of coordinates. Before writing this monograph the author was engaged for several years in fundamental research concerned with the double nature of light. These researches contained intricate experiments and raised questions which are of interest in connection with somé of the fundamental problems of the theory of relativity, too. The author analyses in great detail the methods of how the velocity of light can be experimentally determined. The controversial question of the carrier of light is also dealt with.

Theory of Relativity Based on Physical Reality

Read more

Albert Einstein - On The Theory Of Relativity

Read more

Special Theory of Relativity

Read more

Mathematical Theory of Relativity

Read more

Einstein's theory of relativity

Read more

The theory of relativity

Read more

Einstein's Theory of Relativity

Read more

Theory of relativity

Read more

Special Theory of Relativity

Read more

A Theory of Relativity

Read more

General Theory of Relativity

Read more

Theory of Relativity

Read more

Theory of relativity of motion

Read more

The Einstein theory of relativity

Read more

The Einstein Theory Of Relativity

Read more

Einstein's General Theory of Relativity

Read more

Einstein's General Theory of Relativity

Read more

The Einstein Theory of Relativity

Read more

The special theory of relativity

Read more

The mathematical theory of relativity

Read more

Theory of Relativity (International Series of Monographs on Physics)

Read more

Relativity, The general theory

Read more

Lecture Notes on the General Theory of Relativity

Read more

Sidelights on Relativity

Read more

Lectures on general relativity

Read more

A survey of physical theory

Read more

Introduction to the Theory of Relativity

Read more

An introduction to the theory of relativity

Read more

What is the theory of relativity

Read more

Introduction to the Special Theory of Relativity

Read more

Recommend Documents

Theory of Relativity Based on Physical Reality

Albert Einstein - On The Theory Of Relativity

A Universal Download Edition On the Theory of Relativity Albert Einstein King's College, London, 1921 IT is a particul...

Special Theory of Relativity

The Special Theory of Relativity The Special Theory of Relativity David Bohm London and New York First published 1...

Mathematical Theory of Relativity

Einstein's theory of relativity

The theory of relativity

Einstein's Theory of Relativity

Theory of relativity

Special Theory of Relativity

A Theory of Relativity