This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
00, we get
13
. 28 = r 2 SIn .
The lineelement (1.2) takes the form
1.1.1.1
Galilean group
The choice of the reference frame is arbitrary. It is thus interesting to exhibit the frame transformations that preserve the form (1.1) of the line element dC 2 . It is equivalent to finding the coordinate transformations keeping the 3 axes of reference orthogonal. The set of these transformations is therefore given by the rigid rotations of these three axes and a translation of the origin (1.4) The rotation Aj can be time dependent, but it is rigid because it is independent of the position x j ; it is therefore the same for all points at a given time. Tj is a timedependent translation. The invariance under the action of these transformations (1.4) reflects the isotropy and the homogeneity of E uclidean space. Among all these rigid transformations , those of the Galilean group playa central role. The Galilean group is the subgroup of t imeindependent rotations and of translations linear in time, Ti = Tovit , where Td and Vi are constant vectors , hence preserving the equivalence between inertial frames. The laws of physics are assumed to be identical for all inertial reference frames , which means that they are invariant under the action of a transformation of the Galilean group (1.5) This group is composed of three rotations , three translations and three changes of the inertial reference frames. Taking also the possibility to change the origin of time, it is therefore ten parameters group (three angles , a vector with three translation coordinates, one velocity vector and a change in the origin of t ime). 1.1.1.2 Trajectory of a massive body The equations of motion of a body of mass m evolving in a gravitational potential ¢ can be obtained by determining the extremum of the Lagrangian constructed from the total energy of the particle (1.6) where we have considered that the inertial m ass, m[ , and the gravitational mass, m G , could a priori be different. The Euler Lagrange equations [c5L / c5xi d(bL/bxi)/dt] are rewritten in any coordinate system as l ISee the following chapter for a brief reminder of analytical mechanics and Ref. [5] for a complete and detailed study.
14
General relativity
(1.7
In Cartesian coordinates, gij = Oij and ffj = O. We therefore recover the equations o motion in their classical form x =  V ¢;, when m[ = mG' Symbols ffj represent the fictitious forces (centrifugal, Coriolis ... ) that have to be taken into account when the reference frame is not inertial. Also , note that (1. 7) is independent of the mass of the body if m[ = m G ; all bodie fall identically, whatever their mass or chemical composition. This is what is called the weak equivalence principle or universality of free fall. Actually, in the Newtonian case it is a priori possible to introduce two different masses and the equality between the gravitational mass and the inertial mass seems fortuitous. Using pendulums, Newton (1686) showed that ImG  m[ I
m G+ m [
<
10
3.
(1.8
The equality between the gravitational and the inertial mass was therefore only required up to one thousandth. However, Newton considered this equality to be the crux of his theory of gravitation and he dedicated the introduction of his Principia to it. As we will see, this property plays a central role in the construction of theories o gravitation. 1.1.2
Spacetime of special relativity
The Maxwell laws of electromagnetism have the property of not being invariant unde the Galilean group (1.5). For example, the electric force generated by a charge at res is not invariant since a magnetic force appears in a reference frame where the source is in inertial motion (see, for example, Ref. [6]). On the other hand, Maxwell deduced from his theory that light is an electromagnetic wave whose velocity of propagation, c depends on the permittivity and the permeability of the medium. One had to introduce a fictitious medium in which these waves could propagate; the aether. The Maxwel equations would only be valid in this particular reference frame, identified with the absolute reference system of Newton. In a different inertial reference frame the speed o light should therefore have been measured as c ± v, which opened up the possibility o determining the absolute reference frame , that of the aether , at the price of abandoning the principle of Galilean relativity. 1.1 .2.1
Lorentz transformations
The experiments by Michelson and Morley (1887) refuted the abovementioned hy pothesis; light was shown to have the same speed of propagation in any given inertia reference frame. In 1905, Einstein changed this perspective by reaffirming the principle of special relativity according to which all the laws of nature should have the same form , no matter what the inertial reference frame is (see for example Ref. [7] for the historical aspects). In particular, this should have been the case for the Maxwell equations, which implied that the speed of light should be the same in all inertial reference
Spacetime and gravity
15
frames. Consequently, one deduced that the rules for changing the inertial frame can not be those of Galileo (1.5). The new group of transformations is the Lorentz group. An inertial observer can label an event by considering a rigid Euclidean reference frame, which allows him to determine the Cartesian coordinat es x, y , z . At every point in that grid, he can place a synchronized clock in such a way that all events are labelled by a quadruplet (t, x, y , z). A second observer in an inertial reference frame moving at a velocity v along the Ox axis with respect to the first observer can also construct such a coordinate system . It permits him to identify the same events by another quadruplet (tl ,xl , yl ,ZI). The two sets of coordinat es are relat ed by the Lorentz transformation , called a boost , ctl =
Xl
yl
=
=
ct  (v/c) x
J1  v2 /c 2 ' X 
(v/c)ct
J1  v2 /c 2 ' y,
Zl = Z.
(1.9)
These transformations have the property of reducing to Galilean transformations when v « c and keep the speed of light equal to c in all inertial reference frames .
1.1 .2.2 Minkowski spacetime Special relativity postulates that all inertial observers are equivalent , the sets of coordinates determined by different observers have no intrinsic meaning. The invariant quantity upon changing the reference frame is obtained by considering a pseudoEuclidean version of P ythagoras ' theorem. The lineelement (1.1) is replaced by the lineelement between two events, in Minkowski coordinates, of spacetime (1.10) with x O == ct. The Greek indices vary between 0 and 3 and the Einst ein summation convention is extended to these values. We note that contrary to the Euclidean metric, ds2 can be negative or vanish for noncoincident events. In this fr amework, space and time are united into a spacetime, the Minkowski spacetime. This spacetime is absolute, just as in the Newtonian framework. It is endowed with four orthonormal axes constituting the absolute Minkowski reference frame. The set of points such that ds 2 = 0 is known as the light cone and represents , for all points M in the spacetime, the ensemble of points that can receive a light beam emitted in M (future light cone) or received in M (past light cone). This light cone determines the causal structure of Minkowski spacetime illustrat ed in Fig. 1.2. Indeed, since the velocity of all bodies is smaller than c, only pairs of points (M, P ) such that ds 2 < 0, in other words such that M is at the interior of the light cone in P, can be joined by traj ectories ema nating from P. The dist ance is then timelike and the proper t ime T is generally introduced by the relation (1.11)
j
.. UNIVERSIDAD . ',
j
\
1
AU'TONOMA DE
MADPJO ru!3tlOr~ GEI'~a.~
16
General relativity
If ds 2 > 0, t he interval is said t o be spacelike and the points of t he couple (M, P cannot be joined by any physical trajectory.
Fig. 1.2 T he causal structure of Minkowski spacet ime. The not ion of simultaneity is re placed by the notion of a light cone defined by d s 2 = O. T he worldline Xi>( A) emanating from P is found at t he interior of t he light cone (d s 2 < 0) . All points exterior to t he cone can neve be joined by a physical t rajectory leaving from P. They are separated from P by a sp acelik interval (d s 2 > 0) .
1.1 .2.3
Proper time
The trajectory of any massive part icles can be represented in the parametric form xl' (A ) , where A is an arbit rary paramet er (increasing towards infinity) . We can repa rameterize t his worldline as a function of the proper time
where U I' = dxl' I dA is t he tangent vector to t he worldline. In t his new parameteriza tion , t he t angent vector is given by u l' = dxl' I dT and it satisfies
Let us now consider an observer 0 ' that moves wit h a velocity v with respect to a inertial reference frame, 5 , where an inertial observer 0 is locat ed . As 0 is motionless its t rajectory is given by Xi = const and its proper t ime is dT~= dt2.
The moving observer 0 ' has a tra jectory such that dx i I dt = reference frame 5. Its proper time is thus
Vi
with respect to th
Spacetime and gravity
2 2 i ' 2 dTO' = dt  b;j dX dx J Ie =
(
17
V2 ) dt.
1  e2
2
In conclusion, the proper times measured by two observers 0 and 0' , obtained by eliminating t he time coordinate, are related by
dTO' =
V~ 1  ~ dTO·
(1.12)
Thus, contrary to the Newtonian case, t he duration of the journey measured by 0 is not the same as that measured by 0' , depending on their t raj ectories. If the clocks of o and 0' have been initially synchronized , they will not be synchronous at another meeting point. This is what is called the phenomenon of time dilation.
1.1 .2.4 Poincare group Just as in the Newtonian case, the metric (1.10) can be written in any coordinate system (ylt) that is related to Minkowski coordinates (XV) by a relation of the form XV (ylt) . A vector dx v in Minkowski space has the coordinates dylt = (aylt I ax 1.1 )dxl.l . We thus deduce that the lineelement (1.10) has the form
(1.13) For example, we can consider the Rindler coordinat es, relat ed to Minkowski coordinates by
t = RsinhT,
x = R cosh T ,
y
=
y' ,
z = z' .
The lineelement (1.13) then takes the form
The laws of physics should be, according to the principle of special relativity, independent of any preferred inertial reference frame. It is therefore important to exhibit transformations of the reference frame that preserve the form (1.10). These particular transformations constitute the Poincare group and take the form
(1.14) where the rotation matrices A ~ and the vector Tit do not depend on the coordinates. Note that this group of transformations is more constrained than the Galilean group in which the rotations could depend on time. This Poincare group consists of 3 rotations , three Lorentz boosts and four translations. It therefore has 10 degrees of freedom , which means that 10 values are needed to completely describe any transformation of the group.
18
General relativity
1.1.2.5
Poincare algebra
The Poincare group can be constructed using infinitesimal transformations. Let u first consider a simple transformation of vector TI' = 5xl' and write the transformatio (1.14) for a scalar function f(x) in the form f (yl') = (1  i5x cx Pcx ) f (xl'). This relatio defining the generator PI" we get PI' = i0l' by identification with a firstorder Taylo expansion of the function f(y). Equivalently, a Lorentz transformation can be writte yl' = xl' + w~xv, with WVI' = wI'V, and the generator JI'V is obtained using th definition f (yl') = (1  iw cx {3 J cx {3 ) f (xl') , which gives JI'V = i (x l'0V  xv0l'), Other representations of the Poincare group are possible, but all of them respec the following commutation relations , which form the Poincare algebra:
(1.15 for the translations, and
(1.16
which is the Lorentz algebra, itself independently closed. Let us recall that the com mutator between two operators is defined by [A,B] = AB  BA.
1.1.2.6
Trajectory of a free body
In an analogous way to the Newtonian case, the trajectory of a massive free body ca be obtained by ext remising the action L
=
1
g 2 I'V (XA):i;I':i;V ,
(1.17
where a dot above a variable represents a derivative with respect to A. Euler Lagrang equations then take the form
(1.18
where ul' = dxl' Id A and ill' = dul' IdA. In Minkowski coordinates, we find that a inertial observer must follow a trajectory of constant velocity
(1.19
Note the analogy with the massive body case in (1.6) , but also the difference: in the cas of (1.17), it is not possible to simply introduce a t erm equivalent to the gravitationa potential.
1.1.3.1 Equivalence principle Einstein based his analysis on the fact that in Newtonian gravity all test objects fall in exactly the sarie way in an externai gravitational field, whatever their mass or chemical couipositiorr.
In Galilean and Newtonian physics, this universality of free fall conles from the eqrrality betwcen the inertial rriass arrd the gravitational mass, the equivalence principle [cf. (t.8)]. This equality may seeni accidental, but it is verified experimentally with high accuracy. In the Ner'vtonian casc. a smail deviation fiom this equality would not be catastrophic. Is this equivalence principle a good approximation, a simple coincidence or does it signal something more profound? Einsteiri corisiderecl the study of accelerated motion. At first glance, it could seem that objects are following clifferent laws of physics, since one rreeds to add fictitious forces (inertia, Coriollis...) to describe their dynarrics. For Einstein. these forces are just as real as the force of gravity since the iatter can also be eliminated in a freefall reference fra,rne. Thus, an accelerated observer is subiect to the same laws of physics as those encor.rntered in the gravitational field, introduced in addition to the other forces. Everything happens as if one could free orreself from gravity by an astute choice of the reference frame. This is possible only if the equivalencc principle is strictly valid. Eirrsteirr thelefbre elevated it to the status ofa first principlc. This property teaches us that the 'confusion'
between gravity trnd acceleration is not accidental: it is a fundamental property of gravity. Note that tliis property only applies to gravity and to no other force of nature (in thc case of electrornagnetism, for instance. the acceleration in a given electric field rn'ould depend on thc ratio between charge anci rnass, which varies from particle to particle). The equivalence principle errsures that one can alu'ays find locally a reference frame in which the force of gravity is eliminated.
1.1.3.2 Tidal forces and spacetime curvature
It
seerns as though gravit"l' has disappearecl. In fact, as we ernphasized, such an op* eration is possible only locally. As an example, let us consider a body A orbiting around the trarth at a distance rr. Its Newtonian acceleration is ao : G*NI6rof r). A neiglibouring body B, orbitirrg according lo rt  rnl6r experiences a relative acceleration with respect to the body A
(d, _ 6ao, q41e r')\ri/ rt, _dtr,)
20
General relativity
at first order in Or . This relative acceleration cannot be cancelled out. If r Aor = body B approaches body A and if r A . Or = ±rAor, it moves away. Thus, a sm sphere in orbit would slowly deform into a type of ellipsoidal cigar , while conserv its volume. This distortion characterizes the tidal forces. We have seen that in the Newtonian and Minkowskian frameworks , the trajector a free particle corresponds to a path that minimizes t he length of a trajectory betw two points. Even though we can eliminate gravity locally, neighbouring trajector have a tendency to converge or diverge from each other. In a flat spacetime, o one body can free itself from gravity and the fictitious forces. However, in cur spacetime it is possible to consider that all the bodies are free and follow the line minimum length. Gravity and fictitious forces disappear for all bodies at the expe of postulating a new property of space. In this framework , all bodies follow the lines of shortest path and gravity has loc vanished for each one of them. However , since this is possible only locally, neighbour geodesics reveal the curvature of spacetime. Gravity has therefore not disappeared is simply hidden in the curvature of the spacetime, which explains the tidal forces a novel way. This heuristic example shows us that by introducing a curved spacetime we eliminate all the fictitious forces related to the choice of a noninertial reference fram at the expense of introducing a curvature for the space. The trajectory of all fr falling bodies, i. e. those subj ect to gravity alone, can be obtained as a geodesic, a of shortest path, in curved spacetime. Accordingly, all reference frames, inertial not , are placed on the same footing, which allows us to reconcile gravity and spe relativity. This construction is however only possible because acceleration does depend on the mass of the particles in free fall, which provides a glimpse into relationship between t he universality of free fall and the geometrization of gravity.
1.1. 3.3
Gravity as a m anifestation of geom etry
Einstein's equivalence principle is at the heart of all metric theories of gravitati which includes amongst others the theory of general relativity. It is based on th conditions:
• the weak equivalence principle (or the universality of free fall) according to wh the trajectory of a neutral test body is independent of its internal structure a its composition. This body must have a negligible binding gravitational ene and be sufficiently small such that the inhomogeneities of the gravitational fi can be ignored; • the local position invariance according to which the result of all nongravitatio experiments is independent of the point in spacetime where the experiment to place; • the local Lorentz invariance according to which the results of nongr avitatio experiments are independent of the motion of the laboratory as long as it is f falling.
One can argue (see, for example, Ref. [8]) that if the Einstein equivalence princi is valid then gravitation is the physical manifestation of a curved spacetime, i.e
Elements of differential geometry
21
metric theory. Such a theory has t he following three properties: • the geometry of spacetime is described by a metric, • free bodies follow the geodesics of that geometry, • in a local reference fr ame in free fall , the laws of physics t ake the same form as in special relativity. This implies that the contribution ofthe binding energies of the three nongravitational interactions to the mass is the same for a gravitational mass and an inertial mass. It could be asked whether this property can be generalized to the gravitational binding energy itself. This leads t o the formulation of the strong equivalence principle. If the gravitational binding energy contribut es equally to the inertial and gravitational mass then it is possible t o eliminate t he external gravitational field. This can be achieved in a locally inert ial reference frame in which all the laws of Nature , gravitation included, have the same form as in the absence of an ext ernal field. This principle seems poorly defined and coarse: it implies a gravitational law that has as yet not been defined. A theory of gravitation that satisfies this principle is necessarily nonlinear since the gravitational field 'weights' and generat es a secondary gravitational field , etc. General relativity and the Nordstr0m scalar theory are examples of theories that satisfy this principle (see Ref. [8]) . Since these principles are central t o the construction of general relativity, it is indeed importa nt to test them (see Section l.6).
1.2
Elements of differential geometry
The heuristic argument of t he preceding paragraph shows that we can describe the motion of a freefalling body (that is, a body on which no other force but gravity is acting) by a geodesic in a fourdimensional curved sp acetime. Spacetime is therefore described by a fourdimensional continuum so that one would need four values t o locate each event , just as in special relativity. In the Newtonian framework and in special relativity, this is assumed to be globally valid so that an ensemble of events can be mapped bijectively with JR.4. In general relativity we do not want t o make such a strong hypothesis about the structure of spacetime before having determined it. This situation is analogous to the description of a sphere in two dimensions: we need two numbers to characterize the position of all points. Locally, the sphere looks like a plane JR.2 but globally it does not . The spacetime of general relativity will t herefore be modelled as a fourdimensional manifold, that is, a space with four dimensions that locally looks like JR.4 but not necessarily globally (like in the case of a sphere in two dimensions). 1.2.1
Manifolds and tensors
Here, we define the notions of manifolds and tensors in a formal way. The two following sections can eventually be skipped and the reader can directly go t o the rules of tensorial calculus derived in the t hird section. 1.2.1.1 Manifolds To define a manifold , let us recall t hat an open set of JR.n is a set that can be defined as the finite union of open balls of JR. n . Such a ball of radius r centred at y = (Yl , . . . , Yn )
22
General relativity
is defined as the set of points x such that Ix  yl < T . A manifold is a space consisting of neighbourhoods that are locally like IR n an that can be continuously glued together. More precisely, a manifold M is a collectio {Od of sets that satisfy the following three properties:
1. for all points p of M, there exists at least one set Oi containing p , which mean
that {Oi} covers M, 2. for all i, there exists a homeomorphism 'lfJi from Oi into a subset Ui of IR n , tha is a continuous map that associates all points p of Oi to ntuplet of real number that we shall call coordinates of p in the map 'lfJi, 3. if two sets Oi and OJ have a nonempty intersection, then the map 'lfJj 0 'lfJi that maps the points of 'lfJi( Oi n OJ) C Ui into 'lfJj (Oi n OJ) C Uj is an infinitel differentiable function on IRn.
Examples of manifolds are the Euclidean plane 1R2, which needs only one chart (iden tity), or a sphere that requires two charts.
M
Ij/j
Fig. 1.3 Illustration of the relationship between different charts of the same manifold. Lo cally, one can relate M and IR n but globally this is only possible if we use a collection o charts (an atlas) that cover all the manifold.
1.2.1.2
Vectors
The Euclidean space in Newtonian physics is a vector space. However, this structur is lost in curved spaces, such as that of a sphere, but can be recovered in the limit o
Elements of differential geometry
23
infinitesimal displacements. In lR. n there is a bijective map between vectors and directional derivatives and each vector y = (yl, ... , yn) defines a derivative yJ.1.(%xJ.1.), in any arbitrary coordinate system (xJ.1.). In the manifold M , consider the set F of Coo functions of M in R We can then define the tangent vector y at a point p in M as a map of F into lR., which is linear and satisfies the Leibniz rule 1. y (a ] + bg) = ay(J) + by(g ) for all ], 9 in F and all a, bin lR. , 2. y(Jg ) = ](p)y(g) + g(p)y(J).
The set of tangent vectors at point p forms a vector space Vp [since (Yl + Y2)(J) = Yl(J) + Y2(J) and (ay)(J) = ay(J)] of the same dimension as t he manifold. We can construct a basis {X J.1.} of Vp by considering the map from F to lR. defined by
We usually call XJ.1. simply %xJ.1. or 0J.1. ' such t hat all vectors y can be expressed, using the Einstein summation convention, in the form (l.20) where yJ.1. are the components of y. The tangent vector to a curve] with coordinates xJ.1. (T) is then given by the application
so that t = tIJ.0IJ. with tIJ. = dxIJ.(T)/dT . In a coordinate transformation of the form (l.21) we have, for all ] E F , 01/ ] = (OI/XIIJ.)(O~]). We thus deduce that the elements of the basis are transformed according to (l.22) so that the components of all vectors y transform according to (l.23) which is the standard formula for the coordinate transformation that we have used previously.
1.2.1.3 Tensors The notion of tensor is a generalization of that of a vector , when one considers quantities that have mult ilinear dependencies under an infinitesimal transformation.
24
General relativity
Fig. 1.4 The tangent sp ace of a twodimensional sphere is a plane whose basis is given b ee = 8e and e'P = 8'P '
Given a vector space Vp, one can construct a dual space Vp* composed of the linea maps from Vp to R This new vector space has a basis ylJ.* that can be defined b specifying their action on the elements of a basis y IJ. of Vp
In particular, dx" represents the basis of Vp* associated with the basis OJ.' of Vp. dx should therefore satisfy dx" (01J.) = 6~
and it associates any vector y to its component y" defined by dx" (y) = y". In coordinate transformation , dxlJ. thus transforms as the components of a vector
(l.24
A tensor T of rank (r , q) is a multilinear map from (Vpy x (Vp*)q to lR such that
(l.25
where ® is the t ensor product. Such an object is said to be r times contravarian and q times covariant. In particular, a tensor of rank (0,0) is a scalar whose valu is independent of the system of coordinates, a tensor of rank (1 , 0) is a vector an a t ensor of rank (0,1) is a Iform. Using the transformation laws for vectors and 1 forms, we can deduce that in a coordinate transformation, the components of a tenso transform as
between two neighbouring points. The metric A should therefore be a symmetric and nondegenerate tensor of rank (0, 2). The lineelement between two events is then given
by 1ds2
: grrd.rpdr"
(t.27)
,
where g* is the metric tensor; it is syrnmetric and has hence 10 components. From the previous results, we can deduce that it transforms as
0ro \tP , gp": Artt"
(1.28)
6*n9a0'
This expression is analogous to the one obtained in the Euclidean or Minkowski cases. In particular) we can deduce that the determinant of the metric g : detg1,,, transforms
o'
[a"t
(#)]' '
(1.2e)
1.2.1.4 Tensor calculus Tensors are thus a generaiization of the notion of vectors. Tensors of the sarne rank can be added
nfl l;  sii i: + r1,"\,
,
and one can multiply tensors of different ranks to obtain a tensor of higher rank
Rll
y:
: sli !:r!:i,"1""
The indices can be contracted to form a tensor whose rank is lowered by
Ri:
i: sili] i{
2
s!,,,1,, ,1n,6i,,.
This operation reduces to sum over the repeated index. This allows us to define a scalar by contracting all the indices. In particular, TpV,  g*,VpT" represents the scalar product of two vectors. We conclude tha Tlr'..,\' is a tensor if and only if its contraction with any tensor of the forrn RLtr...}, is an invariant. The metric can be rrsed to raise or lower the indices Rltt...llrlt r,2...rc :
In Particular, Tt"  gtt'7,.
Rl,t
pr
rrrr...rngr,p
26
General relativity
A tensor is symmetric if it satisfies TI'v = TVI' and is antisymmetric if TI'V =  TVI' From any tensor, we can extract a symmetric and an antisymmetric tensor by
(1.30) This can be performed on one or several pairs of indices, or on all the indices. Fo instance , symmetrizing two indices of a vector of rank 4 gives 1
T(I'laf3l v) =
"2 (Tw ,f3v + T vaf31')
,
and a completely symmetric tensor is obtained by
In the particular case of a rank 2 tensor , it can always be decomposed as the sum o a traceless symmetric tensor , an antisymmetric t ensor and a trace
in four dimensions . We note that the completely antisymmetric tensor of rank n is given by
cI'1l'2 .. I'n
(1.31 with Cl..n = A and c 1 following property caf3Jyc
aAVI' _ 
.. n
,A ,v ,I' uf3 u J uy 
= I j A. ,v ,I' ,A uf3 uJuy 
For n
,I' ,A ,v uf3 uJuy
=
4, the anti symmetric tensor has the
+ uf3 ,A ,I' ,v + ,I' ,v ,A + ,v ,A ,I' uJuy uf3 uJuy uf3 uJu y .
(1.32
Contracting this identity, we get
1.2.2
Geodesic equations and Christoffel symbols
Using the notions introduced in the previous section, we can go back to the trajectory of a freefalling body moving in a space time with metric 91'v· Such a traj ectory is a line of shortest length, such that its equation, Xl'( A) , can be obtained by maximizing the action
s=
J
ds.
However, as already mentioned in the Minkowski case, ds can be negative or null for some traj ectories . It is therefore cleverer to maximize the action
s=
J
LdA ,
(1.34)
with xl' = dxl'jdA . This is equivalent to maximizing j'(ds 2j dA2)dA. We can show that extremizing these two actions comes to the same result if ds i 0, but using the
Elements of differential geometry
second one allows us to get the correct result also for null geodesics 2 (ds Section 1.4.2). The Euler Lagrange equations take the form
5L
d 5x'"  dA
27
= 0, see
(5L) 5i;'" .
Since
we have that
g/,,,,x,,/, _ 2"1(o",g/,vx' /"V X  O{3g/,,,,x.{3./, X  o/,g(3",X.(3./,) x where we have symmetrized the term o{3 gw,i;{3i;/'. Introducing the Christoffel symbols (1.35) and
'" r /'v
= 9 A"'r A/,V '
(1.36)
and contracting our equation with g"''', we get the geodesic equation
X",'
+ r/'VAX.v X. A 
0
(1.37)
identical to (1.18) obtained in special relativity. As we have stressed, spacetime curvature does not intervene in the equation of the trajectory of a single particle, but it does in the deviation of close trajectories. From the definition of the Christoffel symbols (1.35), we have, since the metric is symmetric, (1.38) as well as (1.39 ) Note that in four dimensions there are 4 x 10 = 40 Christoffel symbols, which is precisely the number of partial derivatives of the metric o",g/,v , We can hence inverse the relation (1.35) to obtain, after some algebra, (1.40) Using the geodesic equation, we can convince ourselves that the Christoffel symbols are not t ensors since they transform as
o x '''' ox Cf oxP ox{3 ox'/' ox,v "P
r''''       r(3 /' v 
02X'''' ox" ox P ox" ox P ox'll ox'v
(1.41)
under a coordinate transformation. 2Lightlike geodesics are also called isotropic geodesics; we will avoid this terminology, which can be confusing . We will call them either null geodesics or lightlike geodesics with no distinction.
28
General relativity
To finish , we consider the determinant of the metric
where 7r~e;:~ = + 1 if (af3ryb) is an even permutation of (p,l/PeJ) , 1 if it is an odd permutation and 0 otherwise. We thus obtain the derivative of the determinant
(1.42)
Another way to view this result is to write that dg = m/'l.Idg/,I.I' where m/'I.I are the minors of the matrix of the g/'I.I. The inverse of a matrix is g/'I.I = (g /,1.1)  1 = m/'I.I / g so we can deduce that oag = m/'l.Ioag/,1.I = gg/'l.Ioag/,I.I' We can then deduce the usefu property
(1.43) We can also point out another interesting property obtained using the fact tha = 4, and hence that oa (g/'I.I g/,I.I) = 0, that is
g/'I.I g/,I.I
(1.44) 1.2.3
Covariant derivative, parallel transport and Lie derivative
The laws of physics often take the form of partial differential equations. It is thus important to understand how a tensor field can be differentiated in a way that preserve its tensor properties.
1.2.3.1
Covariant derivative
The partial derivative of an arbitrary tensor field
is no longer a tensor, apart from the exceptional case of the derivative of a scalar tha is a Iform. To really understand this , let us consider the derivative of a vector T/and how it transforms under a coordinate transformation
(1.45)
The reason for this is that this operation makes the difference between the value o the t ensor at x a and x a + dx a and that at these points, the tensor has differen transformation laws. Consider a constant vector of the Euclidean plane, with coordinates TX = 1, TY = O. In polar coordinates, its coordinates depend on the position (TT = sin () , T e = cos ()) simply because the directions of the coordinate lines change from one point to the other. It follows that oeTT and oeTe do not vanish even if T is a constant. What we should define is a derivative such that the components of a constant vector do not change when it is transported from x a to x a + dxa.
1 Elements of differential geometry
29
The covariant derivative is the differentiation operator that parallel transports the tensor from x'" to x'" + dx'" b efore subtracting its value at x"'. Since this operation is linear, one can always define such a covariant derivative by
(l.46)
where the quantities C~'" should be determined. For that , we notice that (l.45) implies that these quantities should transform as (l.41) during a change of coordinate. Working locally in Minkowski space and noting that then \7 ",TJ.L = o",TIL, we conclude that
(l.47)
In the rest of this book, we will denote the covariant derivative either with the symbol 'V IL or by ; p, in subscript, with no distinction: \7 ILT" = T;~ . Similarly, we will use the notation 0ILT" = T,~ for the ordinary partial derivative. The relation (l.46) can be generalized to an arbitrary tensor
(l.48)
A consequence of this result is that \7 ", 9IL" = O",9IL"  r~IL90""  r~,,90"/L" Using the expressions (l.36) for the Christoffel symbols and (l.40) for the derivative ofthe metric, we get \7 ",9IL"
= O.
(l.49)
To each metric, we can therefore associate a covariant derivative. Note that we can define various covariant derivatives but only one is compatible with the metric in the sense that it satisfies the property (l.49). 1.2.3.2 Parallel transport
The covariant derivative allows us to define the notion of parallel transport. A vector is parallel transported if DTIL == dx" \7 "TIL = O. We can use this condition to define the notion of parallel transport of a vector along a curve of equation x lL (T) by DTIL j DT = 0, i.e. by
(l.50)
where u lL = dxIL jdT. The example of the sphere presented in Fig. l.5 shows that the result of this operation depends on which path is followed . A special case of transport along a curve is the autoparallel transport , i.e. a curve such that the tangent vector is transported parallel to itself; it should therefore satisfy
(l.51 )
This is simply the equation of the geodesic (l.37) so that we conclude that the geodesics (curves of shortest path) are therefore autoparallel.
30
General relativity Path 1
Path 2
Fig. 1.5 (left) : On a 2dimensional sphere, a vector can be transported parallel to itself a long a path 1 corresponding to a great circle until its antipodal point. We can reach the same point following the path 2 along the equator. The two vectors that we get do not coincide. (right): The Lie derivative evaluat es the change determined by an observer that goes from a point P to a point Q along the flow curves of the vector field ~!' transporting its local frame with it.
1.2.3.3
Differential operators
The covariant derivative allows us to define some mathematical operators such as the divergence and the curl. Using (l.35),we get that V f.LTf.L = of.LTf.L + r~pTP = of.LTf.L + TPo p In Fg, which can be used to express the divergence of any vector field by
V f.LTf.L
=
~0f.L (F9Tf.L)
y  g
.
(l.52)
Similarly, we can check that the divergence of any antisymmetric tensor is of the form
(l.53)
where we used the fact that the contraction of a symmetric tensor with an antisymmetric one vanishes. The d 'Alembertian operator is defined as 0 = V f.L Vf.L and can be applied to any kind of t ensor. In a Minkowski spacetime, it reduces t o V f.L Vf.L =  0; + ~ , where ~ = r5 ij OiOj is the usual threedimensional Laplacian. Note that the relation (l.52) can be used to express the d ' Alembertian of any scalar by Vf.LVf.L¢
=
~ 0f.L (F9gf.L
y  g
V
ov ¢) ,
(l.54)
where we used the fact that V f.L¢ = 0f.L¢. Finally, t he analogue of the curl of a Iform Af.L is given by (l.55) Av;f.L  Af.L;v = A v,,,  Af.L ,v, since
r~v
is symmetric in /LV.
Elements of differential geometry
31
1.2.3.4 Lie derivative Another way to construct a derivative is to consider the Lie derivative. For that we assume that there exists a vector field ~J.L and try to define how t he coordinates of a vector TJ.L will change for an observer that moves along the flow curves of the vector ~I' from a point P with coordinat es x'" to a point Q with coordinat es x'" + E~'" (where E is a small parameter) and that transports its system of coordinat es with itself (see Fig. l. 5). Using the coordinates defined in P at the point Q is equivalent t o performing a coordinate transformation in Q defined as x 'J.L = xl' ~ E~J.L , such that
and OX
V
rv
~ u x'J.L = u"~
to leading order in
E.
>:}V
+EU
~J.L '
The coordinat es of the vector TJ.L at Q will be given by
The Lie derivative is obtained by comparing this quantity with the vector TJ.L at P
£f, TJ.L
=
lim
~[T'J.L (Q) ~ TJ.L(P )],
€> O E
so that (l.56) This relation can be generalized t o any t ensor by using the definition (l.56) and the relevant transformation laws p
£ €1',J.L1"· J.L P _ <::,C"' o 0' 1',J.L1·"J.LP ""' 1',1'1 ""''' ·J.LpO0' <::'CJ.Li Vl"'V V l "'V ~ VI ' 'l/q q
q
i= l
q
+ ""'TJ.L1 "J.L p ~
VI ' .{3 "'V q
0V j <'"c(3 .
(l.57)
j= l
The Lie derivative plays an important role in the study of the symmetries of a Riemannian space. If £f,T,~ = 0 then T,~J.L ( X' ) = T;;(x') and the t ensor field is invariant under that change of coordinat es. In particular , for the metric t ensor , using (l.57) for a symmetric rank2 tensor , the relations (1.40) and (l.38) , as well as the definition of the covariant derivative for a vector, we get (l. 58) 1.2.4
Curvature
1.2.4. 1 Curvat ure tensor As seen from F ig. l.5 , the parallel transport of a vector along a closed curve does not bring t he vector back to its initial stat e if the space is curved , contrary to what
32
Genera l relativity
dx~
Fig. 1.6 The vector T'" is parallel transported along the path PIP3 passing by either P2 or P4 to give , respectively, the vectors Tin and Ti43' These two vectors only coincide if the Riemann tensor vanishes.
happens in a Euclidean space. Just like for the geodesic deflection, this characterizes the curvature of space. Let us consider the parallel transport of a vector TI" along the closed path shown in Fig. 1.6. From PI to P2 , the relation (1.50) implies that the variation of the vector TI" is given by OI2TI" = dx"' \7c;TI" = dTI" + r~aTadx"'. By iterating it, we get that the variation between PI and P 3 is given by 0123TI" = dx"'dx,6 \7,6 \7 c; TI" = d 2TI" +2r~[/dx"'dT[/ +(8,6 r~a+ r~[/r~a)dx"'dx,6 Ta. The same variation passing through P 4 is given by OI43TI" = dxc; dx,6 \7 '" \7 ,6TI" . On comparing these two expressions, only the terms proportional to dx"'dx,6 are different and we get that (1.59) where the Riemann tensor is given by (1.60) The definition (1.60) allows one to conclude that the Riemann t ensor is antisymmetric in the permutation of the first or the second pair of indices (1.61) and that it is symmetric in the permutation of the two pairs (1.62) We can also show that it satisfies the identity (1.63) The two first symmetry properties imply that the Riemann tensor has as many components as a 6 x 6 symmetric matrix, ie. 21 components. The last relation (1.63) is
Elements of differential geometry
33
independent of the first ones and thus adds one extra constraint t o the number of components . The Riemann t ensor therefore has 20 independent components. Finally, the derivatives of the Riemann tensor enjoy a cyclic symmetry property that is (1.64) 3RaJ.L [va; J3l = R a J.Lva ;f3 + R a J.Laf3;v + RaJ.Lf3 v;a = 0, called the Bianchi identity. 1.2.4. 2 Ricci and Einstein tensors The symmetry properties of the Riemann t ensor imply that we can construct only one tensor by contraction of two indices. This is the Ricci tensor , which is symmetric and defined by (1.65) The Ricci t ensor has 10 independent components. Its trace defines a scalar , the scalar curvature, also known as the Ricci scalar (1.66) Contracting the Bianchi identities (1.64) , respectively, by g af3 and g af3 gJ.La implies that We can combine these two equations t o construct a conserved tensor for the covariant derivative GJ.LV = RJ.Lv 
1
2RgJ.Lv.
(1.67)
This tensor G J.LV is called the Einst ein t ensor. 1.2. 4.3
Killing vector fields
A vector that satisfies the Killing equation
is called a Killing vector; it keeps the metric invariant and therefore corresponds to a spacetime symmetry. From the rela tion (1.59), we conclude that (1.58) implies that \1 a \1 f3~v = RJ.L af3 v~J.L" T his shows that from the value of ~v and \7 J.L~V at a given point one can determine the Killing vector field uniquely. One should then specify N values for ~v and (N  1)/ 2 values for \7 J.L ~'/' keeping in mind it is ant isymmetric , so that there are at most N(N + 1)/2 linearly independent Killing vector fields. For N = 4, this is 10 Killing vectors, which is precisely the dimension of t he Poincare group of Minkowski space. Indeed , t his maximum number of Killing vector fields is not always reached in an arbitrary spacet ime since the Killing equations are not necessarily integrable for any choice of initial conditions; most spacetimes have no symmetries and no Killing vector fields!
34
General relativity
A spacetime enjoying the maximum number of Killing vector fields is called a maximally symmetric spacetime. It can be shown from the Killing equations and (1.59) that the Riemann tensor must then satisfy (see , e.g., Ref. [2] for a demonstration)
(N  l)RI' ",j3v so that
N
R~ =
= R"'v6~
R N(N _ 1)
 R",j36~ ,
(R"'v6~  R",j36~) ,
the Ricci scalar, R, being a constant. Maximally symmetric spaces are thus spaces of constant curvature. For N = 4, there are three maximally symmetric spacetimes: Minkowski , de Sitter and antide Sitter. See also the discussion in Section 3.1.3 o Chapter 3. 1.2.5
Covariant approach
We consider a family of timelike geodesics representing the worldlines of typical observers of the Universe (the fundamental observers). In the case of a fluid , we can think of this flow of geodesics as the set of worldline fluid elements. Each observer has a tangent vector to its worldline, ul' = dxl' /dT . This vector is timelike and satisfies 3
A more detailed description of this covariant approach can be found in Ref. [9].
1.2.5.1
3+ 1 decomposition
With the vector ul' one can define two projection tensors
(1.68) along and perpendicular to the vector ul', respectively. We have Uj)U~
= Uf;,
ut = 1,
Uj)u v
,~=
,!:u
and ,!:,~
=
,~,
3,
v
= ul' , = o.
These two proj ectors allow one to define a natural notion of space and time for an observer 0 since the lineelement takes the form
(1.69)
Any vector vI' can be decomposed into 'time' and 'space' parts, relative to the vector ul' as vI' = vVUj) + vi, vi = vV,!: . In particular, a particle of mass m has momentum
3We now work in a unit system such that c = 1. This equation would otherwise take the form u!'u!' = _ c 2
Elements of differential geometry
35
The observer 0 will measure the energy of this particle to be (1. 70)
so that E2  P.l!,pi = m 2 . Note that if this particle does not follow a geodesic, then its equation of motion is (1.71)
where F!' represents any nongravitational force applied on the particle. Here , the last equality stems from the fact that v!'v!, =  1.
1.2.5.2 Spatial and time derivatives We can now define a derivative along a worldline , tangent to T cx (3 ... !,V...
u!'
by
cx (3 .. . = uAV AT!,v .. .
It corresponds to a derivative with respect to the proper time. We can also define a derivative in the hypersurface orthogonal to u!' by
flAT::!!
=
1';' (1'~, I'g, ... ) b~' I'~' ...) V
A,
T::,'t,'.
This derivative is a threedimensional covariant derivative associated to the spatial metric (we can indeed check that Vcxl'!'v = 0). However, note that for an arbitrary function p , V[!' Vv i P = U [!,;vJ P (recall u!' is geodesic). Thus, this term vanishes only if U[I';vl = O. The derivative V is well defined only under this condition. We may note that in particular VI'T!' = V!,TI' + uI'TI'.
1.2.5.3 Distortion of the geodesic flow We consider a vector flow u!' describing, for instance, a flow of matter. The covariant derivative of the vector field u!' can be decomposed as (1. 72)
The trace e = V!'u!' is called expansion and describes the isotropic distortion, a!' v is a traceless symmetric tensor called shear (a!,vu!' = 0, at = 0) and describes the distortion of the flow of matter, w!'v is an antisymmetric tensor called vorticity that describes the rotation of the matter flow (w!,vu!' = 0). The vorticity vector is defined by WU _ 
1
2u cxc
and satisfies by construction are, respectively, defined as
cxu!'v
w!'u!,
_ (3 U w!'v {:==} W!'V  U c(3u!'vw ,
=
Uv is the acceleration and vanishes if
w!'vw!'
u!'
=
O. The shear and vorticity amplitude
follows a geodesic.
36
General relativity
It will be useful to introduce a characteristic scale S (T) defined by
~S = ~8 3 '
which determines S a long each worldline up to an overall factor. The volume of eac element of the fluid varies as S3. 1.2.5.4
Raychaudhuri equation
e
We now calculate = uV\lv(\l J.LuJ.L) using the decomposition (1.72) as well as (1.59 to commute the covariant derivatives. This leads to t he Raychaudhuri equation
e  _~3 82 + il,!Li.t J.L + 2(w 2 
0"2) + VJ.L il.J.L  R J.LV uJ.LUV .
(1.73
This equation is at the heart of t he study of gravitational collapse. Two other equation describing the proj ection of the vorticity and of the shear can be obtained in a simila way (see Ref. [9]). This equation has an important consequence. Let us suppose that the vorticit vanishes (w = 0) and t hat uJ.L = 0 for all times. It then follows that if RJ.LvuJ.LuV > for all times, we have that + 8 2 /3 ::::; 0 so t hat
e
lIT
> 
8(T)  8 0
+ . 3
• If 8 0 < 0, i. e. if it is a converging flow , then at a given time T ::::;  3/8 0, 1/ 8 must g through zero and 8 should therefore diverge. The geodesic flow is hence pathologica This signals the presence of a caustic or of a singularity. • If 8 0 > 0 at a given time, we can express the Raychaudhuri equation as a functio of S to obtain that S / S ::::; O. There must therefore exist a previous time TO < 3/8 such that S  t 0 when T  t TO. This translates into the exist ence of a singularity i the past. This is a sp ecial case of the theorems on spacetime singularities that is discusse in detail in Ref. [10]. We give in the following table the definition of several energ conditions that will appear throughout this book.
Name null
Definition G J.LV kJ.LF >  0 \fk J.L, kJ.LkJ.L = 0
Meaning Light is focused by matter
causal
G J.LV uVGJ.LPu P <  0 \fuJ.L , uJ.LuJ.L =  1
Energy does not flow faster than the speed of light
weak
GJ.LVuJ.LUV 2: 0 \fuJ.L, uJ.LuJ.L =  1
The energy density is positive for any observer
dominant
causal and strong
strong
RJ.LvuJ.LuV 2: 0
S
Elements of differential geometry
1.2.5.5
37
Case of a vanishing vorticity
If w = 0 for all times , then the hypersurfaces that are locally perpendicular to the flow ulJ. can be globally extended. 'YIJ.V is then the spatial metric (i.e. threedimensional Riemannian metric) induced by glJ.v on 2:; and 'V' is the covariant derivative associated with 'YIJ.V, 'V' a'YlJ.v = 0 (see Fig. 1.7). The 'bending' of 2:; in the fourdimensional spacetime M can be characterized by the change of direction of the normal ulJ. when moved on 2:;. This is associated to the extrinsic curvature tensor
(1.74) Using (1.72) , it reduces to KlJ.v
A Riemann 4 tensor, equality 'V' a 'V' {3Vv we get that
=
1 38'YlJ.v
+ (JIJ.V·
'V' from (1.59). Starting from the = 'Y~''Yg''Y~''Va'h~''Y~,'VpVa) for a Iform Va on 2:; (i.e. vaua = 0) , (3) R a {3lJ. v
can be associated to
where we have used the fact that 'Y~'Y~ 'V a'Yf = K avulJ. . When taking the antisymmetric part (over ex and (3), the second term vanishes so that we finally obtain the relation (1.75) where we have used 'Y~ua'V pVa =  K3va to express the two last terms. This is the Gauss relation. Contracted on the two indices {3 and v it gives
and taking the trace, we get the scalar Gauss relation (1. 76) It relates the intrinsic curvature of 2:; to curvature. To finish, the relation (1.59) for Codazzi relation 'YpIJ. n a 'Yaa' 'Y{3{3'RP aa' {3' _
the Ricci scalar of M and the extrinsic the vector ulJ. projected on 2:; leads to the nv {3 KIJ. a  nv a KIJ. {3
,
which gives, after contraction, the contracted Codazzi relation (1. 77) 41n three dimensions , the Riemann tensor is fully determined from the Ricci tensor and the scalar curvature as
38
General relativity
Using t he expression of t he extrinsic curvature, the scalar Gauss relation (l.76 implies that the curvature of I; is given by
(l. 78
which is valid for all spacetimes such that w = O. Such a flow is called irrotational.
1.2.5.6 Decomposition of rank 2 tensors The existence of the vector field uIJ allows one to decompose any symmetric of rank tensor in a unique way, in the form
(l. 79 with
In particular, we have
1.3 1.3.1
Equations of motion Einstein's equations
Similarly to electromagnetism, we would like to formulate secondorder equations fo
9IJ v ' For that, we need a Lagrangian that depends on the metric and its first derivative
only. The Lagrangian proposed by Einstein and Hilbert reduces to the scalar curvatur Other choices are possible, such as any function of the scalar curvature or term involving other invariants such as the Gauss Bonnet t erm, or even more complicate theories of 'scalartensor' type; all these cases are discussed and studied in Chapter 10 The 'simplest' choice is in perfect agreement with all observations and defines th theory of general relativity. Any other choice will be considered as an extension of th standard case. Thus, we focus on the action
(l.80
where K, is a coefficient to be determined by requiring that the theory reduces t Newtonian gravity in the weakfield limit; this is the only free parameter of the theor .c mat is t he Lagrangian for the matter fields and A is a constant called the cosmologic constant. We will now present the variation of these two terms.
Equations of motion
39
Fig. 1.7 A vector flow up. defines an orthogonal surface at each point of the flow. We can then define a global notion of space and time for this flow if and only if its vorticity vanishes .
1.3.1.1
Einstein Hilbert action
We start by varying the Einstein Hilbert action
where we have just used the definition of the scalar curvature. Expanding this expression, we get
To evaluate the first term, we should vary the metric determinant. Using the result (l.42), we get (1.81)
which leads to the conclusion
The second term is more complicated to evaluate. For that , we work in a locally inertial coordinate system , i. e. where gp. v = fl /'v and o,gp.v = 0 from which we deduce that
40
General relativity
r~V = 0 (but note that the derivatives of the Christoffel symbols do not vanish). W hence obtain
To go from the partial derivative to the covariant derivative, we used the fact tha t he result is a t ensor , and hence it should be the same in any frame. We can henc conclude t hat
1v9l'VFgb RI'vd4x =
Iv Fg [(9I'Vbr~v);0"  (9I'Vbr~v);I'] d4x,
on any volume V. The integrand is therefore of the form Fg 1~ , which can be r expressed, using the property (l.52), as (Fg 11') ,/," This is therefore the integral of total derivative, which can be reduced to an integral on the boundary of the volum V so that it does not contribute to the equations of motion. We therefore obtain
b J RFgd 4x = J FgGl'v bgl'V d4x ,
(l.82
where the Einstein tensor GI'V is defined in (l.67). The variation of the term in A trivial and gives
J 2AbFgd 4x =  J AFggl'vbgl'V d4x. 1.3.1.2 Stressenergy tensor The variation of the matter action with respect to the metric is given by 4 bJL mat yC::d g X

4 J b(LmatFg) Fgbgl'V b9I'V yC::d g x,
which we may write as
b J Lmat Fg d 4 x =
~ J
FgTl'vbgl'V d4x,
(l. 83
where the energymomentum tensor TI'v, defined by
(l.84 is a rank 2 symmetric tensor.
1.3.1.3
Einstein 's eq uations
Using the results of the two previous sections , we obtain t he general form of E instein equations
(l.85
Note t hat there is another way to vary the Lagrangian , known as the Palatin method , in which the metric and the Christoffel symbols are considered as independen
Equations of motion
41
Extremizing the action with respect to the two fields then gives Einstein's equations and the condition that V fLg(){(3 = 0, which is equivalent to the definition (1.35). 1.3.2
Conservation equations
The Bianchi identities (1.64) impose and the fact that gfLv;a = 0, that
Cr: = 0, which implies, using Einstein 's equations (1.86)
This equation is satisfied by the total energymomentum tensor (i.e. of all matter components). To go further , one should specify the form of the matter Lagrangian or of its energymomentum tensor.
1.3.2.1
Covariant form
The most general form for the energymomentum tensor is the one given in (1.79) valid for any symmetric energymomentum rank 2 tensor ,
in which p = TfLvufLu V is t he energy density measured by an observer at rest with the fluid , P = TfLI/,fL V/3 is the pressure, qfL =  T(){(3u(){,(3fL is t he energy flux relative to ufL and 7r fLV is the anisotropic pressure tensor (7rfLV ufL = 7rt = 0) . The conservation equation V fLTfLV = 0 can be projected along ufL and ,t:, respectively, to obtain t he two conservation equations
Using t he decomposition (1.72) , we obtain  U v V fLTfLV = ufL V fLP + VfLqfL + (p + P)G +2qfLufL + (Jt:7r~ and , ;VfLTfL1/ = + V Ap +V(){7r aA +~GqA +((J~ +w~) qa +( p+ A P)u +7r(){AU(){. We get that
,;rjl/
(1.87)
These equations are valid for any energymomentum t ensor and any spacetime geometry.
1.3.2.2 Perfect fluid The perfect fluid is an especia lly interesting case for which 7rfLV = 0 and qfL = o. The two previous equations (1.87) and (1.88) reduce, respectively, to the matter continuity equation and to t he Euler eq uation
p + (p + P)G = 0,
(1.89)
42
General relativity
(p + P)UI' + '\7 I'P
=
O.
(1.90)
The term ul' = u (X V (Xul' represents an acceleration that vanishes for a pressureless fluid (P = 0). Particles of such a pressureless fluid will hence follow geodesics. For an irrotational flow , using (1. 85) to express the Einstein tensor, (1. 78) is of the form 2 (3) R = 2KP  382 + 20"2 + 2A., (1.91) that we will call the generalized Friedmann equation. Some conditions are often imposed on the pressure and the energy. The energy conditions introduced previously can be rewritten as • • • • •
null: p ~  P, causal: Ipi ~ IFI, null and causal: Ipi ~ IFI and p < 0 if and only if p =  P , weak: p + P ~ 0 and p ~ 0, strong: p + 3P ~ 0 and p + P ~ O.
1.3.2.3
Electromagnetism
Electromagnetism is described by the anti symmetric tensor FI'II ' called the Faraday tensor and satisfying (1.92) FI'I1 can be defined as the curl of the vector potentia l AI' as
(1.93) The energymomentum t ensor can be derived from the action (1.94) which gives (1.95) The conservation equations then reduce to Maxwell 's equations in the vacuum (1.96) This equation can be rewritten in t erms of t he potential vector AI' as (1.97) where we chose the Lorentz gauge , defined by VI'AI' = 0 and where the Ricci t ensor arises from the commutation of the covariant derivatives. A source term may be added by introducing a coupling term AI'jl' in t he Lagrangian to obtain (1.98)
Equations of motion
43
An observer with tangent vector uJL can define an electric and a magnetic field, respectively, by (1.99) These vectors satisfy EJLuJL tensor as
= BJLuJL = 0 and we can express the electromagnetic field (1.100)
Using this expression, we can decompose the energymomentum tensor as the one for a fluid TJLV
wit h E2
1
1
= 2(E 2 + B2)uJL u v + "6(E 2 + B2}yJLv + 2U(JLEv)paAupEaBA + 7rJLV,
= EJL EJL' B2 = BJLBJL and (recall 7rJLV is traceless)
We can then deduce that the energy density measured by a co moving observer is Pern = ~(E2 + B2) and that the equation of state for radiation is Pern = 1Pem. The trajectory of any particle of charge e and mass m in an electromagnetic field is of the form (1.101 ) Finally, we can reexpress Maxwell 's equations, V JLFVJL Projecting along uJL and along ry~ we get, respectively,
VJLEJL .yAj;;JL _Ea.Apa u a. VpB a IJL
= j V, in terms of EJL and BJL.
 2wJLBJL = c,
(1.102)
= _ .yAJ.v aAEv+ Ea.APau a. (u p B a +p WE) Iv _ ~eEA+ 3 v a , (1.103)
with c =  jJLu!," The same procedure applied to the equation F [JLv;a.] respectively, VJLBJL + 2wJLBJL = 0, .yABJL +Ea.Apa u a. V pE a IJL
= _~eBA+aABv _ Ea.APau a. (u p E a  p w B) 3 v a .
=
0 gives , (1. 104)
(1.105)
Written in this form , these four equations give a relativistic generalization of the four Maxwell's equations , valid for any spacetime geometry.
1.3.2.4
Scalar field
The action of a scalar field ¢ evolving in a potential V (¢) is given by (1.106) Varying this action with respect to the metric , we get the energymomentum tensor as defined in (1.84)
44
General relativity
(l.107 for which the conservation equation is simply the Klein Gordon equation
(l.108
We notice that the energymomentum tensor of a scalar field can be decompose similarly to that of a perfect fluid with velocity field
(and q!'
= 0 , 7r!'v = 0) and with energy density and pressure p¢
1
2
= 21/J + V(¢) ,
We can also note that 8
P¢
1 2 = 1/J 2
v.
=  (eJ; + V')N, w!'V = 0, a!, =  (o!'1/J + eJ;u!')N an
o"!'v = ,;,~C'V(>, \l8)¢)N
+ l!'v\l",[1/Jl\l"'¢]/3.
In particular, it follows that th
conservation equations reduce to
p+8(p+P) = 0
{==}
¢+8¢+V'=0,
for any Universe geometry.
1.3.3
ADM Hamiltonian formulation
To finish, we shall rewrite the gravitational action in the 1 + 3 form provided by th Hamiltonian analysis by Arnowitt, Deser, and Misner [4] . We consider spacetimes that can be foliated by a continuous family of three dimensional hypersurfaces , I: t . This means that there exists a smooth function £ o M whose gradient never vanishes so that each hypersurface is a surface of constant I: t
== {p
E
M,
£(p) = t},
\j t E
K Such spacetimes are called globally hyperbolic and they represent most space t imes of astrophysical and cosmological interest. We assume that each hypersurface o the foliation is spacelike and that it covers M. If we split the lineelement as
(l.109 then the normal vector to I: t has components n'" = (1,  Ni)/N , i.e. n", Calling t'" the vector defined by t"'\l ",t = 1, then
= N(  1,0,0,0
and t'" = Nn'" + N"'. This gives a geometrical interpretation to Nand N i : N is th lapse function , N i the shift vector and lij the spatial metric. It is thus clear tha
Equations of motion
45
the components of the metric are gOO =  N 2 + 'Yij N i N j , gOi = N i and gij = 'Yij and the components of the inverse metric are goo =  N  2 , gOi = N i / N 2 and gij = '"Y ij  NiNj /N 2. As t increases, we go from a hypersurface to another and we can view the fourdimensional spacetime as the time evolution of a threedimensional Riemannian space . One can then deduce a simple relation between the determinants g of gJ.!. v and 'Y of '"Yij. Using Cramer 's rule to compute the inverse metric, one gets that gOO = 'Y / g so that v=g = N v::Y. The Einst ein Hilbert action (l.80) can be rewritten as
This action has to be considered as a functional of the variables q their time derivatives. The extrinsic curvature is K ij =
2~
('Yik D j N
k
+ 'Yjk D i Nk
=
( 'Yij,
N , N i ) and
 i'ij ) ,
so that the gravity sector derives from the Lagrangian density (l.110) Since this Lagrangia n does not depend on H and Hi, we conclude that t he lapse function and the shift vector are not dynamical variables and will be associat ed to two constraint equat ions. The only dynamical variable is therefore 'Yij and its conjugate moment um is (l.I11) It follows that the Hamiltonian density is given by 7i =
7rij i'ij
£ and can be expressed
as 7i =
v::Y [N
(3) R 
K ij K ij
+ K2) + 2N i (DiK
 DjKf)]
+ 2v::YD j (K Nj  KI N i ). The last t erm, being a divergence, does not contribute t o the integral, so t hat the Hamiltonian reduces to (l.112) with (l.113) ij i It is a functional of q = hij , N , N ) and p = 5£ / 5q = (7r , 0, 0). The scalar curvat ure (3) R is a function of 'Yij and its spatial derivatives and t he extrinsic curvature is a function of 'Yij and 7r ij obtained from the inversion of (l.111)
46
General relativity
1 (1
= Vi 2 'Ykl'Tr kl'Yij  'Yik'Yjl1f kl) .
K ij
The Hamilton equations lead to two constraints , the 'energy constraint ' Co = 0 and the 'momentum constraint ' Ci = 0, and an evolution equation , which in the vacuum takes the form "'tij = 5H/ 51f ij =  2NKij + D iN j + D j N i . It is b eyond the scope o this book to derive the Hamilton equation with matter. When matter is included, the constraint equations take the form (3) R
+K2
D j K l  D iK
K ijKij = 161fGp,
= 81fG N Pi ,
(1.114 (1.115
with p == TJ.l. l/ n J.l. n l/ and P (3 =  (5~ + n (3 nJ.l.)TJ.l.C< n C< ' These relations can be obtained directly from the Einst ein equations and the Gauss relation (1.76) .
1.4 1.4.1
Geodesics in a curved spacetime Conserved quantities along a geodesic
To each symmetry one can associate a Killing vector , ~J.l. , under which the spacetime metric remains invariant . Such a vector satisfies the equation deduced from (1.58) \7 J.l.~1/ + \7 I/~J.l. = O. The exist ence of such vectors makes it possible to compute conserved quantities useful for the study of the solutions of Einst ein's equations. Consider a geodesic with tangent vector uJ.l.. One can check that the quantity uJ.l.~" is constant along the tra jectory. Indeed,
The first t erm vanishes from the geodesics equation and the second term is the con traction of a symmetric t ensor with an antisymmetric tensor. Similarly, if T J.l. 1/ is an energymomentum t ensor then TJ.l. I/ ~1/ is a conserved vector
The first t erm vanishes from t he conservation equation and the second is the contraction of a symmetric t ensor with an antisymmetric tensor. This relation has an important implication. The vector j J.l. = TJ.l.I/~I/ A satisfies by construction
Integrating over a volume V , we can hence conclude that
The quantity
QO , defined by
Geodesics in a curved spacetime
47
follows t he conservation equation
It follows that
1.4.2
aoQ o =
0 if the boundary term vanishes.
Geodesic deviation equation
We consider a family xI.L (A, s) of neighbouring geodesics, where A is the affine paramet er along the geodesics and s is a label. The vector axI.L nI.L =  
as
measures the separation between two neighbouring geodesics. It can be chosen in t he plane orthogonal to the geodesic flow, i. e. such that uI.LnI.L = 0 (otherwise one can replace it by rtn V). One can check that
The relative acceleration between two neighbouring geodesics is defined by (1.116) Using t he previous property we obtain (1.117) To evaluate the first term, note t hat nI.L'V I.L (uv 'V vu"') = 0 since u I.L satisfies the geodesics equation. This means that (nI.L'V I.LU V) 'V vu'" + nI.Luv'V I.L 'V vuO: = 0 that can be rewritten , switching nI.L and uI.L in the first term and switching the dummy indices J.L and /.I in the second one, as (uI.L 'V I.LnV) 'V vu'" + nvuI.L'Vv'VI.Lu'" = O. Using t his result to evaluat e the second term of (1.117), we obtain, after commuting the two partial derivatives (1.118) This equation confirms the heuristic argument on deviation of nearby geodesics. Even if we work gravity locally vanishes) the Riemann tensor does of the Christoffel symbols a priori do not cancel) or defo cusing of neighbouring geodesics. 1.4. 2.1
the relation between curvature and in a freefalling referential (where not vanish (because t he derivatives and one cannot avoid the fo cusing
Covariant approach
We decompose the vector nI.L as nI.L = f.eI.L with eI.Le I.L = 1 (thus, ePeI.L = 0) and eI.LuI.L = O. We can t hen express ilP = uV'V vnI.L either as ilP = f.e. v + Ce v or as ilP = nV'V vuI.L. By comparing these two expressions and proj ecting on eI.L, we get
48
General relativity
(1.119 that we call the generalized Hubble law. As for the orthogonal projection, it gives
(1.120
which describes the change in the direction of observation of a distant galaxy, fo instance. 1.4.2.2
Optics and redshift
We now consider an electromagnetic wave with a potential vector of the form
= CJ.Lei'P.
AJ.L
The Lorentz gauge condition imposes CJ.L0J.L'P = O.
If we assume that the amplitude CJ.L varies slowly compared to the phase 'P and tha the wavelength is small compared to the spacetime curvature length scale, we ca neglect the curvature term in (1.97) so that Maxwell's equations impose
\lwy J.L 'P = 0,
0J.L'PoJ.L'P = O.
For any function 'P, the vector kJ.L = oJ.L'P is orthogonal to the wave surface on whic 'P = const.; kJ.L is the wavevector. If we differentiate the second equation and use th fact that \l VOJ.L'P = \l J.LOV'P, we get that light is a transverse wave propagating alon null geodesics
= oJ.L'P, kJ.LkJ.L = 0, kJ.L\l J.Lk v = O. kJ.L
(1.121
In a Minkowski spacetime, kJ.L = ( E, k) , where E is the photon energy and kit momentum. An observer that follows a trajectory with tangent vector uJ.L will measur an energy (or equivalently, a frequency) given by E
= l'iw =  kJ.LuJ.L"
A photon emitted with a frequency Wem will be received with a frequency Fig. 1.8). These two frequencies will be related by
W rec
(se
(1.122
This relation will be very important in cosmology in order to define the notion o redshift, z. The wavevector kJ.L can always be decomposed as kJ.L = E(uJ.L + eJ.L) dxJ.L /dp , with uJ.L eJ.L = 0 and eJ.LeJ.L = 1 and where p is the affine parameter along t h photon traj ectory. It follows that dE/dp = kJ.L\l J.L(kvu V) and hence
. J.L J.L v) 1 d.\ _ 1 dE _ ( 1 e >':dp  Edp "3 +uJ.Le +CYJ.LV e e E.
(1.123
This expression points out the anisotropic contributions to the redshift coming from the shear. This expression is valid for any spacetime.
Weakfield regime
Reception
Emission
Fig. 1.8 A photon emitted by a comoving observer with 4velocity null geodesic and is received by a comoving observer with 4velocity
1.5 1.5.1
49
U::
m ,
propagates along a
u~ec.
Weakfield regime Newtonian limit
An undetermined parameter appears in the Einstein equations, the constant K. To fix it, we consider the weakfield limit of the Einstein equations in order to underst and how Newtonian gravity is recovered. We expand t he spacet ime metric around a Minkowski spacetime, in Minkowski coordinates, as
(1.124) where hJ.L1/ is a small perturbation. 1. 5. 1.1
Linearized Einstein equations
From t he previous decomposition, we can conclude that to first order in hJ.Ll/ ,
and that h J.L1/ = fJ o. J.LfJ(3J.L h o.(3 and we set h = h~. To linear order in hJ.L l/, the Christoffel symbols are given by 1 0.'\ [2 h '\(J.L,I/) r 0.J.LI/ = "ifJ
Accordingly, the Ricci tensor is of the form RJ.LI/ RJ.LI/
=
~
]
 h J.LI/,'\ .
=
80. r~1/

8J.Lr~I/'
[28,\8(J.Lhl/),\  DhJ.L1/  8J.Ll/h] ,
so that
(1.125)
where 0 = fJ o.(380.8(3 is the d' Alembertian in Minkowski space. Contracting t he previous expression with fJJ.Ll/ , we get the scalar curvature
50
General relativity
R = h::"  Dh. The Einstein tensor is given by GJ.LV read
=
RJ.Lv  ~R7]J.Lv so that t he Einstein equation
(l.126 with
In particular , we can check that h
1.5.1.2
=
 h and that hJ.Lv
=
hJ.Lv  ~h7]J.Lv.
Gauge freedom
In general relativity, there exist a gauge freedom related to the arbitrary choice o coordinate systems: two different metrics can describe the same spacetime. In the linearized limit, two perturbations can represent the same state if they are related by a change of coordinates. After a change of coordinates of t he form xJ.L > xJ.L + ~J.L, the metric perturbation transforms as
where the Lie derivative is given by £f,7]J.LV = 2o(J.L~v). Using these transformations, the linearized Einstein equation (l.126), takes the remarkably simple form
if one chooses ~J.L such that D~J.L = ovhJ.Lv. This means that after the change of variable the metric perturbation satisfies the gauge condition
(l.128)
This gauge is known as the de Donder gauge or harmonic gauge. It is analogous to the Lorentz gauge in electromagnetism.
1.5.1.3
Solutions of Einstein's equations
In the harmonic gauge , the solution of Einstein's equations can be found using the
retarded potential method, just like in electromagnetism ,
, ( )_ J ~
1J.Lv x , t  271"
TJ.Lv(x ' , t Ix  x'l/c)d3
IX

x, I
I
x .
The general solution is therefore of t he form h1 v ) = hJ.Lv + h1~), where h1~) is a solution of the homogeneous equations D h1~) = 0 with oVh1~) = O. g en
Weakfield regime
1.5.1.4
51
Geodesic equation
For a classical fluid, and to lowest order in the velocity, only the component Too gives a contribution (since P « pc 2 ). TOi and Tij should be taken into consideration only at the first postNewtonian (lPN) order , i.e. at order v2 /c 2 . The form of the general solution of the linearized Einstein equations (l.127) implies that the only nonvanishing terms are hoo and hOi. They are given by 2 
KC
hoo = 2;"
JIx _
p
3
I
xii d x ,
 Oih
KC 
27r
JIx 
PUi
xii
d 3 x.I
It follows that all diagonal components of h,.LV are equal to hoo / 2. We now consider the trajectory of a particle with velocity small compared to the speed of light (v / C « 1). As we have seen before, to lowest order in h the tangent vector to the worldline is given by
dx"
u'"
_Idx"
= dT ="
ill'
with Vi == dx i / dt and" = dT / dt = \/1  v 2 / c2, so t hat we have u" = ,, 1(C, vi). Using the relation d 2 x" / dt 2 = dbu")/dt =u"d" /dt +"du" / dt , the spatial component of the geodesic equation becomes d 2 x'
dt 2
dx'" dx,6
=
dx'" dx,6 dx i
r~i3 ill ill + r~,6 ill ill cit·

To linear order in h and v, this equation is of the form (l.129)
1.5.1.5
Determination of K
Let us work in a static gravitational field. The perturbation h"v only depends on the position Xi. Equation (l. 129) t hen simplifies to
The first term represents the coordinate acceleration. In the second term, we recognize the gradient of a scalar that we identify with the gravitational potential 2¢ = c2 h oo , so that we recover the Newtonian form 2
d X dt 2
i
= oijo.'!' ,,+,,
We can therefore go back to the Einstein equations and compare them in this limit with t he Poisson equation that determines the gravitational potential in Newtonian mechanics. This allows one to identify the value of K such that the weakfield limit of relativity is consistent with Newtonian gravity. From the previous equation, we deduce that hoo = 4¢/c 2 . The energymomentum tensor has for main component Too = pc2
52
General relativity
and the limit of the E instein equation is therefore b.¢ = Kpc 4 /2. Comparing thi equation wit h the Poisson equation b.¢ = 47rG N P, it follows that
(1.130
where G N is the Newton constant.
1.5. 1.6
Analogy with electromagnetism
In the previous expressions, the linearized equations of general relativity are ver similar to the equations of electromagnetism. One can take the analogy a step furthe by defining Ai = chOi . The geodesic equation can then be rewritten as
x= 
V¢
A + v !\ (V !\ A) + 3¢v.
¢ = 0, the previous expression reduces to a form similar to t he one in electromagnetism. The Einstein equations are hence similar t the Maxwell equations if we introduce
In a static spacet ime, for which
1
9 =  V ¢  Ot A , c
B = V !\ A ,
V·B = O, V !\ 9
(1.131
1
=   Ot B ,
c The following list summarizes some aspects of the analogy, which cannot unfortu nately be pushed much further.
Variables Gauge transfo rmations Gauge condition Field equations R estrictions on the other gauge transformations
electromagnetism
linearized relativity
All A ll + All + 0IlX 0IlAIl = 0 D AIl =  j ll
h ll v h llv + h llv  £ ~ T)llv + T)IlVOo<~O< , _ 0ll h llV = 0 D h llv =  2KTllv D ~1l
DX = 0
=0
1.5. 1.7 SVT decomposition To go further in the study of perturbations, let us decompose h llv as
hoo =  2A ,
hOi = Oi E + Hi,
hij = 2C6ij + 20 ij E + 2o(iEj ) + Eij
with
OiHi = 0,
OiEi = 0,
OiEij = 0,
EJ
= O.
The 10 components of the tensor h llv are decomposed into 4 scalar perturbation (A , E, C , E) , 2 vectors perturbations (E i , Hi) that have 6  2 = 4 independent com ponents and 1 tensor perturbation Eij that has 6  4 = 2 independent components
Weak field regime
53
The important property of this decomposition, which will be used in the cosmological context in Chapter 5, is that these three types of perturbations decouple (see Ref. [11]) . As previously, we should take into account the effect of an arbitrary change of coordinates. In order to do so , we decompose the displacement vector field ~p. as ~o
~i
=  T,
=  ciL  V.
Under such coordinate transformation , the metric transforms as 9p.v [see (1.58)]. It follows that 5
A
of
A
+ T,
B
of
B
+t 
T,
C
of
C,
E
of
E
of
9p.v  £E9p.v
+L
for the scalar modes, for the vector modes , and the tensor modes remain unchanged
The tensor modes are therefore gauge invariant since they do not depend on the choice of the coordinate syst em. This is not the case for the vector and scalar modes . However , we can define a combination of these modes that are gauge invariant. For the scalar modes, we define \Ii =  C,
and for the vector modes
i
=
Ei  Hi.
We therefore have defined 2 scalar quantities and 1 vector quantity (2 degrees of freedom) which are gauge invariant. All together , we therefore have 6 = 10  4 degrees of freedom, once the 4 arbitrary degrees of freedom related to the gauge choice have been absorbed. 1.5.2
Gravitational waves in an empty space time
1.5.2.1 Propagation eq uations For the scalar modes, the Einstein equations G p.v = 0, imply that Ll\li = 0 (00 component ), Oi \[t = 0 (Oi component) and
=
\Ii
= 0,
which means that no scalar mode can propagate. For the vector modes, the component Oi implies that Ll~i = 0, the only regular solution of which is i = O. Just as for scalar modes, no vector modes can propagate. 5To first order in ~ , h l'v . h l'v  £1;111'v and we just need to evaluate £<'100 = 2T , £(I)Oi Gi(T  L) 
£i a nd £~"I)ij =
 2[;,LJij  2G(i L j).
54
General relativity
The situation is different for tensor modes. The ZJ component of the Einstein equations implies that DEij = O. The only perturbations that can propagate in a Minkowski spacetime are the gravitational waves and they satisfy (1.132)
The three conditions = W =
Plane waves
Let us consider a plane wave propagating along the axis Oz in the TT gauge. Since the wave is transverse and traceless, it satisfies hE'T) = h~~T) = h~~T) = O. Therefore, it has only two degrees of freedom (two polarizations) that should satisfy h~~T) =  h~~T) and h~~T) = h g T ). We hence can decompose it as follows
h~JT) = E +(et  z)ED
+ E xCet  Z)Elj,
with t he two polarization tensors being defined by
ED
=
(~000 ~1 ~) ,
Elj =
(~000 ~ ~)
The solution of the propagation equation is E+ /x = A+/x cos[k( z  ct) +cp], where cp is a phase. 1.5.2.3
Motion of a test particle
Consider the geodesic equation of a test particle in the gravitational field of a gravitational wave in the TT gauge. Since to leading order in perturbation we have roo = 0, a particle initially at rest will remain at rest . Of course, this does not mean that nothing happens , but rather that the frame of reference is co moving with the test particle. To see if anything happens , we should look at the relative motion of two neighbouring particles, which can be done using the geodesic deviation equation (1.ll8). The relative acceleration is given by al1 = Rl1vpauvnPua . To leading order in vic, we have ul1 = (1,0) and nl1 = (0, ni) , R iojo =
a; h~JT) 12 so that
2
i
d n = ~02h(TT)inj. dt2 2 t J
Substituting the expression for h~JT) from the previous section, we can easily integrate this expression. Figure 1.9 illustrates the deformation of a ring of particles under
Tests of genera l relativity
55
the influence of a gravitational plane wave propagating along the axis z for each polarization case. t=T14
t =0
+
X
0 0
, /
,,
/

t=T12
,,
, \
I
I \
\
,,
I
,,
,
I /
,,
, \
I
0 0
t=JTI4
, /
,,
/
,
,,
,,
I
I
Fig. 1.9 Effects of a gravitational plane wave propagating along the axis z on a ring of particles located in t he plane xy, depending on the wave polar ization . The four snapshots ar e parameterized by the period T corresponding to a gravitational wave with wavelength k  1 .
1.6
Tests of general relativity
The cosmological models we will construct will strongly rely on the application of general relativity to the Universe as a whole. It is hence important to understand the limits of validity of our theory of gravitation. We summarize here the principles of the main tests of gravitation; all constraints are summarized in the t able at the end of this section. This question is presented in detail in Refs. [8, 12, 13]. 1.6.1 1.6.1.1
Test of the equivalence principle Uni versality of free fall
The mass of any body comes from different contributions (rest energy, electromagnetic or nuclear binding energy, etc.). If one of these energies contributed differently to the inertial mass than to the gravitational mass, we would observe a violation of the universality of free fall. To quantify this possible difference, we write (1.133) where for each interaction i, Ei is the associated binding energy, and ryi is a small dimensionless paramet er that quantifies the amplitude of the violation of the universality of free fall associated to this interaction. The acceleration of any test body in an ext ernal gravitational field is given by
56
General relativity
so that measuring the difference between the acceleration of two different bodies allow us to define the Eotvos parameter T) by
(l.134 to leading order in the parameter T):
T)i.
Two methods have mainly been used to constrain the value o
1. The first method uses the period T of a pendulum of length L ,
T = 2n
~I {f mG
, 9
A
/
g
Fig. 1.10 Principle of Eotvos balance. 9 is the external gravitational field and w is th angular velocity of the measurement device. The unit vectors e J and eb are, respectively parallel to the torsion fibre and to AB.
Tests of general relativity
57
so that
IT1  T21 . T1 +T2 2. The second method uses a torsion balance (see Fig. 1.10). Two objects of different compositions, A and B, are attached at the end of a horizontal beam of length 2L suspended on a torsion fibre. The balance has an angular velocity w, coming, for instance, from the Earth rotation; the force on each mass is therefore rJ=2
where c is the centrifugal acceleration. We can conclude from this expression that the wire tension is T = FA + F B and that the torque applied on the torsion fibre is M = Leb 1\ (F B  F A). There is therefore a torque M directed along the same direction as the tension. Performing the scalar product M . T , we get that L
M = rJ1 1(ebl\c).g. g+c This device was invented by the baron Roland von Eotvos (see Ref. [14]). In the original experiment , 9 corresponded to the Earth's gravity and the rotation was that of the Earth. The pendulum and torsion balance can test the weak equivalence principle because the gravitational binding energy of the test objects are negligible compared to the other energies. In order to test the strong equivalence principle, we should consider objects with an important gravitational binding energy such as celestial objects. This can be done by comparing the relative motion of the Earth and the Moon within the gravitational field of the Sun. There is therefore an extra acceleration with expression
oa _
GN M 0 2 rJ R
( E (grav) 2
me
I
_ E(grav) I
Earth
me
2
)
eT
,
Moon
where e T is a unit vector directed from the Sun to the Earth and R the radius of t he Earth orbit. This implies that the orbit of the Moon around the Earth is polarized in the direction of the Sun, with maximal amplitude
Or m ax
30a
=     ,
2 (WEarthWMoon)
Neglecting the gravitational energy of the Moon compared to that of the Earth, which is considered to be a homogeneous ball of radius R and mass m, it follows that E(grav) / me 2 =  3Gm/ 5Re 2. A more precise computation gives E(grav ) / me 2 =  4.6 x 10 10 for the Earth and  0.2 x 10 10 for the Moon , from which we conclude that These measurements have been made possible, mainly owing to the Lunar Laser Ranging (LLR) experiment t hat measured the position of the Moon relative to the Earth with a precision of 1 cm over about thirty years.
58
General relativity
1.6.1.2 Local Lorentz in variance
There exist few clean tests of the local Lorentz invariance. In particular, many of the highenergy tests cannot usually distinguish the specific effects of Lorentz invariance breaking from the complications due to weak and strong interactions. The best constraint is the one obtained from the Hughes Drever experiment [15]. The experiment observes the ground state of the lithium7, with quantum number J = 3/2. In the presence of a magnetic field , this state splits into four equally spaced levels. Any perturbation associated with the existence of a preferred direction , would modify the spacing between the levels. This sets a limit on the potential anisotropy of the inertial mass om~j of the lithium nucleus that can be parameterized as
(1.135)
where OA is a dimensionless parameter that characterizes the amplitude of the anisotropy induced from the interaction A with binding energy EA. In the experiment with ij 2 t he lithium7, we get c 1.7 x 10 16 eV.
lom
1.6.1.3
1:s
Local position in variance
The first t est of local position invariance is related to the Einst ein effect, i. e. to the fact that clocks run more slowly in the presence of a gravitational field. Consider an atomic transition at point A emitting a photon towards an observer located in B. According to the previous sections , the photon is received with a redshift (l +z) = (U /J. k/J.) B / (U/J.k/J. )A, implying that /:).¢ z = ~,
where ¢ is the gravitational potential. If there is a space dependence, this relation should not be valid any longer. We would therefore expect /:).¢
z =(1 + a)2 ' c
(1.136)
where a is a parameter that vanishes in general relativity. Several experiments have been performed to measure the redshift, either by sending atomic clocks in orbit or by observing solar spectra. Another method consists of testing the variation of the fundamental constants. Numerous tests have been performed mainly on the fine structure constant , on the electron to proton mass ratio, and on the gravitational constant. These experiments go from local scales to the early Universe. We will come back to these t ests in lat er chapters (see also the review Ref. [16]) .
1.6.1. 4
Fifth force
To finish , we point out tests on the potential presence of a fifth force. In these experiments , the gravita tional potential is supposed to be modified by a Yukawa potential of range .A
Tests of general relativity
59
(l.137) where (0'12,
0'12
characterizes the amplitude of the deviation. The constraints on the pair
A) are summarized in Fig. l.1l. In the laboratory, experiments using Cavendish
balances can test Newton's law up to a hundred micrometers [17]. The constraints go up to scales of the order of the Solar System size by studying the motion of celestial objects.
Lamoreaux
10'
Excluded region
10 4
104 10 6
10  5
104
103
102
1.(m)
10  2
104
10 6
a
Earth ± LAGEOS
10  8
10  2
LAGEOS ±
10  10 I
1.(m)
Fig. 1.11 Constraints on the existence of a fifth forc e deriving from a Yukawa potential. The parameter 0: represents the parameter 0:12 of (1.137), which we assume to be the same for all bodies . The bottom graph represents a zoom on the smallest t est ed scales. The diagram presents the predicted deviations expected from some models, such as t he presence of extra dimensions, axions or dilaton, etc .. . These different models will be presented in later chapters of this book (quote from Ref. [18]).
60
General relativity
Tests in the Solar System
1.6.2
We will now concentrate on the t ests of the geometrical structure of spacetime a of Einstein's equations. These tests rely mainly on the study of geodesics in the So System. We will focus on t he case of Schwarzschild spacetime that describes the met generated by a massive body. We may, however, point out that many modern tes (such as LLR, LAGEOS or GPB experiments) , can only be described with aNbo metric, i.e. beyond the Schwarzschild metric, which is beyond the scope of this bo (see Refs. [1 4] for further details).
1.6.2.1
Schwarzschild spacetime
The spacetime in the Solar System can be described by a static solution with spheric symmetry of the form
(1.13
with d0 2 = de 2 + sin 2 edcp2 . The solution of the Einstein equations with a mass sour M localized at the origin is the Schwarzschild solution whose explicit form is 2 ( ds = 
2G M) 1 ~ N
2 dt +
2
(1 _2G M) dr
N
2 2 + r dO .
(1.13
c2 r Most of the tests in the Solar System rely on the study of geodesics of this metric.
1.6.2.2 Extension
To be more general and to be able to test both the geometrical structure of spa and the field equations, we will work in the weakfield approximation (G N M / c 2 r « which is justified since GNM0 /C2 = 1.4 X 10 3 m for the Sun). We assume that t metric is of the form ds 2 =

[
1
G M 2_N_ c2r
+
2
(fJ PPN
2

G_ M2 I'PPN ) _N _ ] dt 2 c4 r2
+ (1 +
G M) 2 I'PPN_Nc2r
dr2 + r 2d0
(1.14 From the field equations of mot ion of general relativity and using (1.139) , we get th PPN I'PPN = fJ = 1, but there is a priori no reason for this to be the case within anoth metric theory of gravity. 1.6.2. 3
Light bending
The study of null geodesics can be performed directly in the general form (1.138) . B symmet ry, we can choose to concentrate on the solutions that lie in t he plane = 7r / The trajectory then reduces to {t(A) , r(A), cp( A)}. Since spacetime is static and h a spherical symmetry, there must be two constants of motion. The one associat with timetranslation invariance is identified with the energy E = Ut = gttdt/d
e
Tests of general relativity
61
while the second one, associated with the angular momentum rotation invariance, is L = u¢ = g
2d
For a null geodesic, we have also the extra condition
(1.141 )
.
kl'kl' =
0, which we can write as
Using the constants (1.141), we can deduce that the equation of the traj ectory is
= (E)2 e~l/(r)~I'(r) ( ~)2 d
_
r2
e~l/(r) r2
(1.142)
The functions v(r) and Ji(r) can be identified with the corresponding terms in (1.140) . After introducing y = G N M/c 2 r and b = L /E, we therefore have
d2 y _ + y d
2
G M2 = 3,PPN y 2 + (l_,PPN)_N_ . b2 c4
To second order in M / b, the solution of this equation is given by the relation y = (G NM/bc) sin
roc
which reduces to D.
~ iro
Actual source position
Observer
Fig. 1.12 Geometry of the trajectory of a photon d eflect ed by a massive body.
1.6.2.4
Shapiro Effect
The Shapiro effect , discovered in 1964, is an abnormal time delay when electromagnetic waves are emitted from Earth and reflected back by a satellite or a planet. Using the null geodesic equations derived in the previous section, we obtain
(r ~:
=
el'(r) ~l/(r)
[1 
el'(r)
~~] ,
62
General relativity
the coordinate time a photon takes to go from r to ro is t herefore
t(r, ro) =
f
ro
r
Integrating this expression from rEarth to of change of the travel proper time dl:l.T
  ::: (1
dT
rplan et,
back and forth, we get that the r
+ ,PPN) IlS . h  1
(1.1
assuming ro ::: 7 x 108 m. 1.6.2.5
Precession of the perihelion of Mercury
The planet trajectories are obtained using the timelike geodesics, such that
e~(r) E2 = 1 + ev(r) (~ ) (:~ ) + (~ ) . 2
2
Introducing y = l / r and neglecting terms of order (G N M / rc 2)2 , we can integrate t equation to obtain the approximate solution
y(cp) =
G~M
(1
+ e cos {[ 1 _
(2
+ 2,PPN _
jJPPN)
G~~2] cp} ) ,
where e is the eccentricity. At each orbit, the perihelion advances by
l:I.cp =
27r(2
+ 2,
G 2 M2 = 27rG NM2 (2 £2 a(l  e )
PPN _ jJPPN)_N_
+ 2,PPN
_ jJPPN).
(1.1
For Mercury, we get 42.98" per century if , PPN = jJPPN = 1, which is consistent w observations. 1.6.2.6 Summary
The tests performed in the laboratory and in the Solar System give a bound to violation of the equivalence principle and to the deviations from the theory of gene relativity. We have only presented here some of these parameters, but the param terized postNewtonian (PPN) formalism brings into play other parameters that beyond the purely Newtonian theory. This formalism has been developed in order take into account all kinds of possible deviations. The following table and Fig. 1 give a summary of the main constraints obtained in the Solar System (see Ref. [ for more details).
1.6.3
Gravitational radiation
Just as any charged object emits electromagnetic waves, a massive body can radi gravitational waves. In what follows, we review the properties of this radiation , with giving any proof. The existence and the properties of gravitational waves can be u to t est the theory of general relativity.
Tests of general relativity
63
Table 1.1 Solarsystem constraints on the postNewtonian parameters.
Parameters rJ (weak)
Constraints 10  :3 5 x 10 9 (1.9 ±2 .5) x 10 12 5.5 x 10(1 ± 15) x x l0 10
rJ (strong)
a
 ,j
 <:
2 x 10 4 1.7 x 10 · 1(i eV
15m') . I
10 23 10 22 5 x 10 18
6'
1.6.3.1
13
Details Pendulum Torsion balance Torsion balance (BeCu) Torsion balance (EarthMoon) LLR Photon emitted (Moss bauer effect) HMaser in orbit Lithium Strong interaction Electromagnetic interaction Weak interaction
Reference Newton (1686) [1 4] [19] [20] [21] [22] [23] [15]
Radiation of gravitational waves
The gravitational radiation emitted by an object is given by (TT)
hi)'
(x, t)
=
2
T) ,
2G N d Q kl ( 4Pijkld 2 t  C T t c
where Pijk1(n) = (6 i k  n ink )(6jl  nknl)  ~(6ij  ninj)(6kl  nknt) and where Qkl is the quadrupole of the mass distribution Qkl(t)
=
J
p(x,t) ( XkXl 
~X26kl) d 3x .
The radiation is therefore sourced by the quadrupole of the matter distribution, and there is no monopole contribution (due to the mass conservation) or dipole (from the equivalence principle). The total power of the radiation is given by the Einstein quadrupole formula dE G N d 3Qkl d 3Q kl dt 5c 5 dt3 dt3 ' 1.6.3.2
Binary pulsar PSR 1913+16
Discovered in 1974 by Taylor and Hulse, this pulsar made it possible to test the formula for the gravitational radiation and prove the existence of gravitational radiation. Using Kepler' s law, it can be shown that if the energy of the binary system decreases , its period must t hen increase as
jdP) \dt
=
_1927r (27rG N 5c2 P
)5/3 m1m2
4 2 1+73e / 24 + 37e /96 m1+m2 (l _e 2 )7/ 2 '
where m1 and m2 are the two masses and e is the eccentricity. For the pulsar PSR 1913+16, m1 = 1.44M8 and m1 = 1.38M8 , e = 0.617 and P = 7 h 40 min so that
64
General relativity (( Lunar
\l Laser
Ranging
Y PPN
Precession of the perihelion of Mercury
1.004
Mars Radar Ranging
1.002             
1
0.998
0.996
0.996
0.998
1
1.002
1.004
i3 PPN
Fig. 1.13 Summary of the constraints on the postNewtonia n parameters in the Solar tem . The current constraints are 12,PPN  j3PPN  11 < 3 x 10 3 form the precession o perihelion of Mercury, 4j3PPN  , PPN  3 = (0.7 ± 1) x 10 3 from the Lunar laser ran (LLR) experiment , b PPN  11 < 4 x 10 4 from the light bending (VLBI: Very Long Bas Interferometer) and b PPN  11 = (2.1 ±2 .3) x 10 5 from the measure of the time d elay o Cassini space probe signal.
(dP / dt ) = 2 .4 x 10 12 . We may note that there is, however, a real discrepanc 14 (J" between the observed decrease and the theoretical prediction. This is because Doppler effect coming from the acceleration of the pulsar towards the centre o Galaxy should be taken into account . The theoretical works in Ref. [24] have ma possible to compute the equations of motion up to order v 5 / c5 and have led to idea how to t est general relativity in the strongfield limit [25] (see Ref. [26] for a revi They also prove that gravity propagates at the speed of light. 1.6.3.3
Direct detection
Many experiments are now in progress to detect gr avitational waves, either from i ferometry (LIGO, VIRGO, LISA, . . . ) or using Weber bars (see Ref. [27] for a revi In order to understand how difficult it is to det ect them, let us give an estima the amplitude of the gravitational waves. Consider for that a bar of length L and m M rotating with angular velocity w. The characteristic amplitude of the quadru is Q = lVI L2, so that dnQ/dt n rv lVI L 2w n We therefore have a typical amplit ude dE
dt
rv
G N M2L4 w 6 c5
Tests of general relativity
65
For a 500ton mass and 20m long bar rotating at 5 rad/s, we get h ~ 10 38 at 50 m, and dE / dt ~ 10 32 W! The radiation of ordinary objects is therefore undetectable. For a binary system of typical mass 1Mo, with an orbit of 10 6 km and an orbital period of order 10 h, we get dE/dt ~ 10 22 W. This power can even be increased and reach a typical value of dE/dt ~ c 5 /G N ~ 3.6 X 10 52 W for an ultrarelativistic source. 1.6.4
Need for tests at astrophysical scales
Most of the tests presented here have a characteristic scale of the order of the size of the Solar System or at most, of the size of our Galaxy for binary pulsars. In cosmology, we will extrapolate the law of gravitation to the Universe itself. This extrapolation requires some checks that we will present in detail in the rest of this book. Let us, nevertheless, remind ourselves of some facts worth keeping in mind (see Ref. [28] for more details): 1. At galaxy scales, the rotation curves indicate that the gravitational potential goes as ¢ ex In r on large scales. This behaviour is usually explained by the presence of dark matter, which has not been detected yet. Beyond a given scale, everything goes effectively as if gravity had a bidimensional behaviour. The study of the rotation curves, however, shows that if such a transition existed, this scale should depend on the galaxy mass (see Chapter 7 for more details on these relations). 2. For galaxy clusters of around 2 Mpc, the effects of gravitational lensing, which measures the gravitational potential from light bending, and the Xray emission, which traces out the gas temperature, can be compared. This comparison has shown that the Poisson equation is valid up to a factor 2. Hence , it provides a test of general relativity (testing both the light bending and Einstein's equations in the weakfield limit). 3. At astrophysical and cosmological scales, there are no direct tests of the law of gravity. The growth of the largescale structure of the Universe provides a test, but it mixes the properties of gravity with those of matter. For the theory to be consistent with the observations, dark matter should be present. 4. The observed recent accelerated expansion of the Universe (see Chapter 4) cannot be explained within the framework of general relativity with ordinary matter (among other conditions, matter with positive pressure; this follows from the Raychaudhuri equation). This observation is explained either through exotic matter (some candidates such as the cosmological constant, the quintessence , ... will be described in Chapter 12) or by modifying the laws of gravity at large scales. 5. Two tests are currently proposed. On the one hand , the Einstein equivalence principle can be tested by checking how constant the 'constants of Nature' are on large scales. On the other hand, the Poisson law can be tested up to scales of a hundred megaparsecs , by comparing the matter distribution in the galaxy catalogues with the weak lensing by largescale structures (see Ref. [29]).
This shows the importance of testing gravity either in order to validate the standard approach that adds new exotic matter sectors (dark matter, dark energy), or in order to give new explanations for the Universe. We will come back to all these issues in Chapter 12.
References
[1] R HAKIM , An introduction to relativistic gravitation, Cambridge Univers Press, 1999. [2] H. STEPHANI, General relativity, Cambridge University Press, 1990. [3] S. WEINBERG, Gravitation and cosmology, John Willey and Sons, 1972. [4] R WALD , General relativity, Chicago University Press, 1984. [5] L. LANDAU and E. LIFSCHITZ, Course of theoretical physics, Volume 2, P ergam Press , 1976. [6] J.D . JACKSON , Classical electrody namics, Wiley, 1998. [7] J. EISENSTAEDT, Einstein et la relativite generale, CNRS Editions, 2003. [8] C .M. WILL , Theory and experiments in gravitational physics, Cambridge Univ sity Press, 1984. [9] G.F.R ELLIS, 'Relativistic cosmology' , in Proc. Inti. School of Physics 'Enri Fermi', Corso XLVII , R.K. Sachs (ed.), pp. 104 179, Academic Press, 1971. [10] S.W. HAWKING and G .F.R ELLIS , The large scale struct ure of spacetime, Ca bridge University Press, 1973. [11] J. BARDEEN, 'Gaugeinvariant cosmologica l perturbations' , Phys. Rev. D 2 1882, 1981. [12] C.M. WILL , 'The confrontation between general relativity and experiment ', Livi Rev. Relativity 4, 4, 200l. [13] 1. CIUFOLINI and J.A. WHEELER, Gravitation and inertia, Princeton Univers Press, 1995. [14] RV. EOTvos, V. PEKAR and E. FEKETE, 'Beitrage zum Gesetze der Prop o tionalitat von Tragheit und Gravitat ', Ann. Physik 68 , 11, 1922. [15] V.W. HUGHES et al. , 'Upper limit for the anisotropy of inertial mass from nucle resonance experiments', Phys. R ev. Lett. 4, 342, 1960; RW. DREVER, 'A sear for the anisotropy of inertial mass using a free precession technique' , Phil. Ma 6, 683, 1961. [16] J.P. UZAN , 'The fundamental constants and their variation: observational a theoretical status', Rev. Mod . Phys. 75,403, 2003. [17] E. FISCHBACH and C . TALMADGE, 'Ten years of the fifth force' , in Dark matter cosmology, quantum measurem ents, experimental gravitation , Moriond procee ings, 443 (1996) , arXiv:hep  ph/9606249; E. ADELBERGER, B. HECKEL , and NELSON, 'Tests of the gravitational inverse square law', Ann. Rev. Nuc. Phy 53 , 77, 2003. [18J C.D. HOYLE et al. , 'Submillimeter tests of the gravitational inverse square law Phys. Rev. D 70 , 042004, 2004 . [19] Y. Su et al. , 'New tests of the universality of free fall ', Phys. Rev. D 50, 361 1994.
References
67
[20] S. BAESSLER et al., 'Improved test of the equivalence principle for gravitational selfenergy', Phys. Rev. Lett. 83, 3585, 1999. [2 1] J .O. DICKEY et al., 'Lunar laser ranging: a continuous legacy of the Apollo program', Science 265 , 482, 1994. [22] R.V. POUND and G.A. REBKA , 'Apparent weight of photons', Phys. Rev. Lett. 4 , 337, 1960. [23] R. VESSOT and M. LEVINE , 'A test of the equivalence principle using a spaceborne clock', J. Gen. Rel. Grav. 10, 181, 1979. [24] T. DAMOUR and N. D ERUELLE, 'General relativistic celestial mechanics of binary systems' , Ann. Inst. H. Poincare 43, 107, 1985; 44, 263 , 1986. [25] T. DAMOUR and J .H. TAYLOR, 'Strong field tests of relativistic gravity and binary pulsars ', Phys. R ev. D 45 , 1840, 1992. [26] I.H. STAIRS, 'Testing relativity with pulsar timing' , Living Rev. R elativity 5 , 1, 2003. [2 7] I. BLAIR, 'Detection of gravitational waves', Rep. Prog. Phys. 63, 1317, 2000. [28] J.P. UZAN, 'Tests of gravity on large scales and variation of the constants', Int . J. Theor. Phys. 42, 1163, 2003. [29] J.P. UZAN and F. BERNARDEAU, 'Lensing at cosmological scales: a test of higher dimensional gravity', Phys. Rev. D 64, 083004, 2000.
2
Overview of particle physics and the Standard Model
Although gravity structures spacetime on large scales, numerous physical processes rely on the description of t he three other interactions, namely electromagnetism , the weak and strong nuclear forces. This chapter is dedicated to t he description of these interactions and of the particles on which they act . This standard model of particle physics describes ordinary matter , as observed in the laboratory. We shall see lat er that the cosmological observations concerning the existence of dark matter call for an extension of this framework. It is hence necessary to present it in detail. For this, we will work in this chapter in a flat spacetime. Section 2.1 recalls the basis of analytical mechanics and the steps from classica mechanics to quantum mechanics. Section 2.2 then focuses on the special case of a scalar field. Being the simplest prototypal model, the free or selfinteracting scala field illustrates in a complete way what can be learnt from quantum field theory. The following sections sket ch a general picture of the standard model of particle physics. Section 2.3 presents the spectrum of particles detected in accelerators, and Section 2.4 describes how , relying on the notion of symmetry, a classification can be sketched out. The Higgs mechanism will t hen be described in Section 2.5. Applying this mechanism to electroweak interactions will a llow us to obtain the standard mode of particle physics , presented in Section 2.6. F inally, Section 2.7 presents a study o discrete symmetries, followed by an overview on the current state of t his standard model. The ideas introduced in this chapter will be used repeatedly in the rest of the book. For instance, t he aspects of quantization and in particular t hose of scalar fields will be central to our discussion of inflation (Chapter 8) , and also when we touch on the cosmological constant problem (Chapter 12); the notion of symmetry breaking plays a crucial role in the formation of topological defects (Chapter 11 ) and in the grand unified theories. This chapter therefore lays the necessary basis for the study of cosmologically relevant topics such as inflation of course, but also of topologica defects, the origin of dark matter , baryogenesis, the cosmological constant problem etc.
2.1
From classical to quantum
It is now a little more t han a century since the principles of quantum mechanics started being used to describe the laws of Nature. This section summarizes t he steps from analytical classical mechanics to quantum mechanics.
From classical to quantum
2.1.1
69
Analytical classical mechanics
Formally, quantum mechanics can be seen as an extension of classical mechanics as formalized finally, e.g., by Lagrange in the second half of the eighteenth century and Hamilton at the beginning of the nineteenth century. The classical laws can be transformed into their quantum counterparts by applying the correspondence principle.
2.1.1.1
Analytical classical mechanics
The essence of classical mechanics (all the details of which can be found in numerous works , in Refs. [1,2] among others) can be summarized by the existence of a Lagrangian function L from which all the dynamics of a system can be derived. For a set of N interacting particles , this Lagrangian depends on the coordinates Xi (i = 1, ' " , N) of these particles, and on their velocities, Xi . We merge together the set of all these coordinates, which are not necessarily Cartesian, under the designation qi (generalized coordinates) , with the index i now varying from 1 to 3N, and the generalized velocities qi. The dynamics of this system can be obtained by minimizing the action
S=
ltfinal L [qi(t), qi(t), t] dt ,
tinitial , tfinal
= const. ,
(2.1)
tinitial
assuming that the value of the coordinates is fixed at the boundary of the integration domain [JS = 0 with Jqi(tinitial) = Jqi(tfinal) = 0]. The Euler Lagrange equations are then obtained as (2.2) The first term of this equation brings into play the quantities pi == 8L / 8qi, which are the generalized momenta. It is possible to invert these relations to determine the generalized velocities in terms of the coordinates and momenta. Thanks to a Legendre transformation , all the dynamics can then be described by changing variables and using the generalized momenta instead of the velocities. For that, let us define the Hamilton function , or Hamiltonian H , by 3N
H [qi(t) ,pi(t) , t]
== LpiiJi  L {qi(t), qi [r (t)] ,t} ,
(2.3)
i=l
which allows us to write the secondorder differential equations (2.2) as a firstorder system, .i 8H . 8H
p =  8qi '
qi = 8pi'
(2.4)
This system allows us to easily obtain the time derivative of any physical quantity by using the Poisson brackets defined by
(2.5)
70
Overview of particle physics and the Sta ndard Model
A and B being two quantities depending on qi and pi. Substituting for B the Hami nian H in the definition (2.5) , and using the equations of motion (2.4), the total t derivative of A can easily be seen to be dA
ill
=
8A
at + Li
(8A ' i fiiP
8A.)
+ ~qi q,
P
=
8A
at + {A , H}.
(
Finally, the Poisson brackets of the coordinates and momenta satisfy
(
This formalism introduces an antisymmetric operator between two quantities, such that {A,B} =  {B , A}. This property will allow us to make the transi formally but very easily to the quantum description. 2.1.1.2
Functional derivative
Classical field theory, which can be illustrat ed, for instance, by general relativity, be written using the same formalism as the analytical theory of point particles, i rely on field functionals rather than on functions. A functional Z [J], is defined as a map from a normalized linear space offunctio .c = {f (x); x E IR}, onto the set of complex (or real in some cases) numbers , Z : .c f+ C (or IR). The functional derivative is then defined in a similar way the partial derivative for a function of many variables. A functional can indeed understood as a function with an infinite number of variables. Its infinitesimal varia is
<5Z[J] =
<5Z[J] <5f(x) <5f(x)dx,
J
(
introducing the dummy variable x , this reduces to a definition similar to the on normal analysis , that is
<5Z[J(x)] 1 <5f() == lim  {Z[f(x) y
<+ 0 E
+ E<5(X 
y) ] Z[J(x)]},
(
where <5 is the Dirac distribution in one dimension. Note that for all computations, functions are supposed to be integrable and to vanish fast enough at large x in or for the surface terms to vanish in the integration by parts. The two definitions (2. 8) and (2.9) are equivalent and can both easily be shown satisfy the rules required for a derivative, i.e. the Leibniz rule
<5 <5f(x) (Z[J]W[J])
=
and the functional composition law ITechnically, this is called a Banach space.
<5Z[J] <5f(x) W[f]
<5W[J]
+ Z[J]<5f(x)
,
(2
From classical to quantum
5Z {W[j]} = 5f(x)
J
5Z[W] 5W[f] d 5W(y) 5f(x) y.
71
(2.11)
From the definition (2.9), it can easily be shown that
5f(x) 5f(y) = 5(x  y),
(2.12)
where the first function, f(x), is seen here as the trivial functional that maps any function into itself. We also have the useful property
_5 5f(y)
J
g
[df(X)] d = _~ I [df(y)] dx x dyg dy ,
where the prime indicates a simple derivative of the function with respect to its argument. Finally, it is interesting to show that , from the functional point of view, a function and its derivative are not independent. By choosing Z[j(x)] = f'(x), and using the definition (2 .8), we find that
5f I (y) =
J
5f'(y) 5f(x) 5f(x)dx =
J
f (y)] d [55f(x) dy 5f(x)dx
=
J
d [5(y  x)] 5f(x)dx, dy
and we can hence make the identification
5f'(y) = ~ [5(y  x)]. 5f(x) dy All of these relations can easily be generalized to an arbitrary number of dimensions. Most importantly, they allow us to write a classical field theory in a manner that bears close resemblance to the Hamiltonian formulation of a classical particle.
2.1.1.3
Lagrangian of a classical field
The equivalent of the action (2.1) for a scalar field is obtained from a Lagrangian density L[¢(x!")], as Sfield
=
J
L(t) dt =
JJ dt
d3 xL [¢(x, t), o!"¢(x , t) ].
(2.13)
Let us first consider L as a function of x and t. Using an ordinary derivative followed by an integration by parts in four dimensions gives rise to the Euler Lagrange equations, equivalent to (2.2) for a scalar field ,
oL o¢(x a
)
o oL ox!" o[o!"¢(xa) ] '
(2.14)
This relation can also be obtained using a functional differentiation, i.e. viewing L as a functional of the field and its time derivative , in the following way. The time and spatial dep endencies of the Lagrangian can be separated explicit ly before the variation
72
Overview of particle physics and the Standard Model
oL =
J
3
d x
({
8L 8L } 8L . ) 8¢>(x, t)  V' 8 [V' ¢>(x, t)] o¢>(x , t) + 8¢(x, t) o¢>(x , t) ,
where an integration by parts was p erformed to get the second term. The definitio (2.8) for the functional derivative then implies t hat
oL =
J
d3 x
[
. t) ] , O¢>(o.c /¢>(X, t) + . OL O¢>(X, x,t 8¢>(x,t)
and t herefore, by identification, we obtain
o.c O¢>(X, t)
8L _ V' 8L 8¢>(x, t) 8 [V' ¢>(x, t)]
o.c
and
o¢(x, t)
8L 8¢(x, t)
Integrating these relations by parts, assuming that t he variation at the bounda vanishes, t his reduces to (2. 14).
2.1.1.4
Hamiltonian and neld Poisson brackets
Starting from the Lagrangian (2. 13), and using an approach completely similar t hat used in pointparticle mechanics, we can first define the canonically conjuga field 7r(x, t) , which is equivalent to the generalized momentum [see (2.2)], by
_
oL(t) , o¢>(x, t)
7r(x , t) = .
(2. 1
so that the equation of motion is of t he same form as for point particles, name
ir(x , t) = oLjo¢>(x, t). In order to generalize (2 .3) to the scalar field case, t he Hamiltonian can be writte in the following form
H(t) =
J
7r(x, t)¢(x , t) d 3 x  L(t)
==
J
H (x , t) d 3 x,
(2.1
which hence defines t he Hamiltonian density, related to the Lagrangian density L , b the relation
H(x , t)
=
7r(x, t) ¢(x , t)  L(x , t).
(2.17
In t his for malism, the Euler Lagrange equations (2. 14) get replaced by t he set Hamilton equations
(2 .1
This system of firstorder equations is the equivalent of (2.4) for a scalar field.
From classical to quantum
73
The Poisson bracket of two functionals is constructed in the same way as for two classical quantities via the straightforward generalization of (2.5)
_J
{W[¢(x),7f(X)] ,Z[¢(x),7f(X)]} =
3
d x
[OW oZ oW OZ] O¢(x) 07f(X)  07f(X) O¢(x) .
(2.19)
The Hamilton equations can hence provide the time evolution of any functional W by dW = oW dt
at
The conjugate momentum
7f
+
{WH}.
,
and the field ¢ obey the relations
{¢(X ,t) ,¢(X',t)} {¢(x , t) , 7f(X' , t)}
= =
{7f(x,t) ,7f(X',t)} 0(3) (x  x') ,
=
0,
(2 .20)
which generalizes (2.7). 2.1.1.5
Nrether Theorem and conservation laws
In 1918, E. Nrether enunciated the theorem according to which a conserved quantity can be associated to any transformation that leaves the action of a theory invariant. More precisely, this theorem is enunciated in the following way:
If the action of a given theory is invariant under the infinitesimal transformations of a symmetry group , then there exist as many conserved quantities (constant in time) as infinitesimal transformations associated to this symmetry group. Furthermore, these conserved quantities can be obtained from the Lagrangian density. As an example of this theorem , as well as for its own interest, let us mention the conservation of energy that results from the time translation invariance of the laws of physics (one invariance, one conserved quantity) , as well as the conservation of momentum and angular momentum, which arise, respectively, from the translation and the rotation invariances of spacetime. In these two last cases, the invariance is vectorlike (three coordinates for the translation vector, or three angles for the rotation), and the conserved quantity remains of the same nature. All these invariances, which are usual in classical mechanics, follow from the conservation of a unique quantity, the energymomentum tensor. This construction can be generalized to the case of general relativity (Chapter 1, Section 1.4). 2.1.1.6
Case of classical point particles
The classical mechanics of point particles is completely described by specifying a Lagrangian L(qi, (ii , t). If the physics that corresponds to this Lagrangian is timetranslation invariant, i. e. such that the result of any experiment involving this theory does not depend on the time at which it is made , it is then expected that
74
Overview of particle physics and the Standard Model
L(qi, qi, t  to) = L(qi, qi, t), for any arbitrary initial time to. This implies that t Lagrangian cannot explicitly depend on time , i. e. L(qi' qi, t) = L(qi, qi); it depen on time only through the variables qi. It follows that aLlot = 0, so that its to derivative with respect to time is
(2.2 The equat ions of motion (2.2) then imply that the function E, defined as
. oL
E= " q ·  L , ~ to' i qi
(2.2
is conserved in time, i. e. dE I dt = O. One can easily be convinced that E is indeed energy if we take the example of a set of N point particles of mass m in a potential for which 3N
L ex =
1
2:= 2mqi2 
(2.2
V (qj).
i= 1
The relation (2.22) then implies that
which corresponds to the classical expression for the total energy (kinetic plus p tential) of a system of point particles . Note that (2.22) is simply the Hamiltoni (2 .3). The inva riance under infinitesimal coordinates transformations and generalized locities, is obtained using the transformation qi f+ Qi = qi + Eli, and qi f+ Qi qi + Eji' where E is a small parameter (E « 1) and Ii is a vector that will be specifi later. The invariance under this transformation can be written as
Expanding the lefthand side of this relation to leading order in
E,
we get
Once again, the equations of motion (2.2) imply that
(2.2
and hence to the existence of a new conserved quantity. Equation (2.24) can be illustrated on the example (2.23) if one chooses translati in space as the coordinat e transformation (i.e. Ii = £i constants in Cartesian coor nates). For each of the three possible independent quantities £i (i.e. along the thr
From classical to quantum
75
possible axes), the quantity p = mq is conserved: this is the momentum. One can clearly see here that for a vectorlike invariance (three independent quantities form a vector) corresponds to a conservation law that is itself vectorlike. This is the meaning of Nocther's theorem. 2.1.1.7 Field theory case Let us now come back to the case of field theory, an example of which is the action (2. 13) , and let us consider a general coordinate transformation (2.25) Then the scalar field ¢ transforms as
which is a local variation. One can also define a global variation d through
which only takes into consideration the modification of the field itself at the point xc> , thus defining a total variation after the local variation is brought back to its original point. To first order in E, these two variations are related by (2.26 ) Note that the variation
d commutes with the derivative, unlike the variation 6. Indeed,
while
still to first order in E. The theory will be said to be invariant under the transformation x f+ x' (to simplify, we write x == xC> when there is no ambiguity) if the action , i. e. the integration of t he Lagrangian density over a spacetime volume V4 , remains unchanged during the operation. In other words , (2.27) where the integration is performed over the same volume of spacetime, expressed in terms of the new coordinates (hence the notation VI in the second equality). The volume element is modified by the determinant of the J acobian matrix
76
Overview of particle physics and th e Standard Model
where the second equality comes from the expansion to first order in then b ecomes
E.
Equation (2.2
using the relation (2.26) between the operators 6 and d, this leads to
(2.2 Furthermore, we can expand d£ in this expression as
which is possible since d is an exact differential (d and expression and using the field equations (2.14), we get
1 V4
d 4 x d£ =
1 V4
d 4 x 0J.L
an commute). Integrating th
(0£J.L ¢>  ) ~ d ¢>
,
such that (2.28) b ecomes
Since V4 is arbitrary, this implies that the integrand vanishes, or in other word that the term in parenthesis is a conserved current JJ.L , i.e. that satisfies oJ.LJJ.L = T his current is explicitly given , using (2.26 ), by
(2 .2
for a transformation of the form (2.25). In this expression we have reintroduced th local variation 6¢>, which is often better controlled than the total variation d¢>. The previous discussion does not depend on the fact that ¢> is a scalar , i.e. on th way it transforms under a change of coordinates, nor does it depend on the fact th we have considered only one field. More generically, for a theory with N fields ¢ where a E [I , N ], the relation (2 .29) can be generalized to
(2.3
This relation is still valid if the indices of ¢>a form , for instance, a vector (¢>a or a tensor of higher rank.
>
AJ
From classical to quantum
77
The conservation of t he current J implies the existence of a quantity constant in t ime. To see t his , it is sufficient to evaluat e the integral of 0l' JI' on an arbitrary space volume V3 , namely
before using the theorem of Stokes Ostrogradski to eliminate the second term
It t hus follows that
r
JV
oo J od 3 x
i.
=
dt
3
r
JV
JOd 3 x
= 0,
3
and t he integral of the time comp onent of the current is indeed conserved (see the generalization of Section 1. 4 from Chapter 1) . 2.1 .1.8 Field en ergy and momentum
Just as t ime and space t r anslation invariance imply the conservation of energy and moment um in pointparticle mechanics , t he same invariance, expressed in t erms of the current (2.30) , generat es the conservation of the energymomentum tensor. Under an infinitesimal t ransformat ion X'I' = xl' + dJ' in Minkowskian coordinat es, where the fl' are fo ur constant quantit ies, we can see that the field remains unchanged , i.e. b¢ = O. Note that t his remains true for any vect orlike or tensor quantity of arbitrary rank , since aX' l' / ox v = b~ for this specific case. Since the quantities fa are supposed to be constant, it is possible to read them off from the definition (2.30) . The conserved quantity is then a t ensor of r ank 2, since
which allows us to define a canonical energymomentum t ensor
8j.J.a
(2 .31 ) This t ensor is conserved by construction , i.e. (2.32) For each value of a = 0, . . . , 3, corresponding to independent variations of fa, is associated a conserved quantity: they are the components of the energymomentum vector p o. = (E / c, P) , given by
p o. =
~ c
which is hence conserved in time.
r d x8o<> (xf3) , 3
JV
3
(2 .33)
78
Overview of particle physics and the Standard Model
2. 1.1 .9 Symmetrization of the energymomentum tensor
T he energymomentum tensor (2.31 ) is not necessarily symmetric and thus does not necessarily correspond to t he t ensor (1.83) that appears in the Einst ein equations (1.85) However , it is always possible t o make it symmetric in an appropriate way, such that it remains conserved and such that the physical quantities we can derive from it are not affect ed. This allows us t o use its symmetric version as the source for the E inst ein equations. Let us define TP,cx == GP,cx + opGPP,cx .
This t ensor is conserved by construction if GPP,cx is antisymmetric in its first two indices, i.e. GP,pcx = _ GPp,cx . Furthermore, TP,cx leads to t he same conserved quantity as long as the fields decrease fast enough at infinity. We indeed have
(2.34) The second term of the last equality vanishes identically since GPP,cx is antisymmetric The last term can be expressed as an integral over the surface that limits V3 , i.e. oV3 which we send to infinity, where the fields are supposed to vanish . It follows that
(2 .35)
which we can identify with the momentum vector. This can b e illustrat ed by the example of free electromagnetism. In that case, the field ¢ get s replaced by the vect or potential AI'" Its dynamics is determined by the Lagrangian density [ef. (1.94)]
which only contains a kinetic term , with Pcx(3 == ocx A (3  o(3 A cx . The part of the energymomentum tensor that is proportional to the metric is already symmetric. The second t erm of (2.31) is given by GP,cx  _ o£e. m. oC< A _ ~ p,cx F p cx(3 e .m . 00 A P 47) cx(3 p,
P
,
which can be expressed , after a simple rearrangement , as
Here again, only the last t erm prevents the tensor from being symmetric (note, furthermore, that it is not even gauge invariant).
From classical to quantum
79
Let us now look for a t ensor with t hree indices G PWY., ant isymmet ric in its first two indices, which will allow us t o remove the last t erm. In other words , it should satisfy opGPJ.LQ = FPJ.L o pAn Using Maxwell 's equations, t his can be easily solved to obtain GPJ.LQ = FPJ.L A The symmetrized energymoment um t ensor is hence Q .
TJ.LV e .J11 .
=
FVP F J.L _ P
~",J.LV F FQ{3 4 '/ Q{3 ,
(2. 37)
which turns out to be the same as t he one obtained by variation wit h respect to t he metric (1.83) . 2.1.2
2.1 .2. 1
Quantum physics
Uncertain ty principle and Hilbert spaces
The great concept ual revolut ion of quantum mechanics [3J arose from its st atement that one cannot determine simultaneously with arbitrary precision the positions and momenta of any obj ects. T his means, for inst ance, t hat if one measures t he posit ion and moment um of a particle with precisions 6.q and 6.p , then , at best (in one dimension), 6.q. 6.p >
1 n
 2 '
(2. 38)
where n is Planck 's constant. T he existence of such an intrinsic limitation in t he possibility of knowing the stat e of a physical system obliges us to give up the not ion of a classical t rajectory. (See, however , Ref [4], which discusses t he Bohm de Broglie ontological interpretation fo r which such notions are still meaningfuL ) T he quantum descript ion replaces the classical hypotheses, by assuming t hat any physical system is described by an element of a complex vector space, a space of states, i.e. configurations, the dimension of which depends on t he system. Let us imagine, for instance, that we would like to study a light beam polarized linearly, either vertically or horizont ally. If t his is t he only prop erty we describe on the syst em, we then have a system wit h two possible st ates, denoted as I fT) and 11), in the 'kets' not ation introduced by Dirac; t he reverse notation allows us t o define a scalar product , fo r example (11 represents a ' bra', so that the set of t hem forms a 'bracket '. The vector space, in which t his system evolves, can have these two st ates as a basis, thus it is a twodimensional space. Any element of this space can always be decomposed as 11lf ) = 01 fT) + .81 1) with 0, f3 E
80
Overview of particle physics and the Standard Model
unaccountably infinite dimension. In this case (infinite dimension and endowed with a norm), it is called a Hilbert space.2 2.1.2.2
Operators and the superposition principle
In the quantum framework, the physical variables are replaced by operators that act on the vector space of states. For instance, for the stat e !q), which describes a particle at the position q, there exists a position operator, denoted as ij such that
(2.39)
In other words, the position state becomes an eigenstate of the posit ion operator. To avoid any risk of confusion and error in what follows, we will systematically denote a quantum quantity corresponding to a classical one by the same letter with the symbol , ' ': q f+ ij, P f+ p, etc. Any operator A should satisfy some conditions in order to correspond to a measurable quantity. In particular, since the space of states is complex, the eigenvalues o an observable, which can potentially be measured, should be required to be real. This implies that A is Hermitian , i. e. At :==TA* = A. Furthermore, since the eigenvalues Ok of this operator correspond to all possible values of the concerned property, the eigenstates !ak) of A should form an orthonormal basis of the space of states. In other words , any arbitrary physical state !\[! ) can b e decomposed as
(2.40)
where the discrete sum should be replaced by an integral in the case of a continuous infinity of degrees of freedom i.e. in the case of a Hilbert space of states. An operator A that satisfies these conditions is called an observable. The relation (2.40) directly leads to the central property of quant um theory, namely t he superposition principle, according to which several physical configurations can be superposed in a linear combination that is still a physical state; this is the case in (2.40). Moreover, a measurement of the observable A performed on the state !\[!) will necessarily give a unique eigenvalue Ok of this operator. Based on this principle, we can see that the dynamical equation satisfied by the physical state is necessarily linear. 2. 1.2.3 Noncommutativity
Any given classical physical quantity corresponds to an operator in the space of states, and the equations that control the dynamics must reproduce those of classical physics in the limit where we know they are valid. The easiest way to obtain the quantum laws is thus to derive them from the corresponding classical laws. Now, unlike the functions that define classical physics, operators act on a vector space and do not always commute . It is this property that is used to realize the correspondence.
We define the commutator between two operators A and B as [A, B] :== AB  BA. This obj ect has the same symmetry properties as the Poisson brackets introduced in
2The phrase ' Hilbert space' is a lso very often used for finitedim ensiona l spaces, although this is mathemat ically incorrect.
From classical to quantum
81
(2.5). It turns out that we obtain an appropriate prescription if we make the substitution (2.41 ) {A , B} f+  ~ [A , 13] .
In particular, t he position and momentum variables qi and pi are now variables that no longer commute and (2 .7) becomes
I [tii,qj]
= 0=
[pi,pJ]
and
(2.42)
This relation can be generalized to numerous cases (see for example the case of a scalar field later on). One can explicitly check here that in the representation where the position operator is expressed by (2.39), t he momentum operator then becomes
pi =
 infJ/fJqi .
2. 1.2.4
Predictability and statistics
The predictions of quantum mechanics are essentially probabilistic. Formally, t his translates into saying t hat if a physical system is in a state I"'), then t he probability Pw(<1» for it to be measured in a state 1<1» is proportional to the square of the scalar product (<1>1 "' ). Since only these probabilit ies are effectively predicted by the t heory, one can choose to normalize t hese physical states, which will always be assumed in what follows, i. e. ("'I"') = 1. With this convention, we have (2.43) Note that the normalization of a state is always possible, also because the scalar product of a state with itself is by definition positive in a vector space with positivedefinite norm (which is a prerequisite here to define probabilities)
For an observable operator A, we have seen that the only values a measurement could give were those belonging to the spectrum (the set of possible eigenvalues) of t his operator. Suppose that we have a great number of particles prepared in the same physical state, for definiteness, we choose, for instance, the state I"') of the decomposition (2.40). The measurement of A on this set of particles will be given by the average of t he operator in t he state I"'), namely
with a variance around this average value given by
As we would expect, this variance vanishes when I"') is an eigenstate of A. The uncertainty relation for the observables A and 13 is obtained in a similar way [5]
by considering the operator
C = 613 (A  (A) ) + i6A (13  (B) ) , where
from now
82
Overview of particle physics and the Standard Model
on (unless specified otherwise) the average is taken in an arbitrary state 11l1), an the redundant index has been suppressed. It is sufficient to notice that (ot 0) = II011l1)11 2 2: 0 and to expand this inequality to obtain
llAll13 2: ~ 1\ i [A, 13] ) I'
(2.44
which, once applied to the position and momentum operators, reduces to the Heisen berg inequality (2.38) , with an extra factor of, arising from t he fact that the coordi nates and momentum in orthogonal directions commute.
2.1 .2.5 Heisenberg and Schrodinger representations
Any physical state can always be normalized to unity. This allows for a probabilisti interpretation, which makes sense only if the dynamical evolution conserves the nor malization of the states. Denoting by (;(t) the t ime evolut ion operator, the fact tha the evolution is linear implies that the state 11l1(t)) evolves as
1111 (t ))
= (; (t
 to) 1111 (to) ) ,
(2.45
where (; satisfies
N (t) 1111 (t)) = N (to) 11l1(to) ) ¢=? (;t (; =
l.
(2.46
Thus the evolution operator (; is a unitary operator. In order to be physical, an quantum theory must satisfy this unitarity principle; this criterion allows us to rule ou some theories, usually very speculative, without even needing to resort to experiments Unitary transformations actually allow us to go further and to define some rep
d)
resentations i. e. some pairs ({An}, {11l1 ) t hat contain all possible operators and states of the theory. These representations are related to each other by
I ~h
{ An
t
I~) k =, ~1 1l1 ) ~ ,
t
An
= U An U
(2.4 7
.
These rela tions guarantee the conservation of the norm of t he states, and t hus of th probabilities: all predictions made in a given representation will be identical to thos obtained in another representation. There exist many useful representations, in particular, the Heisenberg picture i which the operators are time dependent but where the basis vectors, t he states, ar time independent. In this representation, thanks to (2.41), the classical equation i directly transposed to
dA
[ ,, ]
illcit = A , H ,
(2.48
where fl is the operator that corresponds to the Hamiltonian function of the classi cal system. To obtain (2.48) , we have assumed that the operator A did not depend explicitly on time. Such an explicit dependence can be taken into account by simpl adding the required term with a partial derivative. The Schrodinger picture corresponds to t he other extreme in the possibility o choices. In t his representation t he operators do not depend on time, and only t h
From classical to quantum
83
states are time dependent. The difference between these two representations is similar to the choice one can make when describing rotations. Either one considers an active rotation of the system , leaving the axis unchanged , or one can describe the same rotations in a passive way, considering that it is the axes that change (in the opposite direction) while the system itself does not move.
2.1.2.6
Schrodinger equation
In the Schrodinger representation, the quantum dynamics of the state, now depending on time, simply follows from the requirement that predictions, and thus probabilities, should be timetranslation invariant. Indeed, (2.45) implies that the time propagator U(t  to) satisfies d d ' dt 1\IF(t)) = dt U(t  to) 1\IF(to )) . However, if the equation describing the dynamics of a state is linear and independent of the initial state and time to, then the righthand side term in the previous relation should be proportional to the vector 1\IF(t) ). We can hence conclude that d ' dt U(t  to)
,
<X
U(t  to ).
This differential equation can be solved very easily to give
U(t  to) = exp
[ ~iI x (t 
to)] ,
(2.49)
where iI is a Hermitian operator. From t he timetranslation invariance, this operator should be conserved , and should therefore be related to the energy. Actually it can only be the Hamiltonian. The form of (2.49) guarantees that the operator U is indeed unitary. Let us now consider again (2.45) with the solution (2.49), we obtain the Schrodinger equation
ifi~ I\IF (t)) dt
=
iI I\IF (t)) .
(2 .50)
Moreover, we can note t hat U is precisely the unitary operator one needs to shift from the Schrodinger representation to the Heisenberg one. The vector 1\IF(to)), which is effectively time independent, forms the basis of this representation. Equation (2.50) can also be understood as a formal correspondence rule between the Hamilton func tion , i.e. t he energy, and the operator ifidj dt.
2.1.2.7 Harmonic oscillator Before attacking the subj ects of relativistic quantum mechanics and field theory, let us first recall the results of a very simple system, the harmonic oscillator. This will allow us to put into practice the notions discussed previously. Furthermore, this syst em will also be useful in field theory since all t he fields , whose properties we shall use, can be expanded as sometimes infinite sets of such harmonic oscillators.
84
Overview of particle physics and the Standard Model
The Hamiltonian of a onedimensional harmonic oscillator of mass m and angu frequency w is
(2.5
The Schrodinger equation (2 .50) for this Hamiltonian is obtained from the correspo dences discussed previously, namely H + ilid/ dt and p +  ilia / aq, . d'if; 2lidt
li2 a 2 'if;  2m aq2
=
+
1 2
2
 mw 'if;
(2.5
'
where we have defined the wavefunction 'if;(q , t) == (ql'if; (t)). This function is usua complex. Given the interpretation (2.43) of quantum mechanics, the square of modulus, 1'if;1 2 , represents the density of probability to find the particle at position q t ime t. The normalization of the state I'if;(t)) simply implies that the total probabili integrated over the whole space, is unity, so that 1'if;1 2 dq = l. The spectrum of the Hamiltonian, i.e. the set of the eigenvalues associated wi the operator il , is here obtained by solving the eigenvalue equation il'if; = E 'if; , equivalently by looking for the oscillation modes of the wavefunction of the fo 'if; = f(q) exp (iEt/li). The equat ion we obtain is well known and its solutions a normalizable only if the energy is positive, which is physically acceptable. Furthermo t he derivative of t he solut ion at q = 0 is continuous only for a discrete set of values the energy, En. We define the operators a and at, called annihilation and creation operators, respe tively, (understood as being associated with each oscillation mode), by the relation
J
, Y2ii fmW(,q+2.ft) a== mw '
a, t 
f{#w ('q2. ft ) . 
2li
mw'
(2.5
t hey are conjugate to each other but not Hermitian: they are not observables. Inverti these relations and using the commutation rules (2.42) , we find that [a , at ] = l. T Hamiltonian can then be expressed as
In this form, we can understand why the energies are positivedefinite since
by the definition of the norm in a Hilbert space. Once we obtain a state In) of energy En, we can determine all t he others applying t he creation operator:
which shows that In + 1) = (n + 1)  1/2a t ln ) is an eigenstate of the Hamiltonian wi the eigenvalue E n+ 1 = En + fiw. Similarly, In  1) = n 1 / 2 aln) is the eigenstate of
From classical to quantum
85
with the energy E n  1 = En  nw. The state with lowest energy, the vacuum state 10), satisfies alO) = 0, and all the states can be obtained by
In) =
~ (att 10).
vn!
Thus, the vacuum state has a nonvanishing energy, Eo = ~nw , and the full spectrum is En = (n + ~) nw.
2.1.2.8
General potential
Let us now come back to the threedimensional space. A particle evolving in a general potential V (x) can be expanded in a similar way in the basis of the eigenstates of the Hamiltonian, with eigenfunctions, '0E(XI') satisfying (2.54) where the eigenvalue of the Hamiltonian E, is the energy of the state. For a harmonic oscillator , Vi1.o. ex x 2 , and (2.54) becomes invariant under the transformations x >  x and p > p. Therefore, the eigenstates then have vanishing average values of the position and momentum, i. e. (nlqln) = 0 = (nlpln). This reflects the fact that the potential has a minimum at the origin , increases with the distance (quantum analogue of a restoring force) and is symmetric. Thus, the wavefunctions are essentially localized at the origin and the requirement that they are normalizable demands that they vanish asymptotically. A general potential V(x) does not necessarily respect these conditions. In particular, it can be nonvanishing only in a finite region of space. This is the case, for instance, for the Coulomb potential of the electric interaction between an electron and a proton. This attractive potential is, in addition , negative. We therefore deduce that the eigenvalues are negative energies forming the different possible states of the hydrogen atom, spread on a discrete spectrum of normalizable wavefunctions. There is also a continuous component formed by all the positive energies, which corresponds to nonnormalizable wavefunctions. Actually, they are the wavefunctions of the momentum operator that can be expressed as
'0 p (xl') ex
e i( p."'  wt ) ,
i.e. plane waves . We usually normalize them by artificially putting the particle in a box of finite volume V3 , and dividing the wavefunction by V3. Once all computations are performed nothing should depend any longer on V3 and one can take the limit V3 > 00. If this is not the case, using the wavefunctions of the Hamiltonian is delicate and one should resort to other methods and different representations. An important point, valid both in classical and in quantum mechanics, is that the potential V (x) must be determined by experiments, or should be imposed in one way or another. Its functional form is in the theoretical context completely arbitrary. The description of some interactions in t he framework of field theory will allow us to compute the form of this potential for many cases .
86
Overview of particle physics and the Sta ndard Model
The eigenstates of the momentum can also be used as a basis for the representatio of free particles, i. e. without a potential. In this case, the creation operator at (p) act on the vacuum state to create a particle with moment um p , and the wavefunction i obtained using the Fourier transform of these states. It follows that
which allows us to define the field operator If, by
,'1,( ) 'I'
X
=
J
3
d p i p·x (27f)3/ 2e ap , A
(2 .56
such that 1f;(x, t) = (O!If,(x)!1f; (t)). The operator If,(x) annihilates a particle at th point x, while its Hermitian conjugate If,t produces a particle at the same point. The generalized commutation relations for the creation and annihilation operator for momentum eigenstates, namely
(2.57
translate into
(2.58
Note that it is also possible to define a conjugate momentum IT( x) and a Hamiltonia formalism. We will come back to this in the context of the relativistic theory. 2.1.3
Quantum and relativistic mechanics
The application of the correspondence principle allows the immediate formulation o a relativistic quantum theory. Instead of using the classical expressions relating th total energy, the Hamiltonian , to the kinetic and potential energies, it is sufficient t consider the relativistic momentum , pi' = mui' , and the invariant Pi'Pi' = _ m 2 c2 which can be expressed as a function of the energy, E , and of the momentum , p , a Pi'pi' = _ E2/C 2 + p2 =  m 2 c 2 , to get E2
=
p
2 2 C
+ m 2 c4 .
The Klein Gordon equation for the wave function ¢ can then be obtained using th correspondence E 4 inOt and p 4  inV
(2.59
It will be shown in what follows how it is possible to take into account the propertie of special relativity to describe quantum field theory. An important part of thes developments can be generalized to curved spacetimes, in particular by replacing th partial derivatives oe. by covariant ones \7 e.' This is how we can consider the influenc
From classical to quantum
87
of gravity by means of general relativity to a large degree: the quantum field theory in curved spacetime assumes that gravitation alters field theory only through the metric in which the (test) quantum fields evolve. From now on, natural units will be considered and we therefore fix 'It = c = l ' (see appendix A).
2.1.3. 1 Klein Gordon and Dirac equations
In natural units , the Klein Gordon equation , which describes the dynamics of a free scalar field , becomes (2.60) For a general selfinteracting potential, the Lagrangian from which this equation is derived is given by (l.106). In the case of a massive free scalar field, we will take (2.61 ) i.e. a simple mass term. Its quantization will be discussed in detail in Section 2.2. A problem immediately appears in (2.60), that of the existence of negativeenergy solut ions . Indeed, the classical relation E = p 2 + m 2 assumes that E > O. But the solutions of (2 .60) are simply the eigenstates of the momentum and the energy, which are ex: exp [i(p . x  Et)], but nothing indicates what the sign of the energy should be. In other words, both choices E = p 2 + m 2 are possible. But if the wavefunction rP corresponds to one of these states, then it contains less energy than the vacuum state itself. Thus, a theory containing such states is necessarily unstable. This is why we should go to the level of quantum field theory, for which the solution of (2.60) is not a wavefunction, as in the case of 'lj; for the Schrodinger equation , but an operator , similar to >it of (2.56). The difficulty we have just mentioned for the scalar field comes from the fact that its dynamical equation, unlike in the classical case, is second order in time, which is normal to respect the relativistic invariance. Another way to proceed is to postulate a firstorder equation in time, and hence also in space. In order to respect the relativistic invariance, one should be able to write it as
J
±J
(2.62) which we can obtain by varying the Dirac Lagrangian (2.63) Since this equation , known as the Dirac equation, describes the dynamics of a relativistic particle its square should also reproduce the correspondence observed in the Klein Gordon equation. We should therefore also have ('T)J.LI/ 8J.L81/  m 2 )'lj; = 0, which implies that
88
Overview of particle physics and the Standard Model
This is only compatible with t he Klein Gordon equation if the coefficients , " satisf
(2.64
T hus, the '" cannot be simple numbers. Actually, t hey should be matrices of rank a least equal to four . A useful representation for these ," is
,o=
(0 f) fO
a")
0 ( (j" 0
'
where each entry of these matrices is itself a 2 x 2 matrix, f = twodimensional identity matrix, i. e. f = is 1
a =
(01) 10 '
a
2
=
(0 i) 0 ' i
(~ ~).
aD
'
(2 .65
standing for th
T he a i are the Pauli matrices, tha
and
a
3
(1 0)
= 01 .
(2 .66
Finally, we have defined the vectors a" == (I , a i ) and (j" == (I , ai ). For reasons t ha will become clear later , the matrices of (2.65) are called the chiral representation o the Clifford algebra, defined by (2.64). It is interesting to note that (2 .62) can also be obtained from the action
(2.67 Assuming that 'I/J , defined by
(2.68
and 'I/J are considered as two independent quantities , then t he variations must b performed simultaneously. T his action introduces the operator f7
FoG == Fo(G)  o(F)G,
(2.69
for any two quantities F and G . In the case of (2.67) , F and G are spinors, but th definition applies also for any kind of field, for instance scalars.
2.1.3.2
Maxwell 's and Praca's theories
The scalar field ¢ is of spin 0 and the spinor field 'I/J of spin 1/2. T he fields know experimentally, among which we can find all the particles listed in Section 2.3, als include particles of spin 1 such as the photon, of zero mass, or the W ± and ZO vector of t he electroweak interaction. Such fields are described by a vector A" , the dynamic of which is governed by P roca's Lagrangian
(2.70
for a real field of mass M A . The Faraday tensor has the usual definition , F"v == 20I"Av
Canonical decompositions
89
Varying (2.70) using (2.14) for each component Ao< of the vector field, we find
and hence (2.71 ) This equation does not seem to be equivalent to the Klein Gordon equation for each component of the field due to the righthand side term. However, if we take the divergence 80< of this equation, the righthand side term becomes a d'Alembertian , which simplifies with the one on the left, so that
Note that things are more complicated in the relativistic case, where the covariant derivatives do not commute [ef. (1.97)]. As can be seen here, if the vector field is massive, i.e. MA of. 0, then t he Lorentz gauge condition 8J.L AJ.L = 0 is automatically satisfied [the fact that Proca's theory (2.70) is not gauge invariant precisely because of this term is no surprise], and (2. 71) leads to a Klein Gordon equation for each component, which is to be exp ected from the classicalquantum correspondence principle discussed around (2.59). The energymomentum tensor fo r the massive vector field, once it has been suitably symmetrized, is identical to the one for the gauge field of electromagnetism (2.37) up to the mass term that has to be added. This leads to (2.72)
2.2
Canonical decompositions
The particular case of a scalar field, let it be real or complex, is discussed in detail in most textbooks on field theory; see, for instance, Refs. [6 9]. We briefly review it in this section. For completeness, we also briefly describe the case of the vector and spinor fields. 2.2.1
Technical clarification
In what follows, we will systematically choose a symmetric definition of the Fourier transform and of its inverse that are defined by (B.58) and (B.59). 2.2.1.1
Lorentz invariant measure
One of the important properties of t he Dirac distribution is that
df
o [j(x)  a] = ~ ( dx a
where the
Xa
)
1
o(x  xa),
(2.73)
X = Xa
represent the zeros of the argument, i.e. the solutions of f(xa) = a.
90
Overview of particle physics and the Standard Model
This property allows us to replace the Fourier integrals in 4 dimensions by three dimensional integrals when the fields over which we perform the integration are 'o shell ', i. e. their Fourier components satisfy the relation E2 == k;5 = k 2  m 2 , the energ being posit ivedefinite . In that case, it is possible to perform the integration over th time component ko with the previous relation by setting f(k o) = k;5 . This gives
with the definit ion
(2.75
and where e is the Heaviside step function. T he relation (2.74) defines a Lorent invariant measure despite being threedimensional. Note that to obtain this measure it is necessary to impose the positive energy condition in order to keep only one term of (2 .73).
2.2.2
Real free fie ld
Equation (2.60) can be obtained by varying t he Lagrangian (l.106) in which we pos tulate an arbitrary potential V(¢) for t he scalar field ¢ . The conjugate field is then
7r(x) = so that the Hamiltonian density, H =
Of:
~ . 
J¢(x)
.
= ¢(x),
(2 .76
7r¢  L , is given by
(2.77
General potentials V(¢) are useful in cosmology, for instance, when addressing t h inflation scenarios (Chapter 8).
2.2.2.1
Mode expansion
The eigenstates of the momentum operator are t he basis of every expansion in quantum field theory. To deduce that, we should first know the energymomentum tensor , tha is 8 J.'
_ ~8
= 8J.'¢8
[~ (8¢)2 + V(¢) ] = TJ.'n
(2 .78
The last equality follows from the fact that the energymomentum tensor of a scala field is already symmetric. Moreover, it coincides with t he tensor (l.1 07) obtained by varying (l.84) with respect to the metric.
Canonica l decompositions
91
The tensor (2.78) allows us to construct the fourmomentum vector that is obtained by po. == TOO. , from which we indeed get, first the Hamiltonian density, which corresponds to the purely timelike component, i.e. pO == TOO = 1i, and then the associated momentum to the scalar field P = 7rV ¢. Because the presence of a general potential V (¢) can lead to some potential nonlinearities that can imply some unnecessary complications, here and in the following sections we restrict ourselves to the case of the free massive field for which V = ~m2¢2. Having found the momentum P, it is sufficient to perform the substitution of the correspondence principle and to replace P by P + iV. The eigenmodes Uk(X) with eigenvalue k of the momentum operator thus satisfy iVUk (x) = kUk (x). They are plane waves Uk(X) = Nkeik.x,
N k being a normalization factor to be determined later. It is now sufficient to expand the field on the basis of these eigenstates as
(2.79) Since ¢ satisfies the Klein Gordon equation (2.60), we find that the coefficients ak(t) evolve as adt) + (k 2 + m 2 )ak(t) = 0, where we used the fact that V 2Uk (x) of this equation is simply
=
2  k uk (x). Setting
wk = k 2 + m 2, the solution
with b;" = a  k since ¢ E lR. After some algebraic manipulations, and using the relations between the coefficients of the expansion, we find a more explicit fourdimensional expression, that is (2.80) where we recall that k . x == k . x  wkt. From (2.80) and from the invariant measure (2.74) we see that the field expansion, even though it was realized on the basis of threedimensional states, is invariant, i.e. that ¢ is effectively a Lorentz scalar, as long as we choose the normalization (2.81) Moreover, this relation allows us to recover the usual commutation relations after quantization. Note that the conjugate momentum of the field is simply (2 .82) which we obtain in a similar manner as before. The factor (  iWk) in the integrand of (2.82) implies that the latter is not explicitly a Lorentz scalar. This is not a problem since only the field should absolutely satisfy this property.
92
Overview of particle physics and the Standard Model
2.2.2.2
Quantization
The field and its conjugate momentum can now be 'promoted' to the status of oper tors , which can be done by replacing the coefficients ak and a"k by the annihilation a creation operators, respectively, ak and The field Poisson brackets then becom the equal time commutations relations. If we impose
al .
(2.8 and all others to vanish, i. e. the usual commutation relations , we indeed obtain t canonical relations for the field
[¢(X,t),7T(X',t) ] =i5(3) (xx') ,
(2 .8
a computation that is recommended to be performed explicitly. The relations (2.8 and (2.84) can only be respected if the normalization of the field modes is given (2.81) . The eigenmodes of the field operator are now defined by
u k (x , t)
=
N k ei( k .x w /c t) ,
(2.8
so that the field operator is expressed as
(2.8
This relation can be inverted to express the creation and annihilation operators as function of the field , namely
(2.8 f7
where the derivative 8 is defined by (2.69) . This analysis is then complet e as soon as we know the Hamiltonian, which is giv by
(2.8 We then get
(2.8
where we have used the commutation relation (2 .83) and defined Nk as the 'number particles in the state with momentum k ' operator, the significance of which will le us to the definition of the Fock space associated with this theory. A remark concerning (2.89) is necessary at this point. The second term, propo tional to 5(3) (0 ), is clearly divergent , even in the case where the momentum space finite, i. e. if there exists a cutoff k m ax above which one can no longer perform the com putation. In field theory, as discussed here, this infinite energy is not embarrassing
Canonical decompositions
93
it has no physical meaning since in practice only differences of energy are measurable. In practice, we define a normal ordering that automatically puts the creation operators on the left and the annihilation ones on the right, or by 'renormalizing ' the energy by simply suppressing the vacuum energy obtained this way. This possibility is unfortunately no longer an option as soon as the theory is coupled to gravity. The vacuum energy then assumes all its meaning and acts as a cosmological constant , which is in any case considerably more significant than the one we measure (Chapter 12). It is, in the current state of knowledge, the most serious problem that theoretical cosmology has to confront, and it comes from quantum fi eld theory!
2.2.2.3
Fock space
The complete quantum theory is obtained once we know the state of the particles. For that, we st art by defining the vacuum 10) as being the state annihilated by all the annihilation operators, i. e. aklO) = 0 'tIk E JR3 . We consider, furthermore, that this state is normalized, namely (010) = 1. A state with one particle is then constructed from this vacuum state by applying a creation operator to it. We thus have
(2.90)
Ik ) == 0,110) , normalized by
(klk' ) = (ola k al,10) =
5(3)
(k  k') ,
and we construct in this way a complete space of states, each state having n( k) particles in the state of momentum k . The definition (2.90) shows that a state with n(k) is an eigenstate of the operator Nk , since (2.91) It follows that any state describing a set of particles will be an element of
called a Fo ck space and having an unaccountable infinity of states. Finally, note that starting from the vacuum state 10) , it is possible to produce a particle at the point x simply by applying the field operator itself
lx,t ) =
q} (x , t) IO) =
J
d3k e ik . x (271")3/2 J2Wk Ik),
(2.92)
which appears essentially as the Fourier transform of the one particle st ate with given momentum, hence its interpretation. Note that this equation uses the operator q) to generate a particle, which plays no important role here since the field is real and hence cfyt = cfy, but in the more general case of a complex field , the distinction is important.
2.2.2.4
Spherical waves exp ansion
The previous expansion is only a special case of the expansion in the basis of the momentum eigenstates expressed in Cartesian coordinates (x , y, z) . It is also possible
94
Overview of particle physics and the Standard Model
to perform this expansion in spherical coordinates (r, 0, rp) using spherical harmon In that case, we split the Laplacian into radial and angular parts t. = t.r + r  2 t with t.r = 8; + (2 /r)8r and t.11 = 8~ + sin 2 08~. We express the eigenvalues
JP2
the Laplacian using the amplitude p == and two integer numbers £ E Na m E [£, +£j . The scalar field ¢(x, t) can then be decomposed into the eigenmod ¢pem(x), of the Laplacian (2. which are given by
¢pem(x , t)
=
upe(r)Yem (0, rp).
Since the spherical harmonics are solutions of (B.8), (2.93) leads to an ordin differential equation for the radial part upe(r) , namely
2
d [ dr2
2d
+ ~ dr 
£(£+ 1)
r2
2]
(2.
+ P upe = 0,
the solution of which is a spherical Bessel function upe(r) ex: je(pr), (Appendix B). T normalization relations of the spherical harmonics (B.15) together with those of spherical Bessel functions (B .51) show that choosing
the modes of the scalar field are normalized , i.e.
(2 . Also note t hat with these convent ions, we have the completeness relation
roo
in
o
+e
L L 00
dp
Upe(r)Upe(r')y£~(O , rp) Yfm(e', rp')
= 5(3)
(x  x').
(2.
e=o m=e
The scalar field operator is now expressed as
(2.
where the operators iipfm depend explicitly on time. Inserting the relation (2.97) the Klein Gordon equation (2.60) , we get
22 (:t
w;
+ w~)
iipem(t)
=
0,
(2.
where = p2 + m 2 for each independent mode. The relation (2.98) shows that creation and annihilation operators, which could a priori depend on the three numb p , £ and m , have a time dependence that is only a function of the magnitude p of
Canonical decompositions
95
momentum. The solutions of equation (2.98) are complex exponentials, so that the total field takes the form
1 z= 00
¢(x, t) =
dp
o
Npupe(r) [Y£m(B,
+ Yi::n(B,
(2 .99)
e,m
and the momentum conjugate
(2.100) The remaining thing to do is to use the commutation relations (2.84) at equal time and to impose the same normalization as previously, namely Np = (2wp) ~ 1 / 2, to get the commutation relations for the creation and annihilation operators (2.101 )
The link between the Cartesian and spherical representations is obtained by expanding the plane wave in the basis of spherical waves (B.21). By identifying the coefficients of the term e ~ iwpt in both expansions (2.86) and (2.99), we obtain (2.102) where (Bp,
The complex scalar field
The generalization to the case of a complex scalar field, ¢ E
¢ = 3le(¢)
+ i~m(¢) ==
1
yf2(¢1
+ i¢2),
and we assume t hat the real and imaginary parts are independent. In a perfectly equivalent way, and this is the point of view we will adopt, we can also consider ¢ and ¢* as two independent fields. The Lagrangian obtained in both cases should be real, so that for a free massive field (2.103) Note the absence of the factor ~ in the mass term, compared with the Lagrangian (2 .61 ). This is due to the fact that since the two degrees of freedom of the field are
96
Overview of particle physics and the Standard Model
independent, the variation with respect to ¢ or ¢* does not give the factor 2 of real scalar field. The two conjugate momentums are
7T =
~~
= ¢* and
7T* = : ; = ¢,
which are indeed complex conjugates of one another and are hence independent. theory describing a complex scalar field is simply that of two real scalar fields of same mass, related to each other by the fact that they are the real and imagi parts of a complex number. The energymomentum tensor is obtained by summing (2.31) over the indepen degrees of freedom
which gives
(2.
We find again the same result as the one obtained using the definition that comes f the variation with respect to the metric. The commutation relations at equal time between the field operators and t conjugate momentum are now
(2. The mode expansion is of t he form
(2. where we have introduced two different sets of operators, complex. Their commutation relations are now
ELk
and
bk , since the fie
(2.
all the other commutators vanish. With the normal ordering for the operators discussed in the case of a real sc field, the Hamiltonian takes a form equivalent to (2.89), that is
(2 .
which shows t hat a complex scalar field can be interpreted as a collection of kinds of different particles, but with the same mass, corresponding to the two of operators ELk and bk . The Fock space is then the tensor product of the two F spaces generated by these operators, with one mutual state, which is the vacuum annihilated by both ELk and bk .
Canonical decompositions
2.2.3.1
97
M axwell or Praca field
The Lagrangian of a massive complex vector field is obtained by generalizing Proca's Lagrangian (2. 71 ) as LP roca
=
~FI.wpi.w + M~A:A" ,
(2.109)
where A" now has four complex components. As for the scalar field , the mass term no longer has the factor ~ characteristic of the real field. These types of fields can describe charged vector particles, such as t he W± bosons mediating the weak inter action. The expansion into creation and annihilation operators is analogous to (2.106), up to the fact that due to the gauge invariance for the Maxwell case with vanishing mass , or for the gauge constraint for the massive case of Proca [see the discussion below (2.71)], t he number of degrees of freedom is not equal to the number of field components (i.e. four ), but three for the massive field and two otherwise. This point leads to subtleties in the quantization that are discussed in Refs. [7 9] to which we direct the reader. 2.2.3.2
Internal symmetries
The Lagrangian (2. 103) not only represents two independent scalar fields of t he same mass. Indeed, these two fields are the components of a complex number and the Lagrangian is thus invariant under the phase shift and
(2.110)
where ex E IR is an arbitrary constant. Expressed in terms of t he real and imaginary parts of ¢, the phase shift (2.110) corresponds to a global internal rotation. It is this invariance that imposes the stat es 'a' and ' b' to have the same mass. From Ncether's theorem , the existence of such an invariance should translate into the existence of a conserved quantity. To determine it let us work in the limit ex « 1, such that we can expand t he exponential in (2.110), leading to the infinitesimal transformation ¢' = (1 + iex)¢. This transformation being independent of the spatial coordinates, x, we have i" = 0 in (2 .29) , and the current
J
" _ . 8L . J.  2 88,,¢ 'f'

. 8L ..J..* 88,,¢* 'f' ,
2
(2.111)
is conserved since in that case, we have o¢ = iex¢ and o¢* = iex¢* (we have removed the irrelevant normalisation factor ex in this definition). The conserved 'charge ' is obtained by direct integration, i.e. (2. 112) After quantization , this charge leads to the operator (2. 11 3)
98
Overview of particle physics and the Standard Model
which is indeed time independent since its equation of motion , using the form (2.1 of the Hamiltonian , is
dQ dt
. [,Q, He , ]=
= ~
O.
Equation (2.113) can be easily interpreted by saying that particles associated w the operators a and b are particles of opposite charges; in reality they are antipartic of one another. The complex scalar field is only an example of internal symmetry, in this case of invariance under transformations of a U( l) group. The associated charge is, in gene called hypercharge, and corresponds to the electric charge only in the case where U(l) group is that of electromagnetism. The different existing particles of which will make an inventory in Section 2.3, indicate the existence of more symmetries, ba on larger symmetry groups.
2.2.3.3 Fermions and anticommutation relations
It is possible to expand the Dirac field (2.63) in a similar way as in the exp sion (2 .106) , with an important difference however: the field operators, and hen those associated to the annihilation and creation of particles, must respect antico mutation relations instead of commutation relations of the type (2.83) or (2.84). T consist ency problem related to this point is made explicit in Refs. [7,8], and is beyo the scope of this presentation. It will be sufficient to know that for fermions we ne to replace the Poisson brackets by anticommutation relations
and
(2. 1
where we used the fact that , from the Lagrangian (2.63), the conjugate momentum the spinor is given by
=
7r7j; 
O£Dirae _ .
o1/J

'"I,t
~'f/
.
(2.11
Note that these anticommutators, similarly to the commutators for bosonic fiel are taken at equal time. At this point, it is interesting to note that if we place oursel in t he context of field t heory at finite temperature (thermal field theory, see e Ref. [10]) , then these anticommutation relations are necessary so that the fermio satisfy the Pauli exclusion principle; moreover, they lead to the correct statisti distribution law, i. e. the Fermi Dirac law (just as the commutation relations for bosons lead to the Bose Einstein distribution) .
2.3
Classification and properties of the elementary particles
The general properties of elementary particles are gathered in Table 2.1 (see Ref. [11 When one looks at these properties, one finds structures, such as illustrated , for stance, in Fig. 2.1 showing the particular case of baryons of spin ~ and of posit parity (parity is defined later , here it is only an illustration).
Classification and properties of the elementary particles I
I
I
I
t,
t, 0
t,+
t, ++
or
0
0
0
o
2: 
2: 0
2:+
1
I
0
0
0
2
r
0
3

~
99

1236 MeV
 1385 MeV
so  1532 MeV
0
Q
0 I
1
 1672 MeV I
o
I
I
1
2
Fig. 2.1 An example of the structure observed between the elementary particles : the baryons of spin ~ and of positive parity. Table 2.1 Nomenclature of the elementary particles. The families based on quarks (q) have several hundreds of particles identified so far , whereas the leptons and the gauge fields are here presented in an exhaustive way in the context of the standard model. The gauge fields are written in the following way: I is the photon, the ZO and W ± bosons are the mediators of the weak interaction, and ga represent the 8 gluons of the strong interaction . Spin
Name
Examples
B aryons = qqq Ha drons
{
Mesons = qij
n
7r°'±, Ko'± ,J/'Ij;, Do , Bo , 1'/ , '
Leptons Gauge fields
This kind of diagram has made it possible to understand the existence of a very precise structure, the history of which is discussed in Ref. [12]. In what follows, we will only indicate the result of numerous years of experimental research that led to the socalled 'standard ' model of particle physics, its theory is the subject of Section 2.6.
2.3.0.4
Exchange bosons
The known particles come m three forms. First, we find the bosons of unit spin, mediating fundamental interactions: they are the gauge bosons, namely the photon " which mediates the electromagnetic interaction, the eight gluons ga, which mediate the
100
Overview of particle physics and the Standard Model
strong nuclear interaction , and the vectors ZO and W ± . The latter have the proper of being massive and are hence described by Proca's Lagrangian (2.109) , at least long as we ignore their interactions. The vectors ZO and W ± allow the weak nucle interaction to propagate, W ± are in addition electrically charged . According to the current theoretical ideas, to be complete, one should add to th table the graviton, hl'v, massless 'particle' (in the framework of a quantum theo of gravity equivalent to that of the other interactions) of spin 2 responsible for t gravitational interaction (Chapter 1). In the absence of any satisfactory quantu theory of gravity, we do not consider at the moment the graviton as an elementa particle belonging to the standard model.
2.3. 0.5
Leptons
Of spin ~ , t he leptons, among which t he electron is the best known , are of negati electrical charge (e  , p, and T ) or vanishing (v e , vI' and VT which are the associat neutrinos). The evolution of these particles is dict at ed by the Dirac equation, coupl or not to electromagnetism , depending on the case. They are also subject to the we interaction and can be grouped into three 'doublets' and
(::)
which have exactly the same properties with respect to this weak interaction. Actual the only characteristic that allows us to distinguish between these different double is the mass of the particles that constitute them. Moreover, among them, only t doublet of the electron is stable, the others, of higher mass, are subject to the weak a electromagnetic interactions, and can decay into, for instance, electrons and electro neutrinos. For a long time, the neutrinos have been considered to have a vanishing mass, b some recent experiments and measurements (see Ref. [2] of Chapter 9) have show that there must be a mixing between t he different species of neutrinos, which is on possible for massive particles.
2.3.0.6
Hadrons
Only one category of particles , albeit t he most numerous one, is subj ect to the stro interaction: it is the hadrons. They can be of halfinteger spin, such as the proton a the neutron , and then belong to the category of the baryons. The mesons, of integ or vanishing spin, count among them , for instance, the 71"0 and its charged partne 71"±, which are described by the Klein Gordon equation for the real scalar fields ( the 71"0) or the complex (for the 71"±). The surprising profusion of the hadrons, as well as the observed properties symmetry between the different elements of the families sharing some properties (sp and parity, for instance, as in the illustration of Fig. 2.1) have led to the understandi that these particles, unlike what we currently think of gauge bosons and lepton are not elementary particles, but are constit uted of 'quarks' a The latter are for t
3 According to the legend , this word was proposed by M urray C ell Mann (Nobel Prize 1969) w would h ave read it in the book ' Finnegan's Wake ' by J ames J oyce.
Internal symmetries
101
moment seen as effectively elementary, of spin ~, and of electric charge + ~ for the quarks u, c and t, and for the quarks d, sand b. We can also group them into doublets for the weak interaction, with a similar structure to the one for the leptons , more precisely
1
a nd This description, which is satisfying from the theoretical point of view, currently does not have any generally accepted explanation (even if many models exist) . The names of the quarks themselves come from this structure: at the time where only the first family was experimentally accessible, the chosen names were simply up (u) and down (d). Then , a third quark was needed to b e introduced, which was considered as strange (s). The second family was then completed by the charm (c). In the same philosophy, when the presence of the fifth quark started to appear, the name of beauty (b) was proposed, soon replaced by bottom, allowing the first family to be reproduced with the top (t). Just as for the leptons, the only difference between the quark families comes from their masses, as well as the fact that only the members of the first family are stable. The ones from the second and third families decay much faster than their leptonic counterparts since they do so through the strong interaction instead of the weak interaction, the latter having a much weaker amplitude , as its name suggests. All these particles have been produced in accelerators, but have never b een observed alone, which can be explained by the notion of confinement of strong interactions. This is how it is only possible to observe states containing three quarks , baryonic states of halfinteger spin, and states composed of one quark and one antiquark, of integer spin: a meson. This point is not yet completely understood theoretically. Table 2.1 summarizes the nomenclature used to describe the diversity of elementary particles. The origin of the mass of all these particles, together with the important disparities observed between them , still pose many questions that are not all yet resolved. As we can see, the known particles reflect, or seem to reflect, some elements of symmetry. These are described by group theory, to which we now turn.
2.4
Internal symmetries
Contrary to leptons , which all appear in doublets , hadrons are found in numerous forms , suggesting that they are not elem entary. In reality, using a much smaller number of elementary particles, in this case quarks , and placing them in a relevant way in group representations, it is actually possible to combine them so that the other representations are perceived as new elementary particles. The role of group theory is precisely to lead us to this conclusion.
2.4.1
2.4.1.1
Group theory
Definitions
A group g is defined as a collection of objects {g;} , where the index i is not necessarily discrete , satisfying the following rules:
102
Overview of particle physics and the Standard Model
 There exists a composition law, denoted by 0, which can combine any two eleme of the group , the resulting combination being an element of the group: if gl E and g2 E g, then g3 = gl 0 g2 E g. Furthermore, this law is associative, gl 0 (g2 0 g3) = (gl 0 g2) 0 g3 = gl 0 g2 0 g3·  There exists a neutral element, denoted by 1, such that Vg E g, l og = go 1 =  Each element 9 Eg is associated with an inverse element , denoted by g  l, sat fying go g l = g l 0 9 = l.
Note that the law 0 in not necessarily commutative: when it is, i.e. if Vgl , g2 E gl 0 g2 = g2 0 gl, the group is then said to be Abelian. In the opposite case, the gro is usually said to be nonAbelian. The simplest group one can think of, besides the trivial group that only conta one element, is the discrete group Z2 that contains two elements, denoted by 1 (wh is the identity) and 1 with the ordinary multiplication law. This is an Abelian gro for which each element is its own inverse. For discrete groups, one can construct t complete table of all possible operations. 2.4.1.2
Lie Groups
Another wellknown example of a group is that of translations that transforms a po in space x into x + a. In this case, any element of the group can be directly associat with a translation vector a: the composition law is vector addition, the inverse eleme of a is simply  a, and the identity is the translation by the null vector o. It is ag an Abelian group. Such a group contains an infinite number of elements, which can all b e identified the knowledge of three real numbers (the three coordinates of the translation vecto More generally, if it is possible to describe the elements of a group by a set of numb varying in a continuous interval (but not necessarily infinite), it is then a Lie grou A Lie group is also a manifold. The bestknown nonAbelian group is probably that of rotations in three dime sions. Each rotation is determined by, for instance, the three Euler angles , or equiv lently, a unit vector (rotation axis) and the angle of rotation around this rotation ax The composition of two rotations is equivalent to a third rotation, which makes i group (since a rotation can always be inverted by performing the same rotation in t other way, and so a rotation of angle zero is the identity), but the order in which t rotations are performed is crucial and this group is not Abelian. This group, denot by SO(3) , is identical to the group formed by the orthogonal 3 x 3 matrices with u determinant. We can generalize it to the group SO(n) , which contains the orthogo n x n matrices with unit determinant. Each group is associated with an invariance. For instance , in the case of a rotatio the norm of a vector on which the rotation has been performed remains unchang during the transformation. A rotation in a generalized Minkowski space, i.e. havi p spatial dimensions and q timelike dimensions, must keep the form up.up. inva ant; it contains not only rotations in the pdimensional space, but also pure Loren transformations (some ' boosts') . This group, generalizing SO(n) for arbitrary met signatures, describe p spacelike coordinates and q timelike ones, and is denoted SO(p , q). The socalled Lorentz group, is then simply SO(3, 1). The translation gro
Internal symmetries
103
can be associated with the Lorentz group to form the Poincare group on which special relativity is based (see Chapter 1) . 2.4.2
2.4.2.1
Generators
Infinitesimal transformations
An infinitesimal transformation, T€, of a Lie group is a transformation close to the identity and hence can always be written as T€ '::' 1 + iEaAa. This defines the generators Aa , which are Hermitian matrices and depend on the nature of the performed transformation. We suppose here that the Lie group is described by p parameters, i. e. a = 1" " , po A finite transformation can be obtained from an infinitesimal transformation by exponent iation. Such a fini te transformation can always be written as the composition of a large number, tending to infinity, of infinitesimal transformations. If a represent the p necessary numbers for the finite transformation, then we can always write E = a lN, where N » 1, and performing N transformations will indeed lead to the finite transformation. In other words, we will have (2.117)
which is the general form of a finite transformation of the group. 2.4.2.2
Lie Algebra
In order for TOt the property
0
Tf3 to be a group transformation, Ty for instance, one should have
(2.118)
For infinitesimal transformations this gives to second order in the expansion of the exponentials eiOt Aeif3
A
= 1 + i (aa + f3a) Aa  ~ (aaab + f3af3b + 2a af3b) Aa Ab,
the logarithm of which, a lso expanded to second order [using the relation In(l E  ~ E2 + ... ], should be proportional to Aa . We find explicitly
a+
i (a a + f3a) A
~aaf3b
a [A , Ab]
=
+ E)
=
hc Ac .
This equality is in general only satisfied if the commutator of the>. satisfies (2.119)
This relation defines the algebr a associated with the Lie group , i.e. the set of operators of the form aa Aa. The numbers rabe are the structure constants of the group , and the Aa form its algebra. Note that since the generators are Hermitian operators, the structure constants are real numbers .
104
Overview of particle physics and the Standard Model
2.4.2.3
Representations
In order to be able to classify the particles in terms of symmetry groups , we need sort them depending on the representations of the group of the considered invarian An ndimensional representation being a set {n a} of n X n matrices satisfying t commutation relations (2.119) of the algebra, the fields that describe the particles o given representation will be gathered in multiplets q. = {iP a}, a = 1,' . . ,n on whi the action of the group is
(2. 12
It is usual to say that iPa transforms under the representation n a , or more direct that iPa is 'in the representation n a'. If two representations are related to each oth by a similarity transformation , i. e.
na = unau t ,
where U is unitary (U t = U  1 ) , then we say that they are equivalent. In other wor for all practical purposes, it is the same representation , such that the similarity tran formation does not provide any complementary information. If there is an invaria subspace in the representation , i. e. if the action of any element of the group on t vectors of this subspace leads to an element of the same subspace, then the represe tation is said to be reducible. In the opposite case, it is called irreducible. Only t irreducible representations are used to classify the particles in the representations the invariance groups. Moreover , if {na} forms a representation, then so does { _ (na)*}, which is cal the 'conjugate representation ' , [one can easily be convinced that these matrices a respect the algebra closure relation (2.119) since the structure constants of the gro are real] . If a representation and its conjugate are equivalent then we are deali with a socalled real representation. Otherwise, we say the representation is comple In order for a group to be able to potentially reproduce the internal symmetries particle physics, it is often required that it has some complex representations in whi fermions can be placed in a simple way. The representations that are the most relevant for quantum physics are those th preserve the norm defined by liP I2 == iP tiP,
of the multiplet iP . In order for the norm to be conserved during an arbitrary tran formation n a , we see that we need to have
(iP)tiP = iP'tiP' = (iP)t(TaYTa iP ,
for all transformations Ta. We should therefore require that (Ta)tTa = 1, i. e. th it is a unitary transformation. A representation satisfying this condition is a unita representation. Finally, the adjoint representation of a Lie group is formed using its structu constants. It is a representation of the same dimension as the order of the group , which the matrix elements of n a are
(2 .12
Internal symmetries
2.4.2.4
105
Caftan classification
Besides the spacetime symmetries (Poincare group) , the invariance groups that are useful from the point of view of the classification of particles are the compact ones, i.e. for which the parameters vary on a finite interval, as if we had to deal with generalized rotations in an internal space (and in many cases, it is actually precisely the required invariance). Moreover, we wish to classify the particles along 'fundament al' invariances, i.e. with no substructure (we will see later that only one coupling constant comes into play in these groups, which makes them even more interesting). This is why we are interested in simple groups: if H is a subgroup of 9 and if Vg E 9 and Vh E H , the element go h 0 g  l is still in H, then H is an invariant subgroup of g. 9 is simple if there is no invariant subgroup in g, and semisimple if it does not contain any Abelian invariant subgroup. One can show that a compact and semisimple group is thus either forced to be simple or is the direct product of two or more simple groups. Using this philosophy, studying simple Lie groups is therefore sufficient to classify all semisimple compact groups . Note also that the groups with structure S x [U(l)]n with S a semisimple group and n an arbitrary integer, which are called reducible groups, have representations that can all be decomposed as a product of irreducible representations; t his property makes them especially interesting for the classification of elementary particles. 8ince the simple Lie groups are also subj ect to the constraint (2.119), they can be classified in terms of their structure constants . This classification was performed by E. Cartan at the beginning of the twentieth century. He proved that the simple Lie groups could be classified into 4 large categories, namely 80(2n) , 80(2n + 1), 8U(n + I) , and 8p(2n), to which one should add the socalled exceptional groups , which can not be classified into any of these categories, called G 2 , F 4, E 6 , E7 and Es· For the four infinite sequences, n is the rank of the group, i. e. the maximal number of generators that can be simultaneously diagonalised. For the exceptional groups, the rank is given by the index in the notation. For each group , the order, i.e. the number of generators, is indicated in Table 2.2.
Table 2.2 Order (number of generators) or dimension of the simple Lie groups of rank n according to the Cartan classification (the index of the exceptional groups is the rank). Group SU(n + 1) SO(2n + 1) Sp(2n) SO(2n)
G2
F4 E6 E7 Es
N generators
(n (n (n (n
~ ~
~ ~
1) 2) 3) 4)
n(n+2) n(2n + 1) n(2n + 1) n(2n 1) 14 52 78 133 248
106
Overview of particle physics and the Standard Model
The groups of the four infinite sequences can be understood using their fundamenta representation, i.e. the one of smallest dimension (from which, by direct product, a the other representations can be constructed) . The set of all complex unitary matrice of dimension n x n forms the group U(n) ('U' for 'unitary'), the subgroup restricted t the unit determinant matrices is 8U(n) ('8' for 'special'). It is the set of transformation that leave invariant the inner product of two vectors of a complex space of dimensio n. The transformations of this group leave invariant, among other things , the norm o the complex vectors and can be used in quantum mechanics to assure the conservatio of probabilities. The sequence 80(n) is constructed from the matrices that leave invariant the inne product of two real vectors. They are orthogonal matrices (hence the '0') with uni determinant (special) of dimension n x n. Finally, the group 8p(2n) regroups all th matrices M of dimension 2n x 2n that leave invariant the antisymmetric matrix
o
1
 10
s=
o
1
10
o
1
 10
These matrices therefore satisfy MT S M = S, where MT is the transpose matrix of M Thus , the antisymmetric and quadratic form y T Sx for two real vectors of dimension 2 is conserved. The matrices M defined in this way are called the symplectic matrices thus the name of the group. The only transformations that allow us to conserve probabilities in quantum me chanics are those that are unitary. Can we hence deduce that only the groups of th type 8U (n) can playa role in physics, and that hence all theories should be based o such a group? Unfortunately, the answer to this question is negative: some representa tions of the groups classified by Cartan are unitary, as well as respecting the rules o their classification. We can therefore exclude a group as a possible internal invarianc only if we can show that it has no unitary representation. This is , for instance, th case of 80(11) , which is hence excluded as a grand unification group candidate. Thi is why it is indispensable to study in detail the representations of the possible Li groups.
2.5
Symmetry breaking
Not all the symmetries existing in Nature are manifest , and some underlying ones ar not immediately visible at low energy. This is , for instance, the case for the doublet of the type (e  , ve). If these particles really belong to the same representation of th gauge group of the weak interaction, why do they not have the same characteristic as well (mass , electric charge, for instance)? The answer to this question relies on th mechanism that enables us to break symmetries, and in particular the gauged ones.
Symmetry breaking
2.5.1
107
Gauge group
We are generally confronted with 'gauge' invariances, i.e. local invariances. This leads to the existence of a gauge boson, through which some interactions are performed: the presence of a gauged (local) symmetry in particle physics, as in gravity, implies the existence of an interaction.
2.5.1.1
Local symmetries
The simplest illustration of a local symmetry is given by electromagnetism. Its action is invariant under the transformation of the group U(l), which leads to the prediction of the existence of the photon, i. e. the term (1.94) of Chapter 1 in the total action. Consider a complex scalar field , cp(xl') E C, invariant under the phase transformations cp  7 cp' = e ic 4>C
(0l'cp)*ol'cp
7
(ol'cp')*ol'cp'
= =
[(01'  iC¢0l'a) cp*] [(01' + ic¢ol'a) cp] (2.122) (0l'cp)*ol'cp + ic¢JI'0l'a + c~lcpI20I'aol'a ,
where the conserved current JI' = i(cp*ol' cp  cpol'cp*) is given by (2.111). We can note here that this term, proportional to the conserved current, leads, after integration by parts, to a surface term in the action and thus has no physical effects. Despite this fact, the action is still not invariant since the last term is remaining.
2.5.1.2
Gauge boson
The form (2 .122) suggests the replacement of the partial derivative by a gauge covariant derivative by introducing a new vector field GI'. If we define (2.123) with 9 an arbitrary coupling constant, we find that this term transforms as
We can therefore simply impose GI' to transform as
GI'
7
G'I'
=
GI' 
~ol'a 9 ,
(2.124)
in order for the action to be again invariant , whatever the value of c¢. Since the action (l.94) with the Minkowski metric T) c
108
Overview of particle physics and the Standard Model
where GI'V = ol'CV  ovCI" and we have generalized the potential of the scalar field for it to be an a priori arbitrary function of the field norm I¢I == j[¢f2. The action (2.125) is the basis of the scalar electrodynamics, an example of gauge theory. We should note that (2.125) does not have any mass term for the vector field CI'. This comes from the fact that such a term, described by the Lagrangian of Proca (2.71) , is not invariant under the transformation (2.124). The action (2.125) may be generalized to the case where the gauge group is not Abelian, this is what is done in the context of the standard model of Section 2.6 Note already here that the gauge bosons introduced in the framework of nonAbelian theories can neither be massive, for the same reason that a mass term a la Proca is not gauge invariant.
2.5.1.3
Fermions
In a similar way, the action for the fermions (2.67) is made invariant under the local transformation 'Ij; > 'Ij;' = eicrcx'lj; by replacing the partial derivative by the gauge covariant derivative (2.126)
VIle can also see that an extra coupling term with the scalar field can be introduced. Finally, the action for the fermions, can be written in a general form as
(2.127)
with f an arbitrary constant, which, in addition of (2.125), allows us to describe a theory containing a fermion and a complex scalar field coupled via an intermediate vector. We can also see that the fermion can have a purely massive term in addition to its coupling with the scalar field , while respecting the invariance. 2.5.2
Higgs mechanism
The Higgs mechanism allows us to build a theory invariant under the transformations of a given group, while this invariance is not manifest. In other words, it allows us to impose an invariance with some properties to theories that do not seem to have it, as is, for instance, the case for theories having massive gauge bosons.
2.5.2.1
Higgs potential
We consider a complex scalar field ¢ and a gauge field AI', the dynamics of which is described by the action (2. 125). For technical reasons that we will not discuss here, (of renormalization, see, for instance, Ref. [7]), the potential V(I ¢I ) cannot introduce powers of I¢I higher t han 4. We generally choose a quadratic term, i.e. we postulate a mass for the scalar field, and a quartic term, in AI¢1 4 . In the case of the Higgs mechanism, the chosen potential is given by
VHiggs (¢) = A ( I¢I 2  Ti 2)2 ,
(2.128)
depicted in Fig. 2.2. The parameter Ti, having the dimension of a mass, is called a 'vacuum expectation value' (VEV).
Symmetry breaking
109
!Jle (1/1 / 17)
'i5'rn
(1/1 /17)
Fig. 2.2 The Higgs potential as a fun ction of the components (real and imaginary parts) of the Higgs field . The circle 11>12 = TJ2 indicates the set of possible minima.
For the theory to make sense, one should first define the vacuum. It can be obtained , at the classical level, by demanding the energy, i.e. the Hamiltonian, to be a minimum. A simple generalization of (2.88) gives in that case H(x , t)
= ~ [1>2 + (V¢) 2 + 2V(¢)]
,
(2.129)
which is the sum of two positivedefinite terms and of the potential. As a consequence, the minimal energy is reached when the field is invariant with respect to the transformation of space and time, so that 1> = V ¢ = O. Furthermore, the potential should also be minimized , which leads to
(2.130) As illustrated in Fig. 2.2 , one of the solutions, here ¢ = 0, represents a local maximum of the potential, thus it is an unstable state of the theory. The second solution, i¢i = T), on the other hand, is the absolute minimum of the potential. This is the value the field will hence naturally take and that describes the vacuum st ate of the theory (thus the name of t he parameter T)) .
2.5.2.2 Goldstone boson The expansion of the field ¢ around its VEV is
¢ = [T) +
~h(X, t)]
ei8(X,t )/V2rJ,
(2.131)
110
Overview of particle physics and the Standard Model
e
where h and are two real fields. Noting be written as ,
mh
==
2,j),'rJ , the potential for h and
e
(2 .
and is independent of e. This can easily be understood by examining the poten the fluctuations h and e around the expectation value 'rJ of the field correspon fluctuations either along the radial direction (h) , thus changing the energy of configuration , or along the circle of the minima (e) , which by definition do not any energy. The field h is called the Higgs field , whereas e is the Goldstone boson
2.5.2. 3
Masses of the bosons and fermions
The degree of freedom corresponding to the Goldstone boson is not physical. Actu it is always possible to choose a gauge in which it is suppressed , since the only p where it comes in is in the kinematic term of the field cp, which can be expanded
(2.
Choosing the gauge in which CI' > CI' + (0l' e) / (V2'rJ9 Cf ) , the field e comple disappears. After expanding the kinematic part (2.133) ofthe scalar field cp, one can see app ing a term similar t o Proca's one (2.70) for the vector field C w This implies that field has gained a mass during this process. After identification, we find Me = V2g The appear ance of a massive vector field can be explained by the loss of the G stone boson. Its corresp onding degree of freedom cannot have completely disappea and it has actually been completely absorbed ('eat en ', as it is sometimes writ by the gauge field. A massless vector field has indeed only two degrees of freed corresponding in t he classical theory to t he two t ransverse polarization modes ther circular or linear. On the other hand , a massive vector field evolves such its angular moment um has a vanishing projection , and therefore has three poss modes. The coupling of the scalar cp to the fermion in (2.127) leads to a modification of fermion 's mass, which is increased (or decreased , depending on the sign of the coup const ant f), by 6.mf = i'rJ. Thus, if the fermion was by symmetry initially mass as is the case in t he electroweak and strong st andard model, it will t hus acqui mass, in the same way as the gauge field. This is why fermions of extremely diffe masses can be placed in the same multiplet . We will see later , in the context of st andard model of the electroweak and strong interactions , how this process is app in a precise case, and is checked experimentally. 2.5.3
NonAbelian case
T he Higgs mechanism can easily be generalized to an arbitrary group Q, which coul nonAbelian , but t hat we suppose is simple. We now consider a field in a representa
Symmetry breaking
111
of dimension n , i.e. a column vector ¢ composed of n real scalar fields ¢a, with a = 1, ... , n. The VEV is now a vector
~~
(Oi¢ iO)
~
C) ·
(2.134)
with 10) the vacuum st at e of the scalar fields. The individual VEVs , 7]a = (OI ¢a IO), satisfy L a7]; = 7]2 . The potential of ¢ is always of the form (2.128) , with I¢I = L a¢; , so that once again we have oV(¢) I =0
O¢a
=TJ
'
and we define the quantum fields ¢a == ¢a  7]a. The mass term then t akes the general form 1 02V ' , Lm ass
=  "2 O¢aO¢b¢a ¢b·
If we perform a global infinitesimal transformation with paramet er a c tential transforms as V(¢) > V (¢) + SV(¢) , with
«
1, the po
OV .oV C b SV (¢) = o¢a S¢a = Z o¢a (ac R) a ¢b, where we assume that ¢ transforms under the given representation with the matrices
RC For the pot ential to be invariant under the previous transformation, we need OV (¢ ) = 0, implying that
oV (Rc)b A. = 0 O¢a a 'f' b . Differentiating with respect to ¢d for ¢ = ry (for which t he first partial derivative of the potential cancels), leads to
I
0 2V o¢ o¢ a
d
=TJ
(RCry)a = O.
There are now two possible ways to make these terms vanish , according to whether or not (R Cry) a vanishes. The set of the generators satisfying (RCry )a = 0 forms a subgroup 'H of 9 under which the vacuum is still invariant. Thus, the scheme of the symmetry breaking is 9 > H. The expansion (2.131) around the VEV can be generalized t o 7]1
+ hI (x) (2.1 35)
o
112
Overview of particle physics and the Standard Model
assuming that only the p first values of
so that
(2 .
o
which defines the p physical Higgs fields of the theory. Under a general transformation with parameters aa, the gauge bosons Cal" tr form as the generalization of (2.124) to the nonAbelian case, more precisely
(2.
where 9 gives the coupling of the group and the r bca its structure constants. We sh stress t hat there is only one coupling constant , valid for all the fields submitted to required invariance. This property is only valid for a simple group or a simple g risen to any power with a transverse symmetry, which allows us to identify the diffe groups. In t he case of the Higgs mechanism, we use this relation for the transformat t hat only bring into play the n  (p + 1) Goldstone fields (h, i = p + 1, . . . ,n ins of the general paramet ers aa' To first order in Bi , the relation (2 .137) gives
Substituting t his expression, together with (2.136) in the kinetic term of
2.6
The standard model SU(3)c x SU(2) L x U(l)y
The standard model of nongravitational interactions accounts for all the interact mentioned in Section 2.3 classified dep ending on the representations of the gr
The standard model SU(3)c X SU(2) L X U(1)y
113
SU(3)c for t he strong interactions, and SU(2)L x U(l)y for the electroweak ones. This last symmetry is broken by t he Higgs mechanism described in Section 2.5.2 along the following symmetrybreaking process SU(2)L x U(l)y
+
U(l) elec,
leaving the photon m assless, as observed. 2.6.1
The strong interaction (QeD)
The description of the strong interaction is based on an invariance under the transformations of the group SU(3). Since t his group has 8 generators, its joint represent ation, in which t he gauge bosons G~ are placed , is of dimension 8: a = I , . . ·8. T he leptons are not subj ect to these interactions , and t hus appear as singlets of SU(3), which do not couple to the gauge boson. The quarks exist in three possible states, and are thus placed in a threedimensional representation, denoted by 3. The quarks are therefore represented in column vectors with these three elements , Q ==
(ur) (dr) (sr) (Cr) (tr) (br) ug
'ub
,
dg
,
db
Sg,
cg
sb
cb
,
tg
,
tb
bg
.
(2.138)
bb
The index is called colour ,4 which has nothing to do with any actual coloured appearance. Inst ead , this terminology comes from an analogy: due to the fundamental properties of the strong interaction , a quark can never be observed alone but only in pairs of quark antiquarks, or in sets of t hree quarks. In the latter combinations, the total colour cancels, in the same way as it is possible to combine the three fundamental colours to give white, which can be seen as t he absence of any colour. This is why t his SU(3)c group is called the colour group. The covariant derivative of each quark Q, with respect to the strong inter action is given by (2.139) where t he Aa are 8 m atrices forming the algebra of SU(3). We often take the GellMann matrices (see, for instance, Refs. [7, 13] for more details) which satisfy the appropriate commutation relations of SU(3) , i.e. (2 .119) with the structure constants of SU(3) , denoted as r~. The coupling constant g3 of the group measures the intensity of the interaction. Experimentally, it has been measured that this coupling constant has a higher numerical value than those for the other interactions, t hus the name of strong interaction. By analogy with the colour states, this interaction is also called quantum chromo dynamics 'QCD'. The Lagrangian of QCD takes t he form .cQCD
=
L
Q'"YJ1. D (3 )Q 
~GaJ1.vGaJ1.v,
(2.140)
quarks
4The index tha t t a kes the va lue r, g and b represents the ' fund amental ' colours red, green and blue.
114
Overview of particle physics and the Standard Model
2.6.2
Electroweak interaction
The electroweak interaction combines the properties of the weak interaction , com from the invariance under transformations of the group SU(2)L' and of electrom netism , based on U(l)elec, brought together as a unique interaction. All particles combined in singlets or doublets of SU(2)L' 2.6.2.1
Quarks
The quarks of each generation are supposed to be equivalent , so that the doublets
(2.1 where each fermion appears as an eigenstate of chirality with the definition 1
+
,5)'I/J,
(2.1
'l/JR== (  2where the matrix
,5is defined as  10) ( 01'
One can explicitly check that (,5) 2 = 1, so that the operators P R,L == ~ (1 ± are proj ectors orthogonal to each other , i. e. they satisfy P ; ,L = PR,L and PRPL = Another useful property is to do with the anticommutation relations: anticommu with all , l' , or (2.1 {,5"I'} = 0,
,5
relation that will be very useful in the following sections. Finally, the quarks with right chirality are defined as singlets of SU(2)L:
2.6.2.2
Chirality
The proj ectors PR,L on the state of definite chirality 5 come into playas soon as we interested in the properties of the spinor with respect to the Lorentz transformatio In terms of the socalled chiral representation (2.65), t he projectors are found to t the following explicit form
1_,5 (10)
P L = 2 =
00
'
and
P = R
1+2,5 = (001' 0)
(2.1
In this form, the proj ection properties of these operators become transparent. N by the way that due to the fact that and anticommute (2.143), the joint spin 7jjL,R are defined using opposite proj ectors
,5
,0
5The op erator 'Y 5 is called chirality b ecause it indicates how its eigenstates transform unde parity transformation [6 , 9].
The standard model SU(3)c X SU(2)L X U(l)y
~L = ~ (I ~15 ) = ~PR
and
115
(2.145)
Thus, they satisfy (2.146) and and
(2 .147)
which confirms that a spinor existing only in one chirality state, cannot have any mass term in the Lagrangian. From the projectors (2.1 44) , one can deduce that the eigenstat es of the chirality are quadrispinors and where ~ and X are two bispinors representing respectively the left and right components of the total spinor. The Dirac equation (2.62) can then be written as two coupled equations, the Weyl equations, for the bispinors
(at + ( j . V ) ~ = imx, (at  ( j . V )X = im~ ,
(2.148) (2.149)
where the mass m should now be understood as a coupling t erm between the two components. For a massless spinor , both equations decouple and each bispinor , ~ and X, called Weyl spinors, evolve independently. We will see lat er that these kinds of spinors have an especially important role to play in the formalism of supersymmetry.
2.6.2.3
Leptons
For a long time, neutrinos were thought to be represented by some Weyl tensors. Experimentally, t hey are only observed in the left state V L , leading to the conclusion that they should be massless. Since they only have two degrees of freedom , in order to put them in similar doublets as in (2.141), it is only possible to pair them with the left component of the corresponding lepton, leaving the right component in a singlet of SU(2)L' thus the index 'L' of the group . The structure is thus
(:eLL) , R, (~LL ) , e
~R '
(V;LL) ,
TR
,
(2.150)
which gathers in a similar scheme, the electron, the muon and the tau with their associated neutrinos. The covariant derivatives are defined as
D12) (:e L
)
== (01" 
ig2 B il"ai )
L
(:e L
)
•
(2. 151 )
L
(Ji (i = 1,2, 3) are the Pauli matrices (2.66), generators of the group SU(2) since the satisfy the algebra
[a i ,a j
]
= 2i Ei\ a k .
(2 .152 )
Similar to the strong interaction, the vector bosons B il" are in the adjoint representation of SU(2). The coupling constant of the group is here called g2 . The coefficients
116
Overview of particle physics and the Standard Model
ij E k
take numerical values +1 (resp.  1) if (i,j,k) is an even (resp. odd) permutat of (1,2 , 3), and 0 otherwise (i .e. if two indices are the same) . The structure constant the group SU(2) are given by the rank 3 LeviCivita tensor 6 , and this group is loc equivalent to that of rotations. Moreover, a charge with respect to the phase transformations is associated to e particle. This 'hypercharge' , to distinguish it from the electric charge Q, is deno by Y and we have YR = 2 for the right leptons, YL = 1 for the left lepto Y(qJ = ~ for the quark doublets (2.141) , of left chirality, and for the right chiral Y(UR , cR , t R) =! and Y(dR , SR , bR ) = ~. The doublet structure of (2.141) and (2.150) reminds us of that of spin ~ w its components 11) and 11)· This is why the doublet classification of SU(2)L is a called weak isospin, denoted by T to avoid any confusion with the spin S. The val T3 = ±~ are respectively given to the top and bottom component of the doub Using this analogy and bearing in mind that the electric charges of the quarks Qu ,c,t = + ~ and Qd,s ,b = ~, the ones for the massive leptons Qe,/"T =  1 and t neutrinos have no electric charge, we obtain the GellMann Nishijima relation
(2.1 Note that for this relation to hold for all particles, we need to set T = 0 for singlets. We can then add a covariant derivative t erm U(l)y for each field ,
where gl is the coupling constant associated with the group U(l)y. 2.6.3
A complete model
Putting together all the previous building blocks, a complete Lagrangian can be writ down to describe the three nongravitational interactions and all their invariances
2.6.3.1
Kinetic terms
The kinetic terms of the standard model for the electroweak and strong interacti are thus given by
.c kin
=
 i;!ry/'
(a/' 
 4"1 (o/,Gv 
 ~ (O/,G av 
ig 3 G a /,)...a
ov G/,)

ig2Bi/,ai  ig1YG/,) 'ljJ
2
OvGa/,
+ ifbcaGb/,Gcv)
2
(2.154
6The LeviCivita tensor components El , ... ,n = + 1
€
in n dimensions is the completely antisymmetric rank n tensor w
The standard model SU(3)c X SU(2)L X U(l)y
117
where t he action of the covariant derivative depends on which particle it acts. For inst ance, for the lepton doublet of the first family, using the hyper charges obtained earlier , t his gives,
since the leptons are not subj ect to the strong interaction. 2.6.3. 2 W± and ZO bosons and photon ; Weak angle Equation (2. 155) can explicit ly be expanded in left and right components as .c~ in
=  iv eL, I'81've L

ie,l' (81'
+ 2ig2 sin Bw A I') e
 V2g2 (veL,I'W:e R + eR,I'W;veJ  L ' I'Zl' e L   g2 B  (VeL' I'ZI' VeL  COS 2B WVe cos w

2 SIn . 2 BWV  e R' I'Zl'e ) , R
(2.156) where t he weak a ngle Bw is defined by t he relations
g1

g2
=
tanB w ·
(2. 157)
As for the vector fields, they are defined by
(2.158) and
Bw  sinBw) (B31') ~ (B31' ) == ( CO~Bw sinBw ) (AZ~,.. ) == (C~S sm Bw cos Bw CI' CI'  smBw cos Bw
(AZ~ )
,
,.. (2. 159)
AI' being t he electromagnetic field. The Lagrangian (2.1 56) can be understood in the following way. First , we notice that the kinet ic term of the electron leads to a gauge coupling of the U( l ) type with the photon , since it can be written in the form
with
q = g2 sin Bw the electromagnetic coupling const ant and Q the electric charge operator , with value Qe =  1 for the electron. This coupling is the same fo r the left and right degrees of
118
Overview of particle physics and the Standard Model
freedom of the electron. In addition to the coupling terms, an intermediate neut boson ZJ1. has appeared , coupling both right and left components of the electron a different way, and also adding a selfcoupling term for the (lefthanded) neutri Finally, charged intermediate bosons make interactions between neutrinos a electrons possible (the charges of t he W± come from the fact that each term of final Lagrangian should preserve the electric charge, which between a neutral neutr and an electron is only possible wit h the indicated signs for the charges). For a quark doublet , one can perform the same expansion, which gives us electric charges of the quarks, but leads to 36 different interaction terms for e generation of quarks once the triplet of colour is made explicit .
W:
2.6.3.3
Symmetry breaking
The Lagrangian (2.154) counts almost all possible terms that satisfy the invariance t he standard model and bringing into play t he particles known experimentally. D to its invariances, and in particular due to the weak isospin, it is not possible to wr mass terms for the leptons. As for the quarks, all elements of the same doublet SU(2)L should have the same mass, like for instance both quarks u and d, which is contradiction with the measurements. Thus t he model can only be made compati with experiments if the SU(2)L symmetry is broken. For this , we introduce a complex field doublet
the dynamics of which is governed by the potential (2.128)
The minimum of this potential is 14>112 + 14>2 12 = ry2. Assuming that the Higgs field not subject to the strong interaction, the kinetic term becomes
(2. 1 which can be expanded as
1&4>11 2 + 1&4>21 2 + v2g2 (.JJ1. W+J1. + .JJ1.+ W  J1.) + (g2 B 3J1. + gl Yq, C J1.).Ji  (g2 B 3J1.  gl Yq, C J1.).Ji: + (g2B3J1. + glYq,CJ1.)214>112 + (g2B3J1.  gl Yq,CJ1.)2 14>21 2 + v2g2 (4)~4>2 + 4>14>;) [wJ1. (g2B3/l + glYq,C/l)  W+J1. (g2 B 3/l + 2g2 (14)11 2 + 14>212) W:WJ1..
ID
glYq, CJ1.)]
(2.16
In (2 .160) , we have introduced the Uy(l) hypercharge Yq, of the Higgs doublet; determine its value below.
The standard model SU(3)c X SU(2)L X U(1)y
119
The currents are defined by
and
JI' == i (cP~0l'cP2  cP20l'cP~) , with J + = (J )t = i (cPZ0l'cPl  (h0l'cPz) . We now work around the minimum of the potential, and perform a gauge transformation to suppress any of the three redundant degrees of freedom . In the unitary gauge, for which cPl = 0, and cP2 = 7) + hlJ2, i.e.
+ O
with
(~)
and
(2.162)
we are left with only the Higgs field, h, which is real. We then get
c :r =
 ~Ol'hOl'h  (7)2+ ~h2 + V27)h)
[(92B31'  glY<jJ CI')2
+ 29~W+I'W;],
(2.163) where we can recognize the intermediate vector boson ZI' provided that we fix the hypercharge of the Higgs scalar doublet to Yq, = 1. It is the only value that will preserve the electromagnetic invariance after the symmetry breaking, ensuring the photon defined by the relation (2.159) remains massless.
2.6.3.4
Masses of the intermedia te bosons
Equation (2.163) contains interaction terms between the Higgs boson, h, and the intermediate bosons, ZO and W±. The mass term of the latter can be extracted. Taking h = 0 in (2. 163) , we get
.c masses 

2g27) 2 2W+ W  1' I'

7) 2 cosg22 Bw
Z I' ZI' =

M2W W+WI' I'

21 M Z2ZI' ZI' ,
(2.164) so that we can make the identification
Mz =
and
..)2g27)
= Mw .
cosBw
cos Bw
(2.165)
This last relation allows us to derive the value of the weak angle Bw from the masses of the intermediate bosons. This has been done following their discovery at CERN in 1982 . Their actual masses are [11]
Mw = 80.419 ±0.056 GeV
and
Mz = 91.1882 ±0.0022 GeV,
which gives the weak mixing angle, namely sin 2 Bw ,...., 0.22 , that is approximatively ()w ~ 280. Note that no term of the type AI'AI' appears in the Lagrangian (2.164). The photon remains massless, in agreement with Maxwell 's electromagnetism.
120
Overview of particle physics an d the Sta ndard Model
2.6.3.5
Masses of the fermions
T he particles of right and left chirality belong to different representations of SUL (2 thus it is not possible to construct mass terms in a simple way just as for t he gau fields, one should impose that all the fermions are massless before the symmetry brea ing. Moreover , since the right neutrino should be absent from this theory, one shou postulate that the neutrino is massless, even after symmetry breaking. The most general coupling terms between the fermions and the Higgs field th satisfy the required invariances, are given by 7
(fleptonsfL . cpe R
+ f qT ii L
. ;PU R
+ fqlii L
.
cpd R
+ h. c. )
,
generations
(2.16 with ;;,. 'f'
. 2 A.*
=W
'f'
(
¢2 )
= ¢r '
where the matrix (j2 is defined by (2 .66), with hypercharge YJ) = Y¢ = 1 , an where the sum is over the three different generations , so that the fields e R , u R an dR represent, respectively, (e, j.L , T)R ' (u , c, t) R and (d , s, b)R' In (2.166), the numbe iteptons, fqT and fq l, called the Yukawa coefficients, not only represent the coupling the quarks and leptons with the Higgs fields, but, once the symmetry is broken , als give the values of the m asses of the different fermions. One can see that the interactio of a given particle with the Higgs field is all the more important that it is massive. 2.6.3.6
Electromagnetic symmetry
Working in the unitary gauge defined by (2.162), we see that the vacuum configuratio CPo is identically annihilated by the electric charge operator Q given by the GellMann Nishijima relation (2.153). Indeed, with
T3 =
~2(j 3 = ~2
(1 0) 01
'
and Y¢ = 1, we have
The annihilation of the vacuum state by the generator of electromagnetism impli that this state satisfies (2.167
and hence it is invariant under a transformation U(I)elec. Thus, the symmetry breakin is not total and the electromagnetic invariance is recovered, as required for the theor to be compatible with experiments .
70 ne can explicitly check that with the hypercharges given previously for all particles, these term are the only possible ones t h at are invariants under the tran sformation of U(1)y.
Discrete invariances
121
Note that even if here the relation (2.167) has been explicitly obtained in a unitary gauge, from the invariance of the initial gauge of the theory we can conclude , independently of the gauge , that this result will apply for a ny choice of gauge, i.e. that for any choice of the configuration of the Higgs doublet, as long as it is at the minimum of the potential.
2.7
Discrete invariances
In addition to the continuous symmetries, usually gauged ones, there exist discrete symmetries such as parity, charge conjugation and timereversal invariance. The first one appears to be maximally violated by the weak inter actions, and the last one weakly, for instance , in the system J(o  Ko · In what follows, we will limit ourselves to the study of the scalar field, all results being generalizable to the fermionic and vector sectors [8]. For a classical field, a symmetry operation can be obtained in the following way. For a given transformation x f7 f( x , a), where a represents a set of parameters defining the transformation , the field ¢( x) transforms as ¢'(X')
= A(a)¢(x) ,
(2. 168)
where A is a matrix if the scalar field ¢ has several components. A state I'lj;) in the Fock space, corresponding to the field operator ¢, transforms linearly i'lj;') = U(a)I'lj;), where ut = u  1, enssuring that ('lj;I'lj;) = ('lj;' i'lj;'). We require that the matrix elements of the field operator satisfy the relation (2.168) ,
between two arbitrary states I'lj;l) and 1'lj;2) and the transformed states I'lj;D and which leads to ut(a)¢(x')U(a) = A(a)¢,
14)~) ,
i.e. to where the dummy variable x' has been replaced by x and expressed using the inverse transformation law. We can now see how these general transformations can be applied to different particular cases, useful in particle physics. 2.7.1
Parity
Let us assume that we perform a reflection transformation on the spatial coordinates
{
X
f7
x' = x,
t
f7
t'
=
t.
(2.169)
If the physics remains unchanged under t his transformation (as is the case, for instance, for the Klein Gordon equation t hat is of second order in the spatial derivatives) , t hen a classical field can only be modified by at most a phase factor
122
Overview of particle physics and the Standard Model
¢'(x') = sp¢(x), with Isp 12
=
1. For the field operator this translates into
p  l ¢(X , t)p =
sp¢(  x , t),
(2.1
where P is a unitary operator representing the parity. If ¢ E C, then the variation can be performed via a phase, i. e. sp = e iQ , a E However, for a real field (¢ E ~), the corresponding operator should be Hermitian, ¢t = ¢, and hence
pt ¢t (x , t)
(Pl) t =
p  l ¢ (X , t) p = s;¢(  x , t) ,
s;
from which we deduce that = sp. This, together with the relation Isp 12 = 1, imp t hat sp = ± 1. In this special case, the particle associated with the field will be s to be scalar if sp = + 1 and pseudoscalar for Sp = 1 , i.e. when it has a negat intrinsic parity. The action of the parity operator on the Hilbert space can be obtained by expand the field in the basis (2 .106) (we consider from now on a complex field) and rewrit t he equation. We find
J
d3k [Pa pt ei(kxwktl (21T)3 / 2V2wk k =
s p
J
3
d k
(21T)3 / 2V2Wk
+ Pb t pt e i(k'X  wktl] k
[a ei(k,x +Wktl k
+ bt ei(k'X wktl] k
,
(2.1
where the sign of x in the planewave exponentials of the righthand side have b inverted, in accordance with (2.170). Since t his last operation is equivalent, in exponentials , to the one that changes the sign of k , it gives the transformation la of the creation and annihilation operators
Pakpt = spa_ k and
Pblpt
= Spb~k'
(2.1
T hus, the parity operator reverses the sign of the momenta, which had to be expec since t hey are 3vectors. For the case of a free complex scalar field, i.e. massive but with no interact with any other field , the Hamiltonian, given by the relation (2.108), transforms un parity as
(2. 1
since the integral is p erformed on all values of the vector k and Isp 12 = 1 whicheve the parity of the states we consider . Similarly, the field momentum, equal to
(2.1 transforms as
pppl =
_P,
which corresponds to the transformation law of a vector. At this point, it is interest to note from (2.173) that the parity operator commutes with the Hamiltonian, and th
Discrete invariances
123
the parity of a physical state is a constant of motion for the free complex scalar field. Some interactions can modify this conclusion: this is the case of the weak interactions. A parity can also be attributed to the vacuum, and we choose PIO) = 10) (the vacuum must remain identical to itself if the coordinates are inverted). Note that we can explicitly construct an operator P satisfying (2.172) for which this relation is verified, but it is also possible to give a negative intrinsic parity to the vacuum: this only complicates things a little and simply inverts all the fields parities. In other words , this is only a matter of convention. 2.7.2
Charge conjugation
Another discrete transformation as simple as parity can also be performed. This is charge conjugation C, which consists in replacing each particle in a given physical state by its antiparticle. In terms of operators, this amounts to transforming (p into (Pt and vice versa. We can therefore write (2 .175)
and
the second relation of (2. 175) , being simply the Hermitian conjugate of the first one. Just as with parity, in (2.175) , t he eigenvalue of Sc has a unit modulus since transforming twice in a row all the particles and their antiparticles is in the end equivalent to no transformation at all. This is equivalent to saying that C2 = 1. Furthermore, note that C should be unitary, so that finally, we have = C 1 = C. In terms of the annihilation operators, we find
ct
(2.176) and the Hermitian conjugate relations for the creation operators . Assuming, as before, that the vacuum is an eigenstat e of the charge conjugation, with eigenvalue + 1, i.e. CIO) = 10), we can then determine the charge C of any physical state created with a determined number of particles. Here again, we find t hat for a free scalar field , the charge conjugation is a constant of motion, leaving the Hamiltonian (as well as the momentum) unchanged. However, the conserved charge Q of (2.11 3) changes sign, as one could expect,
CQC 1 =  Q, so that an eigenstate of the conserved charge Qcannot simultaneously be an eigenstate of the charge conjugation. The time conservation of the latter is thus only useful for physical states with a global vanishing charge. = 0, and the ones that violate parity conserMost of the interactions satisfy
[iI,C]
vation, such as t he weak interactions, more often than not, leave the combination CP invariant . In the section dedicated to baryogenesis in Chapter 9, we will discuss the cases where even this constraint is not valid. However, in all field theories analogous to the one presented here, a discrete symmetry is satisfied, it is the combination CPT, where the last symmetry, the t imereversal invariance, requires some technical details that are given in what follows .
124
2.7.3
Overview of particle physics and the Standard Model
Time reversal and CPT theorem
The last discrete invariance, valid for instance for a free scalar field, is that related t time reversal T: {
X
f+
x' = x ,
t
f+
t' =  to
(2 .177
Unlike the two previous symmetries, which are unitary operators, this symmetry ca be implemented in the Fock sp~ce of states only in the form of an antiunitary operator An operator T is antiunitary if it is antilinear, i.e. such that
(2.178
and if it satisfies TTt = TtT = l. Thanks to this last operator , one can prove [14] the following very general theorem If a local quantum field theory can be represented by a Hermitian Lagrangian .c(XD that is invariant under proper Lorentz transformations (i.e. without the space an time inversions) and if its field operators satisfy the spinstatistics theorem,9 then w have
(2.179
Due to this relation (2.179), the action integral S = J d 4 x..c is invariant under th transformation CPT. Thus, the equations of motion, which are obtained by varyin the action, and the canonical commutation relations, are also unchanged under th transformation. We can also deduce that the Hamiltonian of the given theory mu commute with CPT,
[CPT, iT] = 0,
and hence any realistic theory, i.e. satisfying the hypothesis of this theorem, has th invariance. This result is often written in the symbolic form 'CPT = 1' . A nontrivial consequence of this theorem is the following: for any theory satisfyin it , i.e. for any realistic theory, the particles and antiparticles should have exactly th same masses and the same lifetimes. This consequence is verified with a very hig precision since, for example, for the Ko/ Ko system, we find [ll]iMKo  MRoi/MKo 10 18 . For most of the other particles, we find typical values of order of ib.M i/M 10 5 , and constraints on the differences of lifetimes of order of ib.Ti/T < 10 3 .
8This happens b ecause of technical reasons related to the fact that the timereversal operator mu also reverse the roles of the initial and final configurations [8] . 9The spinstatistics theorem , itself also very general in field theory, specifies that the particl associated to integerspin fields (for instance bosons, scalars and vectors) have a statistics describe by the Bose Einstein distribution function , whereas the ones corresponding to halfinteger fiel (fermions, Dirac spinor, for instance) , are d escr ibed by Fermi Dirac statistics. This theorem on relies on microcausality. It is discussed in detail in Ref. [15] .
Discrete invariances
125
The standard model of particle physics presented in this chapter has so far never displayed any flaws from the point of view of experimental physics. However no one sincerely thinks that it describes all interactions between the elementary particles in a final way. There are many reasons for this, which are as much experimental as theoretical. From an experimenta l point of view, the recent discovery of the existence of a mass for the neutrinos and the discussion concerning the chirality (Section 2.6.2) indicate that they cannot uniquely exist in the left chirality as attributed by the standard model. There must therefore exist a right neutrino , which has different properties from the left one since it would have otherwise already been detected experimentally. They therefore need to have a very large mass, in order to explain why no accelerator has already produced it. At the experimental level, there is still an unknown . Since the discovery of t he 'top' quark , all particles of the standard model are identified except one , which is maybe the most important since the principle of symmetry breaking relies on it, namely the Higgs boson . As long as it has not been produced directly in an accelerator , it will remain hypothetical, and the model will not have been fully verified. Now from a theoretical point of view, t he situation is even worse. First, this model, although it unifies in a quantum framework the three nongravitational interactions (electromagnetism , weak and strong) , makes no mention of the last interaction, gravity, despite t he fact t hat one of its pillars, the Higgs mechanism, is supposed to be at the origin of the mass of elementary particles. Interestingly, these masses are also the gravitational charges of these particles . Moreover, t he standard model relies on three different symmetry groups, wit h t hree coupling constants having no links between them, which is not justified by any fundamental principle. Finally, and this may be the most serious, there are 19 free parameters in the theory, which can only be measured. T his proliferation of parameters is often considered as t he sign t hat it is not a fundamental theory but actually a lowenergy effective theory. We will see later that in many extensions, the number of free parameters is seriously reduced, and that it is in principle possible to calculate t he 19 arbitrary coefficients that appear in t he general Lagrangian in terms of a much sm aller set of parameters.
References
[1] L. D. LANDAU and E. LIFSCHITZ, Mechanics, Pergamon Press, 1976. [2] N. D ERUELLE and J. P. UZAN , Mecanique et gravitation newtoniennes, Vuibe 2006 . [3] C. COHENTANNOUDJI , B. DIU and F. LALOE, Quantum mechanics, John Wil 1992. [4] P. R. HOLLAND, The quantum theory of motion, Cambridge University Pre 1993. [5] J. SCHWINGER, Quantum mechanics, Springer, 200l. [6] F. HALZEN and A. D. MARTIN, Quarks & leptons, John Wiley and Sons, 198 [7] M. A . PESKIN and D . V. SCHROEDER, An introduction to quantum field theo AddisonWesley, 1995. [8] W . GREINER and J. REINHARDT , Field quantization, SpringerVerlag, 1996. [9] L. H. RYDER, Quantum field theory, Cambridge University Press, 1987. [10] J. 1. Kapusta, Finitetemperature field theory, Cambridge Monographs on Ma ematical Physics, 1994 [11] W. M. YAO et al., 'The review of particle physics', Journal of Physics, G 33 2006; see also the website http ://pdg .lbl. gov / that gives the same informati Updated almost every two years to take into account new experiments and th results. [12] E. SEGRE, From XRays to quarks: modern physicists and their discoveries, Fr man , 1980. [13] H. GEORGI, Lie Algebras in particle physics, Westview Press, 1999. [14] W. PAULI, in Niels Bohr and the development of physics, Pergamon Press, 19 G. LUDERS, Ann . Phys. (NY) 2 , 1, 1957. [15] B. DIU et ai. , Physique statistique, Hermann, 1996.
Part II The modern standard cosmological model
3 The homogeneous Universe The first cosmological solution of Einstein's equation was given by Einstein himself in 1917. At the expense of introducing a cosmological constant, he was able to construct a closed and static space. The general cosmological solutions were discovered independently by Alexandre Friedmann and Georges Lemaitre in 1922 and 1927. This chapter details t he derivation of these solutions from Einstein's equations and studies their kinematics and dynamics before defining the various observational quantities that will be used in this book. It ends on some considerations on lesssymmetric spacetimes. For complementary developments , see Refs . [17].
3.1 3.1.1
The cosmological solution of FriedmannLemaitre Constructing a Universe model
Within the framework of general relativity, one can try to answer the question of which solution of Einstein's equation describes the Universe we observe, or more modestly, which solution is an idealized (but good) model of our Universe. To carry out this program, we need enough input from astrophysical and cosmological observations. This program is indeed difficult to tackle and its solution will, in particular, depend on the amount of available data. But , as we noted in the introduction, even with ideal data, there are limitations to the answers one can give to this problem. These limitations arise mainly because we observe only a finite part of one Universe (it is a unique object and we cannot discuss its probable nature by comparing it to similar objects) from a given spacetime position (and we are unable to choose this point) . Astrophysical data are mainly localized on our past light cone and around our past worldline for geophysical data. They are thus localized on (just a portion of) a 3dimensional hyp ersurface. First, it may be that various spacetimes are compatible with the same data and second, the interpretation of these data is not independent of the spacetime structure. Thus , we must distinguish between the observable Universe (for which , by definition , we have data) and the Universe, which includes regions we cannot directly influence or experiment. The inference of the geometrical properties of the Universe from the observable Universe cannot b e achieved without hypothesis and philosophical prejudices. The standard cosmological model, in its simplest form, relies on four central hypotheses. HI Gravitation is well described by general relativity. Indeed , as discussed in Chapter 1, general relativity is well t ested on small scales (e.g. , Solar System scales),
130
The homogeneous Universe
but we cannot exclude that it is not t he case on large scales or at early times (se e.g., Chapter 12 for examples of theories that dynamically become more and mo similar to general relat ivity during t he evolut ion of the Universe). In particula t he E instein equivalence principle implies that the laws of physics can be extrap lated at all times (see Chapter 1). We will thus keep applying, as long as possibl the standard laws of physics, in particular for nongravitational interactions. H2 Hypothesis on the nature of matter. We do not examine the spacetime itself or i matter distribution. Rather , we observe particular objects (e.g. , galaxies, ... ) t h are supposed to be representative of t he total matter content . The influence of th variations of the properties of individual objects on the inference of the matte content of the Universe has to be quantified. On large scales, we will assume t h matter is described by a mixture of a pressureless fluid and radiation, allowin also for a cosmological constant. In particular , we assume that there is no matt outside the standard model of particle physics described in Chapter 2. The perfe fluid hypothesis will be shown to derive from the hypothesis H3. H3 Uniformity principles. The previous hypotheses are not sufficient to solve th E instein equations. We must assume some symmetries for t he spacetime. Th hypothesis and their consequences are discussed in Section 3. 1.2 H4 Global struct ure. Hypotheses H1 H3 enable us, as we shall see, to solve Einstein equation and to determine t he lo cal structure of the Universe model. Howeve there may exist various manifolds with identical local geometry and differe global topology, which correspond to the choice of various boundary condition For a globally hyperbolic spacetime, the choice of the topology reduces to th choice of the topology of the spatial sections.
As we shall see, t hese hypotheses allow us to construct very successful Univer models but there are reasons to challenge everyone of t hese four assumptions. Th will be discussed mainly in Chapter 12. 3.1.2
Cosmological and Copernican principles
Galaxies seem to be spread isotropically around us, just as the cosmic microwav background radiation. This indicates that the spacetime describing the observab Universe should have a spherical symmetry around us. T here are two possible expl nations: either our Galaxy lies in a special place in the Universe so t hat it is no homogeneous and must have a centre. Or the spacetime is homogeneous; every poin is then similar and the Universe is isotropic around each point (see F ig. 3.1 ). To construct cosmological models, we can distinguish two uniformity principle T he cosmological principle supposes t hat t he Universe is spatially isotropic and homo geneous. In particular, this implies that any observer sees an isotropic Universe aroun him. We can distinguish it from t he Copernican principle that merely states t hat w do not lie in a special place (the centre) of the Universe. This principle, together wit the isotropic hypothesis, actually implies t he cosmological principle. The cosmological principle makes definite predictions about all unobservable r gions beyond the observable Universe. It completely determines t he entire structu of the Universe, even for regions that cannot be observed. From this point of view this hypothesis, which cannot be tested, is very strong. T he Copernican principle ha
The cosmological solution of FriedmannLemaltre
{:] p
{:] Q
131
.......... .. .... ... ... .. .. .. ...... ........ .... ··· .. . .. •.. . .... ··.. . .. U . ....... . . . . . . .. . . .. .... ....... .. ........ {:] p
• • Q.
•
Fig. 3.1 A point distribution , statistically isotropic around every point (left) and around a unique point (P) (right). In the second version, P and Q are not equivalent. The cosmological principle excludes such kinds of solutions, which would assume that we lie in a sp ecial place in the Universe . From Ref. [l J of the introduction.
more modest consequences and leads to the same conclusions but only for the observable Universe where isotropy has been verified. It does not make any prediction on the structure of the Universe for unobserved regions (in particular, space could be homogeneous and nOn isotropic On scales larger than the observable Universe). Some cosmological spaces that do not satisfy this cosmological principle are described in Section 3.6. From a pragmatic point of view, these uniformity principles should be considered in a statistical sense, i. e. for a Universe smoothed On dist ances greater than its larger structures. Thus, they implicitly involve a smoothing scale and a smoothing procedure to which we shall return. They lead to the Friedmann Lemaitre solut ions that are at the basis of modern cosmology. Recently, a test of the Copernican principle was proposed [8J. 3.1.3
Cosmological principle and metric
Homogeneity and isotropy are two distinct notions that we should define more precisely. The homogeneity of space means t hat at every moment, each point of space is similar to any other One. More precisely, a spacetime is (spatially) homogeneous if there exists a oneparameter family of spacelike hypersurfaces, I: t , foliating the spacetime such that for any t and for any pair of points (P, Q) of I: t , t here exists an isometry of the spacet ime metric taking Pinto Q (see Fig. 3.2). Isotropy means that at every point the Universe can be seen as isotropic. More precisely, a spacetime is (spatially) isotropic at each point if there exists a congruence of timelike worldlines with tangent vector uJ.L (called isotropic observers) such that at any point P and for any pairs of unit spatial tangent vectors ei, e~ E Vp (see Chapter 1), there is an isometry of the metric that leaves P and uJ.L at P fixed and that rotates the vectors ei into e~. Thus, it is impossible to construct a preferred tangent vector perpendicular to uJ.L.
132
The homogeneous Universe
/ /
~
... Q
p. III
..
..
·x7 ./
p
Fig. 3.2 (left): A homogeneous spacetime can be foliated by a family of homogeneous spaces (right) : for an isotropic space, there is an isometry that rotates any unit spatial tangent vecto ei to any other e~, while the vector u lL remains invariant . From Ref. [9J.
We recagnize that for a hamageneaus and isatropic spacetime, the surfaces of ho mogeneity, ~t, must be orthogonal to the tangent vector uj.t of the worldline of the isotropic observers , that are thus called fundamental observers. (Proof If it were no the case; then, assuming that the homogeneous space and the cangruence of isotropi observers are unique, the failure of the tangent subspace perpendicular to uj.t to coin cide with ~t wauld enable us ta construct a geometrically privileged spatial direction This is in cantradiction with isotropy.) The metric gj.tv induces a spatial metric :rj.tv(t) on each hypersurface ~t · As w have just seen, there must exist an isametry of :rj.tv bringing any point of ~t into another point of ~t and it must be impossible to construct a preferred vector from :r,w. As one can show, this implies that the Riemann tensor canstructed from :rj.tv fo the threedimensianal spaces ~t must be ..of the farm I [2, 9, 10]
where K has , because ..of homogeneity, the same value at any paint ..of ~t, and can ..onl be a functian ..of t, K(t) . A space satisfying this property is called a maximally sym metric space (as we shall see it has the maximum number ..of Killing vectars allawed by its dimensian) and is a space of constant curvature. Two spaces ..of canst ant curvatur with the same dimensian, signature and the same value ..of K are isametric. Thus, it) sufficient ta enumerate the threedimensianal spaces ~orresJ::>0nding ta all values ..of K We classify these spaces according ta the sign of K . If K is pasitive, ~t is a three dimensional sphere ..of radius R(t) , with embedding equatian x 2 + y2 + z2 + w 2 = R2(t in a faurdimensianal Euclidean space. In spherical caordinates,
1 As shown in Ref. [9], this conclusion can be drawn from the isotropy a rgument. Consider (3) R ILv cx { as the components of a linear map, L, of the vector space of 2forms into itself. Since L is symmetric it can b e diagonalized. If its eigenvalues were not equal, then one could construct a preferred 2form and then a preferred direction. Thus , isotropy implies that L is proportional to the identity and thu (3) R cx{3 = 2K b cx b{3 by symmetry. ILl.'
[IL
vi
The cosmological solution of FriedmannLemai'tre
x
=
RcosX,
y
=
Rsin xcos B,
z
=
R sin x sinB cos
w
133
Rsin xsinBsin
=
t he E uclidean metric ds 2 = dx 2 + dy2 + dz 2 + dw 2 induces the threedimensional metric on I;t If K
= 0, I;t is a threedimensional Euclidean space with metric
If K is negative, I;t is a t hreedimensional hyperboloid , with embedding equation _w 2 + x 2 + y2 + Z2 = _ R 2(t). In spherical coordinates ,
x
=
R cosh X,
y
=
R sinh xcos B,
z
=
R sinh xsinB cos
w
=
R sinh xsinB sin
the Minkowski metric ds 2 =  dw 2 + dx 2 + dy2 + d z 2 induces the threedimensional metric on I; t dS(3) = R 2(t ) [dX2 + sinh2 X (dB 2 + sin 2 Bd
3.1. 3.1
Fourdimensional m etric
The worldlines of the fundamental observers are orthogonal to the hypersurfaces I;t. The coordinates syst em can be transported from one hypersurface to another by means of this congruence of worldlines. The general form of the spacetime metric is imposed by these symmetry considerations 91.1.1/ =  uJ.L UV + ~J.Lv(t), where, for each value of t , ~J.LV is the metric of a threedimensional sphere, Euclidean space, or hyperboloid. It follows that the lineelement of the cosmological space t akes the form ds 2 =  (U J.L dxJ.L)2 +~J.L v( t)dxJ.Ldxv . Setting uJ.LdxJ.L == dt , t he coordinat e t is t he proper time measured by the fundamental observers (i. e. co moving wit h uJ.L) and we get the general form of the spacet ime metric as (3.1 )
t being the cosmic time, a(t ) t he scale factor and
lij the metric of constant t ime hypersurfaces, I;t , in comoving coordinat es. 2 Latin indices i, j , . .. run from 1 t o 3. The conformal t ime 7) , defined by
2This implies t hat ;Yij (xk, t ) = a 2 'Yij (x k ), so that the sp atia l met r ic satisfies (3) R ijkl
and K(t )
= K/a 2 (t).
= 2K 'Yk(i'Yj]I ,
134
The homogeneous Universe
(3.
dt = a(t)d7] , can also be introduced to recast the metric (3.1) in the form
(3. T he spatial metric, in comoving spherical coordinates, takes the general for m
(3.
where X is a radial coordinate and dn 2 = dB 2 + sin 2 Bd'P2 is the infinitesimal so angle, 'P runs from 0 to 27r and B from  7r to 7r. T he quantity runs from 0 to when K > 0, and from 0 to +00 otherwise. The function f K depends on the curvatu K (which now is a pure number) as
vTKTx
K  l / 2sin
fK( X) =
(VRx)
K
>
0
X
K =O { ( K) 1/ 2 sinh(v' Kx) K < O.
(3.
It relates the surface S(X) of a comoving sphere to its radius X by S(X) = 47rf'k( that reduces to t he E uclidean expression for distances small compared to t he curvatu (i.e. for X « II Using the radial coordinate r = fK( X), (3 .1) can also take t for m
vTKT)·
(3.
This metric, or the equivalent forms (3 .1) or (3.3) , is called the Friedmann Lemait metric. T his construction shows how the cosmological principle, and thus the symmet assumptions that follow, has allowed us to reduce the ten arbitrary functions of t spacetime metric to a single function of one variable, a(t ), and a pure number , K.
3.1 .3.2
Geom etrical quantities
The expression for the Christoffel symbols can be obtained , using the definitions Chapter 1 and the metric (3.1). Their only nonvanishing components are f
o ij
= H a2'Yij,
fjo. = HOj,.
i
_
f jk 
(3)
rijk ,
(3 .
where the (3)fjk are the Christoffel symbols of the spatial metric 'Yij . The quanti H = ala is t he Hubble parameter , and a dot represents the derivative wit h respect the cosmic time t. The explicit form of (3)fjk is not needed to compute the component of the Ric t ensor because the spatial metric 'Yij satisfies (3) R ij kl = K hik'Yjl  'Yinjk) , so th
The cosmological solution of FriedmannLemai'tre
135
(3) Rij = 2K lij and (3) R = 6K. It follows that the only nonvanishing components of the Ricci tensor are 3
Roo =
a a
3~,
R ij
=
(
2H
2
K) a
+ ~ii + 2 a 2
2
(3.8)
lij,
from which we deduce the expression for the scalar curvature
R=6 ( H 2
K) ,
ii a
(3.9)
+~+2
a
and for the nonvanishing components of the Einstein tensor
Goo 3.1.3.3
=
3 ( H2
+
~) ,
G ij
=
(
H
2
+ 2~ii +
K) a
a2
2
lij.
(3.10)
Killing vectors
As explained in Chapter I, to each symmetry of a spacetime is associated a Killing vector field,~ , satisfying (1.58). We have seen that for a spacetime of dimension n , the maximal number of independent Killing vectors, and thus of symmetries, is n(n + 1)/2. For the Friedmann Lemaitre spacetimes, the spatial sections are maximally symmetric so that we expect to obtain 6 Killing vectors. They can be explicitly determined by considering the Weinberg coordinates
in terms of which the spatial metric, and its inverse , take the form
(3.11) with r2 == 6mn xmxn. The Killing equation, \7 I'~V + \7 V~I' = 0, gives for the OOcomponent, ~o = 0, so that ~o is a function of the spatial coordinates only, ~o = F(xk). Then, the Oi and ij components take the form ~i = gki Dk~O and likDj~k + IjkDie + 2H lij~O = 0, where Di is the covariant derivative associated to lij. They can be combined to get
Only iI a 2 may possibly depend on time in this equation. If this is the case, iI a 2 is not a pure number , then this implies that F = O. We thus have six solutions to the Killing equations: 3 As
fbpf?j
an example, consider the computation of R ij . It can b e decomposed as Rij = oor?j  OJ
~
r3jrf", + (3) R ij , where
r?o +
all terms involving no p, = 0 component have been gathered to
identify the Ricci tensor of the spatial metric. We then use that
(3) Ri j =
2K "Iij = 2(K / a 2 )gij.
136
The homogeneous Universe
• three Killing vectors associat ed to spatial translations
r = 1, . . . , 3,
(3.12)
• three Killing vectors associated to spatial rotations
r , S = 1 ... , 3.
(3.13)
When Ha 2 does not depend on time, Ha 2 = K, as we shall see from the Friedman equations. This occurs for the Minkowski, de Sitter and antide Sitter spaces. It follow that there are four additional solutions associat ed to time translation and Loren boost s. They are , respectively, given by
(3 .14 and
L /L[rJ ..
L o[rJ _
for K = 0, r . X ,
(3.15 for K = ±1 ,
in t erms of the conformal time TJ. Because of the expansion of the Universe , these four vectors are no longer Killin vectors of the Friedmann Lemaitre spacetime. They now satisfy the equation V /L ~lI V lI~/L = (V (X~(X)g/L lI /2 and are called conformal Killing vectors. The symmetries in volving t ime (i.e. time translation and Lorentz boosts) are lost due to the fact tha the cosmological spacetimes are not static but they are still conformal to a stat spacetime, a property related to the existence of these four conformal Killing vector 3.1.4
Kinematics
Without knowing the evolution law for the scale factor, a(t) , a few general consequence can be obtained simply from the form of the FriedmannLemait re metric. 3.1.4.1
Fundam en tal observers
Notice first of all that in the metric (3.1), the coordinates xi are comoving. Th worldline of a fund amental observer with traj ect ory orthogonal to the hypersurface I;t is given by x O = t and X i =constant (one can check that this is indeed a timelik geodesic). As a consequence , this observer has a fourvelocity
(3 .16 As seen earlier , this metric can be reexpressed in the form
(3 .17
The cosmological solution of Friedmann Lemaitre
137
where 9 /Lv = g/LV + u/LU V is the projector t ensor introduced in Section l.2.5. The expansion of the Universe thus allows a comoving observer to naturally decompose any vector p/L into a timelike and spacelike component as (3 .18) In particular, all tools introduced in sections l.2.5 and l. 3.2 of the first chapter can be used. 3.1 .4.2 Hubble law Let us now consider two comoving observers with traj ectories given by x = x = X2. In the physical space, their separation is given by r12
Xl
and
= a(t)(xl  X2)'
It follows that
(3 .19) The farther the observers are from each other , the greater is their relative velocity. T his is what is usually called the Hubble law. Notice t hat H depends a priori on time and can only be approximat ed as a constant for obj ect s whose distance is small in comparison to the Hubble radius. Note that this law is a special case of the generalized Hubble law (l. 119) with e = 3H and (J/LV = 0, as imposed by homogeneity and isotropy. 3. 1.4.3
Recession velocity and proper velocity
Since the physical distance r is related to t he co moving distance x wit h the relation r = a(t)x, we deduce that
r = Hr + a(t)x ==
V rec
+ vpro p er '
(3.20)
The recession velocity, V rec , is a t imedependent quantity and represents t he component of the velocity relat ed to the Hubble flow. As for v proper, it is the proper velocity of the object with respect to the cosmological reference frame privileged by the expansion. 3.1.4.4
Properties of ligh t
In the geometrical optics limit , it was shown that the traj ectory of a light ray, X"' (A) , is given by a null geodesic with wavevector k'" = dx'" / dA satisfying (3 .21 ) Using the general decomposition (3. 18), we obtain
where E =  k/Lu/L represents the energy of a photon measured by the comoving observer and p/L its momentum (p/Lu/L = 0). It follows that
138
Th e homogeneous Universe _E2
+ a2iij p i p j = 0,
which is simply the generalization of the wellknown relation between the energy of obj ect , its momentum and its mass. Moreover, the projection along ul' of the geode equation gives . Oij . 2 i' EE + f ijP P = EE + H a iij P rY = 0,
so that t/E =  H and hence E = ko la , ko being a constant. The energy, a hence t he frequency 1/ oc E , of any photon varies as the inverse of t he scale fact Its wavelength varies as the scale factor. In an expanding space, wavelengths are th redshifted achromatically. T he redshift defined by the general equation (l.122) is t herefore given by 1+ z
=
(ul'kl')emission
(ul' kl' )observation
a( t o b servation) a( temission) .
(3. 22
The redshift can be measured from the comparison of an observed spectrum to laboratory spectrum , from which one can deduce how much the Universe has expand since it was emitted. We stress at this point that most observations give an angu position on the celestial sphere and a redshift. The radial distance cannot be measur directly but is obtained from the redshift and requires the knowledge of the expansi law of the Universe, that is a(t). 3.1.4.5
Beh aviour of a m assive test particle
Consider now the trajectory of a massive t est particle with 4velocity vI' . It satisfie
(3.2
can be split as vI' = aul' + pI', where ul' is t he 4velocity of t he fundamen observers and pI' satisfies ul'pl' = O. It follows from ul'ul' = vl'vl' =  1 t hat pl'pl' p2 =  1 +a 2 . The geodesic equation fo r vI' then leads to aa = (8/3)11'''PI'P'' = H so that aa + H(  1 + ( 2 ) = 0 from which we obtain that  1 + a 2 oc a 2 . We conclude that p2 oc a  2 so that a + 1 and vI' + ul'. The proper velocity any massive t est part icle is damped as a I and its worldline tends to adjust to t Hubble flow , that is to the worldlines of the fundamental observers. vI'
3.1.5
Spacetime dynamics
Up to now we have described the implications of the Robertson Walker geometry. investigate the dyna mics of these spacetimes, we must obtain the equations of moti deriving from the E instein equation. When these equations are fulfilled , we refer these solutions as Friedmann Lemaitre spacet imes. 3.1. 5.1
Energymoment um tensor
The energymomentum tensor, TI'''' is a symmetric tensor of rank 2. The possi forms this tensor can take are reduced by t he spacetime symmetries. Indeed , t most general form compatible with symmetries is
The cosmological solution of FriedmannLemaitre
139
B represents the only spatial component (for the fundamental observers); it is therefore identified with the pressure that is thus seen to reduce to a unique number for a homogeneous and isotropic space. A = TjJ,vu jJ,u v == p is the energy density measured by the fundamental observers. In conclusion, the most general form of the energymomentum tensor is that of a perfect fluid (3.24) The energy density, p, and the pressure, P , only depend on time. This form is completely fixed by the cosmological principle and is not an independent choice. Thus, the only remaining freedom is the choice of the equation of state, that is the relation between the pressure and the energy density.
3.1.5.2 Friedm ann and conservation equations Using the expressions for the E instein t ensor and the form (3.24) of the energymomentum tensor, we can easily deduce that the two Einstein equations reduce to two independent equations
(3.25)
(3 .26 ) recalling [see (1.130)] that r;, == 87rG N • As for the matter conservation equation (\7 jJ,TjJ,v = 0) , it reduces to a single equation
I p + 3H (p + P) = o. I
(3.27)
We can convince ourselves that t he three equations (3.25) , (3.26) and (3 .27) are not independent [indeed, by differentiating (3.25) with respect to time and expressing (i / a using (3 .26) and (3 .25) to eliminate H2 , we find (3.27)]. This is a consequence of the Bianchi identities.
3.1.5.3
Covariant approach
In the covariant approach, a Robertson Walker space is defined by the conditions V/Lf = 0 for any scalar function f , which implies that ilP = 0, and CJ jJ,V = wjJ, = 0, so that \7 jJ,u v = 8 9jJ,v/3 . Space being isotropic , we can set S = a, and thus H = 8/3 and the Raychaudhuri equation (1.73) gives (i
1 r;, ujJ,u v =  (p 2 jJ,V 2
3 =  R a
+ 3P ) + A'
140
The homogeneous Universe
and the generalized Friedmann equation (1. 78) gives 4 ~ 6K 2 (3) R =  2 = 2KP + 2A  _ 8 2 = 2KP + 2A  6H2. a 3
We recover the Friedmann equations, but the regime of validity of the Raychaudh and generalized Friedmann equations is larger since they also apply to nonisotro Universes , such as Bianchi Universes (see below).
3.1 .5.4
Equations in con form al time
It will sometimes be convenient to work in t erms of the conformal time. Rememberi that dt = ad7] , we find that the derivative of any quantity X with respect to 7], X', related to that with respect to t by XI = aX. It will be convenient to remember th
X"  HX I = a 2 X.
XI=aX ,
al / a has b een introduced, which satisfies
The comoving Hubble constant H particular
a" aii =   H 2
H = aH,
a
'
a2if = HI  H2.
We deduce that the Friedmann and conservation equations [(3.25) to (3.27)] take t form
(3.2
I
H =
K
2
"6 a
(p
A
2
+ 3P) +"3a ,
I pI + 3H (p + P) = o. I
(3.2
(3.3
Notice also that if the scale factor is a power law in cosmic time , then
a(t) ex t n as long as n
3.1.5.5
i=
{==}
a(7]) ex 7] ''.'.n,
(3.3
1.
Eq uation of state
We hence have two independent equations for three unknowns (the scale factor , t energy density and the pressure). We therefore need some additional information solve this system. For that, the most convenient way is to give an equation of sta for the matter in the form P=wp. (3.3
For instance, pressureless matter will be described by w = 0 whereas for radiation , will have w = 1/ 3. For the cosmological constant, A, which corresponds to a consta
4Note t h at here (3) R refers to the Gaussian curvature of the hypersurfaces L;t with metric ~I This explains why (3) R = 6K Ia 2 .
The cosmological solution of FriedmannLemai'tre
141
energy density, (3 .27) implies P =  p and thus w = l. The resolution of (3.27) implies that for any constant w, p ex
a 3 (l+w).
(3.33)
Comparing this general result with equations (3.25) and (3.26), we notice that the curvature term can be assimilated with a fluid with equation of state w = 1/3. By differentiating w with respect to time and expressing pi with the conservation equation (3.30) we get Wi = 3H(1 + w)(c;  w), (3.34) where
Cs
is the speed of sound, defined by
pi
cS2 =pi For a barotropic fluid w
=
(3.35)
c;, so that w is a constant.
3.1.5.6 Reduced form It is useful to rewrite the Friedmann equations in a dimensionless form in terms of reduced quantities. For that, we introduce the energy density parameters
(3.36) respectively, for the matter , the cosmological constant and the curvature. The matter term can be decomposed as a sum of components with different equations of state as o = I:x Dx with
(3.37) and we will denote by D xo its value today. The first Friedmann equation (3.25) then takes the form of a constraint
(3.38)
Using the solutions of the conservation equation, it follows that
_
Dx  D xo for a constant equation of state 5 as 5 For
a
general
Oxo (Hal H)2 exp [ 3
timedependent
(~) 3(1+WXO) ao
Wx
= Wxo, so that
equ ation
J:o (1 + wx)d In a].
(Ho)2 H '
of state,
E(a) == it
can
H/H
o
be
can be expressed
checked
that
Ox
142
The homogeneous Universe
(3.3
°
Quantities with an index are evaluat ed today. Thus, to will be the dynamical age the Universe and ao = aCto) . Notice t hat E only depends on x = a/ao = 1/( 1 + Depending on the case, it will be denoted by E(a), E(x) or E( z). For the matter , t he following different components can be distinguished Tot al matter
0
Different equations of state
Om ~
Db
Different species
Dc
where the total matter component is decomposed into matter (w = 0) and radiati (w = 1/3) and then, respectively, into baryons (b), cold dark matter (c) , photons h and neutrinos (v). Of course , other species can be added, whether as a species with new equation of state or as a new component.
3.1 .5.7
Units and conventions
Two conventions are often used for the choice of dimensions. The decomposition any physical length as the product of t he scale factor and a co moving length , C(phys) a(t) C(com) can indeed lead to an ambiguity since we can arbitrarily choose either t scale factor or t he co moving length to have dimension of length. We will use bo conventions according to the situation at hand.
 The scale factor can be chosen to have dimension of length , and X to be dime sionless. This implies that K can be normalized to K = 0, ± l. The value of can t hen be determined by evaluating the Friedmann equations today.
I [aJ ~ [, and [xl
LX
=x,
K = 0, ± 1,
1
ao = Ho
1
J 11 Dol
.
For a Universe such t hat Do = 1 (i. e. K = 0) , ao is arbitrary, which reflects t invariance of the Euclidean space under dilatation.  The scale factor can be chosen to be dimensionless. X will then have dimensi of length , and thus K the dimension of inverse length squared. It is t hen oft convenient to set ao = 1.
[aJ = 1 and [xJ
=
L: K = (0, ±1)/ R~,
We recover that for Do = 1 (i.e. K charact eristic curvature scale.
1
Rc =  aoHo 0) , Rc
= 00 .
1
Jl1  Do l
.
A Euclidean space has
I 1, the conservation equation implies that p o( a the Friedmann equation (3.25) implies H2 x o3(1+ru). It follows that a By integrating this relation, we obtain, As seen earlier, if uL
so o<
2
(3.40)
a(l) x ,3{1+u) From this expression, the conformal time can be computed by integrating
giving 4
o li#b if w I 1/3. o(r1)
that
aQ+3u)/2.
dl :
dt I
a(t)
,
We cleduce that
21 x q' ts;
if
al
(3 41)
;.
and a(4) o( exp ? otherwise.
3.2.1.2
K:0andu:\
If the matter density is dominated by a cosmological constant, we have H2 x p x so that a x a. By integrating this relation, we get a(t) x eHt
a0,
(3.42)
This space has an accelerated expansion and is called de Sitter space. In terms of the conformal time, the scale factor is of the form
aQi)xh,
4<0.
This space can also be written in such a way that Chapter 8). ds2
3.2.1.3
:
dt2
. *+f!
it
(3.43)
has spherical spatiai sections (see
(dx, +sin21d02)
.
(3 44)
K:IIandu+Il3
It is more convenient to work in conformal time to obtain
{"0i,t(:d}.
Using p
:
Qr_ Jlr: , {)sl, (g) ll \osl
in units such that l1{l
n,(t) in the parametric form
po(alas)3(1+u) and (3.25), we get
: 1 This equation
t't''
_*.
can be integrated to give
144
The homogeneous Universe
(aoa) = (11 0 00
)1 / 20: {(Sinh o.'TJ)1 / 0: if 1 (sin o.'TJ) 1/ 0: if 0
K =  1, K = + 1,
(3.4
with 20. = 1 +3w. This solution is valid for all constant w of.  1/3. The case w =  1 corresponds to a Universe with no matter since matter can then be absorbed into curvature term. 3.2.1.4
K = 0, w = 0 and A of. 0
Let us mention the solution for a spatially Euclidean model dominated by a pressurel fluid and a cosmological constant [11]
a(t)
=
(1 ) OAO 
1
1/ 3
sinh
2 3 /
(3Tt) '
(3.4
where we a = HOVOAO. This solution is relevant for describing the late time evolut of a flat ACDM model. 3.2.1.5
Other solutions
We present in Fig. 3.3 the profile of the scalefactor evolution in a large number of si ations, including for different kinds of matter, curvature and value of the cosmologi constant. We may notice the presence of periods of slower evolution (a 'hesitating' phase) various lengths depending on the values of the cosmological constant (see, e.g., ca F, L or K). During these phases, the Universe gets closer to an Einstein static Unive (i.e. a = 0) for which
Thus, this is a spherical Universe for which the gravitational attraction is just co pensated by the repulsive effect of the cosmological constant. The behaviour of th solutions depends mainly on the relative values of the cosmological constant and A Another interesting solution is the one obtained for an empty Universe with hyp bolic spatial sections (K =  1) . This space, called the Milne space, has t he follow metric (3.4
It can be shown that this metric corresponds to a Minkowski metric by an appropri change of coordinates (T = t cosh X, R = t sinh X leads to the Minkowski metric spherical coordinates) , but this space only covers a quarter of the Minkowski spa time.
3.2.2
Dynamical evolution
In order to study the general dynamics of the solutions of t he Friedmann equatio it is interesting to rewrite the system (3.25) and (3.26) in the form of a dynami system for t he quantities 0 , OA and OK (see Refs . [12 14]).
Dynamics of FriedmannLemaitre spacetimes 3.0
0
<
*·····
c\\\~..
\,>o\}~.. ····
2.0
145
LK
~............
H
1.0
OJ
..................... ...G
0.0
B
A
D
oj
 1.0 0.0
1.0
2.0
3.0
QmO
3.0
3.0
a
a ao
3.0
2.0 a
ao 1.0
0.0 3 2 1
0
1
2
1.0
0.0  3 2 1
3
Hot
2.0
0
1
2
0.0 3 2 1
3
Hot
3.0
a
a
ao
A =0
2.0
2
3
a
\
......
1.0
1
1
2
3
1
2
3
3.0
ao 1.0
0
Hot
3.0 Hesi tating ~ (Spherical:
0
2.0
0.0 3 2 1
Hot
0
1
Hot
2
2.0
Hyperbolic
ao 1.0
...:
3
0.0  3 2 1
0
Hot
Fig. 3.3 Depending on the values of the cosmological parameters (upper panel) , the scale factor can have very different evolutions.
3.2.2.1 Using
Dynamical system
if =
ii/a  H2 and expressing ii/a with (3.26) and H2 with (3.25) leads to H
H2 = (1 + q) .
(3.48)
q is the deceleration parameter defined by
(3.49) with 'Y == w + 1. The system of Friedmann equations can thus be rewritten using the new time variable p == In( a/ ao). The derivative, XI, with respect to p of any quantity X is then given by XI = X/H.
146
The homogeneous Universe
The evolution equation for the Hubble parameter (3.48) then becomes
+ q)H.
H' = (1
(3.5
n
If we differentiate n, n A and K with respect to p and use (3.50) to express H noticing that a' = a and that (3.30) can be used to obtain pi =  3,p, we then obta the system
n ' = (2q + 2  3, )n , n~
(3.5
= 2(1 + q)nA '
n~ = 2qn K
(3.5
(3.5
.
It is useless to keep all of these equations since (i) H does not enter in (3.51) (3.5 can be deduced algebraically using (3.38). and (ii)
n
The dynamics of Friedmann Lemaitre spacetimes can thus be studied by considering the dynamical system
(3.54) where q is a function of
nA
and
nK
given by (3.49).
This system associates a unique tangent vector to each point so that only o integral curve passes through each point of the phase space where the tangent vect is defined. If two trajectories crossed each other, the tangent vector at this point wou not be unique. Points where the tangent vector is not defined are called fixed point 3.2.2.2 Determination of the fixed points The fixed points are defined by the conditions solutions of (1 + q)nA = 0,
n~
= 0 and
n~
o and
are thu
(3.5
They represent the equilibrium positions (stable or not). This equation has thr solutions (nK,n A) E {(O,O),(O, l) ,(l, O)} , (3.5 and each of these solutions represent a Universe with different characteristics.
 (n K , n A)
= (0, 0): the Einstein de Si tter space (EdS). This is a spacetime wi no curvature (Euclidean spatial sections) and no cosmological constant.  (nK' n A ) = (0, 1): the de Sitter space (dS). This is a space with no matter b with a cosmological constant and the spatial sections have a positive curvatur This Universe is in eternal exponential expansion.  (nK , n A) = (1 , 0): the Milne space (M). This is an empty space with no cosm logical constant but with hyperbolic spatial sections (K < 0).
Dynamics of FriedmannLemaitre spacetimes
147
3.2.2.3 Stability of the fixed points To determine if these fixed points are attractors (A), saddle points (S), or repellers (R), we should study the evolution of a small perturbation around each equilibrium position. For this we set
== 0,K +WK, == 0,1\ + W1\,
OK 01\
(3.57) (3.58)
with (0,1\, 0,K) the coordinates of a fixed point and (W1\' WK) a small deviation. Rewriting the system (3.54) in the form
(3.59) where FK and F1\ are two functions determined by the system (3 .54) that vanish at the fixed point , it can be expanded to linear order as
~j ' WK
_ p_ _
W

(!tA,!tK)
~WKj W
with
p(IT IT ) A,
K
==
(;~: ~~:) aF1\
(3.60)
aF1\
aO K a01\
(ITA ,IT K
)
The stability of a fixed point depends on the sign of the two eigenvalues (A1,2) of the matrix P(ITA,IT K ) ' If both eigenvalues are positive (resp., negative), the fixed point is unstable (resp ., stable). If the two eigenvalues have different sign, it is a saddle point. The eigenvector U.\ 1,2 then gives the stable and unstable directions .  EdS: we have PEdS
= (
3"1  2 0 ) 0 3"1 '
with eigenvalues 3"1  2 and 3"1 EdS is thus an attractor for "1 Ej 00, 0[, a saddle point for "1 Ej O, 2/3[ and a repeller for "1 Ej2/3, +00[. The associated eigendirections are U(3,.  2) = (1,0).  dS: we have PdS
=
(2 =~'Y ~'Y
) ,
with eigenvalues  2 and  3"1 If "1 Ej  00, O[ then dS is a saddle point and for "1 EjO, +00[, it is an attractor. The two eigendirections are given by U(2)
= (1 , 1).
 M: we have PM
_ (2  3"1  3"1 ) 
0
2
'
with eigenvalues 2 and 2  3"1 If "1 Ej  00,2/3[ then M is a repeller and if "1 Ej2/3, +00[, it is a saddle point. The two associated eigenvectors are U (2)
= (1 ,  1).
148
The homogeneous Universe
These results are summarised in Table 3.1 and in Figs. 3.4 to 3.6. Table 3.1 Stability of the fixed points as a function of the value of the equation of state, , = w+ 1, (A: attractor, R: repeller, and S: saddle point). ] 
EdS dS M
00,
o[
°
A
N.A.
S S
A R
]0,2/3[
2/3
]2/3, +oo[
S A R
NA A R
R
A S
0.5
E
C
0.5
 0.5
 1~
 0.5
__~~~__~__~__~~__ J 1.5 o
0 0.5
0
0.5
1.5
QA
Fig. 3.4 Dynamical evolution in the plane (r:lK , r:l A) (left) and (r:lm , r:lA) (right) for, = (w = 0). From Ref. [14].
3.2.3
Expansion and contraction
When the total energy density vanishes, the expansion rate of the Universe can chan sign. For a Universe with nonflat spatial sections and with matter, radiation and cosmological constant , there are four possibilities (see Fig. 3.3) . The expansion lasts eternally starting from the Big Bang. This is the case if K or  1 for a Universe without cosmological constant.
=
The expansion is followed by a contracting phase leading to a big crunch. This the case for a Universe with K = + 1 with no cosmological constant.
The expansion lasts eternally but was preceded by a contracting phase; the must therefore have been a bounce. This situation can occur when the Univer has spherical spatial sections (K = + 1).  The Universe undergoes a series of contraction and expansion phases; this is oscillating Universe .
149
Dynamics of FriedmannLemaitre spacetimes 1
1.4 Qm
0.8
1.2
1
0.6
" 0.8
cf°.4
cf
0.6 0.2 0.4 0  0. 2  0. 2
0.2 O ~
0
0.2
0 .4
0 .6
0.8
1
__
~~~~~L_ _~w~~
0.4 0.2
0
QA
0.2 0.4 QA
0. 6
0.8
Fig. 3.5 Dynamical evolution in the plane (O K , Ot>.) (left) and (Om , Ot>.) (right) for 'Y From Ref. [14].
1
=
1/3 .
E
c:
 0.2
0
0.2 0.4 QA
0.6
0.8
Fig. 3.6 Dynamical evolution in the plane (OK , Ot>. ) (left) and (Om , Ot>. ) (right) for 'Y =  l. From Ref. [14].
The separation between a Universe with infinite expansion and a Universe with a big crunch is obtained when
2
nAO = 1  nmo + :3 cos [arct an (S;;;~)  7r] )
(3.61 )
with SmO = JI2n mo  11. The separation between a Universe with infinite expansion and a bouncing Universe is obtained when
150
The homogeneous Universe
nmO = 0, or 1
_
o ::; nmo ::; 2'
nAO  1  nmO
3 [ + 2(n 1110 )1 /3 (1
 n mo
+ '::'~mO )1 /3
+ (1  nmo  :=:mo) 1/3] , nAO = 1  n mo
3.3
2
+ 3" cos [arctan (:=:;;;6) + 7r].
(3.6
Time and distances
The characteristic distance and time scales of any Friedmann Lemaitre Universe a fixed by the value of the Hubble constant. We define t he Hubble time and distance 1
c
tH = H'
DH = H'
(3.6
The sphere with radius equal to the Hubble radius is called the Hubble sphere. T order of magnit ude of these quantities is obtained by expressing the current value t he Hubble parameter in the following way
(3.6 with h typically of the order of 0.7. We will come back later to the measurement of t he parameters introduced in t his chapter. We obtain t hat
DHo = 9.26 h 1 x 10 25 m ~ 3000 h  1 Mpc ,
tHo = 9.78 h  1 x 10 9 years.
(3.6
All the quantities that will be introduced in this section (distance and time) will expressed in terms of E( z ) defined in (3.39). As can be seen in Fig. 3.7, this functi has some degeneracies that change with t he redshift. By combining observations different redshifts, one can lift these degeneracies and thus use them jointly to bett constrain the cosmological parameters.
3.3.1
Age of the Universe and lookback time
From the definition of the Hubble parameter, it follows that dt = da /a H . Thus, t expression (3 .39) for the Hubble constant gives
da dt = tHo aE(a)'
The age of the Universe is thus obtained by integrating this equality between a = and a = ao, or equivalently between z = 0 and z = 00,
r dx tHo J xE(x)
roo
1
to
=
o
=
tHo Jo
dz
(1
+ z)E(z)'
(3.6
This function is represented for various sets of cosmological parameters in Fig. 3.8.
Time and distances 0.9
0.')<=== == ==:.
0.8
o.
0.7
151
o
0
a<
c:< 0.6 0.5 0.4 0.3 0.1
0.2
0.3
0.4
0.5
QmO
Fig. 3.7 Isocont ours of the funct ion E( z ) in terms of the cosmological parameter (Dmo , DAO) for a J\CDM modeL T he degeneracies at z = 1 (left), z = 3 (middle) and z = 10 (right) are not the same. The code of grey shade is arbitrary.
The analogous expression in conformal time is 1
'T}o
=
tHo
10 x2~x) ·
(3.67)
In a similar way, the age of the Universe at t he t ime when a photon with redshift
z. was emitted is defined by
roo
t( z.) = tHo Jz. (1
dz
+ z)E(z) ·
(3.68)
The lookback time is the difference between the age of the Universe and its age when the photon was emitted by the source. The previous relations lead to
llt( z*) = to  t( z*) = tHo
3.3.2
Jro ' (1 + dz z)E(z )·
(3 .69)
Comoving radial distance
The co moving radial distance X of an object of redshift z t hat is observed by an observer located at X = 0, is obtained by integrating along a radial null geodesic. The form of the metric (3.4) gives dX = dt /a, so that dz
H( z) . It follows that
aoX( z*)
=
DHo
Jr' o
dz
E( z )·
(3.70)
The dependance on ao simply reflects the arbitrary choice of the system of units. This function is represented for various sets of cosmological parameters in Fig. 3.9.
152
The homogeneous Universe
9
11
10
8 9
1.0
1.5
2.0
2.5
QmO
Fig. 3.8 Isocont ours of the functi on to as a fun ction of the cosmological parameters. curve indicat es t he value of to in units of 10 9 years. The grey zone represents the zo paramet ers for which there is no Big Bang.
For a fiat Universe with no cosmological const ant, we find, for instance,
(
For a Universe with a nonvanishing curvature, we can give the expression for radial dist ance in units of the curvature radius
( 3.3.3
3. 3.3.1
Angular distances
Comoving angular diam eter
The co moving angular diameter relates the comoving transverse size of an obj ect the solid angle under which it is observed . It is defined by the relation
between the solid angle and the apparent surface of an obj ect. Remembering t h comoving sphere centred a t X = 0 and with comoving radius X has a comoving su s (co m ) = 47f jk (X ), we deduce that
Rang (z) = jf( [X( z)] .
(
Time and distances
153
................__... _.. ... .... .
..:.: :.: ..
:":;'
3
5
4
Redshift z
Redshift z
Fig. 3.9 (left): Age and lookback time in units of the Hubble time, t( z )j tHo and 6t( z )j tHo as a function of z for three cosmological models defined by (Dmo, DAO ) = (1 , 0) , (0.05 , 0) and (0.2 , O.S), respectively, as plain , d otted and d ashed lines. (right) : Comoving radial distance in uni ts of t he Hubble length , aox(z ) j DHo, as a fun ction of z for the same cosmological models.
Notice that t his function is not necessarily increasing wit h z . Indeed , if K = +1 , then two objects with the same comoving size, located at X and 7r  X will have the same comoving angular diameter.
3.3.3.2 Angular dis tance The angular dist ance is the not ion that generalizes the concept of parallax. It is defined , for a geodesic bundle converging at t he point where the observer is, as the ratio between the transverse physical size of an object and the solid angle under which it is observed . The physical area of the source is related to the co moving one by dS p hys = a 2 dS com . With this defini t ion phy s com dSso 2 urce 2 dssource D A= 2 = a 2 dD o b s dD obs This quantity will also be denoted by T o . Using the definition of the co moving angular diameter , int roduced in the previous section , we obtain
aoRang(z ) l+z
(3.74)
3.3.3.3 R eciprocity theorem In the same way, we can define the angular distance from the source. It is defined using a geodesic bundle diverging from the source as the ratio between the physical transverse size of t he observer and the solid angle under which it would be observed from the source, phys _ .2 d n 2 dS o bs

rs
~ [,s o urc e'
154
The homogeneous Universe
It can be shown that if the tra jectories of the photons are null geodesics and if t equation for the geodesic deviation is valid, then there exists a relation between T s a To
(3.7
called the r eciprocity theorem (see Ref. [15] for a demonstration) . This relation cann be directly checked since Ts cannot be observed directly. 3 .3 .4
Luminosity distance
3.3.4.1
Defini tion
The luminosity distance relates the luminosity of a source, locat ed at a comoving rad dist ance X from the observer , to the observed flux as A.
_
'Po bs 
Lsource (X) 47rD[
(3.7
To relat e the luminosity of the source and the observed luminosity, notice that t luminosity of t he source is given by the emitted energy per unit of emission tim L source = !:::"Eem/ !:::"t em . Due to the expansion of the Universe, !:::" Eobs = !:::" Eem a/ and !:::"to b s = !:::"temaO /a (because !:::,.'rJobs = !:::"'rJem for two null geodesics) so that L obs L so u rce (l + z )2 The flux received by the observer at X = 0 is given by
using
S (phys) = a6 s (com).
Identifying this with the definition (3.76), it follows t hat
I DL( z ) =
ao(1 + z )fJ( [X (z)].
I
(3.7
The luminosity of the source is indeed unknown. However , by comparing some sour to a reference source for which the redshift z. and the distance are known, we obta tha t
By extracting a family of astrophysical obj ects, called s tandard candles, wit h t he sa luminosity, the ratio of their fluxes can be used to measure the luminosity dist ance To finish , let us not e that deriving (3.77) allows us to extract the Hubble rat e a function of redshift as 1
H( z )
[
1 + DJ(oHg
(~) 2]1/2~ (~) . 1+ z
dz
1+ z
Time and distances
,
" ...   
~ ~
155
15
... :.. . .......... .. ......... .
" ..... .
,''' " f
...
I,:
I :'
"
I:' I:'
2 3 Redshift z
4
5 Redshiftz
Fig. 3.10 (left): The angular distance D A (z) / DHo as a function of the redshift z for three cosmological models, defined by (n mo , n AO ) = (1,0), (0.05, 0) and (0.2,0.8), respectively, as plain, dotted and dashed lines. (right): The luminosity distance Ddz) / DHo as a function of the redshift z for the three same cosmological models.
3.3.4.2 Distances duality relation If the number of photons is conserved, which is usually the case if the Maxwell equations are valid, and if the absorption from the medium in which they propagate is small, then (for a proof, see Ref. [15]) DL
= rs(1 + z),
so that (3 .75) implies a duality relation between the angular and luminosity distances (3.78) Unlike (3 .75) , this relation can be tested experimentally [16]. 3.3.4.3
Distance modulus
The luminosity distance is usually expressed in terms of the distance modulus, m  M, defined such that it vanishes if the source is located conventionally 6 at 10 pc, m  M
=  2.51og [¢(z)/¢(10pc)] .
(3.79)
Using (3.76) and noticing that at 10 pc, the luminosity distance is also 10 pc, we obtain m  M = 5 1og[DL(z)/1O pc]. Using the expression 10 pc = DHo/3 X 10 8 h , it can be deduced that m  M
= 25 + 5 log [3000 DL ]  5 log h. DHo
(3.80)
6Note that the 2.5 factor is a rbitrary a nd was chosen to match the magnitudes of star s initia lly defined by Hipparcos .
156
The homogeneous Universe
m is the apparent magnitude and the absolute magnitude M is defined by comparis to the Solar luminosity M =  2.5 log ( : 8 )
(3.8
+ 4.76
where L 8 ~ 3.8 x 10 26 W is the Sun luminosity and Mo = 4.76 its magnitude. Wi such a definition , the brighter an object is, the weaker is its magnitude.
3.3.4.4
Magnitudes
The luminosities considered previously were the luminosities integrated over the enti electromagnetic spectrum, defining the bolometric magnitude. It is difficult to measu these bolometric magnitudes and, in general, some filters are used, selecting a speci frequency band. The magnitude Mx in the band X is then defined by
Mx = 2.5 10g ( Lx ) L8X
+ Mo x.
(3 .8
An example of different frequency bands used in astronomy is given in the table belo Band
Central
Width (run)
MO
66
5.61
wavelength (nrn)
365
U (u ltraviolet)
B (blue)
445
94
5.48
V (visible)
551
88
4.64
R (red)
658
138
4.42
I (infrared)
806
149
4.08
J
1200
213
3.64
K
2190
390
3.28
00
4.76
Bolometric
The flux detected in the frequency band 6.vx was emitted in the frequency ban (1 + z )6.vx . The flux received per frequency unit is thus A.
()  (1+ )L v [v(l+z)] z 47rDl '
'l'X v
(3.8
where Lv is the source luminosity per frequency band. It follows that
¢x (v)
=
(1 +D~) 47r L
r
Lv[v(l
+ z)]dv.
(3.8
} tlvx
The correction induced by the deformation of the spectrum does not appear when t source is at 10 pc so that the distance modulus in the band X defined by mx  Mx  2.5 log[¢x/¢x(10 pc)], is given by
mx  Mx = 510g ( DL ) 10pc
+ K.
(3.8
Time and distances
157
The Kcorrection represents the distortion of the spectrum due to the cosmological expansion K
= 2.51og(1 + z)  2.51og {
Lvx Lv[v(l
J
+ Z)]dV}
[]
D.vx Lv v dv
.
(3.86)
Finally, note that other effects of astrophysical origin, such as the absorption by the interstellar or intergalactic medium, can affect the apparent magnitude. We shall discuss them in the next chapter. 3.3.5
Volume and number counts
The geometric properties of spacetime also playa role in the determination of the distribution of a family of objects (for instance, galaxies) in a given redshift band. Let us consider such a family with proper number density n(z) and proper radius r*. First, we can determine the probability for a line of sight to intersect such a galaxy with redshift lying between z and z+dz. This probability is proportional to the surface area of the galaxy, to the density, and to the distance light propagated in this redshift interval,
(3.87) The probability to intersect an object with redshift smaller than z , called the optical depth, is then given by
1 Z
T(Z)
=
2 7rr*DHo
o
n(z) (
dz) (). l+zEz
In particular, for galaxies diluted by the cosmic expansion, n(z) Universe with 0 = 1 and OA = 0 we find
(3.88)
= no(1 + z)3, in a
(3.89) If no rv 0.02h 3 Mpc~3 and r* rv 10h ~ 1 kpc, T rv 4% at z = 1 and 100% at z rv 20. The proper volume contained in a solid angle d0 2 and between the redshifts z and z + dz is the product of the area D~.d02 and the length D Ho dzj(l + z)E(z), so that
~  D d0 2 dz 
Ho
(1
iI[x(z)] + z)3 E(z)'
(3.90)
The number of galaxies in a solid angle d0 2 and in a redshift band dz is then dN d02dz
iI[x(z)]
= DHon(z) (1 + z)3 E(z)'
Galaxies number counts are among the oldest cosmological tests (see Ref. [1]).
(3.91 )
158
The homogeneous Un iverse
3.4 3.4.1
Behaviour at small redshifts D ecelerat ion parameter
We can expand the scale factor around its value today as
a(t) with 0 < to  t
«
=
1 2 ao [ 1 + Ho(t  to)  2qoHo(t  to) 2
+ ...] ,
(3.9
to. The factor qo is the deceleration parameter7 and is defined by qo
== 
a!2It=to '
(3 .9
so that the Universe accelerates if qo < 0 and decelerates otherwise. Using the Frie mann equations to express 0,0 / ao , we easily find t hat
for a ACDM model, since the contribution from radiation is negligible today. For a arbitrary matter composition, the deceleration parameter is given by
(3.9
Thus , in order to have qo < 0, there should be at least one cosmic fluid (dominatin today) with equation of state Wx <  1/3 . We stress that t he expansion (3 .92) on relies on the hypothesis that the Universe is described by a Friedmann Lemaitre spac time. In particular , it makes no assumptions on t he matter content of the Univers contrary to (3 .94). 3 .4.2
3.4.2.1
Expression of the time and distances
Small redshift expansions
At small redshifts, we can expand the function E(z) as
neglecting t he contribution of radiation. Thus, we conclude t hat the lookback t ime
6.t( z ) =
tHo
[1  ~(qO + 2)Z] z + 0( z3 ).
(3.9
The radial distance takes the form
7The choice of sign is historical. At the time qO was introduced it was thought that the expansi of the Universe was d ecelerat ing so t hat qO > 0 was exp ected. C hapter 4 details why we now thi t h at qO < O.
Behaviour at small redshifts
159
the comoving angular distance,
Rang(Z)
=
DHo ao
[1  ~(qO + I)Z] Z + O(z3), 2
(3.97)
the angular distance, (3.98) and the luminosity distance, (3.99) Locally (z « 1) all the distances reduce to DHoz and the deviations only depend on the value of the deceleration parameter. A distanceredshift diagram will thus be sensitive to Ho in its region z « 1 (slope of the diagram), to qo mainly in its intermediate region (z ~ 1) and to the full set of cosmological parameters for larger z. This illustrates the evolution of the degeneracy of the function E(z) represented in Fig. 3.7. 3.4.2.2
Time drift of the redshift
Suppose we observe the same galaxy at to and to redshift given by
+ Mo.
There will be a change in its
Now, null geodesics are straight lines in conformal coordinates so that br] = br]o Malao = btla from which we deduce that bzl(1 + z) = [Ho  H]br]o to first order in 61)0 and thus
bz = Ho [(1 uto
£
+ z)
 E(z)] ,
(3.100)
where E is defined by (3.39). This gives the time evolution of the redshift of any object at redshift z as a function of the cosmological parameters. At low redshift, it reduces to bzlbt = qoHoz + O(z2). Indeed, this drift is extremely small and of the order of az I at ~ 10 8 I century. However, it gives a direct information on E(z) and thus on the cosmological parameters (see Refs . [17,18]). Note that the redshifts are free of the evolutionary properties of the sources, and that a direct detection of their drift would represent a new (and yet never exploited) and direct approach in observational cosmology to constrain the evolution history of the Universe, alternative to those based on luminosity or diameter distances. It can be of importance in the study of the darkenergy problem (see Chapter 12).
160
3.5
The homogeneous Universe
Horizons
An horizon is a frontier t hat separates observable events from nonobservable on T wo very different concepts of horizon have t o be distinguished in cosmology ( Refs. [10, 19]) . The different quantities defined in t his section are represented in Figs. and 3.12. 15
~
Geodesic of
>.
'"0
:::
a particle _ entering into our particle horizon today _
10
~
0)
B
0. 8 0.6
(f'
:3
t)
"
. _ 0.4
.~
0)
0;
"
0
CIJ
U
 0.2 0 0
10
20
40
30
50
Comovin g di stance, aoX, ( 109 lightyears) 15
LO
~
>. 0
~
:::
0. 8
IO
~
" E
~
"
.~
0.6
'" ..'": .9
~
"
5
0)
0;
"
CIJ
0
U
20
30
40
50
Physica l di stance, ax, (10 9 lightyears)
Fig. 3.11 The particle horizon at a given time (3. 102) is defined at different moments the geod esic of the most distant observable comoving particles at this moment (plain verti lines). T his surface can be visualized as the intersection of the particle horizon (3.103) w a constanttime hypersurface . The equation of this surface is given by X = ¢(t) in comov coordinates (top) and by X = u(t) in physical coordinates . This diagram represents a U verse with t he following cosmological parameters (!lmo , !lAO) = (0. 3, 0.7) and h = 0.7. Fr Ref. [20].
3.5.1
Event horizon
For an observer 0 , the event horizon is the hypersurface that divides all events in two families: the ones that have been , are, or will be observable by and the on that are for ever outside the observational perimeter of 0. A necessary and sufficient condition fo r the existence of an event horizon is t h the integral OO dt
°
J
a(t)
Horizons
161
Physical distance (10 9 lightyears)
Comoving distance, GoX (10 9 lightyears) 60
~
50
;>,
,
lOpO
,,'
3.0
o~/
2.0
~~,
b~~~~:~r7~~*~~~____~__~~OT'~'______~1.0
40
0. 8
30
04
~
0
¥
''
oJ
.~
OJ
E = 0
<2
€
20
0.1
U
"5"
'"
10 LLL_''L'_"_~_'.L_
 40
_"'____
__"__
__"__
_"__
 20 20 o Como ving di stance, GoX ([0 9 lightyears)
___L_"____L_
40
_i:>....:J
0.01 0.00 I
60
Fig. 3.12 Universe diagrams for a Friedmann Lemaitre space with paramet ers OmO = 0.3, [lAO = 0.7 and h = 0.7 . The first two diagrams represent , respectively, the physical distance, a(t)x, and the comoving distance , X, in terms of the cosmic time, t (left scale) or in terms of the scale factor, normalized to 1 today (right scale). The last diagram represents the comoving distance in terms of the conformal time, 7). The dotted lines represent the worldlines of comoving observers and our world line is the central vertical line. Above each of these lines is indicated the redshift at which a galaxy on this world line becomes visible for the central observer. We represent the past light cone for the present central observer and the event horizon corresponding to a similar light cone originated from timelike infinity (plain bold line) . We also show t he Hubble sphere, defined by (3.63) (plain light line) and the particle horizon , defined by (3 .103) (clashed line) . From Ref. [20].
is convergent . If this integr a l converges , then at a ny time to , there exists a worldline
1
00
X = Xo =
to
dt
a(t) ,
such that the photon emitted at to in Xo towa rds the origin , reach es X
=
0 at t
= +00.
162
The homogeneous Universe
Any photon emitted at to in X > Xo never reaches the origin and any photon emitt at to in X < XO reaches the origin in a finite time. The de Sitter space, for which a = exp(Ht ), has such an event horizon, since
=
Xo
=
dt e Hto exp H t = ~ <
1 to
00.
As a second example, let us consider Universes wit h scalefactor evolution of the for a(t ) = tn. The integral
J=
c ndt
converges if and only if n > l. Recalling that for a flat Universe (K = 0) n is relat to the parameter w of t he equation of st at e by 3n = 2/ (1 + w), we find that an eve horizon exist s if 1 (3 .10 w <  '3 {==} p + 3P < 0, i.e. if the strong energy condition is violated. 3.5.2
Particle horizon
For an observer 0 at to, t he particle horizon is t he surface of the hypersurface t = dividing all particles of the Universe into two nonempty families : the ones that ha already been observed by 0 at the time to and the ones that have not. For ea time to, it is thus t he intersection between the geodesics of the most dist ant comovi par ticles that can be observed, with the hypersurface t = to. Hence, it is a tw dimensional spacelike surface, which reduces to a sphere of centre 0 for an isotrop and homogeneous Universe. A necessary and sufficient condition for the exist ence of a particle horizon is th the following integral be convergent:
· 1
dt
dt a(t) '
J_=
or
a a(t )
depending on whether or not a(t) continues for negative values of t . Indeed , at any time t o, any particle such that X > a  1dt has not yet be observed by an observer 0 at the origin. The twodimensional surface
J;o
X = ¢(to ) =
i
dt
to
a

a(t)
(3.10
defines t he particle horizon at a given time to and thus divides all particles into tw subfamilies: the ones t hat have been observed at to or before to [X :s: ¢(to )] and th ones t hat have not yet been observed [X > ¢(to) ]. This surface is called the partic horizon of 0 a t to. It can be seen as the section at t = to of t he spacetime surfa (J = ¢ (t ). We thus define the particle horizon as the hypersurface X
=
(J
= ¢(t ).
(3 .10
The hypersurface (3 .103) is the fu t ure light cone emitted from the position of t observer at t = O. Since a(t ) is a positive function , when ¢(t ) exist s, it is an increasi
Horizons
163
function of t so that as time elapses, more and more particles are visible from O. In conformal time, t he particle horizon is a cone and the part icle horizon at to is a sphere. If ¢(t ) is finite when t t ends t o infinity, then the Universe has an event horizon since only the particles with X ::; ¢ (oo) will be accessible to the observer O. This is represented in Fig. 3.11 where the vertical lines represent the particle horizons at various times, whereas the d ashed line represents the surface defined by X = ¢(t) . As an example, let us consider Universes wit h scale factor evolution of the form a( t ) = tn. The integral
lC
n
dt
converges if and only if n < 1. Since for a flat Universe (K = 0) , 3n = 2/ (1 + w) , it follows that there exists a particle horizon if 1
w > 3
~ p
+ 3P > O.
(3.104)
From the conditions (3.101) and (3 .104), we thus conclude that a Universe containing a single fluid with constant equation of st ate can have either an event horizon or particle horizon, but not both at the same time. The two types of horizons are thus mutually exclusive in this particular case. The equation (3.102) gives the co moving radius of the particle horizon. Its physical diamet er at a time t 2 for a n event that occurred at tl < t 2 is thus (3.105) It can be checked that if t l and t 2 are two events from an era dominated by a fluid with const ant equation of st at e w , then _ 6(1 + w)
D p h (t 1 , t 2) 
[ 1 +3w t2 1 
(h) t2
l
(1+3W / (3+3W l ]
.
(3.106)
If tl « t2 t hen t he horizon diameter is proportional to t he Hubble radius (3.63) at the time t 2
(3.107) The particle horizon is at t he origin of the 'horizon problem ' of the st andard BigBang model (see Section 4.5.2 of Chapter 4). 3.5.3
Global properties
To study the global structures, and in particular at infinity, of the cosmological spacetimes , we shall use the representation introduced by P enrose (see R ef. [10] for a general study). It is based on the idea of constructing, for any manifold M with metric g/J>v , another manifold ;\It with a boundary 3 and metric 9/J>v = W g/J>v such that M is conformal to the interior of ;\It , and so that the 'infinity' of M is represented by the 'finite' hypersurface 3. The last property implies that W vanishes on 3. All asymptotic properties of M can be investigated by studying 3 (see Ref. [21 ]) .
164
The homogeneous Universe
3.5.3.1
The Minkowski spacetime as an example
The Minkowski metric (1.10) in spherical coordinates can be written in terms of t advanced and retarded null coordinates, respectively, defined by v = t + r and u = t as 1 ds 2 =  dvdu + 4(v  u) 2d0 2,
where v ?: u and v and u range from 00 to +00 . No terms of t he form dv 2 or d appear because the surfaces of constant v (or constant u) are null surfaces. We can make these coordinates vary in a finite interval by introducing tan V = v ,
tanU = u ,
so that  71"/ 2 < U :S V < 71"/2. Then , introducing the coordinates T and R by
T = U + V,
R = V U,
with  71" < T + R < 71" ,
 71" < T  R < 71" ,
(3. 10
the Minkowski metric turns out to be conformal to the metric 9 given by ds 2 =  dT 2 + dR 2 + sin 2 R d0 2,
with conformal factor W = {2 sin[(R + T) / 2] sin[(T  R) / 2]}  2. The Minkowski spa time is thus conformal to the region (3.1 08) of the Einstein static spacetime (a cyl der S3 x IR. that represents a st atic spacet ime with spherical spatial sect ions). T boundary of this region therefore represents the conformal structure of infinity of t Minkowski spacetime. This boundary can be decomposed into  two 3dimensional null hypersurfaces , .:7+ and .:7 defined by
(3 .10
or equivalently by T = ±( 71"  R) with R E [0,71"]. T he image of a null geode originates on .:7 and t erminates at .:7+, which represent future and past nu infinity.  two points i+ and i  defined by U = V =
±=2'
(3 .11
or equivalently by R = 0 and T = ± 71". The image of a t imelike geodesic origina at i  and terminates at i + . They represent future and past timelike infinity, th is, respectively, t he start and endpoint of all timelike geodesics.  one point iO defined by U =  V= = (3.11 2' or equivalently by R = 71" and T = o. It is t he start and endpoint of all spacel geodesics so that it represents spatial infinity.
Horizons
165
The fact that i± and iO are single points follows from the fact that sin R = O. These are coordinate singularities of the same type as the one encountered at the origin of polar coordinates. The manifold Nt is regular at these points. J  is a future null cone with vertex i and it refocuses to a point iO that is spatially diametrically opposite to i  . The future null cone of iO is J +, which refocuses at i+. Note that the boundary is determined by the spacetime and is unique but that the conformal ext ension spacetime (here the Einstein static spacetime) is not fixed by the original metric and is not unique since another conformal transformation could have been chosen. For any spherically symmetric spacetime, the region (3.108) can be represented in the (T, R) plane by ignoring the angular coordinates so that each point represents a sphere S2 In this representation the Minkowski spacetime is a square lozenge (see Fig. 3.13) . Null geodesics are represented by straight lines at ±45 deg running from J  to J + .
i+
i+
J+
(1) =
+ on )
:
I
I I
I I I
I
I I
I
J 
(I) = 0)
J 
(I) = 
00.')
I
Fig. 3.13 Conformal diagrams of the Minkowski spacetime (left) and FriedmannLemaitre spacetimes with Euclidean spatial sections with A = 0 and P > 0 (middle) and de Sitter space (right) .
3.5.3.2
FriedmannLemaltre spacetimes with K = 0
When written in terms of the conformal time, Friedmann Lemaitre spacetimes with Euclidean spatial sections are conformal to Minkowski. It follows that they map on a part of region (3 .108) representing Minkowski spacetime in the Einstein static Universe. The actual region is determined by the range of variation of 7]. For A = 0 and P > 0, 0 < 7] < 00 so that it is conformal to the upper half of the Minkowski diamond defined by T > 0 (see Fig. 3.13) with a singularity boundary, T = O.
166
The homogeneous Universe
3.5.3.3 de Sit ter spacetime
The metric of t he de Sitter spacetime is given by (3 .44) with t varying in R It spherical spatial sections and is conformal to the whole Einstein static spacetime. study its infinity we define the coordinates
R =X,
T = 2 arctan
(eHt )
,
(3.1
with 0 < R < 7r and 0 < T < 7r. It is then conformal to a square region of the Eins static space with conformal factor W = H  2 cosh2(Ht) (see Fig. 3.13). We see that this conformal diagram has a spacelike infinity both for timelike null geodesics, in the future and in the past.
3. 5.3.4
Friedmann Lem aitre spacetimes with K = ± 1
Friedmann Lemaitre spacetimes wit h spherical spatial sections (K = + 1) are con mal to the Einstein static spacetime (with 17 7 T and X 7 R). They are thus map into the part of this spacetime determined by the allowed values for 17· There are three general possibilities. When A = 0, 17 varies from 0 to 7r if P = 0 from 0 to a < 7r when P > O. The FriedmannLemaitre spacetime is thus confor to a square region of the Einst ein st atic spacetime so that both J + and J spacelike and represent two singularities. When A i 0, 17 can vary from 0 t o 00 the hesitating Universes (such as examples Nand M in Fig. 3.3) or from  00 to for t he b ouncing Universes (such as examples G and H in Fig. 3.3). The Friedma Lem aitre space time is thus conformal to either half of or the entire Einstein st spacetime. As in the case of the de Sitter space, the conformal region will be a squ of the Einstein static space and J + and J  are also spacelike but do not necess represent a singularity. In the case of Fr iedmann Lemaitre spacetimes with hyperbolic spatial secti (K = 1 ), the metric can be shown to be conformal to part of t he region (3.108) means of the coordinat e transformation
T= R =
arctan [tanh ( 17 : arctan [tanh ( 17 :
X)] + arct an [t anh (17 ; X)] , X)] _arct an [t anh ( 17 ; X)] .
Again, the exact shape of this region depends on the matter content (equation of s and A). We see in these examples that some parts of the boundary correspond to the B Bang singularity a = O. When P > 0 and A = 0, the initial singularity is spacel which corresponds t o the existence of a particle horizon. When K = + 1 and A = the future boundary is spacelike, which signals the exist ence of event horizons for fundamental observers. For most physically interesting spacetimes, J + and J  are either spacelike null , which is closely related to the existence of one of the two types of horizon descri above.
Beyond the cosmological principle
167
If .] is spacelike, the worldlines of the fundamental observers do not all meet on .] at the same point . For any particular observer and event on its worldline close to J , the past light cone of this event will not intercept all the particles in the Universe before it reaches .] and there will be a particle horizon. If .] is null , it is expected that all the worldlines of the fundamental observers pass through the vertex i so that the past light cone of a point P will intercept all the worldlines and there is no particle horizon. If .]+ is spacelike, the worldline of any fundamental terminates on a point 0 of J+ and the past light cone of 0 divides the Universe into events that can be seen by the observer and events that he can never see; there is en event horizon. If .]+ is null , all the worldlines will pass through the vertex i+ , so that 0 = i+ and its past light cone is .]+ and there is no event horizon. In conclusion, .] spacelike <===> existence of a particle horizon, .]+ spacelike <===> existence of an event horizon.
,....
J + {}I
0
=;: I
k
1
II
II I ;><
J+
...
( 1/ = + ,~ ,c )
a
II ;><
1 1
1 ;>< 1 1 II
I
"1 1
..J.... I
J 
(II= 0 )
J+{ I/= +'X ·I I
I
....
1 1
..
(I I = 0 )
a
II
,,
, I
;><
1 1
;><
II "1
1 1
I I
J  (17 =
 cc· )
Fig. 3.14 Conformal diagrams of the Friedmann Lemaitre spacetimes with spherical spatial sections with P > 0 for A = 0 (left), and A =F 0: hesitating case (middle) and bouncing case (right). T he dotted line represents a singularity. The three diagrams differ only by the nature of J±, even though they are all spacelike surfaces .
3.6
Beyond the cosmological principle
The Friedmann Lemaitre spacetimes have spatial sections with constant curvature, i.e. threedimensional homogeneous and isotropic surfaces. These symmetries are characterised by six Killing vectors (Chapter 1) that completely fix the geometry of the spatial metric , up to a number, K. Reciprocally, we can construct some solutions by restraining the number of symmetries and by imposing the structure of their Lie algebra. Numerous solutions of Einst ein's equations are known in this case and have been classified [22 24].
168
The homogeneous Universe
We give a general description of the classification of cosmological spacetimes cording to t heir symmetries and t hen we present two cosmological solutions in or to illustrate the richness that this provides. 3.6.1
Classifying spacetimes
Even t hough the previous discussion focused on a particular class of cosmological so tions with maximally symmetric spatial sections , most solutions of E instein's equati have no symmetry at all. This simplification was motivated by the cosmological pr ciple but it is important to have an idea of exact solutions with less symmetry particular to construct counterexamples to some conclusions or t est their robustne
3.6.1.1
Symmetries and Killing vectors
Symmetries of a given space are t ransformations of this space into itself leaving metric unchanged as well as all its geometrical properties . As explained in Chapte a continuous symmetry can be characterized by a Killing vector field , ~ , satisfying
The group of isometries of a Riemannian manifold forms a Lie group (that is a gro that is also a smooth m anifold with a group operation xy  l that is a smooth m into t he manifold itself; see Chapter 2). The set of Killing vector fields generators t hese isometries forms the associated Lie alge bra: it is a vector space, since t he lin combination of any two Killing vectors is also a Killing vector, of finite dimensio since the maximum number of Killing vectors for a space of dimension n is n(n + 1) reached for maximally symmetric spaces. The product operation is defined by commutator of two Killing vectors
(3 .1 of coordinates [~a, ~b]'" = (8f3~b) ~g [~a, [~b , ~cll
+
[~b, [~c,~all
+
[~c , [~a , ~b ll

(8f3~~n~f and it satisfies the J acobi ident
= O.
It is easily checked that if ~a and ~b are two Killing vectors then [~a, ~bl is als Killing vector (since £[l;a, l;blgj.LV = 8!L[~a , ~blv + 8v[~b,~al!L = 0). Considering a ba {~a}a = l..r of this algebra , any Killing vector can be decomposed on this basis a linear combination with constant coefficients. It follows t hat
(3 .1
C~b are t he structure constan ts of the Lie algebra. They are antisymmetr Cba ' and must satisfy, from the J acobi ident ity C~[bC~d) = O. These relati are integrability conditions t hat ensure that t he Lie algebra is defined in a consist way.
where C~b =
3.6.1.2
Defini tions
T he action of a group of isometries defines orbits in the space where it acts and dimensionality of these orbits characterizes t he symmetries.
Beyond the cosmological principle
169
T he orbit of a point P is the set of all points into which P is transformed by the action of the isometries of the space. The orbits are thus invariant sets. An invariant variety is a set that is globally left invariant by the action of the group of isometries. It will be larger than all t he orbits it contains. T he orbits are t he smallest invariant subspaces (for all the isometries in the group) . An orbit that reduces to a single point is a fixed point. It is a point where all Killing vectors vanish. We distinguish general points where the dimension of Killing Lie algebra takes the value it has almost everywhere from special p oints, such as t he fixed points, where the dimension is lower. The group of isometries is transitive on a space S if it can move any point of S into any other point of S. Orbits are the largest spaces through each point on which the group is transitive and can be called surfaces of transitivity. Their dimension, s, has to be smaller than or equal to t he dimension of t he group of translation, s = dim(surfaceoftransitivity)
~
n.
The isotropy group, at a point P, is the subgroup of isometries letting that particular point be fixed. It is generat ed by the Killing vectors that vanish at P. Its dimension q is smaller t han the dimension of the group of rotations, n(n  1) / 2, q
= dim(isotropygroup)
~
1 "2n(n 1).
The dimension of the group of isometries , r , is the sum of the dimensions of the isotropy group and of t he surface of transitivity. It follows t hat r has to be smaller t han the maximum dimension of the Killing Lie algebra, that is n( n + 1) / 2, that is for a space of constant curvature. r = dim(groupofisometries) = s
+ q ~ ~n(n + 1).
The group of isometries of a space is thus characterized by two of the t hree numbers (S, q,T) and by the structure constants of t he group (C~b)' 3.6.1.3
Classification of cosmological models
For an = 4 dimensiona l spacet imes, r ~ 10 and s can run from 0 to 4. The dimension of the group of isometries being a global property of t he space, r must not change over space. Indeed, the real Universe has no symmetry (r = 0). For cosmological spaces, we assume that p + P > 0, which implies that q E {3, I , O} because the congruence of t he worldline of the fundamental observers is invariant, so that the isotropy group at each point must be a subgroup acting orthogonally to uf.' and it can be shown t hat there is no subgroup of 0(3) of dimension 2. Furthermore, q can vary over space but not over an orbit and it can be greater at special points where s is smaller. The space will be said to be isotropic for q = 3 (this is the case of the Friedmann Lemaitre spaces), locally invariant under rotation for q = 1 (there is a preferred spatial direction) and anisotropic for q = O. Table 3.2 summarizes all t he possibilities according to q and s. For more details, see Refs. [7,22 24].
The homogeneous Universe
170
Table 3.2 Classification of cosmological models according to their isotropy and homogeneity properties . We recall that r = q + s :::; 10.
q= 6
8= 3
8= 2
spatially homogeneous
inhomogeneous
Minkowski , de Sitter antide Sitter
none
none
Einstein static
Friedma nn Lemaitre
none
q= 1
Bianchi Kantowski Sachs
Lemaitre Tolman Bondi
q= O
Bianchi
q= 3
3.6.2
8= 4 spacetime homogeneous
Universe with nonhomogeneous spatial sections
3.6.2.1 Lemaitre TolmanBondi spacetime
Among all the nonhomogeneous spaces, the LemaitreTolman Bondi spacetim the simplest. It enjoys a spherical symmetry, so that s = 2, q = 1 and r = 3 (but n that the origin is a special point with q = 3 and s = 0). Its metric takes the gen form ds 2 = dt 2 + S2(t , r)dr2 + R2(t , r)dS1 2. (3. 1 The Einstein equations for a fluid of dust (P = 0) reduce to 2
S (r, t)
= 1_
(R')2 K(r)r2
R? = 2M(r) _ K(r)r2
R M'(r) 47rG N P(r, t) = R2 R' '
(3.1
(3 .1
(3.1
where a prime and a dot represent the derivatives with respect to rand t , respectiv M(r) and K(r) are two integration functions. The Friedmann Lemaitre spaceti correspond to the special case R = a(t)r and K(r) = K. These equations can integrated to give the solution
R(rJ , r)
=
M(r) d
M(r) t(rJ , r)  to(r) = (2E) 3/2
(3. 1
where 2E = (2E , 1,  2E) ,
Beyond the cosmological principle
3.6.2.2
171
The Swisscheese model
This family of inhomogeneous spacetimes is constructed by cutting out spherical regions from a FriedmannLemaitre Universe and filling the voids by a spherically symmetric solution of the Einstein equation, for example, a Schwarzschild solution. The two spacetimes are then glued along a 3dimensional timelike hypersurface. This implies constraints on the two spacetimes since they have to obey junction conditions for the total spacetime to be well behaved (see Section 8.4.5 of Chapter 8 for a description of these conditions). For instance, it can be shown that it implies that the central mass of the Schwarzschild spacetime, the density and scale factor of the Friedmann Lemaitre spacetime and the radius of t he spherical void must be related by MSchw = (471" /3)[ pa 3 R 3 ]FL . Thus, we can construct a spherical void with an overdensity at the centre. The junction conditions impose that the over and underdensities average to the mean density of the Friedmann Lemaitre spacetime. It follows that the mass in the central void cannot have any effect on matter outside the void, and in particular on the expansion rate: its gravitational effect is screened. 3.6.3
Universe with homogeneous spatial sections
The Universe models with s = 3 are central in theoretical cosmology because they mathematically realize the cosmological principle (they have homogeneous spatial sections). We have described at length t he case of isotropic spacetimes (q = 3; r = 6) in this chapter . Two classes of nonisotropic spacetimes can be constructed: (i) the family of solutions with a local rotation symmetry (q = 1; r = 4), which contains the family of Kantowski Sachs Universes and some Bianchi Universes, and (ii) the family of anisotropic solutions (q = 0; r = 3) , which contains the other Bianchi Universes. 3.6.3.1 Generalities on Bianchi Universes To classify all spaces with s = 3, we need to determine the structure constants, C~b' of the group t hat is simply transitive on spacelike sections. The C~b have 9 independent components and can thus be mapp ed onto t he 3 x 3 matrix Nab defined by N cd =
~2 Cab Eabd , C
where cabd is the completely antisymmetric tensor. Splitting Ncd into its symmetric and antisymmetric parts as Ncd = ncd +E cdb A b , we obtain that the structure constants are C~b = Edab ncd + 6gAa  6g A b· One can always choose the basis of the Lie algebra such that nab is diagonal [nab = diag(nl,n2 , n3)]. A rotation allows us to set Ab = (a,O,O) while keeping nab diagonal. The Jacobi identity implies that nab Ab = 0, so that we have one of the three possibilities: (i) a = and nl y'c 0, (ii) a y'c and nl = 0, or (iii) a = and nl = 0. By rescaling the Killing vectors, we can finally set the nonvanishing n a to ± 1 but in general it is impossible to rescale both a and na. In conclusion
°
°
°
(3.120)
172
The homogeneous Universe
Table 3.3 gives the eleven groups of isometries that are distributed among the ni Bianchi types I to I X. They are divided into two classes: class A if a = 0 and class otherwise. In some cases, these groups allow higher symmetry. For instance, Bianch and V I 10 can be isotropic (leading to K = 0 Friedmann Lemaitre spacetimes) as w as Bianchi IX (leading to K = +1 Friedmann Lemaitre spacetimes) and Bianchi and Bianchi VIla (leading to K =  1 FriedmannLemaitre spacetimes).
Table 3.3 The structure constants of the 11 groups of isometries and the definition of the 9 associated Bianchi types . Class
A
B
Bianchi type
I
II
VIla
VIa
IX
VIII
V
III
IV
VIla
VIa
a
0
0
0
0
0
0
1
1
1
a
a
nl
0
0
0
0
1
1
0
0
0
0
0
n2
0
0
1
1
1
1
0
0
1
1
1
n3
0
1
1
1
1
 1
0
1
 1
1
1
3.6.3.2 Example: Bianchi I Universe
The simplest Universe with anisotropic expansion is the Bianchi I Universe. It is homogeneous spacetime so that the projected gradient of any scalar function vanish 'V'vi = O. This implies, in the notations of Chapter 1, that aIL = 0 (because 'V' ILP = and the vorticity is also vanishing, w ILV = 0). It follows that it is completely describ by 8(t) , a(t) , p(t), and P(t). The Raychaudhuri equation (1. 73) gives
The spatial sections are assumed to be Euclidean so that t he generalized Friedmann equation (1.78) reduces to
(3) R =
0, which implies th
The shear evolution equation is derived from the GaussCodazzi relation, using t fact that (3) Rab = 0, and is given by
In terms of the cosmic time derivatives , this rewrites as
In co moving coordinates , the metric of this spacetime takes the form
(3. 12
Beyond the cosmological principle
173
The average of the directional expansion rates is S = (XY Z)1/3 and is related to by 30 = S/ S. Factorizing S, t he metric can be recast as
e
(3 .122) and the metric on constanttime hypersurfaces can be decomposed as 'Y ..

i2J 
where the
f3i
have to satisfy
L
f3i =
O.
e2 (3i (t) u>tJ. . '
"tij
(3.123)
is traceless and related to the shear by
The equation of evolution of the shear implies that (3.124) where ~j a constant tensor . This implies that 0 2 == (Jij (J i j = ~2 / S6, where ~2 = ~j~L and thus that the generalized Friedmann equation (l.78) takes t he form (3.125) For a fluid with equation of state P = wp , P IX S3(w+l) . T his implies that for any value of ~, the shear always dominates the primordial evolut ion of the Universe. It can also be checked t hat the Raychaudhuri equation takes the form (3.126) The evolution of the three scale factors can be shown to be obtained by setting (3.127) with W(t) =
dt
J
(3.128)
S3 .
The coefficients Bi must satisfy the constraint (3. 129) This relation is trivially verified with 27r .
a·=a+ 3 2' t where a is a constant. For instance, if w = 0, we obtain the solution
(3.130)
174
The homogeneous Universe
5 = (
~Mt2 + /f2:.t )
1/ 3
w =  /[~ In (3M + 2V62:./ t) ,
,
(3
once setting Kp = M / 5 3. At lat e times, the Universe becomes more and more isot and t ends towards an E inst einde Sitter spacetime, but close to the Big Bang shear dominat es, 5 > (V32:.t)1 /3 and
X(t)
>
X ot (1+2s in O:J)/3,
Y(t)
>
Yo t (1+ 2 sin O: 2 )/3 ,
Z(t)
> Zot(1 + 2 s in O: 3 )/ 3
(3 If 0: i 7r / 2, two of the powers are positive and the third one is negative. Si behaviours close to the singularity will be discussed in Section 13.4.1 of Chapte As another example, consider the case of a pure cosmological constant. Then Friedmann equation can take the form
It follows that
5(t)
=
5. [sinh(3H. t)]1/3 .
Again, at late time, it converges toward the de Sitter scale factor 5 at early time it behaves as t 1 / 3 .
rv
exp(H. t)
References [1] P.J.E. PEEBLES , Principle of physical cosmology, Princeton University Press, 1993. [2] S. WEINBERG , Gravitation and cosmology: principles and applications of the general theory of relativity, John Wiley and Sons, 1972. [3] J. RICH, Fundam entals of cosmology, SpringerVerlag , 200l. [4] E. HARISSON, Cosmology, Cambridge University Press, 1981. [5] B. RIDDEN, Introduction to cosmology, Addison Wesley, 2003. [6] R.c. TOLMAN, R elativity, thermodynamics and cosmology, Oxford University Press, 1934. [7] G.F.R ELLIS and H. VAN ELST, 'Cosmological models ', in Theoretical and observational cosmology, Kl uwer , Dordrecht, 1999. [8] J .P. UZAN , C. CLARKSON and G.F.R ELLIS, 'Time drift of the cosmological redshift as a test of the Copernican principle' , Phys. Rev. Lett. 100, 191303, 2008. [9] R. WALD, Gravitation, Chicago University Press, 1984. [10] S. HAWKING and G.F.R ELLIS , The large scale structure of spacetime, Cambridge University Press, 1973. [11] A.D. CHERNIN, D.l. NAGIRNER and S.V. STARIKOVA , 'Growth rate of cosmological perturbations in standard model: explicit analytical solution ' , Astron. Astrophys. 399 , 19, 2003. [12] J. EHLERS and W. RI NDLER, 'A phase space representation of Friedmann Lemaitre Universes containing both dust and radiation and the inevitability of a Big Bang', Month. Not. R. Astron. Soc. 238 , 503 , 1989. [13] G.F.R ELLIS and J . WAINWRIGHT (eds.) , The dynamical system approach to cosmology, Cambridge University Press, 1996. [14] J.P. UZAN and R LEHOUCQ, 'A dynamical study of the Friedmann equations', Eur. J. Phys. 22 , 371 , 200l. [15] G.F.R ELLIS , 'Relativistic cosmology' , in R elativity and cosmology, RK SACHS (ed.) , Academic Press, 1971. [16] J.P. UZAN, N. AGHANIM and Y. MELLIER, 'The distance duality relation from Xray and SZ observartions of clusters', Phys. Rev. D 70 , 083533, 2004. [17] M. DAVIES and L. MAY, 'New observations of the radio absorption line in 3C 286, with potential application to direct measurement of cosmological deceleration', Astrophys. J. 219 , 1, 1978. [18] A. LO EB, 'Direct measurement of cosmological parameters from the cosmic deceleration of extragalactic objects ', Astrophys. J. L ett. 499 , Ll11 , 1998. [19] W. RINDLER, 'Visual horizons in worldmodels', Month. Not. R. Astron. Soc. 116, 622, 1953.
176
References
[20] T.M. DAVIS and C.H. LINEWEAVER, Expanding confusion: common miscon tions on cosmological horizons and the superluminal expansion of th e Univ [astroph/0310S0S] .
[21] R. PENROSE, 'Conformal treatment of infinity' , in Relativity, groups and topo Les Houches 1963, pp. 561 C. DE WITT, and B. DE WITT (eds.) , Gordon Breach, 1964. [22] A. KRASINSKY , Physics in an inhomogeneo us Universe , Cambridge Unive Press, 1996. [23] D. KRAMER, H. STEPHANI, M.A.H. MCCALLUM and E. HERTL, Exact solu of Einstein's field equations, Cambridge University Press, 1980. [24] M.P. RYAN, Homogeneous relativistic cosmologies, Princeton University P 1975.
4 The standard BigBang model This chapter is devoted to the description of the historical pillars on which the standard BigBang model stands. Starting from the cosmological solutions described in Chapter 3, we present the observations proving the expansion of the Universe and the various consequences of this expansion. We start , in section 4.1, by describing the observational evidences of the expansion of the Universe from Hubble's observations to the most recent ones. This will lead us to discuss the value of the Hubble constant and its relation with the age of the Universe. Elements of thermodynamics in an expanding spacetime are sumarised in section 4.2. We will deduce that the Universe has a thermal history. In particular , we will show how the departures from thermodynamic equilibrium playa central role. A remarkable feature of the BigBang model comes from the proof by Alpher , Bethe and Gamow in 1948 [1] tha t light nuclei can be formed during primordial nuc1eosynthesis. T his is discussed in section 4.3. Gamow [2] also deduced the existence of a microwave background of photons, the temperature of which was calculated by Alpher and Herman [3] in 1948. Relying on an estimate of the helium abundance and of the baryon density, they concluded that this temperature should be around 5 K. Section 4.4 describes the properties of the isotropic component of this microwave background radiation. These physical ingredients transform the (mathematical) cosmological solutions of F'riedmann Lemaitre into the BigBang model. We will argue in section 4.5 that this model is an excellent standard model for cosmology, even if it suffers some problems, on which we will focus our attention in the following chapters of this book.
4.1
The Hubble diagram and the age of the Universe
If the Universe is expanding, galaxies must move away from one another (see Section 3.l.3; Chapter 3). The measurement of the systematic achromatic shift to longer wavelengths of the absorption or emission spectral lines gives access to the redshift z. For small redshifts , the recession velocity is given by vi c ~ z. As for distances , (3.99) shows that the luminosity and angular distances reduce to D A
~ D ~~z~ ~ L Ho Ho '
as long as z « l. In this regime, the slope of the curve of v as a function of D directly gives the Hubble constant Ho. For larger redshifts, this is no longer the case. First, the
178
The standard BigBang model
angular distance and the luminosity distance are no longer identical, and in addit other cosmological paramet ers also interfere in the determination of these distan (see Chapter 3). We describe the measurements for small redshifts in Section 4.1.1 and then th for supernOVffi that extend up to redshifts of order 1 in Section 4.1.2. We finish summarizing the methods allowing the age of the Universe to be determined in S tion 4.1.3 . 4 .1.1
The Hubble constant
The Hubble diagram is the most direct proof of the expansion of the Universe. T underlying idea is very simple and comes from the independent measurements of redshift and distance. These measurements must be performed at distances large enough for the pro velocity of the galaxies to be negligible. Typically, one should measure velocities lar than v rv 10 4 km· sI, which corresponds to a redshift of z rv 0.03. The original measurements of Hubble, lie respectively, up to velocities of order 1000 km · S l and then up to velocities of order v rv 20000 km · Sl (see Fig. 4.1). T analysis provided a value for Ho that was 10 times larger than the one adopted tod For a galaxy with recession velocity v rv 10000 km· sl, the effect of its proper velo v rv 300 km . S l leads to an uncertainty of approximatively 3%. This uncertainty be reduced by observing a large number of objects well distributed over the whole s Hubble (1929)
Hubble & Humason (1931)
~ ~
~
I
I
en
0
~ 1000 C 0
>
0
500
~
" en
0
0
0
0
10000
0 .;;;
0
en
<J) <J)
~
Jl,
"
0 .;;; 0
o
CIS 000 '0
'0 Q)
~ 20000
6
<J)
5000
0
'8
<J)
~
1 Distance (Mpc)
30
2 Distance (Mpc)
Fig. 4 .1 Observations by Hubble in 1929 (left) a nd by Hubble and Humason in 1931 (ri proving the expansion of the Universe. The first measurement of Ho included 18 gala for which both t he distance and the red shift were determined . Hubble concluded that H 500km · S  1 . Mpc 1 [4] .
4.1.1.1
Astronomical m ethods
The difficulty in constructing a reliable scale of distances [5] has limited progress in measurement of the Hubble constant. Thanks to the Hubble Space Telescope (HST is now possible to measure Ho with a precision of about 10%. In astronomy, distan cannot be measured directly. The parallax method , which is purely geometrical,
The Hubble diagram and the age of the Universe
179
only be used to determine the distances of the closest stars in the Milky Way. The general method is then to measure the intrinsic luminosity of a class of objects for which it is either constant , or correlated to another physical parameter , whose value is thought to be independent of the distance, as a way to calibrate the relative distances. Various classes of objects have been used to this purpose (see Ref. [6] for a review). 1. Cepheids: cepheids are variable stars the external atmosphere of which pulses
with a period between 2 and 100 days. These young and bright stars are very abundant in the close spiral and irregular galaxies. The pulsation mechanism is well understood theoretically and it has been established empirically that the pulsation period, which is independent of the distance , is correlated to the intrinsic luminosity of the star. The dispersion of the period luminosity relation is of the order of 20% in luminosity. Since the luminosity scales as the inverse of the square of the distance, this implies a precision of 10% on the determination of the distance. For a sample of 25 cepheids, the statistic uncertainty drops to 2%. The Hubble diagram obtained from cepheids by the HST Key Project [7] is presented in Fig. 4.2. The most precise calibration of the period luminosity relation is performed on the Large Magellanic Cloud (LMC) and its distance has been determined with a precision of 10%. At the moment, the det ection of cepheids is limited to about 20 Mpc. To extend the scale, obj ects brighter than simple stars should be found . Their luminosity can be calibrated by identifying such objects in the nearby galaxies that also contains cepheids. 2. Type Ia supernova;: type Ia supernOVffi (SN Ia) are the most promising distance indicators for modern observational cosmology. The origin of SN Ia is the explosion of a white dwarf. Their luminosity is comparable to that of an entire galaxy, making them observable at distances of several hundreds Mpc. As will be shown later in Section 4.l.2 , their luminosity curve shows a strong correlation between the characteristic evolution t ime and the maximal luminosity. The uncertainty of this relation is of the order of 12% on the luminosity, which translates into an uncertainty of around 6% in t he determination of their distance. 3. The Tully Fisher relation: the total luminosity of spiral galaxies is strongly correlated to the m aximal rotation velocity of the galaxy. This relation, detailed in Chapter 7, reflects the fact that a brighter galaxy, and thus a more massive one, must rotate faster to compensate the gravitational attraction. The rotation velocity can be measured by spectroscopic observations from the Doppler effect. The Tully F isher relation has been measured for around a hundred galaxies and it can be shown empirically that it has a dispersion of the order of 30% in luminosity, which implies a precision of the order of 15% in the determination of the distance. 4. Fundamental plane: the intrinsic luminosity of elliptical galaxies is correlated to the velocity dispersion of their stars. This correlation is analogous to the TullyFisher relation. Elliptical galaxies occupy a 'fundamental plane' where their luminosity L is correlated to t he surface brightness 10 and to the velocity dispersion 0 7 (J, by a relation of the form L ex 10 . (J3 . This relation can be understood by the fact that elliptical galaxies are roughly
The standard BigBang model
180
2000 ::;'
1 500
I
OJ
5 00L~~
o
• '" • • []
~ ~
I
en
E
__~__~__~__~~__~__~__~__~__~__L~__" 20 30 10 Distance (Mpc)
2· 104
6
~~
Tull yFicher (Bande I) Fundamental plane Surface brightness Supernovre Ia Supe rnovre II
'8 0
~
~
I
"0.
::E I'
'(l
E
6
10 4
0 100 80 60 40 0
100
200 Distance (Mpc)
300
400
Fig. 4.2 (top): Hubble diagram for cepheids obtained by the HST Key Project collab tion [7]. Data are spread up to distances of around 20 Mpc, just as the measurement of Hu and Humason (1931). (bottom): Different objects are used and thanks to type Ia supernO the diagram can spread up to around 300 Mpc.
selfgravitating syst ems with a masstoluminosity ratio more or less const The virial theorem (see Section 7.2.2) then imposes (J ex M I ro. Moreover, surface brightness is well described by I (r) = 10 exp [ (r I ro) 1/ 4 ], called the Vaucouleur law. This can be integrated as L ex Ior5. Assuming that the mass luminosity ratio is of the form MIL ex M X , we obtain that £l +x ex (J4 4x Ig The empirical law is recovered for x rv 0.25 . The precision of this relation is of the order of 10 20%, which implies a preci of the order of 5 10% on the det ermination of the distance.
5. Fluctuation of the surface brightness: the resolution of stars with a CCD cam depends on the distance. The luminosity of each pixel is due to a given num of stars. One can show t hat the Poisson fluctuations from one pixel to ano
The Hubble diagram and the age of the Universe
181
depend on the distance of the galaxy. This method has a precision of the order of 8% and is applied to redshifts up to z '" 0.02 . 4.1.1.2 S ummary of the m easurements These different methods have been put into application, mainly thanks to the HST, to construct a reliable astronomical distance scale. In particular , t his distance scale has been calibrated up to a level of 5% by cepheids. The results concerning the Hubble diagram are presented in Fig. 4.2. T he different values of the Hubble constant obtained by these various methods [7] are summarized in Table 4.1. Table 4.1 Value of the Hubble constant as measured by t he various methods d iscussed in t he text. Ho (km . s  1 IMpc)
Method
71 ±2 ±6
SN Ia
71 ±3 ± 7
'TUlly Fisher
70 ±5 ±6
surface brightness
72 ±9 ±7
SN II
82 ±2 ± 6
fundamental plane
72±8
combined methods
4.1.1.3 Physical methods In parallel with these astronomical methods , there exist at least three other methods to determine the Hubble constant, avoiding t he need to construct a scale of distances. 1. Expansion of the photosphere of SN II: the front of the SN II propagates at a velocity of order v / c '" 0.01 after t he explosion. This velocity can be measured
by the Doppler shift in t he supernova spectrum. The radius of the photosphere is then rphotosphere = v(t  texp losion) . If the angular diameter B under which the supernova is observed can be measured, then its distance can be deduced , D = v(ttexplosion )/B. Unfortunately, the angular diameter is difficult to measure directly for extragalactic supernOVffi. 2. Lensing effect: the arrival times of two gravitationally lensed images of the same pointlike source depend on the path followed by light and on t he gravitational potentials along t hese paths (Chapter 7). The time delay between two light signals together with the angular separation between the two images of a variable quasar allow the Hubble constant to be deduced. However, this method has some problems. The gravitational lens is usually a galaxy, the mass distribution of which is not known independently. Thus, there is a degeneracy between the distribution of the lens mass and the Hubble constant. Ideally, we should need a measurement of the velocity dispersion as a function of the position. Only a dozen known systems, having both a favourable geometry and at least one variable source, is currently known . 3. Sunyaev Zel 'dovich effect: the inverse Compton scattering of a microwave background photon (see Section 4.4) by t he ionized gas of a cluster , induces a distortion
182
The standard BigBang model
of the blackbody spectrum, known as the (thermal) Sunyaev Zel'dovich effec The scattering probability P can roughly be expressed as a function of the cl diameter Dctuster and its average electronic density ne by P rv n e O"TDcluster being the Thomson scattering crosssection. The temperature and density of the hot gas can be obtained from an X emi map. Using the fact that the X flux , unlike the spectral distortion of the micro background, depends on the distance to the cluster, the latter can thus be dedu The main uncertainties of this method come from the heterogeneities of the distribution (which tend to lower the measured value of H o), from the proj ec effects, and the hydrostatic equilibrium hypothesis.
To conclude, it is reassuring to notice that methods based on very different phy principles lead to consistent values of the Hubble constant (see table 4.2), thems in agreement with those obtained by astronomical methods. Table 4.2 Measurement of the Hubble constant by physical methods. Ho (km . S  1 / Mpc)
4.1.2
Method
73 ± 15
photosphere
72 ±7 ±15
lensing effects
60 ±20
lensi ng effects
60 ±4
Sunyaev Zel'dovich effect
72 ±5
WMAP
Reference Schmidt et al. (1994) Tonry and Franx (1998) Fassnacht et al. (2 000 ) Reese et al. (2002) Spergel et al. (2003 )
The contribution of supernovre
As illustrated in section 4.1.1, SN Ia represent to date the most competitive candid to extend the Hubble diagram to large redshifts , for which the luminosity dista redshift relation is no longer linear. This offers the opportunity to measure cosmolo parameters other than the Hubble constant.
4.1.2.1 Distant supernovre
There exist two families of supernovre. Those for which the optical spectrum incl hydrogen traces are called type II, whereas those lacking hydrogen are of type 1. M over, type I supernovre are subdivided into three very different subfamilies (a, SN Ib and Ic, just as SN II , are created by the explosion of massive stars (c WolfeRayet) and lead to a residual black hole or neutron star. The mass lost by progenitor is more significant for the Ic than for the Ib so that the latter still ha layer of helium. SN la, on the other hand, appear in binary systems where one o two stars is a white dwarf that accretes the mass of its companion. When it rea the critical Chandrasekhar mass, 1.4 M G , the white dwarf collapses and explodes supernova. It has been established , thanks to the study of close supernovre, that luminosity curve could be calibrated using an empirical relation between the peak the width of the luminosity curve. Thus, SN Ia seem to be good standard candles can be observed up to large redshifts.
The Hubble diagram and the age of the Universe
183
Two teams have worked in parallel on the search for dist ant supernovre: the Supernova Cosmology P roj ect (SCP ) [9] and the HighZ Supernova Search Team (HZT) [10]. In 1998, they independently published t he Hubble diagram of type Ia supernovre (see Fig. 4.3 ). The SCP group used a catalogue of 42 SN Ia composed of 18 SN Ia with z < 0.101 and 24 SN Ia with 0.1 80 < z < 0. 830. T he analysis of t he HZT group relies on 34 close SN Ia and 16 SN Ia with 0.16 < z < 0. 62. Figure 4.3 summarizes t he observations of both t eams. Since then , the number and the maximal redshift of the observed supernovre have increased greatly. 3
44 • IIighZ SN Sea rch team ~I o Supernova Cosmology Proj ect o 42
i
..2'" 40 ::l
2
"0 0
 IIighZ SN Search team   Supernova Cosmology .. Proj e~~.... ·····
E 38
t::
lS
is'"
= 0 .3 , QAO = 0. 7 = 0.3, QAO = 0.0 QmO = 1.3, Q AO = 0 .0

36
QmO
.. ... QmO  
34
\.\0 <:> ..••••.•
o
6
J" l. n
~ O. o
.
~E O O~~~~~~~~tt~~~~Ir.~~ 1 fI o. ~
f ~
0 .01
0 .10
1.00
 lLLww~~~~~~~~~~WWW
0.0
z
0.5
1.0
1.5
2.0
2 .5
Q mO
Fig. 4.3 The Hubble d iagram for the supernovre of b oth HZT and SCP t eams compared with the predictions of t hree ACDM models. The bot tom panel shows the differ ence between the data and a Universe with (Dmo, DAD ) = (0.3 , 0). (right): T he constraints on the cosmological parameters (Dmo , Dl\o) obtained by b oth exp eriments, compared with t hose obtained from the cosmological background (see Chapter 6) based on t he observat ions of MAXIMA and BOOMERanG balloons. Adapted from Ref. [11].
4.1.2.2 Implication for the cosm ological p aram eters As soon as t he redshift is not too small, t he relat ion between t he distance luminosity and the redshift becomes nonlinear and t he nextt oleading order term is proportional to ~( 1  qO )z2 [see (3.99) ]. The analysis of both t eams proves t hat the deceleration parameter satisfies (4.1 ) qo < 0
> O. T his conclusion follows from the parameterizat ion (3 .92) t hat does not make any hypothesis on the material composition
at 2.80 with no other hypot hesis than Dmo
184
The standard BigBang model
of the Universe and relies only on the hypothesis that the Universe can be describ by a Friedmann Lemaitre spacetime. We can thus consider the fact that the Unive is accelerating to be a robust conclusion of the analysis of SN Ia data. l In a ACDM Universe, this relation mainly gives information on the two cosmol ical parameters n mo and nAO, as indicated by (3.94), and the constraint (4.1) imp that nAO > ~nmo. Figure 4.3 summarizes the supernovre constraints. These data ply t hat the cosmological constant must be nonvanishing and positive. The ellipses constraints of Fig. 4.3 can be roughly summarized as
(4
For tighter constraints, one should combine the data of other observations such as th from the cosmological microwave background. The latter indicates that the Unive is almost fiat , n mo + nAO c:::: 1.02 ± 0.02 according to the analysis of WMAP ( Chapter 6). Thus, for a ACDM Universe,
nmO
rv
0.3.
(4
The order of magnitude of this analysis is confirmed by precise statistical analysis
4.1.2.3
Robustness of the conclusion
The greatest uncertainty lies in the hypothesis that distant supernovre are stand candles and can be calibrated in the same way as close supernovre. Three effect s sho be quantified with precision in order to know if this effect is real or if it is an artefa
 The behaviour of supernovre could vary with cosmic time, mainly due to change of metallicity, i. e. of their chemical composition beyond helium4. If dist supernovre have a luminosity peak systematically weaker than closer ones, conclusion nAO > 0 would be softened. If the explosions of SN Ia are strong then the conclusion would be reinforced. It is difficult to obtain information the intrinsic luminosity and one can more easily study other parameters of light curve. If two families (close and distant) are in all aspects identical, it however , highly probable that their luminosity is also the same.  The absorption by the interstellar medium could give the illusion that dist supernovre are systematically dimmer. The analysis of the light curve in seve colours can give an estimate of this dimming. The effects of gravitational lens could also soften the luminosity.  The possibility of an observational bias should also be considered.
The increasing number of observed supernovre provides a better understanding of th physics and current studies tend to prove that the acceleration of the Universe can be explained by anyone of these effects [ll]. Figure 4.4 compares the astrophysi and cosmological effects .
lWe emphasize that this conclusion can now b e reached by the combination of other da ta ( eMB a nd weak lensing) witho ut taking the SN Ia into account.
The Hubble diagram and the age of the Universe
185
0.5
~
5" ¥I
0.0 ... '.
r
0 .5
Q = O
_.  .....
Absorption by interstellar medium or evolution QmO = 0.35, Qi\O = 0.65 QmO = 0.35, Qi\O = 0.0 QmO = 1.0, Qi\O = 0.0
.... "
.
...... ..... .
z
Fig. 4.4 Hubble diagram for type Ia supernovce compared to the one for an empty Universe (OmO,OAO) = (0,0). The measurement of the SN Ia at z = 1.7 is incompatible with some astrophysical effects that could mimick an acceleration of the Universe at lower redshifts . Adapted from Ref. [11] .
4.1.2.4
Cosmological constant or dark energy
The previous results assume that only a cosmological constant and ordinary matter dominate. However , any kind of matter with equation of state W < 1/3 can lead to qo < 0 [see (3.94)]. It is legitimate to wonder whether it can be checked from the data that the matter responsible for the acceleration of the Universe is indeed a cosmological constant, i.e. a component with an equation of state w =  l. Figure 4.5 presents the constraints on the pair of parameters (w, D l11o ) for a flat Universe if w is assumed to be constant. For Dmo ~ 0. 3 we obtain w <  0.6. (4.4) The constraints on the equation of state depend on which data have been combined and on the parameterization of the equation of state. Figure 4.5 compares the constraints obtained by combining the observations from the microwave backgroud (WMAP) , galaxy catalogues (SDSS), the supernovre, and from the Lymana forest in various ways. It is currently conventional to call dark energy the component responsible for the acceleration of the Universe. This d ark energy could be a cosmological constant but other candidates also exist (see Chapter 12 for a more detailed discussion). The majority of these candidates have a dynamical equation of state. The constraints on the equation of state of the dark energy change very rapidly with the flow of new observations. Let us cite, for instance, a constraint [12] on a parameterization of t he form W= Wo + waz/(l + z); it leads to +0.193 Wo =  O.981 0. 193'
wa
=
0.83 O. 05 +0.6 5'
(4.5)
186
The standard BigBang model  0.2,       , _
D D  0.4
WMAP + gal + SN WMAP + gal + bias + Lyman  a \VMAP + gal + SN + bias + Lyman  a
 0.8
1.0
 1.2
 1.4
1.6 ~~~~~~~L~~
0.6
0.20
0.24
0.28
0.32
0.36
Fig. 4.5 (left): Constraints on a const ant equation of state of dark energy as a functio 12mo for a fiat Universe. Adapted from Ref. [9J. (right) : Same thing by combining various of observa tion (contours at 68% and 95%) , cosmological microwave background (WMA galaxies (SDSS) , supernOV ffi and Lymana forest , in different ways. Adapted from Ref.
at 1 a . The determination of the nature of this dark energy is one of the impor challenges of modern cosmology and will be discussed in the final chapters of book. 4.1.3
The age of the Universe
A constraint on the cosmological parameters can also be obtained from the meas ments of the age of t he Universe. A lower bound on this age can then be set that sh be compatible with t he dynamical age of the Universe computed from the Friedm equations. Dat a from supernovce imply that the dynamical age of t he Universe f flat ACDM model is I
t = 14.9+ 14 (  ' o  l.l 0.63
)1
X
10 9 years
h
'
to = 14.2! 6g ( 0.63
)1 X
109 years ( '
using data, respectively, from SCP [9] and HZT [10].
4.1.3.1
A ges in the Milky Way
The oldest obj ects in the Milky Way are low metallicity 2 stars and are locate the halo of the Milky Way. T here are three methods to determine the age of t stars [13].
1. Nuc1eochronology: conceptually, this is t he simplest method , analogous to radi t ive dating methods, and it represents the most direct way to date the Unive
2By m etallicity, we mean here t he content in chemica l elements beyond helium4 . The ra tio [F traces t his met a liicity.
The Hubble diagram and the age of the Universe
187
tu . The age of the stars is estimated from the abundance of longlived radioactive
nuclei for which a halflife is known (for details , see Ref. [14]). The radioisotope 23 2Th (thorium) with halflife of 14 million years has been used to date t he age of the stars of the Galaxy (in particular t he star CS 22892) but its abundance is only halved on a period equal to the age of the Universe. In principle, 238U is a better indicator since its halflife is 4.5 million years . The abundance of this isotope was derived by observing the star CS31082001 8 [15], 10g(U /H) =  13. 7 ± 0.14, which corresponds to an age of (12.5 ±3) x 10 9 years . 2. Cooling of white dwarfs: white dwarfs are t he terminal stage of the evolution of stars with mass smaller t han around 8M0 . As t hey get old, white dwarfs get cooler and dimmer. Using a model of their cooling, their cooling rate, and thus the age of t he Universe, can be calculated. Unfortunately, since they are not very bright , white dwarfs are very difficult to observe so that the studies are reduced to white dwarfs in the neighbourhood of the Solar System and enable us to estimate the age of the local disk to approximatively 9.8 ::: 6:~ x 109 years. 3. Main sequence: theoretical models of stellar evolution make it possible to understand the posit ion of a star in the temperat ure~lumino sity plane. During the main sequence, which corresponds to the longest part of their life, stars burn their hydrogen to produce helium. At the end of t his period, the luminosity of the star increases and its surface temperature decreases. It can be shown that t he luminosity of the stars at this transit ion of t he main sequence is the quant ity least sensitive to errors in the t heoretical models and can be used to determine the age of globular clusters up to precision of 7~ 10 % . One should also determine the distance of the cluster that limits the precision of the method (an error of 10% on the distance translates into an error of 20% on the age!). Thanks to the satellite Hipparcos a better calibration of the local distance scale was possible. The age of globular clusters is roughly is the r ange (11  14) x 109 years.
Table 4 .3 Summary of the main constraints on t he age of the Universe. Age (109years)
Method
Reference
15.2 ±3 .7
nucleochronology (star CS 22892)
Sneden et al. ( 1996)
12.5 ±3
nucleochronology (star CS 310820018)
Cayrel et al. (2001)
11. 5±1.3
5 methods (glob ular clusters)
1l.8 ± 1.2
main sequence (globula r clusters)
14 ± 1.2 13.7 ± 0.2
+ binary) + la rgescale structure
main sequence (globu lar clusters WMAP
C h ab oyer et al. ( 1998) Gratton et al. (1997) Pont et al. (1998) Spergel et al. (2003)
4.1 .3.2 Age of the Universe To conclude, the minimum age of our Universe can be estimated by specifying the ages of the oldest object s in our Galaxy. To obtain the age of t he Universe, the t ime taken to form these objects should also be estimated. Unfortunately there is no robust theory
188
The standard BigBang model
for the beginning of stellarformation processes. The beginning of star formati estimated at z rv 5  20, which corresponds to a period of (0.1  2) x 10 9 years. allows us to conclude (see Table 4.3) that 9.6 < 9 tu S; 15.4.  10 years
4.1.3.3
Compatibility between the Hubble constant and the age of the Univers
Having measured the Hubble constant, the dynamical age of the Universe ca obtained from (3.66) which only depends on the cosmological parameters Dmo DAO (the duration of the radiation era being negligible). To understand if the constraints on the Hubble constant and on the measurem of the age of the Universe are compatible, we first present a comparison between from supernovre and the theoretical computation of the dynamical age of the Uni (Fig. 4.6). We t hen compare Hotu with Hoto(Dmo, DAO ), assuming that Ho and t known up to 10%. These measurements are consistent if the cosmological constant not vanish , as independently indicated by the results from supernovre. An E inste Sitter Universe is marginally excluded (up to 1.50) by these measurements.
4.2
Thermodynamics in an expanding Universe
The previous sections have convinced us t hat the Universe is expanding. Since radi scales as a 4 while pressureless matter scales as a 3 , the Universe was dominate radiation in the past for redshifts larger than
obtained by equating the matter and radiation energy densities and where 8 TCMB/2.725K [see (4.137) below]. Since t he temperature scales as (l+ z ), the tem ature at which the matter and radiation densities were equal is Teq = TCMB(l + which is of order
Above this energy, the matter content of the Universe is in a very different form that of today. In particular, when the temperature T becomes larger t han twic rest mass m of a charged particle, the energy of a photon is large enough to pro particle antiparticle pairs. Thus, when T » m e, both electrons and positrons present in the Universe, so that the particle content of the Universe changes d its evolution, while it cools down. Here, we describe this phase and first fo cus on the particle distribution in librium and then on some outofequilibrium processes leading, for instance, t production of thermal relics. 4.2.1
Equilibrium thermodynamics
For a description of thermodynamics in the relativistic context and in cosmology can refer to Refs. [16 18] .
Thermodynamics in an expanding Universe
h
=
0.7 , tu
=
189
12.10 9 yrs
2.0
1.5 + 20  + 10
1.0
 10 .  20
0.5 ++~~~~rQ~h=~O+(h = 0.5 , tu = 12.10 9 yrs) 0.0
l..'1..._ _ '  _ _L..'_ _'"''_ _...L..'_ _....IL'
o
1
Fig. 4.6 Comparison bet ween t he constraints on the product of the Hubble const ant and the age of the Universe, H ot u , and the product of the Hubble const ant and the dynamical age of the Universe, H oto. The latt er depends on the cosmological paramet ers. We have fixed h = 0.7 and tu = 12 X 109 years. T he dark curve represents a flat Universe (D mo + DM = 1) in which case the abscissa is D hO . T he light curve is for a Universe wit h no cosmological const ant (OAO = 0), for which the abscissa is then D ruo. T he grey regions indicat e t he areas at 1 and 20" for Hotu and the plain horizontal line is the case where h = 0. 5. The circle corresponds to an Einst ein de Sitter Universe for which H otu = 2/3. From Ref. [6]. 4.2.1.1
Dis tribu tion fun ctions and thermody namical quantities
To describe physical processes during t he radiationdom inated era, the distribution function f i (P , T ) of the p a rticles present in the Universe should be d etermined.
Distribution functions P art icle interactions are mainly cha r acterized by a r eaction rate f . If t his rea ction rate is much lar ger tha n the Hubble expa nsion r a t e, then it can m a intain these pa rticles in thermodyn amic equilibrium a t a temper ature T. P articles can thus b e treated as perfect Fermi Dirac a nd B ose E instein gases with distribut ion 3
(4.10)
3The distri but ion function dep ends a priori on (x , t) and (p , E ) b ut the homogeneity hypothesis implies t hat it does not dep end on x a nd isotropy implies that it is a function of p2 = p 2. Thus , it follows from t he cosm ological principle tha t f (x , t, p , E) = f eE , t) = f [E, T (t )] .
190
The standard BigBang model
where 9i is the degeneracy factor, /Li is the chemical potential and E2 = p2 + m 2 . T normalization of Ii is such that Ii = 1 for the maximum phasespace density allow by the Pauli principle for a fermion. Ti is the temperature associated with the giv species and, by symmetry, it is a function of t alone, Ti (t). Interacting species have same temperature. Among these particles, the Universe contains an electrodynam radiation with blackbody spectrum [see Section 4.4] at a temperature of 2.725 K tod Any species interacting with photons will hence have the same temperature as th photons as long as r i » H. The photon t emperature Ty = T will thus be called temperature of the Universe.
If the crosssection behaves as (J rv En rv Tn (for instance, 11, = 2 for electrowe interactions) then the reaction rate behaves as r rv n(J rv T n +3 ; the Hubble const behaves as H rv T2 in the radiation period. Thus, if 11, + 1 > 0, there will always a temperature below which the interaction decouples while the Universe cools dow The interaction is then no longer efficient; it is said to be frozen , and can no lon keep the equilibrium of the given species with the other components. This prope is at the origin of the thermal history of the Universe and of the existence of rel This mechanism, during which an interaction can no longer maintain the equilibri between various particles because of the cosmic expansion , is called decoupling. particle will be said to be coupled if r ~ H and decoupled if r H. This crite of comparing the reaction rate and the rate H is simple; it often gives a corr order of magnitude , but a more detailed description of the decoupling should be ba on a microscopic study of the evolution of the distribution function. As a concr example, consider Compton scattering. Its reaction rate, rCompton = ne(JTC, is of or 3 rCompton rv 1.4 X 10 Ho today, which means that, statistically, only one pho in 700 interacts with an electron in a Hubble time today. However, at a redsh z rv 10 3 , the electron density is 10 3 times larger and the Hubble expansion rate is order H rv HoJOmo(l + z)3 rv 2 x 104Ho so that r Compton rv 80H. This means th statistically at a redshift z rv 103 a photon interacts with an electron about 80 tim in a Hubble time. This illustrates that backward in time densities and temperatu increase and inter actions become more and more important. Note that if the thermal equilibrium can be maintained by interactions withi cosmic plasma, by causality, these interactions cannot establish such an equilibri between two causally disconnected regions that would have different initial tempe tures. The hypothesis of the homogeneity of space implies that the Universe is assum to be initially in thermal equilibrium. The justification of this state will be discus in Chapter 8.
:s
Thermodynamical quantities
The distribution function (4.10) can be used to define macroscopic quantities su as the particle number density, 11" energy density, p and pressure, P, for each spec 'i' , as 4
4Since E2  m 2 41TV E2  m 2 EdE.
p2 implies that pdp
EdE a nd, b ecause of isotropy, d 3 p = 41Tp 2 dp
Thermodynamics in an expanding Universe
191
(4.11) (4.12) (4.13) These three quantities can be expressed in terms of the integrals I (m,n) ( i
.
.)
=
X t , Yt 
1
00
Xi
m( U 2
2)n/2 d U exp [U( )Yi] ± 1 '
U

Xi
(4.14)
with Yi
!Li ==, T
(4.15)
as (4.1 6) Table 4.4 summarizes the expression of these quantities within various limits for fermions (F) and bosons (B), where ((3) ~ 1.202 is the value of the Riemann zeta function at 3.
Table 4.4 Summary of the main limits of the thermodynamical quantities (4.11) (4. 13) for fermions (F) and bosons (B). particles
Limit
p
n
P
B
g«((3) /1r 2 )T3
(1r2/3 0 )gT4
p/3
F
(711"2/8
p/3
F
g(3(3) /41r 2 )T3 gJ.L3 / 61r 2
T»m,J.L <  T
B,F
(g /1r 2 )el"/TT 3
gJ.L3/81r 2 (3g/1r 2 )el"/TT4
T«m
B ,F
g(mT / 21r)3 /2 eCI"=) /T
(m +3T/2)n
T »m, J.L J.L » T »m
X
30)gT 4
p/3 p/3 nT
4.2.1.2 Number of relativistic degrees of freedom The radiation density at a given temperature T can be computed from these expressions (see Table 4.4) (4.17) 9. represents the effective number of relativistic degrees of freedom at this t emperature ,
g.(T)
=
L i=bosons
Ti ) 4 7 gi ( T +8
L i=fermions
gi
(Ti)4 T
(4.18)
192
The standard BigBang model
The factor 718 arises from the difference between the Fermi and Bose distributio In this regime, the matter content is dominated by radiation and the curvatur negligible, so that the Friedmann equation is of the form 2
(7r30 )
H2 = 87rG N 3
g.
T4.
(4.
Numerically, this amounts to H(T)
1/2
rv
T2
= 1.66g. lVI'
(4.
p
or equivalently,
t(T)
~ 0 .g. 3 1/2 Mp T2

obtained by integrating H = 4.2.1 .3
rv
2 42 .
 1/2
g.
(_T_) 1 MeV
2
s,
(4 .
 TIT.
Chemical potential and particle antiparticle aBymmetry
Inelastic scattering processes such as Brehmsstrahlung e + p f+ e + p + 'Y im that the number of photons is not conserved and thus t heir chemical potential sho vanish (4. /L, = O.
This implies that photons must have a Planck distribution. Any particle, A, kept in chemical equilibrium with its antiparticle by a reaction the form A + A f+ 'Y + 'Y must then satisfy the constraint /LA
+ /LA = O.
(4.
As long as T» mA , using (4.11) , this implies an asymmetry nA _ nA
~ g;~3 [7r 2(~) + (~) 3]
between particles and antiparticles. At lower temperatures (T is exponentially suppressed nA  nA
~ 2g A (m;:) 3/2 e
mA
/
T
sinh
«
(4.
mA) this asymme
C"J; ).
(4.
When T becomes smaller than mA , A and A annihilate and only this small exc survives. This is, for instance, the case for the electrons. Note t hat the electrical neutrality of the Universe implies that the number protons np is equal to n e ne . Using the constraint npln, rv 5 X 10 10 [see Section (4.142)]' we obtain n 9 7r 2 I/. ..E. rv '=~ rv 10 8 (4. n, g, 6((3) T ' so that 1.34/Le1T rv 6.5 can thus be neglected .
X
10
10
.
The chemical potential of the electrons and positr
Thermodynamics in an expanding Universe
4.2.1.4
193
Entropy
In order to follow t he evolut ion of the processes, it is convenient to have conserved quantities such as the entropy. Combining t he conservation equation (3.27) , rewritten in the form d(pa 3) =  Pda 3 , and the derivative of the pressure (4.13) with respect to the t emperature,
we get that t he quantity 5 p + P  np, (4.27) T satisfies 6 the equation d (sa 3 ) = (p, / T)d(na 3). Thus, the product sa 3 is constant (i) as long as matter is neit her destroyed nor created , since t hen na 3 is constant , or (ii) for nondegenerate relativistic matter , p,/ T « 1. We thus deduce that in the cases relevant for cosmology, (4.28)
s ==
'=''
To interpret this quantity, let us consider t he expression (4.27) which implies t hat Td( sa 3) c::: d (pa 3 ) + Pda 3 , in t he limit p,/ T « 1. We recognise S = sa 3 as the entropy and thus s is the entropy density. Hence, (4.28) implies that s ex: a 3 as long as p,fT « 1. T hen, using (4.27), the entropy can be expressed as (4.29) where q* is defined by q. (T) =
L
L
Ti ) 3 7 gi ( T + "8
i = bosons
gi
(Ti T
)3
(4.30)
i= ferm io ns
If all relativistic particles are at the same temperature, Ti = T, then q. = g•. Note also that s = q*7T 4 / 45((3)n, ~ 1.8q.n" so that the photon number density gives a measure of the entropy. SIn the case of a vanis hing chemical potential, th is expression can b e easily derived by starting from the general expression of t he entropy of a system as TdS = dE + Pdll. If we choose T and V as our variables to describ e the thermodynamical state of the system t hen, since they a re intensive quantit ies, P and p are functi ons of T alone; p(T) a nd P(T). It foll ows that dE = d(pll) = pdll + lI(dp/dT) dT. Thus,
+ ~ dp dT. T TdT We deduce that (8S/8V)T = (p + P) /T a nd (8S/8T)v = (1// T )(dp/ dT ). T he integra bility condit ion then implies that dp /dT = (p + P )/T. Plugging these expressions in dS we deduce that d S = P + P dll
dS = d
[ ~(p + P)]
,
from wh ich we get the expression of the entropy up to a n addit ive constant . 6S tarting from TdS = dE + Pdll  /LdN and us ing t h at 11 ex a 3 , so t hat E ex pa3 , and Sex sa 3 we deduce that Td( sa 3 ) = d (pa 3 ) + Pda 3  /Ld (na 3 ). Tak ing into account that the first two terms of the r.h.s. cancel, we deduce t he evolutio n equation of sa 3 .
194
The standard BigBang model
The number of particles in a given volume, N ex na 3 , is constant if no matt created or annihilated. The relation (4.28) thus implies that
y= ~ s
(4
represents the number of particles per co moving volume. For relativistic particles, quantity remains constant during the evolution. It can be easily computed in limiting cases
(4
(4
4.2.1.5
Decoupled species
If at a given time , the interaction rate is no longer high enough to maintain thermodynamic equilibrium of the species i, i. e. if f i ;S H , then this species evo independently from the others . If this species was in thermodynamic equilibrium be decoupling, then its distribution function at the time of decoupling, tD , is 1
f i(P, tD)
=
exp [(E  f.L i)/Ti(tD)]
(4
±1
After decoupling, particles propagate freely and the form of the distribution functio conserved. Only the momentum is redshifted by the expansion as p(t) = p(tD)aD/ It follows that the distribution function at any t > tD is given by
f i(P, t)
a(t) ] fi [ aD p, tD .
=
(4
Thus, if a species was in thermodynamic equilibrium at a given time in the histor the Universe , one can determine its distribution at any time. If the decoupling occurred when the particle was relativistic, (T » m , f.L), then distribution function (4.35) takes the simple form 1
(4
fi(P, t > tD) = exp [E/Ti(t) ] ± l '
where TD = T(tD). The temperature of this decoupled species always decrease a  1 and its entropy Si is conserved separately. If this particle becomes nonrelativ after t NR » tD , we have E(t) c::::: m for any t > t NR but the distribution function k the form (4.36). If the decoupling occurred when the particle was nonrelativistic (T « m) , E c::::: m + p2/2m and the distribution function is given by
f ,(p , t > tD) o
= e(m  I' )/T De 
p 2 /2mTi
(t) ,
T (t) i
=
T
D
[aD] 2 a(t)
(4
Thermodynamics in an expanding Universe
195
Thus, everything goes as if the particles had an effective chemical potential , J.l( t) m+ [J.lD  mJT(t) / To. We then obtain ni
4.2.1.6
~
gi
(
mTO )
3/ 2
271"
(aO)3e  m / To . a
(4.38)
Temperat ures and their evolution
Temperature of the Universe Since the total entropy is constant, q. (T)T3a 3 remains constant during the evolution of the Universe . If, moreover , a species i decouples from the cosmic plasma at temperature To then the entropy of this species is conserved on its own so that and
S,
=
271"2
3 3
q·(T)T a 45 ' "
(4 .39)
are both constant. The entropy, S  Si = (271"2/45 )q, (T)T 3 a 3 , of the particles in thermal equilibrium with the photons, is t hus also constant. We thus conclude that the temperature of the Universe varies as (4.40) where q, is given by the relation (4.30) but only summing over the relativistic particles that are in thermal equilibrium with the photons. Temperature of a decoupled species As long as the number of relativistic species does not vary, the temperature decreases as a  I . When a species becomes nonrelativistic, its entropy is transferred to the other relativistic pa rticles remaining in thermal equilibrium and therefore see their temperature suddenly increase (q. is a decreasing quantity) as
T(+) =
[q,(_)] q, (+)
1/3
T(  ),
(4.41 )
where + and  refer, respectively, to quantities evaluated after and before the decoupIing. At the time of decoupling T = Ti = To so that comparing the quantities (4.39) at temperatures To and T < To, we get that the temperature of the decoupled species is related to that of the Universe by (4.42) This expression generalizes (4.36) to the case where qi(T) is not constant. In particular, qi(T) can vary if the species i has a new interaction allowing it to annihilate to produce another particle that does not interact with the cosmic plasma.
196
The standard BigBang model
4.2.1.7 Neutrinos as an example
Neutrinos are in equilibrium with the cosmic plasma as long as the reactions v + D + e + e and v + e +> v + e can keep them coupled. Since neutrinos are not charg they do not interact directly with photons. The crosssection of weak interactions is given by u rv E2 ex: T2 as long the energy of the neutrinos is in the range m e « E « mw. The interaction rate thus of the order of r = n(uv) c:::: G;T5 . We obtain that
G;
r
c::::
G;
(_T_) H
(4.
3
1 MeV
'
where we have used (4.20). Thus, close to To rv 1 MeV, neutrinos decouple from cosmic plasma. For T < To , the neutrino temperature decreases as Tv ex: a 1 a remains equal to the photon temperature. Slightly after decoupling, the temperature becomes smaller than me . Between and T = m e there are 4 fermionic states (e , e+, each having ge = 2) and 2 boso states (photons with gy = 2) in thermal equilibrium with the photons. We thus h that 7 11 (4. qy (T > me) = 8 X 2 X 2 + 2 = 2· For T < m e only the photons contribute to qy and hence
qy (T < me) The conservation of entropy implies that after the neutrinos and the photons are related by
(4.
= 2.
e
e annihilation, the temperatures
_ (11) 4 1/3 Tv.
(4.
Ty 
Thus, the temperature of the Universe is increased by about 40% compared to neutrino temperature during the annihilation. Since nv = (3/11)ny, there must ex a cosmic background of neutrinos with a density of 112 neutrinos per cubic centime and per family, with a temperature of around 1.95 K. The energy density of th neutrinos, as long as they are relativistic, is Ov = (7/ 8) (4/11 )4/ 30y per family neutrino. Using the value (4.141) of Oyo, we conclude that today r. 2 HvO h =
1.68
X
10 5
(Nv) :3 84 
(4.
2.7'
if their mass is smaller than 10 4 eV.
4.2.1.8
Number of relativistic degrees of freedom
The e  e annihilation is the last annihilation to happen , so that knowing Tv / Ty gi the values of g. and q. today 7
g.o = 2 + 3 x  x 2 x
8
( 4 ) 4/ 3 
11
= 3.36,
if there are 3 families of massless neutrinos.
q.o = 2+3 x
~ x 2x
C
4 1) = 3.91 (4.
Thermodynamics in an expanding Universe
197
The functions q* (T) and g* (T) depend on the particle content of our Universe. As an example, Table 4.5 and Fig. 4.7 summarize their evolution in the context of the standard model of particle physics, discussed in Chapter 2.
Table 4.5 Particle content of the Universe as a function of the t emperature in the context of the standard model of particle physics. ~ 150400 MeV characterizes the quark hadron phase transition, which is assumed to be adiabatic. The thermodynamic history of this transition is not yet well understood. This table also assumes that the Higgs boson has a mass larger than the mass of the W ±, ZO bosons and of the top quark (t). The last three lines of this table are not certain since the Higgs sector and the electroweak transition are not well understood. At larger temperatures, 9* depends greatly on the theoretical model for the fundam ental interactions. For instance, for a minimal SU(5) grand unification model, 9* (T > TG UT ) = 647/4 and in the case of the minimal supersymmetric model, 9*(T < Tsusy) = 915/4 since the number of degrees of freedom is almost doubled.
T:r
Threshold (GeV)
T
< me
0 .511
X
10 
Relativistic particles
"( (+
3
g*
2
+ e±
11/2
TD(V)  ml'
0.106
v start to interact
43/4
mJL  mW
0.135
+ J.L ±
57/4
+ rr±, rro
69/4
m e  TD(V)
(2  4)
X
mrr  T:!h
T2h 
ms
0.194
10
3
,,(, 3v's, e±, J.L± U, d ,
U,
ms  me
1.27 ± 0.05
+ s,
d, 8
S
205/4
g 's
247/4
mCmT
1.78
+ c, c
289/4
m T  mb
4.25 ±0.1O
303/4
mbmW
80.3 ±0.3
mWmt
180 ± 12
+ T± + b, b + W±, ZO +
t, t
423/4
+ HO
427/4
mt  mHO mHO 
4.2.2
3 decoupled v)
T;:w
300 (7)
345/4 381/4
Outofequilibrium thermodynamics
The evolution of a decoupled species can thus easily be described . However, the description of the decoupling , or of a freezeout of an interaction, is a more complex problem that requires us to go beyond the equilibrium description that has been developed previously.
198
The standard BigBang model
q
1
105
104
10 3
10 2
101
1
T (GeV)
Fig. 4 .7 Evolution of the functions 9. and q. as a function of the temperature in the s dard model of p article physics SU(3)c x SU(2)L x U(l)y with two hypotheses concerning critical temperature of the quark~hadron transition, Tcqh .
4.2.2.1
Boltzmann equation
Evolution of the distribution function
The evolution of the distribution function is obtained from the Boltzmann equat
L[jl = C[f]'
( 4.
where C describes the collisions and L = d/ds is the Liouville operator, with s length along a worldline. The operator L is a function of eight variables taking explicit form
a  r'"{3yP{3Py ap"" a L[fl  P'" ax'" In a homogeneous and isotropic spacetime, time, f(E , t) , so that
f
L [f l = E af _ H
at
( 4.
is only a function of the energy 2
af . aE
(4.
p Using the definition (4.11) of the particle density, and integrating this equation w respect to the momentum p , we obtain 7 71n order to evaluate the second term of (4.5 1), we use that
Thermodynamics in an expanding Universe
199
( 4.52) The difficult part lies in the modelling and the evaluation of the collision term . Here, we restrain ourselves to the simple case of an interaction of the form i+j ~ k+l,
for which the collision term can be decomposed as Ci
(4.53)
= Ckl >ij
 Cij > kl
with
(4.54) with a + sign for bosons and a  sign for fermions. IMlij>kl are the matrix elements describing the interaction. The Dirac delta function imposes the conservation of momentum and of energy. This form also shows that the probability for i to disappear is proportional to fdj , i.e. roughly to the density of the interacting species. 8 The factors (1 ± fk) arise from quantum mechanics and are related to the Pauli exclusion principle for fermions a nd to stimulated emission for bosons. If CP invariance holds, as we assume here , then Ckl>ij and Cij>kl involve a unique matrix element, 1M 12 , determined by the physical process. Indeed , this invariance implies t hat the process we consider is reversible and thus that i + j + k + I a nd k +14 i + j have the same matrix elements. It follows that
(4.55) Equation (4.52) thus involves three sources of evolution for the number density ni, namely dilution (3H ni), creation (C i = Ckl >ij ) and destruction (C i = Cij>kl). Evolution of the particle density In cosmologically interesting situations, E  p, » T. Quantum effects can thus be neglected and 1 ± f c::: 1. Equation (4 .52) then takes the form
J
afd 3 p = 3 p2 E aE
J
fd 3 p
by integrating by parts.
8For this reason , we may decompose C ij > kl as Cij>kl = n i njCij>kl. Since Cij_kl has dimensions of a volume divided by a time , we can interpret it as the average crosssection multiplied by the relative speed of the interacting particles, that is C ij>k l = (aij_kIVij) . If we define the reaction rate by rij~kl = nj ( a ij _ kl Vij) then (4.52) takes the form
200
The standard BigBang model
(4
In this limit, the distribution functions are of the form f ex exp[(/1  E)IT ] so the particle density (4. 11) can be expressed as a function of that at /1 = 0 as
(4 Furthermore, t he conservation of energy implies t hat Ek term !kfl  fdj takes the form
+ El = Ei + E j
such t hat
The Boltzmann equation (4.56) can t hus be written as
nn ) n + 3Hn = (uv ) ( n·n· t J  ~nknl _ _ nknl 1.
1.
,
(4
where (uv) is defined as
(4
This equation is very general, but to go further we should know the form of the ma element IM I. We simply note t hat when nj (uv) » H , we recover t he fact that (4 can only be satisfied if the term in brackets cancels, i.e. if
(4
The system is then in chemical equilibrium. This equality is broken as soon as system is out of equilibrium.
4.2.2.2 Frozen interactions and relic particles
A massive particle X is in thermodynamic equilibrium with its antiparticle X temperatures larger than its mass. Assuming this particle is stable, then its den can only be modified by annihilation or inverse annihilation
If this particle had remained in t hermodynamic equilibrium until today, its density, n ex (mIT)3/2 exp(  miT) would be completely negligible. The relic den of t his massive particle, i. e. the residual density once the annihilation is no lo efficient , will actually be more important since, in an expanding space, annihila cannot keep particles in equilibrium during the whole history of the Universe. T particle is usually called a relic.
Thermodynamics in an expanding Universe
201
To evaluate the factor (Jxfx  fLfr) , note that land [ are in thermal equilibrium so that , after neglecting their chemical potentials, fL = exp(  EI I T) and fr = exp ( ErIT). Conservation of energy imposes Ex + Ex = El + Er, so that
fdr = exp ( 
 E x+ Ex) T = fxix·
The Boltzmann (4.58) now t akes the simplified (integrated) form
+ 3Hnx =
nx
 (O"v) (n~  n~) ,
(4.61 )
where (O"v ) is defined by (4.59). The conservation of entropy implies that nx + 3Hnx = sYx , where Yx = nx I sis the number of X particles per co moving volume, so that (4.61 ) can be written as (4.62 ) Y.x = (O"V)s (Y x2  Y2) x . Using the variable x [see (4.15) for its definition] and the fact that dx /dt equation can be rewritten as
x =  (O"V)s (2 Y  Yx2) . dx Hx x
dY
= Hx, this (4.63)
During the radiation era, H ex: x  2 , so tha t introducing ~ == Yx  Yx , and expressing H and s, respectively, by (4.19 ) and (4.29) , it t akes the final form
d~ dx

dYx
=    AX
dx
with A = (O"V)
 2
~
(
 ) ~+2Yx ,
rrr ~*/ 2mMp. V45 g*
To go further , we need the form of (O"v) and the value of mechanism in the next section. 4.2.3
4.2.3.1
(4.64)
(4.65)
Yx We illustrate this
Two limiting cases
Cold relics
For a cold relic, the decoupling occurs when the particle is non relativistic. Using our previous results to express nx [(4 .11)] and s [(4.29)]' we get that
Y x _ 45ji9X   x 3/ 2 e  x . 27r4 8 q*
(4.66)
The quantit ies (O"v ) and A depend a priori on the velocity. In general, (O"v) ex: v 2 n with n = 0 (annihilation of s waves) or n = 1 (annihilation of p waves). This quantity can be parameterized in a gener al way as
(O"V) = O"of( x) ,
(4.67)
where f (x) is a function of m i T only since (v 2 ) ex: T. Equation (4.64) can be integrated numerically, but the main properties of its solutions can be obtained analytically.
202
The standard BigBang model
Before decoupling, (1 < x « xd , Y x rv Yx and .6. and its derivative remain v small. Thus, (4.64) has the approximate solution .6.
2
rv _
Since d In Yx / dx

1_ _ x_ d In Yx 2>'0 f(x) dx '
_
=  1 + 3/ 2x
(4.6
»
1, we get that before decou pling
1
x2
1 for x
rv
.6.
rv
(4.6
__
 2>'0 f(x)·
After decoupling (x » xd, Yx decreases exponentially and becomes negligible comparison to Y rv .6.. Equation (4.64) then takes the form .6. 2
d.6.

=  >'02
dx
Integrating this equation between .6. 1
f(x)x
and x =
Xf
_
00
.6. f
>. 0
1 =
00,
1
f(x) dx 2·
>'0
Xf
f(x)
(4.7
X
[ 1 7 ]1 00
Yoo c:::
we get
00
Xf
It follows that
(4.7 ·
(4.7
dx
Xf should now be determined. The decoupling happens when Y x differs significan from Y x , i.e. when .6.(xd = cYx(xd, c being a number of order unity that can determined by comparison with a numerical integration . By estimating .6.(xd with value (4.69), Xf is the solution of
(4.7 A similar expression is obtained using the criteria f(xd
= H(xr), which gives
(4.7
From today's entropy density [see (4.140)], we can deduce the relic density of t particle X, nxo = sOYoo
(4.7
Thermodynamics in an ex panding Universe
203
To give an estimate of t his relic density, let us assume that f(x) = x n and that q*(xf) ~ g*( xt) . Since px o = mnxo and Dx = Px / Pcrit , we get
D
xo
h 2 ~ 0.31 g* [
(
1 / 2
(n
Xf )]
100
1
+ 1) xn+1 f
( q*o ) 8 3 (
3.91
2. 7
ao
10  38 cm2
)
(4.76)
Since X f depends only weakly on the mass , these expressions give a good estimate of the relic density as a function of the particle 's mass and crosssection. Figure 4.8 depicts the numerical integration of t he Boltzmann equation (4.64). We recover both asymptotic regimes discussed in this section. 1 10 2 ~
.~ I: '"
104
Q)
"Cl
....
Q)
10 6 increasing
.D
E ;:l to:
'OJ)
~
10 8
to:
.;;:
~
0
E 0
y"
  
U
~

101 2
10 14 10 1
1
103
10 2 Time
miT
+
Fig. 4.8 Numerical integra tion of (4.64). We recover b oth asymptotic regimes a nd the d ependence in (cyv ) of the relic density Yx.
4.2.3.2 Ho t relics
A hot relic decouples from the plasma while it is relativistic, so t hat
(4.77) where cFB = 1 (boson) or 3/4 (fermion). The interaction is no longer efficient when Yx is constant , so t hat the relic density is simply given by its value at equilibrium
Ycxo =

Yx(xr)
45((3)
gx
= 2 4 CFB(). 7f
q*
Xf
(4. 78)
204
The standard BigBang model
If the evolution of the Universe remains adiabatic after this transition then nXO
=
SOYoo
q.o ) CFBgx 8   () 3.9 1 q. Xf
= 776.8 (  
3 2.7
cm
3
.
(4
If this particle is still relativistic today, then it behaves as the massless neutrino cussed in Section 4.2.1.6. If its mass is greater than the temperature of the cur cosmic background, i.e. m > 1.7 x 10 4 eV, then the associated energy densi pxo = mxsoYoo, so that using (4.140) , its density is 2 ( q.o ) ( mx ) cFBgx nxoh = mxsoYoo = 0.739 3.91 10 eV q.(xd·
(4
The constraint n x h 2 < 1 implies that m < 13.5q.(xd/cFBgx eV. The precise v of this constraint depends on the value of q. at decoupling. For instance, for slightly massive neutrinos, or for particles with a small mass decouple at around 1 MeV, q. (xd = 10.75, and we get
n
xo
h2
_ mx 91.5 eV '
(4
which gives a constraint on the particle mass. The relic density is inversely proporti to q. (xd. A relic will thus be less abundant if it decouples early. For instance, if particle decouples at around 300 GeV then q.(xd cv 107, so that
n
xo
h2 =
mx 910eV'
(4
So, a particle decoupling while q.(xd » 1 will have a very small relic abundance t emperature compared to the photons of the cosmic background. These are usu called warm relics.
4.3
Primordial nucleosynthesis
For temperatures above or of the order of 100 MeV, the Universe is dominated relativistic particles in equilibrium: electrons, positrons, neutrinos and photons , the nuclear reactions can be activated. The contribution from neutrons and prot which are then nonrelativistic, is negligible in the mass budget. The weak interact between neutrons, protons and leptons, Ve
+n
f+
p
+ e,
(4
and the interactions between electrons and positrons,
(4
maintain all these particles , together with nonrelativistic baryons, in thermodyna cal equilibrium.
Primordial nucleosynthesis
4.3.1
205
Main stages o f the mechanism
The synopsis of primordial nucleosynthesis (usually referred to as BigBang nucleosynthesis, BBN) can be decomposed into three main stages:
(1) T» 1 MeV, the components of the Universe are in thermodynamic equilibrium. Radiation dominates the Universe and 9* = 10.75. Thus, the neutron to proton ratio is approximatively given by its value at equilibrium
Q=
mn  mp = 1.293 MeV. The fraction of each atomic nucleus can then be obtained from purely thermodynamical considerations.
(2) T rv 1  0.7 Me V , weak interactions can no longer maintain the equilibrium. Neutrinos decouple and the nip ratio deviates from its equilibrium value. At the freezeout temperature, Tf rv 0.8 MeV , it is of the order of 1 rv

5 Free neutrons then decay into protons while atomic nuclei remain in thermodynamic equilibrium. In terms of time, this stage starts roughly round t rv 1 s [see
(4.21)J. (3) T
rv 0.7 0.05 MeV, the nuclear thermodynamic equilibrium can no longer be maintained. Electrons and positrons have annihilated each other and reheated the photon bath. 9* is then 3.36 . The atomic nuclei are then formed by a series of twobody reactions. For that , deuterium needs to be synthesized (p+n 7 D +,,(), which is possible only when the radiation density is low enough so that the inverse reaction , the photodissociation of deuterium , is negligible. This determines the temperature TNuc obtained by no/n"{ rv 1, i.e.
T) 2
exp
(ED) 
rv
1.
TNuc
Around T Nuc , (n/p)Nuc = (n/p)f exp(  t Nuc/Tn) rv 1/ 7. All neutrons are then bound in helium4 nuclei, with abundance (in mass with respect to the total number of nucleons, nb = n + p) of the order of
2(~)NUC
1+ n) ( P
rv
0.25,
Nuc
because the number density of helium4 is n/2 so that it accounts for a mass 4 x n/2 = 2n in nuclear units. The increase of the Coulomb barrier for nuclei of larger atomic mass together with the drop in the temperature and density (and hence in the probability of two body reactions) associated with the absence of stable atomic
206
The standard BigBang model
nuclei of m ass A = 5 and A = 8 explain why primordial nucleosynthesis does p roduce any nuclei b eyond helium in a ny significant amount . BBN roughly st when T ~ 0.05 MeV , which corresponds to t ~ 530 s [see (4.21) ]. Key parameters
In this mechanism , departure from thermodyn amic equilibrium is crucial, since nucleons would otherwise b e in t he form of iron if the equilibrium was maintai during the entire nucleosynthesis. The a bundances predicted by primordial nucleos thesis dep end m ain ly on a few pa ra meters .
 9. : the number of relativistic degrees of freedom. 9. determines the radiation d sity a nd thus the value of the freezeout t emperature Tf. This paramet er depe on the number , N v , of neutrino fa milies. The tempera ture Tf is roughly obtai from the Friedma nn equation
O;Tl ~ VON9.Tl. 
T n: t he neutron lifetime det ermines the ratio (n / p) Nuc a t the beginning of nu osynthesis from its freezeout value. Tn dep ends on various fundamental const a
 1
Tn
636 2 2 ) 5 ) 1 = 1.27r . 3 0F (1 + 39A m e = (887 ± 2 s
 T/: t he number of b ar yons p er photon T/ == nb/ n, ~ (5  6) x 10 10 [see (4.14 This quantity is not directly measurable with a great accuracy as it is difficul determine which fr action of da rk matter is in baryonic form. BBN thus offe way to m easure this paramet er.  The fundamental constants: ON and OF are involved in the determination t he freezeout temperature. The fine structure const a nt 0: det ermines Q and reaction r at es , since it enters the value of the Coulomb ba rriers.
This chapter will detail these main lines. To obtain precise results , one sho integra t e numerically the network of nuclea r reactions in an expanding sp aceti We refer to the original papers [1 3, 19, 20] a nd reviews [21 25] for more deta expla nations and to t he numerical code [26] tha t is freely availa ble [27].
4.3.2 4. 3.2.1
Initial state Thermody namic nuclear equilibrium
x1
At t emperatures high compared to 1 MeV, all nuclei are kept in thermodyna equilibrium by nuclear interactions. The density of these nonrela tivistic nuclei is t given by nA = 9A (
mAT) 3/ 2
~
e
 (mA /LA )/T
.
(4.
As long as the reaction ra te is greater than the expa nsion rate, the chemical equilibr imposes the value of the chemical potential
(4.
Primordial nucleosynthesis
207
where I£p and I£n are the chemical potentials of protons and neutrons. Neglecting the mass difference Q = mn  mp c:::: 1.293 MeV in the prefactors, the neutron and proton densities are (4.87) where m N = ~(mp + mn). Using (4.86), the exponential factor in the expression (4.85) can be expressed as exp
(  mA T I£A)
=
[exp (1£'; )]Z [exp (1£;: )] A Zexp (mA)  T .
Expressing the first two factors as a function of the proton and neutron densities (4.87) , we get (4.88) where B A is the nucleus binding energy (4.89) The mass fraction of a nucleus with atomic number A with respect to the total nucleon mass is defined by (4.90) In terms of the parameter TJ defined by (4.91 ) we finally get that the abundances of the atomic nuclei in t hermodynamic equilibrium, using that n'Y = 2((3)T3 11f2, are given by
(4.92) (4.93) This illustrates the influence of the entropy on the a bundance of the different nuclei. If 1] ~ 1, X ~ is stable as soon as T '" B A since at this temperature the formation of the nuclei, controlled by the factor exp ( BAIT), dominat es over its destruction by photodissociation, controlled by t he factor TJA  l. Now, if TJ « I , the balance between formation and photodissociation is reached only when exp(BAIT) '" TJ A 1 , that is at lower temperatures.
208
The standard BigBang model
The temperature at which the a bundance of the nucleus A becomes of order un which we denoted by TA, can thus be estimated from (4.92) . Neglecting the numeri factor F(A) and with Xp '" Xn '" 1, we have
(4. For the lightest elements , we obtain, with Tf '" 5
X
10 10 ,
D
T
BA (MeV)
2.22
6.92
7.72
28.3
TA (MeV)
0.066
0.1
0.11
0.28
So, even at thermodynamic equilibrium, nucleosynthesis does not start before arou
0.3 MeV. This is due to the very small value of Tf. Even if nuclei are energetica
favoured , entropy acts in the opposite way and favours free neutrons and protons.
4.3.2.2
Neutron abundance
Since the equilibrium between protons and neutrons is maintained by weak interacti (4.83) implies that (4. f..in + f..iv = f..ip + f..ie· We conclude from (4.87) that
n nn P == np
= exp
(Q + T
f..ie  f..iv )
T
.
(4.
As shown by (4.26) , the electrical neutrality of the Universe implies that f..ieiT 10 10  10 8 . Neglecting f..iv , we get that
(4.
For temperatures larger than 0.1 MeV, the study of the previous section teaches that XA « 1 so that nb = nn+n p . Thus , the abundance offree neutrons in equilibri is (4. Xn ,eq = ( 1 + e Q/T )  l
At large temperatures, we therefore have Xn = Xp = ~ and XA c::: O. For instance, T = 1 MeV, deuterium has an abundance X 2 '" 6 X 10 12 . 4.3.3
Freezeout of the weak interaction and neutron to proton ratio
Weak interactions maintain neutrons and protons in equilibrium. These interacti freezeout at a temperature Tf determined by H = r w(Tf ). For temperatures T < and T > m e the reaction rate of the weak interactions is given by
rw =
77r(1 60
+ 3g A2 )C2 T5 F
'
(4.
Primordial nucleosynthesis
for the reaction p
+ e + V e + n.
209
We obtain from (4.19) with g* = 10.75 that
r
(
H"::'
T 0.8 MeV
)3 '
(4.100)
so that the freezeout temperature of the weak interactions is of the order of Tf
~
(4.101)
0.8 MeV.
Thus , the freezeout occurs around tf = 1.15 s. Applied to neutrons, the Boltzmann (4.56) can be reexpressed in terms of Xn as (4.102) The reaction rate n + p , Anp , is related to that of the reactions p + n, Apn, by Anp = Apn exp( Q /T) simply because the reactions (4.83) are reversible . In equilibrium, i.e. when Xn = 0, the solution of this equation is given by (4.98). We can first note that the evolution equation (4.102) has an integral solution
Xn(t)
=
it
J(t, t')Apn(t')dt'
+ J(t, ti)Xn(ti),
(4.103)
t,
where the function J is defined by J(t, t') = exp { 
Jt~ [Apn(U) + Anp(U)] du} . Since
at high temperatures the reaction rates are very large, J(t, t i ) will be negligible if the time ti is chosen to be far enough in the past, so that Xn (td plays no role and one can choose ti = 0 (see, e.g., Ref. [20]). It follows that
Xn(t) =
lot J(t, t')Apn(t')dt' "::' Xn ,eq(t)  lot J(t, t') d~' [Xn,eq(t')] dt'.
(4.104)
Since the reaction rates are large compared to their relative variation , it follows that (4.105) Thus, the neutron abundance follows its equilibrium value until T f . When the weak interactions freezeout , the neutron abundance is therefore
Xn(Tr) "::' Xn,eq(Tr)
=
[1
+ exp(1.293/0 .8) r 1 ~ 0.165 ~ ~,
(4.106)
that is (n/p)rf ~ 0.198 ~ 1/5. The numerical coincidence Q/Tf = 0(1) implies that Xn is neither equal to Xp nor very small. As a consequence, the quantity of helium synthesized will be of the same order of magnitude as the quantity of hydrogen. We stress here that this coincidence is unexpected since Q is determined by the strong and electromagnetic interactions , whereas Tf is fixed by the weak and gravitational interactions.
Th e standard BigBang model
210
To evaluate the remaining abundances after the freezeout, Xn(t use the explicit form of the reaction rates (see Ref. [22])
A(nVe
t
pe) = A
A(ne+
t
pve ) = A
A(n
pe ve ) = A
100 dqvq~qe(Ev +
100 1
»
tr) one sho
Q)(l  f e)fv,
dqeq;qv(Ee + Q)(l  fv)fe,
v Q2 m~
t
dqeq; qv(Ee + Q)(1  fv) (l  f e).
(4.1
One can show (see Ref. [20] for details of this computation) that the leading part these integrals comes from particles with energy large compared to the t emper at during BBN. The Fermi Dirac distributions can thus be approximated by Maxw distributions, f e = [1 + exp(Ee/Te)]  l '::::' exp(  E e/Te). Since the Boltzmann fact are small in a diluted gas , quantum effects are negligible and the Pauli factors can neglected, 1  f '::::' 1. The last approximation assumes m e = 0 in the computation the two first int egrals, which are dominated by Ee,v » m e, so t hat
This approximation is precise at a level of 15% as long as T > m e, when the react rates have become very weak. The last reaction r at e gives a relation between const ant A and the neutron lifetime
1= AQ5[V1  in (1  3_m 2 m 2 _4 ) + in4cosh1(1)] '::::' 0.0158AQ 5 in ' 2
T
n
with in
5643
4
== m e/ Q. Finally, the total reaction rate Anp '::::'
253
  5 (12 Tn X
Anp '::::' 2A(nve
t
pe) takes the form
2
+ 6x + x ),
(4 .1
with x == Q / T , if we neglect the decay of neutrons during the freezeout. Actually, t term only becomes important when T < 0.13 MeV , i. e. for x > 10. We recover th for x « 1, the reaction rate is given by Anp ~ 12 X 253 / Tn X 5 rv G;T5, which justif a posteriori our approximation in the determination of Tf. The expression (4 .108) allows us to compute the function J(t, t') as
J( x, x')
exp[K(x)  K( x') ]'
=
with
/00 1 + e u
K( x ) = b J
u4
x
= b
(12
+ 6u + u 2 )
du
[( 4+3+ 1) + (4 + 1) x3
x2
X
x3
x2
e
x] ,
Primordial nucleosynthesis
211
where the parameter b is given by b = 253J45/47f 3g.Mp /Tn Q2 = 0.252. From the expression (4.104) , we can derive the evolution of the neutron abundance
Xn( X)
=
Xn,eq(x)
+ fox[X n,eq(X,)]2 exp [K(x)
 K(x')] eX'dxl
(4.109)
Finally, we find that the neutron abundance freezes out at X n (=)
This value is reached at T 4.3.4
rv
0.25 MeV (x
= 0.150. rv
(4.110)
5), i.e. at t c::: 20 s.
Abundances of the light elements
The previous computation does not take into account the change in g* and the production of deuterium, two processes happening below 1 MeV. Light elements will be formed by a series of nuclear reactions, summarized in Fig. (4.9). No significant abundance can be produced before the deuterium is formed since the densities are too low to allow for more than twobody reactions, such as 2n + 2p > 4 He, to play any important role. Furthermore, the production of deuterium from two protons (p + p + D + e+ + v e ) involves the weak interaction, and is thus negligible. Light isotopes cannot be synthesized before the temperature drops below 0.1 MeV. This is due to the fact that deuterium has a very weak binding energy and is easily photodissociated. Its synthesis only starts when the radiation density is low enough. Moreover, the reaction chain cannot continue before the deuterium is synthesized, this is often called the 'deuterium bottleneck'. 4.3.4.1
Deuterium
The synthesis of deuterium is thus a key stage of primordial nucleosynthesis. It must be produced in sufficient quantity so that the heavy elements can then be synthesized. As already mentioned, the synthesis of light isotopes only starts at around T = Tnue = 0.066MeV. At this temperature, the electrons and positrons are no longer relativistic and g* = 3.36, so that (4.21) implies that t nue = 303s. Since n(Tnue) = n(Tr) exp( tnue/Tn) and p(Tnue) = p(Tr) + n(Tr)[I exp(  tnue/Tn)], which imply that (n /p)(Tnue ) rv 0.133, we have (4.111) The reaction rate per neutron of the deuteriumforming reaction (n
+ p > D + ,)
is (4.112) It can be shown that this reaction freezes out well after the end of nucleosynthesis so that the deuterium abundance (go = 3) is given by its equilibrium value (4.85)
(4.113)
212
The standard BigBang model
l.p H n 2. p (n, r)d 3.d(p, r)JHe
4. d (d, n)JHe 5. d (d, p)t 6. d (d, nlHe 7.( (ex, rYLi 8. JIIe (n, p)t 9. J IIe (d, pj4He 10. J][e (ex, rYBe 11. 7Li (p, a/He 12. 7Be (n, p YLi
Fig. 4.9 Simplified network of nuclear reactions involved in the synthesis of light nu during BBN.
4.3.4.2
Helium 4
Since the binding energy of helium4 is larger than that of deuterium, helium favoured by the Boltzmann factor exp(B /T) in comparison to deuterium. As c be seen from Fig. 4.10 , the growth in the deuterium abundance is followed by th of helium4. Heavier nuclei are not synthesized in significant quantities mainly two reasons: (1) the absence of stable nuclei with A = 5 or A = 8, so that production chains for the reactions n + 4He, p+4 He, 4He+ 4He are blocked (the trip a process, that produces carbon12 in stars, cannot be efficient in the primord Universe); (2) the important Coulomb barrier for reactions as 3H+ 4He7 7Li+')' a 3He+ 4He7 7Be+,),. At T nuc , all free neutrons are bound in helium4 nuclei. Since helium4 has t neutrons, its primordial abundance is therefore
(4.1
A more precise estimate [20] of Tnuc gives Tnuc = 0.086 MeV, which implies t nuc 178.5 s and hence Yp rv 0.238. This analytical estimate gives the correct orders of magnitude and predictions w a precision of t he order of 1% as compared with what can be obtained from numeri integrations. Let us recall that our estimate assumed f1v = 0 to determine the ratio (4.97) this is not the case, the constant parameter ~ == f1v/T can be introduced. When t parameter is small, its main effect is to modify the reaction rate n + Ve 7 P + e,
Primordial nucleosynthesis
213
p
n
10 5
10 15
Tn =
10 20
887 s
Nv= 3 Q bOh 2 =
102 5 LU~~LL~
1
0.01
__L~~LLLL~LL~__L__~ 0.1 0.01 T (MeV)
Fig. 4.10 Evolution of the abundances of light elements during primordial nucleosynthesis. Dashed lines represent abundances in thermodynamic equilibrium. that Anp = A pn exp (Q / T + ~ ) . This changes the freezeout neutron abundance (4.110) to Xn(OO ) = Xn (oo , ~ = O)exp ( O . The results from numerical simulations can be summarized , in the interval 3 x 10 10 < TJ < 10 9 , by
Yp
= 0. 245+ 0.01 4(Nv 
3)+ 0.0002 (Tn 887 s)+ 0.009 ln (
TJ
5 x 10
10)
 0.25~,
(4. 115)
which gives t he sensitivity t o t he most critical parameters (see Ref. [28] for a more detailed study).
4.3.4. 3
Oth er elem en ts
At the end of nucleosynt hesis , at around t "' 1000 s, traces of deut erium (XD "' 10 4) , helium3 (X 3He "' 10 4) and lit hium (X 7Li "' 10  9 ) are still present. T he abundance of helium3 includes some of the trit ium that disappears by (3 decay, and similarly, the one of lithium7 includes beryllium7. The production of elements beyond A = 7, such as boron, is negligible. Unlike helium4, t he abundances of t hese nuclei are very sensitive to t he value of 'f/ (see F ig. 4.11 ) . T he abundances of deut erium and helium3 decrease with TJ . The lithium7 can be produced by two channels: (1) 4He+ 3 H> 7Li +,), that competes with 7Li+p>4He+ 4He for TJ < 3 X 10  10 and (2) 4He+ 3 He> 7Be+)', followed by a (3 decay for'f/ > 3 X 10 10 . The t ransit ion between both channels explains t he shap e of the lithium 7 abundance curve (F ig. 4.11 ). We can show [23] that in the band 10 10 < TJ < 10  9 , the primordial abundances of the light nuclei are given by
214
Th e standard BigBang model
~) = ' 36 (H
X
1O 5 ±o .06 71 
(4.11 6)
He ) = 1 2 H'
X
1O 5 ±o. 06 71 
(4.117)
1.6 '/ 5 .5 ,
p
3 (
0 . 63 '/5.5 ,
P
L 1') H
= 1 ' 2 X 1O 11 ±o.2
_7
(
(71  2 .38 '/ 5 .5
+ 21 ' 7712 .38 ) '/5.5
,
(4.118)
P
wit h rJ5 5 = rJ / 5.5
X
10 10 10 2
Q boh 2
0.26 0.25
4He
~
ci" 0
0.24
'0 ()
'"
J::
'" ::;;'"'"
0.23 0.22
:t: 0
104
£
:t: OJ
'"
105
10 6
10 9 :t:

r~
Fig. 4.11 Helium4, deuterium , helium3 and lit hium7 abundances as a fun ct ion of t he baryon to photon rat io, 'rj , or equivalent ly of the baryon density, n bo h2. The hatched areas correspond t o the spectroscopic observations discussed in the text: Refs. [2931] for helium4, Ref. [32] for deuterium, and Ref. [33] for lit hium7. T he vert ical band represents t he constraint on 'rj obt a ined from t he W MAP satellit e [34] . From Ref. [35].
Primordial nucleosynthesis
4.3.5
215
Observational status
The comparison between the observed and predicted abundances is difficult , mainly because the primordial abundances can be modified by various nuclear processes during the evolution of the Universe. Moreover, these modifications are different for each nucleus. For instance, the abundance of helium4, which is very stable, increases due to the stellar production, whereas the deuterium is much less stable and can only be destroyed but never be synthesized in stars. Helium3 and lithium7 have more complex evolutions as they can be both synthesized and destroyed. Astronomers therefore try to measure the abundances in environments as primordial as possible.
4.3.5.1 Observations We briefly review the status of the lightnuclei abundance observations. For a more detailed review on these topics, see Ref. [24].
 Helium4: is detected through its emission lines in various astrophysical environments (planetary atmospheres, young stars, planetary nebulae), which could be galactic, extragalactic or intergalactic. In order to eliminate the production of stellar helium, its abundance is correlated to that of other elements such as nitrogen and oxygen. The best system to carry out these studies seems to be the HII regions of compact blue galaxies . Recent measurements from various groups give Yp = 0.2443±O .0015 [29], Yp = 0.2391±O.0020 [30] and Yp = 0.249±O.004 [36]. The recent analysis of 82 HII regions performed in 76 compact galaxies has given [31] Yp = 0.2421 ± 0.0021.  Deuterium : is easily destroyed in stars as soon as the t emperature increases above 6 x 105 K. Any measurement of D / H is therefore a lower bound, giving an upper bound on TJ. Its abundance could have been reduced by a factor estimated from 2 to 10. Its spectral lines have actua lly never been detected in any star, which implies that D / H < 10 6 in stellar atmospheres . They can be detected in giant planets, in local interstellar environments and in the absorption spectrum of lowmetallicity clouds on the line of sight of quasars . The finest measurements in absorption systems lead [32] to D / H= 2. 78:+:g : ~~ x 10 5 . Conservatively, local measurements within the interstellar regions give the lower bound [37] D / H= (1.5 ±0.1) x 10 5 , while the absence of any detection in systems with high redshift fixes the upper bound [32] D/H< 6.7 x 10 5 .  Lithium7: the system that seems to enable the best measurements of lithium7 is the stellar halo of our Galaxy. This halo is mainly composed of stars from populations I and II and thus has a very low metallicity. Extrapolating to zero metallicity, we get a primordial abundance Li/H=1.23 :+: g~~ x 10 10 [33].  Helium3: has only been observed in HII regions of the galactic disk (thus in a relatively high metallicity environment). In these regions, its abundance is 3He/H = (1  2) x 10 5 [38] although it is difficult to track back to its primordial value since this isotope has a complex stellar history and can be both produced and destroyed during the galactic evolution. Helium3 is thus unlikely to provide a precise cosmological test today. At the moment, it can be mainly
216
The standard BigBang model
used to constrain the astrophysical mechanisms in which it is produced or stroyed. However , by comparing the primordial abundance of 3He with the abundance, we can conclude that it has not evolved much in 14 billion years
4.3.5.2
Constraints on cosmological models
Primordial nucleosynthesis allows us first to measure t he ratio T/. If only spectrosc data are considered and assuming N " = 3, then T/
= (5.15 ± 1.75) x 10 1°,
(4.1
which corresponds to D bO h 2 = 0.01 8 ±0.006. Current da t a allow us to impose constraints on parameters such as N ", the num of neut rino families with mass lower than 1 MeV , and ~ , but also to test the valu some fundamental constants such as the gravitational const ant , or the fine struc constant (see Ref. [39]). Primordial nucleosynthesis is thus a test not only of the Bang model, but also of nuclear physics and of general astrophysics. The agreem between theory and observations makes it one of the pillars of the BigBang mod
4.3.5. 3 A recent uisis?
For a long time, primordial nucleosynthesis and the spectroscopic determinatio lightelement abundances were t he only reliable ways to quantify the baryon con of the Universe. Recently, D bO h 2 has been determined with an extreme precision t hanks to the analysis of t he cosmic microwave background anisotropies (see Chapte D bOh 2 = 0.0224 ± 0.0009 ,
(4.1
°.
which corresponds to T/ = (6.14±0.25) X 10 1 Using t his value for the baryon den we can then deduce the expected abundances [35]: Yp
D/ H
0. 2479 ± 00004 2.60:t: g~~ x 10 5
7Li/H 4. 15:t:g!; 1O 
3He/H 10
(104 ± 0.04) x 10
6Li/H 5
(1.2 ± 0.08) x 10
This has opened up a reanalysis of the nuclear reaction rates involved during cleosynthesis (Fig. 4.9 ). The situation is summarized in Figs. 4.11.
4.4
The cosmic microwave background radiation
As long as the temperature of t he Universe remains large compared to the hydro ionization energy, matter is ionized and photons are then strongly coupled to elect through Compton scattering. At lower t emperatures, the formation of neutral at om thermodynamically favoured for matter. Compton scattering is then no longer effic and radiation decouples from matter to give rise to a fossil radiation: the cos microwave background (CMB). The temperature of t his cosmic background was predicted by Alpher and H man [3] in 1948 , following the arguments developed in the same year by Gamow This prediction was confirmed in 1964 by Penzias and Wilson [40] who discov
The cosmic microwave background radiation
217
an unexplained noise signal, identical in every direction. This signal was immediat ely interpreted by Dicke et al. [41] as being the cosmic microwave background radiation at a temperature of 3.5 ± 1 K. In this section, we first describe the recombination and decoupling mechanisms before describing the properties of the isotropic component of the cosmic background as revealed by the COBE satellite in 1987; in particular , we describe the constraints on the possible distortions of the Planck spectrum. For more details on this subj ect, we recommend the reviews [42 44]. 4.4.1
Recombination
4.4.1 .1 Prediction of the existen ce of the cosmic microwave background To st art with, let us briefly review the analysis of Alpher and Herman who predicted the temperature of t he cosmic background in a surprising way, based uniquely on an estimate of the helium abundance and on t he baryon density. To explain a helium4 abundance of approximatively 25 %, primordial nucleosynthesis must have happened at around 109 K. At this time , the baryon density is of the order of n b rv 10 18 cm  3 . Today, it is of order nbO rv 10 7 cm  3 , from which we deduce that , at the time of nucleosynthesis, the redshift is 1+ Since T scales as (1
+ z) , the
=
ZBBN
(
n b )
1/3
n bO
rv
8 2 x 10 .
current t emperature of the photon bath is thus around
ToCM B
=
T BBN
1 + ZBBN
rv
5 K.
4.4.1. 2 R ecombination and decoupling As long as the photoionization reaction (4.121) is able to maintain the equilibrium , the relative abundances of the electrons, protons and hydrogen will be fixed by (4 .60). Since the Universe is electrically neutral, one has ne = np . For simplicity, we introduce the ionization fraction
x e _
ne n p + nH
,
(4.122 )
where the denominator represents the t ot al number of hydrogen nuclei, n b = np + nH [ne = np = X en b, nH = (1  Xe )n b]. In this particular case, (4.60) is known as the Saha equation. Once t he equilibrium quantities are computed , it implies that (4 .123) where Er = m e + mp  mH = 13.6 eV is the hydrogen ionization energy. The photon temperature is T = 2.725(1 + z) K = 2.348 x 10 4 (1 + z) eV and the baryon density nb = 7]n"o(1 + z)3 cm  3 so that the previous equation becomes
218
The standard BigBang model
log
X; ) (1 Xe
2 (1 + z)3/2]  25163 = 20.98  log [ DbOh . 1+Z
(4.12
Notice first that when T """' E r, the righthand side of (4.123) is of order 10 15 , that X e(T """' Er) """' 1. Recombination only happens for T « E r· To see that , w deduce from (4.124) that if recombination is defined by Xe = 0.5 , 0.1 , 0.01 then Zrec 1376,1258, 1138, respectively. Thus, X e varies abruptly between Z = 1400 and Z = 12 and the recombination can be estimated to occur at a temperature between 3100 an 3800 K. Note that (4.123) implies that
3
[2
X ; = In (meT ) Er  In Tj((3)T I n 1  Xe 2 271" T 71"2
3] ,
from which we deduce that
(me) 271"T
Er = 3 In T
2
[
2 )X; ] . In Tj  In ((3 71"2 1  Xe
This gives a rough estimate of the recombination temperature since the last term ca be neglected. It gives T """' 3500 K. The electron density varies rapidly at the time of recombination, t hus the reactio rate fT = neaT , with aT the Thompson scattering crosssection, drops off rapidly that the reaction (4.121) freezes out and the photons decouple soon after. An estima of the decoupling time , td ec, can be obtained by the requirement fT(td ec) = H(t dec The reaction rate fT takes the form
(4.12 Only matter and radiation contribute significantly to the Hubble expansion rate that
H 2 = Dmo H
5(1 +z )3 (1 +~). 1 + Ze q
(4.12
Thus, we conclude that the decoupling redshift is solution of (1
+
)3/2 = 280.01
Z
dec
X e(OO )
(n 2)1(n ~ 0.02
~2 )1 / 2 0.15
1 + 1 + Zdec 1 + Zeq
(4.12
Xe( 00) is the residual electron fraction, once recombination has ended. We can estima (see the next section) that Xe(oo) """' 7 X 10 3 . 4.4. 1. 3
Recombination dy namics
To describe the recombination, and to determine Xe( oo ), one should solve the Bolt mann equation. A complete treatment , taking the hydrogen and helium contribution into account , is described in Refs. [45,46]. Actua lly, the helium recombination ends lon before that of hydrogen begins. For simplicity, we will neglect the helium contributio
The cosmic microwave background radiation
219
Consider the Boltzmann equation (4.58) for the electron density. Using ne = nbXe and the fact that nba3 is constant , the term ne + 3Hne can be written as nbXe. Then , using that nH = nb(1 X e) and l1e l1 p /l1H = (m eT /27r)3 / 2 exp( Er/T) , it reduces to (4.128)
with
~ = (~~) 3/2 cyCZ) exp ( 
i ),
(4.129)
These parameters are given in terms of the temperature by ¢2(T ) ::::' 0.448 In
(i) .
(4.130)
The expression for ¢2 is a good approximation , better than 1% for T < 6000 K. The coefficient Cr should in principle be equal to unity. This description is actually too naive [45] . When an electron is captured by a proton, a hydrogen atom is formed in an excited state. This a tom returns to a lower excited state by emitting a series of resonant Lyman a photons. To reach the ground state, one of the photons must have at least the transition energy 2s ; Is. These photons have an important crosssection and thus a high probability to be absorbed by surrounding hydrogen atoms, exciting them to a st ate from which they can easily be ionized. Thus, this process does not lead to any net change in the hydrogen density. It has been shown by Peebles [45] that only transitions from the level n = 2 can lead to significant changes in the hydrogen abundance. The fine structure of this level therefore plays a central role during recombination. Two leading effects should be taken into account .  The transition 2s ; I s is forbidden to first order. From conservation of energy and angular momentum , two electrons should be emitted during this t ransition. This process is slow and its reaction rate is A 2s = 8.227 S I. This is the leading mechanism and the reaction r at e of recombination is very different from the one predicted by the Saha equation.  The transition 2p ; I s produces some Lymana photons. The frequencies of these photons are redshifted by the cosmic expansion. Thus, their energy gets lower and they are no longer resonant , this lowers the probability to reionize a hydrogen atom. These two effects are equivalent to fixing the value of the reduction coefficient , Cr, to
C _ r 
Aa.
Aa. + A 2s + A 2s + ~(2 )
(4.131)
with
~(2) = ~ exp (~) ,
8H
Aa.=~, A a. n
ls
Aa.
4 3E[
= 
=
1.216
X
10 5 cm 
l
For T < 10 5 K t he hydrogen abundance in the state I s can be replaced by nls (1 Xe)nb. The ~(2) coefficient describes the ionization , whereas An + A2s is the total
220
The standard BigBang model
reaction rate of both dominant processes . A more complete study, with details o the processes is presented in Ref. [46]. Table 4.6 compares the approximation from Saha equation with the integration of (4.128) for a fiat Universe with DbOh2 = 1.
Table 4.6 Comparison between the approximation from the Saha equation and the in gration of (4. 128) for a flat Universe with DbOh2 = 1. z
1667
1480
10 1
Sah a
9.43 x
(4.128)
9.2 x 10
Ref. [46]
9.14 x 10
1 1
1296
1111
3.8
X
10 
1
2.6
X
10 
2
4.0
X
10
1
7.2
X
10
2
4 .0
X
10 
1
7.15
X
10
2
8.8
X
9.8
X
9.68
X
926
10 
4
5.28
10 
3
9.2
10
3
A detailed analysis of (4.128) shows [46] that for 1500 > _dX_e
dz
=  61 72
.
D bO h
9.01
Z
741
X 10 6 X X
10 10
4
2.4 l. 23
4
l.2
X X X
10
10
10
> 800, it reduces t
2
J( z) [ 1 + 2 26 x 104 ze 14486/(z827) ]  1 X2 (Dmo h2 ) l / 2 · e'
(4
with J( z ) = (1 + 1.45z/zeq )1/2 , if we only keep the dominant contribution. A approximation of the solution of this equation is
(4. as long Dba and Dmo are both between 0.05 and 1.
4.4.1.4
Last scattering surface
The optica l depth can be computed using the ionization fr action as a function o redshift around decoupling, T
=
j .n X e
(7
e T
dX::::::'
 1
0.42e1 4 .486(1 827)
(
Z )14.25 1000 '

(4.
where the Dmo dependence disappears in an obvious way. More interestingly, also independent of Dba. The optical width varies rapidly around z '" 1000 so the visibility function g(z) = exp(  T)dT/dz, which determines the probability a photon to be scattered between z and z + dz, is a highly peaked function Fig. 4.12). Its maximum defines the decoupling time, Zdec ::::::' 1057.3. This red defines t he time at which the CMB photons last scatter; the Universe then rap becomes transparent and these photons can propagate freely in all directions. Universe is then embedded in this homogeneous and isotropic radiation. The ins when the photons last interact is a spacelike hypersurface , called t he last scatte surface. Some of these relic photons can be observed. They come from the intersec of the last scattering surface with our past light cone. It is thus a 2dimens sphere, centred around us , with comoving radius X(Zdec). This sphere is not infin thin. If we define its width as the zone where the visibility function is halved get .0. z dec = 185 .7. As can be seen in (4 .133) and (4.134), the value of Zdec is
The cosmic microwave background radiation
221
very sensitive to nboh2 and n mO h 2 [it is only independent of these parameters in the approximation of (4.132) ]. From the analysis [34] of the WMAP satellite, we can conclude that (4.135) Zdec = 1089 ± 1, .0.Zdec = 195 ±2, which is in good agreement with our analytical approximation. p
,, 5
tl~
_ QOh2 = 1
·QOh 2 =0 .1 ...... Q Oh 2 = 0.01
4
\'
Equality
a
Today
0
3
,...;
,£
.S
0 2 t:
2 0
~
:0 '<;j
\.
/
>=
.............. 0 0
500
1000 Redshift
Fig. 4.12 (left): During the expansion of the Universe, the density, p, decreases with time. Slightly after recombination, the freeelectron d ensity rapidly decreases and the Universe becomes transparent. (right): The v isibility function is highly p eaked , which allows us to define the last scattering surface.
4.4.2
Properties of the cosmic microwave background
In BigBang models, the energy inj ected into radiation between the electron positron annihilation at a redshift of order Z rv 109 and Z = 10  20 is very small. The fossil radiation emitted around Z = 10 3 must hence have a spectrum very close to a Planck distribution. We had to wait until the COBE satellite [42] to check this prediction with accuracy. In this section, we detail the observed properties of this radiation. 4.4.2.1
Observed temperature and spectrum
The temperature of the cosmic microwave background, defined as the average of the temperature of the whole sky
To == Tyo =
~ 47T
jT(e , 'P) sineded'P,
has been measured with precision by the FIRAS experiment on board the COBE satellite [47]
222
The standard BigBang m odel
= 2.725 ±0.001
To
(4 .
K
at 20". We define the dimensionless parameter
e 2 .7 = ~K 2.725 .
(4.1
T his temperature corresponds to an energy of Eyo = 2.345 X 10 4 eY. The observed spectrum is compatible with a blackbody spectrum
I(v) ex
v3
e"
IT
 1
(4.1
.
F igs. 4.13 and 4. 14 compare the measurements of the spectrum of the cosmic microw background to that of a blackbody at 2.725 K. The fact that this spectrum is so close to a blackbody proves that the fossil radia could have been thermalized , mainly thanks to interactions wit h electrons. Howe for redshifts lower than z ~ 10 6 , the fossil radiation do not have time to be thermali Any energy injection at lower redshift would induce distortions in the Planck spect and can thus be constrained from these observations (see Fig. 4.13). Wavelength (em)
10 17
~
l.0
10
Freq uency (GHz)
0.1
l.0
10 18
0.1
I
to
::r:: ~
...
I
1019
N
I
E 1020
t
~ ~
u
2.726 K
~
• • o
10 21
o
•
Cl
U.
Blackbody FIRAS DMR UBC LBLItaly Prin ce ton Cyanogen
"'1
"
0.01
~
0.001
0.0001
10 22 1
10
100
Frequency (GHz)
1000
10 5
10 7
(J + z)
Fig. 4.13 (left): Comparison between the measurements of the cosmic microwave backgro spectrum to a blackb ody at 2.725 K. (right ): Upp er bound on the inj ect ed energy compa wit h the constraints from FIRAS on t he distortions of t he cosmic background spectrum . F Ref. [48] .
CMB photons have a Planck distribution at a temperature To. From the express (4. 11 ) and (4.12), we can conclude (9"1 = 2) that
(4.
This implies that Pyo = 0.26032 eY . cm  3 = 4.96 x 10 13 ey4. We then deduce f (4.29) that the current entropy density is
The cosmic microwave background radiation
30
300 3.5
0.03
I.
400
I
~
g3.0
1 300
\
"
:>,
~
.3
; 200
~
~
\
0"'''""'o 5 10 15 20
/"
\
/"
/"
\
0
100
/"
\
2.5
/"
\
E ~ 2.0
'"
~
Wavelength (em) 3 0.3
223
/ Planck / ... Compton distorsion , y /   Che mical potential, J1 _ . Freefree, YU·
\ \ \
l.5 0.1
1
Frequency (cm1 )
100 10 Frequency (GHz)
1000
Fig. 4.14 (left): Spectrum of the fossil radiation measured by the C OBE satellite, the error bars are multiplied by 200. (right): Different kinds of distortions of the blackbody spectrum . None of these distortions has b een observed. From Ref. [48]. q* ) 83 So = 2889.2 ( 3.91  2.7 cm 
3
q* ) = 7.039 ( 3.91 n, o,
(4.140)
and that the photon density paramet er is (4.141)
Using n bO = DbOPcrit / m N "' 0.11(DbO / 0.02) m  3 we deduce that
7]
= 5.514
X
10 
10
h2) 8 
D (~ 0.02
3 . 2.7
(4.142)
4.4.2.2 Distortions of the Planck spectrum Three kinds of distortions can affect the sp ectrum of the cosmic background. 1. At redshifts lower than
Z de c , the cosmic plasma can be heated by the injection of energy with t emperatures higher than the one of the CMB photons. The Compton scattering of hot electrons then t ends to shift the spectrum towards higher energies , while m aintaining the number of photons constant, so that the low frequencies (Rayleigh J eans part) get deplet ed in favour of the higher frequencies (Wien part ). This distortion is called Compton distortion and is characterized by a unique paramet er
(4.143)
which represents the integrat ed Compton optical depth. 2. For redshifts between Z "' 2 X 10 6 and Z "' 10 5 , the cosmic microwave background can be thermalized through Compton scattering , but the thermalization of t he energy between the plasma and the photons happens too late for the spectrum to
224
The standard Big Bang model
have time to relax into a Planck spectrum. The spectrum is then deformed Bose Einstein spectrum with nonvanishing chemical potential 1I
3
J(lI) ex e (VIJ, )IT

(4.1
1.
This difference can be characterized by the dimensionless parameter p, T"{
(4.1
p, =  .
3. Finally, there can be some socalled freefree distortions , mainly coming from
emission of photons by electrons that scatter on charged particles. The effec the cosmic background is then proportional to the parameter Yff defined as
(4.1
2)3_,::n"~==
with ri,
where 9
rv
==
8ftg ( e 3v6 47fco
meT~JmeTe '
2 is called the Gaunt factor.
The current constraints on these three kinds of distortions [48] are mainly obtai from the analysis of the FIRAS spectrum, and give
iyi < l.9
x 10 5 ,
(4.1
These strongly constrain any processes capable of inj ecting energy into the Univ from z rv 106 onwards, and even from primordial nucleosynthesis (see Fig. 4.13). 4.4.2.3
Monopole and dipole
The observations by FIRAS [49] depend on the position and frequency. They have b expanded in spherical harmonics into monopole, dipole and residual anisotropies. Even though each point in the sky has a Planck spectrum, the spectrum temp ture is slightly higher in one half of the sky and lower in the other half. This dip distortion was measured by FIRAS
bT(e) = (3.346 ±0 .017) x 10 3 K cos e,
and is compatible with a local kinematic origin associated with the motion of the S System and of our Galaxy with respect to the CMB rest frame. The dipole would t arise from a Doppler effect , i. e. from a t emperature fluctuation of the form T =
To
e
[ 1  v 2 / c2 (1  V  cos e)] 
J
c
1
rv
v2
1 + Vcos e + 1  2 cos 2e c 2c
+0
(v3 ) , c3
where is the angle between the line of sight and the dipole, i.e. the directio motion. We thus conclude that the centre of gravity of the Solar System moves in
The cosmic microwave background radiation
225
direction (l, b) = (263. 85° ± 0.1O°, 48.25° ± 0.04°) in galactic coordinat es, with velocity v '" 368 ± 2 km . S  l. The corresponding velocity of our Galaxy and the local group is then v rv 627 ± 3 km· S l rv 0.0022c in the direction (l , b) = (276° ± 3°, 30° ±3°). From this dipole, we can determine the absolute motion of the satellite with respect to the CMB rest frame, i. e. the reference frame for which no dipole is present . Note that this dipole could have a cosmological origin (its effect would not be distinguishable from that of a kinematic dipole) . However , since the value of the quadrupole is around 1% of the dipole, the kinematic interpret ation is commonly accepted. 4.4.2.4
R esidual fluctu ations
Residual fluctuations
oT(B , cp) = T(B , cp)  To ,
(4.148)
with charact eristic amplitude of ((oT/ T) 2) 1/2 rv 1.1 X 10 5 , have also been detected by the COBE satellite. To reach these fluctuations, various foreground effects had to be eliminat ed first (dust , galactic emission , synchrotron , freefree, ... ). The spectrum of these emissions is very different from a Planck spectrum . Fig. 4.15 represents their relative amplitudes as a function of the frequency. These contributions can thus be substracted (more or less easily). After having subtracted these effects, some t emperature anisotropies remain , with relative amplitude rv 10 5 , which correspond to t emperature fluctuations of the order of 30 ~K. These anisotropies correspond to anisotropies in the cosmic microwave background at the time of recombination and redshifted by the cosmic expansion. Chapter 6 is dedicated to the study of these temperature anisotropies. Wavele ngth (cm) 3
30
0.3
2.7 K CMB
Synchrotron Freefree I I
CMB /',T
0.01 1
10 Frequency (GHz)
100
Fig. 4 .15 Different foreground effects spoiling the cosmic microwave background. None of these effects have a Planck spectrum . From Ref. [48] .
226
Th e standard BigBang model
To conclude , t he cosmic microwave background confirms that t he Universe eme from a state in t hermodynamic equilibrium and that its matter was ionized in the Its t emper ature allows us to determine the photons density. T his contribution re sents around 93 % of the electromagnetic radiation , integrated over all wavelengths can thus deduce the radiation density, the best measured cosmological parameter. cosmic microwave background can be used to strongly constrain any energy inj ec since z rv 106 . 4.4.3
Another proof of the expansion of the Universe
The evolution of the t emperature of the cosmic microwave background gives an a tional proof of the expansion of the Universe. Indeed , it must scale as the inver the scale factor, so t hat T( z ) = 2.725( 1 + z ) K. (4.
These photons can excite the fine structure levels of some atoms . The t empera of the CMB photons at a redshift of z = 2.33771 has recently been measured from t he observation of absorbing clouds. The measurement is based on the excita of the two first hyperfine levels of neutral carbon. T hese excitations are induce collisions and by the tail of the CMB photon distribution. After subtracting t he contribution, which can be evaluated from other transitions, one can determine distribution temperature of the relic photons. It leads to the constraint
6.0 K < T < 14 K ,
(4.
while the theoretical prediction is T = 9.1 K. F ig. 4.16 summarizes the variou tempts to check the relation (4 .149) and proves that it seems to be satisfied . provides an additional proof of t he expansion of the Universe.
4.5 4.5.1
Status of the BigBang model A good standard model.. .
The FriedmanLemaitre solutions derived in Chapter 3 describe homogeneous and tropic expanding Universes. This expansion is now established by various observa (see Section 4.1.1 ) and t he Hubble constant has been measured with a good preci Questioning the reality of this expansion would require the reinterpretation of var observations, and this has not yet been achieved by any alternative model (see example, Ref. [51] for an original and instructive attempt) . The expansion of Universe and the redshift represent t he first observational pillar of t he BigBang mo The cosmological principle, i. e. the homogeneity and isotropic hypothesis, imp t he symmetries of cosmological spacetimes and is at the origin of their mathema simplicity. This hypothesis is supported by the observations of the cosmic microw background that prove t he high level of isotropy of the Universe around us. The Bang model does not address the origin of this homogeneity and isotropy. It is an important issue to fur ther test the Copernican principle observationally. The study of nuclear and electromagnetic processes in cosmological spacet has led to two main conclusions: (1) the exist ence and abundances of light nucle
Status of the BigBang model
30
0 GE et al . (1997)
• o • ... 
g ~
227
Lu et al. (1997) Roth and Baue r (1999) Songaila et al. (1994) Srianand anel Petitjean (2000) T(z) = 2.72 6 X (1+z )
20
3
......'"
Q)
0.
E
~
co ::E u 10
Reelshift
Fig. 4.16 Measurements of the temperature of the cosmic microwave background at different redshifts. The dashed line represents the prediction from the BigBang model. From Ref. [50]. naturally explained to within ten orders of magnitude, Section 4.3; and (2) there must exist a cosmic microwave background of photons of cosmological origin, Section 4.4. Both these predictions have been confirmed by observations , providing two other observational pillars of the model. As already seen, outofequilibrium mechanisms playa central role in these predictions. These conclusions rely on sensible, even conservative, hypotheses, namely the validity of general relativity and nuclear and electromagnetic physics at temperatures lower than 100 MeV, providing a robust description of the history of the Universe from 10 4 s after the Big Bang. This BigBang model is therefore an excellent standard model for the description of the Universe. We will use it as a starting point in the rest of t his book and it is nowadays adopted unanimously. A number of conclusions are generic to this model:  the Universe was dominated by radiation in the past. Thus , a radiationdominated era and matterdominated era succeed each other. Then , there can possibly be an era dominated by a cosmological constant or by the curvature of the Universe.  the Universe has a thermal history. During its expansion , the Universe cools down. The interactions that are efficient at high temperatures tend to decouple as soon as their reaction rate becomes smaller than the expansion rate.  the Universe emerges from a state where matter at high temperature is ionized and is in thermodynamic equilibrium. The dynamics of the BigBang models depend on their matter content. As seen in
228
The standard BigBang model
Chapter 3, all background observables depend on the function E(z) only. A detai assessment of this content can be made [5 2]. As seen in this chapter, BigBang mo els are characterized by 5 independent parameters (h,OK,Om,Ob,Or,OA) with constraint that the sum over the Os should be 1. The values of these parameters, deduced from observations, are summarized in Table 4.7. From these parameters few quantities can be deduced that characterize the thermal history of the Univer They are summarized in Table 4.8.
Table 4.7 Cosmological pa ra meters of BigBang models and constraints discussed in t chapter. 0.72 ± 0.05
Section 4.1.1
curvature parameter
 0.0049 ± 0.0l3
Chapter 6
darkma tter density
0.113 ± 0.009
Chapter7
baryon density (BBN)
0.018 ± 0.06
(4.119)
Hubble const a nt
baryon density (CMB)
0.0224 ± 0.009
fl yoh2
photon density
2.4696 x 1O  5e~.7
(4.141 )
flvO h2
(m assless) neutrino density (3 famili es)
(4.47)
fl ro h 2
radiation density
1.68 x 1O 5et7 4. 148 x 1O  5e 7
cosmological const a nt
0 .72 ± 0 .03
Section 4.1. 2
dark energy equation of state
w <  0.6
(4 .4)
fl AO w
t
Table 4.8 Characteristic quantities of the thermal history of the Universe . 'r/
baryon/ photon r atio (CMB)
(6.14 ±0.25) x 10
10
(4.120)
baryon/photon ratio (BBN)
(5.15 ± 1.75) x 10
10
(4.119, 4.1 42)
To
present CMB temperature
Zeq
red shift at equa lity
T eq
temperature at equa lity
Zd ec
Ll. zdec Tdec
(4.137) (4.8)
5.65e2.~(flmo h2) eV
(4.8) Section 4.4
red shift at decoup ling
1089 ± 1
width
195 ± 2
Section 4.4
tempera ture a t decoupling
2970 K
Section 4.4
~ 10 10
Section 4.3
ZB BN
redshift at nucleosynthesis
TBBN
temperature at nucl eosynthesis
4.5.2
2.725 ± 0.001 K 3612(fl m oh 2 /0.1 5) e 2.~
~1
MeV
Section 4.3
... but an incomplete model
The BigBang model offers a convincing picture of the evolution of the Universe. higher temperatures the model is less reliable since the laws of physics have to extrapolated to regimes where they have not been tested otherwise. For example, still do not have any model explaining the origin of baryons. But it is sensible to ho that we will be able to extrapolate, at least qualitatively, up to the grand unificat scale, i.e. to around 10 16 GeV. Beyond this value, probably at energies close to t
S tatus of the BigBang m odel
229
Planck scale at 10 19 GeV , we expect t o have t o consider quantum gravity effect s. We also get into more and more speculative zones but that gives us the possibility t o test the phenomenology of theoretical models at these energies. Putting this aside, t he BigB ang model has a list of problems inherent to its formulation. 4.5.2. 1 Flatness problem The equation for the evolution of the curvature (3.53) is of the form dD K
dina =(3w+ 1)(1  DK) DK '
(4.151)
if we neglect the recent contribution from t he cosmological const ant. If w is const ant , this equation can be integr at ed easily DK
=
(1  DKO )( l
D KO
+ z )1+3w + DKO·
So that if t he present curvature is small, i.e. IDK ol = IDo  11 < 0.1 , as observations tend to prove, then we need
(4. 152) at equality and
(4.153) at the Planck t ime. The BigBang model does not give any explanation for such a small curvature at the beginning of the Universe. We could have ended with the same conclusion from t he dynamical analysis of Sect ion 3. 2.2 that concluded that , for any equation of st ate w :::: 0, the fixed point DK = 0 was unst able. We should therefore det ermine a mechanism leading to such a small curvature or . explain why we strictly have DK = O. One solution is already suggested in Fig. 3.6 that illustrates that DK = 0 is an at t r actor if the Universe is dominat ed by a mat t er component with equation of stat e w <  1/ 3. In the primordial era , as long as the Universe is domina ted by ordinary matter , it is always very fl at , so that the fl atness problem can actually be seen as an age problem: why does our Universe look so young? 4.5.2.2 Horizon problem The cosmological principle, which imposes space to b e homogeneous and isot ropic, is at the heart of the Friedmann and Lemaitre solutions. By const ruction, they cannot explain the origin of this homogeneity and isotropy. So it would be more satisfying t o find a justificat ion of t his principle. From an observational p oint of view, the CMB radiat ion supports this hypothesis. Its isotropy could be easily explained if the photons were emitted once the causal contact was est ablished b etween them. For t hat , the photons should be in causal contact since the t ime of decoupling. To compute t he size of the causally connected regions at decoupling, we work in conformal time (see Fig. 3.12) .
230
The standard BigBang model Infinite redshift / /
zdec
/
E
~
/
o.,
/
/ / /
/ /
,,
,,
,,
/
I
I
,,
I
,,
\
I
\
I
,,
/ /
I
,,
/ /
,,
/ /
I I I
,,
/
/,,"
,,
Decoupling
/ /
CJ
Las t sca ttering surface
o
,,
,,
,
",
\
A' "
\
,,
,,
",I" A" /
/ /
A
B"",
,,
\ \
, Space
B
,, ,
/
" ,, ,
/ / /
... _
_ ...
Fig. 4.17 (left) : The past light cone of an observer 0 intersects the surface of last scatt on a sphere of diameter A' B' that determines the size of the observable Universe. Th of the causally connected regions (A' A" , ...) is determined by the intersection with the f light cone from the BigBang hypersurface. (right): The horizon at decoupling and the su of last scattering.
As can be seen from Fig. 4.17, the regions in causal contact at the time of decou are smaller than the size of the observable Universe. Thus, the surface of last scatt includes around N
~
(TJO  TJdec ) TJd ec
3
~ 8(1 + Zdec)3 j 2 ~ 105 _
106
»
1
regions in causal contact. Such a causal region at decoupling is now observed u an angle ~ 2 TJdec ~ 10. (4 TJo  TJdec It is difficult to explain why the temperature of t he cosmic microwave backgr is the same up to 10  3% in the entire sky, while the latter is composed of a 10 6 causally independent regions. The standard cosmological model predicts that small regions of the surface of last scattering should be correlated, whereas l scales should necessarily be uncorrelated. Another way to formulate the horizon problem is to give an estimate of the nu of initial cells, with an initial characteristic Planck size length , present today i observable Universe. This number is typically of order
e
N
~
(1Cp+HoZPl)3~
1087.
Here again , the study of galaxies tends to show that their distribution is homogen on larger scales , so it is difficult to understand how init ial conditions fixed on causally independent regions can appear so identical (at a 10 5 level!).
Status of the BigBang model
231
The horizon problem is related to the stat e of thermodynamic equilibrium in which the Universe is, discussed in Section 4.2 . The cosmological principle imposes a noncausal initial condition on spatial sections of t he Universe and in particular that the temperature of the thermal bath is the same at every point. The horizon problem is thus closely related to the cosmological principle and is therefore deeply rooted in the Friedmann Lemaitre solutions. 4.5.2.3 Problem of the origin of struct ures Friedmann Lemaitre solutions describe strictly homogeneous and isotropic spacetimes. However, the Universe is filled with numerous structures, whose origin and properties cannot be addressed and explained within t his framework. As formulated , t he cosmological model is thus incomplete. Since gravity is purely attractive, any density perturbation will tend to collapse to create structure. We can thus think that small fluctuations, of thermal origin, for example, could explain t he largescale structure. But this mechanism is actually not very efficient in an expanding Universe (see Chapter 5). As long as t he Universe is radiation dominated , radiation pressure acts against gravitational collapse, and hence delays the beginning of largescale structure formation. T hus, density fluctuations can only grow significantly for Z > Zeq rv 10 4 . We should t herefore find a mechanism able to generate density fluctuations with nonnegligible amplitude. Such fluctuations do indeed exist and have been detected by the COBE satellite. Thus, a new horizon problem appears. In a radiation and matterdominated Universe, the causal radius, roughly the Hubble radius, increases more rapidly than t he physical wavelength of perturbations. Thus, the wavelengths of all cosmological perturbations observed today must have been larger than the causal radius at the beginning of the expansion of the Universe. In particular, the sat ellite COBE has detected temperature fluctuations with wavelengths larger t han the horizon at decoupling. Moreover, these fluctuations seem to have a scaleinvariant spectrum. Can these initial density perturbations be generated by a causal mechanism? How can we explain their amplitude and t heir spectrum? Another problem related to the existence of these perturbations is the transPlanckian problem. All physical wavelengths observed today were smaller than the Planck scale at some t ime in the past. Thus, they must have been described by some physics that we do not know. Until which point are cosmological observations robust enough against the hypothesis of this (unknown) physics? Can they be used to extract some information on this physics? Note that t he Hubble radius also becomes smaller than the P lanck length , it is t hus unclear whether the description of the Universe by a classical spacetime is still acceptable. This problem is related to the existence of an initial singularity. This version of the transPlanckian problem is slightly different from the one presented in Chapter 8 where only t he wavelength of perturbations becomes transPlanckian, whereas the Hubble radius remains larger t han the Planck length . 4.5.2.4
Relic problem
While the Universe cools down , some symmetries can be spontaneously broken (see Chapter 11) . During these phase transitions, topological defects can be formed.
232
The standard BigBang model
I I
I I I
I I
a
I I I I I I I I I I
I I I I I I I I
I I
Physical distance
Fig. 4.18 Horizon problem for a perturbation with physical wavelength A and the Planckian problem related to the existence of a singularity. All physical lengt hs, in part wavelengths, become smaller than the Planck scale. The Hubble radius also becomes s than the Planck scale and the classical d escription of spacetime can b e questioned i limit.
In particular, in the grand unified theory framework, nongravitational intera are assumed to have a symmetry described by a group (} . This group must be b to give rise to the one observed at lowenergy, namely SU(3) cx SU(2) LxU(1)y Chapter 2). If the group (} is a simple group, then during this symmetry bre monopoles are produced with a mass of order Mmonopole '" T CUT '" 10 16 GeV. the fields giving rise to these monopoles are not a priori correlated on distances than the causal horizon , we can assume that they are created with a unit densi volume during the phase transition, nmonopo le '" (2tCUT) 3 . We get that the mon density is Pmo nop o le '" Mmonopol e nm o n opole. At tCUT , t he Universe is dominated by radiation , so that tCUT '" T c0T (see C ter 11 for a description of phase transitions in the cosmological framework). It fo that 2 Dmonopole h
17 '"
10
TCUT ( 1016
Ge V )
3 (
Mmonopole ) 10 16 Ge V
.
(4
As soon as they are formed , such relics would therefore overdominate the m content of the Universe. The effects of this matter on the evolution of the Un would be catastrophic. This problem was initially referred to as t he monopole pro To conclude, the overproduction of massive relics in the primordial Unive expected in various theoretical frameworks, the grand unified theory in particula should explain why these relics do not affect the dynamics of the Universe.
Status of the BigBang model
233
4.5.2.5 Darksector problem According to the conclusions from cosmological observations, only around 5% of the cosmic matter, socalled baryonic, is in a form expected by the standard model of particle physics. We should postulate the existence of a dark sector representing 95% of the matter and having at least two components:  the dark matter , representing 23 ±3% of the matter. This nonrelativistic matter cannot be in baryonic form . The production of relics in the primordial Universe provides a mechanism that could lead to such matter. We should determine the interactions involved and the nature of this particle.  the dark energy, representing 72 ± 3% of matter , is required to explain the recent acceleration of the Universe. The cosmological constant has been a natural candidate for a long t ime. A satisfying cosmological model cannot fail to have a deep understanding of this dark sector. In particular , we should understand why dark matter and baryonic matter have comparable abundances. We should also remember that baryogenisis is still not a wellunderstood process. This mechanism should explain why 'rf = nb /ny rv 10  1 As for t he dark energy, two possibilities are to b e considered:
°.
 the cosmological constant is a manifestation of the vacuum. In this case, its energy density can be computed from the theory of particle physics. However, any vacuum energy coming from a theory compatible with the standard model of particle physics should be at least of order of the energy of the electroweak symmetry breaking scale, p;(4 rv 1 TeV. But the cosmological value is of the order of p;(4 rv 10 3 eV. The disagreement between the theoretical estimate and the cosmologically determined value is thus at least of the order of 1060 , and can go up to 10 120 if we extrapolate to the Planck scale. This problem is known as the cosmological constant problem.  The dark energy is a dynamical quantity. These models will be described in detail in Chapter 13. However, we could ask ourselves why this matter only starts to dominate the matter content of the Universe today. This is what is called the coincidence problem. The coincidence problem is actually more general and relies on the fact that the different matter components are in the following ratio today
(4.156) How can we understand that these components of matter , whose origin depends a priori on differing mechanisms , come with such a ratio? Why do we live at a time when the two components of the dark sector are in comparable quantities? Also, note that the existence of structure and the cosmological observables are very sensit ive to these ratios. To finish, note that any information obtained on this dark sector has been deduced from the observation of luminous matter. These conclusions rely on a central hypothesis: the validity of general relativity to describe gravity. This hypothesis is so central that it is mandatory to t est it on cosmological scales . This last point merges with
234
Th e standard BigBang model
the effective constancy of the fundamental constants of Nature on cosmological scales. Note once more that these conclusions have been obtained by imposing tha Universe is well described by a Friedmann Lemaitre space. Most observations, in particular the supernovre data, from which we deduce t he recent acceleratio the Universe, are localized in our past light cone. This implies that dependenci space and time are in general degenerate. This degeneracy is lifted by the symm hypothesis in the framework of Friedmann Lemaitre models . The conclusions c thus be different in a Universe without these symmetries. For instance, the supern data can be reproduced in a Universe with spherical symmetry, filled only with (LemaitreTolman Bondi space described in Chapter 3). The expansion of su Universe is not accelerated but the observation of supernovre would be interpreted a wrong symmetry hypothesis and would lead to the conclusion that the expan is accelerating, whereas it actually only reflects the spatial dependance of the me Once more , we should check the Copernican principle as much as possible and veri conclusions that can be deduced from the observations: they depend on the hypot made on the spacetime structure. 4.5.3
Conclusion
The BigBang model is a good standard model, giving a satisfactory description o Universe on large scales , where it can be described by a homogeneous and isot spacetime . It has, nonetheless, various problems that primordial cosmology aim resolve. At higher temperatures, physics is less well understood, and its cosmolo implications can offer solut ions to some of these problems, but can also constrain physics. It is thus important for primordial cosmology to have a standard model such a Big Bang . This starting point will allow us to formulat e some variations, to test t and to study their phenomenology. The rest of this book is dedicated to this prog
References [1] R.A. ALPHER, H. BETHE and G. GAMOW, 'The origin of chemical elements', Phys. Rev. Lett. 73 , 803 , 1948. [2] G. GAMOW, 'The evolution of the Universe', Nature 162, 680, 1948. [3] R.A. ALPHER and R. HERMAN, 'Evolution of the Universe', Nature 162, 774, 1948. [4] E. HUBBLE , 'A relation between distance and radial velocity among extragalactic nebulae', Proc. Natl. Acad. Sci. USA 15, 168, 1929. [5] S.K. W EBB, Measuring the Universe, the cosmological distance ladder, SpringerVerlag, 1999. [6] W. FREEDMANN , 'The Hubble constant and the expansion age of the Universe' , Phys. Rep. 333 , 13, 2000. [7] W. FREEDMANN et al., 'Final results from the Hubble telescope key project to measure the Hubble constant ' , Astrophys. J. 553 , 47, 2001. [8] M. BIRKINSHAW, 'The SunyaevZeldovich effect' , Phys. Rep. 310 , 97, 1999. [9] S. PERLMUTTER et al. , 'Measurement of n and A from 42 high z supernovre' , Astrophys. J. 517, 565 , 1999. [10] A.G . RIESS et al. , 'Observational evidence from supernovre for an accelerating Universe and a cosmological constant ', Astron. J . 116, 1009, 1998. [11] A. FILIPPENKO, 'The accelerating Universe and dark energy: evidence from type Ia supernovre' , Lect. Notes Phys. 646 , 191 , 2004. [12] U. SELJAK et al., 'Cosmological paramet er analysis including SDSS Lya forest and galaxy bias: constraints on the primordial spectrum of fluctuations, neutrino mass, and dark energy ', Phys. Rev. D 71 , 103515, 2005. [13] B. CHABOYER, 'The age of t he Universe', Phys. Rep. 307, 23, 1998. [14] J.J. COWAN, F.K. THIELEMAN N and J .W. TRURAN , 'R adioactive dating of the elements', Annu. Rev. Astron. Astrophys. 29 , 447, 1991. [15] R. CAYREL et al. , ' Measurements of stellar ages from uranium decay', Nature 409 , 691 , 2001. [16] E. KOLB and M. TURNER, The early Universe, Addison Wesley, 1990. [17] R. TOLMAN, R elativity, thermodynamics and cosmology, Cambridge University Press, 1934. [18] J . BERNSTEIN, Kinetic theory in the expanding Universe, Cambridge University Press , 1988. [19] P.J.E. PEEBLES, 'Primordial helium abundance and the primeval fireball ', Astrophys. J. 146, 542, 1966. [20] J . BERNSTEIN, L .S. BROWN and G . FEINBERG, 'Cosmological helium production simplified ', R ev. Mod. Phys. 61 , 25 , 1989. [21] P.J.E. PEEBLES , Physical cosmology, Princeton University Press, 1971.
236
References
[22] S. WEINBERG , Gravitation and cosmology: principles and applications of the g eral theory of relativity, John Wiley and Sons, 1972. [23] S. SARKAR, 'BigBang nucleosynthesis and physics beyond the standard mod R ep. Prog . Phys. 59 , 1493, 1996. [24] K.A . OLIVE, G. STEIGMAN and T .P. WALKER, 'Primordial nucleosynthesis: t ory and observations ', Phys. R ep. 333 , 389, 2000. [25] V. MUKHANOV , 'Nucleosynthesis without a computer' , Int. J. Theor. Phys. 4 669 , 2004. [26] RV. WAGONER, 'On the synthesis of elements at very high temperatures ', A trophys. J. 148, 3, 1967. [27] http://wwwthphys.physics.ox.ac.uk/users/SubirSarkar/bbn.html. [28] K.M . NOLLETH and S. B URLES, 'Estimating reaction rates and uncertainties fr primordial nucleosynthesis' , Phys. R ev. D 61 , 123505, 2000. [29] YI. I ZOTov et al. , 'Helium abundance in the most metal deficient blue comp galaxies' , Astrophys. J. 527, 757, 1999. [30] V. LURIDIANA et al. , 'The effect of collisional enhancement of Balmer lines the determination of the primordial helium abundance', A strophys. J. 592, 8 2003. [3 1] YI. IzoTov and T.X. THUAN, 'Systematic effects and a new determination the primordial abundance of 4He and DY / dZ from observations of blue comp galaxies ' , Astrophys. J. 602 , 200, 2004. [32] D. KIRK MAN et al., 'The cosmological baryon density from t he deuterium to h drogen t owards QSO absorption system: D/ H towards Q1243 +3047' , A stroph J. Supp. 149, 1, 2003 . [33] S.G. RYA N et al. , 'Primordial lithium and big Bang nucleosynthesis ' , Astroph J. 523 , 654, 1999. [34] D.N. SPERGEL et al. , 'First year Wilkinson microwave anisotropy probe (WMA observations: determination of cosmological parameters', Astrophys. J. Sup 148, 175, 2003. [35] A. Coe et al. , 'Updated big Bang nucleosynthesis confronted to WMAP obs vations and to the abundance of light elements ', Astrophys. J. 600, 544, 2004 [36] K. OLIVE and E. SKILLMAN , ' A realistic determination of the error of prim dial helium abundance: steps toward nonparametric nebular helium abundance A strophys. J. 617, 29, 2004. [37] J. LINSKY, ' Atomic deuterium/ Hydrogen in the Galaxy', Space Sci. Rev. 10 , 49 , 2003; H.W. Moos et al., 'Abundances of deuterium , oxygen, and nitrog in t he local interstellar medium: overview of first results from FUSE missio Astrophys. J. Suppl. 140, 3, 2002. [38] T .M. BANIA , RT. ROOD and D.S. B ALSER, 'The cosmological density of baryo fro m observations of 3He+ in the Milky Way', N ature 415 , 54, 2002. [39] R CYBURT et al. , 'New BBN limits on physics beyond the standard model fr He4' , A stropart. Phys. 23 , 313 , 2005 . [40] A. PENZIAS and R WILSON , 'A Measurement of excess antenna temperature 4080 Mc/s', A strophys. J. 142, L419 , 1965.
References
237
[41] R DICKE, P. PEEBLES , P. ROLL and D. WILKINSON, 'Cosmic blackbody radiation ', Astrophys. J. 142, 383, 1965. [42] G. SMOOT, The CMB spectrum, in The cosmic Microwave Background, C.H. Lineweaver et al.(eds.) , NATO ASI series 502 , Kluwer Academic Publishers , 1997. [43] F.R BOUCHET, J.L. PU GET and J.M. LAMARRE, The cosmic microwave background, in l 'Univers primordial, Springer , 1999, pp . 103. [44] RB. PARTRIDGE, 3K: the cosmic microwave background, Cambridge University Press, 1999. [45] P.J .E. PEEBLES , 'R ecombination of the primeval plasma', Astrophys. J. 153, 1, 1968. [46] B.J. JO NES and RF. WISE, 'Ionization of the primeval plasma' , Astron. Astrophys. 149, 144, 1985. [47] J.C. MATHER, 'Callibrator design of the COBE far infrared absolute spectrophotometer ', Astrophys. J. 512 , 511 , 1999 . [48] G.F. SMOOT and D. S COTT, 'Cosmic background radiation ', in The review of particle physics, S. E idelman et al. , Phys. Lett. B 592, 1, 2004. [49] D. FIXSEN et al., 'The cosmic microwave backgound spectrum from full COBE FIRAS data set' , Astrophys. J. 473 , 576 , 1996. [50] R SRIANAND, P. P ETITJEAN and C. LEDOUX , 'The microwave background temperature at t he redshift of 2.3371', Nature 408 , 931 , 2000. [5 1] G.F.R ELLIS, R MAARTENS and S.D. NEL , 'The expansion of the Universe', Month. Not . R. Astron. Soc. 184, 439, 1978. [52] M. FUK UGITA and P .J.E. PEEBLES, 'The cosmic energy inventory', Astrophys. J. , 616 , 643 , 2004.
5
The inhomogeneous Universe: perturbations and their evolution
Friedmann Lemaitre spacetimes are approximations t hat give a good description our Universe only on very large scales. In fact , our Universe is not strictly homogeneo and isotropic since it contains structures such as galaxies, galaxy clusters, etc. In t 1980s, it was understood that the largescale structure can result from the gravitatio amplification of the density fluctuations of a pressureless fluid, the cold dark matt see R efs. [1 3J. The aim of this chapter is to study Universes with a geometry 'close' to a Frie mann Lemaitre spacetime and to understand the evolution of the density perturb tions under the influence of gravity in an expanding spacetime. To do so, we will st by recalling the Newtonian limit in Section 5. 1. Then , we will focus on the gene relativistic case in Section 5.2 to present the theory of cosmological perturbation This part, which is quite technical, is one of t he cornerstones of modern cosmolo and is essential in order to address the properties of t he largescale structure of t Universe. In Section 5.3 , we will study some solutions of these perturbat ion equatio The charact eristic scales that should be t aken into account in the general case, and t form of the power spectrum are described in Section 5.4. We will finish in Section with a description of the observations of these largescale structures. R eferences [4 9J can complement this chapter.
5.1
Newtonian perturbations
Before developing t he general case of t he evolution of perturbations in the relativis framework, let us focus on the simplified case of the evolution of perturbations of initially homogeneous fluid , in the Newtonian regime. In fact , as we will see in Section 5.4, such a Newtonian approach is sufficient describe the late time evolution of the density perturbations during the matter era subHubble scales. A general description of gravitational dynamics in the Newtoni framework can be found in R ef. [4J .
5.1.1
Case of a static space
In a static Euclidean space, the hydrodynamic equations take t he usual form [10J o continuity equation and the Euler equation
Newtonian perturbations
OtP + V. (pv) = 0 ,
239
(5.1) 1
Otv+ (v· V)v =  Vp  VcI> ,
(5.2)
P
for a fluid with density p, pressure P and velocity field v , evolving in a gravitational field with potential cI> . 5.1 .1.1
Without gr avi ty
If we neglect gravity and decompose the density in terms of a homogeneous component and a perturbation as p = 15 + op and the pressure similarly as P = P + op, (5.1) and (5 .2 ) reduce to a unique wave equation
O;op  tJ.op = 0.
(5.3)
Let us introduce Xa == (oP/oP)a/P, called the isothermal or adiabatic compressibility coefficient , where a stands either for the t emperature (T) or the entropy (S). When the propagation time of heat is brief, the flow can be considered to be isothermal, whereas it will be adiabatic if there was insufficient t ime to reach equilibrium, which is the case for long wavelengths. In the latter case, we infer that t he wave equation (5.3) takes the form (5.4)
Thus , hydrodynamic perturbations propagate with constant amplitude and with velocity cs ' the speed of sound of the given fluid. 5.1.1.2
Effect of gr avity
Let us now turn on gravity. The gravitational potential is determined by the Poisson equation (5.5) so that the evolution equation for the density fluctuations now has an extra term and takes the form
(5.6) By decomposing the perturbations into plane waves as op ex exp[i(wt  k.x) ], we obtain the dispersion relation (5.7)
AJ is the Jeans length above which any perturbation is unst able and grows exponentially as
240
The inhomogeneous Universe
Qualitatively, the J eans length characterizes the transition from a regime for which t evolution of the perturbations is dominated , at long wavelengths , by gravity (A > AJ to a region (A < AJ) where gravity is negligible and in which we recover sound wav The value of the J eans length can actually be deduced by considering the tw dynamical t imes of the problem. The time associated with sound waves (i. e. the t im taken by this wave to propagate along a distance equal to its wavelength A) is given t sound cv AIcs ' The gravitational collapse t ime taken by a structure wit h density 15 collapse is t gr av cv (GNP)  1/ 2. Hence , at small wavelengths, one has tso und « t grav a the pressure has time to compensate t he gravitational collapse of the fluid , which th remains quasihomogeneous. When A becomes larger than the J eans length, t so und larger than t grav , the pressure can no longer balance gravity and large inhomogeneiti can develop.
5.1.1 .3 Energetic aspects
From an energetic point of view, each p oint of the fluid can be seen as a harmon oscillator whose displacement from its equilibrium position is denoted by 8 . The velocity of each element of the fluid is thus 8 and , using the conservation (5.1 t he density perturbation is op =  V· 8. The equation of propagation (5.7) can th be integrated to give, after averaging over a volume V , t he equation for conservati of energy (5.
where (EK ) = ~ (82 ) is the mean kinetic energy, (EE) = ~c; (V8 2 ), the mean elas energy, and (Ee) = 27rGN 15 (82 ) , the mean gravitational energy. The kinetic ener takes t he form 1
2
(8.2) _
1 W
2
2(8 2)
'
We deduce that the total energy per unit volume is
(EM)
= ~ (82) [w 2 + c; (k2  k3) ] .
The dispersion rela tion teaches us that there is equipartition of t he energy betwe the kinetic energy a nd the potential energy (which has two contributions). The me energy is t hus given by (EM) = (82) (k2  k3) , (5.
c;
and is, as expected , negative for wavelengths larger than the Jeans length. 5 .1. 2
Case of an expanding space
In an expanding space, one can introduce t he co moving coordinates x by
r(t)
=
a(t)x ,
(5. 1
where a(t) is the scale factor. The velocity field is then given by
v(t) = r = Hr where u =
ax is the proper velocity.
+ u,
(5.1
Newtonian perturbations
5.1.2.1
241
Eulerian formulati on
Equations (5 .1 ) and (5 .2) can be rewritten by noticing that , in the previous case of a static spacetime, the time and space derivative operators defined from t and r were considered to be independent. This is no longer the case here, and it is convenient to use space derivatives defined with respect to the comoving coordinates x. Consequently, the gradient V r is replaced by V x/a, and the time derivative Otp(r, t), that previously assumed constant r , becomes Otp(x , t)  Hx . V xp(x , t). The two equations are thus reexpressed in co moving coordinates as .
p(x , t)
+ 3Hp(x , t) + 1 V x [p(x , t)u(x , t)] a
=
(5 .1 2)
0,
and
u(x , t)
1
+ Hu(x , t) +  (u· a
where we have used that V x
V x) u(x , t)
1
1
a
ap
=   V x(x , t)   V xP(x, t),
(5.13)
= 3. If t he density contrast, 6, is defined by
. X
p(t)[l
p =
+ 6(x , t) ],
(5.14)
then the first equation can be rewritten as the continuity equation .
1
6 +  V [(1 + 6)u] = 0,
(5. 15)
a
where from now on, unless otherwise stated , we omit the reference to co moving coordinates that is implicit in what follows. After some manipulations, an evolution equation for 6 can be deduced from these equations
.. . 1 1 1 [ . "] 6+2H6 = 211P +2V. [(1+6) V ]+2oiOj (1 + 6)u"uJ pa
a
a
.
(5. 16)
Note that this equation does not m ake any assumptions on 6 and in particular it does not assume 6 to be small compared to 1. The gravitational potential is determined by the Poisson equation, which then takes the form
(5.17) Solving (5.17) , one gets that the gravitational potential is given by
(x , t )

=  G NPa
2
J
3 I 6(XI , t) d x Ix _ xi i'
(5. 18)
These equations provide a Eulerian description of the dynamics. 5.1. 2.2
Lagrangian description
A Lagrangian approach can also be developed by following the trajectories of particles or fluid elements instead of considering the dynamics of the density and velocity fields.
242
The inhomogeneous Universe
The central quantity is then the displacement field X(q, t) , which maps the initia position of a particle labelled q into its Eulerian position at time t according to
x(t) = q
+ X (q , t).
The equation of motion for particle trajectories x(t) is then
or equivalently in terms of the conformal time TJ
x" + Hx' =  V",
Taking the divergence of this equation, one obtains an equation for the displacemen field
(5.19
where we have made use of the Poisson equation and of the fact that 15d 3 q b(x , t)]d 3 x, which implies that 1 + b(x, t)
=
1
det(bij
= 15[1 +
1 + Xi,j)  J(q, t)'
 
where X i, j == 8Xd 8qj, and J(q , t) is the Jacobian of the transformation from Euleria to Lagrangian space. Note that the gradient in Eulerian coordinates can be transforme to a gradient in Lagrangian coordinates as
Such a description can be useful for some problems and more details can be foun in Refs. [11 , 12]. Note, however, that when there is shell crossing, that is when th fluid elements with different initial positions q end up at t he same Eulerian positio x, the Jacobian vanishes and the density field becomes singular. At these points, th description of the dynamics in terms of a mapping no longer holds . 5.1.2.3
Linearized systems
Let us now suppose that the fluid density is only weakly perturbed with respect t the homogeneous configuration. This translates into the two conditions b « 1,
(dut)2 «b,
where u is the characteristic fluid velocity and d the co~ere nce length of density per turbations. The second condition implies that the gradients must be small. Let u recall that in a flat Universe with no cosmological constant, the Friedmann equatio implies that 67fG N 15t 2 = 1 during the matterdominated era (since a ex: t 2 / 3 ) .
Newtonian perturbations
243
By linearizing (5.16), we get the propagation equation (5.20) We recover the J eans length defined by (5.7). Unlike the case of a static Universe, the Jeans length now depends on time via 15. A perturbation with given wavelength will thus be able to pass from a soundwave regime to a gravitational collapse regime , and this at a time depending on the value of the wavelength. 5.1.2.4
Growth of density p erturbations
For a fluid of matter with negligible pressure, (5.20) is a second order linear differential equation involving only time. One can therefore look for solutions of the form
where the functions dx) correspond to the initial density field. We will refer to D+ as the growing mode and to D _ as the decaying mode. The functions D are solutions of the equation (5.21 ) where we have expressed the mean matter density as 41rG NPm (t) = ~ H2 flm(t). For a flat Universe with no cosmological constant (flrn = fl mo = 1, a ex t 2 / 3 ) , one can check that (5.22) In the general case, this equation needs be integrated numerically. A convenient way to solve it is to use the scale factor , a, as the time coordinate, so that (5.21) takes the form d D (~dH +~ ) dD _ ~flrno D = 0 (5.23) da 2 H da a da 2 a5 Ho '
2+
(li) 2
where we have used the fact that 41rG Np = 41rGNPrnoa 3 and assumed that ao = 1. One can check that D _ = H is a solution. The second solution can then be obtained from the method of variation of parameters as
r
D (a)  ~ H(a) fl da' +  2 Ho rnO o [a' E(a,) ]3 '
(5.24)
J
where E(a) = H(a) / Ho is given in (3 .39). The normalization constant has been chosen so that in a fl at matterdominated Universe, we recover D+ = a. As an example, in a Universe with A = 0 and flrn < 1, it can be shown that the decaying and growing modes are given by D

=
J1+X x 3
'
3 D+ = 1 +  + 3D _(x) In x
[vT+x  Fx] ,
where x == fl;:;/  1. Note that when flrn > 0, that is when x D_ '" l/ x so that the perturbations cease to grow.
»
1, D+
(5.25) >
1 and
244
The inhomogeneous Universe
0.4 0.2 0.2
0.4
0.6 a/ao
0.8
1
Fig. 5.1 Evolution of t he growing mode of the density contrast, D + (a) , as a fun ctio of the scale factor , a , for n mo = 1 [solid line], (n mo , n AO ) = (0.3 , 0.7) [dashed line] an (n mo , nAO) = (0.5,0) [dotted line].
In more general cases, (5.24) for D + has to be integrated numerically, as illustrate in Fig. 5.1. It has been shown [13] that for a ACDM model it is well approximated b
(5.26
Actually, when K = 0, it can be checked [14], using the solution (3.46) of the Friedman equations, that
where a = HoVDi\o = 5. 1.2. 5
J A/3 and 2Fl
is a hypergeometric function.
The Hubble parameter from the growth fun ction
Equation (5.23) can also be seen as a function for E(a) once rewritten as 2
2
dE+ 2 [ 3 + d D (dD)l]  l =0. E 2  3Dmo D (dD) da a da 2 da a5 da
This firstorder equation for E2 is easily integrated to give
E 2(z)
=
dD)  3D mo (1 + z )2 ( d z
2
J OO  D dD d z, z
1 + z dz
(5 .27
in terms of the redshift 1 + z = ao / a. It follows that for the class of ACDM model there exists a rigidity between the background dynamics and the growth of largesca structure. Such a feature can be useful to test this model (see Chapter 12) .
Newtonian perturbations
245
5.1 .2.6 Density  velocity relation Since 6 ex D(t) , (5. 15), implies that 1
e(x, t) == aH V . u(x, t) =  f(Omo, OAO)6,
(5.28)
_ dlnD(a) f(O mO, OAO) = d In a .
(5.29)
with
There exist many analytical approximations for this function. For instance, Ref. [13] suggests
f+(OmO, OAO)
~ o~8 + ~~O
(1 +
~Omo) .
e
Note that represents the local fluctuations of the Hubble constant. This is an observable quantity, independent of the value of the Hubble constant. Using (5.13), it follows that the velocity field has the general form
u
=
~ f+(OmO , OAO) V