This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
d + gluon via a t quark loop (see the lectures of Rosner). Thus, the decay amplitude is given as A(B^ir+x-)=Te-i'y + Pei0eiA, (12) since the penguin is proportional to Va Vt*d and so has the phase (3. Estimates based on the rate of B —> K -K 6 suggest P/T as high as 0.4. Assuming the strong phase A is small, then \
p+ TT~ and p~ n+. An alternative would be to consider decays that are dominated by the b —> d penguin. In this case 0) - iV(sin>cos0 < 0) * ~ N{sm4>cos(f) > 0) + 7V(sin!)cos0 < 0)' « / d2x0 ) = 1 and J{BS) = 0, the final state can have orbital angular momenta t = 0,1, and 2. Exercise: Show that I = 0,2 corresponds to even, and 1=1 to odd CP. 9-4 to be of comparable magnitude. 6.1.2
78
2TPsin(/? + 7)sinA T 2 + P 2 + 2TPcos(/3 + 7 ) c o s A ' For 0 + 7 ~ 90° this gives (with P/T = r) 1r sin A. 1 + rz Thus a very large CP-violating asymmetry is possible if r is greater than 0.3, but it all depends on the strong phase A. It is very difficult to make definite statements about the strong phases in B decays in contrast to K decays. For K -» TTTT the final state interaction can be thought of as elastic scattering with phase shifts S2 and So corresponding to 7T7T states with 1 = 2 and I = 0. However, KIT s-wave scattering at 5 GeV is highly inelastic involving many channels. The phase A arises from the absorptive parts of diagrams corresponding to the strong scattering from other final states into the TTTT state. For any weak interaction operator Oi we can define the real decay amplitude in lowest order An = -
Mti = M% =
(f\Oi\B).
If / were an eigenstate one would then multiply this by el Sf. Going from the eigenstate basis to the states of interest Mfi = £ < / | S 1 / 2 | / ' > < m i - B > , (14) /' where S is the strong-interaction S matrix. The sum in Eq. (2.7) is over a large number (almost uncountable) of states. One can only make some general comments about it: 1. The strong phase depends on the operator 0 , that affects the relative importance of different states / ' . The phase A in Eqs. (2.5) and (2.6) is the difference between the strong phase of the "tree" operator and that of the "penguin". 2. Since the strong scattering is expected to be very inelastic the diagonal element (/IS 11 / 2 !/) has as its major effect the reduction of M / J ; this is a kind of absorption effect. Thus we could write Mfi = M°fiai + i Ri =
\Mfi\ei5i,
where a < 1 is the reduction due to absorption. For a "typical" state, by unitarity, the scattering "in" due to R compensates for the scattering "out" so
Ri = \Jl-a] =sm5i.
(15)
79
3. An estimate can be based on a crude statistical argument 7 in which case one can reduce the multichannel problem to an equivalent 2-channel problem „ _ / cos2<9 i s i n 2 0 \ \i sin20 cos 20 J ' For / = 7r n the diagonal element cos 20 can be estimated by extrapolating data from n N scattering to n n giving cos 2(9 = T] ~ 0.7. Then Eq. (2.7) becomes Mu = cos 6 An + i sin6 A21, tan Si = i tan 6 —^, An where t a n ^ = ( i ^ | f )
1 / 2
(16)
~ 0.42.
The "typical" result corresponds to A2 = A\ and gives a strong phase of 20° to 25°. The quantitative conclusion from Eq. (2.9) is that if the state of interest (labeled 1 here) is a "more probable" final state than the states into which that state scatters (lumped into state 2 here) then the strong phase may be small. In conclusion, after it is clearly shown that CP violation is not superweak the next step is to find quantitative tests of the Standard Model by showing the consistency of a number of different experiments. This is a major program for the next decade. 3
Time Reversal Violation
By the CPT theorem CP violation implies time-reversal violation (TRV). Strong evidence for CPT invariance comes from the phase of e determined from Eq. (1.5). CPT violation would allow a real off-diagonal term m" in the matrix in addition to im' and thus would change the phase. Since the measured phase agrees with theory to about 1° there is a strong limit on m" which corresponds to a limit on [m(K°) —m(K°)]/mK < 10~ 18 . Nevertheless, it is of great interest to look for direct evidence for TRV both as another way to study CP violation as well as a way to demonstrate T violation in a straightforward way 8 . We discuss here four types of direct evidence for TRV; by this I mean a single experiment that by itself is seen to violate T. These are
80
1. A non-zero value of a T-odd observable in a stationary state. The simplest example is the electric dipole moment of an elementary particle or an atom. 2. A violation of the reciprocity condition on the S matrix
Sfi = S-it-f corresponding to comparing a reaction and its inverse. 3. A non-zero value of a T-odd observable in the final state of a weak decay. As discussed below this depends upon the neglect of final-state interactions. 4. In an oscillation a difference in the probability of a —» b from b —¥ a at a given time. It is interesting to note that each example immediately implies a test of CP violation (conceptual if not practical) by going to the anti-particles. In contrast the simplest tests of CP violation have no direct relation to TRV; for example, T(B -4 / ) = T(B —>• / ) involves a rate which has nothing to do with a TRV observable. Experimental limits on the dipole moments of the electron and neutron are dn < 10~ 25 e - cm, de < 4 x 1(T 27 e - cm. In the Standard Model dn is second-order weak and the calculation depends on long-distance effects giving of order 1 0 - 3 2 e-cm; de is third order and perhaps 10~ 38 e-cm. Thus the interest lies in the search for physics beyond the Standard Model (see the lecture by Thomas). Many tests of the reciprocity relations exist for strong interactions although they are of limited accuracy; it is very hard to study the reverse of weak interactions. An interesting proposal by Bowman involves the scattering of slow neutrons from polarized nuclei in the resonance region. One compares the observable < an • I x k > for incident polarization an and final polarization an, where / is the nuclear polarization. T-violating effects in the nuclear wave functions could enhance the result. An example of a "T-odd observable" in the final state of decay is the muon polarization P =< a" •fcMx kv >
81
in the decay K -> it + fi + v. This does not really violate T except in the Born approximation when final state interactions (FSI) can be avoided. As a simple didactic example, consider the scattering from a potential Vo +
Via-L
which certainly is T-invariant. The resulting amplitude is A = fo +
ifia-h
where n is the normal to the scattering plane. In the Born approximation / 0 and / i are real and so < a • h > = 0, but beyond the Born approximation / 0 and / i are complex; for example, if s and p waves dominate / 0 would have a phase el s° and / i the phase el ^. For the case of K° -> ir+ +/x~ + PM there is a Coulomb FSI so that without T violation, P ~ 10~ 3 ; for the case of K+ ->• TT° + ju+ + v^ the FSI involves 2-y exchange so that P ~ 10~ 6 . In the Standard Model the real TRV is expected to vanish in semi-leptonic decays. One would expect a real TRV in non-leptonic decays such as A —> pn~ where there is a defined parameter (3 oc< aA • a* x k > . However, the FSI effect is much larger being proportional to sin(J p — Ss) where 6P, 6S are the n-p phase shifts. If the experiment is also done with A, then /3 + /? is a clear CP-violating effect and is associated with true TRV, but this is hardly "direct evidence" of TRV. A large "T-odd observable" has been found in the decay KL —> n+ ir~ e+ e~ C = < ne x n^ • z > < he • nn >, where n^h^) are the normals to the e+ e~(ir+ ir~) planes and z is the unit vector between the pairs. This was predicted as a result of K - K mixing as an interference between an M l virtual 7 from if2 -> TTTTJ and an Ei virtual bremsstrahlung from K\ —> n tt 7. The theoretical result 9 is C = 0.15 sin(v?e + A), where A ~ 30° comes from mr phase shifts. The experimental result 10 verifies this; the result is so large because for the e + e~ energy considered the El is much larger than M l which compensates for the small admixture |e| of K\. Since A is involved this is not again obvious TRV. It is of didactic interest to consider the limit A —)• 0. In this case C is proportional to sin<£>£. We know
82
that
r(jr°->jr°)-r(ir0->jr°) T{K° -»• K°) + T(K° -> K°) independent of time. This seems somewhat strange since we expected an odd function of time. One can also ask from unitarity if K° goes to K° more than K° goes to K° what compensates for this. The answer is that the K° decays to 7T7T more than K°. Thus, decay plays an essential role, rendering this as a direct test of TRV somewhat questionable. As we have noted the phase of e is completely consistent with CPT invariance. There is no reason to doubt CPT invariance, which appears very fundamental, and so we conclude that the observed CP violation is associated with TRV. Nevertheless, unambiguous direct tests of TRV may prove very difficult.
83
Acknowledgments Many of the early papers on CP violation are reprinted in L. Wolfenstein, CP Violation (North Holland) (1989). This work was supported by the U.S. Department of Energy under Grant No. DE-FG02-91ER40682. References 1. In the original paper on the V — A theory, Feynman emphasizes the elegance of using chiral fields, R.P. Feynman and M. Gell-Mann, Phys. Rev. 109 (1958), 193. 2. More exactly there are also CP-violating parameters describing K —> 3ir decays which affect K -» 2ir via the off-diagonal terms in the T matrix. These can affect the phase of e in the Standard Model, probably less than 1°. A more general treatment of the phase e is given by the BellSteinberger relation, Proc. Oxford Conference (1965), 195. 3. M. Kobayashi and T. Maskawa, Prog. Theor. Phys. 49 (1973), 652. 4. For a general discussion and motivation of effectively superweak theories, see S.M. Barr, Phys. Rev. D 34 (1986), 1567. 5. For a derivation see H. Quinn in Review of Particle Properties, Euro. Phys. Journal 3, pp 555-562 (1998). 6. J.P. Silva and L. Wolfenstein, Phys. Rev. D 49, R1151 (1994); R. Fleischer, hep-ph/0001253. More recent data from BABAR suggests this may be an overestimate. 7. M. Suzuki and L. Wolfenstein, Phys. Rev. D 60 (1999), 074019. 8. See L. Wolfenstein, Int. J. Mod. Phys. E8 (1999), 501. 9. L.M. Sehgal and M. Wanninger, Phys. Rev D 46 (1992), 1035; Erratum, ibid D 46 (1992), 5209. 10. A. Alav-Harati et al., Phys. Rev. Lett. 84 (2000), 408. 11. A. Angelopoulous et al, Phys. Lett. B444 (1998), 43.
This page is intentionally left blank
'¥*3J#SR
Young-Kee Kim
This page is intentionally left blank
PRECISION ELECTROWEAK
PHYSICS
Young-Kee Kim University of California at Berkeley & Lawrence Berkeley National Laboratory Berkeley, California 94720 USA E-mail: [email protected] These three lectures review the experimental state of electroweak interactions and the search for electroweak symmetry breaking.
1
Introduction
One of the central problems in particle physics is to understand the origin and values of the particle masses. In the S t a n d a r d Model, particles acquire their masses by interacting with Higgs scalars. T h u s uncovering the secrets of the Higgs sector is the focus of much present and future experimental research. T h e Higgs sector can be probed by the precision electroweak measurements such as Mz, sin29w, Mw, and M t o p by means of q u a n t u m corrections. However, the n a t u r e of the symmetry breaking sector can only be established by its direct discovery and detailed study. T h e first lecture (Section 2) surveys the accelerators t h a t produce heavy particles such as W, Z, and top quark, and the detectors t h a t identify t h e m and measure their properties. T h e second and third lectures review the precision measurements t h a t probe the Higgs sector (Section 3) and the current and future searches of the Higgs boson (Section 4). 2 2.1
S u r v e y of A c c e l e r a t o r s a n d D e t e c t o r s Colliders
versus Fixed-target
machines
Suppose t h a t we wish to study the Z boson. This requires a centre-of-mass energy of ^ 9 0 GeV. In the e+e~ symmetric collider (e 4 , energy = e~ energy), this would require a beam energy of ~ 4 5 GeV. Assuming a quark in a proton carries a b o u t 1/6 of the proton energy on average (see Figure 1), the proton and anti-proton collider would require a beam energy of ~ 2 7 0 GeV. a In order to achieve the same goal, a proton fixed-target machine would require a proton beam energy of ~ 2 0 TeV. a
N o t e that the energy loss due to synchrotron radiation is much larger for electrons than protons (Esync ~ 1/m 4 in a circular accelerator ring), thus it is easier to accelerate protons than electrons. 87
88
Figure 1: Quark and gluon energy distribution functions at fi2 = 10 GeV 2 where x is momentum fraction carried by a quark or gluon. Figure courtesy of "QCD and Collider Physics" by R.K. Ellis, W.J. Stirling and B.R. Webber.
With the great technological achievements of accelerator physics in colliding one high-energy beam with another, colliders satisfy our need to study elementary processes at high energies better than fixed target machines. 2.2
e+e~ colliders versus Hadron colliders
The highest beam energy achieved in an e + e~ collider is 104 GeV by LEP-2 (LEP stands for Large Electron Positron collider) at CERN, and the highest energy achieved in a hadron collider is 900 GeV b by the Tevatron at Fermilab. The beam energy in LHC (Large Hadron Collider), which will begin in about 2006, will be 7 TeV. Since electrons are elementary particles, all of their energy can be available to produce new particles, thus energy upto 208 GeV is available at LEP-2. The available energy is much larger at the Tevatron hadron collider as demonstrated in Figure 2 although the cross-section rapidly decreases with the energy. With large luminosity, hadron colliders would be good machines to discover massive particles. Note that the 3 heaviest particles, W, Z, and top quark, 6 For the upcoming Tevatron run (Tevatron Run II), the proton and anti-proton beam energy will be 980 GeV.
89 ~n
|
i
|
i
|
i
|
:
CDF d a t a o n j e t ET d i s t r i b u t i o n . 0.1<\r)\<0.7, R=0.7 s t a t i s t i c a l e r r o r s only .Theory-MRS(DO) p a r t o n s
200 ET [GeV]
Figure 2: Jet Er distribution from the CDF collaboration at the Tevatron.
were discovered at hadron colliders (see Table 1). l ' 2 On the other hand, e+e~ colliders have advantages such as their clean environment that makes precision measurements much easier and the fact that the final state particles are mostly detected so that the energy-momentum conservation can be used as a tool to improve the measurement resolution. e+e~ colliders and hadron colliders are thus complementary to each other. 2.3
Production of W, Z, and Top quark
Z production at LEP-1/SLC and W production at LEP-2 The experimental study of the process e + e~ —• Z —> ff :
began in 1989 at LEP and SLC at SLAC. The LEP program completed datataking at the Z resonance in 1995 (LEP-1), and the SLC program finished in
90 Table 1: Discovery of Fermions and Gauge Bosons Type Lepton e e+ M r "e
Mass (GeV) 0.00051 0.00051 0.106 1.777 ~0 ~0
~o
Quark
Gauge Boson
d s c b t 7 9 W Z
0.003-0.009 0.001-0.005 0.075-0.170 1.15-1.35 4.0-4.4 174.3±5.1 0 0
80.419±0.056 91.188±0.002
1898 1931 1937 1975 1953 1962 2000
Discovery J . J . Thomson Anderson Anderson et al Martin Perl Cowan & Reines Lederman, Schwartz, Steinberger DONUT
cosmic ray cosmic ray e + e~ collider fission reactor p accelerator p accelerator
1947 1974 1977 1995
Rochester & Butler Mark I & BNL-E-0598 FNAL-E-0288 CDF & D 0
cosmic ray e + e ~ coll. & p accel. p accelerator pp collider
1979 1982 1982
Tasso / Mark J U A 1 & UA2 U A 1 & UA2
e + e ~ collider pp collider pp collider
1998. The total LEP event sample consists of 15.5 x 106Z -> qq and 1.7 x 10 6 Z —> £+£~ events collected at ~ 7 energies about the Z mass. The total SLC sample consists of 557,000 events collected with left-handed and right-handed electron beams. Typical Z candidates recorded are shown in Figure 3. Starting in 1996, LEP increased the beam energy to produce W boson pairs (LEP-2). Three diagrams contribute to the W pair production: W+ W+
w+
w+ The neutrino exchange diagram dominates near the W pair threshold, yfs ~ 2Mw — 161 GeV. As A/S increases, the contributions of the other two diagrams get larger, but there is a negative interference between them. This is illustrated in Figure 4, where the expected cross-section is shown with the full Standard Model structure, and if one or both of the diagrams with triple vector boson couplings is omitted, unitarity is violated. Figure 5 shows a W+W~ candidate recorded at the LEP-2 experiments. The cross sections in e+e" annihilation with y/s <7_00 GeV are summarized in Figure 6. Dominant processes are e + e~ -> 7* -» / / at *fs < Mz, e+e~ -t Z ->• / / at Vs - Mz, and e+e" -> W+W~ at y/s > 200 GeV.
91
^"••"'"'•"•nF
Figure 3: An e+e
Z —> / i + / i candidate (Top) recorded at the OPAL experiment and candidate (Bottom) recorded at the L3 experiment at LEP-1.
92
02/03/2001
LEP
Preliminary *
*4-T«u
Racoon WW / YFSWW 1.14 no ZWW vertex (Gentle 2.1) onlyv exchange (Gentle 2.1)
160
170
180
190
200
210
ERm [GeV]
Figure 4: LEP-2 average W pair production cross-section, corrected to correspond to the three doubly-resonant W production diagrams.
93
' D E L P H I R I P ' *&£, l.im; I * J . 4 CiV "TH^I DASi M - A p r - I M !
Ijgr^
| | : 43; it
109372 E M . 5483 P r • c ;1 » -Api -1 1uI S e i . • 1P•A pr - ! 0) 8
n • J
n X2oo J
c I
o I
Figure 5: An e + e —> W+W candidate recorded at the DELPHI experiment at LEP-2, where W+ and W~ decays each into a pair of quarks giving a four jet topology.
100 Vs [GeV]
200
Figure 6: Cross sections in e+e
500
annihilation.
94
Z and W production at the Tevatron The primary production of Z and W bosons at the Tevatron arises from the reactions: uu -> Z, dd -» Z, ud -> W+, and ud -t W~, where the up and down quarks (and antiquarks) can be Valence or sea quarks in the proton c :
E xiE u
— - — « -
E X2E •*-=•—
7 d
Although both Z and W± are produced through quark-antiquark annihilation, the dominant contribution is not from the valence-valence collisions but from valence-sea collisions. The typical qq center-of-mass energy %/I for W, Z production is the mass of the boson, %/I ~ Mz,w- Since these boson masses are around 100 GeV, about 1/20 of the pp center-of-mass energy, both valence and sea quarks have a good probability for carrying a sufficient fraction of the proton's energy to produce a gauge boson (see Figure 1). The valence-sea production mechanism is about 4 times larger than the valence-valence and sea-sea production mechanisms. It is coincidental that the valence-valence and sea-sea mechanisms are about equal at this energy. At higher energies such as LHC, the sea-sea mechanism dominates; at lower energies such as SppS where the center-of-mass energy was 560 GeV, the valence-valence mechanism dominates.
Top-quark production at the Tevatron The Tevatron has been (and will be until LHC turns on) the unique place to produce top quarks. The dominant top-quark production mechanism at the Tevatron is the annihilation of the valence quark and the valence antiquark into a gluon, which then decays into a tt pair. One of the tt cadidates recorded at the Tevatron is shown in Figure 7. The dominant top-quark production mechanism at the LHC (proton-proton collider) will be the annihilation of a gluon and a gluon into a gluon, which in turn decays into a tt pair. c T h e proton consists of three valence quarks (uud) which carry its electric charge and baryon quantum numbers, and an infinite sea of light qq pairs.
95
CDF Top Event M
= 79 GeV/c2
Jet 2 Jet 3
Figure 7: A tt candidate recorded at the CDF experiment, where t —> W+b t —• W~b —y qqb.
—• e+ffe and
Summary Table 2 summarizes the number of W, Z and tt events identified at LEP-1, LEP-2, SLC and the Tevatron. 2.4
Survey ofW,
Z, and Top quark Properties
The W, Z, and top particles (and the other elementary particles in the Standard Model) are structureless upto our current resolution, which is about 10~ 16 - 10~ 17 cm; Note that the top-quark mass is enormous, as heavy as a gold atom which consists of 79 protons and 118 neutrons and yet it is structureless. Coincidentally the top-quark mass is about the sum of the Z boson mass and the W boson mass. The top quark and W and Z bosons decay immediately after they are produced; their lifetimes, provided by their width measurements, are
96 Table 2: The number of W's, Z's and top quarks.
Heavy Particle Z
W top
Accelerator LEP-1 SLC Tevatron LEP-2 Tevatron Tevatron
C M . Energy ~91 GeV ~91 GeV 1.8 TeV 132 ~ 208 GeV 1.8 TeV 1.8 TeV
# of events 17,221,000 557,000 9,000 10,000 180,000 100
around 10~ 25 sec. With current date techniques, we can not trace their decays in space. Z's decay into a fermion (a quark or a lepton) and its anti-fermion, and W's decay into a fermion and its weak interaction partner: W+ —> e+ve, n+Vn, T+VT, ud, and cs. Top quarks decay about 100% of time into a W and a b quark, the weak interaction partner of the top quark (t —> W+b). B hadrons (hadrons containing b quarks) live long, 1.5 x 10~ 12 s, at the elementary particle scale. For instance, b quarks from the top decay which have a momentum of ~50 GeV travel about 4 mm before decaying, which is large enough to trace down in space. This is demonstrated in Figure 7 by the displaced vertices (seconary vertices). 2.5
Detectors
In order to identify various types of particles such as electrons, muons, taus, neutrinos, and 6-quarks, which are the products of Z, W, and top decays, a typical experimental detector consists of layers of devices as shown in Figure 8. Imagine you are riding on a particle that was just produced by the collision of a proton and an anti-proton or an electron and a positron. It encounters a thin beam pipe. It then zips through a silicon device, capable of resolving very tiny distances (~100 /xm), thus identifing fo-quarks in the event as shown in Figure 7. After the silicon device, it zips through a gas containing an immense number of very thin gold wires. The particle passes ~100 of these wires. If the particle is charged, each nearby wire records its passage, and the particle's path is determined (see Figures 3,5,7). A measurement of the curvature (this device and the silicon device are typically inside of magnetic field B produced by a solenoid) gives us the momentum of the particle: qBr
97
Figure 8: The beam's view of a typical detector.
where q is the charge of the particle, r is the inverse curvature, and c is the speed of light. If the particle is neutral, there will be no signal on the wire. Next the particle passes through coil of solenoid magnets. It then passes into a calorimeter section, which measures particle energy except muons and neutrinos. Different particles interact differently with matter, thus with calorimeter. If an electron, it fragments on a series of closely spaced thin lead plates and scintillator between the plates, giving up its entire energy in 3 or 4 inches. If a hadron, it penetrates 10 to 20 inches of calorimeter material before exchausting all of its energy through nuclear collisions. If it is a muon, it zips through the calorimeter sections and leaves hits in a gas containing thin gold wires which is located outside of the calorimeter. Neutrinos leave the detector entirely, leaving behind not even a hint of their fragrance. The system stores the data with about one million bits of information for each event.
98
3 3.1
Precision Electroweak Measurements Tree-level formulation
QED can be described by one parameter e or a = e2/(47r) where l/a — 137.03599959(38)(13), most precisely derived from g - 2 measurements. 3 In the electroweak theory, the strengths are specified by three paramters. They are two gauge coupling constants g and g', and v, the vacuum expectation value of the Higgs field (see Chris Quigg's lectures in this proceedings). These three parameters as inputs predict charges, Mw, Mz, and Gp via the following relationships:
sin2 w = 1
°
~M
=
n™ V2GFM^ e2
(1) (2) V ;
(3)
9
2
e
The theory is democratic so that any three variables in the above equations may be chosen as inputs. It makes much more sense to choose the three parameters most accurately measured so that the predictions for any other measurements become as precise as possible, allowing sensitive tests of the theory. In the early 1980s, those were e (or a), GF d and the electroweakmixing parameter s'm29w- With the measured values of these parameters, the masses of the W and Z bosons were prediced to be around 80 GeV and 90 GeV, respectively. This prediction lead to the discovery of the W and Z boson at SppS at CERN. 1 By the end of August 1989, the 4 LEP experiments measured Mz to within 160 MeV. Since then, a, GF and Mz were adopted as the basic input parameters. The current Mz accuracy by LEP is 2 MeV (0.002%). 4 Most likely, this measurement won't be improved in the near future. 3.2
Higher Order Corrections
Once higher order corrections are included, all these equalities are no longer exactly true. In gauge theories, higher order corrections such as the loop diagrams lead to infinities which require renormalization. A general consequence d
GF = (1.16637 ± 0.00001) x 10~ 5 GeV~ 2 is derived from the muon lifetime measurements using the radiative corrections of the V — A Fermi theory.
99
of this is the introduction of Q2 dependent corrections to the parameters of the theory, therefore corrections to the relationships among the parameters. For instance, the Born level relationship GF
- y/MPy, Ml - M ^
W
will be modified by M r - l ™ z ^"l-Ar^M^Ml-M^
(to W
in higher orders. Here Ar ~ Aro — pt/tan#iy Ar 0 ~ 1 - a{0)/a(Mz) ^ 0.06
(7) (8)
where a(Mz) is the electromagnetic coupling constant at the scale of the Z mass and the top quark contributes pt- As shown in Aro, the bulk of the corrections can be observed in "running" (Q 2 -dependent) coupling. The uncertainty in a(Mz) is dominated by the contribution of the light quarks (it through 6), A a ^ ( M f ) 1 2 :
This is evaluated using dispersion relations and a{e+ e~ —> hadrons) measurements at low \ / s , and perturbative QCD calculations at large yfs. The relationship can also be written as Mw MzcosOw
l2
= 1
(10)
in the lowest order, and MW Mzcos0w
l
ia g 1 +
3GFMJ 8\/27r2
~ 1 + 0.0096
( n )
Mt 175GeV
100
with higher order corrections. The second term in this equation (or pt in Eq. 9) is contributed by the top quark in loop diagrams: t t W+-
W+
With Mtop ~ 175 GeV, the correction is about 1% level. Thus precision measurements of My/, Mz, and cos9y/ with much better than 1% accuracy can predict the top-quark mass. The predicted top mass from electroweak measurements is Mtop = 172+Jj GeV. 4 The top mass measured by the CDF and D 0 Tevatron experiments is Mtop — 174.3 ± 5.1 GeV, 5 which agrees with the prediction very well. This is a good example of the successful interplay between theory and experiments. Any inconsistency between the predicted and measured values would have hinted at new physics. Secondary contributions to electroweak observables (or to Eq. 11) are from the Higgs boson in loop diagrams H H W+-
W+
w+ which are proportional to HM2H/M2W) where the Higgs mass M # is unknown in the Standard Model. New particles beyond the Standard Model could also contribute to electroweak observables through loop diagrams. Difficulties arise due to these unknown masses, Mjj and M n e w particle- This could be bad news since the predictions are uncertain and it is hard to test the theory. Or this could be good news since the predictions depend on the unknown masses. Indeed precision measurements provide information about these unknown masses of particles which are too heavy to be produced directly, or whose production cross section is too small to be observed. For example, with ~30 MeV uncertainty in Mw and ~ 2 GeV uncertainty in Mtop, we can predict the Higgs boson mass within ~30%. When precision measurements are inconsistent each other (for example, different sets of precision measurements predict different Higgs masses), this may signal the presence of new physics beyond the Standard Model.
101
3.3
The Experimental Inputs
The last decade has been a remarkable improvement in our knowledge of various electroweak parameters. Much of the improvement is due to the study of the Z resonance at LEP-1 and the SLC, and the study of W and top quark at LEP-2 and the Tevatron. Z boson parameters The precise determination of the Z boson parameters from the measurements at the Z resonance by the four collaborations ALEPH, DELPHI, L3 and OPAL in e + e~ collisions at LEP and by the SLD collaboration in e+e~ collisions at the SLC is a landmarkforprecision tests of the electroweak theory. Cross section at the Z peak The coupling of the Z to a fermion (/) and an anti-fermion (/) pair is described by the following Lagrangian density, L = (^f)1/2*n»(vf
- an5)$fZ»
(12)
= (^M)i/2*/7/i[ff/(i_75)+5/(i+75)]$/^
(13)
where vj and a/ are vector and axial vector coupling constants, and g*L = (vf + a,f)/2 and g^ = (vf — a/)/2 are left-handed and right-handed combinations. The vector and axial vector couplings are related to the quantum numbers of the fermion as follows vf = y/pj(2l(
- 4Q / sin 2 0 / )
(14)
af = y/pj(2l()
(15)
where I[ is the third component of weak isospin, Qf is the electric charge, and the parameters pf ~ 1 and sin 2 0/ ~ 0.23 incorporate electroweak radiative corrections. The cross section for the process e+e~(Pe) -» Z —> ff is described in the center-of-mass frame by the following expression, daf/ dtt
=
9 sTeeTff/Ml 4 (s - M | ) 2 + s 2 r | / M | [(1 + cos26>)(l - PeAe) + 2cos6Af{-Pe
l
+ Ae)]
j
102
where Pe is the polarization of the electron beam, s is the square of the centerof-mass energy, Tz is the total width of the Z, 9 is the angle between the incident electron and the outgoing fermion, Fff is the partial width for Z —• / / , and Af is the left-right coupling constant asymmetry. The partial widths and coupling constant asymmetries are related to the couplings defined in the Lagrangian,
2vIaL_
=
v
=
a
jgtf
- (g/)» 2
(^) + (£)2'
)+ )
where 8 ~ 1 + SaQ^/Air + nfas/-rr, (n/ is 1 for quarks and 0 for leptons) accounts for final state QED and QCD radiative effects. The small size of i^/a^(~ 0.08), where £ = e,/i,r, makes the leptonic coupling asymmetries A( particularly sensitive to electroweak vacuum polarization corrections. The leptonic asymmetries are usually parameterized in terms of s'm29y(, = sin 2 ^ (assuming lepton unversality). It follows that small changes in sin 2 #^>' produce large effects on At, =
'
2(l-4sin2^) l + (l-4sin2^/)2
^ - - S - ^ s i n
2
^ ) -
(20)
Electroweak Observables at the Z pole The cross section described in Eq. 16 is only the dominant term in the total s-channel e+e~ cross section which can be expressed as dafJt(s)
dn
=
daff(s)
dft
dcr^js)
dn
daf/(s)
dn
'
[
'
where the first and second terms represent photon exchange, and the interference between photon and Z diagrams, respectively. At ^/s = 30 — 40 GeV (the centre-of-mass range for the PEP and PETRA colliders), the dominant contribution comes from al^ and the a z contribution to the total cross section is about 2 - 3%. At ^fs ~ 60 GeV (the centre-of-mass range for the TRISTAN collider), the a'z contribution to the total cross section increases to about 25%. The contribution from az is negligible in both cases. The cross section
103
10
£?
10'
§ !^3 t*2
P
10
O 13
1 0 " =-
10
I I I I I I I I I I I I I I I I I I I 1I I I I I I I I I
0
20
40
60
80
100
Center of Mass Energy Figure 9: Cross-section of e+e~
120
[GeV]
—• / / .
near yfs = Mz does not differ dramatically from the resonance cross section given in a'z' because the Z — 7 interference term vanishes at the pole, and the photon exchange cross section is approximately 1000 times smaller than the Z-exchange cross section. The cross-section measurements for fermion pair production from e+e~ annihilation at yfs < 120 GeV are summarized in Figure 9; the agreement between the measurements and the Standard Model predictions are excellent. Using the information contained in hadronic and leptonic cross sections around the Z pole, it is possible to define a number of experimental observables: 1. The line shape parameters, which consists of the Mz and Yz whre the definition is based on the Breit-Wigner denominator (s-M^ + isTz/Mz) with s-dependent width, and the peak hadronic cross section a°h (see
104
Figure 10) o°h = 12nTeerhad/(M*r%);
(22)
2. The cross section ratios Re = Te/rhad,
Rb = Tbb/Thad, Rc = Tcc/Thad
(23)
where I — e, n, r; 3. The unpolarized forward-backward asymmetries A
FB = -J f=0.75AeAf, aF + aB
f = £,b,c
(24)
where aJF is the cross section for finding the scattered fermion in the hemisphere defined by the incident electron direction and oL is the cross section for finding it in the positron hemisphere; 4. The left-right asymmetry which is defined as PA P
^
-PA A
L
R
-
P
^
_
f
, , * 6
,„. {25)
where a?(Pe) is the total cross section for the production of / / pairs with an electron beam of helicity Pe; 5. The polarization of final state r-leptons which depends upon the direction of the T, Ae, and AT, PAcose) = -
AT{l + cos29) + 2Aecos6 i + cos2e + 2 A ^ c o s e -
(26)
6. The left-right forward-backward asymmetries {f = FB
4(~\Pe\) - 4HPe\) afF(-\Pe\) + 4(-\Pe\) 4(-\Pe\)+4(~\Pe\)-0.75PAf,
~ 4( + \Pe\) + 4( + \Pe\) + 4(' '+ -\Pe\) " • + 4(f ' +-\Pe\) "
f=i,s,c,b.
The presence of initial-state QED radiation smears the center-of-mass energy. With the initial state QED corrections, ^ i is reduced and a changes. On the Z resonance, the cross-section is reduced by ~30%, as demonstrated in Figure 10.
105 •
|
i
i
i
|
.
i
i
|
I
I
.
1
1
'
1
•
*'\ !* \\ ** * ; »
ALEPH DELPHI L3 OPAL
30
*
:
* / * / * &
20 r
-•
. _ -
\
» /*\ * * / * \ *• i
10
£~*
yv------"'
40
\* \. * -
i
measurements, error bars / / increased by factor 10 / / — a from fit • - - QED unfolded
/
\
;
/ /J / / V
1
86
,
,
,
1
88
.
,
,
1
90
. ,sLMz, .
92
.
,
1
94
E cm [GeV] Figure 10: Average over measurements of the hadronic cross-sections by the four experiments, as a function of centre-of-mass energy. The dashed curve shows the QED deconvol v e d cross-section, which defines the Z parameters.
The net correction to r-polarization and the left-right asymmetry is less than 2%, and that to the leptonic forward-backward asymmetries is about 100%. A summary of the combined LEP-1 results is presented in Table 3 . 4 Note the remarkable precision of these measurements, especially the Z mass which is measured to 2.2 x 10~ 5 . It is the third most precisely determined electroweak parameter and used as an input to the theory. The measurement of the left-right cross section asymmetry (ALR) by SLD at the SLC provides a systematically precise, statistics-dominated determination of the coupling Ae, and is presently the most precise single measurement, with the smallest systematic error, of this quantity. In principle the analysis is straightforward: one counts the numbers of Z bosons produced by left and right longitudinally polarized electrons (Nz(L) and NZ(R)), forms an asymmetry, and then divides by the e~ beam polarization magnitude (the e + is not
106 Table 3: Summary of the Z parameter measurements at LEP-1 and SLC. The middle box shows the combined result of the LEP lineshape analyses. R[ and ApB assumed lepton universality. The bottom box presents the result from the SLC analysis.
Z parameter LEP-1 Mz Tz < Re
Average Value
sm c;
91.1875 ±0.0021 GeV 2.4952 ± 0.0023 GeV 41.540 ± 0.037 nb 20.767 ±0.025 0.0171 ±0.0010 0.1439 ±0.0042 0.1498 ±0.0048 0.2321 ±0.0010
sin%ff
0.23098 ± 0.00026
A
FB
AT Ae 2
SLC
polarised): ALR
=
1 NZ(L)-NZ(R) PeNz(L)+Nz(R)-
(28)
The average electron polarization at the interaction point has been in the range 73% to 77%. The measured value of ALR 4 can be translated into the value of s i n 2 ^ = 0.23098 ± 0.00026.
(29)
The decays of the Z to neutrinos are invisible in the detectors and give rise to the "invisible width", Tinv — NVTVU, where Nv is the number of light neutrino species. The invisible width can be determined from the measurements of the decay widths to all visible final states and the total width, which is given by the sum over all partial widths, L
Z —t
ee
+ 1 mi T 1 T-T T 1 had ± t
inv
(30)
The ratio of the invisible and leptonic widths is found to be I W r « = 5.941 ± 0.016.
(31)
Dividing this number by the Standard Model value for the ratio of neutrino and leptonic widths r „ „ / r « = 1.9912±0.0012 yields the number of light neutrinos, Nv = 2.9835 ± 0.0083
(32)
107
1 I I I I I I 1 I I i I I I 1 1 1 I i I I I I I I I
87
88
89
90
91 ^=Ecm
92 93 (GeV)
94
95
96
Figure 11: The Z lineshape at LEP-1. The three curves represent the predicted lineshapes with the assumption of two (dashed line), three (solid line), and four (dotted line) light neutrinos.
(see Figure 11). Note that N„ is about 2
(33)
The accuracy of the measurements makes them sensitive to the mass of the top quark and to the mass of the Higgs boson through loop corrections as stated in section 3.2. An example of such sensitivity is presented in Figure 12. Predicted values from all the Z parameters are Mw = 80.374 ± 0.034 GeV,
(34)
Mtop = 169^° GeV,
(35)
MH
56+^ GeV.
(36)
108
Total width T,
24SV.f»
J.2Mt*"-T
2499.6 ± 4.3 M e V
2494.6 ± 2.7 M e V common 1.7 MeV not com 2.1 MeV X2/dof= 3.3/3
91 1.86 ± 2 M e V 60-1000GeV
2480
2490
2500
T z [MeV]
Figure 12: Summay of the Z width measurements at LEP-1 and the sensitivity of the Z width to the mass of the top quark.
109 Table 4: Typical W pair event selection efficiencies and purities.
Decay mode WW -¥ qqqq WW -> qqtv WW -+ tvtv
EfRciency 90% 82% 60 - 80%
Purity 80% 90% 90%
W boson parameters The precise determination of the W boson and top-quark parameters was initiated by the CDF and D 0 collaborations at the Tevatron. Since 1996, the LEP-2 program made a significant contribtuion on the W parameters. These make complementary tests of the Standard Model to those of LEP-1 and SLC. Together with the Z parameters, efforts to constrain MH have become increasingly interesting. W branching fractions, W mass and width from LEP-2 at CERN By doubling the e+e~ collision energy, LEP-2 provided a sample of W pair events. The typical selection efficiencies and purities for W pair events at LEP-2 are given in Table 4. The branching fractions for W decays via the electron, muon, tau, and hadronic modes have been measured by all four experiments 4 . The LEP-2 average results are given in Table 5. The results for the individual leptonic channels are consistent with lepton universality, and the average leptonic branching fraction (10.74 ±0.10)% is also consistent with the Standard Model expectation. The leptonic W branching fraction can be re-interpreted in terms of the CKM matrix element Vcs without need for a CKM unitarity constraint, using the relatively well-known values of other CKM matrix elements involving light quarks 6 . These indirect constraints currently lead to a value 4 of |VCS| = 0.989 ± 0.016, much more precise than the value derived from D decays of 1.04±0.16 6 . The W mass can be extracted from the measurement of the W pair cross section at the W pair threshold, \/s ~ 2Mw- See the measurement point at y/s = 161 GeV in Figure 4. The value of y/s = 161 GeV was chosen from the 1995 Tevatron Mw measurement. At center-of-mass energies above the W pair threshold, the technique for measuring the W mass lies in the reconstruction of the directions and energies of the four primary W decay products. These may be either four quarks, approximated by four jet directions and energies, for the qqqq channel; or two quarks (or two jets) and a charged lepton for the
110 Table 5: LEP average W decay braching fraction measurements.
Decay mode WW ->ev WW•->
fit/
WW ->TP WW -+qq
Branching fraction (%) 10.62 ±0.20 10.60 ±0.18 11.07±0.25 67.78 ±0.32
qqiv channel, deducing the neutrino direction and energy from the missing momentum in the event: P(e+)+P(e-)
= 0 = P(W+)+P(W-)
(37)
+
P{£ ) + P{yi) + P{q) + P(q) = 0 Et+ + EVl +Eq + Eq = vrs = 2Ebeam.
(38)
Decay to Iviv are of limited use because at least two neutrinos are undetected. The W decay products are paired up to give reconstructed W mass estimates. A substantial improvement is made in the mass resolution for both qqqq and qqiv channels by applying a kinematic fit, constraining the total energy and momentum in the event to be that of the known colliding electron-positron system as shown in Eq.s 37 and 38, and making a small correction for possible unobserved initial-state radiation. A typical reconstructed mass distribution from the kinematic fit are shown in Figure 13, for the qqiv channel. A clear W mass peak is observed with low background. The W mass is extracted from the measured W masses in each data event using a Monte Carlo technique, the details of which differ from one experiment to another. The Monte Carlo techniques have in common that they use full detector simulations to correct for the effects of finite detector acceptance and resolution, as well as initial-state radiation. An example of the results obtained from the fits are shown in Figure 13 (histogram). For all of these fits, the W width is taken to have its expected Standard Model dependence on the W mass. The combined W mass from direct reconstruction, taking into account all correlations including those between the W+W~ —> qqiv and W+W~ -¥ qqqq channels, gives Mw = 80.428 ± 0.030(stat.) ± 0.036(syst.) GeV,
(39)
with a x 2 /d.o.f of 27.1/29, corresponding to a x 2 probability of 57%. Table 6 summarizes the combined result from each experiment at LEP-2. The results
111 Table 6: Summary of the W mass measurements from direct reconstruction (v/s = 172 — 202 GeV) at LEP-2. Results are given for W+W~ -> qqlv and W+W~ -> qqqq channels. The x2 per d.o.f is 27.1/29.
Experiment ALEPH DELPHI L3 OPAL Total
Mw (GeV) 80.449 ± 0.065 80.380 ±0.071 80.362 ± 0.078 80.486 ± 0.066 80.428 ± 0.047
are consistent with the W mass extracted from the measurement of the W pair threshold cross-section at 161 GeV (see Figure 4): Mw = 80.40 ± 0.20(stat.) ± 0.03(syst.) GeV.
(40)
The overall LEP average W mass measurement obtained is: Mw = 80.427 ± 0.046 GeV.
(41)
The width of the W mass distributions shown in Figure 13 has components from the true W width and from detector resolution. In many events, the mass resultion is comparable to, or better than, the true width. It is consequently possible to measure directly both the W mass and width, and in practice the two results are little correlated. The combined result is Tw = 2.12 ± 0.20 GeV.
(42)
W branching fraction, W mass and width from the Tevatron at Fermilab At the Tevatron, the W bosons are only detected in their decays to eu and [iv because the decay to qq' is swamped by the QCD dijet background whose cross section is over an order of magnitude higher than the mass range of interest. Also one does not know the event s and one can not determine the longitudinal (parallel to the beam direction) neutrino momentum because a significant fraction of the products from the pp interaction are emitted at Small angle to the beam pipe where there is no instrumentation. Consequently, one must determine the W mass from transverse (perpendicular to the beam direction) quantities 7 , i.e. : the transverse mass Mr, the charged lepton
112 100 ^
a
ALEPH Preliminary
40
i—i i-
(i)
Vs = 191.6,195.5,
evqq selection •
XI)
&. ^ 70 a 60
199.5,201.6 GeV
Data (Luminosity = 237 pb"1) MC (m w = 80.60 GeV/c z )
|"L
Non-WW background
w
50 40 30 20 10 'i
50
i.i.iii.i
55
60
•^••••i-
65
70
75
80
85
90
95
M w (GeV/c2) Figure 13: Reconstructed W mass distribution compared to the best fit for the e+e W+W~ -> e+isqq' channel at the ALEPH experiment at LEP-2.
transverse momentum Pr, or the neutrino transverse momentum PT (or the missing transverse energy $,?). $T is inferred from a measurement of PT and the remaining PT in the detector, denoted by u, i.e. gluons
F4f +Pr
= 0
PT + PT + u = 0 or P^ = £T = -{PT + u).
(43)
113
The following cartoon demonstrates kinematics of W boson production and decay, as viewed in the plane transverse to the antiproton-proton beams, where the recoil energy vector u is the sum of the transverse energy vectors E%T of the particles recoiling against the W:
£
1 2
u receives contributions from two sources: first, the so-called W recoil, i.e. the particles arising from initial state QCD radiation from the qq legs producing the hard scatter, and, second, contributions from the spectator quarks (pp remnants) and additinal soft scatters by p'p' which occur in the same crossing as the hard scatter. This second contribution is generally referred to as the underlying-event contribution. Experimentally, these two contributions cannot be distinguished. Owing to the contribution from the underlying-event, the missing transverse energy resolution has a significant dependence on the instantaneous pp luminosity. Transverse mass of W, M™, is defined as M$ = ^ / 2 P | P ^ ( l - c o s 0 )
(44)
where
114
on Pf. For this reason, and at the Tevatron luminosities where the effect of the # T resolution is not too severe, the transverse mass is the preferred quantity to determine the W mass. However, the W masses determined from the PT and $T distributions provide important cross-checks on the integrity of the MT result because the three measurements have different systematic uncertainties. Figure 14 and 15 show the transverse mass (M-Jf) and lepton transverse momentum (PT) distributions of W events. The W mass at the Tevatron is determined through a precise simulation of the transverse mass line-shape, which exhibits a Jacobian edge at MT ~ MwThe simulation of the line-shape relies on a detailed understanding of the detector response and resolution to both the charged lepton and the recoil particles. This in turn requires a precise simulation of the W production and decay. The similarity in the production mechanism and mass of the W and Z bosons is exploited in the analysis to constrain many of the systematic uncertainties in the W mass analysis. The lepton momentum and energy scales are determined by a comparison of the measured Z mass from Z -» e + e~ and Z —> H+H~ decays with the value measured at LEP. The simulation of the W PT and the detector response to it are determined by a measurement of the Z PT which is determined precisely from the decay leptons and by a comparison of the leptonic (from the Z decay) and non-leptonic ET quantities (u) in Z events. The reliance on the Z data means that many of the systematic uncertainties in the W mass analyses are determined by the statistics of the Z sample. The W and Z events in these analyses are selected by demanding a single isolated high PT charged lepton in conjunction with missing transverse energy (W events) or a second high PT lepton (Z events). Depending on the analyses, the $T cuts are either 25 or 30 GeV and the lepton PT cuts are similarly 25 or 30 GeV. In total, ~84A; events are used in the W mass fits and ~9fc Z events are used for calibration. The Tevatron average W mass measurement obtained is 8 Mw = 80.448 ± 0.062 GeV.
(45)
The Tevatron experiments determine the width by a one parameter likelihood fit to the high end of the transverse mass distribution (see Figure 16). Detector resolution effects fall off in a Gaussian manner such that at high transverse masses (MT > 120 GeV), the distribution is dominated by the Breit-Wigner behavior of the cross section. In the fit region, CDF has 750 events in the electron and muon channels combined. The result 9 is I V = 2.055 ±0.125 GeV.
(46)
115
Transverse Mass (GeV/c )
g 700 > t+H 600 "-
o
S-H
j£ 500 §400
I /
300 ; 200
tiA tr \
/ \
100
0,l^-n-H-l
II
50 55 60 65 70 75 80 85 90 95 100
m T (GeV) Figure 14: W Transverse mass distributions compared to the best fits for the W channel from the CDF experiment (Top) and the D 0 experiment (Bottom).
116
C/3 •(—>
Si 200
4
> CD
"oiooo
M
VH
Oi
M
' § 800 tf
=3
C
600 400 -A t
V
200
\* °25
30
35
40
45
50
55
pT(e) (GeV) Figure 15: The electron transverse momentum distribution compared to the best fit for the W —• ev channel from the D 0 experiment.
117
40
60
80
100
120
140
160
180
200
MT(e,v) (GeV)
140
160
180
200
MT(n,v) (GeV) Figure 16: Transverse mass spectra (filled circles) for W —• ev (Top) and W —> [LV (Bottom) data from the CDF experiment, with best fits superimposed as solid curves. The lower curve in each graph shows the sum of estimated backgrounds. Each inset shows the 50 — 100 GeV region on a linear scale.
118
At LEP-2, the W branching fractions are determined by an explicit cross section measurement, while at the Tevatron they are determined from a measurement of a cross section ratio. Specifically, the W branching fraction can be written as: o.Br{W^ev)=a^T{Z-'ee) oz Tz
-1 R
(47)
where
_ % • Br{W -» ev) ~ az-Br(Z^ee)
R
(48)
is the measurement made at the Tevatron. This determination thus relies on the LEP-1 measurement of the Z branching fractions and the thoretical calculation of the ratio of the total Z and W cross sections. The Tevatron measurement Br(W —> ev) = (10.43 ± 0.25)% is now becoming systematics limited. In particular, the uncertainty due to QED radiative corrections in the acceptance calculation and in aw/o~z contributes 0.19% to the total systematic uncertainty of 0.23%. The corresponding measurement from LEP-2 is Br(W -> ev) = (10.62 ± 0.20)%. The large sample of W events at the Tevatron has also allowed a precise determincation of gT/ge through a measurement of the ratio of W —> rv to W —> ev cross sections. The latest Tevatron measurement of this quantity is gT/ge = 0.99 ± 0.024, in good agreement with the Standard Model prediction of unity and the LEP-2 measurement of gT/ge = 0.101 ± 0.022. W mass prediction from NuTeV at Fermilab Neutrino scattering experiments have contributed to our understanding of electro weak physics for more than three decades. Early determinations of sin29w served as the critical ingredient to the Standard Model's successful prediction of the W and Z boson masses. More precise investigations in the late 1980s set the first useful limits on the top quark mass. The recent NuTeV measurement of the electroweak mixing angle from neutrino-nucleon scattering represents the most precise determination to date. The result is a factor of two more precise than the previous most accurate j/N measurement 10 . In deep inelastic neutrino-nucleon scattering, the weak mixing angle can be extracted from the ratio of neutral current (NC) to charged current (CC) total cross sections; ajv^N -» VltX) - ajv^N -> v^X) a^N ->• fj,-X) - o-^N -> n+X)
K
'
119
R" -rW =
:
1 =
o
„ _
S m
B
W-
1—r 2 where Rv = cr(v^N —> vtlX)/a(vllN -> n~X). Because R~ is formed from the difference of neutrino and anti-neutrino cross sections, almost all sensitivity to the effects of sea quark scattering cancels. This reduces the error associated with heavy quark production (principally due to the imprecise knowledge of the charm quark mass) by roughly a factor of eight relative to the previous analysis. The substantially reduced uncertainties, however, come at a price. The ratio R~ is difficult to measure experimentally because neutral current neutrino and anti-neutrino events have identical observed final states. From the i/N interactions, 386 k NC and 919 k CC events are recorded and from the PN interactions, 89 k NC and 210 k CC events. The extracted value of sin29w (on-shell) = 0.2254 ± 0.0021 which can be translated in to an My/ value of 80.25 ± 0.1(stat.) ± 0.05(syst.) GeV, where the systematic error also receives a contribution from the unknown Higgs mass. Summary of W mass, width, and branching fraction
measurements
The LEP-2 mass values are compared with the Tevatron values in Figure 17. They are in excellent agreement despite being measured in very different ways with widely different sources of systematic error. The systematics at LEP-2 are dominated by the uncertainty in the beam energy (which is used as a constraint in the mass fits) and by the modeling of the hadronic final state, particular for the events where both W bosons decay hadronically. At the Tevatron, the systematics are dominated by the determination of the chareged lepton energy scale and Monte Carlo modeling of the W production, in particular its Px and Pz distribution. At the Tevatron, one can not use a beam energy constraint to reduce the sensitivity of the W mass to the absolute energy (E) and momentum (p) calibration of the detector. Any uncertainty in the detector E, p scales thus enters directly as an uncertainty in the Tevatron W mass. This means that the absolute energy and momentum calibration of detectors must be known to better than 0.01%. By contrast at LEP, an absolute calibration of 0.5% is sufficient. These thus provide welcome complementary determinations of the W mass. These direct measurements are also in good agreement with the indirect measruement from NuTeV and the prediction based on fits to existing, non W, electroweak data (the Z parameters). There is also good agreement between the LEP-2 and Tevatron measurements on the W width and branching fraction. Table 7 summarizes the W parameter measurements at LEP-2 and the Tevatron.
120
W-Boson Mass [GeV] pp-colliders
•— 80.452 + 0.062 'I A 'ff
Average
4-
80.436 ± 0.037 X2/DoR0.1/1
LEP1/SLDA'N/mt -i
1
80
1
1
1
80.386 ± 0.025
1
1
80.2
1
1
1—n
80.4
1
1
1
1
1
r
80.6
m w [GeV] Figure 17: Mw measurements and the prediction from the electroweak measurements including M t o p .
121
Table 7: Summary of the W and top-quark parameter measurements at LEP-2 and the Tevatron. Parameter Mw Br(W -> Iv) Mtop
LEP-2 Tevatron 80.427 ± 0.046 GeV 80.448 ± 0.062 GeV 2.12 ± 0.20 GeV 2.055 ±0.125 GeV (10.74 ±0.10)% (e + M + r) (10.43 ±0.25)% (e) 174.3 ± 5 . 1 GeV -
Top-quark parameters The top quark discovery 2 at the Tevatron in 1995 was the culmination of a search lasting almost twenty years. The top quark is the only quark with a mass in the region of the electroweak gauge bosons and thus a detailed analysis of its properties could possibly lead to information on the mechanism of electroweak symmetry breaking. In particular, its mass is strongly affected by radiative corrections involving the Higgs boson. The emphasis in top quark studies at the Tevatron has been to make the most precisie measurement of the top quark mass. Substantial progress has been made in bringing the mass uncertainty down from > 10 GeV, at the point of discovery, to 5.1 GeV in 1999. In the Standard Model, a top quark decays ~100% of the time to Wb. If both W's decay to qq', the final state from tt is qq'qq'bb and the event sample is referred to as "all-hadronic". Conversely, the "di-lepton" event sample is realized by selecting an £+v£~Dbb final state, where both W's have decayed leptonically (to e or /i). The "lepton+jet" event sample is one in which one W has decayed hadronically and the other leptonically. The precision with which the top quark mass can be measured with each sample depends on the branching fractions, the level of background and how well constrained the system is. The all hadronic sample has the largest cross section, but has a large background from QCD six jet events (Nsignai/Nbackground ~ 0.3), while the dilepton sample has a small background (N S i gna i /Nbackgrounct ~ 4), but suffers from a small cross section and the events are "under-constrained" because they contain two neutrinos. The optimum channel in terms of event sample size, background level and kinematic information content is the lepton+jet channel. Indeed, this channel has a weight of ~80% in the combined Tevatron average. Figure 18 shows the reconstructed mass distribution using the lepton+jet event sample from the CDF experiment. The Tevatron combined value 5 is determined using lepton+jet, di-lepton, and all-hadronic data (see Figure 19) and accounting for all correlations between the measurements. The two experiments have assumed a 100% correlation on all systematic uncertainties related
122
to the Monte Carlo models. The uncertainty in the Monte Carlo model of the QCD radiation is one of the largest systematic uncertainties. This error source will require a greater understanding if the top mass precision is to be significantly improved in the next Tevatron run (Run II). The other dominant systematic error is the determination of the jet energy scale which relies on using in-situ control samples such as Z+jet and 7+jet events. In Run II, due to significant improvements in the trigger system, both experiments should be able to accumulate a reasonable sample of Z —» bb events which will be of great assistance in reducing the uncertainty in the 6-jet energy scale. The top-quark mass measurments from the CDF and D 0 experiments with various channels are summarized in Figure 19. The combined Tevatron mass value 5 is Mtop = 174.3 ± 3.2(stat.) ± 4.0(syst.) GeV = 174.3 ± 5 . 1 GeV.
(50)
This measurement is a supreme vindication of the Standard Model, which, based on other electroweak measurements, predicts a top mass of : Mtop = 172±i? GeV. Of all the quark masses, the top quark is now the best measured.
(51)
123
100
150
200
250
300
350
Reconstructed Mass (GeV/c')
Figure 18: The reconstructed top quark mass distribution compared to the best fit for the tt~+ (W*b) + (W~b) -* (£ + t/6) + (qi?b). The plot also shows the level of background.
124
Tevatron Top Quark Mass Measurements 168.4 ± 1 2 . 8 GeV/c 2 1 73.3 ± 7.8 GeV/cr
Dilepton Lepton+jets
172.1 ± 7.1 GeV/c 1
Combined
167.4 ± 11.4 GeV/cr
Dilepton
176.1 ± 7.4 GeV/c 2
Lepton4-jets
186.0 ± 11.5 GeV/V
All-Hadronic
t D0 I CDF
176.1 ±
6.6 GeV/c 2
Combined
Tevatron combined 174.3 ± 5.1 GeVAr
i""i 150
160
170
1111 11 11111
1S0
190 200
M^ (GeV/c2)
Figure 19: The summary of the top-quark mass measurements from the CDF and D 0 experiments at the Tevatron.
125
Summary and Outlook With the current uncertainties, the measurements of the W mass and topquark mass start to provide an interesting further test of the Standard Model relative to precision electroweak measurements at the Z resonance. This is illustrated in Figure 20; the predicted W and top masses extracted from fits to lower electroweak data are consistent with the direct measurements from LEP2 and Tevatron, and the precision of the measurements is similar to that of the prediction. From the overlaid curves showing the Standard Model expectation as a function of the Higgs boson mass, it is further evident that both the precise lower energy measurements, and the direct W and top mass measurements favor a Standard model Higgs boson in the relatively low mass region (see Figure 21). The 95% confidence level upper limit on MH (taking the band into account) is 165 GeV. The lower limit on MH of approximately 113 GeV obtained from direct searches 13 has not been used in this limit determination. For LEP-1, LEP-2 and NuTeV, no further data are planned. In contrast, the Tevatron has undergone a major luminosity upgrade augmented with substantial improvements in the collider's detectors. At least 2 fb _ 1 is expected per experiment at an increased center of mass energy of 2 TeV in three years. It is expected that both the top quark mass and W mass measurements will become limited by systematic uncertainties. The statistical part of the Tevatron W mass error in the next run will be ~10 MeV, where this also includes the part of the systematic error which is statistical in nature such as the determination of the charged lepton E and p scales from Z events. At present the errors non-statistical in nature contribute 25 MeV out of the total Tevatron W mass error of 60 MeV. A combined W mass error is expected to be 20 — 30 MeV. The W width is expected to be determined with an uncertainty of 20 — 40 MeV. The statistical uncertainty on the top quark mass will be ~ 1 GeV. The systematic error arising from the uncertainties in the jet energy scale and modeling of QCD radiation are expected to be the dominant errors, with a total error value of 2 — 3 GeV per experiment expected.
126
Figure 20: The direct measurements of the W and top quark mass from the Tevatron experiments, and the direct measurement of the W mass from the LEP-2 experiments and the indirect W and top mass measurement from LEP, SLC, and Tevatron neutrino experiments. The curves are from a calculation of the dependence of Mw on Mtop in the Standard Model using several Higgs boson masses. The band on each curve is the uncertainty obtained by folding in quadrature uncertainties on a{M%), a s ( M § ) , and Mz (Reference 11). The dominant contribution to the band width comes from the uncertainty on a ( A f | ) , A a h a d ( M | ) , the contribution of light quarks to the photon vacuum polarisation.
127
V
D
I el 3
'•'
Ac
fe I
Cd =
1 I 11 I
J:
V
\ — 0.02804±0.00065 Y /
V,
\-— 0.02755± 0.00046/ j
I "
\\
4-
I
2\
\
/
Excluded\ \ / / 0J 10 10 \
1 — i — i — i * i
*i i I'I
/
^
Preli ml nary 1—i—i—i—i
i 11
10"
m H [GeV] Figure 21: A * 2 = X2 - X2min vs. MH curve. The line is the results of the fit using all the Electroweak measurements; the band represents an estimate of the theoretical error due to missing higher order corrections. The vertical band shows the 95% CL exclusion limit on MH from the direct search. The dashed curve is the results obtained using the evaluation of A a ^ 5 ) d ( A f | ) from Reference 12.
128
(e)
(f)
(g)
Figure 22: Tree-level diagrams for W+ + W~ —> W+ + W~ scattering. Figure courtesy of "QCD and Collider Physics" by R.K. Ellis, W.J. Stirling and B.R. Webber.
4
Search of Electroweak Symmetry Breaking
As stated in Section 3, the Higgs sector can be brobed by the precision electroweak measurements by means of radiative corrections. However, the nature of the symmetry breaking sector can only be established by its direct discovery and detailed study. The LEP-2 experiments have searched for a light Higgs boson upto ~115 GeV. In March 2001, the CDF and D 0 experiments at the Tevatron will begin a high-luminosity run with considerable sensitivity to new physics, and offer promise for light-Higgs searches in the future. For more massive Higgs bosons, the focus of attention shifts to higher-energy colliders, in particular the LHC, where pp collisions will be studied at A/S = 14 TeV, beginning in about 2006. Future possible e+e~ linear colliders would offer complementary possibilities for the study of electroweak symmetry breaking. 4-1
Theoretical Constraints on the Higgs mass
The Higgs mass can not be too large; as it increases, the amplitude for WW scattering via Higgs exchange (see Figure 22 for the diagrams) becomes large and the contribution from the lowest-order diagrams exceeds the perturbative unitarity limit unless MH < 1 TeV. It is also true that very massive Higgs bosons are wide resonant states (see Figure 23), and therefore their detection
129
0
200
400
600 MH [GeV]
800
1000
Figure 23: The total Higgs decay with for Mtop = 175 GeV. Figure courtesy of "QCD and Collider Physics" by R.K. Ellis, W.J. Stirling and B.R. Webber.
in invariant mass distributions becomes difficult. Another limit on M # comes from consideration of the renormalized A parameter which appears in the Higgs potential V(4>+
(52)
where
130
if new physics comes in at 0 ( 1 TeV). In other words, the Higgs mass can be almost arbitrary. See a detailed discussion on the Higgs mass constraints in Chris Quigg's lectures in this proceedings. 4-2
Searches for the Standard Model Higgs
Searches at LEP-2 The LEP-2 experiments have searched for the Higgs boson via the process e+e" -> Z* -)• ZH ^ (ff + bb) or (ff + rf):
The final-state Z boson can be produced either on mass-shell if y/s > MH+MZ, or off mass-shell if ,/s < MH + Mz- Figure 24 shows the cross section as a function of v ^ for three different values of MH- The LEP-2 collider went upto the center-of-energy of 208 GeV, higher than the design value. With this machine, the Higgs mass can be discoverd upto ~110 GeV where Higgs decays mainly into bb or TT (see the branching fractions in Figure 25). At their highest energy, the 4 LEP-2 experiments have seen a 2.9 a hint with MH = 115 GeV (the excess is dominated by the ALEPH experiment) 14 . For instance, with a set of cuts which give the signal to background ratio of 2, the number of background events and the number of signal events with 115 GeV Higgs are estimated to be 1.72 and 3.03 (4 experiments combined), respectively. The number of candidates they observed is 4. With the same data, one can set the bound on the Higgs Mass; MH < 113 GeV is excluded at 95% Confidence Level. The LEP-2 was shutdown in October 2000, passing the baton to the Tevatron for the Higgs searches (see the cartoon, Figure 26, made by the ALEPH experiment). Searches at the Tevatron At the Tevatron, the CDF and D 0 experiments will begin a high-luminosity run in March 2001 with 10% increase in the center-of-mass energy, corresponding to ~30% increase in the production cross section. Both experiments have
131
e V - ZH
150
160
170
180 Vs [GeV]
190
200
210
Figure 24: Cross section for ZH production at LEP-2. Figure courtesy of "QCD and Collider Physics" by R.K. Ellis, W.J. Stirling and B.R. Webber.
undergone major upgrades on their detectors, triggers, and data handling systems. These upgraded accelerator and experiments offer discovery potential of the light Higgs boson before the LHC turns on. Figure 27 shows the crosssections as a function of the Higgs mass for various production mechanisms. For the Higgs lighter than ~ 120 GeV, the search strategy is via the process qq -> V* -»• VH -> ffbb,
where V is W or Z. Although many more Higgs bosons are produced via qq -> H -> bb, they will be very difficult to detect due to large QCD background via qq —> g —> bb.
132 i
:
i-•
|
._
bb
~
;
.^i^.
-
cc y y
\'A
'-
v
I
i
,
; --
150 MH [GeV]
1
1
=— —
'
zz
/ /
/ /
200
'
50
1
JJ. J
100
:
—
, ,
~
\ 1, \
150 M„ [GeV]
i
~z
-2
/ / /
\
I
100
'Apu' y ' "^-^
Zy / / '
<
VM
I
1
i '
,'-
r
IN.
'_
'
-
_
A
r
i
r :
^\\ \ \ w \\\ \\\
/
i
:
TT
:
i
•'--
\
200
1
—
WW
zz .2
...It
/
—
—
05 i
.02
.01
200
1
/
i 400
,
i 600 M„ [GeV]
1 800
1
1000
Figure 25: Branching fractions of the Standard Model Higgs boson for light Higgs (top) and heavy Higgs (Bottom). The top quark mass is 175 GeV. Figure courtesy of "QCD and Collider Physics" by R.K. Ellis, W.J. Stirling and B.R. Webber.
133
C
hey •
key*
U S %OU.V /kin. .
X^"^-
»CI>U5 ,' J C
=^v
Figure 26: A cartoon made by the ALEPH experiment after the LEP-2 shutdown.
134
For the Higgs heavier than ~ 130 GeV, qq -»• H -»• W+W~* or ZZ* will be the dominant processes for the searches. 1
x
/
/
Q
ff
^
\
V
Lines in Figure 28 show the total luminosity required per experiment for 2a, 3a, and 5a excesses due to the Higgs as a function of the Higgs mass. The Tevatron is expected to deliver ~2 fb _ 1 per experiment in the first 3 years (called "Run Ha"). With the Run Ha data, the Tevatron experiments will observe a 2a excess if the Higgs mass is 115 GeV. With further machine upgrade, CDF and D 0 expect to have ~10 - 15 fb _ 1 per experiment by year 2006 (called "Run lib"). With this luminosity, they will see upto a 3a excess over most of the region with MH < 180 GeV. Searches at LHC For more massive Higgs bosons, we must wait for the LHC. The LHC will produce light Higgs bosons copiously (see Figure 29) but they will suffer from severe backgrounds (more severe than the Tevatron). One strategy is to look for gg —> H —> 77 events. The braching fraction of H —>• 77 (see Figure 25) is extremely small (less than 1%), but backgrounds will be much smaller. For heavier Higgs bosons, the production cross section is smaller, but backgrounds are more manageable. The best process for this case could be via gg —>• H —» ZZ* —> l+£~£+£~~. The invariant mass distributions for these two cases are shown in Figures 30 and 31. 4-3
Higgs Beyond the Standard Model
The Higgs mechanism, while it has many attractive features, is not without difficulties. There is, for example, a severe fine-tuning ('naturalness') problem associated with preventing the Higgs mass being renormalized up to some very high 'new-physics' or grand unification scale. Supersymmetry (SUSY) is one way of circumventing this difficulty while retaining the essential features of the Standard Model Higgs sector. In the minimal supersymmetric model (MSSM), it is necessary to introduce two complex doublets of Higgs fields, and the masses and couplings of the
135 :
1
1
1
|
1
1
1
|
1
i
i
|
i
i
I
'
'
'
1
'
'
'
rj(pp-+H+X) [pb]
Vs = 2 TeV Mt=175GeV CTEQ4M
r
^^^_Jg-^H
f qq^Hqq
:
:
->HW ~ " ~~- - 1 ~ : - - - -S3
-=
qq->HZ gg,qq-*Htt
:
r
gg,qcH>Hb5 i
80
100
.
120
I
I
I
!
.
140
i
i
160
i
i
>
i
180
i
i
200
M H [GeV] Figure 27: Higgs production cross sections at the Tevatron.
resulting five physical Higgs bosons are expressible (at leading order) in terras of just two parameters. The key feature of the minimal SUSY model is the existence of at least one neutral, scalar Higgs boson with a mass less then O(150 GeV), the exact upper limit depending weakly on the parameters of the model on Mtop- Note that a large fraction of SUSY models predict Muiggs < 130 GeV. In most respects, the lightest minimal SUSY Higgs boson behaves exactly like the Standard Model Higgs, and so the calculations and search strategies described in this section again apply. Via loop diagrams, new particles make contributions to the electroweak parameters. Figure 32 shows the predicted Mw - Mtop region in the MSSM which is consistent with the current electroweak measurements. 5
Future Prospects
We will have possible senarios in this decade. • We will discover the Higgs boson with Mniggs < 130 GeV. This will lead
136
combined CDF/DO thresholds
s
,
10
j
,
(
,
f
,
!—
Q. X
30 fb" 1 J10 fb" 1
110' I c "E
2 fb- i
3
™ 9 5 % CL limit
1.10°
- 3a evidence - 5a discovery
(D
J
80
100
120
140
i
I
160
i
L
180
200
Higgs mass (GeV/c 2 ) Figure 28: Higgs discovery potential at the Tevatron Run II.
to another question; is it the Standard Model Higgs ? If not, this will imply new physics. We will discover the Higgs boson with MHiggs > 130 GeV. This will rule out a large fraction of SUSY models. This will lead to a question, "Will this agree with the indirect predicion from the electroweak precision measurements ?" We will not discover upto LHC. In this senario, we expect that detectable effects appear in the production rate and properties of W boson pairs at ~ 1 TeV. Whatever the outcome, it will be extremely interesting. At present, it is essentially an experimental question.
137
1000
~"
I
r
p+p^H+X
100
Vs=14 TeV
\ -
, rQ
1
r-
L£J
b
^^^^i^L
10 ^ = -•"V. E ~""~^. — \ v\ 1 =~ \ \ \ B \ " \ "-.\ V \
\
~^~--^ ^
~
\
^
^""-^ qq^qqtl
v
V
.1 _ = .01 =—
"
" • ' • - . _
•.
X
xXx X x X g g . q ^ txst xH -x x > \ q q ^ H " --T^-^-..-^
"^""-^ "^*«.qq->ZH~---~.r
.001
.0001 200
400 600 MH [GeV]
800
1000
Figure 29: Higgs production cross section in pp collisions at the LHC. Figure courtesy of "QCD and Collider Physics" by R.K. Ellis, W.J. Stirling and B.R. Webber.
Acknowledgments I would like to thank Jon Rosner who organized the TASI-2000 and edited Lecture Notes. I would also like to thank and acknowledge the collaborators of the CDF, D 0 and NuTev experiments, and the ALEPH, DELPHI, L3, OPAL, and SLD experiments. This work is supported by the U.S. Department of Energy. References 1. G. Arnison et al. (UAl Collaboration), PLB 122, 103 (1983); M. Banner et al. (UA2 Collaboration), PLB 122, 476 (1983) 2. F. Abe et al. (CDF Collaboration), PRL 74, 2626 (1995), PRD 50, 2966 (1994); S. Abachi et al. (D0 Collaboration), PRL 74, 2632 (1995). 3. A. Czarnecki and W.J. Marciano, Report BNL-HET-98/43, hepph/9810512. 4. The LEP Collaborations, the LEP Electroweak Working Group, the SLD
138
,19000
14000
-
13000
12000
128
130
(GeV)
Figure 30: Expected Af77 spectrum as simulated in the ATLAS detector at the LHC for M g = 120 GeV and one year of data. The dotted line is background spectrum, and the solid line is background and signal combined spectrum. Figure courtesy of "QCD and Collider Physics" by R.K. Ellis, W.J. Stirling and B.R. Webber.
Heavy Flavour and Electroweak Working Group, hep-ex/0103048. 5. F. Abe et al. (CDF Collaboration), PRD 63, 032003 (2001); F. Abe et al. (CDF Collaboration), PRL 82, 271 (1999); B. Abbott et al. (D0 Collaboration), PRD 58, 052001 (1998). 6. C. Caso et al, Eur. Phys. J. C3, 1 (1998). 7. V. Barger et al, Z. Phys. C 21, 99 (1983); B. J. Smith et al, Phys. Rev. Lett. 50, 1738 (1983). 8. J. Alitti et al. (UA2 Collaboration), Phys. Lett. B 276, 246 (1992); T. Affolder et al. (CDF Collaboration), hep-ex/0007044 (submitted to Phys. Rev. D); S. Abachi et al. (D0 Collaboration), PRL 84, 222
139
»
10
250
500
750
1000
1250
1500
1750
2000
mass (GeV)
Figure 31: Expected Mzz spectrum as simulated in the ATLAS detector at the LHC for MH = 800 GeV and one year of data. The different histograms repesent different selection cuts. Figure courtesy of "QCD and Collider Physics" by R.K. Ellis, W.J. Stirling and B.R. Webber.
(2000); 9. T. Affolder et al. (CDF Collaboration), submitted to PRL (2000). 10. K. S. McFarland, et al, Eur. Phys. Jour. C31, 509 (1998). 11. M. Swartz, hep-ph/9509248; S. Eidelman, F. Jegerlehner, hepph/9502298 12. B. Pietrzyk, The global fit to electroweak data, talk presented at ICHEP2000, Osaka, July 27 - August 2, 2000. Preprint LAP-EXP-00.06. 13. K. Hoffman, Year 2000 update for OPAL and LEP HIGGSWG results, talk presented at ICHEP2000, Osaka, July 27 - August 2, 2000. 14. ALEPH Collaboration, CERN-EP/2000-138.
140
MSSM
mm LEPLSLD,vN data 1VL„-IVL contours : 68% CL
130
140
150
160
170
180 190 200 M top (GeV/c2)
Figure 32: The direct and indirect measurements of the W and top quark mass from LEP, the Tevatron, and SLC overlaid with the allowed region from the MSSM model.
..,.^' r .?**?•>
^|I|
Pllllliliil
••iffliisiiii
G. Buchalla
This page is intentionally left blank
K A O N A N D C H A R M PHYSICS: THEORY
Theory
G. B U C H A L L A Division, CERN, CH-1211 Geneva 23, E-mail: [email protected]
Switzerland
We introduce and discuss basic topics in the theory of kaons and charmed particles. In the first part, theoretical methods in weak decays such as operator product expansion, renormalization group and the construction of effective Hamiltonians are presented, along with an elementary account of chiral perturbation theory. The second part describes the phenomenology of the neutral kaon system, CP violation, e and e'/s, rare kaon decays (K —> -KVV, KL —> 7 r ° e + e _ , K^ —> /J,+/J,~), and some examples of flavour physics in the charm sector.
1
Preface
These lectures provide an introduction to the theory of weak decays of kaons and mesons with charm. Our main focus will be on kaon physics, which has led to many deep and far-reaching insights into the structure of matter, is a very active field of current research and still continues to hold exciting opportunities for future discoveries. Another, and in several ways complementary source of information about flavour physics is the charm sector. Standard model effects for rare processes are in this case typically suppressed to almost negligible levels and positive signals, if observed, could therefore yield spectacular evidence of new physics. Towards the end of the lectures we will describe a few selected examples in charm physics and contrast their characteristic features with those of the kaon sector. For both subjects we will concentrate on the flavour physics of the standard model. We discuss the phenomenology as well as the theoretical tools necessary to achieve a detailed and comprehensive test of the standard model picture that should eventually lead us to uncover signals of the physics beyond. The experimental aspects of these fields are explained in the lectures by Barker (kaon physics) and Cumalat (charm physics) at this School. Before we start our tour of flavour physics with kaons and charm, we give a brief outline of the contents of these lectures. We begin, in the following section 2, with recalling some of the historical highlights of kaon physics and with an overview of the main topics of current interest in this field. In section 3 we introduce theoretical methods that are fundamental for the computation of weak decay processes and for relating the basic parameters of the underlying theory to actual observables. These important tools are the operator product expansion, the renormalization group, the effective low-energy 143
144
weak Hamiltonians, where we discuss the general AS = 1 Hamiltonian as an explicit example, and, finally, chiral perturbation theory. Our main emphasis will be on an elementary introduction of the relevant ideas and concepts, rather than on more specialized technical aspects. With this background in mind we will then address, in section 4, the phenomenology of the neutral-kaon system and CP violation. We discuss a common classification of CP violation, the kaon CP parameters e and e'/e, and the standard analysis of the CKM unitarity triangle. Section 5 is devoted to the physics of rare kaon decays, in particular the "golden" channels K+ -> -K+VV and KL -t ir°vv, and the processes KL -> ir°e+e~ and Kj_ —> A*+M~In section 6 we discuss the prominent features of flavour physics with charm. We present some opportunities with rare decays of D mesons and describe the phenomenology of D°-D° mixing, which is of current interest in view of new experimental measurements by the CLEO and FOCUS collaborations. Finally, section 7 summarizes the main points and presents an outlook on future opportunities. We conclude these preliminary remarks by mentioning several review articles, which the interested reader may consult for further details on the topics presented here, for a discussion of related additional processes and for a complete collection of references to the original literature. Very useful accounts of rare and radiative kaon decays and of kaon CP violation can be found in I,2,3,4,5,6,7_ rp n e g r s t £ v e a r t i c i e s a i s o discuss the relevant experiments. Nice reviews on flavour physics with charm are 8 ' 9 ' 1 0 ' 1 1 . Further details on theoretical methods in weak decays are provided in 12 - 13 .
2 2.1
Kaons: Introduction and Overview Historical Highlights
The history of kaon physics is remarkably rich in groundbreaking discoveries. It will be interesting to briefly recall some of the most exciting examples here. We do not attempt to give an historically accurate account of the development of kaon physics. This is a fascinating subject in itself. For a more complete historical picture the reader may consult the excellent book by Cahn and Goldhaber 1 4 . Here we will content ourselves with a brief sketch of several highlights related to kaon physics. They serve to illustrate how the observation of unexpected - and sometimes tiny - effects in this field is linked to basic concepts in our theoretical understanding of the fundamental interactions.
145
Strangeness Already the discovery of kaons alone, half a century ago, has had an impressive impact on the development of high-energy physics. One of the characteristic features of the new particles was associated production, that is they were always produced in pairs by strong interactions, for instance as 7T+ + p ->• K+ + K° + p
S
0
0
+1
-1
0
(Alternatively a K+ could be produced along with a A(uds) baryon.) This property, together with the long' lifetime, suggested the existence of a new quantum number, called strangeness, carried by the kaons and conserved in strong interactions. The discovery of strangeness opened the way for the SU(3) classification of hadrons and the introduction of quarks (u, d, s) as the fundamental representation by Gell-Mann. The quark picture, in turn, formed the basis for the subsequent development of QCD. In modern notation the K mesons come in the following varieties K+(su) K°(sd) K-(su) K°(sd) where the flavour content is indicated in brackets. The pairs {K+, K°) and (K°, K~) are doublets of isospin. Parity Violation The new mesons proved to be strange particles indeed. One of the peculiarities is known as the 6-r puzzle. Two particles decaying as 8 —> 27r (P even final state) and r —> Sir (P odd), and hence apparently of different parity, were observed to have the same mass and lifetime. This situation prompted Lee and Yang to propose that parity might not be conserved in weak interactions. This was later confirmed in the famous 60 Co experiment by C.S. Wu. Today the 6+ and r + are known to be identical to the K+ meson and parity violation is firmly encoded in the chiral SU(2)L gauge group of standard model weak interactions. CP Violation After the recognition of parity violation in weak processes the combination of parity with charge conjugation, CP, still appeared to be a good symmetry. The neutral kaons K° and K° were known to mix through second order weak interactions to form, if CP was conserved, the CP eigenstates KL,S = (K° ± K°)/V2 (here CP K° = -K°). Clearly, CP symmetry then forbids the decay
146
of the CP-odd KL into the CP-even TT+IT~ final state. Instead, Christenson, Cronin, Fitch and Turlay showed in 1964 that the decay does in fact occur, establishing CP violation. Compared to the CP-allowed decay of K$ —> 7r+7T~ the amplitude is measured to be A(KL
Arir
-t
TT+TT")
+
^ -2.3-10"3
(1)
CP violation is thus a very small effect, in contrast to P violation, but the qualitative implications are nevertheless far-reaching. As we will discuss later, CP violation defines an absolute, and not only conventional, difference between matter and anti-matter. Also, as we now know, CP violation indirectly anticipated in a sense the need for three families of fermions within the standard model. Finally, CP violation is a necessary prerequisite for the generation of a net baryon number in our universe according to Sakharov's three conditions (the other two being baryon number violation and a departure from thermal equilibrium). F C N C Suppression Another striking property of weak interactions that manifested itself in kaon decays is the suppression of flavour-changing neutral currents (FCNC). While the standard, charged-current mediated process K+ —> JJL+V has a branching fraction of order unity B{K+ -> H+P) = 0.64 (2) the similarly looking neutral-current decay KL —> fi+fi~ is suppressed to a tiny level B{KL-+ /x+/x-)« 7 - H r 9 (3) Naively, a "three-quark standard model" would allow a sdZ coupling at tree level. This would lead to a KL -> A*4fJ>~ amplitude of strength Gp, comparable to K+ -> /J,+ V, in plain disagreement with (3). Even if the tree-level coupling of sdZ were forbidden, the problem would reappear at one loop. This is illustrated in the second diagram of Fig. 1. The loop integral is divergent, where a natural cut-off could be expected at the weak scale ~ M\y • The amplitude should then be of the order GpM^,, which would still be far too large. Of course, the three-quark model is not renormalizable and therefore not a consistent theory at short distances. The introduction of the charm quark by Glashow, Iliopoulos and Maiani (GIM) solves all of these problems, which plagued the early theory of weak interactions. The complete two-generation standard model is perfectly consistent and the tree-level sdZ coupling is automatically eliminated by the orthogonality of the 2 x 2 Cabibbo mixing matrix.
147
Figure 1: GIM Mechanism.
The sdZ coupling can still be induced at one-loop order, but the disturbing G2FMw term is now canceled between the up-quark and the charm-quark contribution. The remaining effect is, up to logarithms, merely of the order G\n?c, which is well compatible with (3), unless mc would be too large. To turn this observation into a more quantitative constraint on the charm-quark mass mc is, however, not easy in this case because KL —>• /x + /i~ is actually dominated by long-distance contributions (we will discuss this further in section 5.3). Another FCNC process, K°-K° mixing, proved to be more useful in this respect. K-K
mixing, GIM and Charm
The following example represents one of the great triumphs of early standard model phenomenology. In the four-quark theory, K-K mixing occurs through AS = 2 W-box diagrams with internal up and charm quarks. This AS = 2 transition induces a tiny off-diagonal element M\i in the mass matrix
»-(£,",?)
<«>
of the K-K system. The corresponding eigenstates are KL,S — {K° ±K°)/y/2 with eigenvalues Mi^s- The difference between the eigenvalues AMK = ML — Ms is related to M\2 and can be estimated from the box diagrams (see Fig. 2). Anticipating a more detailed discussion of the calculation, we simply quote the result: AMK _ G\PK 2 2 _ 15 MK
6TT2
'
cs
cdl
c
~
(
'
where the number on the right is the experimental value. The theoretical expression is approximate since we have taken the required hadronic matrix
148
U,C
K[]±K[]
v
{)
\w Iw R
KL
"~ i2 s
«,c
1 M M12 \ \ M12 M j
tf1 i?1
d
Figure 2: A ' 0 - ^ ' 0 mixing.
element in the so-called factorization approximation, we have neglected the (fairly small) up-quark contribution and assumed that mc
149
Figure 3: (Semi)leptonic K decays.
those of the top quark entering CP violating amplitudes, are prominent examples that illustrate this point. In fact, we see that several of the most crucial pillars of the standard model rest on results derived from studies with kaons. Of course, direct experiments at high energies, that have led for instance to the production of on-shell W and Z bosons or quark jets, are indispensable for exploring the strucutre of matter. However, indirect, low-energy precision observables are equally necessary as a complementary approach. They can yield information that is hardly accessible in any other way, such as the elucidation of the GIM structure of flavour physics or the violation of CP symmetry. It is with this philosophy in mind that studies of rare kaon processes, but also rare decays of b hadrons or charmed particles, continue to be pursued with great interest. The most promising future opportunities with kaons will be the subject of later sections in these lectures. We conclude this introductory chapter with a brief general overview of physics with kaon decays. 2.2
Overview of K decays
We may classify the decays of K mesons into several broad categories, some of which are more determined by nonperturbative strong interaction dynamics, while others have a high sensitivity to short-distance physics both in the standard model and beyond. Tree-Level (Semi-) Leptonic Decays These are the simplest decays of kaons. They typically have large branching ratios and are well studied. Examples are the purely leptonic decay K+ —> \x+v and the semileptonic mode K+ —> 7r°e+z/, which are illustrated in Fig. 3. K+ —• ir°e+i/ is very important for determining the CKM matrix element Vus (the sine of the Cabibbo angle). This is possible because the hadronic matrix
150
element of the vector current (7T°|(su)v|.ftT+) is absolutely normalized in the limit of SU(3) flavour symmetry and protected from first order corrections in the SU(3) breaking (Ademollo-Gatto theorem). One finds \VUS\ = 0.2196 ±0.0023
(6)
This is a basic input for the CKM matrix. Furthermore, knowing |VUS|, K+ —> fi+v may be used to determine the kaon decay constant fx — 160 MeV. Nonleptonic Decays Nonleptonic decays, such a s i f - > nir, are strongly affected by nonperturbative QCD dynamics. Nevertheless they provide an important window on the violation of discrete symmetries, P and CP for example. CP violation is currently of special interest. It enters through K~K mixing via box graphs or through penguin diagrams in the decay amplitudes, with the virtual top quarks playing a decisive role. This is sketched in Fig. 4.
K
S
t
d
C~\ I O
R
v—^^j^y ] d t s i
u
u
7T7T Figure 4: SM origin of CP violation in K —> WK decays.
Long-Distance Dominated Rare and Radiative Decays Examples of this class of processes are K+ -> TT+1+1~ , KL -> 7r°77, Kg -t 77 or KL —> M+M-- A typical contribution to K+ —> ir+e+e~ is illustrated in Fig. 5. These processes are determined by nonperturbative low-energy strong interactions and can be analyzed in the framework of chiral perturbation theory. The treatment of nonperturbative dynamics within a first-principles approach as provided by chiral perturbation theory is of great interest in its own right. In addition, the control over long-distance contributions afforded by chiral perturbation theory can be helpful to extract information on the flavour physics from short distances.
151
Figure 5: Typical contribution to K+ —> 7r+e+e
Short-Distance Dominated Rare Decays The prime examples in this category are the processes K+ —> it+vv and KL —> n°vV. Here short-distance dynamics completely dominates the decay and the access to flavour physics is very clean and direct. For this reason the K —>• -KVV modes are special highlights among the future opportunities in kaon physics. To a somewhat lesser extent also KL —> 7r°e+e~ qualifies for this class. In this case the fact that the process is predominantly CP violating enhances the sensitivity to short-distance physics in comparison to K+ —>• 7r + e + e~. Decays Forbidden in the SM Any positive signal in processes that are forbidden in the standard model would be a very dramatic indication of new physics. A good example are kaon decays with lepton-flavour violation. Stringent experimental limits exist for several modes of interest B{KL -^ fie) < 4 . 7 - l C T 1 2 +
B(K+ - > 7 r V e ~ ) < 4.8 • KT B{KL ->
TTV)
(7)
11
(8)
9
(9)
< 3.2 • 10~
In principle these processes could be induced through loops in the standard model with neutrino masses. However, the smallness of the neutrino masses compared to the weak scale results in unmeasurably small branching fractions of typically below 10~ 25 . Larger values can be obtained within the minimal supersymmetric standard model (MSSM). However, there are strong constraints from direct limits on p. —• e conversion processes (// —> ej decay, or /j, —> e conversion in the field of a nucleus). The disadvantage of KL —> fie is that flavour violation is needed simultaneously both in the lepton sector and in the
152
d
u
Figure 6: QCD effects in weak decays.
quark sector. Interesting effects could however still occur in some regions of parameter space. Systematically larger branching ratios are allowed in scenarios with R-parity violation, where decays such as KL —> fie can proceed at tree level 15 . In very general terms, a scenario where the exchange of a heavy boson X mediates sd —¥ fie transitions at tree level receives strong constraints from the tight experimental bound (7). Assuming couplings of electroweak strength, the bound (7) implies a lower limit of the X mass Mx ~ 100 TeV. Such a sensitivity to high energy scales is very impressive, however one has to remember that the tree-level scenario assumed above is quite simple-minded and in general very model dependent. Generically, one would expect some additional suppression mechanism to be at work in Ki —> fie. In this case the scale probed would be less, but the high precision of (7) still guarantees an excellent sensitivity to subtle short-distance effects. 3
Theoretical Methods in Weak Decays
The task of computing weak decays of kaons represents a complicated problem in quantum field theory. Two typical cases, the first-order nonleptonic process K° -» n+n~, and the loop-induced, second-order weak transition K+ —> 7r+z/z> are illustrated in Fig. 6. The dynamics of the decays is determined by a nontrivial interplay of strong and electroweak forces, which is characterized by several energy scales of very different magnitude, the W mass, the various quark masses and the QCD scale: mt, Mw ^ w c ^> AQCD 3> THU, md, (m s ). While it is usually sufficient to treat electroweak interactions to lowest nonvanishing order in perturbation theory, it is necessary to consider all orders in QCD. Asymptotic freedom still allows us to compute the effect of strong interactions at short distances perturbatively. However, since kaons are bound
153
states of light quarks, confined inside the hadron by long-distance dynamics, it is clear that also nonperturbative QCD interactions enter the decay process in an essential way. To deal with this situation, we need a method to disentangle long- and short-distance contributions to the decay amplitude in a systematic fashion. The required tool is provided by the operator product expansion (OPE). 3.1
Operator Product Expansion
We will now discuss the basic concepts of the OPE for kaon decay amplitudes. These concepts are of crucial importance for the theory of weak decay processes, not only of kaons, but also of mesons with charm and beauty and other hadrons as well. Consider, for instance, the basic W-boson exchange process shown on the left-hand side of Fig. 7. This diagram mediates the
Figure 7: OPE for weak decays.
decay of a strange quark and triggers the nonleptonic decay of a kaon such as —¥ ir+ir . The quark-level transition shown is understood to be dressed with QCD interactions of all kinds, including the binding of the quarks into the mesons. To simplify this problem, we may look for a suitable expansion parameter, as we are used to do in theoretical physics. Here, the key feature is provided by the fact that the W mass M\y is very much heavier than the other momentum scales p in the problem (AQCD, mu, ma, ms). We can therefore expand the full amplitude A, schematically, as follows ^
c
( ^ ,
a
, ) .
( e )
+ 0 (|_)
(10)
which is sketched in Fig. 7. Up to negligible power corrections of 0(p2/Myy), the full amplitude on the left-hand side is written as the matrix element of a local four-quark operator Q, multiplied by a Wilson coefficient C. This expansion in l/Mw is called a (short-distance) operator product expansion because
154
the nonlocal product of two bilinear quark-current operators (su) and (ud) that interact via W exchange, is expanded into a series of local operators. Physically, the expansion in Fig. 7 means that the exchange of the very heavy W boson can be approximated by a point-like four-quark interaction. With this picture the formal terminology of the OPE can be expressed in a more intuitive language by interpreting the local four-quark operator as a four-quark interaction vertex and the Wilson coefficient as the corresponding coupling constant. Together they define an effective Hamiltonian ~Heff = C • Q, describing weak interactions of light quarks at low energies. Ignoring QCD the OPE reads explicitly (in momentum space) o A
i
= -YVusVudk2
_
M 2
(su)V^A(ud)V-A
= -.^C.^cW) + 0 ( ^ ) with C = 1, Q = (su)v-A(ud)v-A He/,
(.i)
and
= ^V:sVud(su)v-A(ud)V-A
(12)
As we will demonstrate in more detail below after including QCD effects, the most important property of the OPE in (10) is the factorization of longand short-distance contributions: All effects of QCD interactions above some factorization scale \x (short distances) are contained in the Wilson coefficient C. All the low-energy contributions below \i (long distances) are collected into the matrix elements of local operators (Q). In this way the short-distance part of the amplitude can be systematically extracted and calculated in perturbation theory. The problem to evaluate the matrix elements of local operators between hadron states remains. This task requires in general nonperturbative techniques, as for example lattice QCD, but it is considerably simpler than the original problem of the full standard-model amplitude. In some cases also symmetry considerations can help to determine the nonperturbative input. For example, the only matrix element relevant for K+ —> ir+vv is (n+\(sd)v\K+)
= V2(7r°\(su)v\K+)
(13)
where the equality with the right-hand side uses isospin symmetry and allows us to obtain the matrix element from measuring the standard semileptonic mode K+ -> TT°1+V. The short-distance OPE that we have described, the resulting effective Hamiltonian, and the factorization property are fundamental for the theory of
155
+
•5Z--
K decays. However, the concept of factorization of long- and short-distance contributions reaches far beyond these applications. In fact, the idea of factorization, in various forms and generalizations, is the key to essentially all applications of perturbative QCD, including the important areas of deepinelastic scattering and jet or lepton pair production in hadron-hadron collisions. The reason is the same in all cases: Perturbative QCD is a theory of quarks and gluons, but those never appear in isolation and are always bound inside hadrons. Nonperturbative dynamics is therefore always relevant to some extent in hadronic reactions, even if these occur at very high energy or with a large intrinsic mass scale (see also the lectures by Soper in these proceedings). Thus, before perturbation theory can be applied, nonperturbative input has to be isolated in a systematic way, and this is achieved by establishing the property of factorization. It turns out that the weak effective Hamiltonian for K decays provides a nice example to demonstrate the general idea of factorization in simple and explicit terms. We will next discuss the OPE for K decays, now including the effects of QCD, and illustrate the calculation of the Wilson coefficients. A diagrammatic representation for the OPE is shown in Fig. 8. The key to calculating the coefficients Ci is again the property of factorization. Since factorization implies the separation of all long-distance sensitive features of the amplitude into the matrix elements of (Qi), the short-distance quantities d are, in particular, independent of the external states. This means that the Ci are always the same, no matter whether we consider the actual physical amplitude where the quarks are bound inside mesons, or any other, unphysical amplitude with on-shell or even off-shell external quark lines. Thus, even though we are ultimately interested in K —> 7T7T amplitudes, for the perturbative evaluation of Ci we are free to choose any treatment of the external quarks according to our calculational convenience. A convenient choice that we will use below is to take all light quarks massless and with the same off-shell momentum p (p2 ^ 0). The computation of the Ci in perturbation theory then proceeds in the following steps: • Compute the amplitude A in the full theory (with W propagator) for
156
1
i
ik sr^
l
^
s
d
u
k
T
T UJ
jl
Figure 9: QCD correction with colour assignment.
arbitrary external states. • Compute the matrix elements (Qi) with the same treatment of external states. • Extract the d from A = Cj (Qi). We remark that with the off-shell momenta p for the quark lines the amplitude is even gauge dependent and clearly unphysical. However, this dependence is identical for A and (Qi) and drops out in the coefficients. The actual calculation is most easily performed in Feynman gauge. To 0(as) there are four relevant diagrams, the one shown in Fig. 8 together with the remaining three possibilities to connect the two quark lines with a gluon. Gluon corrections to either of these quark currents need not be considered, they are the same on both sides of the OPE and drop out in the d- The operators that appear on the right-hand side follow from the actual calculations. Without QCD corrections there is only one operator of dimension 6 Ql = (siUi)v-A(Ujdj)V-A
(14)
where the colour indices have been made explicit. (The operator is termed Qi for historical reasons.) To 0(as) QCD generates another operator Q\ = (SiUj)v-A(Ujdi)V-A
(15)
which has the same Dirac and flavour structure, but a different colour form. Its origin is illustrated in Fig. 9, where we recall the useful identity for SU(N) Gell-Mann matrices
157
It is convenient to employ a different operator basis, defining
Q± = ^ f ^ 1
(17)
The corresponding coefficients are then given by C± = C2 ± Ci
(18)
If we denote by S± the spinor expressions that correspond to the operators Q± (in other words: the tree-level matrix elements of Q±), the full amplitude can be written as A=(l
+ 1+as In ^ | ) S+ + (1 + 7 - a s In ^ | ) S_
(19)
Here we have focused on the logarithmic terms and dropped a constant contribution (of order as, but nonlogarithmic). Further, p2 is the virtuality of the quarks and 7± are numbers that we will specify later on. We next compute the matrix elements of the operators in the effective theory, using the same approximations, and find (Q±)=
(l+7±«s(j+ln^))s±
(20)
The divergence that appears in this case has been regulated in dimensional regularization (D = 4 — 2e dimensions). Requiring A = C+(Q+) + C-(Q-)
(21)
we obtain
M2 C± = 1 + l±as In —f (22) H2 where the divergence has been subtracted in the minimal subtraction scheme. The effective Hamiltonian we have been looking for then reads Heff
= ^V:sVud
(C+(fi)Q+
+ C-MQ-)
(23)
with the coefficients C± determined in (22) to 0(as log) in perturbation theory. The following points are worth noting: • The 1/e (ultraviolet) divergence in the effective theory (20) reflects the Mw —> oo limit. This can be seen from the amplitude in the full theory (19), which is finite, but develops a logarithmic singularity in this limit. Consequently, the renormalization in the effective theory is directly linked to the In Mw dependence of the decay amplitude.
158
We observe that although A and (Q±) both depend on the long-distance properties of the external states (through p 2 ), this dependence has dropped out in C±. Here we see explicitly how factorization is realized. Technically, to 0(as log), factorization is equivalent to splitting the logarithm of the full amplitude according to l n
M | —pz
= l n
M | , fiz
+ l n J
^
—pz
Ultimately the logarithms stem from loop momentum integrations and the range of large momenta, between M\y and the factorization scale JJL, is indeed separated into the Wilson coefficients. To obtain a decay amplitude from 'Kefs m (23), the matrix elements ( / | Q ± | - ^ Q ( M ) have to be taken, normalized at a scale \x. An appropriate value for /i is close to the hadronic scale in order not to introduce an unnaturally large scale into the calculation of {Q). At the same time fi must also not be too small in order not to render the perturbative calculation of C(/x) invalid. A typical choice for K decays is fi ss 1 GeV
159
matrix elements. Of course, both quantities have to be evaluated in the same scheme to obtain a consistent result. The renormalization scheme is determined in particular by the subtraction constants (minimal or nonminimal subtraction of 1/e poles), and also by the definition of 75 used i n D ^ 4 dimensions in the context of dimensional regularization. • Finally, the effective Hamiltonian (23) can be considered as a modern version of the old Fermi theory for weak interactions. It is a systematic low-energy approximation to the standard model for kaon decays and provides the basis for any further analysis. 3.2
Renormalization
Group
Let us have a closer look at the Wilson coefficents, which read explicitly
r
1 , a «M ^i0> ,„ ^
c± = 1 +
jo)
ln
7±
^r^ Aq;
J
4
=\-8
/ 9 =x
( 25)
where we have now specified the exact form of the 0(as log) correction. Numerically the factor as(fi)j±'/(8TT) is about +7% (—14%), a reasonable size for a perturbative correction (we used as(n = 1 GeV) = 0.43). However, this term comes with a large logarithmic factor of ln(/i 2 /M^) = —8.8, for an appropriate scale of fj, = 1 GeV. The total correction to C± — 1 in (25) is then —60% (120%)! Obviously, the presence of the large logarithm spoils the validity of a straightforward perturbative expansion, despite the fact that the coupling constant itself is still reasonably small. This situation is quite common in renormalizable quantum field theories. Logarithms appear naturally and can become very large when the problem involves very different scales. The general situation is indicated in the following table, where we display the form of the correction terms in higher orders, denoting £ = ]n(fj,/Mw) LL NLL as£ as a2£2 a2£
0(1)
a2
0(as)
In ordinary perturbation theory the expansion is organized according to powers of as alone, corresponding to the rows in the above scheme. This approach is invalidated by the large logarithms since as£, in contrast to as, is no longer a small parameter, but a quantity of order 1. The problem can be resolved by
160
resumming the terms (as£)n to all orders n. The expansion is then reorganized in terms of columns of the above table. The first column is of 0(1) and yields the leading logarithmic approximation, the second column gives a correction of relative order as, and so forth. Technically the reorganization is achieved by solving the renormalization group equation (RGE) for the Wilson coefficients. The RGE is a differential equation describing the change of C±(fi) under a change of scale. To leading order this equation can be read off from (25) d
a
n i \
« (o)
c±{n)
(27)
(a s /47r)7j_ J are called the anomalous dimensions of C±. To understand the term "dimension", compare with the following relation for the quantity /in, which has (energy) dimension n: d
dlnn
V
(28)
n-/x"
The analogy is obvious. Of course, the C±(fi) are dimensionless numbers in the usual sense; they can depend on the energy scale /j. only because there is another scale, Mw, present under the logarithm in (25). Their "dimension" is therefore more precisely called a scaling dimension, measuring the rate of change of C± with a changing scale fi. The nontrivial scaling dimension derives from 0(as) loop corrections and is thus a genuine quantum effect. Classically the coefficients are scale invariant, C± = 1. Whenever a symmetry that holds at the classical level is broken by quantum effects, we speak of an "anomaly". Hence, the j±' represent the anomalous (scaling) dimensions of the Wilson coefficients. We can solve (27), using da
s dlnfi
9/?
a2s 4n
Po =
33-2/
C±(M,w)
1
(29)
and find 7
(0)
C±(n)
as(Mw) as((i)
200
1
±'
200
(30)
_l + / 3 o ^ l n ^ _
This is the solution for the Wilson coefficients C± in leading logarithmic approximation, that is to leading order in RG improved perturbation theory. The all-orders resummation of as log terms is apparent in the final expression in (30).
161
K°
7T
)
•<+(
s
u
%
*<>
u d
Figure 10: K°
3.3
-> TT+T^- and
K+
-> TT+TT0.
AI = 1/2 i M e
At this point, and before continuing with the construction of the complete AS = 1 Hamiltonian, it is interesting to discuss a first application of the results we have derived so far. Let us consider the weak decays into two pions of a neutral kaon, Ks —• + TT+TT-, and a charged kaon, K —> 7r+7r°, which are sketched in Fig. 10. The two cases look very much the same, except that the spectator quark is a u quark for the charged kaon and a d quark for the neutral one. Naively one would therefore expect very similar decay rates. The experimental facts are, however, strikingly different: T(KS
- » 7T+7T-)
T(K+ -»
TT+TT°)
2
~
45U
(2L2j
(31)
To get a hint as to where this huge difference in the decay rates may come from we have to analyze the isospin structure of the decays. A kaon state has isospin / = 1/2. Taking into account Bose symmetry, one finds that two pions from the decay of a K meson can only be in a state of isospin 0 and 2. More specifically, \TT+TT~) has both 1 = 0 and 1 = 2 components, while |7r+7r°) is a pure 1 = 2 state. The change in isospin is then as follows K+
- • TT+TT0 AI
= 3/2
K°
-> TT+TT" AI
= 1/2,
(32) 3/2
(33)
In particular, K+ —> TT+TT0 is a pure AI = 3/2 transition. The large ratio in (31) means that AI = 1/2 transitions are strongly enhanced. This empirical feature is refered to as the A7 = 1/2 rule. We next take a closer look at the isospin properties of the effective Hamiltonian. Using the Fierz identities of the Dirac matrices the operators Q± can
162
be rewritten as Q± = (siUi)V-A{ujdj)v-A
± {siUj)v_A(ujdi)v-A
= (SiUi)v-A(Ujdj)V-A
± (Sidi)v-A{UjUj)V-A
= (34)
where now all quark bilinears appear uniformly as colour singlets. Retaining only the flavour structure, but dropping colour and Dirac labels for ease of notation, the Hamiltonian (23) has the form
-as(Mwy6/25
H eff
. "*(M) .
{(su)(ud) + (sd)(uu)) +
CK/r \ 1-12/25
+
QS(M)
((su)(ud) — (sd)(uu))
.
(35)
We can now see that the operator Q-, in the second line of (35), is a pure AI = 1/2 operator: u and d appear in the combination
ud-du=\ U) ~ I -It)
(36)
which has isospin 0. The strange quark is also an isospin singlet. The isospin of <5_ is therefore determined by the factor u, which has isospin 1/2. From the Wilson coefficients we have calculated we observe that the contribution from Q_ receives a relative enhancement over Q+ in (35) by a factor "S(A0
18/25
as{Mw)
i
2.6 3.4 /
n
H = 1 GeV M = 0.6GeV ii
i
i
-»u
\i
(37)
Qualitatively, this is precisely what we need: Q - , which is purely AI = 1/2 and can thus only contribute to K° —> TT+TT~, but not to K+ —> 7r+7r°, is re-inforced by the short-distance QCD dynamics. Quantitatively, however, the RG improved QCD effect falls still short of explaining the amplitude ratio 21.2 in (31) by a sizable factor. We might be tempted to decrease fi, which enhances the effect, but we are not allowed to go much below /i = 1 GeV where perturbation theory would cease to be valid. The remaining enhancement has to come from nonperturbative contributions in the matrix elements. Nevertheless it is interesting to see how already the short-distance QCD corrections provide the first step towards a dynamical explanation of the AI = 1/2 rule. 3.4
AS = 1 Effective
Hamiltonian
In this section we will complete the discussion of the AS = 1 effective Hamiltonian. So far we have considered the operators Q\ = (SiUj)v -A{Ujdi)v
-A
(38)
163 Q2 = (siUi)v-A{ujdj)v_A
(39)
which come from the simple W-exchange graph and the corresponding QCD corrections (Fig. 11). In addition, there is a further type of diagram at 0(as),
d
•U, C
Figure 11: QCD correction to W exchange.
which we have omitted until now: the QCD-penguin diagram shown in Fig. 12. It gives rise to four new operators
d
w
u, c, t q
£
q
Figure 12: QCD-penguin diagram.
Qi = {sidi)v-A'Y^(qjqj)v-A
(40)
Q4 = (sidj)v^A
(41)
^2(qjqi)v-A 1
Qh - (sidi)v_A
^2(q~jqj)v+A
(42)
9
Q6 = (sidj)v-A^2(qjqi)v+A
(43)
i
Two structures appear when the light-quark current (qq)v from the bottom end of the diagram is split into V — A and V + A parts. In turn, each of those comes in two colour forms in a way similar to Qi and Q2The operators Qi, • • • ,Qe rnix under renormalization, that is the RGE for their Wilson coefficients is governed by a matrix of anomalous dimensions,
164
generalizing (27). In this way the RG evolution of Ci i 2 affects the evolution of C 3 , . . . , C6- On the other hand Ci,2 remain unchanged in the presence of the penguin operators Q3,..-,Qe, so that the results for C 1|2 derived above are still valid. For some applications (e.g. e'/e) higher order electroweak effects need to be taken into account. They arise from 7- or Z-penguin diagrams (Fig. 13) and also from W-box diagrams. Four additional operators arise from this
Figure 13: Electroweak penguin. source. They have a form similar to the QCD penguins, but a different isospin structure, and read (eq are the quark charges, eUjC = + 2 / 3 , ed,s,b = —1/3) q
Q? = ^(sid^v-A^egiqjq^y+A
(44)
9 Q
1
Q9 = -(sidi)v_A'^2eq(qjqj)V-A
(46)
o
Q10 = -{sidj)v„AY^eq{qj
(47)
9
The construction of the effective Hamiltonian follows the principles we have discussed in the previous sections. First the Wilson coefficients Ci(nw), i = 1 , . . . , 10, are determined at a large scale nw = 0(Mw,rnt) to a given order in perturbation theory. In this step both the W boson and the heavy top quark are integrated out. Since the renormalization scale is chosen to be nw = 0{Mw,irit), no large logarithms appear and straightforward perturbation theory can be used for the matching calculation. The anomalous dimensions are computed from the divergent parts of the operator matrix elements, which correspond to the UV-renormalization of the Wilson coefficients. Solving the RGE the C; are evolved from nw to a scale /x& = 0(mt,) in a theory
165
with / = 5 active flavours q = u,d, s, c, b. At this point the b quark (which can appear in loops) is integrated out by calculating the matching conditions from a five-flavour to a four-flavour theory, where only q = u,d,s,c are active. This procedure is repeated by integrating out charm at \ic — 0(mc) and matching onto an / = 3 flavour theory. One finally obtains the coefficients C;(/i) at a scale /i < He, describing an effective theory where only q = u,d,s (and gluons of course) are active degrees of freedom. The terms taken into account in the RG improved perturbative evaluation of C»(/x) are, schematically: as\n^-)
,
a
NLO:as(asln^f)
l n ^ ( a
s
l n ^ j
, a(asln^f)
at leading and next-to-leading order, respectively. Here a is the QED coupling, refering to the electroweak corrections. The final result for the AS = 1 effective Hamiltonian (with 3 active flavours) can be written as
ntfr1 = ~\u £ (*(/*) - x ^ ) ) Q* + hx-
(48)
where \p = V*Vvd- In principle there are three different CKM factors, Xu, Xc and Xt, corresponding to the different flavours of up-type quarks that can participate in the charged-current weak interaction. Using CKM unitarity, one of them can be eliminated. If we eliminate Ac, we arrive at the CKM structure of (48). The Hamiltonian in (48) is the basis for computing nonleptonic kaon decays within the standard model, in particular for the analysis of direct CP violation. When new physics is present at some higher energy scale, the effective Hamiltonian can be derived in an analogous way. The matching calculation at the high scale fiw will give new contributions to the coefficients Cj(/ivv), the initial conditions for the RG evolution. In general, new operators may also be induced. The Wilson coefficients z* and yi are known in the standard model at NLO. A more detailed account of T-Lfff1 and information on the technical aspects of the necessary calculations can be found in 12 and 13 . 3.5
Chiral Perturbation Theory
An additional tool for kaon physics, complementary to the OPE-based effective Hamiltonian formalism, is chiral perturbation theory (%PT). The present section gives a brief and elementary introduction into this subject. For other, more detailed discussions we refer the reader to 6 ' 1 6 , 1 7 ' 1 8 .
166
Preliminaries The QCD Lagrangian for three light flavours g = (u,d, s)T can be written in terms of left-handed and right-handed fields, qi>R = (1 =p 75)9/2, in the form £QCD = qhi VQL + W VlR ~ QLMqR - qRM\L
(49)
where M — diag(m u , md,ms). If M is put to zero, CQCD is invariant under a global SU{2>)L
QL
qR
(50) (51)
with L and R (independent) SU(3) transformations. The explicit breaking of this chiral symmetry through a nonzero M is a small effect and can be treated as a perturbation. Simultaneously, chiral symmetry is not reflected in the hadronic spectrum, so it must also be spontaneously broken by the dynamics of QCD. For instance the octet of light pseudoscalar mesons 7TU
a
a
$ = T ir =
V2
+
,
V
Ve
7T~
K~
1T+
K+
\
K
-s+^ ° K°
(52)
-fj
is not accompanied in the spectrum of hadrons by a similar octet of mesons with opposite parity and comparable mass. On the other hand the octet $, comprising the lightest existing hadrons, is the natural candidate for the octet of Goldstone bosons expected from the pattern of spontaneous chiral symmetry breaking SU{2)L ® SU{Z)R -> 517(3) (53) down to group of ordinary flavour 517(3), where q^,R —> Vqh,RThe mesons in $ are not strictly massless due to the explicit breaking of chiral symmetry caused by M. and are thus often refered to as pseudoGoldstone bosons. Still they are the lightest hadrons and they are separated by a mass gap from the higher excitations of the light-hadron spectrum. (The masses of the latter remain of order AQCD, while the masses of $ vanish in the limit M ->• 0.) The idea of x ? T is to write a low-energy effective theory where the only dynamical degrees of freedom are the eight pseudo-Goldstone bosons. This is appropriate for low-energy interactions where the higher states are not kinematically accessible. Their virtual presence will however be contained in the coupling constants of xPT. The guiding principles for the construction of %PT
167
are the chiral symmetry of QCD and an expansion in powers of momenta and quark masses. By constructing the most general Lagrangian for $ compatible with the symmetries of QCD, the framework is model independent. By restricting the accuracy to a given order in the momentum expansion, only a finite number of terms are possible and the framework becomes also predictive. A finite number of couplings needs to be fixed from experiment; once this is done, predictions can be made. x P T is a nonperturbative approach as it does not rely on any expansion in the QCD coupling as. Both x P T and the quark-level effective Hamiltonian Heff a r e low-energy effective theories applicable to kaon decays. What is the difference and how are these two approaches related? The essential, and obvious, difference is that x P T is formulated directly in terms of hadrons, H e / / in terms of quarks and gluons. The advantage of %PT is therefore the direct applicability to physical, hadronic amplitudes, without the need to deal with the complicated hadronic matrix elements of quark-level operators. The advantage of Tieff, on the other hand, is the direct link to short-distance physics, which is encoded in the Wilson coefficients. This type of information is important in the context of CP violation or in the search for new physics. In x P T such information is hidden in the coupling constants, which are not readily calculable and need to be fixed experimentally. From these considerations it is clear that W.eff is more useful for applications where short-distance physics is essential (CP violation, e'/e), whereas x P T is especially suited to deal with long-distance dominated quantities, which are hard to come by otherwise. To relate the two descriptions directly is not an easy task and has so far not been accomplished. A calculation of the couplings of x P T from the quark picture, establishing a link between "H e // a n d x P T , requires one to solve QCD nonperturbatively, which is not possible at present.
SU(Z) Transformations Before describing the explicit construction of x P T it will be useful to recall a few important properties of SU(3) transformations and to introduce some convenient notation. We define {ql,q2,q3)
= (u,d,s)
{q1,q2,q3) = {u,d,s)
(54)
and by Ulj the components of a generic SU{3) matrix U. By definition, changing upper into lower indices, and vice versa, corresponds to complex conjugation, thus [/.' = U*) = U^
(55)
168
The unitarity of U implies UklU) = U^U)
= Sij
(56)
U\Uk
= 5^
(57)
= U\U^
Also, det U = eijk U\U2j U\ = 1
(58)
1
The fundamental SU(3) triplet q and anti-triplet qi transform, respectively, as ql -> U^qi (59) Qi ->• Ujqj
(60)
k
It follows from the above that the singlet q qk as well as the Kronecker symbol 6%j and the totally antisymmetric tensor eljk are invariant under SU(3) transformations. Higher dimensional representations can also be built. For example, the traceless tensor Sij=qiqi-lsijqkqk
(61)
is an irreducible representation of SU(3). Its eight components constitute an SU(3) octet, which transforms as S*, -»• U\U]lS\
(62)
r* = eijkqjqk
(63)
j k
(64)
We next define the objects n = eijkq q
They transform in the same way as the fundamental triplet and anti-triplet in (59) and (60), respectively. We show this for (63). From (60) and using (55) and (57), we have rl = eijkqjqk
->•
k
m
= UisenikUn'UjlUkmqlqm
e^ U/Uk qiqm k
]l
U\ e^ U^nU 3U
]m
slm
k
=
q,qm = U\ e (detU^qiqm
= U\rs
(65)
which proves our assertion. Let us consider two simple applications of this formalism. a) The meson field $ in (52) corresponds to the quark-level tensor Slj and both transform as octets under SU(3). The connection can be seen by writing
169 out the components of S l - and comparing with (52). One recovers the quark flavour composition of the meson states. For example $ x 2 =7r+ = S12 =qlq2 $\
= -—r, = S\=q3q3--qkqk
=ud = --{uu + dd - 2ss)
(66) (67)
The transformation law for $1^ is the same as for Slj in (62) or, in matrix notation, $ H> [/*[/* (68) b) In section 3.3 we have seen from the discussion of the A / = 1 / 2 rule that the largely dominant part of the weak Hamiltonian is contributed by the pure AI = 1/2 operator Q_. We will now show the important property that Q- transforms as the component of an octet under SU(3)LTO see this we first note that the operator Q'i =riri-\sijrkrk
(69)
is an SU(3) octet. This follows because the rl in (63), (64) transform as the ql, and Slj in (61) is an octet. We next show that the (2,3) component of Q% • indeed has the flavour structure of Q^\ Q 2 3 = r 2 r 3 = (g3<7i - qiq3)(q1q2 - q2qX) = (su)(ud) - (sd){uu) - (uu)(sd) + (ud)(su)=2Q^
(70)
Since the quark fields in Q_ are all left-handed, we see that Q_ transforms as the (2,3) component of an octet Q4 • under SU(3)LTrivially, it is also a singlet under SU(3)R. Hence, Q_ transforms as a component of a (8^,1^) under SU(3)L ® SU(3)R. The transformation law for Ql • in matrix notation is, with L G SU(3)L, Q -»• LQtf
(71)
Including' the hermitian conjugate we may write 2(Q^ + Ql) = Q23 + Q32=tv\6Q
(72)
where the trace with the Gell-Mann matrix A6 is used to project out the proper components (A6 is the matrix with entry 1 at positions (2,3) and (3,2), and 0 otherwise). It is not hard to see that the penguin operators Q 3 , . . . , Q6 also transform as part of an (8L, 1R), in the same way as Q-.
170
Chiral L a g r a n g i a n We will now construct explicitly the leading terms of the chiral Lagrangian. Since we have to write down the most general form for this Lagrangian to any given order in the momentum expansion, the specific manner in which chiral symmetry is realized does not matter. The most convenient and standard choice is a nonlinear realization where one introduces the unitary matrix E = exp (j$\
(73)
as the basic meson field. Here / is the generic decay constant for the light pseudoscalars (we have used a normalization in which fv = 131 MeV). The field S is taken to transform under 5 ( 7 ( 3 ) L ® SU(3)R as E -» LEflt
(74)
with L £ SU(3)L and R G SU{2>)R. In general, (74) implies a complicated, nonlinear transformation law for the field $. However, for the special case of an ordinary 5(7(3) transformation, where L = R = U, (74) becomes equivalent to (68). We thus recover the correct transformation for the octet $ under ordinary SU(S). We can also see that the vacuum state, which corresponds to $ —> 0, hence E —> 1, is invariant under ordinary 5(7(3) (1 —• U 1 W = 1) as it must be. On the other hand, the vacuum is not invariant under the chiral transformation in (74): 1 —> L1R) ^ 1. This corresponds to the spontaneous breaking of chiral symmetry. The field (73) with the transformation (74) therefore has the desired properties to describe the pseudo-Goldstone bosons. The chiral Lagrangian is constructed as a series in powers of momenta, or equivalently numbers of derivatives £QCD =
CQCD+£QCD+^
CAS=1 = CAS=1 + CAS=1 +
( ? 5 )
^
( 7 6 )
describes strong interactions, £ A S = 1 AS = 1 weak interactions, and the subscripts on the right-hand side indicate the number of derivatives. The lowest order strong interaction Lagrangian has the form QQCD
/ _ t r [D/1E£>"Et + 2 B o ( X S f + EA^)] (77) 8 We have written DM for the derivative, which we later will generalize to a covariant derivative to include electromagnetism. For the moment we may consider D^ as an ordinary derivative. CQCD
=
171
The terms in (77) have to be built to respect the symmetries of the QCD Lagrangian in (49). For M = 0, (49) is chirally invariant and we are thus looking for invariants constructed from E in (74). Only trivial terms are possible with zero derivatives, for example tr(EE T ) = const. The leading term comes with two derivatives, as anticipated. The only possible form is tr(D M ED M E^). Here -D^ED^E* transforms as (8L,1_R) L ^ E Z ^ E 1 -> L £>„££>"£* IS
(78)
and taking the trace gives an invariant. Another possibility would seem to be t r ( D 2 E E t ) , but this term differs from t r ^ E D ^ E * ) only by a total derivative. The second term in (77) breaks chiral symmetry. Its form can be found by noting that the symmetry breaking term proportional to M in (49) would be invariant if M was interpreted as an auxiliary field transforming as M —»• LMB)• To first order in M and to lowest order in derivatives this leads to the mass term in (77). For M —» LMB) it would be invariant. For M fixed to the diagonal mass matrix it breaks chiral symmetry in the appropriate way. We will soon find that this term indeed counts as two powers of momentum. The 1S second order Lagrangian £ j then complete. The factor in front of the first term in (77) is fixed by the requirement that the kinetic term for the mesons be normalized in the canonical way. There are no additional parameters for this contribution. The second term in (77) comes with a coupling £?o- This new parameter is related to the meson masses. To see this more clearly, we can extract the kinetic terms from (77) by expanding to second order in the field $. If we keep, for example, only the contributions with neutral kaons K, K, we find (up to an irrelevant additive constant) C C
2 K°kin = d^Kd^K
- B0(ms + md)KK
(79)
From (79) and similar relations for the other mesons we obtain expressions for the pseudo-Goldstone boson masses in terms of the quark masses and the parameter J5 0 =
O(AQCD)
m2K0 = B0(ms 2
m
K+
+md)
- B0(ms + mu)
(80)
ml+ = B0(mu +md) The meson masses squared are proportional to linear combinations of quark masses. This also clarifies why one factor of M is equivalent to two powers of momenta in the usual chiral counting (see (77)). Expanding (77) beyond second order in $ we obtain terms describing strong interactions among the mesons, such as ir-n scattering.
172
We next need to determine the form of Cfs=1. As we have seen in sec. 3.3, the dominant contribution to the weak Hamiltonian comes from the component of an (8L, 1.R) operator as shown in (72). Since we know empirically from the AI = 1/2 rule that the enhancement of this piece of the weak Hamiltonian is quite strong, we shall here make the additional approximation to keep only this contribution and drop the rest (related to the operator Q+). This is a reasonable approximation for many applications. With the results we have derived so far, it is then easy to write down the correct form for C^s=1. The structure with two derivatives and the correct transformation properties as an (8i, li?) is given in (78). According to (72) we simply need to take the trace with A6 to obtain the right components. Factoring out basic weak interaction parameters we can write AAS=1 = ^\V:sVud\g8^tr\6D^D^
(81)
This Lagrangian introduces one additional parameter, the octet coupling g$. Eq. (81) already contains the usual hermitian conjugate part (see (72)). We have neglected the small CP violating effects and factored out the common leading CKM term V*sVud = VusV*d = \V:sVud\. Ks ->
TT+TT-
from Cfs=1
In order to make predictions, we first need tofixthe constant g$. We can use the dominant nonleptonic decay Ks —>• ir+n~ for this purpose. Expanding the interaction term in (81) to third order in $, and keeping only K, K, ir+ and 7T~, we find tr AedSSEt = - ^ [Tr+dKdir- - dKn-dn+
+ (K - K)dir+dw-]
(82)
Neglecting CP violation we have K\ = Ks, K2 = K^, where CPK\$ ±#1,2, and {CPK = -K)
=
Expressing K, K in terms of Kit2 we obtain for the square brackets in (82) [...] = -!=7r+dTr-{dK2-dKl)-^=n-dTT+{dK2+dK1) v2 v2
+ V2K1dTr+dTr- (84)
173
From (84) the Feynman amplitudes for if1)2 -> TT+-K can be read off. Denoting the momentum of the kaon, n+, n~ by fc, pi, P2, respectively, we get for Kx [...]Ki -^-^=(2p1-p2
+ k-(p1+p2))
= -V2(m2K-ml)
(85)
One may check that the corresponding amplitude for K% —»• ir+ir~ gives zero, as required by CP symmetry. From (85) and (81) we obtain the decay amplitude A(Ki ->• TT+TT") = iGFg8\\u\U(m2K
- ml)
(86)
(ml - ml)' G2Fg28\Xu\2f2
(87)
This gives the branching ratio
B{KS ->
TT+O
= rKs V™lJ™1
Using TKS
= 1.3573 • 1014 GeV" 1
B(KS
-*• TT+TT^) = 0.6861
we find 58 - 5.2
(89)
We have thus determined g% to lowest order in xPT. Using this result, we can make predictions. For instance, expanding (81) to fourth order in <£, we can derive amplitudes for the decays K -> 3TT. In this manner x P T relates processes with different numbers of soft pions. Such relations, also known as the soft-pion theorems of current algebra, are nicely summarized in the framework of %PT in terms of the lowest-order chiral Lagrangians. Other important applications are radiative decays as Ks —> 77, to which we will come back in the following paragraph. So far we have worked at tree level, which is sufficient at 0(p2). At the next order, 0(p4), one has to consider both tree-level contributions of the 0{pA) terms in the Lagrangians (75), (76), and one-loop diagrams with interactions from the 0(p2) Lagrangians. The loop diagrams are in general divergent. The divergences are absorbed by renormalizing the couplings at 0(p4). An example will be described below. Radiative K Decays Electromagnetism and the photon field A^ can be included in x P T in the usual way. The U(l) gauge transformation for the meson fields is £' = UYXJ* => $' = U*U\
U = exp(-ieQQ)
(90)
174
7
+ ... 7 Figure 14: K\ —> 77 in x ? T .
with 0 = 0(x) an arbitrary real function and the electric charge matrix Q = d i a g ( 2 / 3 , - 1 / 3 , - 1 / 3 ) . Writing out (90) for the components of $, one finds that each meson transforms with its proper electric charge as the generator. With A'^ = Afj, — d^Q, the covariant derivative that ensures electromagnetic gauge invariance is D^ = d^-ieAil[Q,T] (91) Using this assignment, (77) and (81) include the electromagnetic interactions of the mesons. At higher orders in the chiral Lagrangian also terms with factors of the electromagnetic field strengths F^v have to be included. In the chiral counting F^ is equivalent to two powers of momentum. We finally give a further illustration of the workings of X^T with two examples of long-distance dominated, radiative kaon decays. We first mention an important theorem for these processes. It says that the amplitudes of nonleptonic radiative kaon decays with at most one pion in the final state start only at 0(p4) in * P T . This means there are no tree-level contributions at 0(p2). Such terms are forbidden by gauge invariance. There can only be tree-level amplitudes from C(p 4 ), and, at the same order, loop contributions generated from 0(p2) interactions. Decays that fall under this category are K -¥ 77, K ->• jl+l~, K ->• 7T77 or K -> nl+l~. A particularly interesting example is K$ —• 77- In this case it turns out that there is no direct coupling even at 0(p4) and hence no counterterm to absorb any divergence from the loop contribution. As a consequence, the oneloop calculation (Fig. 14) is in fact finite. The only parameter involved is g8, which we have already determined. The finite loop calculation then gives a unique prediction 19 . It yields B{KS -> 77) = 2.1 • 10" 6
(92)
This compares well with the experimental result 20 (2.4±0.9) • 10~ 6 , which has recently been improved t o 2 1 (2.6 ± 0.4) • 10^ 6 .
175
Of course this situation with a finite loop result is somewhat special. A more generic case is K+ —> n+e+e~, shown in Fig. 15. Here the loop calcu-
e
e Figure 15: K+ ->• n+e+e'
in x P T .
lation is divergent and renormalized by the counterterm at 0(pi). There is now one additional free parameter, which can be determined from the rate of K+ —> n+e+e~. Other observables as the e+e~~ mass spectrum or the rate + and spectrum of K —> 7T + /J + /J,~ can then be predicted. In the same manner one can also analyze the amplitude for Kg -» 7r°e + e~, which determines the indirect CP violating contribution in Ki —> n°e+e~. Kg -> ir°e+e~ is very similar to K+ —> 7r + e + e~, but the required counterterm is different. For this reason the measurement of K+ -> 7r + e + e~ cannot be used to obtain a prediction for Ks -» 7r°e + e~. A separate measurement of the latter decay will therefore be needed in the future. 4 4.1
The Neutral-K System and CP Violation Basic Formalism
Neutral K mesons can mix with their antiparticles through second order weak interactions. They form a two-state system (K° — K°) that is described by a Hamiltonian matrix H of the form
*=(KH(&£) where CPT invariance has been assumed. The absorptive part T^ of H accounts for the weak decay of the neutral kaon. In Fig. 16 we show the diagrams that give rise to the off-diagonal elements of H. Diagonalizing the Hamiltonian H yields the physical eigenstates KL,S- They are linear combinations of the strong interaction eigenstates K and K and can be written as KL=Ne[{l+E)K
+ {l-e)K\=pK
+ qK
(94)
176
Figure 16: Diagrams contributing to M12 and Ti2 in the neutral kaon system.
Ks = Afg [(1 + E)K - (1 - e)K] = pK - qK
(95)
with the normalization factor A4 = l/-\/2(l + \e\2). Here e is determined by l-e_q 1+ e p
M{2 - i r j 2 (AM + f A r ) / 2
where A M and AP are the differences of the eigenvalues ML,S — i^L,s/^ responding to the eigenstates KL,S AM = ML-Ms>0
A r = T 5 - TL > 0
(96) cor-
(97)
The labels L and S denote, respectively, the long-lived and the short-lived eigenstate so that A r is positive by definition. We employ the CP phase convention CP • K = —K. Using the SM results for M i 2 , r i 2 and standard phase conventions for the CKM matrix (see (121)), one finds in the limit of CP conservation (77 = 0) that e = 0. With (94), (95) it follows that KL is CP odd and Kg is CP even in this limit, which is close to realistic since CP violation is a small effect. As we shall see explicitly later on, the real part of e is a physical observable, while the imaginary part is not. In particular (1 — e)/{l + e) is a phase convention dependent, unphysical quantity. A crucial feature of the kaon system is the very large difference in decay rates between the two eigenstates, the lighter eigenstate decaying much more rapidly than the heavier one (Ts = 579 T^). The basic reason is the small number of decay channels for the neutral kaons. Decay into the predominant CP even two-pion final states 7T+TT~, ir°ir° is only available for K$, but not (to first approximation) for the (almost) CP odd state K^. The latter can decay into three pions, which however is kinematically strongly suppressed, leading to a much longer Ki lifetime.
177
4-2
Classification of CP Violation
The fundamental weak interaction Lagrangian violates CP invariance via the CKM mechanism, that is through an irreducible complex phase in the quark mixing matrix. This leads to a violation of CP symmetry at the phenomenological level, in particular in the decays of mesons. For instance, processes forbidden by CP symmetry may occur or transitions related to each other by CP conjugation may have a different rate. In order for CP violation to manifest itself in this manner, an interference of some sort between amplitudes is necessary. The interference can arise in a variety of ways. It is therefore useful to introduce a classification of the various possibilities. We shall discuss it in terms of kaons, which are our main concern here, but it is applicable also to D and B mesons in a similar way. According to this classification, which is very common in the literature on CP violation, we may distinguish between: a) CP violation in the mixing matrix. This type of effect is based on CP violation in the two-state mixing Hamiltonian H (93) itself and is measured by the observable quantity Im(ri2/Mi 2 ). It is related to a change in flavour by two units, AS = 2. b) CP violation in the decay amplitude. This class of phenomena is characterized by CP violation originating directly in the amplitude for a given decay. It is entirely independent of particle-antiparticle mixing and can therefore occur for charged mesons as well. Here the transitions have AS = 1. c) CP violation in the interference of mixing and decay. In this case the interference of two amplitudes takes place between the mixing amplitude and the decay amplitude in decays of neutral mesons. This very important class is sometimes also refered to as mixing-induced CP violation, a terminology not to be confused with a). Complementary to this classification is the widely used notion of direct versus indirect CP violation. It is motivated historically by the hypothesis of a new superweak interaction which was proposed as early as 1964 by Wolfenstein to account for the CP violation observed in Ki —> ir+ir~ decay (see the lectures by Wolfenstein in this volume). This new CP violating interaction would lead to a local four-quark vertex that changes the flavour quantum number (strangeness) by two units. Its only effect would be a CP violating contribution to Mi2, so that all observed CP violation could be attributed to particleantiparticle mixing alone. Today, after the advent of the three generation SM, the CKM mechanism of CP violation appears more natural. In principle the superweak scenario represents a logical possibility, leading to a different pattern of observable CP violation effects. Now, any CP violating effect that can be entirely assigned to CP violation in
178 Mi2 (as for t h e superweak case) is termed indirect CP violation. Conversely, any effect t h a t can not be described in this way and explicitly requires C P violating phases in t h e decay amplitude itself is called direct CP violation. It follows t h a t class a) represents indirect, class b) direct C P violation. Class c) contains aspects of both. In this latter case t h e magnitude of C P violation observed in any one decay mode (within t h e neutral kaon system, say) could by itself be ascribed t o mixing, thus corresponding t o an indirect effect. On the other hand, a difference in t h e degree of C P violation between two different modes would reveal a direct effect. We illustrate these classes by a few important examples. We will also use this opportunity t o discuss several aspects of kaon C P violation in more detail. a) - Lepton
Charge
Asymmetry
T h e lepton charge asymmetry in semileptonic KL decay is a n example for C P violation in t h e mixing matrix. It is probably t h e most obvious manifestation of C P nonconservation in kaon decays. T h e observable considered here reads (I = e or /i) A =
T{KL
->• n-l+u)
- Y{KL
->• TT+Z"^)
T(KL
->• -K-1+V)
+ T{KL
->•
w+l-9)
2Ree« 7 l m ^ 4 Mi 2
|l + e l 2 - | l - £ | 2 |l+£|2 + |l-e|2 (98)
If C P was a good symmetry of nature, KL would be a C P eigenstate and t h e two processes compared in (98) were related by a C P transformation. T h e rate difference A should vanish. Experimentally one finds however 2 0 Aexp
= (3.27 ± 0 . 1 2 ) - 1CT3
(99)
a clear signal of C P violation. T h e second equality in (98) follows from (94), noting t h a t t h e positive lepton l+ can only originate from K ~ (sd), l~ only from K ~ (ds). This is true t o leading order in SM weak interactions and holds to sufficient accuracy for our purpose. T h e charge of t h e lepton essentially serves t o t a g t h e strangeness of t h e K, thus picking out either only t h e K or only t h e K component. Any phase in t h e semileptonic amplitudes is irrelevant and t h e C P violation effect is purely in t h e mixing matrix itself. In fact, as indicated in (98), A is determined by I m ( r i 2 / M i 2 ) , t h e physical measure of C P violation in t h e mixing matrix. From (99) we see t h a t A > 0. This empirical fact can be used t o define positive electric charge in an absolute sense. Positive charge is t h e charge of
179
the lepton more copiously produced in semileptonic Ki decay. This definition is unambiguous and would even hold in an antimatter world. Also, using some parity violation experiment, this result implies in addition an absolute definition of left and right. These are quite remarkable facts. They clearly provide part of the motivation to try to learn more about the origin of CP violation. b) - CP Violation in the Decay Amplitude Observable CP violation may also occur through interference effects in the decay amplitudes themselves (pure direct CP violation). This case is conceptually perhaps the simplest mechanism for CP violation and the basic features are here particularly transparent. Consider a situation where two different components contribute to the amplitude of a K meson decaying into a final state / A = A{K -> /) = Aie^e^ + A ^ e ^ (100) Here Ai (i = 1,2) are real amplitudes and Si are complex phases from CP conserving interactions. The Si are usually strong interaction rescattering phases. Finally the 4>i are weak phases, that is phases coming from the CKM matrix in the SM. The corresponding amplitude for the CP conjugated process K -¥ f then reads (the explicit minus signs are due to our convention CP • K = —K, (CP-f = / ) ) A = A(K -> / ) = - A i e ^ e - * * 1 - A2ei^e~i^
(101)
Since now all quarks are replaced by antiquarks (and vice versa) compared to (100), the weak phases change sign. The CP invariant strong phases remain the same. From (100) and (101) one finds immediately \A\2 - \A\2 ~ AYA2 sin((Ji - S2) sin(0i -
fa)
(102)
The conditions for a nonvanishing difference between the decay rates of K —> f and the CP conjugate K —• / , that is direct CP violation, can be read off from (102). There need to be two interfering amplitudes A\_, A2 and these amplitudes must simultaneously have both different weak (>;) and different strong phases (Si). Although the strong interaction phases can of course not generate CP violation by themselves, they are still a necessary requirement for the weak phase differences to show up as observable CP asymmetries. It is obvious from (100) and (101) that in the absence of strong phases A and A would have the same absolute value despite their different weak phases, since then A = -A*.
180
A specific example is given by the decays K{K) 7r 7r~ = / ) . The amplitudes can be written as
-> 7r+7r~ (here / =
+
A+_ = \^A0eiS° V o
+
yj i
~A2e^
A+- = - y f ^ S e " 0 - ^A*2ei5*
(103)
where AQ^ are the transition amplitudes of K to the isospin-0 and isospin-2 components of the 7r+7r~ final state, defined by (7T7r(/ = 0,2)\% W \K) = A0t2eid^
(104)
They still include the weak phases, but the strong phases have been factored out and written explicitly in (103), (104). Taking the modulus squared of the amplitudes we get T(K -> TT+TT-) - T{K ->• 7T+7T-) = r- . Sm( T(K -> ^ + T T - ) + T ( i ? -»• 7T+7T-) " ° = 2 Re e'
_ 2)
Re,42 (JmA2 ReA 0 VReA2
_ ImA 0 ReA) (105)
The quantity so defined is just twice the real part of the famous parameter e', the measure of direct CP violation in K —> nir decays. The real parts of Aot2 can be extracted from experiment. The imaginary parts have to be calculated using the effective Hamiltonian formalism. We should stress that the quantity in (105) is not the observable actually used to determine e' experimentally. We have discussed it here because it is of conceptual interest as the simplest manifestation of e'. The realistic analysis requires a more general consideration of KL , Kg -> 7T7T decays to which we turn in the following paragraph. c) - Mixing Induced CP Violation in K —> irir: e, e' In this section we will illustrate the concept of mixing-induced CP violation with the example of K —> 7T7T decays. These are important processes, since CP violation has first been seen in KL —> n+n~ and as of today our most precise experimental knowledge about this phenomenon still comes from the study of K -> 7T7T transitions. There are two distinct final states and in a strong interaction eigenbasis the transitions are K°,K° —> irir(I = 0),irir(I = 2), with definite isospin for mr. Alternatively, using the physical eigenbasis for both initial and final states, one has KL,KS —> n+ir~,ir°ir°.
181
Consider next the amplitude for Ki going into the CP even state irir(I = 0), which can proceed via K (~ (1 + e)A0) or via K (~ (1 - E)AQ). Hence (to first order in small quantities) A{KL -»• 7T7r(/ = 0)) ~ (1 + e)A0ei6° - (1 - e)A*0ei5° ~ e + i j ^ = e (106) Kej4o
This defines the parameter e, characterizing mixing-induced CP violation. Note that e involves a component from mixing (e) as well as from the decay amplitude (lmA0/ReA0). Neither of those is physical separately, but e is. Note also that the physical quantity Ree discussed above satisfies Ree = Ree. More generally one can form the following two CP violating observables A(KL
-» 7T+7T-)
A(KS
-> TT+TT")
_ A{KL
-> 7T07r°)
° ~ A{KS
-» 7r°7r°)
Vo
These amplitude ratios involve the physical initial and final states and are directly measurable in experiment. They are related to e and e' through r)+- = e + e'
rj00 = s - 2e'
(108)
The phase of e is given by e = |e| exp(i7r/4). The relative phase between e' and e can be determined theoretically. It is close to zero so that to very good approximation e'/e = Re(e'/e). Both ?7-j and 7700 measure mixing-induced CP violation (interference between mixing and decay). Each of them considered separately could be attributed to CP violation in K-K mixing and would therefore represent indirect CP violation. On the other hand, a nonvanishing difference r)^ —rjoo = 3e' 7^ 0 is a signal of direct CP violation. Experimentally one has 2 0 \e\ = (2.282 ±0.019) • 10" 3
(109)
The quantity e' can be measured as the ratio Re {e'/e) « e'/e using the double ratio of rates V+= 1+ 6 Re(110) »7oo
e
Ten years ago, and until recently, the experimental situation was characterized by the following, somewhat inconclusive results 2 2 , 2 3 : £^ f ( 2 3 ± 7 ) • 10" 4 CERNNA31 e ~ \ (7.4 ± 5.9) • 10" 4 FNAL E731
{
>
In particular the second measurement was well compatible with zero. A new round of experiments, conducted at both CERN and Fermilab, was therefore
182
anticipated with great interest. direct CP violation 24 ' 25 :
The recent results have firmly established
e^ _ f (28.0 ± 4.1) • 10" 4 FNAL KTeV e ~ \ (14.0 ± 4.3) • 10" 4 CERN NA48
"
>
These results rule out the superweak hypothesis, at least in its most stringent form. The analyses of the experiments are currently still ongoing and should eventually settle the value of e'/e to an accuracy of (1 — 2) • 10~ 4 . 4-3
Theory of e and the Unitarity Triangle
Calculation of e In the theoretical expression for e in (106), the term ImAo/KeAo is numerically negligible (in standard phase convention). The value for e is then approximately given by e from (96), and can be written as W 4
MM» V2AM
( n 3 )
Mi2 is related to the first diagram shown in Fig. 16. It is given by Mi 2 = ^(K0\H?ff2\K°)
(114)
Here T-LffjT2 is the effective Hamiltonian for AS = 2 transitions, which is derived from the box diagrams for Mi2 in Fig. 16 by performing an operator product expansion. In this case there is only a single operator Q A S = 2 = {ds)v-A(ds)V-A
(115)
in the effective Hamiltonian. One obtains explicitly £ = e
W 4 ^ / l 2
12TT
rnK ^2AMK
2
•Im [A; S0(xc)r]i + K2S0{xt)m
+ 2\*\*tS0(xc,xt)m]
(116)
Here A; = V*sVid, fx = 160 Mel/ is the kaon decay constant and, at NLO, the bag parameter B^ is defined by 9
W(,A]-V BK=BK(v)[a^(n)}
1 +
_
4 ^
J s
(117)
183 Table 1: NLO results for rn with A I I L = (325 ± 110) MeV,
mc(mc)
= (1.3 ± 0.05) GeV,
mt{mt) = (170 ± 15) GeV. The third column shows the uncertainty due to the errors in Avre- and quark masses. The fourth column indicates the residual renormalization scale uncertainty at NLO in the product of rji with the corresponding mass dependent function from eq. (116). These products are scale independent up to the order considered in perturbation theory. The central values of the QCD factors at LO are also given for comparison.
m m m
NLO (central) 1.38 0.574 0.47
MS'
m
V
±35% ±0.6% ±3%
scale dep. ±15% ±0.4% ±7%
(K°\(ds)V-A(d8)v-A\K0)
NLO ref. 26 27 28
= -BK{y)fKm2K
LO (central) 1.12 0.61 0.35
(118)
The index (3) in eq. (117) refers to the number of flavours in the effective theory and J3 = 307/162 (in the so-called NDR scheme12). The Wilson coefficient multiplying BK in (116) consists of a charm contribution, a top contribution and a mixed top-charm contribution. It depends on the quark masses, Xi = m^/M^, through the functions SQ. The rji are the corresponding short-distance QCD correction factors (which depend only slightly on quark masses). Detailed definitions can be found in 1 2 . Numerical values for rji, T)2 and 773 are summarized in Table 1. Concerning these results the following remarks should be made. • £ is dominated by the top contribution (~ 70%). It is therefore rather satisfying that the related short distance part r)2So(xt) is theoretically extremely well under control, as can be seen in Table 1. Note in particular the very small scale ambiguity at NLO, ±0.4% (for 100 GeV" < \it < 300GeV). This intrinsic theoretical uncertainty is much reduced compared to the leading order result where it would be as large as ±9%. • The rji factors and the hadronic matrix element are not physical quantities by themselves. When quoting numbers it is therefore essential that mutually consistent definitions are employed. The factors 77, described here are to be used in conjunction with the so-called scale- (and scheme-) invariant bag parameter BK introduced in (117). The last factor on the right-hand side of (117) enters only at NLO. As a numerical example, if the (scale and scheme dependent) parameter -BK-(M) is given in the NDR scheme at ti = 2GeV, then (117) becomes BK = BK(NDR, 2 GeV) • 1.31 • 1.05.
184
• The quantity BK has to be calculated by non-perturbative methods. A representative range is BK =0.80 ±0.15 (119) The status of BK is reviewed in 2 9 . Determination of the Unitarity Triangle The source of CP violation in the standard model (SM) is the CabibboKobayashi-Maskawa (CKM) matrix V entering the charged-current weak interaction Lagrangian CCC
=
^ ^
r
(
1
"
75)
^"^+
+ h C
(120)
- -
where (1*1,1*2,1*3) = (u,c,t), (di, cfe, d%) = (d,s,b) are the mass eigenstates of the six quark flavours and a summation over i,j = 1,2,3 is understood. The unitary CKM matrix has the following explicit form (VudVusVub\ V=\Vcd Vcs Vcb)~\
\VtdVtsVtbJ
(
l-A2/2 -A
\AX3(l-g-irj)
A 1 - A 2 /2
-AX2
AX^g-i^X AX2
1
(121)
/
where the second expression is a convenient parametrization in terms of A, A, g and 77 due to Wolfenstein. It is organized as a series expansion in powers of A = 0.22 (the sine of the Cabibbo angle) to exhibit the hierarchy among the transitions between generations. The explicit parametrization shown in (121) is valid through order C(A 3 ), an approximation that is sufficient for most practical applications. Higher order terms can be taken into account if necessary 12 . The unitarity structure of the CKM matrix is conventionally displayed in the so-called unitarity triangle, Fig. 17 (left). This triangle is a graphical representation of the unitarity relation VudV*b+VcdV*b + VtdVt*b = 0 (normalized by —VcdV*b) in the complex plane of Wolfenstein parameters (g, 77), The angles a, /? and 7 of the unitarity triangle are phase convention independent and can be determined in CP violation experiments. The area of the unitarity triangle, which is proportional to rj, is a measure of CP nonconservation in the standard model. We briefly summarize the main ingredients of the standard analysis of the unitarity triangle, where e plays a central role. There are 4 independent parameters A, A, Q and 77 in the CKM matrix, which are determined from 4 measurements as follows:
185
as BK,mt,Vcb
I
Figure 17: The normalized unitarity triangle in the {g, r]) plane (left), and its standard determination (right).
• A = 0.22, from K+ -t n°l+p, or from semileptonic hyperon decay (A ->• pev, Y,~ —» neP, S~ —> hev). • Vcb = AX2 = 0.040 ± 0.002, from b ->• civ transitions. • \Vub/Vcb\ = A y V +rf = 0.09 ± 0.02, from b ->• ulv transitions. . |e| ~ r, ((1 - Q)A2S(mt) + c) A2BK with
BK
= 0.80 ± 0.15, from indirect CP violation in K ->
TTTT.
Under the final item we have indicated the dependence of e (from (116)) on the most important parameters, writing the CKM quantities explicitly in Wolfenstein form. Here c denotes a constant that is independent of A, g, T], nit and BKThe standard determination of the unitarity triangle is illustrated in Fig. 17 (right). The relevant input parameters are BK, rut, Vcb and IK^/Vctl. For fixed BK, rnt and Vcb, the measured |e| determines a hyperbola in the Q-TJ plane of Wolfenstein parameters. Intersecting the hyperbola with the circle defined by l^fc/Vcil determines the unitarity triangle (up to a two-fold ambiguity). There is a simple regularity, which is quite useful and easy to remember: As any one of the four input parameters becomes too small (with the others held fixed), the SM picture becomes inconsistent (see Fig. 17). Using this fact, lower bounds on these parameters can be derived. The large value that has been established for the top-quark mass in fact helps to maintain the consistency of the SM. In principle, once the unitarity triangle is fixed in this manner, any further, independent measurement of a quantity in the (Q,T]) plane provides us with
186
8,
0.6 It 0.4
0.2
-0.6
-0.4
-0.2
0.0 P
0.2
0.4
0.6
Figure 18: The allowed region (shaded) in the (g, fj) plane, combining information from E, \Vub/Vcb\ a n d including the constraint from AM^. The independent constraint from the lower limit on AMs/AMd excludes the region to the left of the curves labeled with A M S in the plot. £ ~ 1.2 measures SU(3) breaking in the hadronic matrix elements of B^-B^ versus Bs-Bs mixing.
an additional standard model test. In practice, however, the accuracy of such a test is limited by hadronic uncertainties, which enter mainly through BK and IVub/Vcbl- Useful additional restrictions come from B-B mixing. Both AM4 and AMd/AMs, where A M , is the mass difference in the Bq-Bq system, constrain y^(l - g)2 + rj2. The ratio AMd/AMs is particularly important, because the hadronic uncertainties cancel in the limit of SU(3) symmetry. Currently, while AMd is well measured, only a lower bound AMS > 14.3 p s _ 1 exists at present. However, already this bound is interesting. Together with AMd, it implies a quite clean upper bound on y^(l — g)2 + rf. This essentially excludes negative values of g, severely restricting the allowed range of g and 77.
The results of a complete analysis of the unitarity triangle are shown in Fig. 18. Here the axes are labeled by g = g(l - A 2 /2) and f) = 77(1 - A 2 /2), instead of g and r\. In this way higher terms in the Wolfenstein expansion ~ A2, which can be neglected to first approximation, are consistently taken into account. The plot is taken from 30 where further details can be found.
187
4-4
Calculating e'' je
The formula for s'/e, which can be derived from the definition in (107), (108), is given by e' u (lmA-2 lmAa\ e y/2\e\ \ReA2 ReA0i where ui = ReA2/ReA0.
This may be compared with (105) using (123)
arg(£ ) = - + S2 - 60 « j The expression (122) for e'/e may also be written as £
V2|£|ReA 0 V
£
ImA o
(124)
lm.A2
ImAo,2 are calculated from the general low energy effective Hamiltonian for AS = 1 transitions (48), which we have described in sec. 3.4. One has ImAo,2 =
G
10
V2
i=3
(125)
-lmXt-y=^2yi(fi){Qi)0t2
Here y,- are the Wilson coefficients and (Qi)02eiS°-2
= (mr(I = 0, 2)IQ^isT0),
A* = v;svtd. For the purpose of illustration we keep only the numerically dominant contributions and write 7 = 2miA0lmXt
(y6{Q6)o - hy*{Q*h+•
(126)
•
Q6 originates from gluonic penguin diagrams and Qs from electroweak contributions, as indicated schematically in Fig. 19. The matrix elements of Q6 and Q8 can be parametrized by bag parameters B6 and Bs as (Qe)o = - 4
1 2
mK .ms(/x)
+md{n)
mK m g (^) +m d (/i)
/
x 2
m2K(fK - fn) • B6 ~ p ^
m x / x • -Bs
BH
56
(127)
(128)
£?6 = ^8 = 1 corresponds to the factorization assumption for the matrix elements, which holds in the large NQ limit of QCD.
188
d
d
9
1,Z q
Q
q
Figure 19: Gluonic and electroweak penguin contributions, which give rise to operators and Q%, respectively.
The numerical importance of the contributions from Q$ and Q% can be understood as follows. Q§$ are particular because they are of the (V — A) ® (V + A) form, which results in a (5 + P) ® (S — P) structure upon Fierz transformation. Factorizing the matrix element of such operators gives for example (TT+TT
\(su)s+p{ud)s-p\K)
-•
-(TT
•
\SU\K)
(TT+ \wy5d\0)
(129)
Taking the derivative (<9M) of {n+(p)\(uj^5d)(x)\0)
=
UV^VX
(130)
and using the equations of motion, we find (7r + |u 75 d|0) = A
mu + md
h
K
=Um
s
+md
(131)
Here the second equality follows from the x ? T relations in (80). A second factor of m2K/(ms + rrid) comes in a similar way from the scalar current matrix element (-K~\SU\K). This explains the quark mass dependence in (127) and (128). Since l
K (132) Bo = 0(AQCD) m s +md we see that the matrix elements are primarily not proportional to (m s +rrid)~2, but to B0, which remains finite in the chiral limit ms, m^ —> 0. However TTIK is precisely known and it is customary to trade the x P T parameter BQ for the quark masses ms + rrid ~ ms. Because B0 is numerically, and somewhat accidentally, quite large (equivalently, the quark masses quite small), the matrix elements of Qe,s are systematically enhanced over those of the ordinary
189
(V — A) ® (V — A) operators. Q$ and Qj have a Dirac structure similar to Qe and Qs, but a different colour structure, which leads to a 1/NQ suppression. In addition their Wilson coefficients are numerically smaller. This implies that Qe and Qg give the dominant contributions. V6{Q6}o and ys{Qs}2 are positive numbers. The value for e'/e in (126) is thus characterized by a potential cancellation of two competing contributions. Since the second contribution is an electroweak effect, suppressed by ~ a/as compared to the leading gluonic penguin ~ (Qe)o, it could appear at first sight that it should be altogether negligible for e'/e. However, a number of circumstances actually conspire to systematically enhance the electroweak effect so as to render it a sizable contribution: • Unlike Qe, which is a pure AI = 1/2 operator, Qs can give rise to the 7T7r(7 = 2) final state and thus yield a non-vanishing ImA2 in the first place. • The 0(a/as) suppression is largely compensated by the factor l/u « 22 in (126), reflecting the AI =1/2 rule. •
—
^ ( Q s h gives a negative contribution to e'/e that strongly grows with rrit- For the realistic top mass value it can be substantial.
In order to estimate e'/e numerically (see (126)), the hadronic matrix elements have to be determined within a nonperturbative framework (e.g. lattice QCD, 1/Nc expansion, chiral quark model), while the coefficients j/j are known from perturbation theory, and Re^lo, w, GF, \S\ are fixed from experiment. Finally, the CKM quantity ImAt ~ r) is obtained from the standard determination of the unitarity triangle described in the previous section. The Wilson coefficients j/j have been calculated at NLO 3 1 ' 3 2 . The shortdistance part is therefore quite well under control. The remaining problem is then the computation of matrix elements, in particular B§ and B%. The cancellation between these contributions enhances the relative sensitivity of e'/e to the anyhow uncertain hadronic parameters which makes a precise calculation of e'/e impossible at present. The order of magnitude of e' can however be understood from (122). The size of ImAi/ReAi is essentially determined by the small CKM parameters that carry the complex phase and which are related to the top quark in the loop diagrams of Fig. 19. Roughly speaking ImAi/ReAi ~ lmVt*Vtd ~ 10~ 4 . Empirically we have, from the AI = 1/2 rule, ReA2/ReA0 ~ 10~ 2 . This leads to a natural size of e' of ~ 10~ 6 , thus e'/e ~ 10~ 3 . A complete analysis gives the result 33,29 1.4-lCT 4 <e'/e<
32.7- 10" 4
(scanning)
(133)
190
Figure 20: The leading order electroweak diagrams contributing to K model.
5.2 • 1CT4 < e'/e < 16.0 • 10~ 4
ixvv in the standard
(Gaussian)
(134)
This is compatible with the experimental results (111), (112) within the rather large uncertainties. The two ranges refer to different treatments of uncertainties in the experimental input: the assumption of Gaussian errors, or flat distributions (scanning). Similar findings are reported by other groups 34 ^ 12 . A detailed review of the theoretical status of e'/e, including recent developments (hadronic matrix elements, final state interactions, isospin breaking corrections etc.), along with further references can be found in 4 3 . The recent experimental confirmation that indeed e' ^ 0 constitutes a qualitatively new feature of CP violation and is as such of great importance. However, due to the large uncertainties in the theoretical calculation, a quantitative use of this result for the extraction of CKM parameters is unfortunately rather limited. For this purpose one has to turn to theoretically cleaner observables. As we will see in the next section, rare K decays in fact offer very promising opportunities in this direction. 5 5.1
Rare K Decays K+
->• -K+VV and KL -> -n°vv
The decays K —> TXVV proceed through flavour changing neutral currents. These arise in the standard model only at second (one-loop) order in the electroweak interaction (Z-penguin and W-box diagrams, Fig. 20) and are additionally GIM suppressed. The branching fractions are thus very small, at the level of 10 - 1 0 , which makes a detection of these modes rather challenging. However, the loop process K —> iri/P, a genuine quantum effect of standard model flavour dynamics, probes important short distance physics, in particular properties of the top quark (mt, Vtd, Vts). It is also very sensitive to poten-
191
tial new physics effects. At the same time, the K -> itvv modes are reliablycalculable, in contrast to most other decay modes of interest. A measurement of K+ —> ir+vv and KL —> -KQVV will therefore be an extremely useful test of flavour physics. Let us discuss the main properties of these decays, concentrating first on the charged mode. The GIM structure of the amplitude can be written as Y,
XlF(xl) = Xc(F(xc)-F(xu))
+ Xt{F(xt)-F{xu))
(135)
with Xi = V*s Vid and Xi = m] /M^. The first important point is the characteristic hard GIM cancellation pattern, which means that the function F depends as a power on the internal mass scale F{xu) ~ % £ ~ 10- 5 « F(xe) ~ ^ f - In ¥"- ~ 10- 3 « F(xt) ~ 1 (136) The up-quark contribution is a long-distance effect, determined by the scale AQCD- AS an immediate consequence, top and charm contribution with their hard scales mt, mc dominate the amplitude, whereas the long-distance part F(xu) is negligible. Note that the charm contribution, Ac F(xc) ~ 10 _ 1 • 10~ 3 , and the top contribution, At F(xt) ~ 10 _ 4 -1, have the same order of magnitude when the CKM factors are included. The origin of the hard GIM mechanism is the fact that the neutrinos only couple to the heavy gauge boson W and Z. It is interesting to contrast the situation with K+ —> ir+e+e~, where photon exchange can contribute as shown in Fig. 21. For simplicity we consider the case of internal quarks that are light compared to Mw The W propagator can then be contracted and the loop reduces essentially to a vacuum polarization diagram. Electromagnetic gauge invariance, which is unbroken, requires the sd-photon vertex to have the form Tu(q) ~ s 7 " ( l - J5)d • (q2g^ - q^qv) In —
(137)
77Zj
where q is the photon momentum and we assumed q2 <JC mj -C M^,. The structure in (137) ensures current conservation, q"Tv(q) = 0. When the vertex is contracted with the electron current e-y"e, the q^qv term vanishes by the equations of motion. The q2 factor of the remaining term is canceled by the photon propagator, which yields a local (sd)v-A(ee)v interaction and a loop function F(Xi) ~ In MUL
(138)
192
d -**•
q 4- ^7 Figure 21: Soft GIM mechanism in the photon penguin.
The logarithmic behaviour of the photon penguin is refered to as the soft GIM mechanism. It is in contrast to the hard GIM structure arising from the W and Z contributions where gauge symmetry is spontaneously broken, leading to the power behaviour ~ (mf /M^) ln(Mw//m;). The unsuppressed sensitivity to the light quark mass m; = mu in (138) is a signal of the long-distance dominance in the K+ —> n+e+e~ amplitude. Of course, xPT is needed for a consistent analysis in this case. The quark-level result in (138), derived in perturbation theory, is strictly speaking not valid, but it is enough to indicate the long-distance sensitivity and the order of magnitude of the contribution. The short-distance dominance of the s —• dvv transition next implies that the process is effectively semileptonic, because a single, local operator (sd)v-A{vv)v-A describes the interaction at low-energy scales. Hence the amplitude has the form A(K+ -> n+vv) ~ GFa{XcFc + \tFt)(Tr+\{sd)v\K+)
(vv)v_A
(139)
The coefficient function \CFC + XtFt is calculable in perturbation theory. The hadronic matrix element can be extracted from K+ —>• 7r°e + ^ decay via the isospin relation (13). The K+ —> ix+vv amplitude is then completely determined, and with good accuracy. The neutral mode proceeds through CP violation in the standard model. This is due to the definite CP properties of K°, ir° and the hadronic transition current (sd)v-A- Using CP\n°) = -|7r°>
CP\K°) = -\K°)
CP (sd)v (CP)-1
=
-{ds)v (140)
193
we have (A(sd)v\K°)
= -{n0\(ds)v\K0)
(141)
(The axial vector currents (sd)A, (ds)A do not contribute to the K —> TT transition because of parity.) With KL = (K° + K°)/y/2 ( the ^-contribution is negligible) we then obtain for the matrix element of the hadronic transition current {n°\\i(sd)v + K{ds)v\KL) ~ ImA, (142) where A; is the appropriate CKM factor. This demonstrates the CP violating character of the leading standard model amplitude for KL -> 7r°i/i>. For simplicity we have given the argument here assuming standard phase conventions. A manifestly phase convention independent derivation of the same result is discussed in 4 4 . The amplitude then has the form A{KL ->
TPVV)
~ ImAt Ft + ImAc Fc
(143)
where ImAt Ft ~ 1(T 4 • 1 > ImAc Fc ~ 1CT4 • 10" 3
(144)
The violation of CP symmetry in KL —> K°VV arises through interference between K°-K° mixing and the decay amplitude. This mechanism is an example of mixing-induced CP violation. In the standard model, the mixing-induced CP violation in KL —> ifivv is larger by orders of magnitude than the one in Ki ~> 7r+7r~, for instance. This is because A(KL -> *°uu) in contrast to the per-mille-size ratios in (107). Any difference in the magnitude of mixing induced CP violation between two K^ decay modes is a signal of direct CP violation. For this reason, the standard model decay KL —> i^Qvv is a signal of almost pure direct CP violation, revealing an effect that can not be explained by CP violation in the K — K mass matrix alone. The K -> iri/9 modes have been studied in great detail over the years to quantify the degree of theoretical precision. Important effects come from short-distance QCD corrections. These were computed at leading order in 4 5 . The complete next-to-leading order calculations 46>47>48 reduce the theoretical uncertainty in these decays to ~ 5% for K+ —• w+vv and ~ 1% for KL —> 7rVz>. This picture is essentially unchanged when further small effects are considered, including isospin breaking in the relation of K -> -KVV to K+ —> TT°l+vi9, long-distance contributions 50 ' 51 , the CP-conserving effect in KL —>
194 Table 2: Compilation
of important properties and results for K —> •KVU. K+
CKM contributions scale dep. (BR) BR (SM) exp.
-> -K+UU
KL
CP conserving Vtd top and charm ±20% (LO) -> ±5% (NLO) (0.8±0.3)-10^ i U (1.5+^) • 10- 1 0 BNL 787 55
- > TT^VV
CP violating lmVt*Vtd ~ J C p ~ ?? only top ±10% (LO) ->• ± 1 % (NLO) (2.8 ±1.1) • 10" 1 1 < 5.9-10- 7 KTeV 56
ir°vv in the standard model 50 ' 52 , two-loop electroweak corrections for large mj 53 and subleading-power corrections in the OPE in the charm sector 54 . While already K+ —> it+vv can be reliably calculated, the situation is even better for KL, —> iPvv. Since only the imaginary part of the amplitude contributes, the charm sector, in K+ —• TT+VV the dominant source of uncertainty, is completely negligible for KT, -> -n^vv (0.1% effect on the branching ratio). Long distance contributions ( ^ 0.1%) and also the indirect CP violation effect ( ^ 1%) are likewise negligible. The total theoretical uncertainties, from perturbation theory in the top sector and in the isospin breaking corrections, are safely below 3% for B(KL -* n°vv). This makes this decay mode truly unique and very promising for phenomenological applications. In Table 2 we have summarized some of the main features of K+ -> ir+ vv and KL —> irQvv. Note that the ranges given as the standard model predictions in Table 2 arise from our, at present, limited knowledge of standard model parameters (CKM), and not from intrinsic uncertainties in calculating the branching ratios. With a measurement of B(K+ -)• ir+i/v) and B{KL ->• TPVV) available very interesting phenomenological studies could be performed. For instance, B{K+ ->• ix+vv) and B(KL ->• -K°VV) together determine the unitarity triangle (Wolfenstein parameters g and rf) completely (Fig. 22). The expected accuracy with ±10% branching ratio measurements is comparable to the one that can be achieved by CP violation studies at B factories before the LHC e r a 4 4 . The quantity B(KL -> ^vv) by itself offers probably the best precision in determining ImVj* V^ or, equivalently, the Jarlskog parameter JCP = lm(V;sVtdVusV:d)
= A ( l - y ) ImAt
(146)
The prospects here are even better than for B physics at the LHC. As an
195
Figure 22: Unitarity triangle from K+ —> ir+vv and KL —> 7r°^f.
example, let us assume the following results will be available from B physics experiments sin 2a = 0.40 ± 0.04
sin 2/3 = 0.70 ± 0.02
Vcb = 0.040 ± 0.002
(147)
The small errors quoted for sin 2a and sin 2/? from CP violation in B decays require precision measurements at the LHC. In the case of sin 2a we have to assume in addition that the theoretical problem of 'penguin-contamination' can be resolved. These results would then imply IrnAt = (1.37 ± 0.14) • 1 0 - 4 . On the other hand, a ±10% measurement B{KL ->• ir°vv) = (3.0 ±0.3) • 1 0 ^ n together with mt(mt) = (170 ± 3)GeV would give ImAt = (1.37 ± 0.07) • 10~ 4 . If we are optimistic and take B{Ki —> -K°VD) = (3.0 ±0.15) • 1 0 ~ n , mt{mt) = (170 ± l)GeV, we get ImAt = (1.37 ± 0.04) • 10~ 4 , a remarkable accuracy. The prospects for precision tests of the standard model flavour sector will be correspondingly good. The charged mode K+ —• ii+vv is still being studied by Brookhaven experiment E787, which will be followed by a successor experiment, E949 57 . Recently, a new experiment, CKM 5 8 , has been proposed to measure K+ —> ix+vv at the Fermilab Main Injector, studying K decays in flight. Plans to investigate this process also exist at KEK for the Japan Hadron Facility (JHF) 5 9 . The neutral mode, KL, —> Tr°vD, is currently pursued by KTeV. For KL —> •K0VV a model independent upper bound can be infered from the experimental result on K+ —> 7r+i/z/, which at present is stronger than the direct experimental limit 60 . It is given by B(KL -» -K°VD) < 4AB(K+ -> TT+UP) < 2 • 10" 9 . At least this sensitivity will have to be achieved before new physics is constrained with B(KL —>• ifivV). Concerning the future of KL, —> ifivv experiments, a proposal exists at Brookhaven (BNL E926) to measure this decay at the AGS with a sensitivity of 0(1O~ 12 ) 61 . There are furthermore plans to pursue this mode with comparable sensitivity at Fermilab 62 and KEK 6 3 . Prospects for
196
KL —> "K°vv at a ^-factory were discussed in 6 4 . 5.2
KL -> 7T°e+e-
The electric charge of the leptons and the resulting interaction with photons make the decay KL —» 7r°e+e~ more complicated than KL —> iftvv. The short-distance dominated part of the KL ->• 7r°e+e~ amplitude can be analyzed using OPE and the renormalization group in analogy to the case of the AS = 1 effective Hamiltonian. This approach is reviewed in 1 2 . Here we would like to give a qualitative discussion, which highlights the characteristic points and also summarizes the main differences between KL —> n°e+e~ and KL —> TT0VP. The basic diagrams for KL —• 7r 0 e + e~ are similar to those for KL —• ir°vv, except that also a photon can be exchanged instead of the Z boson in the penguin diagrams (see Fig. 20). Taking the GIM mechanism into account, the structure of the effective Hamiltonian reads neff
~ At (Ft - Fc) + Xu (Fu - Fc)
(148)
Recalling that we can write KL « K2 + eKi, where K2 (K\) is the CP odd (CP even) neutral kaon state, the decay amplitude takes the form A[KL -> n°ff) ~ ImA t (F t - Fc) + e [Re\t(Ft - Fc) + Re\u(Fu - Fc)] 10"4 • 1
+10"
io-. i
+io-^101
{=
v
1 / =e Here we have kept the expression general to allow for the cases f = e and v. We have assumed the structure of the short-distance Hamiltonian for this exercise, although this is not strictly correct for the parts that are sensitive to long-distance physics. However it is sufficient for the qualitative argument we would like to make. We recall a few points from the discussion in the section o n K —» 7TZ/P.
First, as we have seen, the K2 component yields an amplitude proportional to the imaginary parts of the CKM elements. Correspondingly, the K\ amplitude (opposite CP), which is multiplied by e, gives the real parts. Second, we need to consider that the GIM mechanism is hard for / = v and soft for / = e. This implies Ft ~ 1, Fc ~ 10" 3 , Fu ~ 10" 5 for / = v, and F„, c , t ~ 1 for f = e. Third, we determine the hierarchy of the CKM elements. Putting this information together, we find the order-of-magnitude estimates shown above for the various terms. For / = v we recover what we already know from the previous section: The amplitude is purely from direct CP violation (the e-component is negligible)
197
e
K . i
71
e
Figure 23: CP conserving two-photon contribution to KL —> 7r°e+e
and it is dominated by short-distance physics (only Ft, Fc contribute). For / = e we can read off: Both direct and indirect CP violation contribute at the same order (~ 10~~4) and the latter component is determined by long-distance dynamics (Fu). In addition to what we have discussed so far, a long distance dominated, CP conserving amplitude with two-photon intermediate state can contribute. Although it is of higher order in the electromagnetic coupling, it can potentially compete with the other contributions because those are suppressed by CP violation. Treating Ki —> 7r°e+e~ theoretically one is thus faced with the need to disentangle three different contributions of roughly the same order of magnitude. • Direct CP violation: This amplitude is short-distance in character, theoretically clean and has been analyzed at next-to-leading order in QCD 65 . Taken by itself this mechanism leads to a Ki —» 7r°e + e _ branching ratio of (4.5 ± 2.6) • 10~ 12 within the standard model. • Indirect CP violation: This part is given by ~ e • A(Ks —> ir°e+e~). The Ks amplitude is dominated by long distance physics and has been investigated in chiral perturbation theory 66,67,68,69 r j u e to unknown counterterms that enter this analysis a reliable prediction is not possible at present. The situation would improve with a measurement of B(Kg —>• 7r°e + e~), which could become possible at the CERN experiment NA48 or with KLOE at DA$NE, the Frascati ^-factory. Present day estimates for B(Ki -> 7r°e + e~) due to indirect CP violation alone allow values of 10-12 _ i Q - i o .
• The CP conserving two-photon contribution is again long-distance dominated. It has been analyzed by various authors 6 8 , 7 0 ' 7 1 . The estimates are typically a few 10~ 12 . Improvements in this sector might be possible
198
li
KL- -&1 7
\
Figure 24: The dominant contribution to KL ~> Al+M
by further studying the related decay K L —• 7r°77 whose branching ratio has already been measured to be (1.7 ± 0.1) • 10 - 6 Originally it had been hoped that the direct CP violating contribution would be dominant. It is possible that the CP conserving part is not too important. However, this is unlikely to be true for the amplitude from indirect CP violation. Experimental input on Kg —> 7r°e+e~ will be indispensable to solve this problem. We also mention that the CP violating contributions interfere with a relative phase, which is known up to a sign ambiguity. The CP conserving part simply adds incoherently to the decay rate. Besides the theoretical issues, Ki —> n°e+e~ is also challenging from an experimental point of view. The expected branching ratio is even smaller than for KL —• -K°VD. Furthermore a serious irreducible physics background from the radiative mode KL —> e + e~77 has been identified, which poses additional difficulties l . A background subtraction seems necessary, which should be possible with enough events. Additional information could in principle also be gained by studying the electron energy asymmetry 68 ' 71 or the time evolution 68,72,73
5.3
KL -*• fl+fj,-
KL —> ^+^~ receives a short distance contribution from Z-penguin and Wbox graphs similar to K —• irvv. This part of the amplitude is sensitive to the Wolfenstein parameter Q. In addition Ki ->• H+H~ proceeds through a long distance contribution with the two-photon intermediate state, which actually dominates the decay completely (Fig. 24). The long distance amplitude consists of a dispersive (A dis ) and an absorptive contribution {Aabs). The branching fraction can thus be written B(KL - • n+fi-) = \ASD + Adls\2
+ \Aabs\2
(149)
199 Using B{KL ->• 77) it is possible to extract 1 \Aabs\2 = (7.1 ±0.2) • 10~ 9 . Adis on the other hand cannot be calculated accurately at present 7 4 ' 7 5 , 7 6 ' 7 7 . This is rather unfortunate, in particular since B(Ki —> n+fi~) has already been measured with very good precision B(KL
-> fj+n-) = (7.18 ± 0.17) • 10" 9 BNL 871 7 8
(150)
Interestingly, the absorptive contribution essentially saturates the total rate. For comparison we note that 3 0 B(KL -> H+H~)SD = \ASD\2 = (0.9±0.4)-10 -9 is the expected branching ratio in the standard model based on the shortdistance contribution alone. Because Adis is largely unknown, Ki —> /x + //~ is at present not a very useful constraint on CKM parameters. 6 6.1
Flavour Physics with Charm Rare D Decays
Weak decays of charmed particles, D mesons in particular, can also be used to probe the physics of flavour. Due to the characteristic pattern of standard model weak interactions, rare processes with D mesons are markedly different from their kaon counterparts. First of all, the charm quark mass mc sa 1.4 GeV is considerably bigger than the strange quark mass, and also the QCD scale A-QCD- Sometimes methods from the theory of heavy quarks can be employed for charmed particles, although they are less reliable than in the case of the much heavier B mesons (see the lectures by Falk in this volume for a general discussion in the context of B physics). This situation makes a theoretical treatment of D decays more difficult than for the heavy B mesons on one hand, and for the light kaons on the other. More important, however, is the fact that charm is a quark with weak isospin T3 = + 1 / 2 , in contrast to s and b. For this reason the virtual particles appearing in FCNC loop diagrams are the down-type quarks d, s, 6, rather than u, c, t familiar from K and B physics. Examples of rare D decays are D -» p 7 , D -+ TVI+1~ , D -> n+/d~, D -> 7 7 , D ->• -KVV
(151)
They proceed through c —> u FCNC transitions and their AC = 1 amplitudes have the generic form
v;dvudFd + V;SVUSFS + vc\vubFb = VC*SVUS(FS - Fd) + V;bVub(Fb - Fd)
(152)
where F{ is the amplitude with internal quark i = d, s, b. A potentially large contribution could come from Fb, which can have a quadratic mass dependence
200
~ ml/Myv- This exceeds the contribution from light flavours in the loop, but it is still much smaller than virtual top effects. In addition, the 6-quark contribution is very strongly CKM suppressed since V*bVub ~ A5. On the other hand, V*SVUS ~ A is much larger, but this term is multiplied by Fs — Fd- The latter is non-zero only through the effects of SU(3) breaking and therefore receives a strong GIM suppression. Moreover, the light-quark loops Fs, F^ are sensitive to nonperturbative QCD dynamics. Additional long-distance mechanisms, different from Fs, F^, can also become important. An example is 8 D° -> /J,+fi~- Here the amplitude in (152) yields a tiny branching fraction of ~ 10~ 19 . Alternatively the decay can proceed, for instance, via the long-distance mechanism D° —> K° —• /i + /u~, with an off-shell kaon. Estimates of this and similar sources give together 8 B(D° —> n+n~) ~ 10~ 15 . This still leaves a window for the discovery of potential new physics effects below the current experimental limit of20 B(D° —> yu + /i~) exp < 4-10~ 6 . 6.2
D°-D°
Mixing
A further interesting probe of the flavour sector is D°-D° mixing, a process with AC = 2. Recent measurements from the CLEO and FOCUS collaborations have stimulated the interest in this observable. A detailed discussion of these results and a list of references can be found in 79 . D°-D° mixing can be searched for in the time-dependent analysis of hadronic two-body modes. We may distinguish three types of decays, Cabibbo favoured (CF): D° -» K'-K+ +
singly Cabibbo suppressed (SCS): £1° - • K K~ doubly Cabibbo supressed (DCS): D° -*• K+Tr~
(153) (154) (155)
In powers of the Wolfenstein parameter A, the CF amplitude c —> sud is of order A0, the SCS amplitude c —¥ sus of order A, and the DCS amplitude c —¥ dus of order A2, which establishes a clear hierarchy in the decay rates. The classification applies of course also to the charge-conjugated modes D° —» K+TT- (CF), D° -> K+K(SCS) and D° -4 K~ir+ (DCS). The framework for describing D°-D° mixing is analogous to the case of K°-K° mixing discussed in sec. 4.1. The mass eigenstates are Di,2=pD±qD
CPD
= -D
(156)
We introduce the convenient definitions ( r is the average total decay rate) M2-Mt x
-—f~
r2-r!
y = -2r-
p(K+n-\D)
t = -q{K+7r-lD)
(15?)
201
x, y, £ are small quantities of order A2. The precise calculation of x and y is difficult for the reasons mentioned above (GIM suppression, long-distance sensitivity). Analyses based on hadronic estimates (K, ir intermediate states) or the heavy-quark expansion (quark-level calculation) lead to a standard model expectation of x, y < 1CT4 (158) We then consider the time dependent decay F(D°(t) -> K+TT~). The state D°(t) is obtained by solving the Schrodinger equation for the D°-D° two-state system with the initial condition D°(t = 0) = D°. The solution reads 2
T(D°(t) ^K+n-)
= e-rt
r(£>° ->
K+TT-
|£| 2 + (y Re£ + x I m O r t + \{x2 + y2)(Tt)2
(159)
To obtain this result we have used the following approximations. We expand in the small quantities £, x, y, assumed to be of the same order (of order x, say), and drop terms of 0{xA) and higher. We also require r t to be at most of 0(1). This is the range that is relevant experimentally, since the D mesons will have decayed once Ft becomes too large. The form of (159) is not hard to interpret. If there is no mixing, x = y = 0, only the first term inside the square brackets survives. It simply describes the exponentially decaying rate of the DCS process D° —> K+TT~ . Even if x, y ^ 0, at time t = 0 the DCS decay is the only effect. On the other hand, if the small DCS decay was absent, £ = 0, only the third term contributes. Then the K+7r~ final state can arise only through mixing D° —> D° —> K+n~. The rate vanishes at t = 0 because D°(t = 0) = D° and there was no time yet for mixing. The (xTt)2 behaviour represents the onset of an oscillating trigonometric function of which it is the remnant in our approximation. Finally, the linear term ~ Tt is from the interference of DCS decay and mixing. Without the DCS mode, the observation of K+ir~ from an initial D° would be an unmistakable sign of mixing. The real situation is more complicated. To demonstrate the presence of the mixing term, the three contributions in (159) have to be disentangled, which is possible in principle due to their different time dependence. So far only the DCS component (|£|2 term) has been unambiguously identified. At present the signal for mixing is still compatible with zero. It will be interesting to follow the future experimental results on this issue. The detection of a substantial value of x would indeed be exciting evidence for new physics.
202
7
Summary and Outlook
Kaon decays have played a key role in the development of the standard model. Today kaon physics is a mature field. A whole array of modern field theoretical techniques is at our disposal to help us extract the underlying mechanisms. Among these tools are perturbative quantum field theory, including the perturbative treatment of QCD at short distances, the operator product expansion, the renormalization group and chiral perturbation theory. While an impressive number of crucial insights has been already obtained in the past, excellent opportunities continue to exist for present and future studies: • Chiral perturbation theory constitutes a complementary handle on the elusive nonperturbative dynamics of QCD at long distances. This can be helpful to control long-distance contributions that contaminate the shortdistance physics that is of primary interest. However, chiral perturbation theory, as a model-independent framework for low-energy QCD, is also of considerable interest in its own right. Typical processes that are studied in this context are K+ —• ir+l+l~, KL —¥ 7r°77, or Ks —> 77. • CP violation in K —> TTTT is still an important area of current interest. Indirect CP violation measured by e is well determined experimentally and provides us with a valuable CKM constraint. Direct CP violation, now established, but still under further experimental investigation, gives an important qualitative test of the standard model. • Standard model precision tests will be possible with the "golden" decay modes K+ —» -K+VV and KL —• -K°vi>. • Other opportunities of interest include decays as KL —> ir°e+e~7 fipolarization in K+ —• 7r°jU+i/, among many other rare processes. • Very direct and clean probes for new physics are decays that are forbidden in the standard model. Lepton flavour violating modes as KL —» e/j,, K —> nfj,e are important examples. In parallel to kaon physics many other observables, as provided from decays of hadrons with beauty and charm, will be necessary to get a reliable and complete picture of the physics of flavour and its possible origins. In these lectures, we have discussed selected examples from the phenomenology of mesons with strangeness and charm. We have also emphasized the theoretical framework for an analysis of these processes, which is crucial to interpret the experimental data and to extract the underlying physics. The coming years promise to be very fruitful for the study of flavour physics and important discoveries are possible in the near future.
203
Acknowledgments I thank the organizers of TASI 2000 for inviting me to this very interesting and pleasant Summer School and for their hospitality at Boulder. Thanks are also due to the students for their active participation. I am grateful to Gino Isidori for comments on the manuscript. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24.
L. Littenberg and G. Valencia, Ann. Rev. Nucl. Part. Sci. 43, 729 (1993). J. L. Ritchie and S. G. Wojcicki, Rev. Mod. Phys. 65, 1149 (1993). B. Winstein and L. Wolfenstein, Rev. Mod. Phys. 65, 1113 (1993). P. Buchholz and B. Renk, Prog. Part. Nucl. Phys. 39, 253 (1997). A. R. Barker and S. H. Kettell, hep-ex/0009024. G. D'Ambrosio and G. Isidori, Int. J. Mod. Phys. A 13, 1 (1998). A. Pich, hep-ph/9610243. S. Pakvasa, hep-ph/9705397. G. Burdman, hep-ph/9508349. E. Golowich, Nucl. Phys. Proc. Suppl. 59, 305 (1997). E. Golowich, hep-ph/9706548. G. Buchalla, A. J. Buras and M. E. Lautenbacher, Rev. Mod. Phys. 68, 1125 (1996). A. J. Buras, in Probing the Standard Model of Particle Interactions, eds. F.David and R. Gupta (Elsevier Science B.V., 1998), hep-ph/9806471. R. N. Cahn and G. Goldhaber, The Experimental Foundations Of Particle Physics, (Cambridge University Press, UK, 1989). A. Belyaev et al., hep-ph/0008276. H. Georgi, Weak Interactions And Modern Particle Theory, (AddisonWesley, Menlo Park, USA, 1984). A. Pich, hep-ph/9806303. G. Colangelo and G. Isidori, hep-ph/0101264. G. D'Ambrosio and D. Espriu, Phys. Lett. B 175, 237 (1986); J. L. Goity, Z. Phys. C 34, 341 (1987). D. E. Groom et al, Eur. Phys. J. C 15, 1 (2000). A. Lai et al. [NA48 Collaboration], Phys. Lett. B 493, 29 (2000). H. Burkhardt et al. [NA31 Collaboration], Phys. Lett. B 206, 169 (1988); G. D. Barr et al. [NA31 Collaboration], Phys. Lett. B 317, 233 (1993). L. K. Gibbons et al. [E731 Collaboration], Phys. Rev. Lett. 70, 1203 (1993). A. Alavi-Harati et al. [KTeV Collaboration], Phys. Rev. Lett. 83, 22 (1999).
204
25. V. Fanti et al. [NA48 Collaboration], Phys. Lett. B 465, 335 (1999); A. Ceccucci, CERN seminar (29.2.2000), http://na48.web.cern.ch/NA48. 26. S. Herrlich and U. Nierste, Nucl. Phys. B 419, 292 (1994). 27. A. J. Buras, M. Jamin and P. H. Weisz, Nucl. Phys. B 347, 491 (1990). 28. S. Herrlich and U. Nierste, Phys. Rev. D 52, 6505 (1995). 29. S. Bosch et al., Nucl. Phys. B 565, 3 (2000). 30. A. J. Buras, hep-ph/9905437. 31. A. J. Buras et al., Nucl. Phys. B 400, 37 (1993); A. J. Buras, M. Jamin and M. E. Lautenbacher, Nucl. Phys. B 400, 75 (1993). 32. M. Ciuchini et al, Nucl. Phys. B 415, 403 (1994). 33. A. J. Buras et al., Nucl. Phys. B 592, 55 (2000). 34. M. Ciuchini and G. Martinelli, hep-ph/0006056. 35. M. Ciuchini, E. Franco, L. Giusti, V. Lubicz and G. Martinelli, hepph/9910237. 36. S. Bertolini, M. Fabbrichesi and J. O. Eeg, Rev. Mod. Phys. 72, 65 (2000); hep-ph/0002234. 37. T. Hambye et al., Nucl. Phys. B 564, 391 (2000); hep-ph/0001088. 38. S. Narison, Nucl. Phys. B 593, 3 (2001). 39. J. Bijnens and J. Prades, JEEP 0006, 035 (2000); hep-ph/0010008. 40. A. A. Belkov et al., hep-ph/9907335; hep-ph/0010142. 41. E. Pallante and A. Pich, Phys. Rev. Lett. 84, 2568 (2000); Nucl. Phys. B 592, 294 (2000); E. Pallante, A. Pich and I. Scimemi, hep-ph/0010073. 42. Y. Wu, hep-ph/0012371. 43. A. J. Buras, hep-ph/0101336. 44. G. Buchalla and A. J. Buras, Phys. Rev. D 54, 6782 (1996). 45. V. A. Novikov et al, Phys. Rev. D 16, 223 (1977); J. Ellis and J. S. Hagelin, Nucl Phys. B 217, 189 (1983); C. Dib, I. Dunietz and F. J. Gilman, Mod. Phys. Lett. A 6, 3573 (1991). 46. G. Buchalla and A. J. Buras, Nucl. Phys. B 398, 285 (1993); Nucl. Phys. B 400, 225 (1993); Nucl. Phys. B 412, 106 (1994). 47. M. Misiak and J. Urban, Phys. Lett. B 451, 161 (1999). 48. G. Buchalla and A. J. Buras, Nucl. Phys. B 548, 309 (1999). 49. W. J. Marciano and Z. Parsa, Phys. Rev. D 53, 1 (1996). 50. D. Rein and L. M. Sehgal, Phys. Rev. D 39, 3325 (1989). 51. J. S. Hagelin and L. S. Littenberg, Prog. Part. Nucl. Phys. 23, 1 (1989); M. Lu and M. B. Wise, Phys. Lett. B 324, 461 (1994). 52. G. Buchalla and G. Isidori, Phys. Lett. B 440, 170 (1998). 53. G. Buchalla and A. J. Buras, Phys. Rev. D 57, 216 (1998). 54. A. F. Falk, A. Lewandowski and A. A. Petrov, hep-ph/0012099. 55. S. Adler et al. [E787 Collaboration], Phys. Rev. Lett. 84, 3768 (2000).
205
56. A. Alavi-Harati et al. [The E799-II/KTeV Collaboration], Phys. Rev. D 61, 072006 (2000). 57. BNL E949 collaboration, http://www.phy.bnl.gov/e949/. 58. R. Coleman et al. (CKM), FERMILAB-P-0905 (1998). 59. T. Shinkawa, in: JHF98 Proceedings, KEK, Tsukuba. 60. Y. Grossman and Y. Nir, Phys. Lett. B 398, 163 (1997). 61. BNL E926 collaboration, http://sitka.triumf.ca/e926/. 62. E. Cheu et al. [KAMI Collaboration], hep-ex/9709026. 63. T. Inagaki, in: JHF98 Proceedings, KEK, Tsukuba. 64. F. Bossi, G. Colangelo and G. Isidori, Eur. Phys. J. C 6, 109 (1999). 65. A. J. Buras et al., Nucl. Phys. B 423, 349 (1994). 66. G. Ecker, A. Pich and E. de Rafael, Nucl. Phys. B 303, 665 (1988). 67. C. Bruno and J. Prades, Z. Phys. C 57, 585 (1993). 68. J. F. Donoghue and F. Gabbiani, Phys. Rev. D 51, 2187 (1995). 69. G. D'Ambrosio, G. Ecker, G. Isidori and J. Portoles, JEEP 9808, 004 (1998). 70. A. G. Cohen, G. Ecker and A. Pich, Phys. Lett. B 304, 347 (1993). 71. P. Heiliger and L. M. Sehgal, Phys. Rev. D 47, 4920 (1993). 72. L.S. Littenberg, in Proceedings of the Workshop on CP Violation at a Kaon Factory, ed. J.N. Ng, TRIUMF, Vancouver, Canada (1989), p. 19. 73. G. O. Kohler and E. A. Paschos, Phys. Rev. D 52, 175 (1995). 74. D. Gomez Dumm and A. Pich, Phys. Rev. Lett. 80, 4633 (1998). 75. M. Knecht et al., Phys. Rev. Lett. 83, 5230 (1999). 76. G. Valencia, Nucl. Phys. B 517, 339 (1998). 77. G. D'Ambrosio, G. Isidori and J. Portoles, Phys. Lett. B 423, 385 (1998). 78. D. Ambrose et al. [E871 Collaboration], Phys. Rev. Lett. 84, 1389 (2000). 79. S. Bergmann et al., Phys. Lett. B 486, 418 (2000).
This page is intentionally left blank
c?
o^5
(**$
x
1
A. R. Barker
>
s * *K-<% *
- *->
* * :*.^
This page is intentionally left blank
K A O N PHYSICS: E X P E R I M E N T S A. R. Barker Department of Physics, UCB390 University of Colorado, Boulder CO 80309, e-mail: [email protected]
USA
Experiments in the area of kaon physics are discussed. Two main categories of experiments are considered: those aiming to measure the CP-violation parameter e'/e and those devoted to measuring various rare kaon decays. Characteristic aspects of each type of experiment are presented, and the details of individual experiments are described.
1
Overview
These two lectures will be devoted to a discussion of experiments in the kaon sector. Kaon experiments were of course the first to reveal the phenomenon of CP violation 1 and the absence of Flavor-Changing Neutral Currents, both important clues to the structure of the Standard Model. Although experiments in the kaon sector have to some extent been overshadowed by B-factory projects recently, kaon experiments are still providing important information about flavor dynamics, and will continue to do so. Recent kaon-physics experiments can be divided into two categories, according to the physics topics they are investigating. One class consists of experiments that are attempting to measure the CP-violation parameter e'/e. The significance of this number in the Standard Model has been discussed in the lectures by G.Buchalla; I will review it only briefly here. Superweak models in which CP violation in K —> TTTT arises entirely from CP violation in mixing between the K° and the K predict that e' = 0. By contrast, the Standard Model allows for so-called direct CP violation in the decay of the CP-odd eigenstate Ki to two pions; consequently it predicts a non-zero value for e'. In the Standard Model, e', like the mixing CP-violation parameter e, is proportional to TJCKM- Unfortunately, efforts to extract a precise theoretical prediction for e' have so far failed due to the difficulty of calculating the required hadronic matrix elements. There is a long history of attempts to measure e'/e; until fairly recently all the results have been consistent with zero. The two experiments that preceded the present generation were E731 at Fermilab and NA31 at CERN. While E731 found that e'/e differed from zero by only about 1 a, NA31, with a similar experimental uncertainty, got a result approximately 3 a from zero. The significance of the difference between the two experiments was approximately 209
210
1.7cr. This unsettled situation led to proposals for two improved experiments, one (KTeV-E832) at Fermilab, and one (NA48) at CERN. Many of the E731 experimenters joined E832, and many of the NA31 experimenters joined NA48. More recently, a third effort to measure e' has begun at the low-energy e + e~ collider DA$NE in Frascati. DA$NE operates as a 0 factory, and thus provide a source of coherent KsKi pairs. The KLOE experiment there hopes to study kaon decays using these pairs in order to measure a variety of CP-violation parameters, including e'. Experiments searching for rare kaon decays also have a long and interesting history. Recently, Brookhaven National Laboratory on Long Island has been a center for this work. Other notable rare decay experiments in the last few years have operated at KEK, CERN, and Fermilab. These experiments have searched for a variety of different phenomena. One major thrust has for some time been the search for decays like K\ —> fi^e^ that violate lepton-flavor conservation. Such decays are absolutely forbidden in the Standard Model (except for vanishingly small branching fraction levels around 1 0 - 2 4 resulting from neutrino mixing); thus, any observation of them at the experimental sensitivities on the order of 10~ 12 would be an unambiguous signal of physics beyond the Standard Model. A second goal of rare kaon decay experiments has been to look for decays that have interesting Standard Model contributions, for example from penguin operators (describing the Flavor Changing Neutral Current s —> dy, for example). The rate for such decays is in principle sensitive to interesting Standard Model parameters like Vtd that may be very difficult to measure directly. Finally, rare decay experiments can do a wide range of less exciting, but still important measurements of what I call "bread-andbutter physics". Future work in rare kaon decay experiments is coalescing around efforts to measure the difficult but rewarding modes K+ —> -n+vv and K° —» -K°VD, also discussed by Buchalla. 2 2.1
Experiments Measuring e'/e Definition of e1 /e and Measurement Technique
There are two neutral kaon states which are eigenstates of strangeness: K° = sd and K = sd. The operator CP transforms these into each other, and so has two eigenstates
\K°1) = -±=(\K°) + \T?)) with CP\K°) = +\K°), and the corresponding antisymmetric combination K\, with CP\K^) = —\K°). If CP were an exact symmetry of the weak interactions, these would correspond to the mass eigenstates K\ and Kg. Specifically,
211
only the CP-even state K° would be allowed to decay to two pions, and it would therefore be identified with the Ks, which nearly always decays in that channel; while the CP-odd state K° would have to decay to 3TT or semileptonic final states, and would therefore correspond to the longer-lived K^. Since 1964, we have known that this simple picture, in which the K° is identified with the Kg, and the K\ with the K°L, is not correct. In that year, Christenson, Cronin, Fitch, and Turlay l found that the K\ has a small, but non-zero probability of decaying into two-pion final states, which implies that CP symmetry is violated. Unlike the violation of parity symmetry alone, which is maximal, CP violation is a small effect. The violation of CP in neutral kaon decays could be explained by a small CP-violating mixing between the CP eigenstates, so that the mass eigenstates would be
\K%) = ~^—
{\K°)+e\K«))
and
\Kl) = -j=L=
(IK^+elK?)).
This mechanism leads to what is called indirect CP violation, in which the K\ decays to TY+TT~ or 7r°7r° final states through its small K° admixture. The observed rate for K\ —\ irw can be explained if e is a complex number with magnitude |e| = 2.269 x 1(T 3 . Soon after the discovery of CP violation, it was proposed that the mixing between CP eigenstates could be a consequence of a so-called Superweak 2 interaction which would mediate CP-violating transitions with AS — 2. Such interactions would permit K° -» K and K —> K° transitions which could lead to indirect CP violation. They could not, however, lead to CP violation in the AS = 1 decays of the K° or K°. This second type of CP violation is called direct. Other theories, including the Standard Model, that explain CP-violation generally have AS = 1 effects as well as higher-order AS = 2 effects (such as mixing). Direct CP violation can be distinguished from indirect by considering the balance between the two possible TTTT final states. The Ks decays to 7T+7r~ twice as often as it does to 7r°7r°. Since the Ks is dominantly the K° eigenstate, which decays dominantly to irn, the ratio observed in Ks decays is the ratio in K® decays. If CP violation is entirely indirect, then the K\ can decay to TTTT only through its K° component, and the ratio of charged to neutral TVTY final states would be identical to that observed in Ks decays. If, on the other hand, direct CP ciolation is also allowed, then the ratio of charged to neutral
212
final states seen in K\ decays might be altered slightly from that observed in Kg decays because some of the rnr decays of the K\ would occur directly, through its dominant K° component. To make this quantitative, one defines the amplitude ratio _ A{K%
-> TT+TT")
V+
~ ~ A(K°S ->• 7T+7T-)
and its neutral counterpart 7700- If CP violation were entirely indirect, then both of these amplitude ratios would simply be equal to e. If there is direct CP violation as well, then the two amplitude ratios can split apart. With the usual definition of e', the splitting is given to a good approximation by ?7oo — e — 2e'
and
r/^
= e + e'.
To measure the splitting, experiments consider the double ratio of partial widths T(KL -> 7 r ° 7 r ° ) / r ( ^ -+ 7r°7r°) T{KL -> 7T+ir-)/r(KS -> TT+TT") ' in terms 01 the parameters denned above, tnis ratio is given Dy 2
R
??oo
= l-6Re '
£
V+Thus, roughly speaking, one measures all four decay modes and determines whether the double ratio R differs from unity. The difference yields six times the quantity Re(e'/e). Predictions of Re(e'/e) in the Standard Model are quite difficult because of the hadronic matrix elements involved. More information on the theoretical calculation can be found in the TASI-2000 contribution from G. Buchalla. 2.2
KTeV Experiment E832
KTeV (Kaons at the TeVatron) is a project at Fermilab encompassing two distinct experiments: E832, whose goal is a precision measurement of Re(e'/e); and E799, which aims to search for a wide variety of rare K\ decays. The e' experiment, E832, requires two parallel beams of neutral kaons. These are created by having a beam of 800 GeV protons from the TeVatron strike a BeO target. Any charged particles produced are magnetically swept from the beam. The resulting beam of long-lived neutral particles passes through collimators which produce two parallel beams travelling through an evacuated beampipe toward the spectrometer. Additional sweeping magnets are placed just upstream of the spectrometer to remove the decay products
213 ANALYSIS MAGNET TRIGGER HODOSCOPES Cat CALORIMETER Pb WALL HADRON VETO
3
20
MUON TRIGGER 'MUON VETO MUON SHIELDING
COLLAR ANTI (EB32)
105
115
Figure 1: The KTeV apparatus at Fermilab.
of particles like A baryons or Ks mesons, which may have decayed in the beampipe. Since the goal of E832 is to compare K\ and Ks decays, the next step in E832 is to create Ks particles in one of the two beams. This is done by placing a regenerator consisting of a series of plastic scintillator tiles in one of the two beams. As the kaons pass through the regenerator, the mixture of K° and K , originally 50-50, changes gradually, so that the exiting beam contains a regenerated Ks component in addition to an attenuated K\ component. Although the regeneration amplitude is of only modest size, the K% that are produced decay quickly downstream of the regenerator and so dominate the decays in the beam in which the regenerator is placed. More precisely, the regenerator beam is a coherent superposition of Ks and K\ components. The distribution of decay proper times in the regenerator beam exhibits the usual interference phenomena seen in a mixed neutral kaon beam, and its detailed shape depends on parameters of the neutral kaon system such as the KQL-KS mass difference Am and the Ks lifetime TS as well as the regeneration amplitude and the CP-violating amplitude ratios r]0o and ?7+_. The decay distributions seen in the second (vacuum) beam, are almost pure K\. The vacuum beam is used to normalise the decay distributions seen in the regenerator beam and to under-
214
stand the acceptance of the apparatus. Because the two beams (left and right) are not absolutely identical in energy, size, or intensity, the regenerator alternates from one beam to the other every minute in order to average out these differences. This also averages out any left-right asymmetry in the response of the spectrometer used to detect the K° decay products. Since K°L and Kg decays are collected simultaneously, with identical event selection criteria, many possible systematic differences in the acceptance cancel out. KTeV Apparatus The KTeV apparatus is shown in figure 1. The regenerator can be seen at the lefthand (upstream) edge of the schematic. After the regenerator, there is a large vacuum decay volume. The vacuum tank extends somewhat more than 30 meters downstream from the regenerator, and increases gradually in diameter from about 1.5 meters at the regenerator to about 2.0 meters just before the window at its downstream end. Every few meters along this vacuum tank, there are so-called "ring vetoes", lead-scintillator stacks with a square hole in the center, labelled RC6 — RC10 in figure 1. Decay products that are exiting the tank at large angles, so that they would miss the spectrometer elements further downstream, hit the veto detectors and cause the event to be rejected. This helps to minimize backgrounds due to missing particles. At the downstream end of the vacuum tank, there is a large, thin window. Just beyond this window is the first of four multiwire proportional chambers, which together with the analysis magnet, are used to measure the momenta of charged particles. The drift chambers are each surrounded by additional veto detectors in order to reject events with particles escaping the fiducial volume of the detector before reaching the calorimeter. These veto detectors are labelled SA2 - SAA in figure 1. The calorimeter is used to measure the energy of photons as well as the energy deposited by charged particles, which can be electrons, pions, or muons. It consists of 3100 crystals of pure Csl. To prevent particles in the beam from interacting in the Csl, the calorimeter has two beam holes each 15 cm square where the beams pass through. A final veto detector surrounds the outer edge of the Csl array. In addition, the inner edge of the Csl acceptance is precisely defined by a tungsten-scintillator veto detector that surrounds the edges of the two Csl beam holes. The ratio E/p, where E is the measured energy in the Csl, and p is the momentum determined from the drift chambers, can then be used to measure the resolution of the Csl calorimeter, which is found to be about 0.7%. This excellent resolution is extremely important for the neutral-mode (7r°7r°)
215
analysis, since the location of the decay vertex is determined entirely from the calorimeter's photon energy measurements. Analysis Procedure Neutral-mode events must have exactly four energy clusters in the Csl. The four photon clusters can be separated into two distinct pairs in three ways. For each pair in a pairing, the hypothesis is made that the two photons came from the decay of a single n°. The distance of the hypothetical 7r° decay from the Csl calorimeter (Az) can then be found from the equation \jE\E2
Az
Ar,
where Mv is the n° mass; Ex and E2 are the measured photon energies; and Ar is the transverse separation of the two photon clusters at the Csl. Two different Az values are calculated for the two pairs in a pairing, and uncertainties on the two Az values are estimated. The best pairing is chosen and the z decay vertex is determined. After this, the total invariant mass of the event can be calculated; events with total mass between 490 and 505 MeV/c 2 form the final neutral-mode sample. ~
10 10
=s 1 0
b 5 A
10
3
10
2
R e g 1
0.47
0.48 0.49
O.S
0.51
O.S2
vac beam
•
•
•
1
TC"*"TC" •
•
.
T-I
•
0.47 0.48 0.49 0 . 5 0 . 5 1 O.S2 r I V I „ „ K - ->TL*TC~ , r e g b Ge caVm/ c
CcV/c^
—>TZ*Ti
•
IO IO
5 4
a l O *
r
3
I O -' r IO
2
teg , , . . 1 . . .
0.47
0.48 0.49
O.S
o 1 —>7T
7T
0.51
0.52
C.«V/c
vac b e a m
TT'V
T-,
1 . , . . ITWT-
0 . 4 7 0 . 4 8 0 . 4 9 0 . 5 O.S1 0 ^ 2 IVI^^ K—>rc 7z , r e g b e a m
Figure 2: Distribution of invariant mass for the charged-mode and neutral-mode candidate events, in both E832 beams.
K° —> TTTT
216
To identify TT+TT~ events, events were required to have exactly two drift chamber tracks. The tracks had to be matched to energy clusters in the Csl, and, to be sure that they were produced by pions, the E/p ratio had to be less than 0.85. The invariant mass of the two pions had to be between 488 and 508 MeV/c 2 . To reject events with missing particles, a quantity called PT was calculated by finding the component of the total of the two measured pion momenta which was perpendicular to the kaon's line of flight from target to decay vertex. Only events with P%. < 250MeV 2 /c 2 , consistent with resolution effects, were retained in the final charged-mode data sample. The invariant mass distributions for these events are shown in figure 2. Before Re(e'/e) can be extracted from these data, backgrounds must be identified and subtracted, and acceptance corrections must be determined, using a Monte Carlo simulation of the experiment. The backgrounds to the charged mode are very small, since the Pj. cut is very efficient at removing them. In the vacuum beam, there is a small charged-mode background from misidentified Kt$ and K^z semileptonic decays, in which the lepton is misidentified as a pion. In the regenerator beam, the main background comes from regenerator scattering events. The shape of the backgrounds match the data very well at large Pj,, allowing reliable determination of the remaining background under the signal peak near zero. The level of scattering backgrounds required to match the data and Monte Carlo Pj. spectra can also be used to help in the determination of neutral-mode scattering backgrounds. Because the decay vertex cannot be as precisely located in the neutral mode, there is no precise P% quantity available to reduce the backgrounds. As a result, the backgrounds are substantially larger in the neutral mode, because the peak region is less tightly defined. Figure 3 shows the final background determinations for all four KL,S —> nn decay modes. As the figure shows, the charged-mode backgrounds are less than 0.1%, while the neutral mode backgrounds are somewhat less than 1% in the vacuum beam, and somewhat more than 1% in the regenerator beam. Extraction of Re(e'/e) Once the backgrounds have been determined, they are subtracted bin-by-bin from the signals, resulting in the four decay-vertex distributions shown in figure 4. The data are further divided into energy bins. In each bin of kaon energy and decay vertex position, the Monte Carlo simulation is used to determine an acceptance correction. The acceptance-corrected numbers of events are combined in 10 GeV energy bins. The ratio of regenerator to vacuum beam events is then fit to the predicted distribution, which depends on parameters
217 Oa,e-lcg;i'Oi.i-iicl S u b t . r a c t . i o n s fox' J*C —->- 2-zr
O
1 OOO
SOOO
3000
4QOO
O
1 OOO
ZOOO
3000
4000
O
1 OO
200
300
400
O
1 OO
200
300
4QO
p= (MeV a /c a )
Ring Number
p= (MeV 2 /c = )
Ring
Number
Figure 3: Summary of total background levels for all four K^^s —> n modes. Shown as a function of P^, for the charged mode, and of a similar quantity, "Ring Number", in the neutral mode.
including not only Re(e'/e), but also Am, TS, the phases 0oo and 0_| of rjoo and r)^ , and the regeneration amplitude. The fit can be done in a variety of ways, either allowing many parameters to float simultaneously, or fixing all parameters but one. For the final fit, all parameters except e' were fixed to the Particle Data Group averages. There are a variety of systematic uncertainties in the extraction of Re(e'/e), which are summarized in figure 5. The most important charged-mode systematic is the overall acceptance uncertainty. Its origin lies in the fact the the observed z-vertex distribution in the data does not exactly match the predicted distribution in the Monte Carlo. In the neutral mode, there is no single, dominant source of systematic uncertainty. The background uncertainty is much larger in the neutral mode simply because the level of background is about an order of magnitude larger than in the charged mode. The energy scale uncertainty is due to the fact that the calculation of the vertex location depends entirely on the overall Csl energy scale; an error in this scale would cause the whole z-vertex distribution to slide, and since this distribution is very different for K°L and K°s decays, such a shift would impact the fit results for Re(e'/e). Combining all the systematic errors in quadrature, the total is 2.8 x 10"~4. With this dataset, the statistical uncertainty is 3.0 x 10~ 4 . To avoid any bias in developing the analysis cuts or the Monte Carlo simulation based on the result for Re(e'/e), the fit results were "hidden" by
218 x 10 2 180000 <5 70000 rT~"~~^--~, 60000 \ ^ ^ 50000 \ 1 40000 r 30000 ~ J V a c 7i+7i" 20000 - j 10000
t/,,
19000 £8000 7000 ^ 6000 \ 5000 1 4000 \ 3000 1 2000 1000
,....,...]
110 120 130 140 150 160 Vertex Z K—>7T+7i", vac beam 325000
F-
"
..,.!S^_...... +
Vertex Z K—>7t 7c", reg beam x 102500 2000
15000
1500
10000 F-
1000
5000
500
^ \ Reg 7t°7i°
M
0 0
11 K L\ Reg n+n \
110 120 130 140 150 160 m
20000
110 120 130 140 150 160
1
M
I
I
V
,.,\l_,....
110 120 130 140 150 160
m
Vertex Z K—>n n , vac beam
0
0
m
Vertex Z K—>n n , reg beam
Figure 4: Background-subtracted decay-vertex distributions. Notice the sharply falling regenerator beam distributions, dominated by K$ decays, compared to the relatively flat vacuum, beam distributions, dominated by K9 decays.
adding an unknown offset to the reported value of Re(e'/e) until the decision was made to report the result in public. The final result from the fit was quite large: 3 Re(e'/e) = (28.0 ± 3.0 ± 2.8) x 10 - 4 As figure 6 shows, this result is considerably larger than the previous E731 result, but quite consistent with the NA31 result, as well as with the NA48 result 4 . It is also larger than most recent theoretical predictions, but as was mentioned above, the predictions are not yet well constrained considering the ranges of parameters that must be considered. This result, together with the new results from NA48, does, however, rule out conclusively the hypothesis that the observed CP violation in neutral kaon decays might be due exclusively to a new, CP-violating Superweak interaction. Whether or not sources of CP violation beyond the Standard Model will be required to explain such a large value for Re(e'/e) awaits the development of more definite theoretical predictions for the CP violation induced by the CKM matrix.
219
S u m m a r y of S y s t e m a t i c U n c e r t a i n t i e s Systematic Trigger (L1/L2/L3) Detector Resolution Calibration/ Alignment Energy Scale DC simulation Csl non-linearity Apertures (incl Reg Edge) Analysis Cuts Backgrounds Overall Acceptance Monte Carlo Statistics Attenuation Slope Movable Absorber External Parameters
7r+7r Analysis (xlO-4) 0.50 0.35 0.25 0.12 0.63
7r°7r° Analysis (xlO-4) 0.29 <0.10 0.38 0.70
0.26
0.60 0.48
0.59 0.20 1.59 0.50
0.78 0.81 0.68 0.90 0.24 0.20 0.19
Figure 5: Systematic uncertainties in the KTeV-E832 measurement ofKe(e'ft); the largest single uncertainty is due to a mismatch between the data and Monte Carlo z-vertex distributions for the charged mode.
2.3
CERN Experiment
NA48
The N A 4 8 Way The current generation e'/e experiment at CERN is called NA48. This is the successor the the earlier NA31 experiment, which was the first to report a result for e'/e significantly different from zero. The NA48 approach to measuring e' is similar in many respects to that used by E832 at Fermilab, but there are also some important differences. Both experiments emphasize the importance of collecting data for all for Kits -* TTTT decay modes simultaneously. This is important in order to ensure that backgrounds due to accidentals are as similar as possible in the two modes; and so that time-dependent changes in detector response (due to varying gains, malfunctioning channels, and the like) will not affect different decays differently. Both experiments also use ideantical ccuts and identical fiducial regions for KL and Ks decays. The experiments differ, though, in how the Ks are produced. As was discussed above, E832 has a target far upstream of the spectrometer, followed by collimators that produce two parallel beams. Almost all the Ks fproduced at the primary target decay far upstream, so that the two beams are both almost pure KL by the time they reach the detector. One of the E832 beams then hits an active regenerator, where the differing interactions of the K° and K components turn the KL beam into a coherent mixture \KL) + p\Ks), where p is the regeneration amplitude. The Ks is the regenerator beam then decay quickly just downstream of the regenerator.
220
35 O
X
30 25
CD
KTeV 28.0±4.1 7 NA31 !23±6.5
"
1
20 -; I
15 10 -: :
5
,, J
^
NA48 18.5-1-7.3
pDG
M
World Avq. "19.3±2.4
15±8 n NA48 1 2.2±4.9 [~731 7.4±5.9
i
i
90 91 92
,i
i
i
i
i
i
i
93 94 95 96 97 98 99 00 01 02 03
Publication Year Figure 6: Final result for Re(e'/e)> and comparison to earlier experiments.
A schematic of the NA48 beamline is shown in Fig. 7 In NA48, as in E832, there is a KL target far upstream of the detector. While both KL and Ks are produced in this target, the Ks decay away far upstream of the detector, as in E832. However, in the NA48 experiment, a small fraction of the primary proton beam is deflected by a bent crystal just downstream of the Ki target. The deflected beam is then directed onto a Ks target just upstream of the fiducial region. Ks particles prodcued at this downstream target then decay in the same fiducial region used for the KL decays. The advantage of this technique compared to E832 is that NA48 does not suffer from the substantial backgrounds E832 sees due to diffractive and inelastic interactions in the regenerator material. Also the momentum spectrum of primary Ks and KL are the same, whereas in E832 the Ks momentum spectrum differs because of the momentum dependence of the regeneration amplitude. However, there are also some disadvantages associated with the NA48 method for producing Ks. First, the Ks beam is not exactly coincident with the KL beam, so different parts of the detector are illuminated, which can lead to systematic differences between the acceptance for Ks and KL. Second, one must identify whether a given decay seen in the spectrometer is due to a Ks
221 Ke anticounter Ks Target,
#l
^
K Target 2 / • /
Muon sweeping
Bent / crislal Ks tagging station
Last collimator'
( -3.10 7proions per spil)
Decay Region (~ 4u m long)
-120 m
12pm
Figure 7: The NA48 beamline at CERN.
decay or a KL decay. In the case of if —> 7r+7r™~ decays, this is simple, because the decay vertex reconstructed from the two charged tracks clearly points back to either the KL target or the Ks target. However, for neutral decays this technique cannot be used and there is an ambiguity as to whether the decaying particle was a Ks or KL- TO resolve this, NA48 uses timing information to determine which neutral decays are which. The protons deflected by the bent crystal pass through a Ks tagging counter. When a K -> 7r°7r° decay is detected, one looks to see whether there was a signal in the Ks tagger at the right time to have been produced by a proton that could later have hit the Ks target, producing the Ks whose decay was observed. If there was no tagger hit in the appropriate timing window, then the decay is identified as a KL -+ ^°^° event; if there was a tagger hit, then the decay is called a Ks ™> 7r°7r° event. This identification is usually correct, but mistakes are made in which the tagging counter fails to fire in the appropriate window (Ks misidentified as KL) or in which the time of a KL decay is accidentally coincident with a hit in the tagging counter (KL misidentified as if 5). The level of misidentification must be determined and corrected for, and the residual uncertainty in this correction is an important component of the final systematic error in the NA48 result for e'/c Another difference between NA48 and E832 is in the way in which the different ifs and ifL lifetimes are handled in the analysis. Although E832 uses the same fiducial region for Ks and KL decays, the observed decay distributions
222
Figure 8: The NA48 apparatus at CERN.
are very different, with the Ks decays being concentrated just downstream of the regenerator, whereas the KL decays are distributed nearly evenly along the length of the fiducial region. As a result of this, the E832 spectrometer elements are illuminated differently by Ks and KL decays, and a correction is applied from a Monte Carlo simulation to account for the resulting difference in acceptance. The distribution of decay vertices in NA48 is also quite different; with the Ks decays being concentrated near the Ks target. However, rather than correct for the resulting acceptance differences using a Monte Carlo, NA48 chooses to reweight the KL decays so the the weighted KL decay vertex distribution looks the same as the Ks decay vertex distribution. After the weighting is applied, the remaining acceptance differences are extremely small. Effectively, NA48 trades statistical power away (by introducing different weights for differen events) in order to reduce the systematic error. KTeV makes the opposite choice, retaining maximum statistical power at the expense of a systematic error resulting from acceptance differences. The spectrometer elements in NA48 are similar in performance to those in E832. There is a system of drift chambers and an analysing magnet with 250 MeV transverse-momentum kick to measure the trajectories and momenta of charged particles. There is a high-precision liquid-krypton calorimeter, with
223
energy resolution similar to that achieved in E832's Csl crystal calorimeter. As with E832, mass resolution in the charged mode is about 2.5 MeV. The liquid krypton calorimeter has one important job that the Csl does not have in E832: the NA48 calorimeter is also used to measure the time of photon showers in order to make the correlation with the K$ tagging counter described above. A timing resolution of about 250 ps has been achieved by NA48 for neutral events. A slightly better time resolution is obtained for charged events, using planes of scintillator upstream of the calorimeter. NA48 collected data using this apparatus during 1997, 1998, and 1999, for a total of about 352 days of running, and about 3.6 million of the rarest of the four 7T7T decay modes, KL —• 7r°7r°. This is about half of the roughly 8 million of these decays collected during the two E832 runs in 1997 and 1999. One of the more important systematic uncertainties in NA48, and one that is different from E832, is the uncertainty resulting from mistagging. To correct for mistagging, charged events (where the true source of the event is known by looking at the vertex location from the tracks) are subjected to the tagger-timing analysis used in the neutral mode. From these events, the tagger inefficiency for charged decays is measured to be (1.97 ± 0.05) x 10~ 4 , and the probability of accidental tagging for charged decays is measured to be 11.05 ±0.01%. Tagger inefficiency causes Ks decays to be misidentified as KL, while accidental tagging causes KL decays to be misidentifed as Ks- The possibility that the neutral misidentification rates may be different is also studied using runs where the KL beam is blocked and by measuring the accidental tagging rate for purely KL decays such as KL —> 7r07r°7r°. The most important contribution to the final systematic error comes from the uncertainty in the neutral accidental tagging fraction, which contributes about 9 x 10~ 4 to the systematic error on the double ratio R, or 1.5 x 10^ 4 to the systematic error on Re(e'/e). The backgrounds in NA48 are much smaller than in the Fermilab experiment E832, because of the absence of regenerator scattering. However, there is still background in the charged mode from semileptonic decays. This has to be subtracted, leads to a correction of 9.9 x 10~ 4 in the measured double ratio, and a contribution of 0.6 x 10~ 6 to the final systematic error on Re(e'/e)Neutral mode background from KL —> 7r07r07r° similarly leads to a correction of 6.6 x 10~ 4 in the double ratio, and contributes 0.3 x 10~ 4 to the final systematic error. The largest correction to the double ratio (which is still quite small) comes from the residual acceptance corrections that are needed even after the reweighting procedure has been used. The correction to R is (31±6±6) x 10~ 4 . The two uncertainties (statistical and systematic) both contribute to the sys-
224
tematic error on e'/e. It is notable that although the Fermilab experiment makes a much larger acceptance correction (a few percent compared to a few per mil), the uncertainty in the size of the correction quoted by E832 is similar to the uncertainty quoted by NA48, so that the final contribution of acceptance correction uncertainties to the systematic errors in the two experiments is not very different. Another substantial systematic uncertainty in NA48 that is very similar to one from E832 is due to the overall energy scale. This uncertainty in Re(e'/e) is estimated at 1.7 x 10~ 4 for NA48, compared to 0.8 x l(T 4 for E832. Another large systematic uncertainty (2 x 10~4) comes from biases that may result from accidental activity. Combining the results from their 1997 and 1998 runs, NA48 quotes a preliminary result: 4 Re(e'/e) = (14.0 ± 4.3) x 10" 4
(NA48 average)
This result is only half as big as the KTeV result 3 Re(e'/e) = (28.0 ± 4.1) x 10~4
(KTeV)
but has about the same quoted precision. The situation is unsettling, in that the two new results differ from each other with 2.4
(NA48 average)
However, it should be noted that the Particle Data Group's averaging procedure, which inflates the uncertainty when there is poor agreement between results, will give a larger error bar. 2.4
The KLOE Experiment at DA&NE
The DA$NE $ factory in Frascati, is a 1.02-GeV e+e~ collider sitting at the $ resonance. The accelerator began commissioning in early 1999, with a design luminosity goal of L = 5 x 10 32 . To date, the accelerator performance has unfortunately been rather far from this goal. The KLOE experiment 5 was designed to measure e'/e and other CP and CPT parameters, although it will also search for a variety of rare KL, KS and K+ decays using tagged kaons from copious $ decays. For CP physics, KLOE has an advantage compared to KTeV and NA48 in that it produces a
225 100 < Kaon Energy < 110 GeV
K L events are weighted according to the K s / K L ratio of proper time distributions
2 2.5 3 3.5 K decay time from AKS (cr) 1.5 1.4 1.3 1.2 1.1 1 0.9 0.8 0.7 0.6 0.5
KS/ weighted! KL .»r*^4»^i*«>.
2 2.5 3 3.5 K decay time from AKS (cr)
Figure 9: The NA48 reweighting procedure,
pure KLKS quantum state, making tagging and interferometry possible, as in B-factory experiments. The KLOE detector is a 4TT apparatus working in the center of mass frame. It is a cylindrically symmetric general-purpose detector with a low-mass central drift chamber of very large volume (4-m diameter) using helium as the ionizing medium, surrounded by a Pb-scintillating fiber calorimeter. The detector is in a 0.6-T magnetic field, so the calorimeter is read out with high-field fine-mesh PMTs. The calorimeter is designed to have good timing resolution (about 130 ps) and to be highly efficient, in order to reduce backgrounds from KL —> 7r07r°7r° events. KLOE is performing well, haiving already demonstrated mass resolution of about 1 MeV for Ks -> TT+TT- events. Data-taking began in the summer of 1999, but to date not many KL -> 7T7r events have been collected. 3 Rare Kaon Decay Experiments This lecture reviews the status of rare kaon decays, with emphasis on the general thrusts of rare kaon decay experiments and on progress made in the last few years. Several good review articles are available that focus on rare kaon decays 6 ' 7,8 , and theoretical studies of rare kaon decays 9 ' 1 0 ' 1 1 ' 1 2 . There are three primary themes of rare kaon decay experiments. The first what I call "bread-and-butter" physics. This is done by studying a variety of rare and semi-rare processes including semileptonic, radiative, and electromagnetic decays. Detailed measurements of the branching fractions and kinematic distributions in these decays provide information on form factors, as well as tests of QED and chiral perturbation theory calculations. While some of these
226
results are not too fascinating for their own sake, they are often needed in order to understand backgrounds or normalization modes for more interesting decays. The second theme is to use rare kaon decays to test the Standard Model and to determine the values of Standard Model parameters. This usually involves selecting kaon decays that involve suppressed flavor-changing neutral currents with important contributions from electromagnetic penguin operators. Examples include KL ->• /u+/x~, KL -> 7r°e + e~, and particularly K -> -KVV. Measurements of these processes should be sensitive to both the real and imaginary parts of the difficult-to-determine CKM matrix element Vtd- These efforts have been underway for some time, but as yet sensitivities have not reached the level required to make precision measurements. Finally, the long kaon lifetime makes kaon decays an excellent place to search for processes that are forbidden in the Standard Model, which would signal the presence of new physics. Exotic decays of this type that have been the subject of rare decay experiments include KL —> n+±eT, and K —> TTfi±eT. To date, none of these searches have turned up any evidence for exotic decays, despite sensitivites to branching ratios in the lO^ 11 range. 3.1
Bread-and-Butter Rare Decay Physics
In addition to the rare kaon decays that directly probe standard-model parameters and those that are sensitive to exotic physics, there is a wide array of other decay modes on which substantial experimental progress has been made in recent years. Although these results receive less attention, they provide critical information in a variety of areas. For example, there has historically been strong interest in the Kl —> /i+ yT decay as a probe of weak interaction dynamics, specifically R,e{Vtd), through its short-distance amplitude. But the short-distance amplitude is known to be quite small compared with the long-distance part, involving the 77 and 7*7* intermediate states. An accurate determination of the KLJ*J* form factor is needed in order to evaluate the long-distance contribution, which is needed in turn to extract Re{Vtd) from B(K£->/z+/u~ ). The form factor can be determined in part from measurements of electromagnetic kaon decays such as KL ->• e + e~7, KL -> M+/"~7; KL -» e+e~e+e~, and KL -» e + e ~ / i + / j ~ . The measurement of such modes as K1^TT°JJ and K°L^r e+e~jj is imr portant for a different reason. The decayK° L ^tt t~77 is an important background in the search for K°L -^ir°£+£~ . Both the absolute number of events from this process and the kinematic distributions of those events are thus important to the effort to learn about standard-model parameters from K°L —>• 7r°e+e
227
and KI->IT°[I+IJ,~ . In particular, large samples of these events must be studied to determine the effectiveness of kinematic cuts necessary to observe this extremely rare decay. The decay if£-»7r°77 , though not a background to K°L—>-K°e+e~~ , can be used to determine the CP-conserving part of the amplitude. Additional contributions from the 7r°7*7* intermediate state with offshell photons are also important. These can be determined from ChPT models, but again there are undetermined parameters that must be extracted by studying kinematic distributions in K°L -> 7r°77 and the related mode K°L -> ix°e+e~-y Because of their low energy release and wide variety of final states, kaon decays provide an excellent testing ground for the predictions of ChPT. For example, the K —^77 modes and the direct-emission component of radiative semileptonic decay modes have proven to be good testing grounds for comparing 0(p4) to 0(p6) calculations. ChPT calculations of 7T7T scattering can likewise be tested by measuring the form factors of K-^-KITIVI (IQ4) decays. Sometimes the study of these less well-known modes can turn up new phenomena of considerable interest. For example, an interesting observation of a CP-violating and T-odd angular asymmetry in the K°L -^ TT+TV~~e+e~ decay has been made by the NA48 and KTeV experiments. This is the largest CPviolating effect yet seen, and the first CP-violating effect ever observed in an angular distribution. Electromagnetic Decay Modes Like the 7r°, the neutral kaons couple to two photons. The interaction with two real photons can be extended to off-shell photons with nonzero k2. In this case, the coupling /p 7 * 7 * can depend on the two k2 values. The form factor / K L 7 7 is needed to accurately calculate the long-distance contribution to K°L->fi+ yT and K°L~>e+e~ , which both have important contributions from the 7*7* intermediate state. This long-distance contribution must be subtracted from the precise experimental measurement of K°L -^ /i+fi~ in order to determine the interesting short-distance part of the amplitude for that decay, which can be related to the real part of the CKM matrix element Vtd or, equivalently, to the standard-model parameter p. Rare kaon decays can be used to study the If 77 form factors in several regions. For example, the electron and muon Dalitz decays K°L—•e+e~7 and K°L—>/u+^~7 are sensitive to the form factor with k\ = 0 and 4m 2 < k\ < M\, where m2 is the lepton mass. From lepton universality the form factors obtained in the electron and muon modes should be the same. As Table 1 shows, substantial statistics are now available in both of these ££j modes. The rarer double-internal-conversion modes, where the final state consists
228 Table 1: Summary of results of kaon decays to two photons and related modes
Decay Mode ^s°^77 K°L^-yj K°L->[i+V~~ K°L-+e+eKl^-e+e-j Ki^fi+fi-j K°L-+e+e-e+eKl~>n+n~e+eKl-^fl+fi-fjL+fiKl^ii+fiK^e+eKl^e+e^jj
'Ki^n+n~'ry
Branching Ratio (2.6 ±0.4 ±0.2) x 10~G (5.92 ±0.15) x 10~4 (7.24 ± 0.17) .x 10^ 9 (8.7^; Y ) x 10~ 12 (1.06 ± 0.02 ± 0.02 ± 0.04) x 10" 5 (3.66 ±0.04 ±0.07) x 10" 7 (3.77 ± 0.18 ± 0.13 ± 0.21) x 10~8 (2.50 ±0.41 ±0.15) x 10" 9 no limit < 3.2 x 10" 7 < 1.4 x 10~ 7 (5.84 ±0.15 ±0.32) x 10" 7 ( 1 . 4 2 l ^ ± 0 . 1 0 ) x 10" 9
events 148 110000 6200 4 6854 9105 436 38
Experiment NA48-00 13 NA31-87 14 E871-00 15 E871-98 16 NA48-99 17 KTeV-00 18 KTeV-00 18 KTeV-00 18
0 0 1543 4
CERN-73 19 CPLEAR 2U KTeV-00 21 KTeV-00 22
of two lepton pairs, are sensitive to the form factor in the region where both photons are off the mass shell. The largest sample of e + e~e + e~ decays reported to date is from KTeV, with 436 events in the 1997 data sample. KTeV has also reported seeing 38 e + e _ /,t + /i~ events. Two decay modes related to K°L —^77 , though not especially interesting in themselves, have significant implications for the attempt to observe direct CP violation in K°L —>7r°e+e~ and K°L —>7r°/j,+^~ . These are the radiative Dalitz decays K°L—>e+e~j~f and K°L —»/i+/i~77 . With a typical infrared cutoff of 5 MeV for the photon energies in the kaon center of mass, the electron mode, K°L —^e+e~77 , has a branching ratio of about 6 x 10~ 7 , five orders of magnitude higher than the expected rate for K°L-^fK°e+ e~ . Moreover, the peak of the 77 invariant mass spectrum in observed events is near the n° mass. With a good calorimeter, experiments can limit the n° mass range to a few MeV, but the number of e + e~77 events in this range still swamps the expected 7r°e+e~ signal. KTeV has identified a sample of over 1500 e + e~77 events and has verified that their kinematic distributions are generally in agreement with those predicted. The decay K°L —>fi+fi^-yy is likewise a serious background to the measurement of K°L —>TT°/J+IJ~ . The absolute rate for this decay is much less than the rate for the corresponding electron mode (see Table 1). Unfortunately, the part of the phase space where this decay can be a background to i ^ —>-7r°/i+/i~ ,
229
after the M 7 7 and other kinematic cuts, is not particularly suppressed. Thus, the K°L —>ir°[I+IJ,~ mode does not allow experiments to eliminate the radiative Dalitz background. K —>TTTTJ
The radiative K772 decays—K+—»7r+7r°7 ; K£->7r + 7r~7 , and K$ —>-7r+7r~7 — have two contributions. In the inner bremsstrahlung (IB) process, a photon is radiated from one of the charged particles. In the direct emission (DE) process, the photon is radiated from an intermediate state. In the neutral kaon decay, K°L—>TT+TT~'J , the DE part of the decay can be either CP-violating or CP-conserving, but experiments show that the DE decay is consistent with a CP-conserving Ml radiative transition. There is also a CP-odd interference term. These CP-odd and CP-even terms manifest themselves in a CP-violating asymmetry in the polarization of the photon, which is not observable in these experiments. However, a CP- and T-odd angular asymmetry is expected in the related decay K°L —>7r+7r~e+e~ , in which the photon internally converts to an e+e~ pair, since the angular distribution of the leptons preserves information about the photon polarization. This effect, predicted in 1992 by Sehgal & Wanninger 23 , is an asymmetry in the distribution of the angle
^'
It is important to note that the raw asymmetry may be significantly different from the acceptance-corrected asymmetry. This occurs not because of any asymmetry in the detectors but because the asymmetry varies across the phase space for the 7r+ir~e+e~ final state, and in general, acceptance is better in regions of the phase space where the asymmetry is large. The raw asymmetries observed by the two experiments are therefore not directly comparable. Nevertheless, they seem to agree. NA48 finds A^(raw) = (20 ± 5)% while KTeV measures ^ ( r a w ) = (23.3 ± 2.3)%.
230
So far, only KTeV has reported an acceptance-corrected asymmetry 28 . An important ingredient in making the acceptance correction is the form factor in the M l DE amplitude, which is extracted by fitting the Mnn and other kinematic distributions. Using the fitted form factor, the acceptance-corrected average asymmetry is found to be A$(corrected) = (13.6 ± 2.5 ± 1.2)%, in excellent agreement with the theoretical prediction. 3.2
Using Rare Kaon Decays to Measure Standard Model Parameters
The unitarity triangle is most readily expressed for the kaon system as follows:
v:svud + v;svcd + v;svtd = o
(2)
or
\u + \c + \t = 0, with the three vectors A; = V*sVid forming a very elongated triangle in the complex plane, as shown in Figure 10. The first vector, A„ = V*sVud, is (0,lmAt) B(K+^7T+^)1
(-Au+ReXt.O)
A(1-X 2 /2-A 4 /8)
(ReAt,0)
f ( K ->7T e u) Figure 10: Unitarity triangle for the K system (not to scale).
well known. The height will be measured by K°L-^-K°VV and the third vector, \ t = Vt*Vtd, will be measured by the decay K+ -^ir+vV . The theoretical ambiguities in interpreting all of these measurements are very small. It may also be possible to extract additional constraints on the height of the triangle from Kl^ir°£+£~ decays and on Re(Xt) from Kl->n+[j,~ decays.
The decay if£—• fj,+n~ is dominated by the process of AT£->77 with the two real photons converting to a fi+n~ pair. This contribution can be exactly calculated in QED 33 based on a measurement of the ^ £ - ^ 7 7 branching ratio.
231
However, there is also a long-distance dispersive contribution, through off-shell photons. This contribution needs additional input from ChPT 34>35J which may be aided by new, improved measurements of the decays K°L—>e+e~"f , K1->fi+[i~j , K°L—>e+e~e+e~ and K°L —>/x+/i~e+e~ . Most interesting is the short-distance contribution which proceeds through internal quark loops, dominated by the top quark. This contribution is sensitive to the real part of the poorly known CKM matrix element Vtd or equivalently to p 1 1 ' 3 7 . This mode has now been measured very accurately 15 by the BNL-E871 collaboration, who have reported B(K£-)- / u + /i~ ) = (7.18±0.17) x 10~ 9 . This measured ratio is only slightly above the unitarity bound from the on-shell two-photon contribution and consequently limits possible short-distance contributions. Unlike K°L —¥/J,+H~ , which is predominantly mediated by two real photons, the decay K°L—>e+e~ proceeds primarily via two off-shell photons. The relative contribution from short-distance top loops is significantly smaller than in Kj^fx+/j,~ . However, the recent observation by E871 16 of four events, with a branching ratio of B(K£->-e + e~ ) = (8.7t\'7i) x 10~ 12 , is consistent with ChPT predictions 35 ' 36 and is the smallest branching ratio ever measured for any elementary particle decay. K°L->n°l+lThe decays i ^ —»7r°e+e~ and i Q —>7r°/i+/U~ can proceed via s —> dj and s —» dZ processes described by electromagnetic penguin operators in the Standard Model. These processes are calculable with high precision since they are dominated by top-quark exchange. If these were the only contributions to this decay, the branching ratio for K°L —»7r°e+e'~ would be cleanly related to CKM matrix elements. With the current best-fit value of 1.38 x 10~ 4 for Im(Xt) 39 , one would expect from short-distance effects a branching ratio of about 5 x 10" 1 2 . Unfortunately, the decay K°L ->n°e+e~ can occur in two other ways. First, there is an indirect- CP-violating contribution from the CP-even, K° component of K°L. There is also a CP-conserving amplitude involving a ir°j*j* intermediate state, from which the virtual photons materialize into an e+e~ pair. Given these three contributions, it will be difficult to extract CKM matrix parameters from even a precision measurement of K°L —>ir°e+e~ . A still more formidable roadblock to progress on these modes was first pointed out by Greenlee 40 : The radiative Dalitz decay K°L —»e+e~77 has a rather large branching ratio (~ 6 x 10~ 7 ). The two photons may have an invariant mass near that of the ir°, so that the final state is indistinguishable from the 7r°e + e~ mode. Two strategies can be used to deal with this background. First, a high-
232
precision calorimeter can be used to minimize the size of the region in M11 where confusion can occur; second, the difference in the kinematic distributions expected in the radiative Dalitz decay can be used to remove most of the background events, at a cost of some acceptance for the signal mode K°L —» 7r°e + e~ . These techniques reduce, but cannot eliminate, this background, so that the present searches for K°L -^-K°e+e" are background-limited at the level of 10^ 10 . The most recent limit on K°L—»7r°e+e~ comes from the KTeV experiment 4 1 . The analysis selects on the direction of the photons with respect to the electrons to minimize the background from radiative Dalitz decays while preserving as much sensitivity as possible. KTeV found two events that passed all cuts, compared with an expected background level of 1.1±0.4 events. This finding leads to an upper limit B(K°L->ir°e+e~ ) < 5.1 x 10" 1 0 (90% CL). A similar analysis of the related muon mode by KTeV resulted in a slightly smaller upper limit 45 , B(K°L - > 7 r > V ~ ) < 3.8 x 1(T 10 (90% CL). K ->irvT>
The decay modes K+ -+K+VV and K°L^tix°vV are the golden modes for determining the CKM parameters p and r\. The K-^-KVV decays are sensitive to the magnitude and imaginary part of Vtd • From these two modes, the unitarity triangle can be completely determined. The theoretical uncertainty in the branching ratio B(K+—>TT+V'U ) is roughly 7%, while that for K°L —¥Tr0vi7 is even smaller, about 2%. In terms of the Wolfenstein parameters, and based on our current understanding of standard-model parameters, the branching ratios are predicted to be B(Kl->n°vV
) = 4.08 x lO" 1 0 ^?? 2 = (3.1 ±1.3) x 10"
+
B(K+-nr vv
(3)
11
) = 8.88 x 10~u A4[{p0-p)2
+ (orj)2}
(4)
11
= (8.2 ±3.2) x 10" , where a = (1 — 4 r ) - 2 and /50 = 1.4 46 . The decay amplitude K°L^fK°vV is direct- CP-violating, and offers the best opportunity for measuring the Jarlskog invariant JCPK+ -+ix+vV Although the decay K+ -+it+vV is attractive theoretically, it is quite challenging experimentally. Not only is the branching ratio expected to be less than 10~ 10 , it is a three-body decay with two undetectable neutrinos. The key to
233
a convincing measurement of this decay is a thorough understanding of the background at a level of 10 - 1 1 . The E787 experiment previously reported results of the analysis of the 1995 data sample 4 7 . This experiment employs two guiding principles for determining the background. First, the background is measured from the same data as the K+ —w+vv signal. In this manner, hardware problems, changes in rates, and changes in detector performance are automatically taken into account. Second, two independent sets of selection criteria are devised, with large rejection (e.g. typical rejections are R > 100) for a given background source. This allows a measurement of background levels at a sensitivity R times greater than the signal by reversing one set of selection criteria. The three major sources of background, _ftf+—»/i+i/M , K+ ->ir+TT° , and pions from the beam, are all measured, with a total background of 0.08 ± 0.02 events from the analysis of the data collected during 1995-1997. One clean K+ —>TT+VV event was found (see Figure 11), and based on this one event 4 8 ,
1 2 0
130
E
140
ISO
( M e V )
Figure 11: E787: Final data sample collected in 1995-1997 after all cuts. One clean K^^t-n^vV event is seen in the signal box. The remaining events are K+—>7r+7r° background.
which was also seen in the earlier data, the branching ratio is B(K+ ~^/K+VV ) = 1-51^2 x 10~ 10 . The E787 experiment, with all data recorded should reach a factor of two higher sensitivity—to the level of the standard-model expectation for K+ -^-n+vV . A new experiment, E949, is under construction at BNL and will run from 2001 through 2003. E949 should observe 10 standard-model events in a twoyear run. The background is well understood and based on E787 measurements is expected to be 10% of the standard-model signal. A proposal for an experiment which promises a further factor of 10 improvement has been prepared at FNAL. The CKM experiment (E905) is designed to collect 100 standard-model
234
events, with an estimated background of approximately 10% of the signal, in a two-year run starting in about 2005. This experiment will use a new technique, with K+ decay-in-flight and momentum/velocity spectrometers. K°L^nt°vV The decay K°L -^fK°vv is even cleaner theoretically and is purely direct- CPviolating. Unfortunately, it is even more difficult experimentally, because all particles involved in the initial and final states are neutral. Presently, the best limit on K°L-^v;°vv is derived in a model-independent way 49 from the E787 measurement of K+ -)-K+VV : B(K°L^tTX°vv ) < 4.4 x B(K+->ir+vT7) < 2.6 x 10" 9 (90% CL).
(5)
The goal is to observe this mode directly in order to extract a second of the CKM matrix parameters. The K°L -^n°vV decay is identified by two photons from the common decay n° —• 77. KL decays such as K°L ->7r07r°7r° and KaL —»7r07r° can easily produce background if all but two of the final-state photons are unobserved. Background can also arise from 7r°'s produced by A or H° hyperon decays to final states such as mr° with the neutron undetected, if the beam contains large numbers of hyperons. An excellent system of photon veto detectors can substantially reduce these backgrounds, but additional kinematic cuts will also be necessary. In the center of mass experiments, a simple invariant mass cut can be made to reject K1^-K°TT° and K°h —>TT°IT0'K° backgrounds. In experiments like KTeV where the kaon momentum is unknown, one can exploit the fact that the neutrinos recoiling against the 7r° in K°L->t it° vv are massless, so that the transverse momentum of the n° extends to larger values than are possible in the background modes, as shown in Figure 12. KTeV does not measure the kaon momentum; in order to determine the transverse momentum of the 7T°, the decay vertex must be known. The longitudinal position of the vertex can be determined from the invariant mass constraint, but the transverse position can only be known within the size of the kaon beam. Thus a narrow "pencil" beam is needed, which limits the available intensity. KTeV tried this approach in a one-day test run and observed one background event, probably from a neutron interaction. From this special run, a 90%-CL limit 50 of B{K°L -*IT°I>V ) < 1.6 x 10" 6 was determined. An alternative is to use the rarer ir° -> e + e ~ 7 decay. This is a factor of 80 less sensitive but has several advantages. First, the location of the decay vertex can be determined from the charged tracks, so that a high-intensity, wide neutral beam can be used. This allowed the KTeV data for this mode to be taken in the
235
standard configuration with standard triggers. Second, this approach allows determination of the transverse momentum with better precision, reducing the background level. The PT distribution of ir° events passing all other cuts can be seen in Figure 12. The backgrounds nearest the search region come from
Figure 12: KTeV: Final K°L -tn°vv data sample collected during 1996-1997 after all cuts. No K°L^-K°VV events are seen above PT ~ 160 MeV/c.
the decays A ->• mr° and 2° - • A7r°. In this search using the full 1997 KTeV data set, with an expected background of 0.121O!O4J no events were seen, and at the 90% confidence level,51 B(KI^-K°VV ) < 5.9 X 10" 7 , still more than four orders of magnitude from the standard-model prediction. 3.3
Lepton-Flavor Violation as a Probe of New Physics
All experimental evidence to date supports the exact conservation of an additive quantum number for each family of charged leptons. If neutrino masses are nonzero, some very tiny mixing effects could permit such a decay in the standard model, but it would occur at unobservably small levels, many orders of magnitude beyond the present experimental sensitivity. Any observation of a signal for the decays K°L-* lie , K+ -»7r + /i+e~, or K°L ->-7r0/ze would thus be conclusive evidence for new physics beyond the standard model 5 2 . Although this lepton-flavor-nurnber conservation law appears to be respected in the standard model, there is no fundamental reason or underlying symmetry to explain why this should be so. Indeed, many possible extensions to the Standard Model predict new interactions involving heavy intermedi-
236
ate gauge bosons that could mediate the otherwise forbidden lepton-flavorviolating decays. Some of the specific models that lead to such decays include compositeness of quarks and leptons, left-right symmetric models, technicolor, some supersymmetric models, unified theories with horizontal gauge bosons, leptoquarks, and string theories. It is important to look for both K°L —> fie and the modes with an extra pion, K+ -»7r + ^ + e~and i^£->7r°//e , because the K^-^fie decay is sensitive to pseudoscalar and axial vector coupling, whereas the other modes are sensitive to scalar or vector couplings. In both cases, the excellent sensitivity of these experiments probes mass scales that are very large. The sensitivity of the experiments to new interactions depends on the coupling constants involved. If the new coupling for an intermediate vector boson of mass Mx is gx, then the lower bound on Mx implied by an upper limit on B(K°L-*\ie ) is given in terms of the electroweak coupling g by 0 v
m 10-- n
Mx ~ 200TeV/c2 x — x B(Kl^fie) 9
!/4
(6)
K°L^fie The best result on K°L^r\xe comes from BNL E871 53 , and was published in 1998. This experiment featured two analysis magnets for redundant momentum measurements. Electrons and muons were each identified in two different ways to reduce background from particle misidentification. The analysis cuts were then chosen to minimize the remaining backgrounds (involving either accidental coincidences or scattered electrons) while maintaining as much sensitivity as possible to the signal. A single-event sensitivity of about 2 x 10~ 12 was achieved, with an expected background of 0.1 event. Having found no events in their signal box, E871 set a 90%-CL upper limit of B(Kl -+\ie ) < 4.7 x 10~ 12 , the smallest upper limit set to date on any kaon decay mode. For an exotic boson with electroweak coupling strength, Equation 6 then implies a lower bound on its mass of 150 TeV. There are no near-term plans to pursue this decay further, as the background from Kl —>/K±eTi'e with a muon decay and a scattered electron is difficult to reduce below a level of 10~ 13 , which is just beyond the E871 sensitivity. K+
-¥-K+n+e~~
E865 at BNL was designed to search for the lepton-flavor-violating decay K+—>7r+yU+e~. This decay, with an extra pion in the final state, is sensitive to exotic gauge bosons with different quantum numbers from those that
237 400
300 O CD
™dr
200 Exclusion Box 100 Signal B o x
490
Azn
495
i
I
i l i i
500
i
505
510
M , e (MeV/c^) Figure 13: E871: Final data sample with the reconstructed mass M^e vs the square of the transverse momentum relative to the kaon direction, after all cuts there are no events in the signal region. The exclusion box (the larger box enclosing the signal box) was used to set cuts in an unbiased way on data far from the signal region. The shape of the signal box was optimized to maximize signal/background.
E871 could detect. The experiment uses K+ decays in flight, and the detector concept is similar to that of E871, with redundant particle identification by two Cerenkov detectors and an electromagnetic calorimeter, and a muon range stack. The limit on this mode from the 1996 r u n 5 5 , with no events above a likelihood to be K+ -•7r + /i + e _ (L i r / j e ) of 20%, is B(K+ ->7r+ n+e~) < 3.9 x 1 0 ~ n . From the combined results from E777 and the E865 runs in 1995 and 1996, a limit of B(K+ ->TT+/x+e") < 2.8 x 10" 1 1 is obtained. E865 is already close to being limited by background from accidentals; there are no plans to continue with this search. The E865 limit implies a lower bound of several tens of TeV on exotic bosons with electroweak coupling, depending on the exact model used.
i q ^
fie
In addition to the search for K+ —»7r+/i+e~performed by BNL E865, a search for the corresponding neutral mode K°L-+-x° \ie has recently been carried out by KTeV at FNAL. The main background concern was the common decay
238
K°L—>7r±e:F^e , in which the pion is misidentified as a muon, with 0.6±0.6 expected events. After all cuts, two background events were observed in the signal box, and the preliminary 90%-CL limit on K°L^,-K°\ie from KTeV 5 6 is B(AT£->7r>e ) < 4.4 x KT 1 0 . 4
Conclusions and Future Prospects
Tremendous progress has been made over the past decade in measuring Re(e'/e). Both KTeV and NA48 have made precise measurements and increased precision can be expected as both experiments finish analysing data already collected. The two experiments unfortunately do not agree with each other very well; but the most serious difficulties lie on the theoretical side at the moment. In the area of rare decay searches, the remarkable sensitivities of rare kaon decay experiments in setting limits on lepton flavor violation have constrained many extensions of the standard model. In addition, the observation of K+ —>K+VV may soon permit measurements of the unitarity triangle completely within the kaon system. An accurate determinations of the CKM matrix element Vtd could well come from the generation of experiments that is starting now. Comparison with the B meson system will then overconstrain the unitarity triangle and test the standard-model explanation of CP violation. The primary focus for the future of rare kaon decays is on the measurement of the so-called golden modes, K°L-+-K°VV and K+ —\v+vV , at sensitivities sufficient for observation of 100 events. Major initiatives with this goal are underway at BNL, FNAL and KEK. At the same time the study of a number of medium-rare and radiative modes will continue to be pursued, both as byproducts and in dedicated experiments. References 1. J.H.Cristenson, J.W.Cronin, V.L.Fitch, and R.Turlay, Phys. Rev. Lett. 13, 148 (1964). 2. L.Wolfenstein, Phys. Rev. Lett, 13, 562 (1964). 3. A.Alavi-Harati et al, Phys. Rev. Lett. 83, 22 (1999). 4. V.Fanti et al, Phys. Lett. B465, 335 (1999); G.Graziani, Proc. XXXV Rencontres de Moriond, Les Arcs, France, March 2000 J. Tran Thanh Van, ed., Paris: Ed. Frontieres (2000). 5. M.Antonelli et al., Nucl. Phys. B Proc. Suppl. 54, 14 (1997); S.Dell'Agnello et al, Nucl. Phys. B Proc. Suppl. 54, 57 (1997); V.Elia et al, Nucl. Phys. B Proc. Suppl. 54, 66 (1997); S.Spagnolo et al, Nucl. Phys. B Proc. Suppl. 70, (1997); F.Lacava et al, Nucl. Phys. B Proc. Suppl. 54, 327 (1997).
239
6. L.Littenberg and G.Valencia, Annu. Rev. Nucl. Part. Sci. 43, 729 (1993) 7. S.H.Kettell and A.R.Barker, Annu. Rev. Nucl. Part. Sci. 50 (2000). 8. J.Hagelin and L.Littenberg, Prog. Part. Nucl. Phys. 23, 1 (1989); J.Ritchie and S.Wojcicki, Rev. Mod. Phys. 65, 1149 (1993); P.Buchholz and B.Renk, Prog. Part. Nucl. Phys. 309, 253 (1997). 9. G.Buchalla, A.J.Buras, and M.E.Lautenbacher, Rev. Mod. Phys. 68, 1125 (1996). 10. A.J.Buras and R.Fleischer, hep-ph/9704376. 11. A.J.Buras, hep-ph/9806471. 12. G.D'Ambrosio and G.Isidori, Int. J. Mod. Phys. A 1 3 , 1 (1998). 13. V.Kekelidze et al., Proc. XXX Int. Conf. High Energy Phys., Osaka, Japan, July 2000 Singapore: World Sci. (2001). 14. H.Burkhardt et al., Phys. Lett. B199, 139 (1987). 15. D.Ambrose et al., Phys. Rev. Lett. 84, 1389 (2000). 16. D.Ambrose et al, Phys. Rev. Lett. 81, 4309 (1998). 17. V.Fanti et al, Phys. Lett. B458, 553 (1999). 18. B.Quinn, Proc. Meet. DPF, Columbus OH, August 2000 Singapore: World Sci. (2001). 19. S.Gjesdal et al, Phys. Lett. B 4 4 , 217 (1973). 20. A.Angelopoulos et al, Phys. Lett. B 4 1 3 , 232 (1997). 21. A.Alavi-Harati et al.,, hep-ex/0010059 (2000). 22. A.Alavi-Harati et al, Phys. Rev. D 62, 112001 (2000). 23. L.M.Sehgal and M.Wanninger, Phys. Rev. D46, 1035 (1992); (E) Phys. Rev.D46, 5209 (1992). 24. P.Heiliger and L.M.Sehgal, Phys. Rev. D48, 4146 (1993); J.K.Elwood, M.B.Wise, and M.J.Savage, Phys. Rev. D 5 2 , 5095 (1995); Phys. Rev. D 5 3 , 2855 (E) (1996); J.K.Elwood, M.B.Wise, M.J.Savage, and J.W.Walden, Phys. Rev. D 5 3 , 4078 (1996); L.M.Sehgal and J.van Leusen, Phys. Rev. Lett. 83, 4933 (1999). 25. P.Lubrano, Proc. Heavy Flavours 8, Southampton, UK, July 1999 P.Dauncey and C.Sachrajda, eds., J. High Energy Phys. (1999). 26. J.Adams et al., Phys. Rev. Lett. 80, 4123 (1998). 27. A.R.Barker, Proc. Heavy Flavours 8, Southampton, UK, July 1999 P.Dauncey and C.Sachrajda, eds., J. High Energy Phys. (1999). K.Senyo, Proc. Int. EuroPhys. Conf. High Energy Phys., Tampere, Finland, July 1999 K.Huitu et al.„ eds., Bristol, UK: IOP-Publishing (2000). 28. A.Alavi-Harati et al., Phys. Rev. Lett. 84, 408 (2000) 29. M.Contalbrigo, Proc. XXXV Rencontres de Moriond, Les Arcs, France,
240
March 2000 J.Tran Thanh Van, ed., Paris: Ed. Frontieres (2000). 30. A.Alavi-Harati et al.„ hep-ex/0008045 (2000). 31. S.Adler et al, Phys. Rev. Lett., hep-ex/0007021 (2000). 32. G.D.Barr et al, Phys. Lett. B328, 528 (1994); M.Zeller, Kaon Physics, Proc. Chicago Conf. Kaon Phys., J.L.Rosner and B.Winstein, eds., June 1999, Chicago: Univ. Chicago Press (2000). 33. L.M.Sehgal, Phys. Rev. 183, 1511 (1969). 34. G.D'Ambrosio, G.Isidori G, and J.Portoles, Phys. Lett. B423, 385 (1998). 35. D.Gomez-Dumm and A.Pich, Phys. Rev. Lett. 80, 4633 (1998). 36. G.Valencia, Nucl. Phys. B517, 339 (1998). 37. C.Q.Geng and J.N.Ng, Phys. Rev. D 41, 2351 (1990). 38. T.Inami and C.S.Lim, Prog. Theor. Phys. 65, 297 (1981). 39. S.Bosch et al, Nucl. Phys. B565, 3 (2000). 40. H.Greenlee, Phys. Rev. D42, 3724 (1990). 41. A.Alavi-Harati et al, hep-ex/0009030 (2000). 42. A.Barker et al, Phys. Rev. D 41, 3546 (1990). 43. K.E.Ohl et al, Phys. Rev. Lett. 64, 2755 (1990). 44. D.A.Harris, et al, Phys. Rev. Lett. 71, 3918 (1993). 45. A.Alavi-Harati et al, Phys. Rev. Lett. 84, 5279 (2000). 46. G.Buchalla and A.Buras, Nucl. Phys. B548, 309 (1999). 47. S.Adler et al, Phys. Rev. Lett. 79, 2204 (1997). 48. S.Adler et al, Phys. Rev. Lett. 84, 3768 (2000); S.H.Kettell, Proc. 3rd Int. Conf. B Phys. and CP Violation, Taipei, Taiwan, Dec. 1999 H.Y.Cheng and W.S.Hou, eds., World Sci. (2000). 49. Y.Grossman and Y.Nir, Phys. Lett. B398, 163 (1997). 50. J.Adams et al., Phys. Lett. B447, 240 (1999). 51. A.Alavi-Harati et al, Phys. Rev. D 61, 072006 (2000). 52. F.Wilczek and A.Zee, Phys. Rev. Lett. 42, 421 (1979); R.Cahn and H.Harari, Nucl. Phys. B176, 135 (1980). 53. D.Ambrose et al., Phys. Rev. Lett. 81, 5734 (1998). 54. D.R.Bergman, A search for the decay K+ -^ir+u.+e~~. PhD thesis. Yale Univ. (1998); S.Pislak, Experiment E865 at BNL: A Search for the Decay K+^n+n+e'. PhD thesis. Univ. Zurich (1998). 55. R.Appel et al, Phys. Rev. Lett. 85, 2450 (2000). 56. A.Bellavance, Proc. Meet. DPF, Columbus OH, August 2000 Singapore: World Sci. (2001).
J. P. Cumalat
This page is intentionally left blank
T H E STATUS OF MIXING IN T H E C H A R M S E C T O R J. P. CUMALAT Department of Physics, University of Colorado, Boulder, CO 80309-0390, USA E-mail: [email protected] During the past year there has been significant progress in the study of charm mixing. The CLEO Collaboration has reported new results from direct (wrong sign) searches and both CLEO and FOCUS have reported results from lifetime difference (Ar) searches. It seems that the observation of mixing or limits for r m ; x of 5 x 10~ 5 (or better!) will soon be available.
The observation of K° — K° oscillations were crucial in the development of the standard model. Similarly, the observation of Bd — Bd oscillations has had an essential impact on the standard model. So why is mixing in the charm sector so small? The explanation is in the so-called GIM (Glashow, Iliopoulos, and Maiani) suppression 1 . It is the GIM suppression that makes searching for charm mixing so interesting. An observation of mixing in the charm sector, rmiK of greater than 1 x 1 0 - 6 , might provide a signature for new physics. 1
GIM Suppression
One of the box diagrams which represent the lowest order short-distance contribution to D°-D° mixing is presented in figure 1. The mixing amplitude calculated from these diagrams is proportional to 2 : (D°\Hwk\D°)
cc £
V?t Vui Vcj V£ S(mlm])
(1)
i,j=d,s,b
where Vij are the Cabibbo-Kobayashi-Maskawa (CKM) matrix elements. If the quark masses m; were all equal, the loop functions .S(m?, m 2 ) would all be equal and the amplitude would be zero due to the unitarity (J2 K* K»» = 0) 0I" the CKM matrix. If the mass differences are small, GIM very nearly works and mixing is small. For charm, the CKM factor V*b Vub is insignificant (~ A5 in the Wolfenstein parameterization 3 ) relative to the factors V*d Vud and V*s Vus (both ~ A). Only the i, j = d,s terms contribute in equation 1, and the mass difference between the d and s quarks is relatively small. The D°~D° mixing probability calculated from these box diagrams is ~ 10~ 10 — 10~ 19 . 4 By contrast, neutral b quark mesons exhibit large mixing because the top mass is 243
Figure 1: One of the two box diagrams for mixing. The second diagram is found by exchanging the internal W and quark lines with each other.
so large. For example, for B° the three CKM factors (V*b Vid, i = u,c,t) are roughly equal (~ AX3) so the contribution of the top quark dominates. Because the short distance predictions for charm mixing are so small, long distance contributions may be important. They are more difficult to calculate, and there is significant disagreement over their size 5 ' 6 ' 7 ' 8 . With long distance effects, r m i x may be as large as 10~ 3 . If there is new physics such as a fourth generation of quarks, leptoquarks, etc., it can contribute to the box diagrams for mixing or the penguin diagrams for FCNC decays 9 . Because the standard model predictions are so small, there is a large window to observe the additional contributions of this new physics, unhidden by standard model effects. 2 2.1
Charm Mixing Basic Theory
The neutral D mesons evolve according to: l
-U^[D°
dt {D°J _(
M-iT/2 \M*2-iT*12/2
M12-iT12/2\ M-iT/2
(D_° ){D°
Diagonalizing gives weak eigenstates DH = PD° + qW; q= p
DL = PD° - qD°
\Ml2-iTl2l2V'2 [M12-iT12/2_
245
of definite mass and lifetime: = M± K[(M*2 - i r ; 2 / 2 ) ( M i 2 -
iTu/2)]1/2
rH,L = r T 2Z[(M{2 - iT*12/2)(M12 -
iT12/2)}^2.
MH,L
These eigenstates evolve with time as: = e-i^-it-ir»-itDff,i(0).
DH,L(t)
If iJwk conserves CP, then Du and £>L are CP eigenstates and p = q = 1. If you start with a D°, the probability that it is a D° at time t is: r m «(t) = r(£>°-H>°) 2
|e —TH*
2e -r* cos AMt]
I g—r^t
where A M = (MH - ML) and AT = (TH - TL). Experimental limits on D°-D° mixing 10 ' 11 ' 12 - 13 - 14 indicate that A M < < F and AP < < T, so rt cW = i e -
If we define x = AM/T
(AM 2 + i A r 2 ) i 2 .
and y = A r / 2 r , then 2
cW
4e
(x2+y2)(r2^2)
(2)
(z 2 +2/ 2
(3)
Integrated over all time: 'mix —
For the charge conjugate process (D°—>D°): 2
^*mix^J — 44 ^
(z 2
2 + 2/
)(r 2 * 2 )
(4)
so that r m i x = r m i x only if |g/p| = 1. Expectations from standard model short distance calculations are that x and y are approximately equal. However, if r m i x is large, the source is likely to be x. While long distance effects and/or new physics can increase A M substantially, they do not make significant contributions to A r 5 ' 9 ' 1 5 . In general AM^> receives contributions from virtual intermediate states, while ATD is generated by on-shell transitions. A nice compilation 16 of theory predictions has been made by H. Nelson and they are presented in figure 2.
246
O - O IVtixJmg Predictions10
-1
*-*
&
Figure 2: _D° — D° Mixing Predictions where the triangles are SM predictions of x, squares are SM predictions of y, and circles are NSM predictions of x. The predictions encompass 15 orders magnitude for rm{x.
2.2
Search Strategies
There are two basic methods currently employed to search for charm mixing. In direct or "wrong sign" searches, one looks for D°^D°^f, where / is a Cabibbo favored (CF) mode such as K+-n~ or fsT+/u~fM. The sign of the daughter Kor /j, distinguishes D°->/from D ° - » / . The produced D is identified using a D* 'tag': D*+^D°n+. The sign of the "bachelor" pion from the D* decay tags the produced neutral D as a D° (tr+) or D° (ir~). The signal for mixing is that the bachelor pion and the K daughter have the same charge: D*+-+-K+D°;
D°-+D°^K+TT-.
The second method is to look for a lifetime difference between DH and
247
DL- If DH and DL are C P + and CP 2/CP
eigenstates, then:
= — - — = -^r = V-
(5)
The lifetimes r+ and r_ can be measured using neutral D decays to states of definite CP, such as D°-*K+K(CP+) or D°^K%4> (CP"). Even if one relaxes the "no CP violation" requirement, J/CP ~ V because it is known 17 that CP violation is small in charm decay and therefore DH and DL are approximately CP eigenstates. 2.3
The DCS Interference
For wrong sign hadronic searches ( / — K+ir~, K+ir~Tr+ir~,...) there is an interesting interference which occurs. The interference is pictorially displayed in figure 3. Because mixing is at best a small effect, doubly Cabibbo suppressed (DCS) decay cannot be ignored when looking for a wrong-sign signal 18,19 . Via DCS, D°—>f can occur directly and not just through mixing D0—>JD°—>•/. In this case: Aws = ADCS(D°->f) + Amix(D°->D°^f) and the wrong sign decay rate becomes: —Tt (f\H\D°)2CFCF x{\\\>+ rws = e 2 2 2 \(x + y )TH + (K(A)y + <3{\)x) Tt}
(6)
where A_p|gp°)nns. A
-
g(f\H\D°)Z '
T _ «|g|^°) n n s P(W\D»)Z •
By measuring the proper time of the decay of the neutral D meson, one can disentangle the contributions from DCS and mixing: while all terms in equation 6 share a common exponential time dependence, the mixing term (x2 + y2) is proportional to t2, the DCS term (|A|2) has no additional time dependence, and the interference term is proportional to t. The interference term is particularly interesting as it may be observable even if the mixing term isn't (in which case mixing would be observable in hadronic modes but not semileptonic!). Alternatively, if the interference term has the opposite sign of the mixing term (destructive interference) and it is of roughly equal size, then the effects of mixing could be masked 18 . Finally, this interference term could allow experimenters to distinguish between A r and AM contributions to mixing. This is especially true if CP violation in mixing is very small or zero (see section 2.4).
248
Doubly Cabibbo Suppressed
ccsin24
K+tv Figure 3: The interference from the mixed decay and from DCS is pictorially displayed. For the semileptonic decay, there is no interference, but for the hadronic decay one must worry about size and relative phase of the DCS decay.
Note that the CP conjugate rate for wrong sign decays is not the same as equation 6: rWs = e \{x2 + y2)T2t2 2-4
Simplifying
(f\H\D%Fx{\\\2+ + (»(A)j/ + Z(\)x)
Tt}
(7)
Assumptions
The wrong sign rates of equations 6 and 7 can be simplified by making assumptions about the nature of CP violation. For example, it is likely that there is no direct CP violation in CF or DCS decays0 and that CP is not violated in charm mixing 15 , i.e. \v/p\ = i. (8) If we assume that both of the above assumptions are valid, then |A| = |A|.
(9)
°For direct CP violation to appear in a decay mode, there must be two amplitudes that contribute significantly to that final state.
249
Defining the strong (final state interaction) phase 5 and the CP violating phase
=
?
(10)
the wrong sign rates become: rws, r W s = e - r t ( / | t f | ; D 0 ) c F x {fines + | r m i x r 2 i 2 + + [j/v/ifocs cos(S ±
sin(<J ± 0)] Tt} .
(11)
If we assume CP invariance (<j) = 0) then rws = ?ws and: rws = e - r t ( / | # | ^ ° > c F x {fiocs + | r m i x r 2 i 2 + y'VR^H}
(12)
where y' = y cos J — a: sin 5; x' = a; cos S + y sin 5.
(13)
If (5 is small, as has been argued 19,20 , then y'-ty and x'-tx. If one has sufficient proper time resolution, equation 12 shows that the contributions due to x' and y' can be resolved (assuming CP invariance). 2.5
Comments on Assumptions
There may be excellent theoretical motivation for the assumptions made in subsection 2.4. But, as demonstrated by E791 in their hadronic wrong sign search 13 , there are no technical reasons for experimenters to make them. E791 first quoted mixing limits based on equations 6 and 7 (minimal assumptions). They then quoted limits after making the simplifying assumptions of equations 8 and 9, and finally for the case of no mixing (DCS only). This approach avoids two difficulties. First, if experimenters quote results based on only one set of assumptions, it can be difficult and misleading to compare results from different experiments (if past history is any guide, they are unlikely to make similar ones). Second, and more importantly, assumptions mask possibly interesting phenomena (assuming CP invariance precludes searching for CP violation in mixing).
250
The situation in A r searches is slightly different. There, one measures J/CP, and y = yep only if DH and DL are CP eigenstates. An observation of 2/CP 7^ 0 is evidence for mixing but extracting y from this requires additional information. 3
Wrong Sign Mixing Searches
E791 and ALEPH have previously published results from wrong sign searches, but only E791 published results for the most general case and hence, I will restrict my discussion to the E791 results. However CLEO has recently published results with significantly higher precision and performed fits for a variety of different assumptions. 3.1
E791 Wrong Sign Searches
E791 at Fermilab is a fixed-target hadroproduction experiment. Using a 500 GeV 7T~ beam and thin target foils, they logged 2 x 1010 hadronic interactions. Their spectrometer employed silicon microstrip detectors for vertexing, two threshold Cerenkov detectors for particle identification, a muon hodoscope and a lead/liquid scintillator electromagnetic detector for electron identification. They have published results for wrong sign searches using hadronic and semileptonic decay modes. Their hadronic analysis 13 uses the CF decay modes D°-lK~-ir+ir~ir+ and D°^-K~TT+. Figure 4 shows the right sign and wrong sign signals they obtain, where they have combined D° and D° modes for the purpose of making the figure. In the plots, Q = m(irKmr) — m(Kmr) — m(n). There are 5643 and 3469 reconstructed signal events in the right sign Kir and K3ir samples. In their analysis, they keep the D° and D° samples separate, and perform a simultaneous binned maximum likelihood fit to each of the eight resulting data sets. The D° mass and width and the Q peak and width are constrained to be the same in each data set, but most parameters (such as backgrounds) are uncoupled, leading to a 41 parameter fit in the most general case. Making no assumptions about CP violation or mixing, E791 sets the following 90% CL limits: rmix(D°^D°)
< 1.45%
fmix(D°^D°)
< 0.74%.
(14)
Assuming CP violation only in the interference term (see equations 8 and 9), they find a 90% CL limit of: rmix < 0.85%.
(15)
251
Figure 4: E791 right sign (top row) and wrong sign (bottom row) hadronic data samples. Candidates for the decay modes Kir and K3n are shown in the left and right columns, respectively. D° and D° candidates are kept separate in the analysis, but are combined here. Q = m^Kmr) — m(Kmr) — m(n).
If they assume no mixing, they find relative branching ratios (equation 10) for DCS modes: RBCS(KTT)
= (0.68±g;f| ± 0.07)%
i?DCs(^37r) = (0.25±g;12 ± 0.03)%.
(16)
The E791 semileptonic wrong sign search 12 uses the D° decay modes K~fj.+I'/J, and K~e+ve. The missing neutrino gives a two-fold ambiguity in the D° momentum; based on Monte Carlo studies they pick the higher momentum solution. Fixing the D° mass, they then fit the Q and proper time t distributions. The right and wrong sign Q plots are shown in figure 5. The dotted lines show the estimate of the background they get using an event mixing technique (they combine LP candidates from one event with a bachelor pion from another). In this analysis, E791 does not quote a "minimal assumption" result, instead they make the assumption of equation 8 from the outset and fit to a time dependence given by equation 2. The number of right sign decays
252
0.040
Q = M(Klvn) - M(D°) - M(7i) (GeV/c2) Figure 5: E791 right sign (top row) and wrong sign (bottom row) semileptonic data samples. Candidates for the decay modes Kev and Kp,v are shown in the left and right columns, respectively.
they get from their fit is 1237 ± 45 {Kev) and 1267 ± 44 limit they obtain is: r m i x < 0.50%. 3.2
{K/J,V).
The mixing (17)
CLEO Wrong Sign Search
The CLEO collaboration has reported 21 results from a wrong sign search for mixing using data from 9.0 fb _ 1 of integrated luminosity taken with the CLEO II.V detector. This analysis takes advantage of the three layer, doublesided silicon vertex detector (SVX) they installed in 1995. In addition to giving CLEO the ability to measure the proper time of charm decays, the SVX dramatically improves their measurement resolution of Q, the energy release
253
in the D*± decay used to tag the initial state of the neutral D. This improved resolution enhances their sensitivity to mixing by substantially increasing their signal to noise. After cuts designed to suppress backgrounds from other D° decays and "cross-talk" between their right-sign and wrong-sign samples, they obtain the D mass and Q plots for wrong sign candidates shown in figure 6. Superimposed on the data (solid lines) in these plots are colored regions which show the contributions from backgrounds determined by a two-dimensional fit to Q and M. The background shapes were determined from Monte Carlo. Prom their fit they find 44.8ig;7 D°^K+TT~~ events in their wrong sign sample.
Figure 6: CLEO wrong sign Q and M distributions for KIT candidates from D* decay. Backgrounds from various sources are shown, and come from two-dimensional fits to Q and M (shapes determined from simulation).
A similar fit to their right sign sample yields 13527 ± 1 1 6
D°^K~ir+
254
candidates. Using these numbers and assuming no mixing, they find TD
= ( ° - 3 3 2 ™ ± 0-04)%.
= ? T £ T TRS
(18)
(K IT)
With this no mixing assumption, it is worthwhile to compare this result to older rD measurements of E791 1 3 (0.68 ±0.34 ±0.07)% and of CLEO 2 2 (0.77± 0.25 ± 0.25)%. The new CLEO value is within errors of the naive expectation of tan49 and more than a factor of two smaller than the older results. With this assumption of no mixing, they determine the branching ratio of D° -» K+ir~ = (1.28±g;l| ± 0.15 ± 0.03) x 10"*. Using a time dependent fit to differentiate between mixing and DCS contributions they measure RDOS(KW)
= (0.48 ± 0.12 ± 0.04)%. \x'\ < 2.9%
z' = 0 ± 1 . 5 ± 0 . 2 %
(19)
y' = (-2.5+l;| ± 0-3)% - 5 . 8 % < y' < 1.0%
(20)
CLEO has further split its wrong sign sample into D° and D° candidates and they find no evidence for a time-integrated CP asymmetry. In their most general fit they allow for CP violation via state-mixing, direct decay, and the interference between the two processes. In leading order they allow for both x1 and y' to be scaled by ( 1 ± A M / 2 ) , RD ->• (1 ± AD), and 6 ->• 5 ±
(21)
AD = (-0.01±g;J? ± 0.01) -0.36 < AD < 0.30
(22)
sin<j) = 0.00 ± 0.60 ± 0.01) No limit(95%CL).
(23)
255
They further assume CP invariance and use the proper time distribution given by (12) in their fits. In this parameterization the interference term gives independent information on y'. From their fits, CLEO finds one-dimensional intervals at 95% CL of: \x'\ < 2 . 8 % x'=0±1.5±0.2%
(24)
y' = (-2.3il;5 ± 0.3)% -5.2% < y' < 0.2%
(25)
Systematic errors are included when finding the above intervals. It is not possible to determine the sign of x' from the fit, which depends only on (a;')2. With this assumption of CP invariance, and if S = 0 so that y'—>y and x'—>x, the above results give separate limits on the A r and AM contributions to mixing, an interesting and important result (particularly so given their excellent sensitivity). A fit to the time dependence for D° —> K+ir~~ for CLEO is presented in figure 7. The vertical hatched lines below zero in the figure indicate the fit contribution from the destructive interference with mixing. The CLEO group concludes that their data is consistent with no D° — D° mixing. They also conclude that they observe no evidence for CP violation. Nevertheless their results are a substantial advance in sensitivity to mixing beyond E791. 4
Lifetime Difference Searches
E791 has published results of a search for A r , while CLEO and FOCUS searches are in progress. 4.1
E791 AT Search
E791 searched 23 for a lifetime difference between the C P + and CP~ eigenstates of the D°. To do so, they compared the lifetimes of the decays D°^K~K+ (CP+) and D°->K-7r+ ( | CP+ + | C P " ) : T(K-K+) - T{K-TT+) _ T(K-rr+) ~
r + - | ( r + + r_)
= yep-
T(K-ir+) (26)
Their fit to the mass distribution includes Breit-Wigner shapes to account for reflections from misidentified K~ir+ and K~-ir+ir0 decays. After cuts, they have
256
D*+ -> D V , D° -> KV
Figure 7: CLEO wrong sign distribution in proper decay time. The data are shown as the solid points with error bars. The figure displays the fit contributions from the direct D° —> K+TT~ decay, from the destructive interference with mixing, and from backgrounds from the charm and light quark production.
3213 ± 77 signal events in K~K+ and 35,427 ± 206 in K~ir+. They observe no difference in lifetimes, and they quote: yep = 0.008 ± 0.029 ± 0.010 -0.04 < yCp < 0.06 (90%CL).
4.2
(27)
CLEO A r Search
CLEO has presented preliminary results 24 on A r based on data they have collected with the CLEO II.V detector. They plan to compare the lifetimes of D°^-K~-K+ and D°-*7r-7r + (both CP+) to D 0 ->K-7r + to extract a measurement of J/CP via equation 26. Based on partial samples of roughly 1300 K~K+,
257
475 7T-7T+, and 19000 K~-K+ (figure 8) events they find: 2/CP = 0.032 ± 0.034 ± 0.008 -0.076 < 2/CP < 0.012
(90%CL).
D*+ -* D V , D° -> Kn+
(28)
D*+ -» D V , D° -> K K+
Figure 8: CLEO D°—>K~n + and D ° - > K _ K + mass plots, for candidates used in their preliminary A F search.
The CLEO Collaboration estimates an eventual sensitivity for the measurement y of oy RS 0.028(stai) ± O.OlQ(syst) using the channels D°->K~ir+ and rP-*K~K+ They are also investigating CP odd lifetimes via channels K°
Wco
C
Py
TDo =
(their beam spot has ay « 10/im, ax « 250fim). With much more data and other channels to analyze, CLEO believes their sensitivity will increase substantially. It is obvious that the BABAR and
258
BELLE Collaborations should become major players in the next year in this lifetime difference search area for mixing. If CLEO III can solve their early luminosity difficulties, then they may also remain major competitors in the quest to conclusively observe mixing.
4.3
FOCUS A r Search
FOCUS (E831) is a photoproduction experiment located at Fermilab that is the successor of the E687 experiment 25 The experiment has an upgraded vertex silicon detector and better particle identification. The data taking was done during the 1996-1997 fixed target run. FOCUS searched for A r by comparing the lifetimes of the C P + final states K+K~ to that of K~ir+ (equation 26) 2 6 . Mass plots for their K+K~ and K~ir+ candidates are shown in figure 9.
Figure 9: (a) Reconstructed mass distribution of D° -> K~ir+ and its conjugate decay. There is a total of 119738 signal events, (b) Reconstructed mass distribution of D° -» K~K+ candidates. There is 10331 D° -> K~ K+ signal events. The vertical, dashed lines indicate the signal and sideband regions used for the lifetime and yep fits.
259
Due to excellent proper time resolution ( « 7% of the D° lifetime) the fractional error in the FOCUS lifetime is roughly equal to the fractional error in the K+K~ yield. In order to account for the small D° -¥ K~TT+ reflection in the D° -> K+K~ mass, FOCUS assumes that the time evolution of the reflection is described by the lifetime of D° -> K~ix+ and the reduced proper time distributions of the D° -> K~TT+ and D° ->• K+K~ samples are fit at the same time. The fit parameters are: the D -> Kir lifetime, the lifetime difference yep-, and each number of background events under the signal region in D° -> K~ir+ and in D° -» K~K+. The signal contributions for the D° ->• K~ir+, D° -> K~K+, and the reflection are described by f(t')exP(—t'/T) in the fit likelihood. The value /(£') is a function which corrects for detector acceptance and the absorption of particles in matter. The background parameters are either floated or fixed to the number of events in the mass sidebands. The signal yields as a function of reduced proper time for JD° -> K~7T+ and for D° -> K~K+ are presented in figure 10.
W
o
o w d >
2000
3000
4000
t' (fs) Figure 10: Plots of the signal yield versus the reduced proper time for D° —> K D° —> K~ K+ events. The data is background subtracted.
7r+ and for
260
The FOCUS results are: yCp = (3.42 ±1.39 ±0.74)% T{D ->• Kit) = 409.2 ± 1.3fs The systematic errors on yep are estimated by changing the selection cuts and by trying different fitting methods. Tests of the Cerenkov identification hypothesis for kaons and for the detachment cut between the secondary and primary vertices were performed. FOCUS also varied the number of bins and used two alternate methods for handling the background. The result on the lifetime of D —• Kir has only a statistical error as this is the most accurate measure of the lifetime thus far, but FOCUS expects to do better when the analysis of the D —>• Kiririr channel is completed. It is intriguing to ask with a yep value of 3.42% what the D° lifetime means? There is a 27 fs difference between the D° measured with a CP odd channel and with the K+K~ channel. One must exercise care in the future when quoting the D° lifetime. 5
Summary
The summary of the limits on x and y are presented in figure 11. The large diagonally lined circular region limits are from E791 semileptonic results. The small circular and peanut shaped regions are derived from different fits to the wrong sign hadronic signals from CLEO. The horizontal bands are determined from measurements of the lifetime difference values for D° —> K~ir+ and D° -» K~K+ from E791 and from FOCUS. First, one notes that the CLEO measurement of y', (—5.8% < y' < 1.0%) and the FOCUS measurement of yep = (3.42 ± 1.39 ± 0.74)% are consistent with the earlier results from E791 (yCp = (0.8 ± 2.9 ± 1.0)%). Second, one recognizes that both CLEO and FOCUS observe mixing results of more than 1.5cr, but in opposite directions. The two measurements just barely overlap. Although the data from FOCUS and CLEO have been presented on a single plot, one should exercise caution as FOCUS makes a measurement oiycp while CLEO reports a limit on x' and y', not on x and y. In figure 11 all values are plotted as x or y. Recently, a paper 2 7 by S. Bergmann, et al. has attempted provide a model which encompasses all results. The authors mention that a large value for the strong phase angle difference between the DCS and the CF channels would reduce the discrepancy between the two experiments. While the new results are tantalizing, we need to be patient. CLEO should soon have a measurement for the lifetime difference A r . CLEO's sensitivity
261
DP~DP
n"! 5 I
#
,,,
Mixing Uniits
*
1
L
s
i©
is
K
«
—J
ao>
Figure 11: A plot of 95%CL limits for measurements of x and y mixing values.
should be comparable, if not better, to the uncertainty in the result from FOCUS. We also expect to soon hear about hadronic results from FOCUS. In addition, both FOCUS and CLEO collaborations have large semileptonic samples in which they can search for mixing. The sensitivity from these samples is expected to be on order 2% or better. Finally, we should recognize that BABAR and BELLE will soon have the largest charm samples. We should expect that their future results will help to clarify whether the current indication of mixing is being observed.If not, then we would hope that -these new experiments will push limits on yep to below
icr3.
262
Acknowledgments I would like to thank the TASI Program Co-Directors, the TASI Scientific Advisory Board, and the TASI Local Organizing Committee for an informative and productive school. I would also like to thank members of the CLEO, E791, and FOCUS collaborations who provided me with information and plots. I would like to extend a special thanks to Professor Paul Sheldon of Vanderbilt whose summary talk and proceedings 28 for the Heavy Flavours Conference at Southampton, UK, 1999 were invaluable references and from which I borrowed heavily, particularly in the introductory theory discussion of mixing in the charm sector. References 1. 2. 3. 4. 5.
6.
7. 8. 9. 10. 11. 12. 13. 14. 15. 16.
S. L. Glashow, J. Iliopoulos, and L. Maiani, Phys. Rev. D 2, 1285 (1970). A. Pich, Nucl. Phys. 66(Proc. Supl.) 456 (1998) [hep-ph/9709441]. L. Wolfenstein, Phys. Rev. Lett. 5 1 , 1945 (1983). A. Datta and D. Kumbhakar, Z. Phys. C 27, 515 (1985); H. Y. Cheng Phys. Rev. D 26, 143 (1982). L. Wolfenstein, Phys. Lett. B 164, 170 (1985); J. Donoghue, E. Golowich, B. R. Holstein, and J. Trampetic, Phys. Rev. D 33, 179 (1986). H. Georgi, Phys. Lett. B 297, 353 (1992) [hep-ph/9209291]; T. Ohl, G. Riccaiardi and E. Simmons, Nucl. Phys. B 403, 605 (1993) [hepph/9301212]. S. Pakvasa, Flavor Changing Neutral Currents in Charm Sector, [hepph/9705397]. A. J. Schwartz, Mod. Phys. Lett. A 8, 967 (1993); P. Singer and D. X. Zhang, Phys. Rev. D 55, 1127 (1997) [hep-ph/9612495]. See for example: J. L. Hewett, T. Takeuchi, and S. Thomas, Indirect Probes for New Physics, [hep-ph/9603391]. W. C. Louis, et al, E615 Collab., Phys. Rev. Lett. 56, 1027 (1986). J. C. Anjos, et al, E691 Collab., Phys. Rev. Lett. 60, 1239 (1988). E. M. Aitala, et al, E791 Collab., Phys. Rev. Lett. 77, 2384 (1996) [hep-ex/9606016]. E. M. Aitala, et al, E791 Collab., Phys. Rev. D 57, 13 (1998) [hepex/9608018]. R. Barate, et al, ALEPH Collab., Phys. Lett. B 436, 211 (1998) [hepex/9811021]. Y. Nir, Nuovo Cimento 109 A, 991 (1996) [hep-ph/9507290]. H.N. Nelson, [hep-ex/9908021].
263
17. P. L. Frabetti, et al, E687 Collab., Phys. Rev. D 50, 2953 (1994); J. Bartelt, et al, CLEO Collab., Phys. Rev. D 52, 4860 (1995); E. M. Aitala, et al, E791 Collab., Phys. Lett. B 403, 377 (1997) [hepex/9612005]. 18. G. Blaylock, A. Seiden, and Y. Nir, Phys. Lett. B 355, 555 (1995) [hep-ph/9504306]. 19. T.E. Browder and S. Pakvasa, Phys. Lett. B 383, 475 (1996) [hepph/9508362]. 20. L. Wolfenstein, Phys. Rev. Lett. 75, 2460 (1995) [hep-ph/9505285]. 21. M. Artuso, et al., CLEO Collab., Search for D°-D° Mixing, [hepex/9908040]. 22. D.Cinabro, et al., CLEO Collaboration Phys. Rev. Lett. 72, 1406 (1994) 23. E.M. Aitala, et al., E791 Collab., Phys. Rev. Lett. 83, 32 (1999) [hepex/9903012]. 24. CLEO Collaboration, talk by Craig Prescott at APS 2000 Meeting in Long Beach, CA, April 29, 2000. 25. P.L. Frabetti et al., E687 Collaboration, Nucl. Instrum. Methods A 320, 519 (1992). 26. J.M. Link, et al, FOCUS Collaboration, Phys. Lett. B 485, 62 (2000). 27. S. Bergmann, Y. Grossman, Z. Ligeti, Y. Nir, A.A. Petrov, FERMILABPub-00/102(May 2000), [hep-ph/0005181]. 28. P.Sheldon, Proceedings of the Heavy Flavours 8 Conference held in Southhampton, UK, 1999 [hep-ex/9912016].
This page is intentionally left blank
li|plil§iplll|pl
•iiiiii
illllB M
v Davison E. Soper
This page is intentionally left blank
B A S I C S OF Q C D P E R T U R B A T I O N
THEORY
DAVISON E. S O P E R Institute of Theoretical Science University of Oregon, Eugene, OR 97403 Email: [email protected] This is an introduction to the use of QCD perturbation theory, emphasizing generic features of the theory that enable one to separate short-time and long-time effects. I also cover some important classes of applications: electron-positron annihilation to hadrons, deeply inelastic scattering, and hard processes in hadron-hadron collisions.
1
Introduction
A prediction for experiment based on perturbative Q C D combines a particular calculation of Feynman diagrams with the use of general features of the theory. T h e particular calculation is easy at leading order, not so easy at next-toleading order and extremely difficult beyond the next-to-leading order. This calculation of Feynman diagrams would be a purely academic exercise if we did not use certain general features of the theory t h a t allow the Feynman diagrams to be related to experiment: the renormalization group and the running coupling; the existence of infrared safe observables; the factorization property t h a t allows us to isolate hadron structure in parton distribution functions. In these lectures, I discuss these structural features of the theory t h a t allow a comparison of theory and experiment. Along the way we will discover something about certain i m p o r t a n t processes: e + e ~ annihilation; deeply inelastic scattering; hard processes in hadron-hadron collisions. By discussing the particular along with the general, I hope to a r m the reader with information t h a t speakers at research conferences take to be collective knowledge knowledge t h a t they assume the audience already knows. Now here is the disclaimer. We will not learn how to do significant calculations in QCD perturbation theory. Three lectures is not enough for t h a t . I hope t h a t the reader may be inspired to pursue the subjects discussed here in more detail. A good source is the Handbook of Perturbative QCD1 by the C T E Q collaboration. More recently, Ellis, Stirling a n d Webber have written an excellent book 2 that covers the most of the subjects sketched in these lectures. For the reader wishing to gain a mastery of the theory, I can recommend the recent books on q u a n t u m field theory by Brown, 3 S t e r m a n , 4 267
268
Peskin and Schroeder, 5 and Weinberg. 6 Another good source, including b o t h theory and phenomenology, is the lectures in the 1995 TASI proceedings, QCD and Beyond.7 I have published a substantially similar set of lectures in the proceedings of the 1996 SLAC Summer school. 8 2
Electron-positron annihilation and jets
In this section, I explore the structure of the final state in Q C D . I begin with the kinematics of e+e~ —> 3 partons, then examine the behavior of the cross section for e + e ~ —> 3 partons when two of the parton m o m e n t a become collinear or one parton m o m e n t u m becomes soft. In order to illustrate better what is going on, I introduce a theoretical tool, null-plane coordinates. Using this tool, I sketch a space-time picture of the singularities t h a t we find in m o m e n t u m space. T h e singularities of perturbation theory correspond to long-time physics. We see t h a t the structure of the final state suggested by this picture conforms well with what is actually observed. I draw a the distinction between short-time physics, for which perturbation theory is useful, and long-time physics, for which the perturbative expansion is out of control. Finally, I discuss how certain experimental measurements can probe the short-time physics while avoiding sensitivity to the long-time physics. 2.1
Kinematics
of e+e~
—» 3 partons
/
V
Figure 1. Feynman diagram for e + e
—tqqg.
Consider the process e + e ~ —> qqg, as illustrated in Fig. 1. Let -y/s be the total energy in the c m . frame and let q^ be the virtual photon (or Z boson) m o m e n t u m , so q^q^ = s. Let pf be the m o m e n t a of the outgoing partons
269 {q,q,g) and let Ej = p° be the energies of the outgoing partons. It is useful to define energy fractions x; by Ei
2pi • q
V*/2
s
(1)
"
Then 0<xi.
(2)
Energy conservation gives
yXi=^CLPi)-9=2m *-^
(3)
s
i
T h u s only two of the x, are independent. Let 6{j be the angle between the m o m e n t a of partons i and j . We can relate these angles to the m o m e n t u m fractions as follows: 2pi • P2 = (pi + P2) 2 = (q ~ Pa) 2 = s - 2q • p 3 , 2£i£2(l-cos012) = s(l-z3).
(4) (5)
Dividing this equation by s / 2 and repeating the argument for the two other pairs of partons, we obtain three relations for the angles 0;J : £12:2(1 - cos#i 2 ) = 2(1 - £3), £22:3(1 -
cos
^ 2 3 ) = 2(1 -
xi),
z 3 zri(l - cos0 3 i) = 2(1 - 3:2).
(6)
We learn two things immediately. First, Xi < 1.
(7)
Second, the three possible collinear configurations of the partons are m a p p e d into Xi space very simply:
0 023 -> 0 031 -> 0 #12 —¥
<=>
X3
«•
XX
<=>
X2
-> 1, -> 1, -> 1.
(8)
T h e relations 0 < X{ < 1, together with £3 = 2 — X\ — X2, imply t h a t the allowed region for (x±, £2) is a triangle, as shown in Fig. 2. T h e edges a;,- = 1 of the allowed region correspond to two partons being collinear, as also shown in Fig. 2. T h e corners s; = 0 correspond to one parton m o m e n t u m being soft
(p?->0).
270
2 & 3 collinear 2 soft
soft
\ -Vis x
•
; o
1 & 2 \ . * • Ic& * collinear ' •ii O
X} e l x
soft —r—
2
Figure 2. Allowed region for (x\, £2). Then X3 is 2 — xi — X2- The labels and small pictures in the right hand diagram show the physical configuration of the three partons corresponding to subregions in the allowed triangle.
2.2
Structure
of the cross
section
One can easily calculate the cross section corresponding to Fig. 1 and the similar amplitude in which the gluon attaches to the antiquark line. T h e result is 1
da
X -t
\
X n
?±CF_ 2n (1 — « i ) ( l — £ 2 ) '
(9)
where CF = 4 / 3 and ao = (47ra 2 /s) £} Q2, is the total cross section for e + e " hadrons at order a ° . T h e cross section has collinear singularities: (l-£i)->0,
(2fe3 collinear);
(1 - x2) -> 0 ,
(1&3 collinear).
(10)
There is also a singularity when the gluon is soft: x3 —> 0. In terms of x\ and X2, this singularity occurs when (l-Xl)->0,
(l-*2)-»0,
(l-*2
const.
(11)
Let us write the cross section in a way t h a t displays the collinear singularity at 03i —> 0 and the soft singularity at E3 —y 0: 1
da
do dE3d cos 031
=
2n
^CF
f(E3,03 £3(1 — cos #31)
(12)
Here f(E3,03i) a rather complicated function. T h e only thing t h a t we need to know a b o u t it is t h a t it is finite for E3 —> 0 and for $31 —> 0.
271 Now look at the collinear singularity, #31 —> 0. If we integrate over the singular region holding E3 fixed we find t h a t the integral is divergent: d c o s 0 3 i -T7T-] W~ = dE3d cos #31
lo
g(°°)-
(13)
Similarly, if we integrate over the region of the soft singularity, holding #31 fixed, we find t h a t the integral is divergent:
[adE3
d
/
=log(oo).
(14)
J0 dE3d cos 03i Evidently, perturbation theory is telling us t h a t we should not take the perturbative cross section too literally. T h e total cross section for e+e~ —> hadrons is certainly finite, so this partial cross section cannot be infinite. W h a t we are seeing is a breakdown of perturbation theory in the soft and collinear regions, and we should understand why.
Figure 3. Cross section for e+e or collinear with the quark.
—± qqg, illustrating the singularity when the gluon is soft
Where do the singularities come from? Look at Fig. 3 (in a physical gauge). T h e scattering m a t r i x element M contains a factor l / ( p i + P3) where (Pi + P 3 ) 2 = 2pi -p3 = 2E1E3{1
- cos03i).
(15)
Evidently, l / ( p i + P3) is singular when 0 3 i —¥ 0 and when E3 —> 0. T h e collinear singularity is somewhat softened because the numerator of the Feynm a n diagram contains a factor proportional to #31 in the collinear limit. (This is not exactly obvious, but is easily seen by calculating. If you like s y m m e t r y arguments, you can derive this factor from quark helicity conservation and overall angular m o m e n t u m conservation.) We thus find t h a t |X|2oc
(16) i3f3l
272 for E3 —> 0 and $31 —> 0. Note the universal nature of these factors. Integration over the double singular region of the m o m e n t u m space for the gluon has the form
j •&**£>*<*„ J tu^v.
(17)
Combining the integration with the m a t r i x element squared gives da
I
2
E3dE3d6l1d(j)
fdE3d6i1
C31 .-^3^31 .
~ J -* W ^
(18)
T h u s we have a double logarithmic divergence in perturbation theory for the soft and collinear region. W i t h just a little enhancement of the argument, we see t h a t there is a collinear divergence from integration over #31 at finite £ 3 and a separate soft divergence from integration over E3 at finite #31. Essentially the same argument applies to more complicated graphs. There are divergences when two final state partons become collinear and when a final state gluon becomes soft. Generalizing further, 9 there are also divergences when several final state partons become collinear to one another or when several (with no net flavor q u a n t u m numbers) become soft. We have seen t h a t if we integrate over the singular region in m o m e n t u m space with no cutoff, we get infinity. T h e integrals are logarithmically divergent, so if we integrate with an infrared cutoff Mm, we will get big logarithms of MjR/s. T h u s the collinear and soft singularities represent perturbation theory out of control. Carrying on to higher orders of perturbation theory, one gets l + a s x (big) + a 2 x (big) 2 + • • •.
(19)
If this expansion is in powers of as(Mz), we have as
Interlude:
Null plane
coordinates
In order to understand better the issue of singularities, it is helpful to introduce a concept t h a t is generally quite useful in high energy q u a n t u m field theory, null plane coordinates. The idea is to describe the m o m e n t u m of a particle using m o m e n t u m components p^ = (p+, p~, p1, p 2 ) where
P± =
(p°±p3)/V2.
(20)
273
Figure 4. Null plane axes in momentum space.
For a particle with large m o m e n t u m in the +z direction and limited transverse m o m e n t u m , p+ is large and p" is small. Often one chooses the plus axis so t h a t a particle or group of particles of interest have large p+ and small p~ and pxUsing null plane components, the covariant square of p^ is p2 = 2p+p~ T h u s , for a particle on its mass shell, p
(21)
-p£. is
p|, + m2 2p+
P
(22)
Note also t h a t , for a particle on its mass shell, p+ > 0 ,
p~ > 0 .
(23)
Integration over the mass shell is (2T)
7
d3p (2<
2
2^fP + m
-3
f*"L
'dP+ 2p+
(24)
We also use the p l u s / m i n u s components to describe a space-time point x^: x± = (x° ± x3)/\/2. In describing a system of particles moving with large m o m e n t u m in the plus direction, we are invited to think of x+ as "time." Classically, the particles in our system follow paths nearly parallel to the x+ axis, evolving slowly as it moves from one x+ = const, plane to another. We relate m o m e n t u m space to position space for a q u a n t u m system by Fourier transforming. In doing so, we have a factor exp(ip • x), which has the form p •x = p
x
x
— px • x y .
(25)
274 T h u s x is conjugate to p + and x+ confusing, but it is simple enough. 2.4
Space-time
picture
of the
is conjugate to p .
T h a t is a little
singularities
Pi P3
Figure 5. Correspondence between singularities in momentum space and the development of the system in space-time.
We now return to the singularity structure of e"1 qqg. Defir Pi + large and kT = 0. Then P3 = k^. Choose null plane coordinates n t h k k2 = 2k+k~ becomes small when , _
..
-
2 P3,T
, 2p+
2 , P3,r
,
, 2p+
(26)
becomes small. This happens when P 3 , T becomes small with fixed p~{ a n d p j , so t h a t the gluon m o m e n t u m is nearly collinear with the quark m o m e n t u m . It also happens when P 3 ? T and p j both become small with p j oc | P 3 , T | , SO t h a t the gluon m o m e n t u m is soft. ( It also happens when the quark becomes soft, but there is a numerator factor t h a t cancels the soft quark singularity.) T h u s the singularities for a soft or collinear gluon correspond to small k~. Now consider the Fourier transform to coordinate space. T h e quark propagator in Fig. 5 is Mk)
= fdz+dX-dxeXp(i[k+*-+k-X+-*•*])
SP(
(27)
When k+ is large and k~ is small, the contributing values of x have small x~ and large x+. T h u s the propagation of the virtual quark can be pictured in space-time as in Fig. 5. T h e quark propagates a long distance in the x+ direction before decaying into a quark-gluon pair. T h a t is, the singularities t h a t can lead to divergent perturbative cross sections arise from interactions t h a t happen a long time after the creation of the initial quark-antiquark pair.
275 2.5
Nature of the long-time
physics
Figure 6. Typical paths of partons in space contributing to e"*"e~ -4 hadrons, as suggested by the singularities of perturbative diagrams. Short wavelength fields are represented by classical paths of particles. Long wavelength fields are represented by wavy lines.
Imagine dividing the contributions to a scattering cross section into long-time contributions and short-time contributions. In the long-time contributions, perturbation theory is out of control, as indicated in Eq. (19). Nevertheless the generic structure of the long-time contribution is of great interest. This structure is illustrated in Fig. 6. Perturbative diagrams have big contributions from space-time histories in which partons move in collinear groups and additional partons are soft and communicate over large distances, while carrying small m o m e n t u m . T h e picture of Fig. 6 is suggested by the singularity structure of diagrams at any fixed order of perturbation theory. Of course, there could be nonperturbative effects t h a t would invalidate the picture. Since nonperturbative effects can be invisible in perturbation theory, one cannot claim t h a t the structure of the final state indicated in Fig. 6 is known to be a consequence of Q C D . One can point, however, to some cases in which one can go beyond fixed order perturbation theory and sum the most i m p o r t a n t effects of diagrams of all orders (for example, Ref. [ 10 ]). In such cases, the general picture suggested by Fig.. 6 remains intact. We thus find t h a t perturbative Q C D suggests a certain structure of the final state produced in e + e ~ —¥ hadrons: the final state should consist of jets of nearly collinear particles plus soft particles moving in r a n d o m directions. In fact, this qualitative prediction is a qualitative success. Given some degree of qualitative success, we may be bolder and ask whether perturbative Q C D permits quantitative predictions. If we want quantitative predictions, we will somehow have to find things to measure t h a t are
276 not sensitive to interactions t h a t happen long after the basic hard interaction. This is the subject of the next section.
2.6
The long-time
problem
We have seen t h a t perturbation theory is not effective for long-time physics. But the detector is a long distance away from the interaction, so i t would seem t h a t long-time physics has to be present. Fortunately, there are some measurements t h a t are not sensitive to longtime physics. An example is the total cross section to produce hadrons in e + e ~ annihilation. Here effects from times At 3> 1/y/s cancel because of unitarity. To see why, note t h a t the quark state is created from the vacuum by a current operator J at some time t; it then develops from time t to time oo according to the interaction picture evolution operator [/(oo,f), when it becomes the final state \N). T h e cross section is proportional to the sum over N of this amplitude times a similar complex conjugate amplitude with t replaced by a different time t1. We Fourier transform this with exp(—iy/s (t — t')), so t h a t we can take At = t — t' to be of order l/y/s. Now replacing ^2 \N)(N\ by the unit operator and using the unitarity of the evolution operators U, we obtain Y,{WW,oo)\N)(N\U{oo,t)J(t)\0)
(28)
N
= (0\J(t')U{t'
,oo)U{oo,t)J{t)\0)
= (0\J(t')U{t'
,t)J(t)\0).
Because of unitarity, the long-time evolution has canceled out of the cross section, and we have only evolution from t to t'. There are three ways to view this result. First, we have the formal argument given above. Second, we have the intuitive understanding t h a t after the initial quarks and gluons are created in a time At of order l/y/s, something will happen with probability 1. Exactly what happens is long-time physics, but we d o n ' t care a b o u t it since we sum over all the possibilities \N). Third, we can calculate a t some finite order of perturbation theory. Then we see infrared infinities at various stages of the calculations, but we find t h a t the infinities cancel between real gluon emission graphs and virtual gluon graphs. An example is shown in Fig. 7. We see t h a t the total cross section is free of sensitivity t o long-time physics. If the total cross section were all you could look at, Q C D physics would be a little boring. Fortunately, there are other quantities t h a t are not sensitive to infrared effects. They are called infrared safe quantities. To formulate the concept of infrared safety, consider a measured quantity
277
Figure 7. Cancellation between real and virtual gluon graphs. If we integrate the real gluon graph on the left times the complex conjugate of the similar graph with the gluon attached to the antiquark, we will get an infrared infinity. However the virtual gluon graph on the right times the complex conjugate of the Born graph is also divergent, as is the Born graph times the complex conjugate of the virtual gluon graph. Adding everything together, the infrared infinities cancel.
t h a t is constructed from the cross sections, -
^
,
(29)
to make n hadrons in e + e ~ annihilation. Here Ej is the energy of the j t h hadron and Qj = (&j,(j>j) describes its direction. We treat the hadrons as effectively massless and do not distinguish the hadron flavors. Following the notation of Ref. [ u ] , let us specify functions Sn t h a t describe the measurement we want, so t h a t the measured quantity is X
dn
=2! I
+ 3! 1 +
4!
da[2] 2(Pl,P2)
dn2
d£l2dE3dtt3
da[3] dQ,2dE3d£l3
Sz{pt,P2,P3)
dQ2dE3d£l3dE4d£l4 da[4] dQ,2dE3dQ3dE4dil4
SI{PI,P2-,P3,PA)
(30)
+ •
The functions S are symmetric functions of their arguments. In order for our measurement to be infrared safe, we need ?n + l ( p i ' ) . . . , ( l - A ) p £ I A p £ )
for 0 < A < 1.
= S„(p$',. - P d )
(31)
278
Y Figure 8. Infrared safety. In an infrared safe measurement, the three jet event shown on the left should be (approximately) equivalent to an ideal three jet event shown on the right.
W h a t does this mean? T h e physical meaning is t h a t the functions S„ and <Sn_i are related in such a way t h a t the cross section is not sensitive to whether or not a mother particle divides into two collinear daughter particles t h a t share its m o m e n t u m . T h e cross section is also not sensitive to whether or not a mother particle decays to a daughter particle carrying all of its m o m e n t u m and a soft daughter particle carrying no m o m e n t u m . T h e cross section is also not sensitive to whether or not two collinear particles combine, or a soft particle is absorbed by a fast particle. All of these decay and recombination processes can h a p p e n with large probability in the final state long after the hard interaction. But, by construction, they don't m a t t e r as long as the sum of the probabilities for something to happen or not to happen is one. Another version of the physical meaning is t h a t for an IR-safe quantity a physical event with hadron jets should give approximately the same measurement as a parton event with each jet replaced by a parton, as illustrated in Fig. 8. To see this, we simply have to delete soft particles and combine collinear particles until three jets have become three particles. In a calculation of the measured quantity 1, we simply calculate with partons instead of hadrons in the final state. The calculational meaning of the infrared safety condition is t h a t the infrared infinities cancel. T h e argument is t h a t the infinities arise from soft and collinear configurations of the partons, t h a t these configurations involve long times, and t h a t the time evolution operator is unitary. I have started with an abstract formulation of infrared safety. It would be good to have a few examples. The easiest is the total cross section, for which
< S n ( K , . . . , p £ ) = l.
(32)
A less trivial example is the thrust distribution. One defines the thrust Tn of
279
an n particle event as r„K,---,P^)=maxEj=nll^:fl . (33) L,i = l\Pi\ Here u is a unit vector, which we vary to maximize the sum of the absolute values of the projections of pi on u. Then the thrust distribution (l/atot) da/dT is defined by taking Sn(rf,...,rt)
= (l/
•
(34)
It is a simple exercise to show that the thrust of an event is not affected by collinear parton splitting or by zero momentum partons. Therefore the thrust distribution is infrared safe. Another example is the energy-energy correlation function d12/dcos(6)12: Sn(p1, . . . ,p£) = £ ^ « $ (cos(0y) - cos(0)) . *—' s J
(35)
J
This measures the correlation between the energies measured by detectors separated by an angle 9 as depicted in Fig. 9. Is this infrared safe? Note that the contribution from a particle with E{ —> 0 drops out. In addition, replacing one particle by two collinear particles doesn't change the thrust: (1 - A) En Ej +XEn Ej = En Ej.
(36)
This works for the autocorrelation term too: (1 - A) 2 El + 2A(1 -X)E2n
+ A2 E2n = E2n.
(37)
%
Figure 9. The energy-energy correlation function
A final example is the cross section to make n jets, crn. Intuitively, a jet is supposed to be a spray of particles all going in approximately the same
280 direction. To make this precise, we need a definite algorithm. There are several algorithms to choose from. Here is the simplest (but not the best) one.
Figure 10. Jet definition
Start with a list of m o m e n t a Pi,p%, ••• ,p%- At the start, these represent the m o m e n t a of particles. (In a perturbative calculation, they are the m o m e n t a of partons.) Choose a parameter j / c u t . Now proceed through the following steps: 1. Find the pair (i,j) such t h a t (pi + Pj)2 is the smallest. 2. If (pi + pj)2 > ycut s, exit. Else continue. 3. Replace the two m o m e n t a pi and pj in the list by their sum p£ = pf+pf • 4. Go to 1. This produces a list of m o m e n t a pi of jets.
281 factorization involving parton distribution functions, which we will discuss later. (See Refs. f 1 ]^ 2 ]^ 1 4 ] for more information.) 3
T h e smallest t i m e scales
In this section, I explore the physics of time scales smaller t h a n 1/\fs. One way of looking at this physics is to say t h a t it is plagued by infinities and we can m a n a g e to hide the infinities. A better view is t h a t the short-time physics contains wonderful truths t h a t we would like to discover - t r u t h s a b o u t grand unified theories, q u a n t u m gravity and the like. However, q u a n t u m field theory is arranged so as to effectively hide the t r u t h from our experimental a p p a r a t u s , which can probe with a time resolution of only an inverse half TeV. I first outline what renormalization does to hide the ugly infinities or the beautiful t r u t h . Then I describe how renormalization leads to the running coupling. Because of renormalization, calculated quantities depend on a renormalization scale. I look at how this dependence works and how the scale can be chosen. Finally, I discuss how one can use experiment to look for the hidden physics beyond the Standard Model, taking high Ej jet production in hadron collisions as an example. 3.1
What renormalization
does
In any Feynman graph, one can insert perturbative corrections to the vertices and the propagation of particles, as illustrated in Fig. 11. T h e loop integrals in these graphs will get big contributions from m o m e n t a much larger t h a n \/s. T h a t is, there are big contributions from interactions t h a t happen on time scales much smaller t h a n l/y/s. I have tried to illustrate this in the figure. T h e virtual vector boson propagates for a time \/y/s, while the virtual fluctuations t h a t correct the electroweak vertex and the quark propagator occur over a time At t h a t can be much smaller than l/\/s. Let us pick an ultraviolet cutoff M t h a t is much larger t h a n -^/s, so t h a t we calculate the effect of fluctuations with 1/M < At exactly, up to some order of perturbation theory. W h a t , then, is the effect of virtual fluctuations on smaller time scales, At with At < 1/M but, say, At still larger t h a n ^Pianki where gravity takes over? Let us suppose t h a t we are willing to neglect contributions to the cross section t h a t are of order s/s/M or smaller compared to the cross section itself. Then there is a remarkable theorem 1 5 : the effects of the fluctuations are not particularly small, but they can be absorbed into changes in the couplings of the theory. (There are also changes in the masses of the theory and adjustments to the normalizations of the field operators,
282 but we can concentrate on the effect on the couplings.)
Figure 11. Renormalization. The effect of the very small time interactions pictured are absorbed into the running coupling.
T h e p r o g r a m of absorbing very short-time physics into a few parameters goes under the n a m e of renormalization. There are several schemes available for renormalizing. Each of them involves the introduction of some scale parameter t h a t is not intrinsic to the theory but tells how we did the renormalization. Let us agree to use MS renormalization (see Ref. [15] for details). Then we introduce an MS renormalization scale /J,. A good (but approximate) way of thinking of // is t h a t the physics of time scales At ^ l / / i is removed from the perturbative calculation. T h e effect of the small time physics is accounted for by adjusting the value of the strong coupling, so t h a t its value depends on the scale t h a t we used: as — as{fi). (The value of the electromagnetic coupling also depends on //.) 3.2
The running
coupling
Figure 12. Short-time fluctuations in the propagation of the gluon field absorbed into the running strong coupling.
We account for time scales much smaller than 1/// by using the running coupling as(fi). T h a t is, a fluctuation such as t h a t illustrated in Fig. 12 can
283
be dropped from a calculation and absorbed into the running coupling that describes the probability for the quark in the figure to emit the gluon. The fj, dependence of as(/i) is given by a certain differential equation, called the renormalization group equation (see Ref. [15]): d
Qfs
^)
dM^)~^r
at t \\ = p{asM) =
a f^litlY
R f^AtlY ft
^°\~^) " l~J
_i_ +
—
One calculates the beta function f3(as) perturbatively in QCD. The first coefficient, with the conventions used here, is A> = (33 -2Nf)/l2, (40) where Nf is the number of quark flavors. Of course, at time scales smaller than a very small cutoff 1/M (at the "GUT scale," say) there is completely different physics operating. Therefore, if we use just QCD to adjust the strong coupling, we can say that we are accounting for the physics between times 1/M and 1/fi. The value of as at Ho w M is then the boundary condition for the differential equation. See Fig. 13. fixed order ??
log(l/M)
log(l/n)
l°g(^*)
Figure 13. Time scales accounted for by explicit fixed order perturbative calculation and by use of the renormalization group.
The renormalization group equation sums the effects of short-time fluctuations of the fields. To see what one means by "sums" here, consider the result of solving the renormalization group equation with all of the /?,• beyond fto set to zero: a,(ri
« a,(M) - (A,/TT) ln(p 2 /M 2 ) a2s(M) + ((3oM2 In2 ( / i 2 / M 2 ) a ? ( M ) + --=
^
.
(41)
A series in powers of as(M) - that is the strong coupling at the GUT scale is summed into a simple function of fi. Here as(M) appears as a parameter in the solution. Note a crucial and wonderful fact. The value of as(fi) decreases as // increases. This is called "asymptotic freedom." Asymptotic freedom implies
284
t h a t Q C D acts like a weakly interacting theory on short time scales. It is true t h a t quarks and gluons are strongly bound inside nucleons, but this strong binding is the result of weak forces acting collectively over a long time. In Eq. (41), we are invited to think of the graph of as(/j,) versus \i. T h e differential equation t h a t determines this graph is characteristic of Q C D . There could, however, be different versions of QCD with the same differential equation but different curves, corresponding to different boundary values as(M). T h u s the parameter as(M) tells us which version of QCD we have. To determine this parameter, we consult experiment. Actually, Eq. (41) is not the most convenient way to write the solution for the running coupling. A better expression is
a M
' "ftlnJW
^
Here we have replaced as ( M ) by a different (but completely equivalent) parameter A. A third form of the running coupling is , ^
W *
<x,(Mz)
l + (f30/n)as(Mz)ln(^/M^y
,
.
(
^>
Here the value of as(p) at fi = Mz labels the version of Q C D t h a t obtains in our world. In any of the three forms of the running coupling, one should revise the equations to account for the second term in the beta function in order to be numerically precise. 3.3
The choice of scale
In this section, we consider the choice of the renormalization scale \i in a calculated cross section. Consider, as an example, the cross section for e + e ~ —>• hadrons via virtual photon decay. Let us write this cross section in the form
Here s is the square of the c m . energy, a is e 2 /(47r), and Qj is the electric charge in units of e carried by the quark of flavor / , with / — u, d, s, c, b. T h e nontrivial part of the calculated cross section is the quantity A, which contains the effects of the strong interactions. Using MS renormalization with
285 scale /i, one finds (after a lot of work) t h a t A is given by Ref. [ 16 ]:
A
=
^M
+ [ 1 . 4 0 9 2 + 1 . 9 1 6 7 In (M2/s)}
( ^ Y
+ [-12.805 + 7.8186 In (>a2/s) + 3.674 ln2(^/s)]
(SiAtTj (45)
Here, of course, one should use for as (fi) the solution of the renormalization group equation (39) with at least two terms included. As discussed in the preceding subsection, when we renormalize with scale fi, we are defining what we mean by the strong coupling. T h u s as in Eq. (45) depends on fi. T h e perturbative coefficients in Eq. (45) also depend on \±. On the other hand, the physical cross section does not depend on //: /
T
A
= 0.
(46)
T h a t is because [i is just an artifact of how we organize perturbation theory, not a parameter of the underlying theory. Let us consider Eq. (46) in more detail. Write A in the form oo
A ~ £ c
n
M a , M " .
(47)
n= l
If we differentiate not the complete infinite sum but just the first TV terms, we get minus the derivative of the sum from TV + 1 to infinity. This remainder is of order ct^"1"1 as as —> 0. T h u s N c
d n
2J2
^
n(p)
a . M " ~0(a,(n)N+1).
(48)
n= l
T h a t is, the harder we work calculating more terms, the less the calculated cross section depends on fi. Since we have not worked infinitely hard, the calculated cross section depends on /i. W h a t choice shall we make for fil Clearly, In (// 2 /s) should not be big. Otherwise the coefficients cn(fi) are large and the "convergence" of perturbation theory will be spoiled. There are some who will argue t h a t one scheme or the other for choosing ji is the "best." You are welcome to follow whichever advisor you want. I will show you below t h a t for a well behaved quantity like A the precise choice makes little difference, as long as you obey the c o m m o n sense prescription t h a t In (// 2 /s) not be big.
286
3.4
An example
0.06 0.05
0.03 0.02 0.01
° -T
-2
'~
-1
"6
1
2
Figure 14. Dependence of A ( / J ) on the MS renormalization scale [i. The falling curve is A j . The flatter curve is A2. The horizontal lines indicates the amount of variation of A2 when /J, varies by a factor 2.
Let us consider a quantitative example of how A(/i) depends on fi. This will also give us a chance to think about the theoretical error caused by replacing A by the sum A n of the first n terms in its perturbative expansion. Of course, we do not know what this error is. All we can do is provide an estimate. (Our discussion will be rather primitive. For a more detailed error estimate for the case of the hadronic width of the Z boson, see Ref. [17].) Let us think of the error estimate in the spirit of a "1 <x" theoretical error: we would be surprised if | A„ — A| were much less than the error estimate and we would also be surprised if this quantity were much more than the error estimate. Here, one should exercise a little caution. We have no reason to expect that theory errors are gaussian distributed. Thus a 4er difference between A n and A is not out of the question, while a 4 a fluctuation in a measured quantity with purely statistical, gaussian errors is out of the question. Take as{Mz) = 0.117, ^/i = 34 GeV, 5 flavors. In Fig. 14, I plot A(p) versus p defined by H = 2pJ~s.
(49)
The steeply falling curve is the order a) approximation to A(^), Ai(/i) = as(fj,)/7T. Notice that if we change y, by a factor 2, Ai(//) changes by about 0.006. If we had no other information than this, we might pick Ax (y/s) w 0.044 as the "best" value and assign a ±0.006 error to this value. (There is no special
287
magic to the use of a factor of 2 here. The reader can pick any factor that seems reasonable.) Another error estimate can be based on the simple expectation that the coefficients of a" are of order 1 for the first few terms. (Eventually, they will grow like n\. Ref. [17] takes this into account, but we ignore it here.) Then the first omitted term should be of order ± 1 x a2 PS ±0.020 using a s (34 GeV) m 0.14. Since this is bigger than the previous ±0.006 error estimate, we keep this larger estimate: A RS 0.044 ± 0.020. Returning now to Fig. 14, the second curve is the order a2 approximation, A2(/i). Note that A2(p) is less dependent on p. than Ai(/i). What value would we now take as our best estimate of A? One idea is to choose the value of p, at which k-iiji) is least sensitive to p,. This idea is called the principle of minimal sensitivity18: A P M S — A(^PMS') ,
dA(p) d\np,
0.
(50)
This prescription gives A £S 0.0470. Note that this is about 0.003 away from our previous estimate, A RJ 0.0440. Thus our previous error estimate of 0.020 was too big, and we should be surprised that the result changed so little. We can make a new error estimate by noting that A2(/i) varies by about 0.0012 when p changes by a factor 2 from PPMS- Thus we might estimate that A PS 0.0470 with an error of ±0.0012. This estimate is represented by the two horizontal lines in Fig. 14. An alternative error estimate can be based on the next term being of order ±1 xaj(34 GeV) RS 0.003. Since this is bigger than the previous ±0.0012 error estimate, we keep this larger estimate: A & 0.0470 ± 0.003. I should emphasize that there are other ways to pick the "best" value for A. For instance, one can use the BLM method, 19 which is based on choosing the fj, that sets to zero the coefficient of the number of quark flavors in A2(p). Since the graph of A2(/i) is quite fiat, it makes very little difference which method one uses. Now let us look at A(p) evaluated at order a,, Az(p). Here we make use of the full formula in Eq. (45). In Fig. 15, I plot A^(p) along with A 2 (^) and Ai(/i). The variation of Aa(/t) with \i is smaller than that of A2(/^). The improvement is not overwhelming, but is apparent particularly at small p. It is a little difficult to see what is happening in the first graph of Fig. 15, so I show the same thing with an expanded scale. (Here the error band based on the p dependence of A 2 is also indicated. Recall that we decided that this error band was an underestimate.) The curve for As(p) has zero derivative at two places. The corresponding values are A PS 0.0436 and A m 0.0456. If
288
log2(/i/\/s)
log2{fl/y/s)
Figure 15. Dependence of A(fJ.) on the MS renormalization scale n, first with a normal scale and then with an expanded scale. The falling curve is Aj . The flatter curve is A2. The still flatter curve is A3. The horizontal lines represent the variation of A2 when /J, varies by a factor 2.
I take the best value of A to be the average of these two values and the error to be half the difference, I get A Pd 0.0446 ± 0.0010. T h e alternative error estimate is ± 1 x 0^(34 GeV) f» 0.0004. We keep the larger error estimate of ±0.0010. Was the previous error estimate valid? We guessed A ?a 0.0470 ± 0.003. Our new best estimate is 0.0446. The difference is 0.0024, which is in line with our previous error estimate. Had we used the error estimate ±0.0012 based on the fi dependence, we would have underestimated the difference, although we would not have been too far off.
3.5
Beyond
the Standard
Model
We have seen how the renormalization group enables us to account for QCD physics at time scales much smaller than -*/s, as indicated in Fig. 13. However, a t some scale At ~ 1/M, we run into the unknown! How can we see the unknown in current experiments? First, the unknown physics affects a , , aem, sm2(0w)- Second, the unknown physics affects masses of u, d,..., e, fi,.... T h a t is, the unknown physics (presumably) determines the parameters of the Standard Model. These parameters have been well measured. Thus, a Nobel prize awaits the physicist who figures out how to use a model for the unknown physics to predict these parameters. There is another way t h a t as yet unknown physics can affect current experiments. Suppose t h a t quarks can scatter by the exchange of some new particle with a heavy mass M, as illustrated in Fig. 16, and suppose t h a t this mass is not too enormous, only a few TeV. Perhaps the new particle isn't
289
Figure 16. New physics at a TeV scale. In the first diagram, quarks scatter by gluon exchange. In the second diagram, the quarks exchange a new object with a TeV mass, or perhaps exchange some of the constituents out of which quarks are made.
a particle at all, but is a pair of constituents t h a t live inside of quarks. As mentioned above, this physics affects the parameters of the S t a n d a r d Model. However, unless we can predict the parameters of the S t a n d a r d Model, this effect does not help us. There is, however, another possible clue. T h e physics at the TeV scale can introduce new terms into the lagrangian t h a t we can investigate in current experiments. In the second diagram in Fig. 16, the two vertices are never at a separation in time greater t h a n 1/M, so t h a t our low energy probes cannot resolve the details of the structure. As long as we stick to low energy probes, y/s -C M, the effect of the new physics can be summarized by adding new terms to the lagrangian of Q C D . A typical term might be AC=
AP
' ^
^ '
(51)
2
There is a factor g t h a t represents how well the new physics couples to quarks. T h e most i m p o r t a n t factor is the factor 1/M2. This factor m u s t be there: the product of field operators has dimension 6 and the lagrangian has dimension 4, so there must be a factor with dimension —2. Taking this argument one step further, the product of field operators in A £ must have a dimension greater t h a n 4 because any product of field operators having dimension equal to or less t h a n 4 t h a t respects the symmetries of the S t a n d a r d Model is already included in the lagrangian of the S t a n d a r d Model. 3.6
Looking for new terms in the effective
lagrangian
How can one detect the presence in the lagrangian of a term like t h a t in Eq. (51)? These terms are small. Therefore we need either a high precision
290
experiment, or an experiment t h a t looks for some effect t h a t is forbidden in the S t a n d a r d Model, or an experiment t h a t has moderate precision and operates a t energies t h a t are as high as possible. Let us consider an example of the last of these possibilities, p+p —> jet+X as a function of the transverse energy (~ PT) of the jet. T h e new term in the lagrangian should add a little bit to the observed cross section t h a t is not included in the s t a n d a r d Q C D theory. When the transverse energy ET of the jet is small compared to M , we expect D a t a — Theory Theory *
9
_9 E% W-
<52>
Here the factor g2/M2 follows because AC contains this factor. T h e factor ET follows because the left hand side is dimensionless and ET is the only factor with dimension of mass t h a t is available.
• CTEQ3M CDF (Preliminary)* 1.03 DO (Preliminary) * 1.01
E, (GeV)
Figure 17. Jet cross sections from CDF and DO compared to QCD theory. (Data — Theory)/Theory is plotted versus the transverse energy Ej- of the jet. The theory here is next-to-leading order QCD using the CTEQ3M parton distribution. Source: Ref. [20]
In Fig. 17, I show a plot comparing experimental jet cross sections from C D F 2 1 and DO 22 compared to next-to-leading order QCD theory. T h e theory works fine for ET < 200 GeV, but for 200 GeV < ET, there appears to be a systematic deviation of just the form anticipated in Eq. (52). This example illustrates the idea of how small distance physics beyond the Standard Model can leave a trace in the form of small additional terms in the effective lagrangian t h a t controls physics at currently available energies. However, in this case, there is some indication t h a t the observed effect might
291 be explained by some combination of the experimental systematic error and the uncertainties inherent in the theoretical prediction. 2 3 In particular, the prediction is sensitive to the distributions of quarks and gluons contained in the colliding protons, and the gluon distribution in the kinematic range of interest here is rather poorly known. In the next section, we turn to the definition, use, and measurement of the distributions of quarks and gluons in hadrons.
4
Deeply inelastic scattering
Until now, I have concentrated on hard scattering processes with leptons in the initial state. For such processes, we have seen t h a t the hard part of the process can be described using perturbation theory because as{fi) gets small as /i gets large. Furthermore, we have seen how to isolate the hard part of the interaction by choosing an infrared safe observable. But what about hard processes in which there are hadrons in the initial state? Since the fundamental hard interactions involve quarks and gluons, the theoretical description necessarily involves a description of how the quarks and gluons are distributed in a hadron. Unfortunately, the distribution of quarks and gluons in a hadron is controlled by long-time physics. We cannot calculate the relevant distribution functions perturbatively (although a calculation in lattice Q C D might give them, in principle). T h u s we must find how to separate the short-time physics from the parton distribution functions and we must learn how the parton distribution functions can be determined from the experimental measurements. In this section, I discuss parton distribution functions and their role in deeply inelastic lepton scattering (DIS). This includes e + p —»• e + X and v + p —> e 4- X where the m o m e n t u m transfer from the lepton is large. I first outline the kinematics of deeply inelastic scattering and define the structure functions i*\, F2 and F3 used to describe the process. By examining the spacetime structure of DIS, we will see how the cross section can be written as a convolution of two factors, one of which is the parton distribution functions and the other of which is a cross section for the lepton to scatter from a quark or gluon. This factorization involves a scale fip t h a t , roughly speaking, divides the soft from the hard regime; I discuss the dependence of the calculated cross section on /J.p. W i t h this groundwork laid, I give the MS definition of parton distribution functions in terms of field operators and discuss the evolution equation for the parton distributions. I close the section with some comments on how the parton distributions are, in practice, determined from experiment.
292
4-1
Kinematics
of deeply inelastic
lepton
scattering
P Figure 18. Kinematics of deeply inelastic scattering
In deeply inelastic scattering, a lepton with m o m e n t u m k^ scatters on a hadron with m o m e n t u m p^. In the final state, one observes the scattered lepton with m o m e n t u m k'^ as illustrated in Fig. 18. T h e m o m e n t u m transfer q" = A" - k'"
(53)
is carried on a photon, or a W or Z boson. The interaction between the vector boson and the hadron depends on the variables q^ and p M . From these two vectors we can build two scalars (not counting m2 = p2). T h e first variable is Q2 = -q\
(54)
where the minus sign is included so t h a t Q2 is positive. The second scalar is the dimensionless Bjorken variable,
xhi = £ _ .
(55)
(In the case of scattering from a nucleus containing A nucleons, one replaces pf by pV/A and defines *bj = AQ2/(2p • q).) One calls the scattering deeply inelastic if Q2 is large compared to 1 GeV . Traditionally, one speaks of the scaling limit, Q2 —> oo with £bj fixed. Actually, the asymptotic theory to be described below works pretty well if Q2 is bigger t h a n , say, 4 GeV and *bj is anywhere in the experimentally accessible range, roughly 1 0 - 4 < Zbj < 0.5.
293 T h e invariant mass squared of the hadronic final state is W2 = (p + q)2• In the scaling regime of large Q2 one has
W2=m2 + i ^ i
Q2 » m2.
(56)
This justifies saying t h a t the scattering is not only inelastic but deeply inelastic. We have spoken of the scalar variables t h a t one can form from p^ and q^. Using the lepton m o m e n t u m k^, one can also form the dimensionless variable y=
p~k-
4-2
Structure
functions
(5?)
for DIS
One can make quite a lot of progress in understanding the theory of deeply inelastic scattering without knowing anything about Q C D except its s y m m e tries. One expresses the cross section in terms of three structure functions, which are functions of x\,j and Q2 only. Suppose t h a t the initial lepton is a neutrino, v^, and the final lepton is a muon. Then in Fig. 18 the exchanged vector boson, call it V, is a W boson, with mass My — Mw • Alternatively, suppose t h a t b o t h the initial and final leptons are electrons and let the exchanged vector boson be a photon, with mass My = 0. This was the situation in the original DIS experiments at SLAG in the late 1960's. In experiments with sufficiently large Q2, Z boson exchange should be considered along with photon exchange, and the formalism described below must be augmented. Given only the electroweak theory to tell us how the vector boson couples to the lepton, one can write the cross section in the form
^=^w\w^L*VMW^9)>
(58)
where C'v is 1 in the case t h a t V is a photon and 1/(64 sm4 9w) in the case t h a t V is a W boson. T h e tensor L^ describes the lepton coupling to the vector boson and has the form !"" = iTr(fc-77"fe'-77y)
(59)
in the case t h a t V is a photon. For a W boson, one has I"" = T r ( k - 7 r"fc'-7r"),
(60)
294
where 1^ is 7^(1 - 75) for a W+ boson {v -> W+£) or 7^(1 + 75) for a W~ boson (P -> VF-f). See Ref. [*]. The tensor W1"' describes the coupling of the vector boson to the hadronic system. It depends on pM and q^. We know that it is a Lorentz tensor and that Wv,i — W'"'*. We also know that the current to which the vector boson couples is conserved (or in the case of the axial current, conserved in the absence of quark masses, which we here neglect) so that qliWf11' = 0. Using these properties, one finds three possible tensor structures for W^. Each of the three tensors multiplies a structure function, Fi, -F2 or F3, which, since it is a Lorentz scalar, can depend only on the invariants x^] and Q2. Thus
w,v
= - (fiV
Fi(xhhQ2)
^Tj
+ [P, - g „ y J [PV - lo^jjr) —
F
^bhQ2)
-ie^XaPXq"
F3(xhj,Q2). (61) VI If we combine Eqs. (58,59,60,61), we can write the cross section for deeply inelastic scattering in terms of the three structure functions. Neglecting the hadron mass compared to Q2, the result is dCT
dxhi dy
=N(Q2
]
yF1+
(62)
-^-F2+Sv(l-^)F3
Here the normalization factor N and the factor &v multiplying F3 are 4na2 N = ——, Qz N=
j
na202 2
4ain*{0w) [Q + N=
-.
6V=0,
e~+h->e-
,
Sv = l,
v+h->
,
Sv = -l,
v+h^v+
uT
+X, +X,
2
Mw)
--^
+ X.
(63)
In principle, one can use the y dependence to determine all three of Fi, F2, F$ in a deeply inelastic scattering experiment. 4-3
Space-time structure of DIS
So far, we have used the symmetries of QCD in order to write the cross section for deeply inelastic scattering in terms of three structure functions, but we
295
Figure 19. Reference frame for the analysis of deeply inelastic scattering.
have not used any other dynamical properties of the theory. Now we turn t o the question of how the scattering develops in space and time. For this purpose, we define a convenient reference frame, which is illustrated in Fig. 19. Denoting components of vectors vM by (v+, v~, v y ) , we chose the frame in which (g+,«r,q) = ^ = ( - Q , Q , 0 ) .
(64)
We also d e m a n d t h a t the transverse components of the hadron m o m e n t u m be zero in our frame. Then +
-
\
l
i Q
V2v*bj'
x
bimh
Q
n
\
'"''
(65)
Notice t h a t in the chosen reference frame the hadron m o m e n t u m is big and the m o m e n t u m transfer is big.
Figure 20. Interactions within a fast moving hadron. The lines represent world lines of quarks and gluons. The interaction points are spread out in a:* and pushed together in x~.
296 Consider the interactions among the quarks and gluons inside a hadron, using x+ in the role of "time" as in Section 2.3. For a hadron at rest, these interactions happen in a typical time scale Ax+ ~ 1/m, where m ~ 300 MeV. A hadron t h a t will participate in a deeply inelastic scattering event has a large m o m e n t u m , p+ ~ Q, in the reference frame t h a t we are using. T h e Lorentz transformation from the rest frame spreads out interactions by a factor Q/m, so t h a t
Ax+~Ix^ =i
(66)
This is illustrated in Fig. 20. I offer two caveats here. First, I a m treating Zbj as being of order 1. To treat small £bj physics, one needs to put back the factors of £bj, and the picture changes rather dramatically. Second, the interactions among the quarks and gluons in a hadron at rest can take place on time scales Ax+ t h a t are much smaller than 1/m, as we discussed in Section 3. We will discuss this later on, but for now we start with the simplest picture.
Figure 21. The virtual photon meets the fast moving hadron. One of the partons is annihilated and recreated as a parton with a large minus component of momentum. This parton develops Into a jet of particles.
W h a t happens when the fast moving hadron meets the virtual photon? T h e interaction with the photon carrying m o m e n t u m q~~ ~ Q is localized to within Ax+ ~
1/Q.
(67)
297
During this short time interval, the quarks and gluons in the proton are effectively free, since their typical interaction times are comparatively much longer. We thus have the following picture. At the m o m e n t x+ of the interaction, the hadron effectively consists of a collection of quarks and gluons (partons) t h a t have m o m e n t a (pf, p,-). We can treat the partons as being free. T h e pf are large, and it is convenient to describe t h e m using m o m e n t u m fractions £,•: ^=P+/P+,
0<&<1.
(68)
(This is convenient because the £,• are invariant under boosts along the z axis.) T h e transverse m o m e n t a of the partons, p ; , are small compared to Q and can be neglected in the kinematics of the 7-parton interaction. T h e "on-shell" or "kinetic" minus m o m e n t a of the partons, p~ = pj/(2pf), are also very small compared to Q a n d can be neglected in the kinematics of the 7-parton interaction. We can think of the partonic state as being described by a wave function V>(PI~>PI;P2">P2;---),
(69)
where indices specifying spin and flavor q u a n t u m numbers have been suppressed.
£p
r Figure 22. Feynman diagram for deeply inelastic scattering.
This approximate picture is represented in Feynman diagram language in Fig. 22. T h e larger filled circle represents the hadron wave function tp. T h e smaller filled circle represents a s u m of subdiagrams in which the particles have virtualities of order Q2. All of these interactions are effectively instantaneous on the time scale of the intra-hadron interactions t h a t form the wave function.
298 T h e approximate picture also leads to an intuitive formula t h a t relates the observed cross section t o the cross section for 7-parton scattering:
a^-j^EA/^M^+owg).
(TO)
In Eq. (70), the function / is a parton distribution function: fa/h{€, A*F) d£ gives probability t o find a parton with flavor a = g, u, u, d,... in hadron h, carrying m o m e n t u m fraction within d£ of £ = pf /p+. If we knew the wave functions ip, we would form / by summing over the number n of unobserved partons, integrating \i/)n\2 over the m o m e n t a of the unobserved partons, and also integrating over the transverse m o m e n t u m of the observed parton. The second factor in Eq. (70), dcra'/dE' du>', is the cross section for scattering the lepton from the parton of flavor a and m o m e n t u m fraction £. I have indicated a dependence on a factorization scale /ip in both factors of Eq. (70). This dependence arises from the existence of virtual processes among t h e partons t h a t take place on a time scale much shorter t h a n t h e nominal Aai + ~ Q/m?. I will discuss this dependence in some detail shortly. 4-4
The hard scattering
cross
section
T h e parton distribution functions in Eq. (70) are derived from experiment. T h e hard scattering cross sections do~a(p)/dE' duj' are calculated in perturbation theory, using diagrams like those shown in Fig. 23. The diagram on t h e left is t h e lowest order diagram. The diagram on the right is one of several that contributes to da a t order as; in this diagram the parton a is a gluon.
Lowest order.
Higher order.
Figure 23. Some Feynman diagrams for the hard scattering part of deeply inelastic scattering.
299
Figure 24. Kinematics of lowest order diagram.
One can understand a lot about deeply inelastic scattering from Fig. 24, which illustrates the kinematics of the lowest order diagram. Recall t h a t in the reference frame t h a t we are using, the virtual vector boson has zero transverse m o m e n t u m . T h e incoming parton has m o m e n t u m along the plus axis. After the scattering, the parton m o m e n t u m must be on the cone k^k^ = 0, so the only possibility is t h a t its minus m o m e n t u m is non-zero and its plus m o m e n t u m vanishes. T h a t is £p++g+=0.
(71)
Since p+ = Q/(a;bj-\/2) while q+ — —Q/V2, this implies * = *bj.
(72)
T h e consequence of this is t h a t the lowest order contribution to dcr in Eq. (70) contains a delta function t h a t sets £ to Xbj- T h u s deeply inelastic scattering at a given value of x^j provides a determination of the parton distribution functions at m o m e n t u m fraction £ equal to x'bj, as long as one works only to leading order. In fact, because of this close relationship, there is some tendency to confuse the structure functions F„(a:bj) Q 2 ) with the parton distribution functions fa,h((,, HF)- I will try to keep these concepts separate: the structure functions F„ are something t h a t one measures directly in deeply inelastic scattering; the parton distribution functions are determined rather indirectly from experiments like deeply inelastic scattering, using formulas t h a t are correct only up to some finite order in a s . 4-5
Factorization
j'or the structure
functions
We will look at DIS in a little detail since it is so i m p o r t a n t . Our object is to derive a formula at lowest order in perturbation theory relating the measured
300 structure functions for e + h -» e + X via photon exchange and the parton distribution functions. Start with Eq. (70), representing Fig. 22. We change variables in this equation from (E1 ,io') t o {xb-},y). We relate £bj t o the m o m e n t u m fraction £ and a new variable x t h a t is just aibj with the proton m o m e n t u m / replaced by the parton m o m e n t u m £p^: *bj = ^
- = £ ^
- = tx.
(73)
T h a t is, £ is the parton level version of x^ • The variable y is identical to the parton level version of y because p^ appears in both the numerator and denominator: _
PJ_Q_
P
• k
_ ip-q £p • k
(74^
T h u s Eq. (70) becomes
d^bj dy
J 0 JO
daa(fiF) £ [ dxdy ?
*-^ „
+ 0(m/Q).
(75)
We can calculate daa/(dx dy) in perturbation theory. At lowest order this is particularly simple, and we obtain results proportional t o delta functions of £bj/£- Using Eq. (62) t o relate da/'{dx^dy) t o the structure functions F\ and i<2 for 7 exchange, we obtain the simple lowest order results
*i(*bj, Q2) ~ \ J2 Ql f»d*hi) + °(a°) + 0(m/Q), F2{xhi,Q2)~52
Q2axbifa/h(xbi)+0(a,)
+ 0(m/Q).
(76) (77)
a
T h e factor 1/2 between £bj-fi spin 1/2 quarks. 4-6
HF
an
d F2 follows from the Feynman diagrams for
dependence
I have so far presented a rather simplified picture of deeply inelastic scattering in which the hard scattering takes place on a time scale Ax+ ~ 1/Q, while the internal dynamics of the proton take place on a much longer time scale Ax+ ~ Q/m2. W h a t happens when one actually computes Feynman diagrams and looks a t what time scales contribute? Consider the graph shown in Fig. 25. One finds t h a t the transverse m o m e n t a k range from order m to order Q,
301
Figure 25. Deeply inelastic scattering with a gluon emission.
corresponding to energy scales k — k 2 / 2 f c + between k ~ m2 /Q and k — Q2/Q ~ Q, or time scales Q/m2 < Ax+ < 1/Q. T h e property of factorization for the cross section of deeply inelastic scattering, embodied in Eq. (70), is established by showing t h a t the perturbative expansion can be rearranged so t h a t the contributions from long time scales appear in the parton distribution functions, while the contributions from short time scales appear in the hard scattering functions. (See Ref. [24] for more information.) Thus, in Fig. 25, a gluon emission with k 2 ~ m 2 is part of / ( £ ) , while a gluon emission with k 2 ~ Q2 is part of da. Breaking up the cross section into factors associated with short and long time scales requires the introduction of a factorization scale, ^F- W h e n calculating the diagram in Fig. 25, one integrates over k. Roughly speaking, one counts the contribution from k 2 < fiF as part of the higher order contribution to fa/h(S,i HF), convoluted with the lowest order hard scattering function da for deeply inelastic scattering from a quark. T h e contribution from /J,F < k 2 then counts as part of the higher order contribution to da convoluted with an uncorrected parton distribution. This is illustrated in Fig. 26. (In real calculations, the split is accomplished with the aid of dimensional regularization, and is a little more subtle t h a n a simple division of the integral into two parts.) hard scattering
parton di;:;nburioii<:
iog(i/M
i°sP*)
Figure 26. Time scales in factorization.
302 A consequence of this is t h a t both d&a{fip)/dE'dto' and fa/h(£, (J-F) depend on fip. T h u s we have two scales, the factorization scale /up in fj/h[£, HF) a n d the renormalization scale fi in as(fi). (When we expand da in powers of as(fi) then the coefficients depend on fi.) As with p, the cross section does not depend on fip. T h u s there is an equation d(cross section)/d fip = 0 t h a t is satisfied to the accuracy of the perturbative calculation used. If you work harder and calculate to higher order, then the dependence on ftp is less. Often one sets fip = A* in applied calculations. In fact, it is rather common in applications to deeply inelastic scattering to set fip — fi = Q.
4-7
Contour
graphs of scale
dependence
As an example, look at the one jet inclusive cross section in proton-antiproton collisions. Specifically, consider the cross section dajdEfdr] to make a collimated spray of particles, a jet, with transverse energy ET and rapidity n. (Here ET is essentially the transverse m o m e n t u m carried by the particles in the jet and r\ is related to the angle between the jet and the beam direction by T) = l n ( t a n ( 0 / 2 ) ) . We will investigate this process and discuss the definitions in the next section. For now, all we need to know is t h a t the theoretical form u l a for the cross section at next-to-leading order involves the strong coupling a s ( / i ) and two factors fa/h{x> HF) representing the distribution of partons in the two incoming hadrons. There is a parton level hard scattering cross section t h a t also depends on fi and pp. How does the cross section depend on p in as(p) and fip in fa/h{x,MF)? In Fig. 27, I show contour plots of the jet cross section versus p and pp at two different values of ET- T h e center of the plots corresponds to a s t a n d a r d choice of scales, p = fip = ET/2. The axes are logarithmic, representing \og2{2fi/ET) and \og2(2pp/ ET)- T h u s fi and fip vary from ET/S to 2ET in the plots. Notice t h a t the dependence on the two scales is rather mild for the nextto-leading order cross section. The cross section calculated at leading order is quite sensitive to these scales, but most of the scale dependence found at order a2s has been canceled by the a3s contributions to the cross section. One reads from the figure t h a t the cross section varies by roughly ± 1 5 % in the central region of the graphs, both for m e d i u m and large ET- Following the argument of Sec. 3.4, this leads to a rough estimate of 15% for the theoretical error associated with truncating perturbation theory at next-to-leading order.
303
Nuv
Nuv
ET = 100 GeV
ET = 500 GeV
Figure 27. Contour plots of the one jet inclusive cross section versus the renormalization scale /i and the factorization scale fj.p. The cross section is drr/dE^dri at rj = 0 with ET = 100 GeV in the first graph and E-p = 500 GeV in the second. The horizontal axis in each graph represents NJJV = log 2 ^ p / E x ) and the vertical axis represents NQO = log 2 (2/if /ET). The contour lines show 5% changes in the cross section relative to the cross section at the center of the figures. The c m energy is y/s = 1800 GeV.
4-8
MS definition of parton distribution
functions
The factorization property, Eq. (70), of the deeply inelastic scattering cross section states that the cross section can be approximated as a convolution of a hard scattering cross section that can be calculated perturbatively and parton distribution functions fa/A{x, HF)- But what are the parton distribution functions? This question has some practical importance. The hard scattering cross section is essentially the physical cross section divided by the parton distribution function, so the precise definition of the parton distribution functions leads to the rules for calculating the hard scattering functions. The definition of the parton distribution functions is to some extent a matter of convention. The most commonly used convention is the MS definition, which arose from the theory of deeply inelastic scattering in the language of the "operator product expansion." 25 Here I will follow the (equivalent) formulation of Ref. [14]. For a more detailed pedagogical review, the reader may consult Ref. [26]. Using the MS definition, the distribution of quarks in a hadron is given
304 as the hadron m a t r i x element of certain quark field operators:
filhU^F) = lJ^7e-'^+y-(p\M^y-,0)1+Fi>i(Q)\p).
(78)
Here \p) represents the state of a hadron with m o m e n t u m p M aligned so t h a t PT = 0. For simplicity, I take the hadron to have spin zero. T h e operator V>(0), evaluated at x^ = 0, annihilates a quark in the hadron. T h e operator V>;(0,y~,O) recreates the quark at x+ = x y = 0 and x~ = y~, where we take the appropriate Fourier transform in y~ so t h a t the quark t h a t was annihilated and recreated has m o m e n t u m k+ = £p+. The motivation for the definition is t h a t this is the hadron m a t r i x element of the appropriate number operator for finding a quark. There is one subtle point. T h e number operator idea corresponds to a particular gauge choice, A+ = 0. If we are using any other gauge, we insert the operator F=Vexp(-ig
T
dz~A+% z~, 0) ta J .
(79)
T h e V indicates a p a t h ordering of the operators and color matrices along the p a t h from ( 0 , 0 , 0 ) to ( O , y ~ , 0 ) . This operator is the identity operator in A+ = 0 gauge and it makes the definition gauge invariant.
DIS
Parton distribution
Figure 28. Deeply inelastic scattering and the parton distribution functions.
T h e physics of this definition is illustrated in Fig. 28. T h e first picture (from Fig. 21) illustrates the amplitude for deeply inelastic scattering. T h e fast proton moves in the plus direction. A virtual photon knocks out a quark, which emerges moving in the minus direction and develops into a jet of particles. T h e second picture illustrates the amplitude associated with the quark
305 distribution function. We express F as F2F1 where F2 = V exp (+ig
f
F1 = Vexp(-igf
dz~At(0,
z',0)tA
dz~ A+(0, z~ ,0)ta)
, .
(80)
and write the quark distribution function including a sum over intermediate states \N):
1 J
Z7F
N
(81) Then the a m p l i t u d e depicted in the second picture in Fig. 28 is (N\Fi^i(0)\p). T h e operator ip annihilates a quark in the proton. T h e operator F± stands in for the quark moving in the minus direction. T h e gluon field A evaluated along a lightlike line in the minus direction absorbs longitudinally polarized gluons from the color field of the proton, just as the real quark in deeply inelastic scattering can do. T h u s the physics of deeply inelastic scattering is built into the definition of the quark distribution function, albeit in an idealized way. T h e idealization is not a problem because the hard scattering function da systematically corrects for the difference between real deeply inelastic scattering and the idealization. There is one small hitch. If you calculate any Feynman diagrams for fi/hiii A*F)) you are likely to wind up with an ultraviolet-divergent integral. T h e operator product t h a t is part of the definition needs renormalization. This hitch is only a small one. We simply agree to do all of the renormalization using the MS scheme for renormalization. It is this renormalization t h a t introduces the scale \xp into fi/h{£, HF)- This role of fip is in accord with Fig. 26: roughly speaking fip is the upper cutoff for what m o m e n t a belong with the parton distribution function; at the same time it is the lower cutoff for what m o m e n t a belong with the hard scattering function. W h a t a b o u t gluons? T h e definition of the gluon distribution function is similar to the definition for quarks. We simply replace the quark field t\> by suitable combinations of the gluon field A^, as described in Refs. [14] and [ 26 ].
4.9
Evolution
of the parton
distributions
Since we introduced a scale fip in the definition of the parton distributions in order to define their renormalization, there is a renormalization group equa-
306 tion t h a t gives the /J,F dependence
-71
fa/h(x,/jF)
= V" /
— Pab(x/Z,as(nF))
fb/h(t,fiF)-
(82)
This is variously known as the evolution equation, the Altarelli-Parisi equation, and the D G L A P (Dokshitzer-Gribov-Lipatov-Altarelli-Parisi) equation. Note the sum over parton flavor indices. The evolution of, say, an up quark (a = u) can involve a gluon {b = g) through the element Pug of the kernel t h a t describes gluon splitting into uu. T h e equation is illustrated in Fig. 29. When we change the renormalization scale fiF, the change in the probability to find a parton with m o m e n t u m fraction x and flavor a is proportional to the probability to find such a parton with large transverse m o m e n t u m . T h e way to get this parton with large transverse m o m e n t u m is for a parton carrying m o m e n t u m fraction £ and much smaller transverse m o m e n t u m to split into partons carrying large transverse m o m e n t a , including the parton t h a t we are looking for. This splitting probability, integrated over the appropriate transverse m o m e n t u m ranges, is the kernel Pab-
d d log nF ' ^ xp
Figure 29. The renormalization group equation for the parton distribution functions.
T h e kernel P in Eq. (82) has a perturbative expansion
P*{*li,«M)
= P%\*lti)^
+ P%\*Ii)
( ^ )
2
+ - - - (83)
T h e first two terms are known and are typically used in numerical solutions of the equation. To learn more about the D G L A P equation, the reader may consult Refs. [x] and [ 26 ].
307 4.10
Determination
and use of the parton
distributions
T h e MS definition giving the parton distribution in terms of operators is process independent - it does not refer to any particular physical process. These p a r t o n distributions then appear in the Q C D formula for any process with one or two hadrons in the initial state. In principle, the parton distribution functions could be calculated by using the method of lattice Q C D (see Ref. [ 26 ])Currently, they are determined from experiment. Currently the most comprehensive analyses are being done by the C T E Q 2 0 and M R S 2 7 groups. These groups perform a "global fit" to d a t a from experiments of several different types. To perform such a fit one chooses a parameterization for the parton distributions at some s t a n d a r d factorization scale HQ. Certain sum rules t h a t follow from the definition of the parton distribution functions are built into the parameterization. An example is the m o m e n t u m sum rule:
E I <%tf°/h(Z,n)
= l-
(84)
Given some set of values for the parameters describing the fa/h{x> f-o), one can determine fa/h{xi^) f ° r a U higher values of fi by using the evolution equation. Then the QCD cross section formulas give predictions for all of the experiments t h a t are being used. One systematically varies the parameters m fa/h{xi A*o) to obtain the best fit to all of the experiments. One source of information about these fits is the world wide web pages of Ref. [ 28 ]. If the freedom available for the parton distributions is used to fit all of the world's d a t a , is there any physical content to Q C D ? T h e answer is yes: there are lots of experiments, so this program won't work unless Q C D is right. In fact, there are roughly 1400 d a t a in the C T E Q fit and only a b o u t 25 parameters available to fit these data. 5
Q C D in h a d r o n - h a d r o n collisions
When there is a hadron in the initial state of a scattering process, there are inevitably long time scales associated with the binding of the hadron, even if part of the process is a short-time scattering. We have seen, in the case of deeply inelastic scattering of a lepton from a single hadron, t h a t the dependence on these long time scales can be factored into a parton distribution function. But what happens when two high energy hadrons collide? T h e reader will not be surprised to learn that we then need two parton distribution functions.
308
I explore hadron-hadron collisions in this section. I begin with the definition of a convenient kinematical variable, rapidity. Then I discuss, in turn, production of vector bosons (7*, W, and Z) and jet production. The theory for the production of heavy quarks is similar and I omit it. 5.1
Kinematics:
rapidity
In describing hadron-hadron collisions, it is useful to employ a kinematic variable y that is called rapidity. Consider, for example, the production of a Z boson plus anything, p+p —> Z + X. Choose the hadron-hadron c m . frame with the z axis along the beam direction. In Fig. 30, I show a drawing of the collision. The arrows represent the momenta of the two hadrons; in the c m . frame these momenta have equal magnitudes. We will want to describe the process at the parton level, a + b —> Z + X. The two partons a and b each carry some share of the parent hadron's momentum, but generally these will not be equal shares. Thus the magnitudes of the momenta of the colliding partons will not be equal. We will have to boost along the z axis in order to get to the parton-parton c m . frame. For this reason, it is useful to use a variable that transforms simply under boosts. This is the motivation for using rapidity. » 4
Figure 30. Collision of two hadrons containing partons producing a Z boson. The c m . frame of the two hadrons is normally not the c m . frame of the two partons that create the Z boson.
Let q** — (q+,q~, q) be the momentum of the Z boson. Then the rapidity of the Z is defined as
y-\m{f).
(85)
The four components (q+,q~, q) of the Z boson momentum can be written in terms of four variables, the two components of the Z boson's transverse
309 m o m e n t u m q, its mass M, and its rapidity: q" = ( e V ( q 2 + M 2 ) / 2 , e " V ( q 2 + M 2 ) / 2 , q ) .
(86)
T h e utility of using rapidity as one of the variables stems from the transformation property of rapidity under a boost along the z axis: 9
+-»eV,
9"^e-wg-,
q ->• q.
(87)
Under this transformation, y^y
+ to.
(88)
This is as simple a transformation law as we could hope for. In fact, it is just the same as the transformation law for velocities in non-relativistic physics in one dimension.
Figure 31. Definition of the polar angle 9 used in calculating the rapidity of a massless particle.
Consider now the rapidity of a massless particle. Let the massless particle emerge from the collision with polar angle 0, as indicated in Fig. 3 1 . A simple calculation relates the particle's rapidity y to 6: y= - l n ( t a n ( 0 / 2 ) ) ,
(m = 0).
(89)
Another way of writing this is t a n 0 = l/sinhy, One also defines the pseudorapidity r] = - l n ( t a n ( 0 / 2 ) )
(m = 0).
(90)
r\ of a particle, massless or not, by or
t a n 0 = l / s i n h rj.
(91)
T h e relation between rapidity and pseudorapidity is sinh r; = J l + m2/q%, sinh y.
(92)
Thus, if the particle isn't quite massless, r\ may still be a good approximation to y.
310
5.2
7*; W, Z production
in hadron-hadron
collisions
Consider the process A + B-+Z
+ X,
(93)
where A and B are high energy hadrons. This process and the corresponding process in which a W boson is produced are historically i m p o r t a n t because they are the processes by which the W and Z bosons were first observed. 2 9 Two features of this reaction are important for our discussion. First, the mass of the Z boson is large compared to 1 GeV, so t h a t a process with a small time scale At ~ \/Mz must be involved in the production of the Z. At lowest order in the strong interactions, the process is q + q —> Z. Here the quark and antiquark are constituents of the high energy hadrons. The second significant feature is t h a t the Z boson does not participate in the strong interactions, so t h a t our description of the observed final state can be very simple. In process (93), we allow the Z boson to have any transverse m o m e n t u m q. (Typically, then, q will be much smaller than Mz-) Since we integrate over q and the mass of the Z boson is fixed, there is only one variable needed to describe the m o m e n t u m of the Z boson. We choose to use its rapidity y, so t h a t we are interested in the cross section da/dy.
Figure 32. A Feynman diagram for Z boson production in a hadron-hadron collision. Two partons, carrying momentum fractions £^ and £#, participate in the hard interaction. This particular Feynman diagram illustrates an order aB contribution to the hard scattering cross section: a gluon is emitted in the process of making the Z boson. The diagram also shows the decay of the Z boson into an electron and a neutrino.
T h e cross section takes a factored form similar to t h a t found for deeply inelastic scattering. Here, however, there are two parton distribution functions:
^ « V fdU
fdiB
dy
JXB
^JXA
fa/A(U,HF) h/B^B^F)
d
^ll. dy
(94)
311 T h e meaning of this formula is intuitive: fa/A(£,A, PF) d£,A gives the probability to find a parton in hadron A\ fb/B^B, Hj) d^B gives the probability to find a parton in hadron B; daab/dy gives the cross section for these partons to produce the observed Z boson. T h e formula is illustrated in Fig. 32. T h e hard scattering cross section can be calculated perturbatively. Fig. 32 illustrates one particular order as contribution to dcrab/dy. T h e integrations over parton m o m e n t u m fractions have limits XA and XB , which are given by xA = eyy/M2/s,
xB = e-yy/M2/s.
(95)
Eq. (94) has corrections of order m/Mz, where m is a mass characteristic of hadronic systems, say 1 GeV. In addition, when dcrab/dy is calculated to order a ^ , then there are corrections of order a ^ + 1 . We could equally well talk about A+B —> f*+X where the virtual photon decays into a muon pair or an electron pair t h a t is observed and where the mass Q of the 7* is large compared t o 1 GeV. For A + B —> / i + + \i~ + X one has the formula
dQHy
d^B fa/A(U,HF)
fb/B^B^F)
^ J ^ J ^ o ' o i " ^ ' ^ , ^ ^ , ^ ,
"
2
,
•
(96)
dQ2dy
This process is historically i m p o r t a n t . Before Q C D , one had partons and Q E D . Partons and Q E D did a good j o b of explaining deeply inelastic scattering. But there were other ways to explain deeply inelastic scattering. High mass dimuon production was investigated experimentally by Lederman et a/. 30 Drell and Y a n 3 1 proposed to explain the experimental results using the lowest order version of the formula above. It worked. T h e alternative m e t h o d s t h a t worked for deeply inelastic scattering did not work here. This helped to establish the parton picture. 5.3
Factorization
is not so obvious
The factorization formula Eq. (94) is supposed to hold up to m2/Q2 corrections. This result is not so obvious, and in fact does not hold graph by graph. A graph for which it does not hold is shown in Fig. 33. Does factorization hold if one sums over graphs? T h e answer is yes, but to show this one needs to use unitarity, causality and gauge invariance. For more information, the reader is invited to consult Ref. [ 24 ]. 5.4
Jet
production
In our study of high energy electron-positron annihilation, we discovered three things. First, QCD makes the qualitative prediction t h a t particles in the final
312
PA
.!---'** PB
'"—•—«, S .
Figure 33. A graph for which factorization does not work. The spectator partons interact softly with the active partons, so that the soft part of the graph does not break up into two factors.
state should tend to be grouped in collimated sprays of hadrons called jets. The jets carry the momenta of the first quarks and gluons produced in the hard process. Second, certain kinds of experimental measurements probe the short-time physics of the hard interaction, while being insensitive to the long-time physics of parton splitting, soft gluon exchange, and the binding of partons into hadrons. Such measurements are called infrared safe. Third, among the infrared safe observables are cross sections to make jets.
Figure 34. Sketch of a two-jet event at a hadron collider. The cylinder represents the detector, with the beam pipe along its axis. Typical hadron-hadron collisions produce beam remnants, the debris from soft interactions among the partons. The particles in the beam remnants have small transverse momenta, as shown in the sketch. In rare events, there is a hard parton-parton collision, which produces jets with high transverse momenta. In the event shown, there are two high Pf jets.
These ideas work for hadron-hadron collisions too. In such collisions, there is sometimes a hard parton-parton collision, which produces two or more jets, as depicted in Fig. 34. Consider the cross section to make one jet plus anything else, A + B -> jet + X.
(97)
313
Let ET be the transverse energy of the jet, defined as the sum of the absolute values of the transverse m o m e n t a of the particles in the j e t . Let y be the rapidity of the jet. Given a definition of exactly what it means to have a jet with transverse energy ET and rapidity y, the jet production cross section takes the familiar factored form ——— Ssl V / dU d£B fa/A(U, dETdr] ^JXA JxB
HF) fb/B(tB,
HF)
,F f • dETdn
(98)
One diagram t h a t contributes to da at nexL-to-leading order is shown in Fig. 35.
Figure 35. A Feynman diagram for jet production in hadron-hadron collisions. The leading order diagrams for A + B —> jet + X occur at order a2s. This particular diagram is for an interaction of order aa. When the emitted gluon is not soft or nearly collinear to one of the outgoing quarks, this diagram corresponds to a final state like that shown in the small sketch, with three jets emerging in addition to the beam remnants. Any of these jets can be the jet that is measured in the one jet inclusive cross section.
W h a t shall we choose for the definition of a jet? At a crude level, high ET jets are quite obvious and the precise definition hardly m a t t e r s . However, if we want to make a quantitative measurement of a jet cross section to compare to next-to-leading order theory, then the definition does m a t t e r . There are several possibilities for a definition t h a t is infrared safe. T h e one most used in hadron-hadron collisions is based on cones. Here I will present a different algorithm t h a t is similar to the algorithms used to define jets in electronpositron annihilation. 5.5
kx
algorithm
The main idea of the kr algorithm 3 2 is to modify one of the algorithms used in e+e~ annihilation so t h a t we use ET, r\ and
314
contamination by the many low ET particles in the event. We choose a merging parameter R. Then we start with a list of "protojets" with m o m e n t a Pi,... ,p^N a s illustrated in Fig. 36. We also start with an empty list of finished jets. The end result is a list of m o m e n t a pk of finished jets, ordered in Ex-
i
Figure 36. A two jet event in a proton antiproton collision. The two protojets on the lower left are the first to be combined.
T h e algorithm can be stated very simply. See Fig. 36. 1. For each pair of protojets define dij = mm(ETti,
ETtj)
[fa -
2
Vj)
+ (0,- - <j>j)2]/R2.
(99)
For each protojet define di = ETii.
(100)
2. Find the smallest of all the dij and the d,- Call it dm-m. 3. If dm\n is a dij, merge protojets i and j into a new protojet k with Er,k — Er,i + ET,J r)k = [ET,i Vi + ETJ
Vj]/ET,k
4>j]/ETjk
(101)
4. If d m in is a di, then protojet i is "not mergable." Remove it from the list of protojets and add it to the list of jets. 5. If protojets remain, go to 1. Evidently, if two protojets are collinear, they will be merged right away. If one has vanishing m o m e n t u m , it will either get merged with a protojet
315 nearby in angle, or it will become a low ET jet in the final list. Many of the jets have small ET and are really minijets, or just part of low ET debris. For an inclusive cross section to make n high ET jets plus anything else, the m a n y low ET jets do not affect the result. For an exclusive n jet cross section, one would use a cutoff i?T,min- T h u s in either case, low ET particles do not change the result. T h u s the algorithm is infrared safe. 6
Epilogue
Q C D is a rich subject. T h e theory and the experimental evidence indicate t h a t quarks and gluons interact weakly on short t i m e a n d distance scales. But the net effect of these interactions extending over long time and distance scales is that the chromodynamic force is strong. Quarks are bound into hadrons. Outgoing partons emerge as jets of hadrons, with each jet composed of subjets. T h u s Q C D theory can be viewed as starting with simple perturbation theory, but it does not end there. T h e challenge for both theorists and experimentalists is to extend the range of phenomena t h a t we can relate to the fundamental theory. References 1. G. Sterman et al., Handbook of Perturbative QCD, Rev. Mod. Phys. 6 7 , 157 (1995). 2. R. K. Ellis, W. J. Stirling, and B. R. Webber, QCD and Collider Physics, (Cambridge University Press, Cambridge, 1996). 3. L. Brown, Quantum Field Theory, (Cambridge University Press, C a m bridge, 1992) 4. G. Sterman, An Introduction to Quantum Field Theory, (Cambridge University Press, Cambridge, 1993). 5. M. Peskin and D. V. Schroeder, An Introduction to Quantum Field Theory, (Addison-Wesley, Reading, 1995). 6. S. Weinberg, The Quantum Theory of Fields, (Cambridge University Press, Cambridge, 1995). 7. D. E. Soper, ed., QCD and Beyond, Proceedings of the 1995 Theoretical Advanced Studies Institute, Boulder, 1995, (World Scientific, Singapore, 1996). 8. D. E. Soper, Basics of QCD Perturbation Theory, e-Print Archive: hepph/9702203, in The Strong Interaction, from Hadrons to Partons, XXIV SLAG Summer Institute on Particle Physics Stanford, August 1996, edited by L. DePorcel (SLAC, Stanford, 1997).
316
9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21.
22. 23. 24. 25. 26.
27. 28. 29. 30. 31. 32.
G. Sterman, Phys. Rev. D 17, 2773 (1978); 17, 2789 (1978). J. C. Collins and D. E. Soper, Nucl. Phys. B 1 9 3 , 381 (1981). Z. Kunszt and D. E. Soper, Phys. Rev. D 4 6 , 192 (1992). C. L. Basham, L. S. Brown, S. D. Ellis, and S. T. Love, Phys. Rev. D 19, 2018 (1979). S. Bethke, Z. Kunszt, D. E. Soper, and W. J. Stirling, Nucl. Phys. B 3 7 0 , 310 (1992). J. C. Collins and D. E. Soper, Nucl. Phys. B 1 9 4 , 445 (1982). G. Curci, W. Furmanski and R. Petronzio, Nucl. Phys. B 175, 27 (1980). J. C. Collins, Renormalization, (Cambridge University Press, Cambridge, 1984). L. Surguladze and M. Samuel, Rev. Mod. Phys. 6 8 , 259 (1996). D. E. Soper and L. Surguladze, Phys. Rev. D 54, 4566 (1996). P. M. Stevenson, Phys. Rev. D 2 3 , 2916 (1981). S. J. Brodsky, G. P. Lepage and P. MacKenzie, Phys. Rev. D 28, 228 (1983). H. Lai et al., e-Print Archive hep-ph/9606399, Phys. Rev. D 55, (to be published). C D F Collaboration (F. Abe al), Phys. Rev. Lett. 77, 438 (1996); B. Flaugher, C D F Collaboration, Proceedings of the XI Topical Workshop on ppbar Collider Physics, Padova, Italy, May 1996. DO Collaboration: N. Varelas, Proceedings of the International Conference on High Energy Physics, Warsaw, July 1996. J. Huston et al., Phys. Rev. Lett. 77, 444 (1996). J. C. Collins, D. E. Soper, and G. Sterman, in Perturbative QCD, edited by A. Mueller, (World Scientific, Singapore, 1989). W. A. Bardeen, A. J. Buras, D. W. Duke, and T Muta, Phys. Rev. D 18, 3998 (1978). D. E. Soper, in M. Golterman et al. eds., Lattice '96 International Symposium on Lattice Field Theory, St. Louis, June 1996 (Elsevier Science, A m s t e r d a m , to be published). A. D. Martin, R. G. Roberts, and W. J. Stirling, Phys. Lett. B 387, 419 (1996). P. A n a n d a m and D. E. Soper, "A Potpourri of Partons", http://zebu.uoregon.edu/~parton/. UA1 Collaboration (G. Arnison et al.), Phys. Lett. B 1 2 2 , 103 (1983); UA2 Collaboration (G. Banner et al.), Phys. Lett. B 1 2 2 , 476 (1983). J. C. Christenson et al., Phys. Rev. Lett. 25, 1523 (1970). S. D. Drell and T.-M. Yan, Ann. Phys. 66, 578 (1971). S. D. Ellis and D. E. Soper, Phys. Rev. D 48, 3160 (1993).
•, \ ^ : * v
~^e
Thomas DeGrand
This page is intentionally left blank
LATTICE QCD A N D THE C K M M A T R I X
Thomas DeGrand
University
Department of Colorado,
of Physics Boulder CO
80309-390
These lectures provide an introduction to lattice methods for nonperturbative studies of Quantum Chromodynamics. Lecture 1 (Ch. 2) is a very vanilla introduction to lattice QCD. Lecture 2 (Ch. 3) describes examples of recent lattice calculations relevant to fixing the parameters of the CKM matrix.
1
Introduction
The lattice 1 regularization of QCD has been a fruitful source of qualitative and quantitative information about QCD for many years, especially when combined with Monte Carlo simulation. Lattice methods are presently the only way we know how to compute masses and matrix elements in the strong interactions beginning with the Lagrangian of QCD. My goal in these lectures is to give enough of an overview of the subject that you will be able to make an intelligent appraisal of a lattice calculation. The first lecture will describe how to put QCD on a lattice. This is a long story with a lot of parts. Lattice QCD is full of technicalities, but I will try to make the discussion physical. In Lecture Two I will discuss lattice calculations of matrix elements which are needed to convert experimental numbers to predictions for the CKM matrix. These calculations have a lot of ingredients: a typical one starts with a particular choice of discretization and simulation algorithm, and a choice of operators whose matrix elements are appropriate for one's measurement. After the lattice number is computed, it may have to be converted into a number in a continuum regularization scheme (like MS), which will involve some kind of perturbative or nonperturbative matching calculation. Finally, it might have to be extrapolated in quark mass, to some physical quark mass value or the the chiral limit. None of these parts are simple or obvious (or more precisely, most of the time the simple and obvious idea doesn't work very well). Hopefully you will find the physics in the calculations more interesting than the tables of numbers which result. 319
320
2 2.1
Basics of L a t t i c e Q C D Lattice Variables and Actions
All quantum field theories must be regulated in order to control their ultraviolet divergences while calculations are performed. The lattice is a space-time cutoff which eliminates all degrees of freedom from distances shorter than the lattice spacing a. As with any regulator, it must be removed after renormalization. Contact with experiment only exists in the continuum limit, when the lattice spacing is taken to zero. The lattice is a unique regulator compared to the ones you might already know. Other regularization schemes are tied closely to perturbative expansions: one calculates a process to some order in a coupling constant; divergences are removed order by order in perturbation theory. The lattice, however, is a nonperturbative cutoff. Before a calculation begins, all wavelengths less than a lattice spacing are removed. All regulators have a price. On the lattice we sacrifice all continuous space-time symmetries but preserve all internal symmetries, including local gauge invariance. This preservation is important for nonperturbative physics. For example, gauge invariance is a property of the continuum theory which is nonperturbative, so maintaining it as we pass to the lattice means that all of its consequences (including current conservation and renormalizability) will be preserved. The bill is paid when we take the lattice spacing to zero and try to recover what we have left out. Let's begin by thinking about a lattice version of scalar field theory. One just replaces the space-time coordinate i M by a set of integers nM (xM = an^, where a is the lattice spacing). Field variables (f>(x) are defined on sites 4>{xn) =
/" d 4 ^ - > a 4 ^ •*
£(#„).
(1)
n
and the generating functional for Euclidean Green's functions is replaced by an ordinary integral over the lattice fields
Z= f(l[d
(2)
n
Gauge fields are a little more complicated. They carry a space-time index \i in addition to an internal symmetry index a (A^(x)) and are associated with a path in space x M (s): a particle traversing a contour in space picks up a phase factor %j) -> P(exp ig / dx^A^ip = U(s)ip(x). (3)
321
P is a path-ordering factor analogous to the time-ordering operator in ordinary quantum mechanics. Under a gauge transformation g, U(s) is rotated at each end: U(s) ^ g-1(x^8))U(8)g(xll(0)). (4) These considerations led Wilson 2 to formulate gauge fields on a space-time lattice, in terms of a set of fundamental variables which are elements of the gauge group G living on the links of a four-dimensional lattice, connecting neighboring sites x and x + aii: U^x), with Utl{x + fx)^ = U^x) U^n)=exp(igaTaA«(n))
(5)
for SU(N). (g is the coupling, A^ the vector potential, and Ta is a group generator). Under a gauge transformation link variables transform as Uli(x)^V(x)U^(x)V(x
+^
(6)
and site variables as 4>{x) -> V(x)ip{x)
(7)
so the only gauge invariant operators we can use as order parameters are matter fields connected by oriented "strings" of U's ^(x1)Ull{xl)Uti(x1+Li)...i>(x2)
(8)
or closed oriented loops of U's Tr . . . U^(x)U^x + p.) • •. -> Tr . . . U^x)V\x
+ fi)V{x + fi)U^x + /},).... (9)
An action is specified by recalling that the classical Yang-Mills action involves the curl of AM, F^. Thus a lattice action ought to involve a product of C/M's around some closed contour. Gauge invariance will automatically be satisfied for actions built of powers of traces of U's around arbitrary closed loops, with arbitrary coupling constants. If we assume that the gauge fields are smooth, we can expand the link variables in a power series in gaA' s. For almost any closed loop, the leading term in the expansion will be proportional to F^v. This is not a bug, it is a feature. All lattice actions are just bare actions characterized by many bare parameters (coefficients of loops). In the continuum (scaling) limit all these actions are in the same universality class, which is (presumably) the same universality class as QCD with any regularization scheme, and there will be cutoff-independent predictions from any lattice actions which are simply predictions of-QCD.
322
Let's hold that thought while we do an example: The simplest contour has a perimeter of four links. In SU(N) S
= ^ E
E
Re Tr
(! - U^n)U„{n + £ ) t / > + 0)Ut(n)).
(10)
This action is called the "plaquette action" or the "Wilson action" after its inventor, g2 is the bare lattice coupling, whose associated cutoff is a. The lattice parameter @ = 2N/g2 is often written instead of g2 = 4iras. Let us see how this action reduces to the standard continuum action. Specializing to the U(l) gauge group, and slightly redefining the coupling, S = ^ E E
R
e
(l-exp(i9a[Ali{n)+A„(n
+ p.)-All(x
+ i>)-Av{n)})).
(11)
The naive continuum limit is taken by assuming that the lattice spacing a is small, and Taylor expanding A^n
+ v) = A^n)
+ ad„A^{n) + ...
(12)
so the action becomes PS = \
Yl E
l
~
9
Re
(exp(t5o[a(a„^ - d,LA„) + 0(a2)]))
^ f l 4 E E ^ +l 4
j
(13)
( 14 ) (is)
transforming the sum on sites back to an integral. 2.2
Numerical
Simulations
In a lattice calculation, like any other calculation in quantum field theory, we compute an expectation value of any observable T as an average over a ensemble of field configurations:
(16)
We do this by Monte Carlo simulation: we construct an ensemble of states (collection of field variables), where the probability of finding a particular configuration in the ensemble is given by Boltzmann weighting (i. e. proportional
323
to exp(—S). Then the expectation value of any observable T is given simply by an average over the ensemble: 1
N
1=1
As the number of measurements N becomes large the quantity T will become a Gaussian distribution about a mean value, our desired expectation value. The idea of essentially all simulation algorithms 3 is to construct a new configuration of field variables from an old one. One begins with some initial field configuration and monitors observables while the algorithm steps along. After some number of steps, the value of observables will appear to become independent of the starting configuration. At that point the system is said to be "in equilibrium" and Eq. (17) can be used to make measurements. Dynamical fermions are a complication for QCD 4 . The fermion path integral is not a number and a computer can't simulate fermions directly. However, one can formally integrate out the fermion fields. For n / degenerate fermion flavors Z=
f[dU}[diP][dij}exp(-pSG(U)~^2ijM(U)ip) ^
(18)
i=l
= [[dU}{detM(U))n'exp(-(3S(U)).
(19)
The determinant introduces a nonlocal interaction among the U's: Z=
f[dU]exp(-l3S(U)-nfTr\n(M(U))).
(20)
Generating configurations of the U's involves computing how the action changes when the set of U's are varied. Typically, this involves inverting the fermion matrix M(U) {dlogM/dM = M~l). This is the major computational problem dynamical fermion simulations face. M has eigenvalues with a very large range- from 2ir down to mQa~ and in the physically interesting limit of small mq the matrix becomes ill-conditioned. At present it is necessary to compute at unphysically heavy values of the quark mass and to extrapolate to mq = 0. (The standard inversion technique today is one of the variants of the conjugate gradient algorithm 5 . ) This tremendous expense is responsible for one of the"standard" lattice approximations, the "quenched" approximation. In this approximation the back-reaction of the fermions on the gauge fields is neglected, by setting n/ = 0 in Eq. (18). Valence quarks, or quarks which
324
appear in observables, are kept, but no sea quarks. No one knows how good an approximation this is, in principle. In practice it works very well for spectroscopy. The only way we know how to test it is to compare simulations in the quenched approximation with those from full QCD. 2.3
Spectroscopy Calculations
Masses are computed in lattice simulations from the asymptotic behavior of Euclidean-time correlation functions. A typical (diagonal) correlator can be written as C(t) = (0|O(*)O(0)|0). (21) Making the replacement 0(t) = eHt0e-m
(22)
and inserting a complete set of energy eigenstates, Eq. (21) becomes
C(t) = J2\(0\O\n)\2e-E«t.
(23)
n
At large separation the correlation function is approximately C{t) ~ |(0|O|l)| 2 e- £ '*
(24)
where E\ is the energy of the lightest state which the operator O can create from the vacuum. Fig. 1 shows an example of this. If the operator does not couple to the vacuum, then in the limit of large t one hopes to to find the mass Ei by measuring the leading exponential falloff of the correlation function. If the operator O has poor overlap with the lightest state, a reliable value for the mass can be extracted only at a large time t. In some cases that state is the vacuum itself, in which E\ = 0. Then one looks for the next higher state-a signal which disappears into the constant background. This is hard to do. Most of the observables we are interested in will involve valence fermions. Let's suppose we wanted to measure the mass of a meson. Then we might take C{t) = '£(J(x,t)J(p,0))
(25)
X
where J(x, t) = ${x, t)Y4>{x, t)
(26)
and T is a Dirac matrix. The intermediate states \n) which saturate C(x, t) are the hadrons which the current J can create from the vacuum: the pion,
325
icr 3
10" 5
10~ 10 0
10
20 t
Figure 1: An obviously very nice looking lattice correlator and its fit. Periodic boundary conditions convert the exponential decay into a hyperbolic cosine.
for a pseudoscalar current, the rho, for a vector current, and so on. Now we write out the correlator in terms of fermion fields C(t) = J 3 ( 0 | ^ ( a ; , t ) a r < i ^ ( a ; ) t ) a ^ / t ( 0 ) 0 ) ^ r H ^ ( 0 , 0 ) ^ | 0 >
(27)
X
with a Roman index for spin and a Greek index for color. We contract creation and annihilation operators into quark propagators (0\T(^^t)aM0,0)0)10)
= G«f ( M ; 0 , 0 )
(28)
so
C(t) = J2 TrG(x, t; 0,0)rG(0,0; x, t)Y
(29)
X
where the trace runs over spin and color indices. Baryons are constructed similarly. A good way to think about these correlators is by using a sort of Feynman-diagram language which keeps track of the valence quark lines but ignores all the gluons and sea quarks. 2.4
The Continuum
Limit
When we define a theory on a lattice the lattice spacing a is an ultraviolet cutoff and all the coupling constants in the action are the bare couplings defined with
326
respect to it. When we take a to zero we must also specify how the couplings behave. The proper continuum limit comes when we take a to zero holding physical quantities fixed, not when we take a to zero fixing the couplings. On the lattice, if all quark masses are set to zero, the only dimensionful parameter is the lattice spacing, so all masses scale like 1/a. Said differently, a lattice calculation produces the dimensionless combination am(a). One can determine the lattice spacing by fixing one mass from experiment. Then all other dimensionful quantities can be predicted. Imagine computing some masses at several values of the lattice spacing. (Pick several values of the bare parameters and calculate masses for each set of couplings.) Our calculated mass ratios will depend on the lattice cutoff. If the lattice spacing is small enough, the typical behavior will look like (arriiia))/'(am2(a))
= m 1 ( 0 ) / m 2 ( 0 ) + 0(mia)
+ 0((mia)2)
+ ...
(30)
The leading term does not depend on the value of the UV cutoff, while the other terms do. The goal of a lattice calculation is to discover the value of some physical observable as the UV cutoff is taken to be very large, so the physics is in the first term. Everything else is an artifact of the calculation. We say that a calculation "scales" if the a—dependent terms in Eq. (30) are zero or small enough that one can extrapolate to a = 0, and generically refer to all the a—dependent terms as "scale violations." We can imagine expressing each dimensionless combination am(a) as some function of the bare coupling(s) {g(a)}, am = f({g(a)}). As a —• 0 we must tune the set of couplings {g(a)} so lim -f({g(a)})
—> constant.
(31)
a->0 a
From the point of view of the lattice theory, we must tune {g} so that correlation lengths 1/ma diverge. This will occur only at the locations of second (or higher) order phase transitions. In QCD the fixed point is gc = 0 so we must tune the coupling to vanish as a goes to zero. One needs to set the scale by taking one experimental number as input. A complication that you may not have thought of is that the theory we simulate on the computer is different from the real world. For example, the quenched approximation, or for that matter QCD with two flavors of degenerate quarks, almost certainly does not have the same spectrum as QCD with six flavors of dynamical quarks with their appropriate masses. Using one mass to set the scale from one of these approximations to the real world might not give a prediction for another mass which agrees with experiment.
327
(The glass is always half empty...In the strong coupling limit, lattice regularized QCD automatically confines 2 and chiral symmetry is spontaneously broken 6 . So unless there is some kind of phase transition as the bare couplings are tuned to take the cutoff away, which probably doesn't happen, we are working with a confining theory without doing anything special.) Today's QCD simulations range from 163 x 32 to 32 3 x 100 points and run from hundreds (quenched) to thousands (full QCD) of hours on the fastest supercomputers in the world. The cost of an unquenched Monte Carlo simulation in a box of physical size L with lattice spacing a and quark mass mq scales roughly as (-)4(1)1-2(-L)2-3 (32) a a mq where the 4 is just the number of sites, the 1-2 is the cost of "critical slowing down"-the extent to which successive configurations are correlated, and the 2-3 is the cost of inverting the fermion propagator, plus critical slowing down from the nearly massless pions. Thus it is worthwhile to think about how to do the discretization, to maximize the value of the lattice spacing. The thing to keep in mind is that the lattice action is just a bare action defined with a cutoff. No lattice discretization is any better or worse (in principle) than any other. Any bare action which is in the same universality class as QCD will produce universal numbers in the scaling limit. However, by clever engineering, it might be possible to devise actions whose scaling behavior is better, and which can be used at bigger lattice spacing. An example 7 of a test of scale violations is shown in Fig. 2. The x axis is the lattice spacing, in units of a quantity n, which is defined through the heavy quark potential: rfdV(r)/dr\ri = 1.0, about 0.4 fm. The plotting symbols are for different kinds of discretizations. The flatter the curve, the smaller the scale violations. The simplest organizing principle for "improvement" is to use the canonical dimensionality of operators as a guide. Consider the gauge action as an example. If we perform a naive Taylor expansion of a lattice operator like the plaquette, we find that it can be written as 1 - ^Re TrUpiaQ = r0TiF^v + a2[n £ „ „ TiD^F^D^F^ r3'Elll/^DllFltaDvFva] 4
+0{a )
+
+ (33)
The expansion coefficients have a power series expansion in the coupling, Vj = Aj + g2Bj + ... and the expectation value of any operator T computed using
328
(a)
2.0
„ 1.
1.6
1.4 0.0
I
, I ,
0.2 0.4 (a/rj2
0.
Figure 2: Lattice calculations of the (a) rho and (b) nucleon mass, interpolated to the point mnri — 0.778, as a function of lattice spacing.
the plaquette action will have an expansion (T(a)) = (T(0)> + 0(a) + 0(g2a) + ..
(34)
Other loops have a similar expansion, with different coefficients. Now the idea is to take the lattice action to be a minimal subset of loops and systematically remove the an terms for physical observables order by order in n by taking the right linear combination of loops in the action, S = "}2JCJOJ with Cj = 2 8 9,10 Co ; + S Cc] J ++ .... This method was developed by Symanzik and co-workers ' in the mid-80's. Ordinary perturbation theory (expansions in the bare lattice coupling g) are not very convergent, but clever prescriptions for definitions of couplings n or nonperturbative tuning methods 1 2 have been quite successful in developing improved lattice actions. 2.5
Relativistic Fermions on the Lattice
Defining fermions on the lattice involves yet another problem: doubling. Let's illustrate this with free field theory. The continuum free action is S=
/ d4x[iP{x)-fnd^(x)
+ m$(x)i>(x)).
(35)
329 One obtains the so-called naive lattice formulation by replacing the derivatives by symmetric differences: gnaive
=
^
^„g(^ n + / 1 -
^ . ^
+
n,n
m
^
^ „ .
(36)
n
The propagator is: G(p) = (z7^ sinp^a + ma)
= ——— —L M s l n " Pfia + m2 a2
(37)
We identify the physical spectrum through the poles in the propagator, at Po = iE: sinh 2 Ea = '^2 sin2 Pja + m2a2 (38) 3
The lowest energy solutions are the expected ones at p = (0,0,0), E ~ ± m , but there are other degenerate ones, at p = (IT, 0,0), (0, IT, 0,), . . . (71", 7r, 7r). As a goes to zero, the lightest excitations of the spectrum, the ones whose energy is 0(1), not 0 ( l / a ) , are the relevant ones, and there are sixteen of these, in all the corners of the Brillouin zone. Thus our action is a model for sixteen light fermions, not one. This is the famous "doubling problem." In fact, associated with the "doubling problem" is the Nielsen-Ninomaya 13 theorem, which says that no lattice action can be undoubled, chiral, and have couplings which extend over a finite number of lattice spacings (ultralocality). However, there are three ways to get two out of three. They are (a) Wilson Fermions (undoubled, nonchiral, ultralocal) We can alter the dispersion relation so that it has only one low energy solution. The other solutions are forced to E ~ 1/a and become very heavy as a is taken to zero. The simplest version of this solution, called a Wilson fermion, adds an irrelevant operator, a second-derivative-like term SW = ™
$ ^ n ( V w " 2 ^ « + >».-„) - ari,D2^
(39)
n,fi
to Snaive. The parameter r — 1 is almost always used and is implied when one speaks of using "Wilson fermions." There are two dimension-five operators which can be added to a fermion action. The Wilson term is just one of them. The other dimension-five term is a magnetic moment term Ssw
j-4>{x)ainJF^ip(x)
(40)
330
^
=
=
^
Figure 3: The "clover term".
and if both terms are included, their coefficients can be tuned so that there are no 0(ag2) lattice artifacts. This action is called the"Sheikholeslami-Wohlert" 14 or "clover" action because the lattice version of F^IV is the sum of paths shown in Fig. 3. Wilson-type fermions contain an explicit chiral-symmetry breaking term. This causes a lot of bad things to happen. The most obvious is that the zero bare quark mass limit is not respected by interactions; the quark mass is additively renormalized. The value of bare quark mass rnq which the pion mass vanishes, is not known a priori before beginning a simulation; it must be computed. This is done in a simulation involving Wilson fermions by varying mq and watching the pion mass extrapolate quadratically to zero as m^ ~ mq — mcq. It actually turns out that this is a worse problem than you would think: the Dirac operator D o n a gauge configuration could develop a real eigenmode A at minus the bare quark mass you dialed into the program. Then D + m would be non-invertible! Other nasty things happen (operator mixing, see the next section) and people argue about how serious they are in practice 15
(b) Staggered or Kogut-Susskind Fermions (chiral, doubled, ultralocal) In this formulation one reduces the number of fermion flavors by using one component "staggered" fermion fields rather than four component Dirac spinors. The Dirac spinors are constructed by combining staggered fields on
331 0.9
a \
CO
t
6 CO
Figure 4: An example of flavor symmetry breaking in an improved staggered action. The different 7's are a code for the various pseudoscalar states. Data are from Ref. 1 7 . For an explanation of the splitting, see Ref. 1 8 .
different lattice sites. Staggered fermions preserve an explicit chiral symmetry as mq —> 0 even for finite lattice spacing, as long as all four flavors are degenerate, although it is not the SU(Nf) x SU(Nf) of the continuum, it is a U{\). Thus there is only one Goldstone pion at finite a, plus other non-degenerate pseudoscalar states whose mass goes to zero in the continuum limit (See Fig. 4 for an example of this.) They are preferred over Wilson fermions in situations in which the chiral properties of the fermions dominate the dynamics. They also cheaper to simulate than Wilson fermions, because there are less variables. However, flavor and translational symmetry are all mixed together 16 , (c) Chiral, undoubled, but not ultralocal These actions implement a modified version 19 of the chiral rotation 1 dtp - 75(1 - -aD)ip;
Sip = tp(l -
1
-aD)y,
(41)
which is sufficient to preserve all the interesting features of continuum chiral symmetry. An example of such an action is the "domain wall fermion 20 ." It is a variation on the idea that if you have a fermion coupled to a scalar field, and the scalar field interpolates between two minima (forms a soliton), the fermion will develop a zero-energy chiral mode bound to the center of the soliton. Now we go into brane world, extend QCD into five dimensions,
332
and put ourselves and our four dimensional world on the kink. There is an anti-kink out there in the fifth dimension, and as long as it is far away the mode on the kink doesn't see the anti-kink and the 4-d theory on the kink is chiral. But if the anti-kink is too close (fifth dimension too small) the modes mix and chiral symmetry is broken. How close is "too close" is (yet) another engineering question. There are four dimensional analogs of this - think of integrating out the modes in the fifth dimension in favor of a tower of massive fermions, and get "overlap fermions 21 ," or construct a low energy Wilsonian effective action from an underlying chiral theory and get "fixed point fermions 22 ." The bad feature is that these actions have couplings which reach out to many neighboring sites. Their strength drops exponentially with distance, so they are true local actions in the continuum limit, but they are very expensive to simulate. But stay tuned... 3
Hadronic Matrix Elements from the Lattice
One of the major goals of lattice calculations is to provide hadronic matrix elements which either test QCD or can be used as inputs to test the standard model. 3.1
Generic Matrix Element Calculations
Most of the matrix elements measured on the lattice are extracted from expectation values of local operators J(x) composed of quark and gluon fields. For example, if one wanted (0| J{x)\h) one could look at the two-point function
Cjo(t) = J2(0\J(x>t)°(0>ty\0)-
( 42 )
X
Inserting a complete set of correctly normalized momentum eigenstates -, _ 1 _
1 V^f,
\A,f)(A,p\ 2EA(P)
[
'
A,p
and using translational invariance and going to large t gives
W| = , - ' » H .
(44)
2mA
A second calculation of Coo(t) = £ < 0 | O ( M ) O ( 0 , 0 ) | 0 ) -> e - ^ t l ( 0 | 2 ^ l ) | 2
(45)
333
is needed to extract (0| J\A) by fitting two correlators with three parameters. Similarly, a matrix element (h\J\h') can be gotten from CAB(t,t')
= Y,(Q\OA(t)J(x:t')OB(0)\0)-
(46)
x
by stretching the source and sink operators OA and OB far apart on the lattice, letting the lattice project out the lightest states, and then measuring and dividing out (0\OA\h) and (0\OB\h). These lattice matrix elements are not yet the continuum matrix elements. Typically, one is interested in some matrix element defined with a particular regularization scheme. It is a generic feature of quantum field theory that an operator defined in one scheme (MS) will be a superposition of operators in another scheme (lattice). In principle, the superposition could be all possible operators. So generically an operator of dimension D will mix like
(f\Ocn°nt(n)\i)m
= aDJ2 Znm(f\Olatt(a)m\i)
(47)
m
The only restriction are symmetries: in a theory where parity is conserved a vector operator and an axial vector operator can't mix. This is relevant for lattice calculations because the symmetries of the lattice action are in general different from continuum symmetries. For example, the space-time symmetry of the lattice is given by the group of discrete rotations. A more serious source of mixing for light quark operators is the way lattice fermions treat chiral symmetry. Wilson-type fermions break chiral symmetry (even massless ones do so off-shell) and so nothing prevents mixing into "wrong chirality" operators. In Eq. (47) the "diagonal" term will contain the anomalous dimension of the continuum operator Q2 Znn = 1 + T 7 T T ( 7 n !<>g afi + A) + . . . 1071"-
(48)
(which cancels the mu-dependence of the coefficient functionC(fi)(f\Ocont\i)fl is independent of the renormalization point). In principle the leading log could be summed, but in practice we don't know how much of the constant term A should be absorbed into a change of scale of g, so they are just left there. The mixing terms to other dimension D operators die out in the continuum so they don't have any logs. There are also terms for mixing with higher dimensional operators, which give contributions proportional to positive powers of a. (These are usually benign.) One can also have mixing with lower dimensional operators, with contributions involving negative powers of a. (Four-fermion
334
operators for BK could mix with sd.) These are deadly. They must drop out in the continuum but it is a delicate business, since they look like they are growing as an inverse power of a. This is probably more than you wanted to know, but you need the Z n m 's to produce numbers. People get them in a number of ways. Most straightforward is to compute them in perturbation theory, but lattice perturbation theory in terms of the bare coupling g(a) is not very convergent, and it is a long tricky story to do better. The culprit is the "tadpole graph." The lattice fermion-gauge field interaction is generically ip(x)U^(x),il)(x + ft) and £/ ~ 1 + igaA^ - g2a2/2A2l + .... The ipAfa vertex, not present in any sensible continuum regularization, causes problems when the gluon forms a loop: the quadratic divergence from the loop integral combines with the a2 to give a finite contribution - in fact, it is often the dominant contribution. In perturbation theory one must also choose the momentum scale in the running coupling constant. There are reasonable choices for how to do that 2 3 ' 2 4 . Often one can find Znm's by forcing lattice observables to obey Ward identities 25 . One can also play this game with quark propagators and vertices, by computing analogs of quark vertices on the lattice and matching ones results to a continuum calculation 26 . Besides, the Z's, there are other things that can go wrong. Most lattice actions break down when the quark mass gets heavy. The dispersion relation for Wilson or clover actions is E(p) = mi + p2/(2m2) and the quark magnetic moment is fi = l/m^ with rrii ^ m 2 ^ m^. The residue of the quark propagator at its pole is not 1/(2E) as in the continuum. What to do then is not obvious (meaning that lattice people fight over what to do). 3.2
Heavy quark operators
There are many lattice calculations of fg, fjj, BB , and form factors for semileptonic decay. B — B mixing is parametrized by the ratio Xd = (AM)b^/Tb^: d2
*i = rb^rjQCDF(^) on*
m2
mw
•}
\VtlVtd\2b(n){-(B\hP(l o
- 75)d&7,(l - 75)rf|2?>}
(49) Experiment is on the left; theory on the right. Moving into the long equation from the left, we see many known (more or less) parameters from phase space integrals or perturbative QCD calculations, then a combination of CKM matrix elements, followed by a four quark hadronic matrix element 2T . We would like to extract the CKM matrix element from the measurement of Xd (and its strange partner xs). To do so we need to know the value of the object in the curly brackets, defined as (3/8)M(,d and parameterized as mBf'BdBt>d
335
where Bid is the so-called 5-parameter, and JB is the B-meson decay constant, (0|67o75^|-B> = fBiriB- Vacuum saturation suggests that BB = 1. From the lattice one can try to get a real value. In Eq. (49) b(fi), the coefficient which runs the effective interaction down from the W-boson scale to the QCD scale fi, and the matrix element M(fi) both depend on the QCD scale. One often sees the renormalization group invariant quantities Mbd = b(ij)Mu{^) or B^ = b(fi)Bbd(lJ-) quoted in the literature. Decay constants probe very simple properties of the wave function: In the nonrelativistic quark model / M OC ip(0)/y/mM, where V(0) is the qq wave function at the origin. For a heavy quark (Q) light quark (q) system ij>(0) should become independent of the heavy quark's mass as the Q mass goes to infinity, and in that limit one can show in QCD that ^TUMIM approaches a constant. The decay constant is computed by combining a heavy quark and a light antiquark propagator into Eq. (42). You might think it would be difficult to calculate JB directly on present day lattices with relativistic lattice fermions because the lattice spacing is much greater than the b quark's Compton wavelength (or the UV cutoff is below mt,). But it is better to think of the lattice theory as an effective field theory for the low-momentum excitations in the presence of additional high energy scales - the cutoff (inverse lattice spacing) and the heavy quark mass. As in any effective field theory, the effects of the short distance are lumped into coefficients of the effective theory 28 . As a practical matter, one can use the good old clover action to do the calculations - it contains all the necessary operators. The bare mass mi has nothing direct to do with the results; one tunes it, monitoring the kinetic energy E(p) = mi + p2/(2m-2) + ..., and takes the hadron mass to be miNonrelativistic QCD has also been discretized and used to make very precise calculations of the properties of quarkonia 29 . This formalism can also be used for the heavy quark (again as long as its momentum is small.) The "static" limit (infinite 6-quark mass) is often used as an additional point on the curve. Then one can try to extrapolate all the way from light quarks to heavies and get all the decay constants at once. I will show some pictures from the lattice decay constant calculation of Ref. 30 . These authors (my name is on it but I didn't do anything) did careful quenched simulations at many values of the lattice spacing, which allows one to extrapolate to the continuum limit by brute force. They have also done a set of simulations which include light dynamical quarks, which should give some idea of the accuracy of the quenched approximation. Examples of results of Ref. 30 are shown in Figs. 5 and 6. The simulations
336
with dynamical fermions are not as good quality as the quenched simulations: the lattice spacings are generally larger, the simulations all have two degenerate flavors (what about the strange quark?), and the dynamical quark masses are still a bit large. We think that the Wilson results (crosses in Fig. 6(b)) overestimate the continuum result, and the clover action we are using underestimates it, but we also suspect that quenched JB is a bit too low.
0.6
T
r
\\
w
0.5
\,
CO
T
1
1
r
u s e d for-
CL=0.7!=:
u s e d for-
CL=0.7S'
u s e d for-
CL=0.99
> CD
O
0.4
>
0.3
0.2
& 0.0
D
i
i
0.5
i
1.0 -1
1/Mp (GeV ) Figure 5: Pseudoscalar meson decay constant vs 1/M, from Ref.30. CL is the confidence level of the fit.
Soni 3 1 has presented a summary of data from various collaborations, as of last winter. Again, there is a hint that the Nj = 2 results may be about 30 MeV above the quenched results. Lattice calculations have been predicting quenched /D, — 200 MeV for about twelve years. The central values have changed very little, while the uncertainties have decreased. Experimental determinations of fut all come in higher than the lattice results, though with large error bars. We need to do a good quality unquenched lattice calculation.
337 (a)
0.0
0.2
0.4 a (Gey)"1
(b)
0.6
0.8
0.0
0.2
0.4 a (OeV)-1
0.6
0.8
Figure 6: JB VS. a from Ref. 3 0 , showing extrapolations to the continuum limit of quenched (a) and full (b) QCD data. The scale is set by fw = 132 MeV throughout.
Table 1: Heavy-light decay constants and their ratios.
Quantity /s/MeV fBs/MeV fn/MeV fDs/MeV fBs/fB IDSIID
Quenched (Nf = 0) 170 ± 20 190 ± 20 205 ± 20 225 ± 20 1.14 ± . 0 6 1.10 ±.06
Partially Unquenched (Nf — 2) 200 ± 30 220 ± 30 225 ± 30 245 ± 30 1.14 ± . 0 6 1.10 ± . 0 6
Now back to the B parameter. On the lattice, one could measure the decay constants and B parameter separately and combine them after extrapolation, or measure M directly. In principle the numbers should be the same, but in practice the first technique has produced better numbers so far. That is because the B parameter is measured as the ratio of a correlator with a fourfermion vertex to a product of two current-current correlators (see Fig. 7). A lot of systematics cancel in the ratio. Many groups have visited this problem. Reviews by Draper 3 3 and Soni 31 quote a world summary, which I copy into Table 2. Semileptonic decays involve processes like Eq. (46). On the lattice, one just measures the matrix element of a current and fits it to the expected set of form factors - for B —¥ •nf.v, for example, (7r(p)|^|B(p')> = / + ( g 2 ) b ' +P ~ ^
.^q],
+/ V r
1
ml n2
Q,
(50)
338
71
Figure 7: BK (shown) and BB are computed by taking the ratio of four quark operator and two two-point functions. (Figure from Ref. 3 2 .)
Table 2: Summary of heavy-light B-parameters.
BBd(mb) BBs/BBd
fBd(BnBy/2 /BS(B£';)1/2
WB£';)1/2
Quenched .86(4) (8) 1.00(1)(2) 195 ± 25 MeV
"Unquenched" .86(4)(8) 1.00(1)(2) 230 ± 35
1.14 ±.06
1.14 ± . 0 7
The best signals come when the momenta of the initial and final hadron are small. Then the large B mass forces q2 (q = lepton 4-momentum = ps — Pn) to be large. If the form factor is needed at q2 ~ 0, a large extrapolation is needed, and there will be additional errors and model dependence in the answer. (Lattice people have no advantage over anyone else at guessing at functional forms.) However, finding Vub from experimental data only requires knowing the form factor at one value of q2. This should work so long as the experiment has enough data to measure the differential rate around that region of q2. Two recent approaches try to do this: UKQCD focussed on near the end-point or the zero-recoil region where the lattice data tends to be cleanest and heavy quark symmetry can be used. The FNAL group has measured B —• Dlv form factors at zero recoil 34 . They have a clever technique from removing much of the lattice-to-continuum Z-factors by computing ratios of matrix elements, such as {D\clQb\B){B\bl0c\D) (D\cl0c\D)(B\bl0b\B)
= \h+(vv'
= l)f
(51)
339 T h e denominators are just diagonal matrix elements of the charge density, and they can easily be normalized. They 3 5 are also computing semi-leptonic form factors for B —> -KIV and D —• n(K)lis, by concentrating directly on the differential decay spectrum in an interval with 0.4 < pn/GeV < 0.8 thus avoiding the need for large extrapolation in q2. 3.3
Kaon Matrix
Elements
Lattice calculations of kaon weak interaction matrix elements begin with the full S t a n d a r d Model at high energies and use the operator product expansion, combined with the renormalization group, to construct a low-energy effective field theory valid at scales /j, of a few GeV. T h e effective Hamiltonian basically reduces to a sum of four-fermion interactions
People have expended the most effort on, and have the best results for, B^; there are some results on the AI = 1/2 rule; and last, there is e'/e, with unreliable results so far. BK_
T h e J L Q C D collaboration has the best results on BK, from a calculation using staggered fermions 36 . They have d a t a from many lattice spacings and several choices for the lattice discretization of the operator. (See Fig. 8.) They find quenched BK(MS,II = 2 GeV) = 0.616(5). T h e main limitations of this result are quenching, plus the fact t h a t the lattice calculations are actually done without S t / ( 3 ) flavor breaking (the lattice "kaon" is a pseudoscalar m a d e of degenerate quarks). These effects are believed 3 1 to be 5-10 per cent corrections. J L Q C D 3 7 has also done a calculation with Wilson fermions. This was done not so much to get a number itself but to check the staggered result. T h e systematics are very different and the operator mixing is fierce due t o t h e loss of chiral symmetry inherent in Wilson fermions. For example, S7 M (1 — 7s)d • S7 M (1 — 75)d mixes with s^d • £75 d and t h a t operator has a K — K matrix element ten times greater. A I = 1/2 Rule Can the lattice reproduce the experimentally observed factor of 22 between K° —• (7T7r)/=o and K° —> (TT7T)J=2 amplitudes? The lattice calculations are difficult. In addition to graphs of Fig. 7, which are reasonably straightforward to compute, there are a host of other topologies, some of which involve Computing propagators from many points on the lattice to many other points. But I think the reason there are so few lattice results is because all of the
340
BK(NDR, 2GeV) vs. mpa q =1/a, 3-loop coupling, 5 points
0.9 e 0.8
0.7
0.6
0.5 -0.2
o non-invariant O invariant
0.0
0.2
0.4 m„a
0.6
0.8
Figure 8: B ^ from staggered fermions, as a function of the lattice spacing, for two different choices of lattice operators. (Figure and results from Ref. 3 6 .)
quantities of interest are scheme dependent and one must compute a latticeto-continuum matching factor. In addition, people don't calculate K —> TTTT directly on the lattice; it is difficult38 to extract the phase shifts from the TTTT final state interactions from lattice data (never mind trying to separate the two pions to asymptotically great distances). Instead, they use chiral perturbation theory 39 to relate K —> TTTT amplitudes to K —> IT. In the case of the A7 = 3/2 amplitude there is a factor of two change in the lattice result depending on whether tree level or one loop chiral perturbation theory is used. This is shown in Fig. 9.
Hi The only recent work I am aware of is by Pekurovsky and Kilcup 32 . The calculation is hard. The biggest operators (06 and <38 in the nomenclature) have opposite signs and nearly cancel. But the big problem is the scheme matching. In a perturbative calculation, we saw that {0)j^ ~ Z(0)iatt and Z = 1 + asC + But in this equation, what scheme is used to compute as, and what is the scale q* at which as is evaluated? Pekurovsky and Kilcup found that their numbers for one operator, 06> shifted by a factor of 4 when
341
0.04
T
1
1
r
i
i
i
r
C +
+
r
|K>
>c 0.03 X
<>
<> <>
2 x expj.
-A>
0.02
<> <>-
St
¥ 11
0.01
41
0.0
expt-
0.5 m.
1.0 GeV<
Figure 9: AI — 3/2 K —> irir amplitude with (fancy symbols) and without (plain symbols) one-loop corrections of quenched chiral perturbation theory. Data are crosses and fancy Data are plotted as a function of lattice crosses, ; diamonds and fancy diamonds, meson mass. This figure (and much else) is from'
they were converted into MS, and the factor of 4 could become a factor of 30 (or worse) as q* was varied from ir/a to 1/a. They attempt to guesstimate numbers but since they say plainly that they should be used "with extreme caution" I won't quote them. A nonperturbative approach to matching is clearly needed but does not exist yet. 4
Conclusions
What about the future? Matrix elements are at the end of a long chain involving a large set of both simulation and physics issues. They are the most complicated corner of the lattice game. If all you want are the numbers, Moore's
342
law says that computer speed doubles every eighteen months, and statistics is y/N, so error bars will fall by a factor of 2 every three years for everything we know how to do today and will learn nothing new about how to do better in the future. And there are many projects and proposals to build clusters or dedicated supercomputers at a cost which is "chicken feed" compared to the experimental program. This will enable us to begin to chip away at the biggest systematic in all the calculations I have shown here - the neglect of the quenched approximation. But new hardware is not really where the action is. It is merely "enabling technology," so we can make mistakes faster, learn more about the physics, and test new ideas. The main bottleneck to progress on hadronic matrix elements is just that these calculations are complex and have many parts. Some of us (me, let's not be shy) think that better discretization algorithms will help. The problem with that approach is that many pieces of the puzzle have to be determined from scratch: learning how to optimize the new algorithm, testing spectroscopy, computing the Z's. This takes a couple of years, if the inventor of the algorithm doesn't get tired first. Others of us prefer to live with poorer algorithms (which have already been well calibrated) and try to tweak the parts of them which work the worst. The simulations still take a couple of years. Believe it or not, even though lattice QCD is a mature field, there are still many questions about QCD which lattice people do not know how to answer, and an outsider might. Maybe you would enjoy thinking about them. Acknowledgements I would like to thank S. Gottlieb, Y. Kuramashi, A. Kronfeld, M. Ogilvie, S. Sharpe, A. Soni, and D. Toussaint for their help preparing these lectures. This work was supported by the U . S . Department of Energy. 1. Some standard reviews of lattice gauge theory are M. Creutz, "Quarks, Strings, and Lattices," Cambridge, 1983. M. Creutz, ed., "Quantum Fields on the Computer," World, 1992, and I. Montvay and G. Munster, "Quantum fields on a Lattice," Cambridge, 1994. The lattice community has a large annual meeting and the proceedings of those meetings (Lattice 'XX, published so far by North Holland) are the best places to find the most recent results. Other reviews, that I have particularly enjoyed reading, include G. P. Lepage, Schladming 1996: Perturbative and nonperturbative aspects of quantum field theory, H. Latal and W. Schweiger, eds., Springer 1997. hep-lat/9607076; M. Liischer, Les Houches Summer School lecture, 1997. hep-lat/9802029. My 1996 TASI lectures have more
343
2. 3.
4.
5. 6. 7. 8.
9. 10. 11. 12.
13. 14. 15.
detail and a different focus than these: T. DeGrand, TASI-96, B. Green, ed; hep-th/9610132. Sharpe's 1994 TASI lectures are still quite relevant: S. R. Sharpe, TASI-94, J. Donoghue, ed.; hep-ph/9412243. The MILC code, one of the modern packages of lattice QCD codes, is available at h t t p : / / c l i o d h n a . c o p . u o p . e d u / ~ h e t r i c k / m i l c / . There is even documentation, but these codes are not designed to be run as black boxes. K. Wilson, Phys. Rev. D 10, 2445, 1974. N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller, J. Chem. Phys. 21, 1087 (1953); F. Brown and T. Woch, Phys. Rev. Lett., 58, 2394, 1987; M. Creutz, Phys. Rev., D36, 55, 1987. H. C. Andersen, J. Chem. Phys., 72, 2384, 1980; S. Duane, Nucl. Phys. B257, 652, 1985; S. Duane and J. Kogut, Phys. Rev. Lett. 55, 2774, 1985; S. Gottlieb, W. Liu, D. Toussaint, R. Renken and R. Sugar, Phys. Rev. D35, 2531, 1987; S. Duane, A. Kennedy, B. Pendleton, and D. Roweth, Phys. Lett. 194B, 271, 1987. For a recent review, see A. Frommer, Nucl. Phys. B(Proc. Suppl.) 53, 120 (1997); hep-lat/9608074. Cf. H. Kluberg-Stern, A. Morel, and B. Petersson, Phys. Lett. 114B, 152,1982. C. Bernard, et al. (MILC Collaboration), Nucl. Phys. B(Proc. Suppl.) 73 (1999) 198. K. Symanzik, in "Recent Developments in Gauge Theories," eds. G. 't Hooft, et al. (Plenum, New York, 1980) 313; in "Mathematical Problems in Theoretical Physics," eds. R. Schrader et al. (Springer, New York, 1982); Nucl. Phys. B226 (1983) 187, 205. P. Weisz , Nucl. Phys. B212 (1983) 1. M. Luscher and P. Weisz, Comm. Math. Phys. 97, 59 (1985). M. Luscher and P. Weisz, Phys. Lett. B158, 250 (1985). M. Alford, W. Dimm, G. P. Lepage, G. Hockney, and P. Mackenzie, Phys. Lett. B361, 87 (1994); G. P. Lepage, 1 . K. Jansen, et al., Phys. Lett. B372, 275 1996 (hep-lat/9512009); M. Luscher, S. Sint, R. Sommer and P. Weisz, Nucl. Phys. B 478, 365 (1996) [hep-lat/9605038]. M. Luscher, S. Sint, R. Sommer, P. Weisz, H. Wittig and U. Wolff, Nucl. Phys. Proc. Suppl. 53, 905 (1997) [heplat/9608049]. H. B. Nielsen and M. Ninomiya, Phys. Lett. B105, 219 (1981); Nucl. Phys. B193, 173 (1981); Nucl. Phys. B185, 20 (1981); B. Sheikholeslami and R. Wohlert, Nucl. Phys. B259, 572 (1985). For technical reasons, lattice people using Wilson or clover fermions like to work with the "hopping parameter," universally labeled K. It is related
344
to the bare quark mass by K = l/(2mg
+ 8). The pion is massless at
16. See M. Golterman and J. Smit, Nucl. Phys., B255, 328, 1985. 17. K. Orginos, D. Toussaint and R. L. Sugar [MILC Collaboration], Phys. Rev. D60, 054503 (1999) [hep-lat/9903032]. 18. W. Lee and S. R. Sharpe, Phys. Rev. D60, 114503 (1999) [heplat/9905023]. 19. M. Liischer, Phys. Lett. B428, 342 (1998) [hep-lat/9802011]. 20. For a recent review see T. Blum, Lattice 98, Nucl. Phys. B (Proc. Suppl.) 73 (1999) 167, hep-lat/9810017. 21. For a review, see H. Neuberger,Lattice '99, Nucl. Phys. B (Proc. Suppl.) 83-84 (2000) 67. 22. P. Hasenfratz and F. Niedermayer, Nucl. Phys. B414 (1994) 785; T. DeGrand, Phys. Rev. D60, 094501 (1999) [hep-lat/9903006]. 23. S. Brodsky, G. P. Lepage and P. Mackenzie, Phys. Rev. D28, 228 (1983). 24. G. P. Lepage and P. Mackenzie, Phys. Rev. D48, 2250 (1993). See also G. Parisi, in High Energy Physics-1980, XX Int. Conf., Madison (1980), ed. L. Durand and L. G. Pondrom (AIP, New York, 1981) 25. L. Maiani and G. Martinelli, Phys. Lett. B178 (1986) 285. 26. G. Martinelli, et al., Nucl. Phys. B445 (1995) 81. 27. See the lectures of J. Rosner at this summer school. 28. A. X. El-Khadra, A. S. Kronfeld and P. B. Mackenzie, Phys. Rev. D55, 3933 (1997); A. S. Kronfeld, Phys. Rev. D 62, 014505 (2000) [heplat/0002008]. 29. C. Davies et al., Phys. Rev. D52 (1995) 6519, Phys. Lett.B 345 (1995) 42; Phys. Rev. D50 (1994) 6963; Phys. Rev. Lett. 73 (1994) 2654; and G.P. Lepage et al., Phys. Rev. D46 (1992) 4052. 30. C. Bernard et al, Phys. Rev. Lett. 81, 4812 (1998) [hep-ph/9806412]. 31. A. Soni, to be published in the Third International Conference on B Physics and CP Violation, hep-ph/0003092. 32. D. Pekurovsky and G. Kilcup, hep-lat/9812019. 33. T. Draper, Nucl. Phys. Proc. Suppl. 73, 43 (1999) [hep-lat/9810065]. 34. S. Hashimoto, et al, Phys. Rev. D61, 014502 (1999). 35. S. Ryan, A. El-Khadra, S. Hashimoto, A. Kronfeld, P. Mackenzie and J. Simone, Nucl. Phys. Proc. Suppl. 73, 390 (1999) [hep-lat/9810041]. S. M. Ryan, A. X. El-Khadra, A. S. Kronfeld, P. B. Mackenzie and J. N. Simone, Nucl. Phys. Proc. Suppl. 83, 328 (2000) [hep-lat/9910010]. 36. S. Aoki, et al., Phys. Rev. Lett. 80, 5271 (1998). 37. S. Aoki et al. [JLQCD Collaboration], Phys. Rev. D60, 034511 (1999) [hep-lat/9901018].
345
38. L. Maiani and M. Testa, Phys. Lett. 235B, 585 (1990). For recent developments, see L. Lellouch and M. Luscher, hep-lat/0003023. 39. C. Bernard, et al., Phys. Rev. D32, 2343 (1985). 40. C. Bernard, in the Proceedings of the 1989 TASI Summer School, eds. T. DeGrand and D. Toussaint, World Scientific, Singapore, 1989. 41. JLQCD Collab., Phys. Rev. D58 (1998) 054503. 42. Y. Kuramashi, Nucl. Phys. B(Proc. Suppl.) 83-84 (2000) 24; heplat/9910032.
This page is intentionally left blank
illl^lililllllllll^llSIllBllll^lilllillllll
•Hflii
i
;7
'
s
T|J^« iA^s^m^&^^^
^'•mmm
Michael Dine
This page is intentionally left blank
THE STRONG CP PROBLEM
Michael Dine Stanford Linear Accelerator Center, Stanford CA 94309 Santa Cruz Institute for Particle Physics, Santa Cruz CA 95064 Physics Department, University of California, Santa Cruz CA 95064
These lectures discuss the 6 parameter of QCD. After an introduction to anomalies in four and two dimensions, the parameter is introduced. That such topological parameters can have physical effects is illustrated with two dimensional models, and then explained in QCD using instantons and current algebra. Possible solutions including axions, a massless up quark, and spontaneous C P violation are discussed.
1
Introduction
Originally, one thought of QCD as being described a gauge coupling at a particular scale and the quark masses. But it soon came to be recognized that the theory has another parameter, the 6 parameter, associated with an additional term in the lagrangian:
where
K** = \^-Fe°a-
(2)
This term, as we will discuss, is a total divergence, and one might imagine that it is irrelevant to physics, but this is not the case. Because the operator violates CP, it can contribute to the neutron electric dipole moment, dn. The current experimental limit sets a strong limit on 9, 9
C = —^Flv +qf +q fmqq + m*?q*.
(3)
Here, I have written the lagrangian in terms of two-component fermions, and noted that a priori, the mass need not be real, m = \m\eid.
(4)
C = Re m qq + Im m qq^q.
(5)
In terms of four-component fermions,
349
350 In order to bring the mass term to the conventional form, with no 75's, one would naively let -ie/2n
-i»/2-
(6)
However, a simple calculation shows that there is a difficulty associated with the anomaly. Suppose, first, that M is very large. In that case we want to integrate out the quarks and obtain a low energy effective theory. To do this, we study the path integral:
z = jldA^jldqWdqy8
(7)
Again suppose m = e'sM, where S is small and M is real. In order to make m real, we can again make •iS/275 5).) The the transformations: q -¥ qe ,e/2;q -> qe , s / 2 (in four component language, this is < result of integrating out the quark, i.e. of performing the path integral over q and q can be written in the form:
Z = J[dA„]JeiS-»
(8)
Here 5 e / / is the effective action which describes the interactions of gluons at scales well below M.
>
Figure 1: The triangle diagram associated with the four dimensional anomaly.
Because the field redefinition which eliminates 8 is just a change of variables in the path integral, one might expect that there can be no ^-dependence in the effective action. But this is not the case. To see this, suppose that 6 is small, and instead of making the transformation, treat the 9 term as a small perturbation, and expand the exponential. Now consider a term in the effective action with two external gauge bosons. This is given by the Feynman diagram in fig. 1. The corresponding term in the action is given by SCeff =
-i-g2MTr(T"Tb)
r d*k
J (2^>
Tr 7 5
1
1
fl+tf1-M"ll-M"ll-42-M
(9)
Here, as in the figure, the qi's are the momenta of the two photons, while the e's are their polarizations and a and b are the color indices of the gluons. To perform the integral, it is convenient to introduce Feynman parameters and shift the k integral, giving: SCeff
= -i8g2MTT(TaT»)
J daida2
J J ^ r ^ n ^ ~ « i 4x + »2 f/2+ «i + M) rf
(V - « i <jx + «2 4i + M) ^(V-ai
(10)
1
q ! + a2 42~ 4i + M)
(k2 - M2 + 0( 9 , 2 )) 3
(11)
351 For small q, we can neglect the g-dependence of the denominator. The trace in the numerator is easy to evaluate, since we can drop terms linear in k. This gives, after performing the integrals over the a's,
sccff = j » M t o ( r r ' ) e ^ g f ^ 4 / ^ y . ' ^ , .
(12)
This corresponds to a term in the effective action, after doing the integral over k and including a combinatoric factor of two from the different ways to contract the gauge bosons:
6C
°"
=
zkieTT{Fp)-
(13)
Now why does this happen? At the level of the path integral, the transformation would seem to be a simple change of variables, and it is hard to see why this should have any effect. On the other hand, if one examines the diagram of fig. 1, one sees that it contains terms which are linearly divergent, and thus it should be regulated. A simple way to regulate the diagram is to introduce a Pauli-Villars regulator, which means that one subtracts off a corresponding amplitude with some very large mass A. However, we have just seen that the result is independent of A! This sort of behavior is characteristic of an anomaly. Consider now the case that m -C AQCD- In this case, we shouldn't integrate out the quarks, but we still need to take into account the regulator diagrams. For small m, the classical theory has an approximate symmetry under which q -> eiaq (in four component language, q —> eiaibq).
q -> eiaq
(14)
In particular, we can define a current: 3s = 9757^9-
(15)
and classically, dJs
= mq-y^q.
(16)
Under a transformation by an infinitesmal angle a one would expect SL = adrfg
= maq-yaq.
(17)
But what we have just discovered is that the divergence of the current contains another, m-independent, term: d„j£ = mql5q
+ ^ F F .
(18)
This anomaly can be derived in a number of other ways. One can define, for example, the current by "point splitting,"
iS = g(x + U)eiS.'*'dt'A'q{x)
(19)
Because operators in quantum field theory are singular at short distances, the Wilson line makes a finite contribution. Expanding the exponential carefully, one recovers the same expression for the current. A beautiful derivation, closely related to that we have performed above, is due to Fujikawa, described i n 1 . Here one considers the anomaly as arising from a lack of invariance of the path integral measure. One carefully evaluates the Jacobian associated with the change of variables q —*• q(l + ij^a), and shows that it yields the same result 1 . We will do a calculation along these lines in a two dimensional model shortly. The anomaly has important consequences in physics which will not be the subject of the lecture today, but it is worth at least listing a few before we proceed:
352 • TT° decay: the divergence of the axial isospin current, 0 1 ) " = J»T57"« " hil"d
(20)
has an anomaly due to electromagnetism. This gives rise to a coupling of the ir° to two photons, and the correct computation of the lifetime was one of the early triumphs of the theory of quarks with color. • Anomalies in gauge currents signal an inconsistency in a theory. They mean that the gauge invariance, which is crucial to the whole structure of gauge theories (e.g. to the fact that they are simultaneously unitary and lorentz invariant) is lost. The absence of gauge anomalies is one of the striking ingredients of the standard model, and it is also crucial in extensions such as string theory. Our focus in these lectures will be on another aspect of this anomaly: the appearance of an additional parameter in the standard model, and the associated "Strong CP problem." What we have just learned is that, if in our simple model above, we require that the quark masses are real, we must allow for the possible appearance in the lagrangian of the standard model, of the 6-terms of eqn. 1. ° This term, however, can be removed by a B + L transformation. What are the consequences of these terms? We will focus on the strong interactions, for which these terms are most important. At first sight, one might guess that these terms are in fact of no importance. Consider, first, the case of QED. Then
/'
d4xFF
(21)
is a the integral of a total divergence, FF = EB=l-d^l>aA-'F»°.
(22)
As a result, this term does not contribute to the classical equations of motion. One might expect that it does not contribute quantum mechanically either. If we think of the Euclidean path integral, configurations of finite action have field strengths, F^ which fall off faster than 1/r 2 (where r is the Euclidean distance), and A which falls off faster than 1/r, so one can neglect surface terms in Cg. (A parenthetical remark: This is almost correct. However, if there are magnetic monopoles there is a subtlety, first pointed out by Witteri 2 . Monopoles can carry electric charge. In the presence of the S term, there is an extra source for the electric field at long distances, proportional to 9 and the monopole charge. So the electric charges are given by: (23) where nm is the monopole charge in units of the Dirac quantum.) In the case of non-Abelian gauge theories, the situation is more subtle. It is again true that FF can be written as a total divergence: F i ? = a«if M a
^
=
W
( A ^ _ | / ° ^ ^ A < ) .
(24)
I n principle we must allow a similar term, for the weak interactions. However, B + L is a classical symmetry of the renormalizable interactions of the standard model. This symmetry is anomalous, and can be used to remove the weak 8 term. In the presence of higher dimension B + L-violating terms, this is no longer true, but any effects of 6 will be extremely small, suppressed by e~27r^aw as well as by powers of some large mass scale.
353 But now the statement that F falls faster than 1/r 2 does not permit an equally strong statement about A. We will see shortly that there are finite action configurations - finite action classical solutions - where F T-, but A -» -, so that the surface term cannot be neglected. These terms are called instantons. It is because of this that 9 can have real physical effects.
2
A Two Dimensional Detour
Before considering four dimensions with all of its complications, it is helpful to consider two dimensions. Two dimensions are often a poor analog for four, but for some of the issues we are facing here, the parallels are extremely close. In these two dimensional examples, the physics is more manageable, but still rich.
2.1
The Anomaly In Two
Dimensions
Consider, first, electrodynamics of a massless fermion in two dimensions. Let's investigate the anomaly. The point-splitting method is particularly convenient here. Just as in four dimensions, we write: jg = i>(x + ie)e'f'
'A'ix'~f''4i(x)
(25)
For very small e, we can pick up the leading singularity in the product of ^{x + e)'t/> by using the operator product expansion, and noting that (using naive dimensional analysis) the leading operator is the unit operator, with coefficient proportional to 1/e. We can read off this term by taking the vacuum expectation value, i.e. by simply evaluating the propagator. String theorists are particularly familiar with this Green's function:
$(x + eW(x)) = l-£
(26)
Expanding the factor in the exponential to order e gives d^
= naive piece + ^ ^ e ^ t r - ^ V •
(27) 2
Taking the trace gives e ^ e " ; averaging e over angles (< eMe„ > = | r / ^ e ) } yields d,f,
= ^ F " " .
(28)
E x e r c i s e : Fill in the details of this computation, being careful about signs and factors of 2. This is quite parallel to the situation in four dimensions. The divergence of the current is itself a total derivative: d„#
= ^ ^ A " .
(29)
So it appears possible to define a new current, J»=ft-^d»A»
(30)
However, just as in the four dimensional case, this current is not gauge invariant. There is a familiar field configuration for which A does not fall off at infinity: the field of a point charge. Indeed, if one
354
has charges, ±6 at infinity, they give rise to a constant electric field, Foi = e9. So 8 has a very simple interpretation in this theory. It is easy to see that physics is periodic in 8. For 8 > q, it is energetically favorable to produce a pair of charges from the vacuum which shield the charge at oo.
2.2
The CPN Model: An Asymptotically
Free Theory
The model we have considered so far is not quite like QCD in at least two ways. First, there are no instantons; second, the coupling e is dimensionful. We can obtain a theory closer to QCD by considering the CPN model (our treatment here will follow closely the treatment in Peskin and Schroeder's problem 13.2?). This model starts with a set of fields, zit i = 1 , . . . N + 1. These fields live in the space CPN. This space is defined by the constraint:
£ N 2 = i;
(3i)
in addition, the point Zi is equivalent to exaZi- To implement the first of these constraints, we can add to the action a lagrange multiplier field, \{x). For the second, we observe that the identification of points in the "target space," CPN, must hold at every point in ordinary space-time, so this is a U(l) gauge symmetry. So introducing a gauge field, A^, and the corresponding covariant derivative, we want to study the lagrangian: C = ±[\D„zi\2-\(x)(\zt\2-l)]
(32)
Note that there is no kinetic term for A^, so we can simply eliminate it from the action using its equations of motion. This yields
^ ^ • V Z + I^^-I2]
(33)
It is easier to proceed, however, keeping A^ in the action. In this case, the action is quadratic in z, and we can integrate out the z fields: Z = [[dA^dXlidz^expl-C]
-/
[dA\[d\]eS
d2xT
: J[dA][d\]exp{-Nti\og{-D2
2.3
The Large N
(34)
-"[A'x]
(35)
- A) - -i- f
d2x\]
Limit
By itself, the result of eqn. 35 is still rather complicated. The fields A,, and A have complicated, nonlocal interactions. Things become much simpler if one takes the "large A^ limit", a limit where one takes N -> oo with g2N fixed. In this case, the interactions of A and A^ are suppressed by powers of N. For large N, the path integral is dominated by a single field configuration, which solves
^f=°
(36)
355 or, setting the gauge field to zero, N
J (2^P
1
1
+X
g2'
l
'
giving A = m2 = M e x p [ - ^ ] .
(38)
Here, M is a cutoff required because the integral in eqn. 37 is divergent. This result is remarkable. One has exhibited dimensional transmutation: a theory which is classically scale invariant contains non-trivial masses, related in a renormalization-group invariant fashion to the cutoff. This is the phenomenon which in QCD explains the masses of the proton, neutron, and other dimensionful quantities. So the theory is quite analogous to QCD. We can read off the leading term in the /3-function from the familiar formula:
m = Me~Svk
(39)
P(g) = ~y3b„
(40)
so, with
Z7T
we have b0 — 1. But most important for our purposes, it is interesting to explore the question of ^-dependence. Again, in this theory, we could have introduced a 9 term:
Ce = -^Jd\^F»\
(41)
where FM„ can be expressed in terms of the fundamental fields ZJ. As usual, this is the integral of a total divergence. But precisely as in the case of 1 + 1 dimensional electrodynamics we discussed above, this term is physically important. In perturbation theory in the model, this is not entirely obvious. But using our reorganization of the theory at large N, it is. The lowest order action for A^ is trivial, but at one loop (order 1/iV), one generates a kinetic term for A through the usual vacuum polarization loop: -
N
2ivm2
-
(42) 2
2
At this order, the effective theory consists of the gauge field, then, with coupling e = ™ , and some coupling to a dynamical, massive field A. As we have already argued, 8 corresponds to a nonzero background electric field due to charges at infinity, and the theory clearly can have non-trivial ^-dependence. There is, in addition, the possibility of including other light fields, for example massless fermions. In this case, one can again have an anomalous U(l) symmetry. There is then no ^-dependence, since it is possible to shield any charge at infinity. But there is non-trivial breaking of the symmetry. At low energies, one has now a theory with a fermion coupled to a dynamical U(l) gauge field. The breaking of the associated U(l) in such a theory is a well-studied phenomenon 3 . E x e r c i s e : Complete Peskin and Schroeder, Problem 13.3.
2.4
The Role of
Instantons
There is another way to think about the breaking of the f/(l) symmetry and (J-dependence in this theory. If one considers the Euclidean functional integral, it is natural to look for stationary points of
356 the integration, i.e. for classical solutions of the Euclidean equations of motion. In order that they be potentially important, it is necessary that these solutions have finite action, which means that they must be localized in Euclidean space and time. For this reason, such solutions were dubbed "instantons" by 't Hooft. Such solutions are not difficult to find in the CPN model; we will describe them briefly below. These solutions carry non-zero values of the topological charge, — / cfxe^Ff,,,
=n
(43)
and have an action 27m. As a result, they contribute to the ^-dependence; they give a contribution to the functional integral:
j[dSzi\e-"i^"'+...
^e^
(44)
It follows that: • Instantons generate ^-dependence • In the large N limit, instanton effects are, formally, highly suppressed, much smaller that the effects we found in the large N limit • Somewhat distressingly, the functional integral above can not be systematically evaluated. The problem is that t h e classical theory is scale invariant, as a result of which, instantons come in a variety of sizes. J[dSz] includes an integration over all instanton sizes, which diverges for large size (i.e. in the infrared). This prevents a systematic evaluation of the effects of instantons in this case. At high temperatures 4 , it is possible to do the evaluation, and instanton effects are, indeed, systematically small. It is easy to construct the instanton solution in the case of CP1. Rather than write the theory in terms of a gauge field, as we have done above, it is convenient to parameterize the theory in terms of a single complex field, 0. One can, for example, define tp — z\/z2- Then, with a bit of algebra, one can show that the action for <j> takes the form:
£
= (w*>r^-(TT»
(45)
One can think of the field <j> as living on the space with metric given by the term in parenthesis, g^-. One can show that this is t h e metric one obtains if one stereographically maps the sphere onto the complex plane. This mapping, which you may have seen in your m a t h methods courses, is just: ... *!+«». 1 - x3
(46)
The inverse is
*(1+ z )
1+ z
2
|z| - 1 X3 •
N 2 + i'
It is straightforward to write down the equations of motion -~) = 0
(48)
357 Now calling the space time coordinates z = n + ix2, z* = xx - 1x2, you can see that if > is analytic, the equations of motion are satisfied! So a simple solution, which you can check has finite action, is
(49)
In addition to evaluating the action, you can evaluate the topological charge,
^Jd2xe^F""
=l
(50)
for this solution. More generally, the topological charge measures the number of times that <j> maps the complex plane into the complex plane; for <j> = zn, for example, one has charge n. Exercise: Verify that the action of eqn. 45 is equal to £ = 9w9d09e0*
(51)
where g is the metric of the sphere in complex coordinates, i.e. it is the line element dx\ + dx\ + dx\ expressed as gZiZdz dz + gZiZ,dz dz" + gz-zdz" dz + gd^dz-dz'dz*. A model with an action of this form is called a "Non-linear Sigma Model;" the idea is t h a t the fields live on some "target" space, with metric g. Verify eqns. 45,47. More generally,
...
(52)
Similarly, the integration over the
(53)
Here the first factor follows on dimensional grounds. The second follows from renormalization-group considerations. It can be found by explicit evaluation of the functional determinant?. Note that, because of asymptotic freedom, this means that typical Green's functions will be divergent in the infrared. There are many other features of this instanton one can consider. For example, one can consider adding massless fermions to the model, by simply coupling them in a gauge-invariant way to A^. The resulting theory has a chiral U(l) symmetry, which is anomalous. In the presence of an instanton, one can easily construct normalizable fermion zero modes (the Dirac equation just becomes the statement that if is analytic). As a result, Green's functions computed in the instanton background do not respect the axial (7(1) symmetry. But rather than get too carried away with this model (I urge you to get a little carried away and play with it a bit), let's proceed to four dimensions, where we will see very similar phenomena.
3
Real Q C D
The model of the previous section mimics many features of real QCD. Indeed, we will see that much of our discussion can be carried over, almost word for word, to the observed strong interactions. This analogy is helpful, given that in QCD we have no approximation which gives us control over the theory comparable to that which we found in the large N limit of the CPN model. As in that theory:
358 • There is a 6 parameter, which appears as an integral over the divergence of a non-gauge invariant current. • There are instantons, which indicate that there should be real ^-dependence. However, instanton effects cannot be considered in a controlled approximation, and there is no clear sense in which ^-dependence can be understood as arising from instantons. • There is another approach to the theory, which shows that the S-dependence is real, and allows computation of these effects. In QCD, this is related to the breaking of chiral symmetries.
3.1
The Theory and its
Symmetries
While it is not in the spirit of much of this school, which is devoted to the physics of heavy quarks, it is sufficient, to understand the effects of S, to focus on only the light quark sector of QCD. For simplicity in writing some of the formulas, we will consider two light quarks; it is not difficult to generalize the resulting analysis to the case of three. It is believed that the masses of the u and d quarks are of order 5 MeV and 10 MeV, respectively, much lighter than the scale of QCD. So we first consider an idealization of the theory in which these masses are set to zero. In this limit, the theory has a symmetry SU(2)i x SU(2)R. This symmetry is spontaneously broken to a vector SU(2). The three resulting Goldstone bosons are the w mesons. Calling
'l)
>
<*>
the two SU(2) symmetries act separately on q and q (thought of as left handed fermions). The order parameter for the symmetry breaking is believed to be the condensate: M„ = (qq).
(55)
This indeed breaks the symmetry down to the vector sum. The associated Goldstone bosons are the n mesons. One can think of the Goldstone bosons as being associated with a slow variation of the expectation value in space, so we can introduce a composite operator
M = qq = M^"-*^ (I
°)
(56)
The quark mass term in the lagrangian is then (for simplicity writing mu = ma = m , ; more generally one should introduce a matrix)
m„M.
(57)
Expanding M in powers of ir/f„, it is clear that the minimum of the potential occurs for 7rn = 0. Expanding to second order, one has
mlfl
= mqM0.
(58)
But we have been a bit cavalier about the symmetries. The theory also has two [/(l)'s; q-^eiaq ia
q -> e q
q -> eiaq q -+ e-iaq
(59) (60)
The first of these is baryon number and it is not chiral (and is not broken by the condensate). The second is the axial C/(l)s; It is also broken by the condensate. So, in addition to the pions, there should
359 be another approximate Goldstone boson. The best candidate is the r/, but, as we will see below (and as you will see further in Thomas's lectures), the 77 is too heavy to be interpreted in this way. The absence of this fourth (or in the case of three light quarks, ninth) Goldstone boson is called the U(l) problem. The f7(l)5 symmetry suffers from an anomaly, however, and we might hope that this has something to do with the absence of a corresponding Goldstone boson. The anomaly is given by ^
= ^ F F
(61)
Again, we can write the right hand side as a total divergence, FF = d^K"
(62)
where K» = ^VPM%F$.
~ \rbcAlAbpA%).
(63)
So if it is true that this term accounts for the absence of the Goldstone boson, we need to show that there are important configurations in the functional integral for which the rhs does not vanish rapidly at infinity.
3.2
Instantons
It is easiest to study the Euclidean version of the theory. This is useful if we are interested in very low energy processes, which can be described by an effective action expanded about zero momentum. In the functional integral, : j[dA}[dq}[dq}e
(64)
it is natural to look for stationary points of the effective action, i.e. finite action, classical solutions of the theory in imaginary time. The Yang-Mills equations are complicated, non-linear equations, but it turns out that, much as in the CPN model, the instanton solutions can be found rather easily5. The following tricks simplify the construction, and turn out to yield the general solution. First, note that the Yang-Mills action satisfies an inequality: f(F ± Ff
= f(F2
+ F2 ± 2FF) = f (2F2 + 2FF) > 0.
(65)
So the action is bounded / FF, with the bound being saturated when F = ±F
(66)
i.e. if the gauge field is (anti) self dual. 6 This equation is a first order equation, and it is easy to solve if one first restricts to an SU{2) subgroup of the full gauge group. One makes the ansatz that the solution should be invariant under a combination of ordinary rotations and global SU(2) gauge transformations: A„ = f(r2) b
+ h(r2)x-f
(67)
This is not an accident, nor was the analyticity condition in the CPN case. In both cases, we can add fermions so that the model is supersymmetric. Then one can show that if some of the supersyrnmetry generators, Qa annihilate a field configuration, then the configuration is a solution. This is a first order condition; in the Yang-Mills case, it implies self-duality, and in the CPN case it requires analyticity.
360 where we are using a matrix notation for the gauge fields. One can actually make a better guess: define the gauge transformation: 9(x) = -^—r
(68)
A„ = fir^gd^g-1
(69)
and take
Then plugging in the Yang-Mills equations yields: r2 / = -5 7 (70) v r 2 + p2 ' where p is an arbitrary quantity with dimensions of length. The choice of origin here is arbitrary; this can be remedied by simply replacing x —> x — x0 everywhere in these expressions, where x0 represents the location of the instanton. E x e r c i s e Check that eqns. 69,70 solve 66. iProm this solution, it is clear why f d^IC does not vanish for the solution: while A is a pure gauge at infinity, it falls only as 1/r. Indeed, since F = F, for this solution, [F2
=
[F2
= 32TT2
(71)
This result can also be understood topologically. g defines a mapping from the "sphere at infinity" into the gauge group. It is straightforward to show that
h!dixFp
32vr2
(72)
counts the number of times g maps the sphere at infinity into the group (one for this specific example; n more generally). We do not have time to explore all of this in detail; I urge you to look at Sidney Coleman's lecture, "The Uses of Instantons" 7 . To actually do calculations, 't Hooft developed some notations which are often more efficient than those described above 5 . So we have exhibited potentially important contributions to the path integral which violate the £7(1} symmetry. How does this violation of the symmetry show up? Let's think about the path integral in a bit more detail. Having found a classical solution, we want to integrate about small fluctuations about it: eie f[d6A][dq][dq]eis2s
(73)
Now 5 contains an explicit factor of 1/g 2 . As a result, the fluctuations are formally suppressed by g2 relative to the leading contribution. The one loop functional integral yields a product of determinants for the fermions, and of inverse square root determinants for the bosons. Consider, first, the integral over the fermions. It is straightforward, if challenging, to evaluate the determinants 5 . But if the quark masses are zero, the fermion functional integrals are zero, because there is a zero mode for each of the fermions, i.e. for both q and q there is a normalizable solution of the equation: Jf)u = 0
J/lu = 0
and similarly for d and d. It is straightforward to construct these solutions:
(74)
361 where ( is a constant spinor, and similarly for u, etc. This means that in order for the path integral to be non-vanishing, we need to include insertions of enough q's and q's to soak up all of the zero modes. In other words, non-vanishing Green's functions have the form {uudd)
(76)
and violate the symmetry. Note that the symmetry violation is just as predicted from the anomaly equation: A(
2 s = 4 ^ fd4xFF = 4
(77)
However, the calculation we have described here is not self consistent. The difficulty is that among the variations of the fields we need to integrate over are changes in the location of the instanton (translations), rotations of the instanton, and scale transformations. The translations are easy to deal with; one has simply to integrate over x0 (one must also include a suitable Jacobian factor 7 ). Similarly, one must integrate over p. There is a power of p arising from the Jacobian, which can be determined on dimensional grounds. For our Green's function above, for example, which has dimension 6, we have (if all of the fields are evaluated at the same point),
Jdpp~7.
(78)
However, there is additional ^-dependence because the quantum theory violates the scale symmetry. This can be understood by replacing g2 —> g2(p) in the functional integral, and using e-s*V(p)
~ (pM)bo
(79)
for small p. For 3 flavor QCD, for example, b0 = 9, and the p integral diverges for large p. This is just the statement that the integral is dominated by the infrared, where the QCD coupling becomes strong. So we have provided some evidence that the U(l) problem is solved in QCD, but no reliable calculation. What about ^-dependence? Let us ask first about (9-dependence of the vacuum energy. In order to get a non-zero result, we need to allow that the quarks are massive. Treating the mass as a perturbation, we obtain E{0) = CA9QCDmumdcos(6)
f dpp~3p9.
(80)
So again, we have evidence for ^-dependence, but cannot do a reliable calculation. That we cannot do a calculation should not be a surprise. There is no small parameter in QCD to use as an expansion parameter. Fortunately, we can use other facts which we know about the strong interactions to get a better handle on both the U(l) problem and the question of ^-dependence. Before continuing, however, let us consider the weak interactions. Here there is a small parameter, and there are no infrared difficulties, so we might expect instanton effects to be small. The analog of the £/(l) 5 symmetry in this case is baryon number. Baryon number has an anomaly in the standard model, since all of the quark doublets have the same sign of the baryon number, 't Hooft realized that one could actually use instantons, in this case, to compute the violation of baryon number. Technically, there are no finite action Euclidean solutions in this theory; this follows, as we will see in a moment, from a simple scaling argument. However, 't Hooft realized that one can construct important configurations of nonzero topological charge by starting with the instantons of the pure gauge theory and perturbing them.
362 If one simply takes such an instanton, and plugs it into the action, one necessarily finds a correction to the action of the form SS = \v2p2. (81) 92 This damps the p integral at large p, and leads to a convergent result. Affleck showed how to develop this into a systematic computation 9 . Note that from this, one can see that baryon number violation occurs in the standard model, and that the rate is incredibly small, proportional to e - 2 " " 7 .
3.3
Real QCD and the U(l)
Problem
In real QCD, it is difficult to do a reliable calculation which shows that there is not an extra Goldstone boson, but the instanton analysis we have described makes clear that there is no reason to expect one. Actually, while perturbative and semiclassical (instanton) techniques have no reason to give reliable results, there are two approximation methods techniques which are available. The first is large JV, where one now allows the JV of SU(N) to be large, with g2N fixed. In contrast to the case of CPN, this does not permit enough simplification to do explicit computations, but it does allow one to make qualitative statements about the theory, available in QCD. Witten has pointed out a way in which one can at relate the mass of the 7} (or rj' if one is thinking in terms of SU(Z) x SU(3) current algebra) to quantities in a theory without quarks. The point is to note that the anomaly is an effect suppressed by a power of JV, in the large JV limit. This is because the loop diagram contains a factor of g2 but not of JV. So, in large JV, it can be treated as a perturbation, and the the rj is massless. d^j^ is like a creation operator for the r), so (just like d^jt? is a creation operator for the IT meson), so one can compute the mass if one knows the correlation function, at zero momentum, of ( « W « ( i / ) ) « ^(F(x)F(x)F(yMy))
(82)
To leading order in the 1/JV expansion, this correlation function can be computed in the theory without quarks. Witten argued that while this vanishes order by order in perturbation theory, there is no reason that this correlation function need vanish in the full theory. Attempts have been made to compute this quantity both in lattice gauge theory and using the AdS-CFT correspondence recently discovered in string theory. Both methods give promising results. So the U(l) problem should be viewed as solved, in the sense that absent any argument to the contrary, there is no reason to think that there should be an extra Goldstone boson in QCD. The second approximation scheme which gives some control of QCD is known as chiral perturbation theory. The masses of the u, d and s quarks are small compared to the QCD scale, and the mass terms for these quarks in the lagrangian can be treated as perturbations. This will figure in our discussion in the next section.
3.4
Other Uses of Instantons:
A Survey
In the early days of QCD, it was hoped that instantons, being a reasonably well understood nonperturbative effect, might give insight into many aspects of the strong interactions. Because of the infrared divergences discussed earlier, this program proved to be a disappointment. There was simply no well-controlled approximation to QCD in which instantons were important. Indeed, Witten 10 stressed the successes of the large JV limit in understanding the strong interactions, and argued that in this limit, anomalies could be important but instantons would be suppressed exponentially. This reasoning (which I urge you to read) underlay much of our earlier discussion, which borrowed heavily on this work.
363 In the years since Coleman's "Uses of Instantons" was published, many uses of instantons in controlled approximations have been found. What follows is an incomplete list; I hope this will inspire some of you to read Coleman's lectures and develop a deeper understanding of the subject. • ^-dependence at finite temperature: Within QCD, instanton calculations are reliable at high temperatures. So, for example, one can calculate the ^-dependence of the vacuum energy in the early universe, and other quantities to which instantons give the leading contribution 11 . • Baryon number violation in the standard model: We have remarked that this can be reliably calculated, though it is extremely small. However, as explained in 7 , instanton effects are associated with tunneling, and in the standard model, they describe tunneling between states with different baryon number. It is reasonable to expect that baryon number violation is enhanced at high temperature, where one has plenty of energy to pass over the barrier without tunneling. This is indeed the case. This baryon number violation might be responsible for the matter-antimatter asymmetry which we observe 12 . • Instanton effects in supersymmetric theories: this has turned out to be a rich topic. Instantons, in many instances, are the leading effects which violate non-renormalization theorems in perturbation theory, and they can give rise to superpotentials, supersymmetry breaking, and other phenomena. More generally, they have provided insight into a whole range of field theory and string theory phenomena.
4 4-1
The Strong C P Problem 6-dependence
of the Vacuum
Energy
The fact that the anomaly resolves the 17(1) problem in QCD, however, raises another issue. Given that J d^xFF has physical effects, the theta term in the action has physical effects as well. Since this term is CP odd, this means that there is the potential for strong CP violating effects. These effects should vanish in the limit of zero quark masses, since in this case, by a field redefinition, we can remove 6 from the lagrangian. In the presence of quark masses, the ^-dependence of many quantities can be computed. Consider, for example, the vacuum energy. In QCD, the quark mass term in the lagrangian has the form: Cm = muuu + mddd + h.c.
(83)
Were it not for the anomaly, we could, by redefining the quark fields, take mu and vtid to be real. Instead, we can define these fields so that there is no 8FF term in the action, but there is a phase in mu and md. Clearly, we have some freedom in making this choice. In the case that m„ and mi are equal, it is natural to choose these phases to be the same. We will explain shortly how one proceeds when the masses are different (as they are in nature). We can, by convention, take 8 to be the phase of the overall lagrangian: Cm = (muuu + rnddd) cos(#/2) + h.c.
(84)
Now we want to treat this term as a perturbation. At first order, it makes a contribution to the ground state energy proportional to its expectation value. We have already argued that the quark bilinears have non-zero vacuum expectation values, so E(8) = (mu + md)eie(qq).
(85)
364 While, without a difficult non-perturbative calculation, we can't calculate the separate quantities on the right hand side of this expression, we can, using current algebra, relate them to measured quantities. A simple way to do this is to use the effective lagrangian method (which will be described in more detail in Thomas's lectures). The basic idea is that at low energies, the only degrees of freedom which can readily be excited in QCD are the pions. So parameterize qq as qq = Z=
e'" *'°
(86)
We can then write the quark mass term as Cm = eieTrMqY,.
(87)
Ignoring the 6 term at first, we can see, plugging in the explicit form for £ , that
mlfl
= (m„ + md) < qq > .
(88)
So the vacuum energy, as a function of 9, is: E(6) = mlficos(6).
(89)
This expression can readily be generalized to the case of three light quarks by similar methods. In any case, we now see that there is real physics in 9, even if we don't understand how to do an instanton calculation. In the next section, we will calculate a more physically interesting quantity: the neutron electric dipole moment as a function of 9.
4.2
The Neutron Electric Dipole
Moment
As Scott Thomas will explain in much greater detail in his lectures, the most interesting physical quantities to study in connection with C P violation are electric dipole moments, particularly that of the neutron, d„. It has been possible to set strong experimental limits on this quantity. Using current algebra, the leading contribution to the neutron electric dipole moment due to 9 can be calculated, and one obtains a limit 8 < 10 _ 9 e . The original paper on the subject is quite readable. Here we outline the main steps in the calculation; I urge you to work out the details following the reference. We will simplify the analysis by working in an exact SC/(2)-symmetric limit, i.e. by taking mu = md = m. We again treat the lagrangian of [84] as a perturbation. We can also understand how this term depends on the IT fields by making an axial SU(2) transformation on the quark fields. In other words, a background n field can be thought of as a small chiral transformation from the vacuum. Then, e.g., for the T$ direction, q —> (1 + iTT3T3)q (the 7r field parameterizes the transformation), so the action becomes: ^ • T 3 ( ? 7 5 9 + Sqq)
(90)
In In other words, we have calculated a CP violating coupling of the mesons to the pions. This coupling is difficult to measure directly, but it was observed in 8 that this coupling gives rise, in a calculable fashion, to a neutron electric dipole moment. Consider the graph offig.2. This graph generates a neutron electric dipole moment, if we take one coupling to be the standard pion-nucleon coupling, and the second the coupling we have computed above. The resulting Feynman graph is infrared divergent; we cut this off at mw, while cutting off the integral in the ultraviolet at the QCD scale. Because of this infrared sensitivity, the low energy calculation is reliable. The exact result is:
(NftiT'qmMMN/mJ-^Mn.
(91)
The matrix element can be estimated using ordinary SU(3), yielding dn = 5.2 x 1O~160 cm. T h e experimental bound gives 6 < 1 0 - 9 - 1 0 . Understanding why CP violation is so small in the strong interactions is the "strong CP problem."
365
N
N
Figure 2: Diagram in which CP-violating coupling of the pion contributes to dn.
5
Possible Solutions
What should our attitude towards this problem be? We might argue that, after all, some Yukawa couplings are as small as 10~ 5 , so why is 1CT9 so bad. On the other hand, we suspect that the smallness of the Yukawa couplings is related to approximate symmetries, and that these Yukawa couplings are telling us something. Perhaps there is some explanation of the smallness of 8, and perhaps this is a clue to new physics. In this section we review some of the solutions which have been proposed to understand the smallness of 6.
5.1
Massless u Quark
Suppose the bare mass of the u quark was zero (i.e. at some high scale, the u quark mass were zero). Then, by a redefinition of the u quark field, we could eliminate 9 from the lagrangian. Moreover, as we integrated out physics from this high scale to a lower scale, instanton effects would generate a small u quark mass. In fact, a crude estimate suggests that this mass will be comparable to the estimates usually made from current algebra. Suppose that we construct a Wilsonian action at a scale, say, of order twice the QCD scale. Call this scale A0. Then we would expect, on dimensional grounds, that the u quark mass would be of order: mdms mu = —
(92)
Now everything depends on what you take A 0 to be, and there is much learned discussion about this. The general belief seems to be that the coefficient of this expression needs to be of order three to explain the known facts of the hadron spectrum. There is contentious debate about how plausible this possibility is. Note, even if one does accept this possibility, one would still like to understand why the u quark mass at the high scale is exactly zero (or extremely small). It is interesting that in string theory, one knows of discrete symmetries which are anomalous, i.e. one has a fundamental theory where there are discrete symmetries which can be broken by very tiny effects. Perhaps this could be the resolution of the strong CP problem?
C P as a S p o n t a n e o u s l y B r o k e n S y m m e t r y A second possible solution comes from the observation that if the underlying theory is CP conserving, a "bare" 0 parameter is forbidden. In such a theory, the observed CP violation must arise spontaneously, and the challenge is to understand why this spontaneous CP violation does not lead to a 9 parameter. For example, if the low energy theory contains just the standard model fields, then some high energy breaking of C P must generate the standard model CP violating phase. This must not generate a phase
366 in d e t m , , which would be a 8 parameter. Various schemes have been devised to accomplish this 13 . Without supersymmetry, they are generally invoked in the context of grand unification. There, it is easy to arrange that the 8 parameter vanishes at the tree level, including only renormalizable operators. It is then necessary to understand suppression of loop effects and of the contributions of higher dimension operators. In the context of supersymmetry, it turns out that understanding the smallness of 8 in such a framework, requires that the squark mass matrix have certain special properties (there must be a high degree of squark degeneracy, and the left right terms in the squark mass matrices must be nearly proportional to the quark mass m a t r i x . / 4 . Again, it is interesting that string theory is a theory in which CP is a fundamental (gauge) symmetry; its breaking is necessarily spontaneous. Some simple string models possess some of the ingredients required to implement the ideas of 13 .
The Axion Perhaps the most popular explanation of the smallness of 8 involves a hypothetical particle called the axion. We present here a slightly updated version of the original idea of Peccei and Quinif 5 . Consider the vacuum energy as a function of 8, eqn. [85]. This energy has a minimum at 8 = 0, i.e. at the CP conserving point. As Weinberg noted long ago, this is almost automatic: points of higher symmetry are necessarily stationary points. As it stands, this observation is not particularly useful, since 8 is a parameter, not a dynamical variable. But suppose that one has a particle, a, with coupling to QCD: £„*,„„ = ( 9 , a ) 2 +
{al + 9) l; 2 FF
(93)
/„ is known as the axion decay constant. Suppose that the rest of the theory possesses a symmetry, called the Peccei-Quinn symmetry, a-t
a+a
for constant a. Then by a shift in a, one can eliminate 8. E(8 is now V(a/fa), the axion. It has a minimum at 8 = 0. The strong CP problem is solved.
(94) the potential energy of
One can estimate the axion mass by simply examining E{8).
2 _ mlf**
(95)
Ja
If fa ~ TeV, this yields a mass of order KeV. If /„ ~ 10 16 GeV, this gives a mass of order 10~ 9 eV. As for the ^-dependence of the vacuum energy, it is not difficult to get the factors straight using current algebra methods. A collection of formulae, with great care about factors of 2, appears i n 1 6 Actually, there are several questions one can raise about this proposal: • Should the axion already have been observed? In fact, as Scott Thomas will explain in greater detail in his lectures, the couplings of the axion to matter can be worked out in a straightforward way, using the methods of current algebra (in particular of non-linear lagrangians). All of the couplings of the axion are suppressed by powers of fa. So if fa is large enough, the axion is difficult to see. The strongest limit turns out to come from red giant stars. The production of axions is "semiweak," i.e. it only is suppressed by one power of fa, rather than two powers of raw', as a result, axion emission is competitive with neutrino emission until /„ > 10 10 GeV or so.
367 As we will describe in a bit more detail below, the axion can be copiously produced in the early universe. As a result, there is an upper bound on the axion decay constant. In this case, as we will explain below, the axion could constitute the dark matter. Can one search for the axion experimentally 17 ? Typically, the axion couples not only to the FF of QCD, but also to the same object in QED. This means that in a strong magnetic field, an axion can convert to a photon. Precisely this effect is being searched for by a group at Livermore (the collaboration contains members from MIT, University of Florida) and Kyoto. The basic idea is to suppose that the dark matter in the halo consists principally of axions. Using a (superconducting) resonant cavity with a high Q value in a large magnetic field, one searches for the conversion of these axions into excitations of the cavity. The experiments have already reached a level where they set interesting limits; the next generation of experiments will cut a significant swath in the presently allowed parameter space. The coupling of the axion to FF violates the shift symmetry; this is why the axion can develop a potential. But this seems rather paradoxical: one postulates a symmetry, preserved to some high degree of approximation, but which is not a symmetry; it is at least broken by tiny QCD effects. Is this reasonable? To understand the nature of the problem, consider one of the ways an axion can arise. In some approximation, we can suppose we have a global symmetry under which a scalar field, (f>, transforms as 0 —> elct
\{
(96)
If this field couples to fermions, so that they gain mass from its expectation value, then at one loop we generate a coupling aFF from integrating out the fermions. This calculation is identical to the corresponding calculation for pions we discussed earlier. But we usually assume that global symmetries in nature are accidents. For example, baryon number is conserved in the standard model simply because there are no gauge-invariant, renormalizable operators which violate the symmetry. We believe it is violated by higher dimension terms. The global symmetry we postulate here is presumably an accident of the same sort. But for the axion, the symmetry must be extremely good. For example, suppose one has a symmetry breaking operator AH+4
V
(97)
Such a term gives a linear contribution to the axion potential of order fan . If fa ~ 10 1 1 , this swamps the would-be QCD contribution ( " * ; " ) unless n > 1218!
/«
This last objection finds an answer in string theory. In this theory, there are axions, with just the right properties, i.e. there are symmetries in the theory which are exact in perturbation theory, but which are broken by exponentially small non-perturbative effects. The most natural value for fa would appear to be of order MQUT — Mp. Whether this can be made compatible with cosmology, or whether one can obtain a lower scale, is an open question.
6
T h e A x i o n in C o s m o l o g y
Despite the fact that it is so weakly coupled, it be copiously produced in the early universe 19 . The point is the weak coupling itself. In the early universe, we know the temperature was once at lease 1 MeV, and
368 if the temperature was above a GeV, the potential of the axion was irrelevant. Indeed, if the universe is radiation dominated, the equation of motion for the axion is:
For t - 1 S> ma, the system is overdamped, and the axion does not move. There is no obvious reason that the axion should sit at its minimum in this early era. So one can imagine that the axion sits at its minimum until t ~ m " 1 , and then begins to roll. For /„ ~ Mv, this occurs at the QCD temperature; for sm.aller /„ it occurs earlier. After this, the axion starts to oscillate in its potential; it looks like a coherent state of zero momentum particles. At large times,
(99)
so the density is simply diluted by the expansion. The energy density in radiation dilutes like T 4 , so eventually the axion comes to dominate the energy density. If fa ~ Mp, the axion energy density is comparable when oscillations start. If fa is smaller, oscillation starts earlier and there is more damping. Detailed study (including the finite temperature behavior of the axion potential) gives a limit /„ < 10 11 GeV.
7
Conclusions: O u t l o o k
The strong CP problem, on the one hand, seems very subtle, but on the other hand it is in many ways similar to the other problems of flavor which we confront when we examine the standard model. 8 is one more parameter which is surprisingly small. The smallness of the Yukawa couplings may well be the result of approximate symmetries. Similarly, all of the suggestions we have discussed above to understand the smallness of 6 involve approximate symmetries of one sort or another. In the case of other ideas about flavor, there is often no compelling argument for the scale of breaking of the symmetries. The scale could be so high as to be unobservable, and there is little hope for testing the hypothesis. What is perhaps most exciting about the axion is that if we accept the cosmological bound, the axion might well be observable. T h a t said, one should recognize t h a t there are reasons to think t h a t the axion scale might be higher. In particular, as mentioned earlier, string theory provides one of the most compelling settings for axion physics, and one might well expect the Peccei-Quinn scale to be of order the GUT scale or Planck scale. There have been a number of suggestions in the literature as to how the cosmological bound might be evaded in this context. At a theoretical level, there are other areas in which the axion is of interest. Such particles inevitably appear in string theory and in supersymmetric field theories. (Indeed, it is in this context that PecceiQuinn symmetries of the required type for QCD most naturally appear). These symmetries and the associated axions are a powerful tool for understanding these theories. Acknowledgements: This work was supported in part by a grant from the U. S. Department of Energy. 1. The text, M.E. Peskin and D.V. Schroeder, An Introduction to Quantum Field Theory, Addison Wesley (1995) Menlo Park, has an excellent introduction to anomalies (chapter 19).
369 2. E. Witten, "Dyons of Charge §£", Phys. Lett. 8 6 B (1979) 283. For a pedagogical introduction to monopoles, I strongly recommend J.A. Harvey, "Magnetic Monopoles, Duality and Supersymmetry", hep-th/9603086. 3. J. Kogut and L. Susskind, Phys. Rev. D l l (1976) 3594. 4. I. Affleck, Phys.Lett. B 9 2 (1980) 149. 5. G. 't Hooft, Phys. Rev. D 1 4 (1976) 3432. 6. A.A. Belavin, A.M. Polyakov, A.S. Schwartz and Y.S. Tyupkin, Phys. Lett. 5 9 B (1975) 85. 7. S. Coleman, "The Uses Of Instantons," in Aspects of Symmetry, Cambridge University Press, Cambridge, 1985. 8. R.J. Crewther, P. Di Vecchia, G. Veneziano and E. Witten, Phys. Lett. 8 8 B (1979) 123. 9. I. Affleck, "On Constrained Instantons," Nucl. Phys. B 1 9 1 (1981) 429. 10. E. Witten, Nucl.Phys. B 1 4 9 (1979) 285. 11. D. Gross, R. Pisarski and L.G. Yaffe, "QCD and Instantons at Finite Temperature," Rev. Mod. Phys. 53 (1981) 43. 12. A.G. Cohen, D.B. Kaplan and A.E. Nelson, "Progress in Electroweak Baryogenesis," Ann. Rev. Nucl. Part. Sci. 43 (1993) 27, hep-ph/9302210. 13. A. Nelson, Phys. Lett. 1 3 6 B (1984) 387; S.M. Barr, Phys. Rev. Lett. 53 (1984) 329. 14. M. Dine, R. Leigh and A. Kagan, Phys. Rev. D 4 8 (1993) 4269, hep-ph/9304299. 15. R.D. Peccei and H.R. Quinn, Phys. Rev. Lett. 38 (1977) 1440; Phys. Rev. D 1 6 (1977) 1791. 16. M. Srednicki, "Axion Couplings to Matter: CP Conserving Parts," Nucl. Phys. B 2 6 0 (1985) 689. 17. P. Sikivie, "Axion Searches," Nucl.Phys.Proc.Suppl. 87 (2000) 41, hep-ph/0002154. 18. M. Kamionkowski and J. March-Russell, "Planck Scale Physics and the Peccei-Quinn Mechanism," Phys. Lett. B 2 8 2 (1992) 137, hep-th/9202003. 19. M. Turner, "Windows on the Axion," Phys.Rept. 197 (1990) 67.
This page is intentionally left blank
I^IIIIIIIIBII
Carl E. Wieman
This page is intentionally left blank
A B I B L I O G R A P H Y OF ATOMIC P A R I T Y VIOLATION A N D ELECTRIC DIPOLE M O M E N T E X P E R I M E N T S
JILA,
CARL E. WIEMAN University of Colorado and National Institute of Standards and Technology, 440 UCB, Boulder, CO 80309-0440, USA E-mail: [email protected]
This bibliography contains references in regards to searches for electric dipole moments of atoms and neutrons, and also for atomic parity violation experiments and theory. [The author felt that a separate set of lecture notes would duplicate material already published in these references - Editor.]
Electric dipole moments of atoms and neutrons 1. E.D. Commins, CP Violation in Atomic and Nuclear Physics, Proceedings of the XXVII SLAC Summer Institute on Particle Physics, 1999). An excellent review of the experimental and theoretical aspects of the subject in the Proceedings of the XXVII SLAC Summer School. This also has an extensive list of relevant references. See http://www.slac.stanford.edu/gen/meeting/ssi/1999/index.html. 2. Y. Nir, CP Violation in and beyond the Standard Model, (Proceedings of the XXVII SLAC Summer Institute on Particle Physics, 1999). A review of the general theory of CP violation and its connection with EDMs is covered by Y. Nir in Proceedings of the XXVII SLAC Summer School. See http://www.slac.stanford.edu/gen/meeting/ssi/1999/index.html 3. L.I. Schiff, "Measurability of nuclear electric dipole moments," Phys. Rev. 132, 2194 (1963). Presents Schiff's Theorem about the EDM in an atom (or lack thereof) due to an EDM of the nucleus or the electron. 4. P.G.H. Sandars, "The electric dipole moment of an atom," Phys. Lett. 14, 194 (1965); P.G.H. Sandars, "Enhancement factor for the electric dipole moment of the valence electron in an alkali atom," Phys. Lett. 22, 290 (1966). Discusses how Schiff's Theorem does not apply to heavy atoms so the atomic EDM is actually enhanced over the EDM of the electron. 5. J.P. Jacobs, W.M. Klipstein, S.K. Lamoreaux, B.R. Heckel, and E.N. Fortson, "Limit on the electric-dipole moment of HG-199 using synchronous optical-pumping," Phys. Rev. A 52, 3521 (1995). Provides best limits on T violating nuclear-nuclear interactions by search373
ing for EDM of the mercury atom. 6. S.A. Murthy, D. Krause, Z.L. Li, and L.R. Hunter, "New limits on the electron electric-dipole moment from cesium," Phys. Rev. Lett. 63, 965 (1989). Provides the best limit on electron EDM that has been obtained in a vapor cell experiment. 7. E.D. Commins, S.B. Ross, D. DeMille, and B.C. Regan, "Improved experimental limit on the electric-dipole moment of the electron," Phys. Rev. A 50, 2960 (1994). Provides best current limit on the EDM of the electron, although improvements are expected soon. 8. P.G. Harris, C.A. Baker, K. Green, P. Iaydjiev, S. Ivanov, D.J.R. May, J.M. Pendlebury, D. Shiers, K.F. Smith, M. van der Grinten, P. Geltenbort, "New experimental limit on the electric dipole moment of the neutron," Phys. Rev. Lett. 82, 904 (1999). Provides the best limits on the EDM of the neutron. 9. E.A. Hinds, "Testing time reversal symmetry using molecules," Physica Scripta T70, 34 (1997). Discusses the possibilities for using molecules to measure electron and nuclear EDMs.
Atomic parity violation 10. S.A. Blundell, J. Sapirstein, and W.R. Johnson, "Relativistic all-order calculations of energies and matrix-elements in cesium," Phys. Rev. A 43, 3407 (1991). Discusses the atomic structure calculations used in connecting atomic PV experiments to the standard model. 11. S.A. Blundell, W.R. Johnson, and J. Sapirstein, "3rd-order many-body perturbation-theory calculations of the ground-state energies of cesium and thallium," Phys. Rev. A 42, 3751 (1990). 12. S.A. Blundell, W.R. Johnson, and J. Sapirstein, "High-accuracy calculation of the 6S1/2 to 7S1/2 parity-nonconserving transition in atomic cesium and implications for the standard model," Phys. Rev. Lett. 65, 1411 (1990). 13. S.A. Blundell, J. Sapirstein, and W.R. Johnson, "High-accuracy calculation of parity nonconservation in cesium and implications for particle physics," Phys. Rev. D 45, 1602 (1992). 14. V.A. Dzuba, V.V. Flambaum, and O.P. Sushkov, "Polarizabilities and parity nonconservation in the Cs atom and limits on the deviation from
375
the standard electroweak model," Phys. Rev. A 56, R4357-R4360 (1997). 15. V.A. Dzuba, V.V. Flambaum, and O.P. Sushkov, "Summation of the high orders of perturbation theory in the correlation correction for the parity violating El-amplitude of the 6s to 7s transition in the cesium atom," Phys. Lett. A 141, 147 (1989). 16. V.A. Dzuba, V.V. Flambaum, and O.P. Sushkov, "Summation of the perturbation theory high order contributions to the correlation correction for the energy levels of the cesium atom," Phys. Lett. A 140, 493 (1989). 17. V.A. Dzuba, V.V. Flambaum, A. Ya. Kraftmakher, and O.P. Sushkov, "Summation of the high orders of perturbation theory in the correlation correction to the hyperfine structure and to the amplitudes of Eltransitions in the cesium atom," Phys. Lett. A 142, 373 (1989). 18. C.S. Wood, S.C. Bennett, D. Cho. B.P. Masterson, J.L. Roberts, C. Tanner and C. E. Wieman, "Measurement of parity nonconservation and an anapole moment in cesium," Science 275, 1759 (1997). Provides the best experimental measurement of atomic parity violation. 19. S.C. Bennett and C.E. Wieman, "Measurement of the 6S to 7S transition polarizability in atomic cesium and an improved test of the standard model," Phys. Rev. Lett. 82, 2484 (1999). Provides the calibration needed to interpret the parity violation measurement in terms of fundamental coupling constants, and also reexamines the accuracy of the earlier atomic theory. 20. C.S. Wood, S.C. Bennett, J.L. Roberts, D. Cho and C.E. Wieman, "Precision measurement of parity nonconservation in cesium," Canad. J. Phys. 77, 7 ((1999). Provides a very extensive discussion of the potential systematic errors of the cesium PNC experiment. 21. M.A. Bouchiat and C. Bouchiat, "Parity violation in atoms," Reports on Progress in Physics 60, 1351 (1997). Provides a general review of atomic PNC experiments. 22. J. Erler and P. Langacker, Constraints on extended neutral gauge structures, Phys. Lett. B 456, 68 (1999). J. Erler and P. Langacker, "Indications for an extra neutral gauge boson in electroweak precision data," Phys. Rev. Lett. 84, 212 (2000). Provides an analysis of precision weak interaction measurements and the constraints they set on extensions of the standard model. 23. A. Derevianko, "Reconciliation of the measurement of parity nonconservation in Cs with the standard model," Phys. Rev. Lett. 85, 1618 (2000). Provides an update of the atomic theory calculations of 1 and shows how they are shifted by the Breit correction.
This page is intentionally left blank
k\\
Adam F. Falk
This page is intentionally left blank
T H E C K M M A T R I X A N D T H E HEAVY Q U A R K E X P A N S I O N ADAM F. FALK Department of Physics and Astronomy The Johns Hopkins University 3400 North Charles Street, Baltimore, Maryland
21218
These lectures contain an elementary introduction to heavy quark symmetry and the heavy quark expansion. Applications such as the expansion of heavy meson decay constants and the treatment of inclusive and exclusive semileptonic B decays are included. The use of heavy quark methods for the extraction of \Vcf,\ and |Vuf,| is presented is some detail.
1
Heavy Quark Symmetry
In these lectures I will introduce the ideas of heavy quark symmetry and the heavy quark limit, which exploit the simplification of certain aspects of QCD for infinite quark mass, TUQ —>• 00. We will see that while these ideas are extraordinarily simple from a physical point of view, they are of enormous practical utility in the study of the phenomenology of bottom and charmed hadrons. One reason for this is the existence not just of an interesting new limit of QCD, but of a systematic expansion about this limit. The technology of this expansion is the Heavy Quark Effective Theory (HQET), which allows one to use heavy quark symmetry to make accurate predictions of the properties and behavior of heavy hadrons in which the theoretical errors are under control. While the emphasis in these lectures will be on the physical picture of heavy hadrons which emerges in the heavy quark limit, it will be important to introduce enough of the formalism of the HQET to reveal the structure of the heavy quark expansion as a simultaneous expansion in powers of A Q C D / ^ I Q and as{mo). However, what I hope to leave you with above all is an appreciation for the simplicity, elegance and coherence of the ideas which underlie the technical results which will be presented. The interested reader is also encouraged to consult a number of excellent reviews,1 which typically cover in more detail the material in these lectures.
1.1
Introduction
We begin by recalling the properties of charged current interactions in the Standard Model. They are mediated by the interactions with the W * bosons, 379
380
which for the quarks take the form. (u
c
f)7M(l-75)VCKM ( s J W ^ + h . c .
(1)
The 3 x 3 unitary matrix VCKM is
(
Vud Vcd
Vus Vub \ Vcs Vcb .
VU
Vu
(2)
Vtb J
The elements of VCKM have a hierarchical structure, getting smaller away from the diagonal: Vud, Vcs and Vtb are of order 1, Vus and VCd a r e of order 1 0 - 1 , Vcb and Vts are of order 10~ 2 , and Vub and Vtd are of order 10~ 3 . By contrast, except for small effects associated with neutrino masses, the interaction of the W± with the leptons is flavor diagonal. The CKM matrix is of fundamental importance, because it is the low energy manifestation of the higher-energy physics which breaks the global flavor symmetries of the Standard Model. In the absence of the Yukawa couplings, the quark sector of the Standard Model may be characterized by its gauge symmetry 517(3) x 5*7(2) x U{1), and its global symmetry U(i)Q x U{2)u * *7(3)D which rotates the triplets of colored fields QlL, UR and DlR among each other. The global symmetry group has a total of 3 x 3 2 = 27 generators. With the addition of fields <j> and
(3)
which break the flavor symmetries explicitly. The complex matrices \%J and XlJ correspond to 2 x (2 x 32) = 36 independent parameters. They break the global flavor symmetries completely, except for a remaining conserved baryon number U{\)B- The 27 - 1 = 26 broken generators may be used to rotate 26 of the Yukawa couplings to zero, leaving 36 — 26 = 10 physical parameters. When <j> and <j> get vacuum expectation values, one may go to the mass eigenstate basis for the quark fields, in which case the ten parameters are six quark masses and four parameters characterizing VCKM- An independent examination of VCKM confirms this counting. A 3 x 3 unitary matrix has 9 parameters, of which (3) = 3 are angles and the remaining 6 are complex phases. However, one may adjust 5 relative phases of the mass eigenstate quark fields, leaving 6 — 5 = 1 physical phase in VCKM- Thus VCKM is indeed characterized by four parameters, one of which is a CP-violating phase.
381
It is an instructive exercise, left to the reader, to perform the analogous counting for the general case of U(Nf)Q x U(Nf)u x U(Nf)o global flavor symmetry. One finds that the Yukawa couplings contain Nj + 1 physical parameters, of which 2Nf are quark masses. The remaining (Nf — l ) 2 parameters characterize VCKM, with ^Nf(Nf — 1) being angles and \(Nf — l)(Nf — 2) being complex phases. In particular, one notes that for Nf = 2 there is no CP violation in the weak interactions. Why is an understanding of QCD crucial to the study of the properties of VCKM? As an example, consider semileptonic b decay, b -^ civ, from which one would like to extract |VC&|. This process is mediated by a four-fermion operator, Obc
= ^ f ^ l
- 7 5 ) ^ 7 M ( 1 - 7 5 )^-
(4)
The weak matrix element is easy to calculate at the quark level, Ajuark = {c£i>\ Obc\b) = — - ~ u c ( p c ) ^ ( l
- j5)ub(pb)
ui{p()'ylj,(l - 7 5 )tt„(p„) •
(5) However, *4quark is only relevant at very short distances; at longer distances, QCD confinement implies that free b and c quarks are not asymptotic states of the theory. Instead, nonperturbative QCD effects "dress" the quark level transition b —> c Iv to a hadronic transition, such as B -)• Dtv
or
B -> D*lv
or
...
(6)
(In these lectures, we will use a convention in which a B meson contains a b quark, not a b antiquark.) The hadronic matrix element *4hadron depends on nonperturbative QCD as well as on GFVCI>, and is difficult to calculate from first principles. To disentangle the weak interaction part of this complicated process requires us to develop some understanding of the strong interaction effects. There are a variety of methods by which one can do this. Perhaps the most popular, historically, has been use of various quark potential models.2 While these models are typically very predictive, they are based on uncontrolled assumptions and approximations, and it is virtually impossible to estimate the theoretical errors associated with their use. This is a serious defect if one builds such a model into the experimental extraction of a weak coupling constant such as Vcb, because the uncontrolled theoretical errors then infect the experimental result. These are issues which are important for the extraction of all the elements of VCKM- Let us pause now to review our current experimental knowledge of
382
each of the magnitudes. The results are taken from the Particle Data Book.3 (The phases of the matrix elements must be extracted from CP violating asymmetries, as discussed elsewhere at this school.) We start with the submatrix describing mixing among the first two generations. The parameter \Vud\ is measured by studying the rates for neutron and nuclear (3 decay. Here the isospin symmetry of the strong interactions may be used to control the nonperturbative dynamics, since the operator ^7^(1 — j5)u which mediates the decay is a partially conserved current associated with a generator of chiral SU(2)L x SU(2)R. The current data yield \Vud\ = 0.9735 ±0.0008,
(7)
so \Vud\ is known at the level of 0.1%. The parameter \VUS\ is measured similarly, via K -> TT£U( and A -> pl&£. Here chiral SU(3)L x SU(3)R must be used in the hadronic matrix elements, since a strange quark is involved; because the ms corrections are larger, |y u s | is only known to 1%: \VUS\ =0.2196 ±0.0023.
(8)
The VCKM elements involving the charm quark are not so well measured. One way to extract | V^s | is to study the decay D -> K£+V(. Unfortunately, there is no symmetry by which one can control the matrix element (ii'|s7' i (l — 75)c|-D), since flavor SU(4:) is badly broken. One is forced to resort to models for these matrix elements. The reported value is \VC.\ = 1.04 ± 0 . 1 6 ,
(9)
but it must be said that this error estimate is not on very firm footing, and should probably be taken to be substantially larger. An alternative is to measure Vcs from inclusive processes at higher energies. For example, one can study the branching fraction for W+ —> cs, which can be computed using perturbative QCD. The result of a preliminary analysis is \VCS\ = 1.00 ± 0 . 1 3 ,
(10)
consistent with the model-dependent measurement. In this case, however, the error is largely experimental, and is unpolluted by hadronic physics. Similarly, one extracts \Vcd\ from deep inelastic neutrino scattering, using the process u^ + d ->• c + yT. This inclusive process may be computed perturbatively in QCD, leading to a result with accuracy at the level of 10%, |Vcd| = 0.224 ±0.016.
(11)
383
The elements of VCKM involving the third generation are, for the most part, harder to measure accurately. The branching ratio for t ->• bl+v can be analyzed perturbatively, but the experimental data are not very good. The present bound on |14b| is
SFM
=
"
i , a
(I2)
If one imposed the unitarity constraint \Vtd\2 + \Vts\2 + \Vtb\2 = 1, then this would amount to a 15% measurement of |Vtj|, but this unitarity constraint is one of the properties of VCKM which one is trying to test. More generally, in fact, one should be wary of constraints on VCKM which impose unitarity as part of the analysis; while one often obtains tighter constraints in this way, these constraints have a different meaning than do direct measurements. What the comparison of a constrained and unconstrained "measurement" really tells you is whether the direct determination may be used to test the unitarity of VCKM • Unfortunately, there are as yet no direct extractions of |V*d| o r \Vts\- One often speaks of these elements being measured in Ba — B& and B$ — Bs mixing, but again, the hypothesis that the Standard Model is responsible for these processes is something that one really wants to check. The correct way to view this part of the experimental program is to say that the Standard Model, including the unitarity of VCKMJ constrains Vts and Vtd severely enough that testable predictions can be made for the mixing parameters Am<j and A m s . This leaves us with the matrix elements Vub and Vcb, for which we need an understanding of B meson decay. In these lectures we will discuss an approach to understanding the relevant hadronic physics which exploits the fact that the b and c quarks are heavy, by which we mean that mb,rnc S> AQCD- The scale AQCD is the typical energy at which QCD becomes nonperturbative, and is of the order of hundreds of MeV. The physical quark masses are approximately mb w 4.8 GeV and m c « 1.5 GeV. The formalism which we will develop will not make as many predictions as do potential models. However, the compensation will be that we will develop a systematic expansion in powers of AQCD /wi(,)C, within which we will be able to do concrete error analysis. In particular, we will be able to estimate the error associated with the fact that m c may not be very close to the asymptotic limit mc ^§> AQCD • Even where this error may be substantial, the fact that it is under control allows us to maintain predictive power in the theory.
384
1.2
The heavy quark limit
Consider a hadron HQ composed of a heavy quark Q and "light degrees of freedom", consisting of light quarks, light antiquarks and gluons, in the limit TTIQ —> oo. The Compton wavelength of the heavy quark scales as the inverse of the heavy quark mass, XQ ~ I/TRQ. The light degrees of freedom, by contrast, are characterized by momenta of order AQCD, corresponding to wavelengths Xe ~ 1/AQCD- Since Xi ~S> XQ, the light degrees of freedom cannot resolve features of the heavy quark other than its conserved gauge quantum numbers. In particular, they cannot probe the actual value of XQ, that is, the value of mQ. We may draw the same conclusion in momentum space. The structure of the hadron HQ is determined by nonperturbative strong interactions. The asymptotic freedom of QCD implies that when quarks and gluons exchange momenta p much larger than AQCD, the process is perturbative in the strong coupling constant as (p). On the other hand, the typical momenta exchanged by the light degrees of freedom with each other and with the heavy quark are of order AQCD, for which a perturbative expansion is of no use. For these exchanges, however, p < TUQ, and the heavy quark Q does not recoil, remaining at rest in the rest frame of the hadron. In this limit, Q acts as a static source of electric and chromoelectric gauge field. The chromoelectric field, which holds HQ together, is nonperturbative in nature, but it is independent of TUQ. The result is that the properties of the light degrees of freedom depend only on the presence of the static gauge field, independent of the flavor and mass of the heavy quark carrying the gauge charge." There is an immediate implication for the spectroscopy of heavy hadrons. Since the interaction of the light degrees of freedom with the heavy quark is independent of TUQ, then so is the spectrum of their excitations. It is these excitations which determine the spectrum of heavy hadrons HQ. Hence the splittings A; ~ AQCD between the various hadrons HQ are independent of Q and, in the limit TTIQ —> oo, do not scale with rriQ. For example, the bottom and charmed meson spectra are shown schematically in Fig. 1, in the limit mb,mc > AQCD- The light degrees of freedom are in exactly the same state in the mesons Bi and Di, for a given i. The offset Bi — Di = rrib — mc is just the difference between the heavy quark masses; in no way does the relationship between the spectra rely on an approximation mb m mc. We can enrich this picture by recalling that the heavy quarks and light a
T o p quarks decay too quickly for a static chromoelectric field to be established around them, so the simplifications discussed here are not relevant to them.
385
1 i 1
B4
> ,
B3
'A3
B2
i
1>
" A4
By
1
A:
"A3
B{ o mb
£>4 £>3
D2
-mc
Dx
a
Ai
D{o Figure 1: Schematic spectra of the bottom and charmed mesons in the limit mi,,mc 2> A Q C D The offset of the two spectra is not to scale; in reality, m j — mc 3> Aj ~ A Q C D -
degrees of freedom also carry spin. The heavy quark has spin quantum number SQ = | , which leads to a chromomagnetic moment MQ
(13)
2mQ
Note that /XQ —> 0 as JTIQ —>• oo, and the interaction between the spin of the heavy quark and the light degrees of freedom is suppressed. Hence the light degrees of freedom are insensitive to SQ; their state is independent of whether SQ = | or SQ = — | . Thus each of the energy levels in Fig. 1 is actually doubled, one state for each possible value of SQ . To summarize, what we see is that the light degrees of freedom are the same when combined with any of the following heavy quark states: Qi(t),
Qi(i);
Q2(t),
Q2U);
QNh(t),
QNAI),
(14)
where there are Nh, heavy quarks (in the real world, Nh, = 2). The result is an SU(2Nh) symmetry which applies to the light degrees of freedom. 4 ' 5 ' 6 ' 7 ' 8 A new symmetry means new nonperturbative relations between physical quantities. It is these relations which we wish to understand and exploit.
386
The light degrees of freedom have total angular momentum Ji, which is integral for baryons and half-integral for mesons. When combined with the heavy quark spin SQ = | , we find physical hadron states with total angular momentum J=\Jt±\\. (15) If S( ^ 0, then these two states are degenerate. For example, the lightest heavy mesons have S( = | , leading to a doublet with J = 0 and J = 1. In the charm system we find that the states of lowest mass are the spin-0 D and the spin-l D*\ the corresponding bottom mesons are the B and B*. The heavy quark spin operator SQ exchanges these two states. Writing the spin wave function \mQ,me), w e have
|M*(j3 = o)> = -^(in> + ut». (16)
|M> = - ^ ( i u > - u t » , Then it is easy to show that
S Q |M) = i | M * ( J 3 = 0)),
SQ\M*(J3=0))
= ±\M).
(17)
When effects of order I/TUQ are included, the chromomagnetic interactions split the states of given St but different J. This "hyperfine" splitting is not calculable perturbatively, but it is proportional to the heavy quark magnetic moment /J,Q . This gives its scaling with TUQ : mo* — rn,D ~ l/mc m,B' — TUB ~ l/mi,.
(18)
From this fact we can construct a relation which is a nonperturbative prediction of heavy quark symmetry, m2B.
-mB
= m2D. - m2D .
(19)
Experimentally, m2B, - m2B — 0.49 GeV2 and m2D, - m2D = 0.55 GeV 2 , so this prediction works quite well. Note that this relation involves not just the heavy quark symmetry, but the systematic inclusion of the leading symmetry violating effects. Generally, the mass of a heavy hadron HQ may be expanded in inverse powers of TUQ m(HQ) =mQ + AH + 0(l/mQ), (20) where A H is independent of Q and is associated with the energy of the light degrees of freedom in the hadron HQ. For the lowest lying Jt = | doublet,
387
this quantity is usually just referred to as A. Prom dimensional considerations, one expects A to be of order of a few hundred MeV. So far, we have formulated heavy quark symmetry for hadrons in their rest frame. Of course, we can easily boost to a frame in which the hadrons have arbitrary four-velocity v^ = 7(1,v). For heavy quarks Qi and Q2, the symmetry will then relate hadrons Hi(v) and Hi{v) with the same velocity but with different momenta. This distinguishes heavy quark symmetry from ordinary symmetries of QCD, which relate states of the same momentum. To remind ourselves of this distinction, henceforth we will label heavy hadrons explicitly by their velocity: D(v), D*(v), B(v), B*(v), and so on. 1.3
Semileptonic decay of a heavy quark
Now let us return to the semileptonic weak decay b —> civ, but now consider it in the heavy quark limit for the b and c quarks. Suppose the decay occurs at time t = 0. For t < 0, the b quark is embedded in a hadron H\,; for t > 0, the c quark is dressed by light degrees of freedom to H'c. Let us consider the lightest hadrons, Hi, = B(v) and H'c = D(v'). Note that since the leptons carry away energy and momentum, in general v ^ v'. What happens to the light degrees of freedom when the heavy quark decays? For t < 0, they see the chromoelectric field of a point source with velocity v. At t — 0, this point source recoils instantaneously 6 to velocity v'\ the color neutral leptons do not interact with the light hadronic degrees of freedom as they fly off. The light quarks and gluons then must reassemble themselves about the recoiling color source. This nonperturbative process will generally involve the production of an excited state or of additional particles; the light degrees of freedom can exchange energy with the heavy quark, so there is no kinematic restriction on the excitations (of energy ~ AQCD) which can be formed. There is also some chance that the light degrees of freedom will reassemble themselves back into a ground state D meson. The amplitude for this to happen is a function only of the inner product w = v • v' of the initial and final velocities of the color sources. This amplitude, £(w), is known as the Isgur-Wise function.4 Clearly, the kinematic point v = v1, or w — 1, is a special one. In this corner of phase space, where the leptons are emitted back to back, there is no recoil of the source of color field at t = 0. As far as the light degrees of freedom are concerned, nothing happens! Their state is unaffected by the decay of the heavy quark; they don't even notice it. Hence the amplitude for them to remain in the ground state is exactly unity. This is reflected in a 6
The weak decay occurs over a very short time St ~ 1/Mw
<S 1 / A Q C D -
388
nonperturbative normalization of the Isgur-Wise function at zero recoil,4 £(!) = !•
(21)
As we will see, this normalization condition is of enormous phenomenological use. It will be extremely important to understand the corrections to this result for finite heavy quark masses rrn, and, especially, m c . The weak decay b —> c is mediated by a left-handed current c~7M(l — 7 5 )6. Not only does this operator carry momentum, but it can change the orientation of the spin SQ of the heavy quark during the decay. For a fixed light angular momentum J(, the relative orientation of SQ determines whether the physical hadron in the final state is a D or a D*. However, the light degrees of freedom are insensitive to SQ , so the nonperturbative part of the transition is the same whether it is a D or a D* which is produced. Hence heavy quark symmetry implies relations between the hadronic matrix elements which describe the semileptonic decays B —> Dtv and B -» D*lv. It is conventional to parameterize these matrix elements by a set of scalar form factors. These are defined separately for the vector and axial currents, as follows: (D{v')\ C7"6 \B{v)) = h+(w){v + v'Y + h-{w){v - v'Y (D*(v',e)\c^b\B(v))
=
hv(w)ie^aPelv'avp
{D(v')\&f-y6b\B{v))=Q
(22)
5
h
(D*(v', e)| c 7 " 7 6 \B(v)) = hM (w)(w + l)e*" - e* • v[hAa («>K + A3
Hv'11].
The set of form factors hi(w) is the one appropriate to the heavy quark limit. Other linear combinations are also found in the literature. In any case, the form factors are independent nonperturbative functions of the recoil or equivalently, for fixed m;, and mc, of the momentum transfer. However, in the heavy quark limit they correspond to a single transition of the light degrees of freedom, being distinguished from each other only by the relative orientation of the spin of the heavy quark. Hence they may all be written in terms of the single function £(w) which describes this nonperturbative transition. As we will derive later, the result is a set of relations,4 h+(w) = hv(w) = hAl(w) h-(w)=hA2(w)=0,
= hAs{w) = £{w) (23)
which follow solely from the heavy quark symmetry. Of course, all of the form factors which do not vanish inherit the normalization condition (21) at zero recoil. This result is a powerful constraint on the structure of semileptonic decay in the heavy quark limit.
389 1.4
Heavy meson decay constant
As a final example of the utility of the heavy quark limit, consider the coupling of the heavy meson field to the axial vector current. This is conventionally parameterized by of a decay constant; for example, for the B~ meson we define fB via (0\urj5b\B-(PB)) = ifBP%. (24) What is the dependence of the nonperturbative quantity fB on m s ? To address this question, we rewrite Eq. (24) in a form appropriate to taking the heavy quark limit, mB —> oo (which is equivalent to mj, —> co). This entails making explicit the dependence of all quantities on mB. First, we trade the B~ momentum for its velocity, pB = mBV» .
(25)
Second, we replace the usual B~~ state, whose normalization depends on TUB, (B(Pl)\B(p2))
= 2EB6^(p1-p2),
(26)
by a mass-independent state, \B{v)) = -±=\BbB)),
(27)
VmB
satisfying (B(vi)\B(v2))
= 27<5(3)(pi - p 2 ) .
(28)
Then Eq. (24) becomes y/m^{0\ U7"756 \B~(v)) = ijBmBv^
.
(29)
The nonperturbative matrix element (0| u^^b \B~(v)) is independent of mB in the heavy quark limit. Hence, we see that in this limit fB takes the form fB = mB ' x (independent of mB).
(30)
This makes explicit the scaling of fB with ran • It is more interesting to write this as a prediction for the ratio of charmed and bottom meson decay constants. We find 4-5-7
£ = / " £ +0( AQ^.W). ID
V mB
\
TUD
(31)
TUB J
For the physical bottom and charm masses, of course, the correction terms proportional to A Q C D / T I Q could be important.
390
Figure 2: The nonleptonic decay of a 6 quark.
2 2.1
Effective Field Theories General Considerations
A central observation which underlies much of the theoretical study of B mesons is that physics at a wide variety of distance (or momentum) scales is typically relevant in a given process. At the same time, the physics at different scales must often be analyzed with different theoretical approaches. Hence it is crucial to have a tool which enables one to identify the physics at a given scale and to separate it out explicitly. Such a tool is the operator product expansion, used in conjunction with the renormalization group. Here a general discussion of its application is given. Consider the Feynman diagram shown in Fig. 2, in which a b quark decays nonleptonically. The virtual quarks and gauge bosons have virtualities fi which vary widely, from AQCD to M\y and higher. Roughly speaking, these virtualities can be classified into a variety of energy regimes: (i) /u ^> Mw\ (ii) Mw > A* > m&; (iii) mb > V > AQCD; (iv) (J. « AQCD- Each of these momenta corresponds to a different distance scale; by the uncertainty principle, a particle of virtuality /i can propagate a distance x « 1/fi before being reabsorbed. At a given resolution Ax, only some of these virtual particles can be distinguished, namely those that propagate a distance x > Ax. For example, if Ax > 1/Mw, then the virtual W cannot be seen, and the process whereby it is exchanged would appear as a point interaction. By the same token, as AIE increases toward 1/AQCD, fewer and fewer of the virtual gluons can be seen explicitly. Finally, for /i « AQCD, it is inappropriate to speak of virtual gluons at all, because at such low momentum scales QCD becomes strongly interacting and an expansion in terms of individual gluons is inadequate. It is useful to organize the computation of a diagram such as is shown in
391 Fig. 2 in terms of the virtuality of the exchanged particles. This is important both conceptually and practically. First, it is often the case that a distinct set of approximations and approaches is useful at each distance scale, and one would like to be able to apply specific theoretical techniques at the scale at which they are appropriate. Second, Feynman diagrams in which two distinct scales /ii S> fi2 appear together can lead to logarithmic corrections of the form a"ln n (;Ui//U2), which for ln(//i//i2) ~ l / a s can spoil the perturbative expansion. A proper separation of scales will include a resummation of such terms. 2.2
Example I: Weak b Decays
As an example, consider the weak decay of a 6 quark, b -» cud, which is mediated by the decay of a virtual W boson. Viewed with resolution Ax < I/My/, the decay amplitude involves an explicit W propagator and is proportional to {i92? P* M2w
c^{l-j5)bd^{l-j5)ux
(32)
where pM is the momentum of the virtual W. Since mi, -C! Mw, the kinematics constrains p2 -C Mw, so the virtuality of the W is of order Mw, and it travels a distance of order 1/Mw before decaying. Viewed with a lower resolution, Ax > 1/Mw, the process b —> cud appears to be a local interaction, with four fermions interacting via a potential which is a 6 function where the four particles coincide. This can be seen by making a Taylor expansion of the amplitude in powers of p2/Mw, cry"(l - 7 5 ) ^ 7 M ( 1 - 7 5 ) "
Mw
1+
P Mw
P
+ Mw + ...
(33)
The coefficient of the first term is just the usual Fermi decay constant, GF/V%The higher order terms correspond to local operators of higher mass dimension. In the sense of a Taylor expansion, the momentum-dependent matrix element (32), which involves the propagation of a W boson between two spacetime points, is identical to the matrix element of the following infinite sum of local operators: £ | C7"(l -7 5 )&
,
{id)2 Mw
{id)4 Mw
(34)
where the derivatives act on the entire current on the right. This expansion of the nonlocal product of currents in terms of local operators, sometimes known
392
as an operator product expansion, is valid so long as p2
Radiative Corrections
At tree level, the effective theory is constructed simply by integrating out the W boson, because this is the only particle in a tree level diagram which is off-shell by order Mw. When radiative corrections are included, gluons and light quarks can also be off-shell by this order. Consider the one-loop diagram shown in Fig. 3. The components of the loop momentum k^ are allowed to take all values in the loop integral. However, the integrand is cut off both in the ultraviolet and in the infrared. For k > Mw, it scales as dik/k6, which is convergent as k ->• oo. For k < rrn,, it scales as dik/kzmbMw, which is convergent as k —> 0. In between, all momenta in the range mj, < k < Mw contribute to the integral with roughly equivalent weight. As a consequence, there is potentially a radiative correction proportional to as ln(Mw/mb). Even if as(fi) is evaluated at the high scale y, = Mw, such a term is not small in the limit Mw -> oo. At n loops, there is potentially a
393
Figure 3: The nonleptonic decay of a 6 quark at one loop. term of order a™ lnn(Mw/mb). For asln(Mw/mb) ~ 1, these terms need to be resummed for the perturbation series to be predictive. The technique for performing such a resummation is the renormalization group. The renormalization group exploits the fact that in the effective theory, operators such as 0 / = c i 7 M (l - lb)V
J,-7M(1
- 7V
(35)
receive radiative corrections and must be subtracted and renormalized. (Here the color indices i and j are explicit.) In dimensional regularization, this means that they acquire, in general, a dependence on the renormalization scale \i. Because physical predictions are independent of fi, in the renormalized effective theory the operators must be multiplied by coefficients with a compensating dependence on \x. It is also possible for operators to mix under renormalization, so the set of operators induced at tree level may be enlarged once radiative corrections are included. In the present example, a second operator with different color structure, On = ci7M(l - lh)V dj^(\
-
7
V
,
(36)
is induced at one loop. The interaction Hamiltonian of the effective theory is then fteff = C/(/i)0/(/i) + Cn(n)On{jx),
(37)
and it satisfies the differential equation MA^eff = 0.
(38)
394
By computing the dependence on fi of the operators Oj(/it), one can deduce the /i-dependence of the Wilson coefficients Cj(/u). In this case, a simple calculation yields
*-«-i[(^r*(^ri
<39)
For fj, = rrib, these expressions resum all large logarithms proportional to ans\nn{Mwlmb). The decays which are observed involve physical hadrons, not asymptotic quark states. For example, this nonleptonic b decay can be realized in the channels B —> Dir, B -> D'irir, and so on. The computation of partial decay rates for such processes requires the analysis of hadronic matrix elements such as {Dir\ crf(l
- 7 5 ) 6 U 7 A I ( 1 - j5)d \b).
(40)
Such matrix elements involve nonperturbative QCD and are extremely difficult to compute from first principles. However, they have no intrinsic dependence on large mass scales such as M\y Because of this, they should naturally be evaluated at a renormalization scale fi -C Mw, in which case large logarithms ln(Mw/?7ib) will not arise in the matrix elements. By choosing such a low scale in the effective theory (37), all such terms are resummed into the coefficient functions Ci{rrn,). As promised, the physics at scales near Mw has been separated from the physics at scales near m^, with the renormalization group used to resum the large logarithms which connect them. In fact, as we will see in the next section, nonperturbative hadronic matrix elements are usually evaluated at an even lower scale // « AQCD -C rrib, explicitly resumming all perturbative QCD corrections. 3
Heavy Quark Effective Theory
We have already extracted quite a bit of nontrivial information from the heavy quark limit. We have found the scaling of various quantities with TUQ, we have studied the implications for heavy hadron spectroscopy, and we have found nonperturbative relations among the hadronic form factors which describe semileptonic b —• c decay. However, all of these results have been obtained in the strict limit TUQ -> oo. If the heavy quark limit is to be of more than academic interest, and is to provide the basis for quantitative phenomenology, we have to understand how to include corrections systematically. There are actually two types of corrections which we would like to include. Power corrections are subleading terms in the expansion in A Q C D / ^ Q i those proportional to
395 A Q C D / W C are the most worrisome, because of the relatively small charm quark mass. Logarithmic corrections arise from the implicit dependence of quantities on rriQ through the strong coupling constant as(mo) ~ l/ln(mg/AQCD)- For the physical values of mj and mc, either of these could be important. What we need is a formalism which can accommodate them both. In short, we need to go from a set of heavy quark symmetry predictions in the TUQ -> oo limit, to a reformulation of QCD which provides a controlled expansion about this limit. The formalism which does the job is the Heavy Quark Effective Theory, or the HQET. The purpose of the HQET is to allow us to extract, explicitly and systematically, all dependence of physical quantities on rriQ, in the limit TUQ ^> AQCD- In these lectures, we will develop only enough of the technology to treat the dominant leading effects, providing indications along the way of how one would carry the expansion further. The HQET, as formulated here, was developed in a series of papers going back to the late 1980's, 4 ' 5 ' 7,9 ' 10 ' 11 ' 12 ' 13 ' 14 ' 15 ' 16 which the reader who is interested in tracking its historical development may consult.
3.1
The effective Lagrangian
Consider the kinematics of a heavy quark Q, bound in a hadron with light degrees of freedom to make a color singlet state. The small momenta which Q typically exchanges with the rest of the hadron are of order AQCD -C TUQ, and they never take Q far from its mass shell, p ^ = m L Hence the momentum p Q can be decomposed into two parts, p Q = mQv» + As" ,
(41)
where rriQV^ is the constant on-shell part of PQ, and fcM ~ AQCD is the small, fluctuating "residual momentum". The on-shell condition for the heavy quark then becomes m
Q = (mQV^ + k^f
=m2Q + 2mQv • k + k2 .
(42)
In the heavy quark limit we may neglect the last term compared to the second, and we have the simple condition v •k =0
(43)
for an on-shell heavy quark. Here the velocity v^ functions as a label; since soft interactions cannot change uM, there is a velocity supers election rule in the heavy quark limit, and v>* is a good quantum number of the QCD Hamiltonian.
396
We find the same result by taking the TUQ —>• oo limit of the heavy quark propagator, i> — m' Q + it -
^ 2 ^v • k. + ie
(44)
In this limit the propagator is independent of TUQ . The projection operators P
±
= ^
(45)
project onto the positive (P+) and negative (P_) frequency parts of the Dirac field Q. This is clear in the Dirac representation in the rest frame, in which P+ and P_ project, respectively, onto the upper two and lower two components of the heavy quark spinor. In the limit TUQ -» oo, in which Q remains almost on shell, only the "large" upper components of the field Q propagate; mixing via zitterbewegung with the "small" lower components is suppressed by l/2mQ. Hence the action of the projectors on Q is P+Q{x) = Q[x) + 0(l/mQ),
P-Q(x)=0
+ O{l/mQ).
(46)
More precisely, these relations should be understood as pertaining to those modes of the field Q(x) which annihilate heavy quarks and antiquarks in a heavy meson. The momentum dependence of the field Q is given by its action on a heavy quark state, Q(x) \Q(p)) = e-*"* |0>. (47) If we now multiply both sides by a phase corresponding to the on-shell momentum, eimQvxQ(x)
|
Q ( p ) )
=
g-ik.x
|Q)
t
( 4 8 )
the right side of this equation is independent of TUQ. Hence the left side must be, as well. Combining this observation with the argument of the previous paragraph, we are motivated to define a THQ-independent effective heavy quark field hv(x), hv(x)=eim^vxP+Q(x). (49) Note that the effective field carries a velocity label v and is a two-component object. The modifications to the ordinary field Q(x) project out the positive frequency part and ensure that states annihilated by hv{x) have no dependence on rriQ. Hence, these are reasonable candidate fields to carry representations of the heavy quark symmetry. Of course, the small components cannot be
397 neglected when effects of order represented by a field
I/TUQ
are included. In the HQET they are
Hv(x) = eimc>vx P-Q(x).
(50)
The field Hv(x) vanishes in the TUQ —> oo limit. The ordinary QCD Lagrange density for a field Q(x) is given by £QCD = Q(x)(ip-mQ)Q{x),
(51)
where D^ = d^ — igA^T0, is the gauge covariant derivative. To find the Lagrangian of the HQET, we substitute Q{x)=e-im^vxhv(x)
+ ...
(52)
into £ Q C D and expand. With the aid of the projection identity P+j>iP+ = v^, we find A J Q E T = hv(x)iv • Dh(x). (53) This simple Lagrangian leads to the propagator we derived earlier, (54)
v • k + ie and to an equally simple quark-gluon vertex, igTav»Al.
(55)
Note that both the propagator and the vertex are independent of TUQ, reflecting the heavy quark flavor symmetry. They also have no Dirac structure, reflecting the heavy quark spin symmetry. Our intuitive statements about the structure of heavy hadrons have been promoted to explicit symmetries of the QCD Lagrangian in the limit TUQ —> oo. It is straightforward to include power corrections to £ H Q E T - Write Q(x) in terms of the effective fields, Q(x) = e-im*v-x
[hv(x) + Hv(x)} ,
(56)
and apply the classical equation of motion (ip — rriQ)Q(x) = 0: iphv(x)
+ (ip - 2mQ)Hv{x)
=0.
(57)
Multiplying by P_ and commuting $ to the right, we find (iv-D + 2mQ)Hv(x)
=ip_i_hv{x),
(58)
398
where D1^ = D** — v^v • D. We then substitute Q(x) into eliminate Hv{x) and expand in I/TUQ to obtain £HQET
= hviv • Dhv + hvip±_
£QCD
as before,
ip±hv iv • D + 2mQ
= /i„w • -D/i„ +
1
[
2mQ
hv{Wx)2hv
+ | hv
+....
(59)
The leading corrections have a simple interpretation, which becomes clear in the rest frame, wM = (1,0,0,0). The spin-independent term is
2m,Q
>K
2m,Q
hv{iD±)2hv
-J-
2TTIQ
hv(iD)2hv,
(60)
which is the negative of the nonrelativistic kinetic energy of the heavy quark. Because of the explicit factor of l/2mQ, this term violates the heavy flavor symmetry. The spin-dependent part is 1
-
2m,Q
1
oT
afi„ « „
•a 'a = ^— % Ka Gaphv 2rriQ 2
.
1 -> -±- h^T^xgG^ 4TTIQ
= g^-Ba,
(61)
which is the coupling of the spin of the heavy quark to the chromomagnetic field. Because it has a nontrivial Dirac structure, this term violates both the heavy flavor symmetry and the heavy spin symmetry. For example, OQ is responsible for the D — D* and B — B* mass splittings. These correction terms will be treated as part of the interaction Lagrangian, even though OK has a piece which is a pure bilinear in the heavy quark field. 3.2
Effective currents and states
The expansion of the weak interaction current C7M(1 - j5)b is analogous. However, here we must introduce separate effective fields for the charm and bottom quarks, each with its own velocity: b->hb,
c -> / ^ •
(62)
Then a general flavor-changing current becomes, to leading order, cTb^hcv,Thbv,
(63)
where T is a fixed Dirac structure. With the leading power corrections, this is
cv b -> hcv, rftj+ - ^ hcvlr(ip±)hbv + ~ hcv, (-i^±)rhbv
+ ....
(64)
399
Q
9
Q
q
Q
q
Q
q
Figure 4: Tree level plus the one loop renormalization of the current qTQ in QCD. The box represents the current insertion.
The effective currents, and other operators which appear in the HQET, may often be simplified by use of the classical equation of motion, iv-Dhv(x)=0.
(65)
However, it is only safe to apply these equations naively at order I/ITIQ; at higher order the application of the equations of motion involves additional subtleties. 17 - 18 ' 19 To complete the effective theory, we need mQ-independent hadron states which are created and annihilated by currents containing the effective fields. For example, there is an effective pseudoscalar meson state \M(v)) which couples to the effective axial current
(67)
from which we immediately find the relationship (31) between fo and }B3.3
Radiative corrections
We can use the effective Lagrangian £HQET to compute the radiative corrections to the matrix element (66). In particular, we would like to extract the dependence of FM on InmQ. This dependence comes through the one-loop renormalization of the current q-y^^Q. At lowest order, of course, the renormalization is straightforward: we simply compute the set of graphs found in Fig. 4. The result is finite, because the current is (partially) conserved, and we extract a result of the form qn5Q
( l + 7 o | ^ ln(m Q /m g ) + . . . ) .
(68)
400
hv
q
hv
q
hv
q
hv
Figure 5: Tree level plus the one loop renormalization of the current qi^y^hv The double line is the propagator of the effective field hv.
q in H Q E T .
Note that there is no explicit dependence on the renormalization scale /u, since there is no divergence to be subtracted. The same result may be obtained in the effective theory. In this case we must match the currents in full QCD onto HQET currents of the form q^^^hhv. This step will induce a matching coefficient containing the explicit dependence on TTIQ, which is absent, by construction, from the operators and Lagrangian of the HQET. C In addition, the effective current will not necessarily be conserved, since the ultraviolet properties of QCD and the HQET differ. Hence the form of the matching, once radiative corrections are included, is
q-f-fQ -» C(mQ, /i) x ^ V M M ) •
(69)
We can deduce the form of C(rriQ,n) by considering the renormalization of the effective current qj^^5hv, shown in the last three terms of Fig. 5. These diagrams, computed in the effective theory, are independent of TUQ. However, in general they are divergent, so they depend on the renormalization scale n; the renormalization takes the form 97 V M M ) = q^5hv(mq)
x (l +
7o
g ; ln(/i/m,) + . . . ) ,
(70)
where here mq acts as an infrared cutoff. The \x dependence in the second term must be canceled by C(m,Q, /j). Since the logarithm depends on a dimensionless ratio, C(m,Q, /x) must be of the form C(m Q ,/i) = l + 7 o ^ ln(m Q //i) + . . . .
(71)
Comparing the dependence on TUQ of C(mQ, fi) and the expansion (68), we see that 70 = 7oc In general, the matching procedure at order a3 can also induce new Dirac structures They do not affect the leading logarithms discussed here.
qThv.
401
However, the effective theory allows us to go beyond leading order, and to resum all corrections of the form a™ In" rriQ. We do this with the renormalization group equations, which express the independence of physical observables on the renormalization scale /z. In this case, they require that the /i dependence of C(m,Q, fj.) cancel that of the one loop diagrams in Fig. 5, under small changes in fi:
The logarithms are resummed because the partial derivative is promoted to a total derivative with respect to /i, including the implicit dependence on /i of the coupling constant a s (/x):
2 2^ 0O = 11 - - t y = — f o r t y = 4 ,
(73)
where ty is the number of light flavors. We compute the anomalous dimension 7o from the ultraviolet divergent parts of the one loop diagrams shown in Fig. 5. It is instructive to perform the radiative correction to the current in detail, since this is different from the diagrams one is used to in ordinary QCD. With the HQET Feynman rules, the diagram may be written in Feynman gauge as Cf (iff) V
[ 7^7 J (27r)4
e
vela -, 7 V —vauhx=±, i vq q2
(74)
where Cf = | is the color factor. This expression may be simplified to
-Cf ig'v^n5
uh M« J ^ L _ ^ _ ,
(75)
which by Lorentz invariance is simply
-Cfig*va^uhsj^L±
r,*
(76)
Rotating to Euclidean space, performing the integral and extracting the pole in e, we find m M 7 5 uh x (2Cf) ^ -
e
+ finite.
(77)
402
Since / / = 1 + eln/i + ..., the one loop contribution to the matrix element then depends on ln/x as vtr15uhx(2Cf)^lnfi.
(78)
The contribution to the anomalous dimension is then 2C/ = | . The calculation of the wavefunction renormalization of the heavy quark (the third diagram in Fig. 5) is similar, but requires a novel version of the Feynman trick, dA
- = r
(79) 2
ab
{
J0 (a + A6) '
'
One also has to pick out the term which cancels the 1/v • k pole in the heavy quark propagator. With the factor of | which accompanies the contributions from wavefunction renormaliaztion, the result is I x va"l5
uh x (4Cf)^
In^.
(80)
Including the usual QCD renormalization of the light quark field, we find from the three terms in Fig. 5, respectively,5,7 8
1 /
8\
1 /16\
„
The solution of the renormalization group equation is
This then yields the leading logarithmic correction to the ratio /B_
JD
[rnp m
V B
/B//D:
fas{mc)Y/25
\as(mb)J
The radiative correction is approximately a ten percent effect. In fact, it has a simple physical interpretation. For virtual gluons of "intermediate" energy, mc < Eg < rrib, the bottom quark is heavy but the charm quark is light. Such gluons contribute to the difference between / # and fo even in the heavy quark limit. In summary, then, the purpose of the HQET is to make explicit all dependence of observable quantities on TUQ . The logarithmic dependence, through as{m,Q), arises from intermediate virtual gluons with m c < Eg < m&. We
403
obtain these corrections by computing perturbatively with the HQET Lagrangian, then using the renormalization group to resum the logarithms to all orders. The power dependence, I/TUQ, is extracted systematically in the heavy quark expansion. We have seen how to expand the Lagrangian and the states to subleading order; the application of the expansion to a physical decay rate will be presented in the next section. These lectures are meant to be pedagogical, so we will only treat the leading corrections to a few processes. However, the state of the art goes significantly beyond what will be presented here. For many quantities, not only the leading logarithms, a" In" TTIQ, but the subleading (two loop) logarithms, of order a " + 1 ln n mQ, have been resummed. Similarly, many power corrections are known to relative order I/TUQ. It is particularly important phenomenologically to take into account the corrections of order \jrr?c. 4
Exclusive B Decays
We now have the tools we need for an HQET treatment of the exclusive semileptonic transitions B —)• D Iv and B —> D* Iv. Earlier, we argued on physical grounds that in the heavy quark limit all of the hadronic matrix elements which appear in these decays are related to a single nonperturbative function £(u>). Now we will sharpen this analysis to actually derive these relations, and to include radiative and power corrections. In fact, almost all of our effort will go into the power corrections, since the radiative corrections to the transition currents are computed just as in the previous section. 4-1
Matrix element relations at leading order
The transitions in question require the nonperturbative matrix elements (D(v')\ &r"b \B(v)),
<£>>', e)| afb \B(v)),
{D*(v', e)| c W 6
\B(v)), (84) parameterized in terms of form factors as in Eq. (22). Our first task is to derive the relations between these form factors, as promised earlier. These relations depend on the heavy quark symmetry, that is, on the fact that the spin quantum numbers of Q and of the light degrees of freedom are separately conserved by the soft physics. Hence we need a representation of the heavy meson states in which they have well defined transformations separately under the angular momentum operators SQ and Ji. In particular, the representation must reflect the fact that a rotation by SQ can exchange the pseudoscalar meson M(v) with the vector meson M*(v,e).
404
The solution is to introduce a "superfield" M(v), denned as the 4 x 4 Dirac matrix 1 2 ' 2 0 M(v) = ^ ± i [ y » M ; ( « , e) -
5 7
M( V )] = V(v, e) + P(v).
(85)
Under heavy quark spin rotations SQ, M.{V) transforms as M{v) ^ D{SQ)M{v),
(86)
and under Lorentz rotations A, as M(v) -»• D(A)M{A-1v)D-1
(A).
(87)
Here D{- • •) is the spinor representation of 50(3,1). The superfield satisfies the matrix identity P+ M{v) P_ = M{v), (88) so it transforms the same way as the product of spinors hv q, representing a heavy quark and a light antiquark moving together at velocity vM. It is straightforward to verify the transformation properties of the superfield in the rest frame, in which v*1 = (1,0,0,0). In this frame, the spinor representation of the angular momentum operator J has components Sl = | 7 5 7 ° 7 l . It acts on the superfield by JlM = [Sl,Ai]. Defining the polarization vectors e£ = (0,1/V%, ±i/V%, 0) and e£ = (0,0,0,1), it is easy to check that J 2 P = J 3 P = 0,
J2V{e) = 2V{e),
J3V(e) = 0, (89) so P has spin zero and ^(e) spin one. On the other hand the heavy quark spin operator S Q has the same component representation but acts only on the left, l (SQ)1M. — S M.. One may then check that (SQ)3P=\v(e3),
J3V(e±) = ±V(e±),
(SQfV(e3)
= ±P.
(90)
As promised, heavy quark spin tranformations exchange the pseudoscalar and vector mesons. A current which mediates the decay of one heavy quark (Q) into another (Q1) is of the form hv' Thv. Under a rotation by SQ, the effective field hv transforms as hv->D(SQ)hv, (91) while hv' is unchanged. The current would remain invariant if we took T to transform as r^TD-l{SQ). (92)
405
On the other hand, the matrix element of superfields {M'{v')\hv,Thv\M(v)) is invariant if we rotate both hv formation law (86) for M(v), it must be proportional to TM(v). we find that the matrix element (M'(v')\hv,
(93)
and M(v) by the same SQ. With the transfollows that the Sg-invariant matrix element When we also consider rotations under SQ*, is restricted to the general form
Yhv \M(v)) = -y/MMMM,
Tr \M {v')YM{v)F{v,v')\
. (94)
The product of masses in front is a convention which restores the relativistic normalization of the states. Note that the heavy quark symmetry allows an arbitrary 4 x 4 Dirac matrix F(v,v') to act on the "light quark" part of the superfields. Its presence reflects the fact that only Lorentz symmetry constrains the spin of the light degrees of freedom during the decay. A general expansion of F(v, v') in terms of scalar functions Fi(w = v • v') takes the form F{v,v') = Fiiw) + F2(w)i> + F3(w)f
+ F4(w)1>f .
(95)
However, the identity (88) applied to the matrix element (94) yields F{v,v') = P_ F{v,v') PL = [Fiiw) - F2{w) - F3(w) + F4(tu)] P- PL . (96) In other words, F(v, v') actually may be taken to be a scalar, which we identify with the Isgur-Wise function, F{v,v') = t{w).
(97)
As an exercise, let us apply this formalism to the matrix elements for B —> (D, D*) Iv. For a given matrix element, we pick out the part of the superfield M(v) which is relevant. Hence we find (D(v')\crb\B(v))
= (MD(v')\hcv,rhbv
\MB(v) 5
= - v ^ ^ T r [ 7 ? ; 7 " P + ( - 7 5 ) ] $(w) = yJmDmB £(w) {v + v'Y (D*(v',e)\cnsb\B(v)}
=
(98)
(M*D(v',e)\hcv,rj5hbv\MB(v))
= -^mD.mB = y/mD.mBS{w)
Tr [ ^ + / 7 5 P + ( - 7 B ) ] £(w)
(99) M
[{w + l j e ' " - e* • (v + v') ]
406
(D*(v',e)\crb\B(v))
= (MD(v',e)\hcv,rhbv\MB(v)) = -^mD.mB TV [ / P | 7 " P + ( - 7 5 ) ] £(«,) = ^nn,.mflfW ^ ^ e » ^ ,
(100)
reproducing explicitly the relations (23) between the independent form factors hi(w). We can also derive the normalization condition at w = 1. Consider the matrix element of the b number current bj^b between B meson states. In QCD, the matrix element of this current is exactly normalized, (B{v)\h"b\B{v))
= 2pB = 2mBvfl.
(101)
But in HQET, we have (B(v)\h»b\B(v))
= (MB(v)\h„rhv |MB(«)> = ms£(u • v)(v + vY = 2ro B u" £(1).
(102)
Hence the normalization condition at zero recoil, £(1) = 1,
(103)
follows directly from the conservation of the heavy quark number current. 4.2
Power corrections to the matrix elements
The matrix elements we have derived are computed in the strict limit mb}C —> 00. How are they affected by corrections of order l/m& and l / m c ? There are two sources of I/TTIQ corrections in the effective theory: the corrections (64) to the heavy quark currents, and the corrections (59) to the Lagrangian. When radiative corrections are included, the expansion of the heavy quark current cYb in terms of HQET operators has a form which is somewhat more general than Eq. (64),
cTb -+ a0(as) hcv,Thbv + ^ ^ K,T?iDahbv + ^ ^ K H X ^ X + • • zrrib
Imc
(104) The matrix elements of the power corrections are constrained by heavy quark symmetry in a manner completely analogous to the leading current. In terms of traces over the superfields, we have 14 (M'{v')\hv, TaiDa hv \M(v)) = -yjMMMM< Tr [ A * V ) Va M(v) Ga{v,«')] , (105)
407
where Ga(v,v')
is another arbitrary 4 x 4 Dirac matrix. The matrix element (M'(v')\ hv. H ^ a ) r i X \M(v))
(106)
may also be written in terms of Ga(v,v'), using charge conjugation. The l/m,Q corrections OK and OG to the Lagrangian contribute somewhat differently. In order to apply heavy quark symmetry, the matrix elements of the local currents, both leading and subleading, must be written in terms of the effective states \M(v)). However, these states are not eigenstates of the Hamiltonian, once OK and OG are included in the Lagrangian. Hence, for example, we must allow for the possibility that if an effective state \M(v)) is created at time t — —oo, then OK or OG could act on the state before its decay at t — 0. This possibility is accounted for by including time-ordered products in which OK or OG is inserted along the incoming or outgoing heavy quark line. If we are keeping terms of order I/TUQ, only one insertion of OK or OG needs to be included. The time-ordered products are of the form 14 {DM(v)\ cTb\B(v)) •••
+
=
^(M'WlijdyTfarhlOK
+ ^-b(M'(v')\iJdyT{hcv,Thbv,0K
+ O^lMiv)) + OG}\M(V)),
(107)
where the ellipses include the current corrections computed earlier. The evaluation of the matrix elements of the time-ordered products will lead to still more nonperturbative functions like F(v,v') and Ga(v,v'). 4-3
Corrections at zero recoil
It is straightforward, but not very illuminating, to expand all of the new nonperturbative functions which arise at order I/TUQ in terms of scalar form factors. In the end, the corrections may be parameterized in terms of four functions of the velocity transfer w, and a single nonperturbative parameter A, all proportional to the mass scale AQCD- 14 The new parameter has a simple interpretation as the "energy" of the light degrees of freedom, and is given by A=
lim (rriB - m&).
(108)
mi-s-oo
Instead of a general treatment, however, we will consider the I/TUQ corrections at the zero recoil point w = 1. This is clearly the most important case, because it is at this point that the nonperturbative matrix elements are
408
absolutely normalized in the heavy quark limit. What happens to this normalization condition when I/TUQ corrections are included? Let us study the corrections to the current in detail. They are described by the nonperturbative function Ga(v,v'). At v = v', Ga{v,v) may be expanded as Ga{v,v) = Giva + G27a + Gzvai> + G47aJ*. (109) But Ga(v,v) is subject to the same constraint as F(v,v'), Ga(v,v)
= P-Ga(v,v)P-
= (Gi - G2 - G3 + G 4 )««P_ = GvaP-,
(110)
and it, too, is equivalent to a Dirac scalar (the same is not true of the general function Ga(v, v')). Now consider the matrix element where we take Ta = va. Then we have (M'(v)\ hv, iv-Dhv
\M(v)) = - s/MMMM, = - Gs/MMMM,
Tr
\M'(V)
Tr
va
M(v) Gva
\M\V)M{V)
(111)
But this matrix element vanishes by the classical equation of motion in the effective theory, v-Dhv(x)=0. (112) Hence G = 0 = Ga(v,v). There are no 1/rriQ corrections from the current to the normalization condition at zero recoil.14'21 The same is true of insertions of the corrections OK and OG to the Lagrangian: their contribution vanishes at w = 1. To show this requires the imposition of the conservation of the b number current at order 1/m;,, much as we derived the normalization of the Isgur-Wise function at leading order. This part of the argument is analogous to the classic nonrenormalization theorem of Ademollo and Gatto. 22 In the end, we have the result known as Luke's Theorem}* There are no corrections at zero recoil to the hadronic matrix elements responsible for the semileptonic decays B —> D Iv and B —>• D* Iv. The leading power corrections to the normalization of zero recoil matrix elements are only of order 1/m 2 . Given that A Q C D / ^ C ~ 30% and A Q C D / ? 7 1 2 ~ 10%, the implication is that the leading order predictions at w = 1 are considerably more accurate than one might have expected. In addition, away from zero recoil the l / m c corrections must be suppressed at least by (w — 1). On closer inspection, this result is more interesting for B ->• D* Iv than for B -> D Iv. This is because the leading order matrix element for B -> D Iv vanishes kinematically at zero recoil for a massless lepton in the final state. Hence, in this case the l / m c corrections are not suppressed as a fractional correction to the lowest order term. 23
409 4-4
Extraction of \Vcb\ from B ->• D* tv
An immediate application of these results is the extraction of |VC&| from the exclusive decay B —)• D* tv. This process is mediated by the weak operator Obc (4), whose matrix element factorizes as {D* tv\ Obc \B) = ^ ^
(D*| C7"(l -
5 7
)6 |J3> (lu\ llfi(l
-
7
> |0).
(113)
The leptonic matrix element may be computed perturbatively, while we treat the hadronic matrix element in the heavy quark expansion. The result is a differential decay rate of the form 24 £
= ± ^ \Ychf (mB - mD.fml.{w
+ 1)3V/^:T
4sw m2B — 2wmbrriD* + m2D, 2 F (w). w+1 {rriB — mi)*)2
(114)
All of the HQET analysis goes into the factor F(w), which has an expansion F(w) = £(w) + (radiative corrections) + (power corrections).
(115)
We extract \Vcb\ by studying the differential decay rate near w = 1, where the hadronic matrix elements are known. Of course, this requires extrapolation of the experimental data, since the rate vanishes kinematically at w = 1. For massless leptons, only the matrix element {D*\ C7M756 \B) of the axial current contributes at this point. The analysis of this quantity in the HQET yields an expansion of the form
n i ) = VA
1 + — + — + S1/m2 + mc mb
(116)
The correction <$i/m2, which contains terms proportional to 1/m2, 1/m2 and l/m c mj,, is intrinsically nonperturbative. It has been estimated from a variety of models to be small and negative, 18 ' 25,26 51/m2 « -0.055 ± 0.035 .
(117)
Note that the model dependence in the result has been relegated to the estimation of the sub-subleading terms. The radiative correction r\A has now been computed to two loops, 27 ' 28 rjA = 0.960 ± 0.007.
(118)
410
The result is a value for F ( l ) with errors at the level of 5%, F ( l ) =0.91 ± 0 . 0 4 .
(119)
This is the theory error which the experimental determination of |Vcf,| will inherit. It is dominated by the uncertainty in the nonperturbative corrections, and it is difficult to see how this can be improved much in the future. All that is left experimentally is to extrapolate the data to w = 1 and extract lim
,
l
^ .
(120)
«>-t-i y/w — 1 d w
Once the kinematic factors in Eq. (114) have been included, this amounts to a direct measurement of the combination |Vy.F(l). Both CLEO and LEP have reported results for this quantity. 29,30 They have taken the slightly different value F(l) = 0.88 ± 0.05, and I have scaled up the theory error of the CLEO result to make it consistent with LEP. Then we have quite consistent results for Vcb: CLEO: LEP average :
(39.4 ± 2.1 ± 2.0 ± 2.2) x 10" 3 (38.4 ± 1.1 ± 2.2 ± 2.2) x 1 0 - 3 .
(121)
This value of \Vcb\ has almost no dependence on hadronic models. In contrast to model-based "measurements", here the theoretical error is meaningful, in that it is based on a systematic expansion in small quantities. 5
Inclusive B Decays
An exclusive semileptonic B decay, such as B —>• D (.P, is one in which the final hadronic state is fully reconstructed. An inclusive decay, by contrast, is one in which only certain kinematic features, and perhaps the flavor, of the hadron are known. In this case, we need a theoretical analysis in which we sum over all possible hadronic final states allowed by the kinematics. Fortunately, this is possible within the structure of the HQET. As in the case of exclusive decays, the key theme is the separation of short distance physics, associated with the heavy quark, from long distance physics, associated with the light degrees of freedom. We will also rely on heavy quark spin and flavor symmetry. However, the new ingredient will be the idea of "parton-hadron duality", which, as we will see, also relies on the heavy quark limit nib » AQCD-
411
5.1
The inclusive decay B —> Xc Iv
Let us consider the inclusive decay B(PB)
-> XC(PX)
Hpt)v(pv),
(122)
where all that is known about the state Xc are its energy and momentum, and the fact that it contains a charm quark. This decay is mediated by the weak operator 0\,c. It is easy to generalize our discussion to inclusive decays of other heavy quarks, such as b —>• u Iv and c —>• s iv, by replacing Obc with the appropriate weak operator. The treatment of exclusive decays required both the b and c quarks to be heavy. For inclusive decays we can relax this condition on the c quark, requiring only mj S> AQCD- What does the weak decay of the b, at time t = 0, look like to the light degrees of freedom? For t < 0, there is a heavy hadron composed of a point-like color source and light quarks and gluons. At t = 0, the point source disappears, releasing both its color and a large amount of energy into the hadronic environment. Eventually, for t > 0, this new collection of strongly interacting particles will materialize as a set of physical hadrons. The probability of this hadronization is unity; there is no interference between the hadronization process and the heavy quark decay. There are subleading effects in powers of A Q C D / " ^ 6 , but they do not alter the probability of hadronization. Rather, they reflect the fact that the b quark is not exactly a static source of color: it has a small nonrelativistic kinetic energy and it carries a spin, both of which affect the kinematic properties of its decay. As in the case of exclusive decays, we will compute the inclusive semileptonic width T(B —> Xclv) as a double expansion in powers of as(mi)) and AQCD/»TI&- 1 9 ' 3 2 ' 3 3 ' 3 4 The expansion in as(mb) reflects the applicability of perturbative QCD to the short distance part of the process. The heavy quark expansion will be continued to relative order 1/mj, as there is an analogue of Luke's Theorem which eliminates power corrections to the rate of order l / m j . These corrections will be written in terms of three nonperturbative parameters. The first, A, is defined in Eq. (108). It is essentially the mass of the light degrees of freedom in the heavy hadron, but we will see that it is plagued by an ambiguity of order AQCD in the definition of the b quark mass. The other two parameters are the expectation values in the B meson of the leading corrections OK and OQ to £HQET- They are defined a s 1 8
X2
= -6LW°GW>
< 123 >
412
where we take the usual relativistic normalization of the states. Hence, Ai may be thought of roughly as the negative of the b quark kinetic energy, and A2 as the energy of its hyperfine interaction with the light degrees of freedom. Now let us outline the computation. The inclusive decay involves a sum over all possible final states, which is actually a sum over exclusive modes (such as D, D*,Dir,...), followed by a phase space integral for each mode. We write F(B -> XclD) = J2 /d[P.S.] \{XC19\ Obc \B)\2 .
(124)
There is an Optical Theorem for QCD, which follows from the analyticity of the scattering matrix as a function of the momenta of the asymptotic states. Its content is that a transition rate is proportional to the imaginary part of the forward scattering amplitude with two insertions of the transition operator, T(B -* Xclu)
= -21mi f dxeikx
(B\T {olc{x),Obc(0)}
\B) = 2 I m T .
(125) In what follows, we will write the time-ordered product T{0'bc, Obc} as a series of local operators, using the Operator Product Expansion. As we will see, the applicability of this expansion, and its computation in perturbation theory, will rest on the limit nib S> AQCD- We will then use this limit again to expand the matrix elements of these local operators in the HQET. The first step is to factorize the integration over the lepton momenta, which can be performed explicitly. Written as a product of currents, Obc takes the form Obc=q^JbUcJi»,
(126)
where j^c =
cr(i-j5)b
J^=^7"(l-75)i/.
(127)
Then T can be decomposed as an integral over the total momentum q^ = Pi + Pv transferred to the leptons, T = \02F\Vcb\2 jdqT^(q)
L^(q).
(128)
Here the lepton tensor is L^(q)
= j d[P.S.] <0| Jl \£9) (19\ Jtv |0) = -o^ Wiv - q2g»v) :
(129)
413
and the hadron tensor is T^(q)
= -ijdxe**
(B\T{J£(X),J£C(0)}
\B).
(130)
We will need the imaginary part, Im T^v. Where is it nonvanishing? In quantum field theory, a scattering amplitude develops an imaginary part when there can be a real intermediate state, that is, the intermediate particles can all go on their mass shell. Whether this is possible, of course, depends on the kinematics of the external states. In this case, there are two avenues for creating a physical intermediate state. 32 The first is to act on the external state \B) with the transition current J^c. The state which is created has no net b number and a single charm quark; the simplest possibility is the decay process b —> c. The momentum of the intermediate state is px = PB — Q', the condition that it could be on mass shell is simply Px = (PB - qf > m2D . (131) If we define scaled variables PB = mBv^ ,
gM = q*jmB ,
mD = mD/mB
,
(132)
this condition becomes v • q < \ (1 + q2 ~ m2D) .
(133)
Another possibility is to act on \B) with the conjugate operator ,/£.'. This operation would produce an intermediate state with two b quarks and one c. For this state to be on shell, the momentum transfer has to satisfy Px = (PB + qf > {2mB + mD)2 ,
(134)
v • q > r (3 - q2 + 4m D + m2D) .
(135)
that is,
The physical intermediate states are shown as cuts in the v • q plane in Fig. 6. Also shown is the contour corresponding to the phase space integration over the lepton momentum q. For physical (massless) leptons which are the product of a heavy quark decay, this integral runs over the top of the lower cut, for the range y/q^+ ie
-(l + q2-m2D)+ie.
(136)
414
integration contour I
(extended)
Figure 6: Analytic structure of T>"/ in the complex v • q plane, for fixed, real q2. The integration contour is over the "physical cut", corresponding to real decay into leptons. The unphysical cuts correspond to other processes.
As indicated by the dotted line, we can continue this contour around the end of the cut and back along the bottom, to v • q = y/q^ - ie. Since TM"(w • q *) = —T^v(v • q) for real q2, we compensate for extending the contour by dividing the new integral by two. We now encounter our central problem. The integral over v • q runs over physical intermediate hadron states, which are color neutral bound states of quarks and gluons. Hence the integrand depends intimately on the details of QCD at long distances, which is intrinsically nonperturbative. A perturbative calculation of T M ", which is all we have at our disposal, would appear to be of no use. The solution is to deform the contour away from the cut, into the complex v • q plane, as shown in Fig. 7. Since the scale of momenta is set by m&, the contour is now a distance of order rrib away from the resonances.32 Since mb ^ AQCD, it is reasonable to hope that a perturbative treatment in this region is valid. Essentially, we are saved because we do not need to know TliU(q) for every value of q, just suitable integrals of T Mi/ . That we can use such arguments to compute perturbatively the average value of a hadronic quantity, where at each point the quantity depends on nonperturbative physics, is known as (global) parton-hadron duality. Parton-hadron duality has the status of being somewhat more than an assumption, since it is known to hold in QCD in the limit m;, —> oo, but somewhat less than an approximation, since it is not known how to compute systematically the leading corrections to it. In any case, the limit mi, > AQCD
415
Figure 7: The deformation of the integration contour into the complex v • g plane.
plays a crucial role here. By deforming the integration contour a distance of order mb away from the resonance regime, we find the correspondence in QCD of our earlier intuitive statement: the probability of the decay products materializing as physical hadrons is unity, independent of the kinematics of the short distance process. The local redistribution of probability in phase space due to the presence of hadronic resonances is irrelevant to the total decay. Finally, we should note that since we do not have control over the corrections to local duality, it might work better in some processes than in others, for reasons that need not be apparent from within the calculation. Hence one must be particularly wary of drawing dramatic conclusions from any surprising results of these inclusive calculations.35 Let us perform the operator product expansion at tree level, and for decay kinematics. The Feynman diagram is given in Fig. 8, which yields the expression T"" = B y (1 -
5 7
)
*"~i + m ° . 7"(1 - 75)& • (Pb ~ q)2 -m2 + ie
(137)
We now write
p£ = mhV» + kfi = mb(v" + &) q» = q»/mb rhc = mc/mb b{x) = e-imhVX hv (x) + 0(1/mb),
(138)
and expand in powers of l/m^. Since the operator product expansion is in terms of the effective field hv, a factor of k11 corresponds to an insertion of the covariant derivative iD^. Operator ordering ambiguities are to be resolved by considering graphs with external gluon fields. As an example of the procedure, let us expand the propagator to order \/m\. (There are also corrections to the currents at this order, which are
416
Pb-q •
Figure 8: The operator product expansion at tree level.
included in a full calculation.) It is convenient to define the scaled hadronic invariant mass, s = (mbV*1 - q^)2/m2b = 1 - 2v • q + q2 . v
Then we find a contribution to T^
(139)
of the form
2k-q-k2 1 hv . rh2. + ie + {s — fh2,+ it)2+ ... '(140) From this expression we can read off the operators which appear in the operator product expansion. Since T"" = — ^ 7 < ^ - | ) 7 " ( l - 7 5 ) mb
Im
(s — rh2 + ie)n
= 7r(n-l)!(-l)n5("-1)(s-m^),
(141)
we see that the effect of taking the imaginary part in each term is to put the charm quark on its mass shell. The leading term is a quark bilinear,
—
mi,
hvl,1{i>-i)lv{l-l5)hv,
(142)
It is straightforward to compute its matrix element in the HQET using the trace formalism, (B\hvr^-4)Y(l-J5)hv\B)
= 2mB (2v*vv - g^ - v»q" - vv
(1 - 8m2c + ml
- ml - Urht lnmc2)
(144)
417
Of course, if we only intended to reproduce the free quark decay result, we would never have introduced so much new formalism. The value of the HQET framework is that it allows us to go beyond leading order and compute the next terms in the series in 1/m™. For example, consider the operators induced by the expansion of the propagator (140). The correction of order \jm\, comes from the operator m 2 (s — m 2 + it)2
hvl"(i,-4)Y(l-jb)q-iDhv.
(145)
However, the matrix element of this operator is of the form
(B\hvTa(v,q)iDahv\B),
(146)
which, as we have seen, vanishes by the classical equation of motion. (In writing Eq. (140), we have already dropped terms explicitly proportional to v • k, for the same reason.) In fact, since all l/m& corrections, from any source, have a single covariant derivative, they all vanish in the same way. This is the analogue of Luke's Theorem for inclusive decays.32 The correction of order 1/m2, in Eq. (140) is
J
1 2
mf (s - m
+it)
2
^7M(^-|)7"(l-75)(^)2^-
(147)
The matrix element of this operator is related by the heavy quark symmetry to Ai, the expectation value of OK- The full expansion of j f " also induces operators with explicit factors of the gluon field, whose matrix elements are related to A2. We now present the result for the inclusive semileptonic decay rate, up to order 1/m2, in the heavy quark expansion, and with the complete radiative correction of order as. We also include that part of the two loop correction which is proportional to PoQ.^. Since @o w 9, perhaps this term dominates the two loop result. In any case, it is interesting for other reasons, as we will see below. Let us first consider the decay B —• Xu £p, for which the decay rate simplifies since mu = 0. We find33,34,36,37,40 r(B4i„ft)
G2F\Vub\2 „ 6 1 + [ 25 _ 2 ^ ^ 1 1927T3 mh 6 3 - (2.98/30 + Cu)
as{mb)
as{mb) A!-9A2 2m 2
(148)
+
'
418
When we include the charm mass, it is convenient to write the unknown quark masses in terms of the measured meson masses and the parameters of the HQET. In terms of the spin averaged mass TUB = ( m j + 3ms* )/4, we have mb = mB - A + -±-
+ ... ,
(149)
and analogously for mc. We then find33-34'38*39
T(B^XclD)=
°^cb[
|2
mB x 0.369 1 _ 1 . 5 4 S ^ 1
- (1.43/3o + Cc) (Si^Bl)2 \
-K
J
_ 1.65 A U _ o.87as{mb) mB
\
0.95-^- - 3.18—|- + 0 . 0 2 ^ mB mB mB
(150)
All the coefficients which appear in this expression are known functions of rap/mB, and are evaluated at the physical point WIDI^B = 0.372. In both B —> Xu tv and B —> Xc Iv, the power corrections proportional to Ai and A2 are numerically small, at the level of a few percent. 5.2
Renormalons and the pole mass
The inclusive decay rate depends on the heavy quark mass rrn,, either explicitly, as in Eq. (148), or implicitly through A, as in Eq. (150). At tree level, mj is just the coefficient of the b b term in the QCD Lagrangian, but beyond that we are faced with the question of what exactly we mean by mb- Should we take an MS mass, such as m\i{mb)l Or should we take the pole mass m^° e , or maybe some other quantity? The various prescriptions for m\, can vary by hundreds of MeV, and, since the total rate is proportional to mjj, the question is of practical importance if we hope to make accurate phenomenological predictions. At a fixed order in QCD perturbation theory, the answer is clear. The heavy quark masses which appear come from poles in quark propagators, so we should take m£ ole (and mP° le ). This is also the prescription for the mass which cancels out the on-shell part of the heavy quark field in the construction of £HQET- Hence the difference of heavy quark pole masses is known quite well,
„}*-„**_ (m,-J+5*L + . . . ) - ( m D - A + 5 ^ = 3.34GeV + 0(H2QCC/m%).
+
.. (151)
419
+
+
fy , & J ^+
Figure 9: The radiative corrections to m£°
e
of order a 3 ( m j , ) n + 1 j3^.
Since T(B ->• X c £p) depends approximately as m\{rrn, — m c ) 3 , the uncertainty due to quark mass dependence is reduced. The problem, of course, is that there is no sensible nonperturbative definition of m\°e, since due to confinement there is no actual pole in the quark propagator. Hence a direct experimental determination of a value for m£° e to insert into the theoretical expressions (148) and (150) is not possible. How, then, can we do phenomenology? One approach would be to define m £ ° e to be the pole mass as computed in perturbation theory, truncate at some order, and then estimate the theoretical error from the uncomputed higher order terms. However, it turns out that even within perturbation theory the concept of a quark pole mass is ambiguous. Consider a particular class of diagrams which contribute to m£° e , shown in Fig. 9. The perturbation theory is developed as an expansion in the small parameter as(mb), so we hope that it will be well behaved. Each of the bubbles represents an insertion of the gluon self-energy, which is proportional at lowest order to as(m;,)/3o- Of course, the infinite sum of the graphs in Fig. 9 can be absorbed into the one loop graph, with a compensating change in the coupling from as(mb) to as(q), where q is the loop momentum. The result is an expansion for m£° e of the form pole pole
Tnvb
— / —
\
= mb(mb)
1 + a i a s + (a2/3o + b2)a2s + (a3ffi + b3/30 + c3)a\ +.
(152)
where as = as(mb). The graphs in Fig. 9 contribute the terms proportional to a"+1P^. Since /30 « 9 these terms are "intrinsically" larger than ones with fewer powers of /?o, and we might hope that their sum approximates the full series. However, it is important to realize that the only limit of QCD in which
420
such terms actually dominate is that of large number of light quark flavors, in which case the sign of /30 is opposite to that of QCD. Although this is a physical limit of an abelian theory, we are certainly not close to that limit here. The ansatz of keeping only the terms proportional to a"+1(m;,)/?Q is known as "naive nonabelianization" (NNA).43 What is most interesting about the series of terms shown in Fig. 9, which takes the form J^a„a"/3J _ 1 , is that it does not converge. Already in the graphs kept in the NNA ansatz, we are sensitive to the fact that QCD is an asymptotic, rather than a convergent expansion. For large n the coefficients an diverge as n!, much stronger than any convergence due to the powers a™. The series can only be made meaningful if this divergence is subtracted. As with many subtraction prescriptions, there is a residual finite ambiguity.1* This ambiguity, known as an "infrared renormalon", leads to an ambiguity in the pole mass of order 4 3 ' 4 4 ' 4 5 <5mP° le ~100MeV.
(153)
By the definition (108), A also inherits this ambiguity. The expressions (148) and (150) are plagued by two problems. The first is the renormalon ambiguity in m£° e and A. The second is that the perturbative expansion for the rate T is itself divergent, and also has an infrared renormalon. In the expansion r = T0 [ ^ a ^ a " ( m & ) / 3 o _ 1 + (power corrections)
,
(154)
the coefficients a'n also diverge as n!. However, it turns out that these two problems actually cure each other, because the infrared renormalons in m^° e and in the perturbation series for T cancel.44'46 We can exploit this cancelation to improve the predictive power of the theoretical computation of the rate. Without this improvement, the infrared renormalons render the expressions (148) and (150) of dubious phenomenological utility. The most reliable approach, theoretically, is to eliminate m£° or A explicitly from the rate by computing and measuring another quantity which also depends on it. For example, let us consider the charmless decay rate T(B —y Xulv~) and the average invariant mass (SH) of the hadrons produced in the decay. Each of these expressions suffers from a poorly behaved perturbation series in the NNA approximation. Ignoring terms of relative order 1/ml and writing the rate in terms of A instead of m£° e , we find to five loop In a formal treatment, this ambiguity arises from a choice of contour in the Borel plane.
421
order 43
1927T3
m.
- 2 . 4 1 ^ - 2.98 (^-) •K
\
/J0-4.43 ( ^ )
7T /
ft
V 7T /
+... 5 A + G>|VUJ,|
1927T 3
5
•771,
1 - 0.061 - 0.120 - 0.107 - 0.111 - 0.136 + . . . (155)
- 5A/m B + . . .
for as(mb) = 0.21 and /?o = 9- As we see, not only does the perturbation series fail to converge, it does not even have an apparent smallest term, where one should truncate to minimize the error of the asymptotic series. The series for {SH) exhibits a similar behavior,40 (sH) = m%
0.20^ + 0.35 (—) 2 A> + 0.64 ( ^ ) Vo + 1-29 ( ^ ) Vo +
2.95p)Vo4 + -- + ^ — + -." \ 7T /
10 771R
= m B [0.0135 + 0.0141 + 0.0156 + 0.0189 + 0.0261 + . . . -7A/10mB + ...l .
(156)
However, the situation improves dramatically if we eliminate A and write T directly in terms of (SH), 1
~
192*3
^
1 - 7.14
(SH)
m%
0.064 - 0.020 - 0.0002 - 0.022 - 0.047
•
(157)
By truncating this series at its smallest term, 0.0002, we obtain a new expression in which the theoretical errors are under control. The price is that we must now measure a second quantity, (SJJ), in the same decay. In principle, the same procedure works for decays to charm.40 In practice, it is best to combine a number of determinations of A. This has been done by CLEO, 41 which has performed a comparison of measurements of the moments {Ei), {E}), {SH -fn2D) and {{SR - ^ D ) 2 ) - The results, reproduced in Fig. 10,
422
Figure 10: The CLEO determination of A and Ai from B —>• Xctv.
are somewhat disappointing, in that one does not obtain a very consistent determination of A and Ai. Perhaps this is just a fluctuation, or perhaps this is a sign that parton-hadron duality is failing in these days. An alternative approach is to express the width V in terms of the running mass mb{mt,) instead of another inclusive observable.44'47 Since the MS mass is a short distance quantity, this also eliminates the infrared renormalon, which is associated with long distance physics. However, from a phenomenological point of view, it raises the question of how the running mass is to be determined from experiment. Possibilities include quarkonium spectroscopy, QCD sum rules, and lattice calculations, but in all of these cases it is important to determine reliably the accuracy of the method, and how to deal with renormalon ambiguities in a manner that is consistent with their treatment in the calculation of T. Nevertheless, such an approach, particularly one based on
423
analyzing quarkonium and production near threshold, should eventually prove fruitful.48 5.3
Phenomenology of VCb and Vub
Despite this ambiguous situation, groups have presented extractions of V^ based on inclusive semileptonic b decays by simply inserting "reasonable" values for A and Ai. Such an approach clearly has its dangers! At any rate, the quoted result is 4 2 \Vcb\ = (40.0 ± 0 . 4 ±2.4) x 1 0 - 3 .
(158)
The lack of controversy about this procedure is no doubt due in part to the fact that this number is quite consistent with that determined from the analysis of exclusive decays. There are additional, much more interesting, problems with extracting \VU(,\ from the inclusive decay B —> Xulv. They arise from the problem that the process B —• Xctv presents an overwhelming background to 5 -> Xulv. The only way to avoid this background is to restrict oneself to a corner of phase space in which charmed final states are kinematically inaccessible. Existing experimental analyses isolate the charmless decays by imposing the requirement Et > (m2B — m2D)/2mB or SH < m2D. Unfortunately, the OPE can be shown to break down in these restricted corners of the phase space. 49 ' 50 ' 51 I do not have space here to explore this issue in much detail, but it is easy to appreciate the essence of the problem. For massless final states, the OPE is an expansion in powers of the light quark propagator, l/m&(l — v • q + q2). Over most of the final state phase space, the denominator is of order mi, and the OPE is well behaved. But there exist configurations for which both the denominator vanishes and the operator matrix elements which appear in the OPE are nonzero. It turns out that the dangerous region is when v • q —> \ and q2 —> 0. The precise form of the divergence depends on the kinematic distribution being studied. The general form, however, is universal. In this "endpoint region", let y be a scaled variable such that y —> 1 at the kinematic endpoint. For example, for dT/dEi, we take y = 2Ei/mi,, and for dT/dsn we take y = SH /Ami,. Then near y = 1, the OPE takes the general form dr ~ An ^-oc> c n ~ —, dy f^0 m£(l - y)n
(159)
where c„ are coefficients of order one, and the An are moments defined by (B{v)\ hv W^
... iD^ hv \B(v))/2mB
= Anv^
...v^
+ ... .
(160)
424
The ellipses represent terms involving factors of the metric tensor g^iflj, which are subleading. Since the An are associated with totally symmetric combinations of the covariant derivatives, they may be interpreted roughly as the moments of the heavy quark momentum. Note that as defined, AQ = 1, A± = 0 and A2 = Ai. The divergences in Eq. (159) can be controlled only if one integrates over a large enough region near the endpoint, 1 - 8 < y < 1. For <5 ~ A/mt, one finds a series for T which does not converge, since the individual terms are of order An/An ~ 1. This situation reflects a dependence of the shape of dT/dy on the entire b quark momentum distribution in the B meson. Since, for example, the window 2.3 GeV < E{ < 2.6 GeV corresponds to 8 ~ 0.1, this problem pollutes the extraction of \Vub\ from dT/dEi. One is forced to introduce a model for the b wavefunction, with the attendant uncontrolled theoretical uncertainties. The same is true, it turns out, for dT/dsn• The current "best" measurement of Vub from LEP, based on an inclusive analysis, is 52 \Vub\ = [4.05tlil(sUt.)^fxt00f7(sys.)
± 0.02(r6) ±0.16(HQE)] x 1 0 " 3 , (161) or approximately IV^/V^I = 0.104+QoJg. While these analyses are experimentally very sophisticated, they rely intensively on a two-parameter model of the b quark wavefunction. Essentially, in such a parameterization all moments of the b momentum distribution are correlated with the first two nonzero ones, a constraint which is unphysical. Even if the two parameters are varied within "reasonable" ranges, it is doubtful that such a restrictive choice of model captures reliably the true uncertainty in \Vub\ from our ignorance of the structure of the B meson. While the central value which is obtained in these analyses is reasonable, the realistic theoretical error which should be assigned is not yet well understood. A recent analysis by CLEO of the exclusive decay B —>• pi v yields 53 |K ft | = [3.25 ± 0.14(stat.)t£;^(syst.) ± 0.55(theory)] x 1CT3 ,
(162)
or approximately |V^b/V^i,| = 0.083^o'oi6> essentially consistent with the LEP result. In this case the reliance on models is quite explicit, since one needs the hadronic form factor {p\ U7M(1 — j5)b \B) over the range of momentum transfer to the leptons. The CLEO measurement relies on models based on QCD sum rules, which have uncertainties which are hard to quantify. Hence, just as in the case of the LEP measurement, the quoted errors should not be taken terribly seriously. All of the current constraints are consistent with |Vui,/Vci,| = 0.090 ± 0.025, where I strongly prefer this more conservative estimate of the theoretical errors. The problem lies not in the experimental analyses, but in our insufficient understanding of hadron dynamics.
425
Recently it has been pointed out 5 4 that the one may alternatively reject the charm background by studying the distribution dT/dq2 and restricting oneself to q2 > (m^ — mo)2- Not only does this cut eliminate charmed final states, but it also avoids the troublesome region near q2 = 0. Hence such a determination would not be polluted by the divergence described above. Whether the neutrino reconstruction algorithms of the B Factories will be up to the task of this measurement yet remains to be seen. 6
Concluding Remarks
Unfortunately, we have had time in these lectures only to introduce a very few of the many applications of heavy quark symmetry and the HQET to the physics of heavy hadrons. Since its development less than ten years ago, it has become one of the basic tools of QCD phenomenology. Much of the popularity and utility of the HQET certainly come from its essential simplicity. The elementary observation that the physics of heavy hadrons can be divided into interactions characterized by short and long distances gives us immediately a clear and compelling intuition for the properties of heavy-light systems. The straightforward manipulations which lead to the HQET then allow this intuition to form the basis for a new systematic expansion of QCD. The deeper understanding of heavy hadrons which we thereby obtain is increasingly important as the B Factory Era begins. Acknowledgements It is a pleasure to thank the organizers of TASI-2000 for the opportunity to present these lectures, and for arranging a most interesting and pleasant summer school. This work was supported by the National Science Foundation under Grant No. PHY-9404057, by the Department of Energy under Outstanding Junior Investigator Award No. DE-FG02-94ER40869, and by the Research Corporation under the Cottrell Scholarship program. References 1. There are many excellent reviews of heavy quark symmetry and its applications. In particular, see A. V. Manohar and M. B. Wise, Heavy Quark Physics (Cambridge University Press, Cambridge, 2000); M. Neubert, Phys. Rep. 245, 259 (1994); M. Shifman, preprint TPI-MINN-95-31-T, to appear in QCD and Beyond, Proceedings of TASI-95. Much of what appears in these lectures is similar to my 1996 SLAC Summer Institute lecture notes, A.F. Falk, hep-ph/9610363. Material has also been taken
426
2.
3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28.
29.
from A.F. Falk, hep-ph/9812217, an introductory chapter written for T i e BaBar Physics Book, SLAC-R-504. For example, see M. Wirbel, B. Stech and M. Bauer, Z. Phys. C29, 637 (1985); J.G. Korner and G.A. Schuler, Z. Phys. C38, 511 (1988); N. Isgur et ai, Phys. Rev. D39, 799 (1989). C. Caso et ai, Eur. Phys. J. C 3 , 1 (1998); update at http://pdg.lbl.gov. N. Isgur and M. B. Wise, Phys. Lett. B232, 113 (1989); Phys. Lett.B237, 527 (1990). M.B. Voloshin and M.A. Shifman, Yad. Fiz. 45, 463 (1987); Yad. Fiz. 47, 511 (1988). S. Nussinov and W. Wetzel, Phys. Rev. D36, 130 (1987). H.D. Politzer and M.B. Wise, Phys. Lett. B206, 681 (1988); Phys. Lett. B208, 504 (1988). E.V. Shuryak, Phys. Lett. B93, 134 (1980). E. Eichten and B. Hill, Phys. Lett. B234, 511 (1990); Phys. Lett. B243, 427 (1990). H. Georgi, Phys. Lett. B240, 447 (1990). B. Grinstein, Nucl. Phys. B339, 253 (1990). A.F. Falk et al, Nucl. Phys. B343, 1 (1990). A.F. Falk and B. Grinstein, Phys. Lett. B247, 406 (1990). M. Luke, Phys. Lett. B252, 247 (1990). A.F. Falk, B. Grinstein and M. Luke, Nucl. Phys. B357, 185 (1991). T. Mannel, W. Roberts and Z. Ryzak, Nucl. Phys. B355, 38 (1991). H.D. Politzer, Nucl. Phys. B172, 349 (1980). A.F. Falk and M. Neubert, Phys. Rev. D47, 2965 (1993) A.F. Falk, M. Luke and M.J. Savage, Phys. Rev. D49, 3367 (1994). A.F. Falk, Nucl. Phys. B378, 79 (1992). P. Cho and B. Grinstein, Phys. Lett. B285, 153 (1992). M. Ademollo and R. Gatto, Phys. Rev. Lett. 13, 264 (1964). M. Neubert and V. Rieckert, Nucl. Phys. B382, 97 (1992). M. Neubert, Phys. Lett. B264, 455 (1991). M.A. Shifman, N.G. Uraltsev and A.I. Vainshtein, Phys. Rev. D 5 1 , 2217 (1995); Erratum, D52, 3149 (1995). M. Neubert, Phys. Lett. B338, 84 (1994). A. Czarnecki, Phys. Rev. Lett. 76, 4124 (1996). For earlier one-loop and one-loop improved calculations, see A.F. Falk and B. Grinstein, Phys. Lett. B247, 406 (1990); Phys. Lett. B249, 314 (1990); M. Neubert, Phys. Rev. D46, 2212 (1992); Phys. Rev. D 5 1 , 5924 (1995); Phys. Lett. B341, 367 (1995). B. Barish et al. (CLEO Collaboration), Phys. Rev. D51, 1041 (1995).
427
30. The LEP results are combined by the LEP Vct Working Group. Details can be found at the web address h t t p : / / l e p v c b . w e b . c e r n . c h . 31. M. Neubert, Int. J. Mod. Phys. A l l , 4173 (1996). 32. J. Chay, H. Georgi and B. Grinstein, Phys. Lett. B247, 399 (1990). 33. I.I Bigi, N.G. Uraltsev and A.I. Vainshtein, Phys. Lett. B293, 430 (1992); I.I Bigi et al, Phys. Rev. Lett. 71, 496 (1993). 34. A.V. Manohar and M.B. Wise, Phys. Rev. D49, 1310 (1994). 35. A. Falk, I. Dunietz and M. B. Wise, Phys. Rev. D51, 1183 (1995). 36. B. Guberina, R.D. Peccei and R. Ruckl, Nucl. Phys. B171, 333 (1980). 37. M. Luke, M.J. Savage and M.B. Wise, Phys. Lett. B343, 329 (1995). 38. Y. Nir, Phys. Lett. B221, 184 (1989). 39. M. Luke, M.J. Savage and M.B. Wise, Phys. Lett. B345, 301 (1995). 40. A.F. Falk, M. Luke and M.J. Savage, Phys. Rev. D53, 2491 (1996); Phys. Rev. D53, 6316 (1996). 41. J. Bartelt et al. (CLEO Collaboration), CLEO-CONF-98-21. 42. R. Poling, proceedings of Lepton-Photon 1999, hep-ex/0003025. 43. M. Beneke and V.M. Braun, Nucl. Phys. B426, 301 (1994); M. Beneke, V.M. Braun and V.I. Zakharov, Phys. Rev. Lett. 73, 3058 (1994). 44. I.I. Bigi et al, Phys. Rev. D50, 2234 (1994). 45. M. Neubert and C.T. Sachrajda, Nucl. Phys. B438, 235 (1995). 46. M. Luke, A.V. Manohar and M.J. Savage, Phys. Rev. D51, 4924 (1995). 47. P. Ball, M. Beneke and V.M. Braun, Phys. Rev. D52, 3929 (1995). 48. For the state of the art, see M. Beneke and A. Signer, Phys.Lett. B471, 233 (1999). 49. M. Neubert, Phys. Rev. D49 (1994) 3392; D49 (1994) 4623; I.I. Bigi et al., Int. J. Mod. Phys. A9 (1994) 2467. 50. A.F. Falk, Z. Ligeti and M.B. Wise, Phys. Lett. B406, 225 (1997); R.D. Dikeman and N.G. Uraltsev, Nucl. Phys. B509, 378 (1998). 51. A.F. Falk et al, Phys. Rev. D49, 4553 (1994). 52. D. Abbaneo et al (LEP Vub Working Group), LEPVUB-99/01, 1999. 53. B.H. Behrens et al (CLEO Collaboration), hep-ex/9905056. 54. C. Bauer, Z. Ligeti and M. Luke, Phys. Lett. B479, 395 (2000).
This page is intentionally left blank
J. L. Rosner
This page is intentionally left blank
CP VIOLATION IN B DECAYS
Enrico
J O N A T H A N L. R O S N E R Fermi Institute and Department of Physics, University 5640 South Ellis Avenue, Chicago, IL 60637 E-mail: [email protected]
of
Chicago
The role of B decays in the study of CP violation is reviewed. We treat the interactions and spectroscopy of the b quark and then introduce CP violation in B meson decays, including time-dependences, decays to CP eigenstates and non-eigenstates, and flavor tagging. Additional topics include studies of strange B's, decays to pairs of light pseudoscalar mesons, and the roles of gluonic and electroweak penguin diagrams, and final-state interactions.
1
Introduction
Discrete symmetries such as time reversal (T), charge conjugation (C), and space inversion or parity (P) have provided both clues and puzzles in our understanding of the fundamental interactions. The realization that the chargechanging weak interactions violated P and C maximally was central to their formulation in the V — A theory. The theory was constructed in 1957 to conserve the product CP, but within seven years the discovery of decay of the long-lived neutral kaon to two pions x showed that even CP was not conserved. Nearly twenty years later, Kobayashi and Maskawa (KM) 2 proposed that CP violation in the neutral kaon system could be explained in a model with three families of quarks, at a time (1973) when no evidence for the third family and not even all evidence for the second had been found. The quarks of the third family, now denoted by b for bottom and t for top, were subsequently discovered in 1977 3 and 1994, 4 respectively. Decays of hadrons containing b quarks now appear to be particularly fruitful ground for testing the KM hypothesis and for displaying evidence for any new physics beyond this "standard model" of CP violation. A meson containing a b quark will be known generically as a B meson, in the same way as a K meson contains an (anti-) strange quark s. The present lectures are devoted to some tests of CP violation utilizing B meson decays. (Baryons containing b quarks also may display CP violation but we will not discuss them here.) We first deal with the spectroscopy and interactions of the b quark. In Section 2 we describe the discovery of the charmed quark, the tau lepton, the b quark, and B mesons. Section 3 is devoted to the spectroscopy of hadrons containing the b quark, while Section 4 treats its weak interactions. Neutral mesons containing the b quark can mix with their_antiparticles (Section 5), 431
432
providing important information on the weak interactions of b quarks. We then introduce CP violation in B meson decays. After general remarks and a discussion of decays to CP eigenstates (Section 6) we turn to decays to CP-noneigenstates (Section 7) and describe various methods of tagging the flavor of an initially-produced B meson (Section 8). Some specialized topics include strange £?'s (Section 9), decays to pairs of light mesons (Section 10), and the roles of penguin diagrams (Section 11), and final-state interactions (Section 12). Topics not covered in detail in the lectures but worthy of mention in this review are noted briefly in Section 13. The possibility that the Standard Model of CP violation might fail at some future time to describe all the observed phenomena is discussed in Section 14, while Section 15 concludes. Other contemporary reviews of the subject 5 ' 6 may be consulted. 2 2.1
Discovery of the b quark Prelude: The charmed quark
During the 1960's and 1970's, when the electromagnetic and weak interactions were being unified by Glashow, Weinberg, and Salam,7 it was realized 8 that a consistent theory of hadrons required a parallel 9 between the then-known two pairs of weak isodoublets of leptons, (i/ e ,e~), (f M ,/x~), and a corresponding multiplet structure for quarks, (u,d), (c,s). The known quarks at that time consisted of one with charge 2/3, the up quark u, and two with charge —1/3, the down quark d and the strange quark s. The charmed quark c was a second quark with charge 2/3 and a proposed mass of about 1.5 to 2 GeV/c 2 . 10 ' 11 The parallel between leptons and quarks was further motivated by the cancellation of anomalies 8 ' 12 in the electroweak theory. These are associated with triangle graphs involving fermion loops and three electroweak currents. It is sufficient to consider the anomaly for the product I^LQ2, where I^L is the third component of left-handed isospin and Q is the electric charge. The sum ^2i(hL)iQ2 over all fermions i must vanish. If a family of quarks and leptons consists of one weak isodoublet of quarks and one of leptons, this cancellation can be implemented within a family, as illustrated in Table 1. The first hints of charm arose in nuclear emulsions 13 and were recognized as such by Kobayashi and Maskawa.2 However, more definitive evidence appeared in November, 1974, in the form of the 3Si cc ground state discovered simultaneously on the East 1 4 and West 15 Coasts of the U. S. and named, respectively, J and xp. The East Coast experiment utilized the reaction p + Be —» e+e~ +... and observed the J as a peak at 3.1 GeV/c 2 in the effective e+e~ mass. The West Coast experiment studied e+e~ collisions in the SPEAR storage ring and
433 Table 1: Anomaly cancellation in the electroweak theory. Family Neutrino Charged lepton Q = 2/3 quark Q = - 1 / 3 quark
1
2
e~ u d
c s
3
Contribution per family
T~
(-l/2)(-l)2 = - l / 2 3(l/2)(2/3) 2 = 2/3 3(-l/2)(-l/3)2 = - l / 6
t b
saw a peak in the cross section for production of e+e~, n+'/j,~, and hadrons at a center-of-mass energy of 3.1 GeV. Since the discovery of the Jftp the charmonium level structure has blossomed into a richer set of levels than has been observed for the original "onium" system, the e+e~ positronium bound states. The lowest charmonium levels are narrow because they are kinematically unable to decay to pairs of charmed mesons (each containing a single charmed quark). The threshold for this decay is at a mass of about 3.73 GeV/c 2 . Above this mass, the charmonium levels gradually become broader. The charmed mesons, discovered in 1976 and subsequently, include D+ = cd (mass 1.869 GeV/c 2 ), D° = cu (mass 1.865 GeV/c 2 ), and Ds = cs (mass 1.969 GeV/c 2 ). These mesons were initially hard to find because the large variety of their possible decays made any one mode elusive. For example, the two-body decay D° -> K~ir+ has a branching ratio of only about 3.8%; 16 higher-multiplicity decays are somewhat favored. 2.2
Prelude: The r lepton
About the same time as the discovery of charm, another signal was showing up in e+e~ collisions at SPEAR, corresponding to the production of a pair of 17 new leptons: e + e -> 7* —> The r signal had a number of features opposite to those of charm: lower- rather than higher-multiplicity decays and fewer rather than more kaons in its decay products, for example, so separating the two contributions took some time.18 The mass of the r is 1.777 GeV/c 2 . Its favored decay products are a tau neutrino, vT, and whatever the charged weak current can produce, including ePe, fiiy^, 7T, p, etc. It thus contributes somewhat less than one unit to
R^Y,Q>
a(e+e —>• hadrons) or(e+e~ —> /u+/x_)
(1)
which would have risen from the value of 2 for u, d, s quarks below charm threshold to 10/3 above charm threshold if charm alone were being produced, but was seen to rise considerably higher.
434
One problem with accepting the r as a companion of the charmed quark was that the neat anomaly cancellation provided by the charmed quark, mentioned above, was immediately upset. The anomaly contributed by the T lepton would have to be cancelled by further particles, such as a pair of new quarks (t, b) with charge 2/3 and —1/3. Such quarks had indeed already been utilized two years before the r was established, in 1973 by Kobayashi and Maskawa 2 in their theory of CP violation. The names "top" and "bottom" were coined by Harari in 1975,19 in analogy with "up" and "down." 2.3
Dilepton spectroscopy
One reason for the experiment which discovered the J particle 14 was an earlier study, also at Brookhaven National Laboratory, by L. Lederman and his collaborators, of /J.+fi~ pairs produced in proton-uranium collisions.20 The m(fi+fi~) spectrum in this experiment displayed a shoulder around 3.5 GeV/c 2 . It was not recognized as a resonant peak and was displaced in mass from the true J/ip value because of the poor mass resolution of the experiment. After the discovery of the J/ip, Lederman's group continued to pursue dilepton spectroscopy. In 1977 a search with greater sensitivity and better mass resolution turned up evidence for peaks at 9.4, 10.0, and possibly 10.35 GeV/c 2 . 3 These were candidates for the IS, 2S, and 3S 3 5i levels of a new QQ system. Several pieces of evidence identified the heavy quark Q as a b quark. (1) The T(1S) and T'(2S) were produced in 1978 by the electron-positron collider DORIS at DESY and their partial widths to e+e~~ pairs were measured.2 It was shown 22 that if the QQ system was bound by the same quantum chromodynamic force as as the cc (charmonium) system, one could use the cc states to gain some idea about the details of the QQ binding. Since T(QQ) oc €Q, where eQ is the charge of the quark Q, it was possible to conclude from the data that \CQ\ = 1/3 was favored over |eg| = 2/3. (2) The Cornell e+e~ ring CESR began operating in 1979 23 reaching a fourth T(4S) peak and finding it broader than the first three. This indicated that the meson pair threshold lay below M[T(4S)]= 10.58 GeV/c 2 . Farther above this threshold, wiggles in the total cross section for hadron production averaged out to indicate a step in R of 1/3, confirming that \CQ\ = 1/3. (3) The possibility that Q was an isosinglet quark of charge —1/3, and thus not the partner of some quark t with charge 2/3, was ruled out by the absence of significant flavor-changing neutral current decays such as b —> S/u+/x-.24'25 The structure of the T levels is remarkably similar to that of the charmonium levels except for having more levels below flavor threshold. For example, the fact that the 3S level is below flavor threshold allows it to decay to the
435
2P levels via electric dipole transitions with appreciable branching ratios; the transitions between the S and P levels are well described in potential models which reproduce other aspects of the spectra. Several reviews treat the fascinating regularities of the spectroscopy of these levels.26 2.4
Discovery of B mesons
The lightest meson containing a b quark and each flavor of light antiquark is expected to decay weakly. The allowed decays of b are (c or u) + (virtual W~), with the c giving rise to lots of strange particles while the u gives few strange particles. The virtual W~ can decay to ud, cs, e~i>e, /x~i/M, and T"VT. In e+e~ collisions above BB threshold, several signals of B meson production were observed by the CLEO Collaboration starting around 1980: 27 • Prompt leptons (signals of semileptonic decay) • An abundance of kaons (a signal that b —> c + W^irt is preferred over b -> u + WT rt ) • "Daughter" (lower-momentum) leptons from c semileptonic decays. These indirect signals were followed by reconstruction of B+ and B° decays,28 e.g., B+ = bu -» cudu -> D0n+
,
B° = bd-> cudd =
D~TT+
.
(2)
Typical branching ratios for these final states 16 are (5.3 ± 0.5) x 1 0 - 3 for 3 D°-K+ and (3.0 ± 0.4) x 10" for D~TT+. B° ->• £»°7r° is also allowed but not yet observed. These small branching ratios mean that reconstruction of exclusive final states is even harder for B mesons than for charmed particles. 3 3.1
The known B hadrons B mesons
The nonstrange ground-state B (pseudoscalar) and B* (vector) mesons are compared with the corresponding charmed mesons in Table 2. Evidence for the B* exists in the form of a photon signal for the decay B* —> Bs"/?9 The photon energy, 46 MeV, is expected to be the same as that seen in B*° —• _B°7.30 Since the B* and B states are separated by only 46 MeV, a B* should always decay to a B of the same flavor and a photon. This is in contrast to the case of the D* and D states, whose separation is just about a pion mass. The electromagnetic mass splittings are such that D*+ —> D°n+, D*+ —>
436 Table 2: Ground-state heavy-light (Qq) pseudoscalar mesons and the corresponding vector mesons. Here the spectroscopic notation 2L+lLj is used to denote the spin, orbital, and total angular momenta of the Qq state.
Quark content cu cd cs bu Id bs
Pseudoscalar ( 1 5o) meson Name Mass (MeV/c 2 ) 1864.5 ± 0 . 5 D+ 1869.3 ± 0.5 Ds 1968.6 ± 0.6 B+ 5279.0 ± 0.5 B° 5279.4 ± 0 . 5 5369.6 ± 2.4
Vector ( 3 5i) meson Name Mass (MeV/c 2 ) 2006.7 ± 0 . 5 B*+ 2010.0 ± 0 . 5 2112.4 ± 0 . 7 B*+ 5325.0 ± 0.6 B*° 5325.0 ± 0.6 ~5416
D+ir°, and D*° -+ D°ir° are just barely allowed, while D*° ->• D+ir~ is forbidden. The low-momentum TT+ in D*+ —» D°n+ acts as a "tag," useful both for signalling the production of a charmed meson 3 1 and, by its charge, distinguishing the D° from a D°. Since B* decays are not useful for this type of "flavor" tag, one must resort to the decays of heavier excited bq states (Section 8). The hyperfine splitting of B mesons is smaller than that in charmed mesons because the chromomagnetic moments of the heavy quarks scale as the inverse of their masses: mB* - mB l/m6 mc 1 . ~ —— = — ~ - . (3) m c . — mo l/Tnc nib 3 3.2
The A;, baryon
The lightest baryon containing a b quark is the Aj = b[ud]i=o- Its mass is 5624 ± 9 MeV/c 2 . 16 The ud system must be in a color 3* (antisymmetric) state, since the b is a color triplet and the Ab is a color singlet. The spin-zero state of ud is favored over the spin-one state by the chromodynamic hyperfine interaction. By Fermi statistics, the ud pair must then be in an (antisymmetric) isospin-zero state. For similar reasons, the 1 = 0 state of a strange quark and two nonstrange quarks, the A = s[ud]i=o with mass 1116 MeV/c 2 , is lighter than the £ = s(uu,ud,dd)r=i states with average mass 1193 MeV/c 2 . The charmed analogue of the Aj, is the A c = c[ud]i=0 with mass 2284.9±0.6 MeV/c 2 . The difference in mass of the two particles is M(Ab) — M(AC) = 3339 ± 9 MeV/c 2 . This provides an estimate of mi — mc since there are no hyperfine terms involving the heavy quark; the light-quark system has zero spin in both baryons. There will be a correction of order m~l — m^1 due to
437
possible differences in kinetic energies. One can perform a similar estimate for Qq mesons by eliminating the hyperfme energy, performing a suitable average over vector ( 3 Si) and pseudoscalar (1S'o) meson masses. The expectation value of the relevant interaction term is (
Interactions of the b quark
In this Section we shall discuss the way in which the interactions of the b quark provide information on the pattern of charge-changing weak interactions of quarks parametrized by the Cabibbo-Kobayashi-Maskawa (CKM) matrix y 2,32 More details on determination of the CKM matrix are included in the lectures by Buchalla,33 DeGrand,34 Falk,35, Neubert,36 and Wolfenstein,37 4-1
The b lifetime: indication of small \Vcb\
The long b quark lifetime (> 1 ps) indicated that the CKM element Vc\, was considerably smaller than \VUS\ ~ \Vcd\ ~ 0.22. One can estimate Vcb using a free-quark method. The subprocess b —> cW*~ —> cl~9e has a rate T(b^cri>e)
G2 = —f^m5b\Vcb\2f(mb,rnc,me)
,
(4)
where GF = 1.16637(2) x 10~ 5 GeV~ 2 is the Fermi coupling constant. In the limit in which the lepton mass can be neglected, f(mt,,mc,me) = f{m2/ml), 3 4 2 with f(x) = 1 — 8x + 8x — x — Ylx lnx. The uncertainty in the prediction for T(b —> cl~D() due to that in mj is mitigated by the constraint noted above on rrib — mc ~ 3.34 GeV/c 2 . Taking a nominal range of quark masses around mi, = 4.7 GeV/c 2 (and hence a range around m c = 1.36 GeV/c 2 , f(m2c/ml) = 0.54), rj — 1.6 x 10~ 6
438
s,
and the branching ratio B(b -t clvt) ~ 10.2%, one finds
Thus if mj is uncertain by 0.3 GeV/c 2 (my guess), \Vcb\ is uncertain by ±0.0024. Recent averages 16,35 give rise to values of \Vcb\ somewhat above 0.040 with errors of ±0.002 to ±0.003. A new report by the CLEO Collaboration 38 finds \Vcb\ = 0.0462 ± 0.0036 based on the exclusive decay process B° —• D*~t+vi. This new determination bears watching as it would affect many conclusions regarding predictions for CP-violating asymmetries in B decays. We shall take \Vcb\ = 0.041 ± 0.003 as representing a conservative range of present values. 4-2
Charmless b decays: indication of smaller \Vub\
Although the u quark is lighter than the c quark, its production in b decays is disfavored, with T(b —> u(.v)/T(b —> civ) only about 2%. Since the phase-space factor / ( m 2 / m 2 ) is very close to 1, while f{m2c/m\) ~ 1/2, this means that |^u(,/Vc(,|2 ~ 1%, or |K,&/Vcf,| — 0.1. The error on this quantity is dominated by theoretical uncertainty 35 ; detailed studies 39 indicate |FU6/VC;,| = 0.090±0.025. 4-3
Pattern of charge-changing weak quark transitions
The relative strengths of charge-changing weak quark transitions are illustrated in Fig. 1. Why the pattern looks like this is a mystery, one of the questions (along with the values of the quark masses) to be answered at a deeper level. The interactions in Fig. 1 may be parametrized by a Cabibbo-KobayashiMaskawa (CKM) matrix of the form 40 1-
A2 2
--A A
VCKM = 3
AX3(p-irj)
A \2
]1 - ^ -
L - 2—
_AX (1 - P--irj) -AX L
2
A\2
(6)
1
The columns refer to d, s,b and the rows to u,c,t. The parameter A = 0.22 represents sin# c , where 9C is the Gell-Mann-Levy-Cabibbo 32,41 angle. The value |Vc6| = 0.041 ±0.003 indicates A = 0.85±0.06, while \Vub/Vcb\ = 0.090± 0.025 implies (p2 + T? 2 ) 1 / 2 = 0.41 ± 0.11. Further information may be obtained by assuming that box diagrams involving internal quarks u, c, t with charge 2/3 are responsible for both the CP-violating contribution to K°-K° mixing and to mixing between neutral B
439
Q=-l/3 3 2 > O
1 0 -1
QO
o
-2 -3
Figure 1: Pattern of charge-changing weak transitions among quarks. Solid lines: relative strength 1; dashed lines: relative strength 0.22; dot-dashed lines: relative strength 0.04; dotted lines: relative strength < 0.01. Breadths of lines denote estimated errors.
mesons and their antiparticles. The parameter \CK\ = (2.27±0.02) x 10 Buchalla's lectures 33 ) then implies a constraint 42 r)(l - p + 0.39) = 0.35 ±0.12
3
(see (7)
where the 1 — p term in parentheses arises from box diagrams with two internal top quarks, while the correction 0.39 is due to diagrams with one charmed and one top quark. The error on the right-hand side is due primarily to uncertainty in the Wolfenstein parameter A = |V^j,|/A2, which enters to the fourth power in the ti contribution to e^. A lesser source of error is uncertainty in the parameter BK describing the quark box diagram's matrix element between a K° and a K°. We have chosen 43 BK = 0.87 ± 0.13. Present information on B°~B mixing, interpreted in terms of box diagrams with two quarks of charge 2/3, leads to a constraint on \Vtd\2 which implies 42 \l-p-ir)\ = 0.87±0.21 for the parameter range fsV^B = 230±40 MeV describing the matrix element of the short-distance 4-quark operator taking bd into db between B° and B states. The best lower limit on B°-Bs mixing 44 AMS > 15 p s _ 1 , when compared with the corresponding value for
440
0.6
0.4 —
P 0.2 —
0.0
-0.50
-0.25
0.00
0.25
0.50
P Figure 2: Region of (p,rf) specified by ±1
B°-B
mixing, Amd = 0.487 ± 0.014 ps" 1 , leads to the bound PB.BB.
Vts
S%BB
vtd
>29
(8)
This may be combined with the estimate 45 /B, \/BBS < 1-25JBVB~B based on quark models. (Lattice gauge theories 34 estimate this coefncent more precisely, generally giving values between 1.1 and 1.2.) One finds |Vis/Vtd| > 4.4 or |1 — p — ir)\ < 1.01. The constraints may be combined to yield the allowed range in (p,rj) space illustrated in Fig. 2. Smaller regions are quoted in other reviews 46 which view the theoretical sources of error differently. 4-4
Unitarity triangle
The unitarity of the CKM matrix implies that to the order we are considering, V*b + Vtd = AX3. If this equation is divided by AX3, one obtains a triangle in the (/j, 77) complex plane whose vertices are at (0,0) (internal angle 7), (p,r]) (internal angle a), and (1,0) (internal angle /?).
441
CP-violating asymmetries in certain B decays can measure such quantities as sin 2a and sin 2/3. The former, measured in B° —> ir+n~ with some corrections due to "penguin" diagrams, may occupy a wide range, as illustrated in Fig. 2. The latter, measured in the "golden mode" B° -» J/tpKs with few uncertainties, is more constrained by other observables. The goal of measurements of CP violation and other quantities in B decays will be to test the consistency of this picture and to either restrict the parameter space further, thus providing a reliable target for future theories of these parameters, or to expose inconsistencies that will point to new physics. Hence part of the program will be to overconstrain the unitarity triangle, measuring both sides and angles in several different types of processes. While we discuss such measurements based on B mesons, Buchalla 33 describes how, for example, K+ —> TT+I/D constrains the combination |1 — p — in + 0.44|, where the last term is a charmed quark correction to the dominant top quark contribution, and the purely CP-violating process KL —• ir°vv constrains n. 5 5.1
Mixing of neutral B mesons Mass matrix formalism
We shall work in a two-component basis utilizing the states (B° ,B ). [It is also sometimes useful to consider a basis 37 (£?+, £?_), where B± = (B° ± B )/\/2.) The time-dependence of these states is described via a mass matrix M = M — i r / 2 , where M and T are Hermitian by definition:
\B°] -=fi
B
= M
\B°] -F=fl
B
The requirement of CPT invariance, which we shall assume henceforth, implies Mn — M22, or equal transition amplitudes for K° ->• K° and K -> K . Exercise: (a) Show this. Remember time reversal is an antiunitary operator. (b) Show that a similar argument applied to M12 or M 21 leads to no constraint. (c) Relate the result in (a) to the result quoted by Wolfenstein37 for the (B° ± S 0 ) / ^ basis. [Answer to Part (a): Insert the unit operator (CPT)-1 CPT before and after M in Mn = (B°\M\B°). Note that CPT(M - iT/2)(CPT)-x =M + iT/2. Then Mn =
(B
0
= (CPTB°\M
+
Y\CPTB0)*
|M + y | S 0 ) * = ( 5 ° | M - y | B 0 ) = A 4 2 2
,
(10)
442
where the antiunitarity of T has been used in the first step. For a discussion of antiunitary operators see, e.g., Sakurai's book on quantum mechanics.47] The eigenstates of M may be denoted by BH ("heavy") and BL ("light"): \BH,L)
= PH,L\B°)
The corresponding eigenvalues
M
Pi
[I,H,L
= Mi
+ qH,L\B^)
.
(11)
= rriH,L — ^H,L
satisfy
Pi
,
(12)
Hi
(13)
(i = H,L)
[qi\
[
M21-+
Pi
Mu
Qi
so that Mi2{qi/Pi) - M2i(Pi/qi), or (vijqif - M12/M21 For B°~B-^ mixing, in contrast to the situation for neutral kaons, the scarcity of intermediate states accessible to both B° and i=fi B and the presence of a large top-quark contribution to Mi2 means that \Y\2\
M12 M,
Pi_ qi
(14)
We may choose pi = pu = p, qh — ~qH = <7- Normalizing \p\2 + \q\2 — 1, we then write \BL) = p\B°) + g|B°> ,
\BH) = p\B°) - q\B*)
.
(15)
The sign ambiguity in (14) may be resolved as follows. Since q
p (16) = Mu + M21, p q q p (17) HH = Mu-Mi2-= M11-M21, p q then p,H - VL = ~2Mi2(q/p), which in the limit jr 1 2 j < |Mi 2 | is \IH ~ ML = / ( - , + ) v M i 2 M 2 i for the choice of (+,-) in (14). Since M 2 i = M*2 (M is Hermitian), we must take the - sign in (14) in order that the "heavy" mass ran be greater than the "light" mass m^. Then (J-L
= Mu +M12-
EL
Mi 2
qi
M-21
(18)
443
Neglecting Ti2 in comparison with IM12I, we then find Am = mH-mL^2\M12\
,
AT =
rH-TL~0
(19)
If one keeps Ti2 to lowest order, one can show 48 that q p
Mf2 \MX
1
- ^Ira (w,
(20)
In the limit that Ti 2 is negligible and A r = 0, q/p is a pure phase, determined by the phase of Mi 2. Now, Mi2 takes B = bd into B° = db, so its phase is that of (VtbVt*d)2, 2t/3 or e . Thus in this limit we find q/p ~ e~2l!3. More specifically, in the phase convention in which (CP)\B°) = +\B ), we find M 12 = -^{VtbV; )2M2wmBf2BBBnBS 2 d
(^-\
MwJ
12TT
(21)
Here fB is the B meson decay constant, BB is the vacuum saturation factor, r\B = 0.55 is a QCD correction factor,49 and 5 0
S(x) = \
+
3 — 9a; + (x-1)2
6x2 In x (x-1)3
(22)
The appropriate top quark mass for this calculation 51 is mt(mt) — 165 GeV/c 2 . The BaBar Physics Book 48 may be consulted for further conventions and details. 5.2
Time dependences
We would like to know how states which are initially B° or B time. The mass eigenstates evolve as Bi —> Bie~l^il (i = L,H). eigenstates are expressed in terms of them as
t = 0: \B°)
\BL)
+ \BH) 2p
|5"> =
t > 0 : \B°(t)) = (\BL)e-l,iLt i
+ t
|B°(t)> = ( | B i > e - ' " -
~ \BH) 2q
\BL)
\BH)e-^Ht)/2p i
t
\BH)e- »" )/2q
evolve in The flavor
(23)
(24) (25)
444
Now substitute back for BL,H\B°(t)) = \B0)U(t)
+ p-(t)\B°)
,
(26)
\B°(t)} = \B°)f+(t)
+ Pqf-(t)\B0)
,
(27)
,
(28) (29)
f+(t)=e-imte-n^cos(Afit/2) /_(*) =e-imte-rt/2ism{Afit/2)
,
A// = HH - HL = Am - i ( A r / 2 ) , Am = m # - m i , A r = TH - FL, m = (mH + mL)/2,T = (rH + rL)/2. Again, for simplicity, we shall neglect A r in comparison with Am. A lowest-order quark model calculation (for which QCD corrections change the answer) gives 52 r ^ _ M 12
3^ m\lMl ml ( 1 2 5 K / M 2 , ) m2 V
8 mg T ^ A , 3m^ttyf*r
"^ o ™ 2 T / T / *
1 180 ion
'
l
0
^
where S(x) was defined in Eq. (22). The intermediate states dominating the loop calculation of Ti2 have typical mass scales mj, whereas loop momenta of order mt give rise to the main contributions to MyiNeglecting A r and performing time integrals, one finds
r
r^<'>|2=2^i) • rr^-wf=wh>
• <3i)
where Xd = Am,Bd/TBd, and Bd is another name for B° = bd to distinguish it from Bs — bs. The sum of the two terms is 1. The first term is 1 for xd — 0, approaches 1/2 for Xd —> oo, and is about 0.82 for the actual value Xd = 0.754 ± 0.027. The second term is 0 for xa = 0, approaches 1/2 for Xd —> oo, and is about 0.18 in actuality. Thus a neutral non-strange B of a given flavor (B° or B ) has about 18% probability of decaying as the opposite flavor. 6 6.1
CP violation Asymmetry:
general remarks
We wish to compare (f\B°=Q(t)) / = {CP)f. Now define
I = $g> |J5°>
and (f\Bt=0(t)),
ss<^>, (f\W)
where / is a final state and
^ u , V
-XOSE,
.
(32)
445
Using the time evolution derived earlier for B°=0 and Bt=0, {f\B°=0(t))
one then finds
= (f\B°) [/+(*) + Ao(t)/_(t)]
,
<7|B?=0(t)> = <7l^°> [/+(*) + Ao(*)/-(*)]
(33) •
(34)
This result can be simplified under several circumstances, (a) If there is a single strong eigenchannel, final-state strong interaction phases in x or x cancel, since the numerator and denominator refer to the same final state. Then x = x*, since weak phases flip sign under CP. (b) Recall that \q/p\ is nearly 1 for B mesons. (For non-strange B's, we found q/p ~ e~2t/3.) Combining (a) and (b), we find A0 = XQ for these cases. 6.2
Time-dependent
asymmetry
According to Eq. (33), the rates for a (B°,B ) produced at t — 0 to evolve to the respective final states (/, / ) at a time t are dr(B?=0^f)/dt~\f+(t) dr(B°t=0-+7)/dt~\f+(t)
+ \0Ut)\2
,
(35)
2
,
(36)
+ \0f-(t)\
with the coefficients of proportionality identical if there is a single strong eigenchannel. Now consider the case of a CP-eigenstate / such that / = ± / . Then we have not only x = x* (see above), but also x = x~x, so |x| = 1. In that case, when \q/p\ = 1 as is the case for neutral B's, we have |Ao| = 1 and Ao = AQ. Then Ami ., . Amt h iAo u sin 2 2 -rt e - - [1 - ImAo sin Amt] , rt cos
dT{B°=0 -> f)/dt ~ e~rt [1 - ImAo sin Amt] rt
dT(B°t=0 -> J)/dt ~ e~
[1 + ImAo sin Amt]
(37) (38) , .
(39)
The second term in each of these equations consists of an exponential decay modulated by a sinusoidal oscillation. The time-dependent asymmetry is then A - dr^B°=o -> Mdt ~ dT(^t=o - » D l d t . . . ..n. T . At = zzn = = —ImAo sin Amt . (40) dT(B°=0 -> f)/dt + dT(Bt=0 -> f)/dt When A m / r 3> 1, the wiggles in Eqs. (39) average out, and not much timeintegrated asymmetry is possible, while when Am/T
446
before there is time for oscillations. The maximal time-integrated asymmetry occurs when Am/T = 1. When more than one eigenchannel is present, the condition |Ao| = 1 need not be satisfied, so that the terms cos2 (Ami/2) and sin 2 (Am£/2) in (37) need not have the same coefficients, and a cos Ami term is generated in the rates. This is the signal of "direct" CP violation, as will be discussed below. Its presence for B —>• mr was pointed out by London and Peccei 53 and by Gronau.54 6.3
Time-integrated
asymmetry
If one integrates the rates for B°=0 —> f and Bt=0 time-integrated asymmetry 55
—> / , one can form the
_r(j?° = 0 ->/)-r(flt 0 ->7) w = rr^ :=o =: • r(Bt°=0->/)7.+ r(s f = 0 -»/)
,,n (41)
,-^o* If we consider the cases (as above) in which |(/|B°)| = \(f\B )|, we just need the integral
and we then find
//o Jo
dtsm(Amt)e-rt = ^ - ^ ^ J- 1 + xd C
f = -7Tl2lmX°
,
(42)
(43)
when \x\ = 1. This is indeed maximal when Xd = 1; the coefficient of —ImAo is 1/2. For the actual value of Xd — 0.75, the coefficient is 0.48 instead, very close to its maximal value. 6.4
Specific examples in decays to CP eigenstates
When / is a CP eigenstate, a CP-violating difference between the rates for B° -> / and B —> / arises as a result of interference between the direct decays and those proceeding via mixing (i.e., B° —» B —> / and B —»• B° —> / ) . The second term in Eqs. (39) is the result of this mixing. As mentioned, the rate asymmetry goes to zero when Xd -» 0 or Xd -> 00. We now illustrate the calculation for two specific examples, B° —>• J/ifiKs and B° —> TTTT. The "golden mode": J/ipKs The quark subprocess governing B° -t J/tpKs is b -> ccs, whose CKM factor is V*bVcs. The Ks is produced through its K° component. The cor0 _ —0 responding decay B -> J/tpKs proceeds via b -> ccs and involves the K component of K$-
447
For a CP-eigenstate, we defined x = (f\B )/(f\B°) and A0 = (q/p)x, but what we actually calculate is (f\B )/(f\B°) where / = r)}Cpf with X]'CP = ± 1 . For / = J/tpKs, vfcp = - 1 - To show this, note that CP\KS) = \KS) and CP\J/ip) = \J/tp) (since J/ip has odd C and P). The decay of the spin-zero B° to the spin-one J/ip and the spin-zero Ks produces the final particles in a state of orbital angular momentum £ = 1 and hence odd parity, introducing an additional factor of —1. Then
(KS\K°)(K°\B°)
V
'
'
(A good discussion of the sign is given by Bigi and Sanda.56) Now \Ks) = 0 so that (KS\K°) = q*K and (KS\K°) = p*K. These numbers PK\K )+qK\K°), are very close to l / \ / 2 . If the loop calculation of M\2 for K°-K mixing is dominated by the charmed quark, then (QK/PK) — {Vcd.V*s) / (V*dVcs), and 0
vcdvc*sv;bvcsv;dvtb
'
>
We assumed a specific quark to dominate the calculation of Mi2 to illustrate the self-consistency of the expression for Ao with with respect to redefinition of quark phases. Note first of all that the denominator is the complex conjugate of the numerator. Then note that each quark is represented by the same number of V's and V*'s in the numerator: 2 for the charmed quark and 1 each for d,s,b, and t. Thus any phase rotation of a quark field leaves the expression invariant. (Bjorken and Dunietz have introduced a nice representation of this invariance.57) The same cancellation would have occurred if we were to say another quark dominated K°~K mixing. For the final state / = J/ipKs we thus find Ao = —e~2l/3 and Im Ao = sin 2/3, leading to the time-integrated rate asymmetry CJ/^KS — ~xd sin 2/3/(1+ x%). In practice the experiments often select events occurring for a proper time t > to > 0 in order to enhance the signal/noise ratio, so that analyses are usually based on the time-dependent asymmetry mentioned earlier. Some recent results on sin 2/3 58 ~ 62 are quoted in Table 3, and ±1<7 limits from the average are plotted in Fig. 2. While the central value is somewhat below that favored by other observables, there is no significant discrepancy. The 7r+7r~ mode and its complications The main subprocess in B° -> 7r+7r~ is the "tree" diagram in which b —> ir+u, with the spectator d combining with the u to make a ir~. Let us temporarily assume this is the only important process and compute the
448 Table 3: Values of sin 20 implied by recent measurements of the CP-violating asymmetry in
B° -> J/i>Ks.
Experiment OPAL 5 8 CDF 5 9 ALEPH 6 0
Value 3.2+^ ±0.5 0.79i°^ 0 . 8 4 ™ ±0.16
Belle"
0.58±g;S±g;~
BaBar 6 2 Average
0.34 ± 0.20 ± 0.05 0.48 ±0.16
CP-violating rate asymmetry. We shall return in Section 11 to the important role of "penguin" diagrams. Since the (spin-zero) ir+ir~ system in B° decay has even CP, we find
x A0 = -x P
{ir+n-\W) V„,V ub^ud (ir+ir-lB*) V:bVud VtdV* VubV, -2i/3 ud
(46) -2i
7
(47)
v;dvtb v*bvud
Exercise: Check the invariance of this expression under redefinitions of quark phases. Since j3 + j = n - a, we have An = e2ia, Im(A0) = sin 2a, and Cn+V- = —xdsm2a/(l + x2d). [Remember that our asymmetries are defined in terms of (B° - B )/{B° + B ).] This result is limited in its usefulness for several reasons. (a) Our neglect of penguin diagrams will turn out to make a big difference. (b) The range of sin a is large enough that early asymmetry measurements are unlikely to expose contradictions with the standard prediction. (c) An even larger range of negative sin 2a turns out to be allowed if Vcb is larger than assumed in Sec. 4. An interesting exercise (whose result would, of course, be modified by penguin contributions) is to suppose that the asymmetries in B° -> J/tpKs and B° -> 7r+7r~ are due entirely to mixing (i.e., to a "superweak") interaction.63 In this case, since J/ipKs and IV+IT~ have opposite CP eigenvalues, one has C„+n- = -Cj/^KsWhat range of parameters in the standard CKM picture would imitate this relation? In other words, for what p and r\ would one have sin 2a = - sin 2/3? [The answer is rj = (1 - p)yJp/{2 - p).]
449
(a)
Figure 3: Contributions to B° —> K+TT "penguin" amplitude ~ V*bVts-
7
. (a) Color-favored "tree" amplitude ~ V*bVua; (b)
Decays to CP-noneigenstates
If the final state / is not a CP eigenstate, i.e. if / ^ ± / , as in the case / = K+n~, f = K~TT+, then a CP-violating rate asymmetry requires two interfering decay channels with different weak and strong phases: A(B -> / ) = A i e ^ V * 1 + A2ei
1 iSl —> 7 ) = A i e - * *'e™' "
,
1 ^i<52 +- -A 24e„ 0 - "^l2"e"
(48) (49)
Here the weak phases
Af) Then
7../
\A(B ^ f)\* - \A(B ^ f)\* |A(B->/)P + |^(B->/)|2
(50)
-2AiA 2 sinA0sinA(5 A? + A\ + 2AXA2 cos A(f) cos A<5
(51)
Examples of interesting channels
B°-±K+TT-
vs.
B°
K-TT" 1
We illustrate two types of contribution to B° -t K+ir~ in Fig. 3. The "tree" contribution, which in this case is color-favored since the color-singlet current can produce a quark pair of any color, has weak phase 7 = Avg(V*bVus) and strong phase ST, while the "penguin" contribution has weak phase 7r = Aig(Vt*bVts) and strong phase Sp.
450
u u
s u Figure 4: Color-suppressed tree diagram contributing to B+ ->
K+TT°.
Even though ST — Sp is unknown, and may be small so that little CPviolating asymmetry is present in B —> K^TT^, it will turn out that one can use rate information for several processes, with the help of flavor SU(3) (which can be tested) to learn weak phases such as 7. B+ ->• K+n° vs. B~ -> K~n° Exercise: Identify the main amplitudes which contribute. What are the differences with respect to B —• K^-K^ ? Answer: There are two "tree" amplitudes, one color-favored [as in Fig. 3(a)] and one color-suppressed (Fig. 4). Both have weak phases 7 = Krg{y^bVus) There is a penguin amplitude [as in Fig. 3(b)] with weak phase IT = Arg(Vt*bVts)• Since TT° = (dd-uu)/V2 in a phase convention in which ir+ = ud, the colorfavored tree and penguin amplitudes are the same as that in B° -> K+TT~, but divided by \f2. Thus the overall rate for 5 ± -» if ± 7r° is expected to be 1/2 that for B —• K±nzf if the penguin amplitude dominates or if the color-suppressed amplitude is negligible. In that case one expects similar CPviolating asymmetries for B° -> K+n~ and B+ -> K+n°. 64 ' 65 B+ -» K°TT+ vs. B~ -»
Tfir"
Exercise: Show that there is no tree amplitude and hence no CP-violating asymmetry expected. This process is expected to be dominated by the penguin amplitude and thus provides a reference for comparison with other processes in which tree amplitudes participate. Small contributions to B+ -> K+n° and B+ ->• K°ir+ are possible from the process in which the hu pair annihilates into a weak current which then produces su. A qq pair is produced in hadronization, giving K+ir° if q = u and K°n+ if q = d. These contributions are expected to be suppressed by a factor
451
of JB/TUB if the graphs describing them can be taken literally. However, they can also be generated by rescattering from other contributions, e.g., (B+ —• K+7r°)tree —> K°n+. We shall mention tests for such effects in Sec. 12. 7.2
Pocket guide to direct CP asymmetries
We now indicate a necessary (but not sufficient) condition for the observability of direct CP asymmetries based on the interference of two amplitudes, one weaker than the other. The result is that one must be able to detect processes at the level of the absolute square of the weaker amplitude?6 This guides the choice of processes in which one might hope to see direct CP-violating rate asymmetries. Suppose the weak phase difference A
w = ° ($&)*%
***«*• •
(52)
Imagine a rate based on the square of each amplitude: Ni = const. |Aj| 2 . Then
\A\ ~
2y^jN[.
The statistical error in A is based on the total number of events. For A2
.
(53)
Thus (aside from the factor of 2) one must be able to see the square of the weaker amplitude at a significant level in order to see a significant asymmetry due to A1-A2 interference. 7.3
Interesting levels for charmless B decays
Typical branching ratios for the dominant B decays to pairs of light pseudoscalar mesons are in the range of 1 to 2 parts in 105. Some recent data are summarized in Table 4. Here the average between a process and its charge conjugate is quoted. These data are based on results by CLEO, 67 ' 68 ' 69 [including a value for B(B+ —> TT+TT°) extracted from an earlier CLEO report 7 0 ], Belle 71 and BaBar.72 The averages are my own. The relative K-K rates are compatible with dominance by the penguin amplitude, which predicts the rates involving a neutral pion to be half those
452 Table 4: Branching ratios, in units of 10
Mode 7T + 7r~ 7T+TT0
K+TTK°n+ K+n° K°TT° K+r)1 K°ri'
C L E 0
11
fj+3.0+1.4
14 «+5.9+2.4
for B° or B + decays to pairs of light pseudoscalar
Belle 71
67,68,69,70
4.3±i;t ± 0.5 5.4 ±2.6 17.2i^±1.2 18.21^ ± 1.6
6
5.6+^± 0 .4 7 O+3.8+0.8 '•°-3.2-1.2 1Q Q + 3 . 4 + 1 . 5 1».0_3 2-0.6 i o 7+5.7+1.9 i o - ' -4.8-1.8
16.311:111:2 2
i6.ol^l 2 :
5 7
BaBar72 4.1 ±1.0 ±0.7 5.ll2°±0.8 16.7 ± 1.61J:? i o 9+3.3+1.6 10 z - -3.0-2.0
io.8i? : JU:3 o 9 + 3 . 1 + l.l °-z-2.7-1.2
62 ±18 ± 8
80lJ° ± 7 89tie ± 9
Average 4.4 ±0.9 5.6 ±1.5 17.4 ±1.5 17.3 ±2.4 12.2 ±1.7 10.4 ±2.6 75 ±10 78 ± 9 (a)
(a) Average for K+r\' and K°r]' modes. with a charged pion. This conclusion is supported by an estimate of the tree contribution via the decay B —> -KO,V and factorization. One then needs some idea of the form factor at m(lv) = m„ or VHK- The result is that one estimates Btree(B° -»• 7r+7r-) ~ IO- 5 , or Btree(B°
->
K+n~
!_K_
K
u
Vud
x 10"
(54)
With j K = 161 MeV, / „ = 132 MeV, fK/U = 1-22, Vus/Vud = tan9c = 0.22/0.975 = 0.226, the coefficient of 10~ 5 on the right-hand side is 0.076. Thus in order to see a significant CP-violating rate asymmetry in B —> Kir one needs at least 13 times the sensitivity that was needed in order to see all the B -» Kn modes. This would correspond to about 100 fb _ 1 at e+e~ colliders, or samples of about IO8 identified B's at hadron machines. In other words, one needs to be able to see branching ratios of a few parts in IO7 with good statistical significance. This is within the capabilities of experiments just now getting under way. 8 8.1
Flavor tagging States of BB with definite charge-conjugation
The process e+e~ -» BB is typically studied at the mass of the T(45) resonance, Ec.m. = 10.58 GeV, above the threshold of 2MB - 10.56 GeV for BB production but below the threshold for production of one or two B*'s:
453
MB + MB- = 10.605 GeV, 2MB- = 10.65 GeV. Now, the T(45) has C = - 1 . It is produced via a virtual photon 7* (which has odd C). It is a 3 5 i 66 state, where the superscript 2Sbi + 1 denotes the total spin S6j = 1. The 66 pair has orbital angular momentum L = 0 and total angular momentum J = 1. A QQ state 2 S + 1 L j in general has C = (-1)L+S. The BB pair produced at the T(45) thus has a definite eigenvalue of charge-conjugation, C{BB) = —1, correlating the flavor flavor of the neutral B whose decay (e.g., to J/ipKs) is being studied with the flavor of the other B used to "tag" the decay, e.g., via a semileptonic decay 6 —> cl~i>i or b —> c^ + t^. Let B°B be in an eigenstate of C with eigenvalue r\c — ±1- (To get a state with 7)c — +1 it is sufficient to utilize the reaction e+e~ —> B°*B or B°B —> B°B 7 just above threshold.) In the B°B center-of-mass system, the wave function of the pair, * c = - L [B°(p)B°(-p) + r)cB°(p)B°(-p)\
(55)
may be expressed in terms of the mass eigenstates BL,H in order to study its time evolution. Since B° = - ~ [BL + BH]
, fi° = -L= [BL - BH]
,
(56)
we have *c = ^
^
{[BL(P)
+ B H ( P ) ] [ S L ( - P ) - B/r(-p)]
+Vc[BL(p) - BH{p)][BL{-p)
+ BH(-p)}}
.
(57)
For rjc = — 1 the LL and -ffiJ terms cancel (this is also a consequence of Bose statistics) and one has *c(lc
= - 1 ) = ^ = - [ ^ ( P ) S L ( - P ) - BL{p)BH(-p)]
.
(58)
.
(59)
For 77c — + 1 the HL and LH terms cancel and one has Vc(ric = +1) = ~^~
[BL(p)BL(-p)
~ BH{p)BH{-p)\
Define t and t to be the proper times with which the states p and — p evolve, respectively: BLMP)
-> BLM(P)e-l^'Ht
,
BL
->• BLiH{-p)e-iln-H~t
.
(60)
454
Project the state with p into the desired decay mode (e.g., J/tpKs) and the state with — p into the tagging mode (which signifies B° or B at time t, e.g., £~ -H- B , £+ -f4 £?°). Then, for a CP-eigenstate, it is left as an Exercise to show in the limit A r = 0 that
d*r[f(t),e-(t)] dt dt d2T[f(t),£+(t)} dt dt
~e-r(t-K)[l-sinAm(*T£)ImA]
^
^
~ e - r ( f + ' ) [ l + sinAm(<=Fi)ImA]
.
(62)
'7C=:fl
(Hints: Recall that A = ( 7 / p ) ( / | B ° ) / ( / | B 0 ) , A = (p/q)((f\B0)/(f\B°), |A| = 1, A = A*, Am = mff - mi. For T?C = - 1 , write the decay amplitude as a function of t and £._ It will have two terms, one ~ e-l(mnt+mLt) ancj the other ~ e~l^rnLt+rn"i\ whose interference in the absolute square of the amplitude gives rise to the sin Am(t — t) terms.) These results have some notable properties. (1) For either value of rjc, the sum of the £+ and £~ results is as if one didn't tag, and the oscillatory terms cancel one another. (2) For r\c — —1) note the antisymmetry with respect to t — t. This is a consequence of the Bose statistics and the C = — 1 nature of the initial state. If one integrates over all times, the CP-violating asymmetry vanishes. Thus in order for the tagging method to work in a C-odd state like T(4S) one must know whether t or t was earlier. An asymmetric -B-factory like PEP-II or KEK-B permits this by spreading out the decay using a Lorentz boost. Exercise: Show for rjc = —1 that if one subdivides the t,t integrations according to t
JJdtdtjcfT/dtdt)^-^ Jfdtdt{d2r/dtdt)[(£-,t
> t) - (£~,t < t) - (£+,t > t) + (£+,t < t)} > t) + {£-,t
1+ 4
ImA
.
(63)
In practice the BaBar and Belle analyses will probably fit the time distributions rather than simply subdividing them, since background rejection and signal/noise ratio are functions of t — t. (3) For rjc = + 1 the oscillatory term behaves as ~ Am(t + t), so it is not necessary to know whether t or t was earlier, and the asymmetric collision geometry is not needed. However, as shown above, in order to produce a B°B state with rjc = + 1 in e + e~ collisions one must work at or above BB* threshold, thereby losing the cross section advantage of the T(4S') resonance.
455
8.2
Uncorrelated BB pairs
Pairs of S's produced in a hadronic environment are likely to arise from independent fragmentation of b and b quarks, so that it is unlikely that they are produced in a state of definite rjc- (The interesting case of partially-correlated B-B pairs can be attacked by density-matrix methods. 73 ) Thus, one must resort to either the fact that a 6 is always produced in association with a b by the strong interactions ("opposite-side tagging"), or the fact that the fragmentation of a b into a B favors one particular sign of charged pion close to the B in phase space ("same-side tagging"). "Opposite-side" methods The strong interactions qq —> bb or gg —• bb (g = gluon) conserve beauty, so that a b can be identified if it is found to be produced in association with a b. The opposite-side b can be identified in several ways. (1) The jet-charge method makes use of the fact that a jet tends to carry the charge of its leading quark 74 , since the average charge of the fragmentation products is zero in the flavor-SU(3) limit. (There is some delicacy if strange quark production is suppressed, since Q{u) ^ — Q(d)J5) (2) The lepton-tag method uses the charge of the lepton in the semileptonic decays b —> ct~vi and b —• cl+f£ to signify the flavor of the decaying oppositeside b quark. The signal is diluted since semileptonic decays occur after the b quark has been incorporated into a meson. Sometimes this meson is a neutral B, in which case information on its flavor is nearly completely lost if it is a strange B and partially lost if it is a nonstrange B. An initial Bs will decay half the time as a B s , while an initial B will decay about 18% of the time as a B°. (See Section 5.2.) (3) The kaon-tag method uses the fact that the signs of kaons produced in b decays are correlated with the flavor of the decaying b. A b gives rise to c, whose products have more K~ than K+. This method is subject to the same dilution as the lepton-tag method. In order to utilize the above methods, one needs the relative probabilities of production of B°, B+, Bs, and A&, The CDF Collaboration 76 has measured these in high-energy hadron collisions to be in the ratios 0.375:0.375:0.16:0.09, while LEP Collaborations find 0.40:0.40:0.097:0.104 for Z° -> bb decays.77 Taking account of P(B° -* B°) ~ 18% and P{BS -» B~s) ~ 1/2, the probability of a "wrong" tag is (3/8)(0.18) + (0.16)(l/2) ~ 0.15, which dilutes the efficacy of the tag by a factor (right - wrong)/(right + wrong) ~ 0.70. For an extensive study of the first two tagging methods, see recent papers by the CDF Collaboration.78 These methods were a key ingredient in obtaining
456
(a)
(b)
Figure 5: Fragmentation of a b or b quark into a B° or B
the CP-violating asymmetry in B° —> J/IJJKS mentioned in Sec. 6.4. "Same-side" methods: Fragmentation and B** resonances The fragmentation of a b quark into a neutral B meson is not chargesymmetric. This was noted quite some time ago in the context of strange £?'s.79 A Bs contains a b and an s. This s must have been produced in association with a s. If that s is incorporated into a charged kaon, the kaon must be a K+ = us. A similar argument applies to non-strange neutral B's and charged pions.80 A B° is then found to be associated more frequently with a 7r+ nearby in phase space, while a B tends to be associated with a ir~. This correlation is the same as that found in resonance decays: B° resonates with 7r+ but not 7r~, while B resonates with ir~ but not 7r+. The fragmentation of a b or b quark is illustrated in Fig. 5. If one cuts the diagrams to the left of the pion emission, one finds either a 6M or a bu state. Thus, a positively charged resonance can decay to B°n+, while a negatively charged one can decay to B TT~ . Recall the case of D*+ -> ir+D° mentioned in Sec. 3.1. The soft pion in that decay may be used to tag the flavor of the neutral D at the time of its production, which is useful if one wants to study D°-D mixing or Cabibbodisfavored decays. The difference between M(D*+) and M(TT+) + M(D°) (the "Q-value") is only about 5 MeV, so the pion is nearly at rest in the D*+ c.m.s., and hence kinematically very distinctive. In the case of JB's, however, the lowest vector meson B*+ cannot decay to n+B° since the B*-B mass difference is only about 46 MeV. One must then utilize the decays of highermass i?*'s, collectively known as 5**'s. The lightest such states consist of a 6 quark and a light quark (u, d, s) in a P-wave.
457 Table 5: Quantum numbers of B** resonances (P-wave resonances between a b quark and a light quark q).
h 1/2 1/2 3/2 3/2
J 0 1 1 2
Decay prods. B-K B*TT
B*ir BIT, B*TT
Part, wave S wave S wave D wave D wave
Width Broad Broad Narrow Narrow
Many key features of the spectroscopy of a heavy quark and a light one were first pointed out in the case of charm, 81 ' 82 and codified using heavy-quark symmetry.83 The heavy quark spin degrees of freedom nearly decouple from the light-quark and gluon dynamics, so it makes sense to first couple the relative angular momentum L = 1 and the light-quark spin sq = 1/2 to states of total light-quark angular momentum jq = 1/2 or 3/2 and then to couple j q with the heavy quark spin SQ = 1/2 to form total angular momentum J. For j q = 1/2 one then gets states with J = 0,1, while for j q = 3/2 one gets states with J = 1,2. The j q = 1/2 states decay to B-K or B*-K only via S-waves, while the jq = 3/2 states decay to BIT or B*ir only via D-waves. These properties are summarized in Table 5. The j q = 3/2 resonances, decaying via D-waves, have been seen, with typical widths of tens of MeV and masses somewhere between 5.7 and 5.8 GeV/c 2 in all analyses. 84 ' 85 The j q = 1/2 resonances are expected to be considerably broader. There is no unanimity on their properties, but evidence exists for at least one of their charmed counterparts. 86 More information on £?**'s would enhance their usefulness in same-side tagging of neutral B mesons. 9 9.1
The strange B Bs-Bs
mixing
The limit 44 Ams > 15 p s _ 1 mentioned in Sec. 4.4 was a significant source of constraint on the (p,rj) plot. What range of mixing is actually expected? One may place an upper bound by noting that
Amd
Vts Vtdt
BB. / 7 B . BB
V/B
(64)
Let us review the estimate 4 5 JBSI!B < 1-25. The upper limit (larger than the lattice range 3 4 ) comes from the nonrelativistic quark model, which implies 8 7 | / M | 2 = 12|*(0)| 2 /M M for the decay constant fM of a meson M of
458
Figure 6: Mixing of Bs and Bs as a result of shared sees intermediate states.
mass MM composed of a quark-antiquark pair with relative wave function \&(r). One estimates the ratios of |*(0)| 2 in D and Ds systems from strong hyperfme splittings. Since M{D*+) - M(D+) ~ M(D*+) - M (£>+), one expects |^>(0)|Qj/rrid ~ | * ( 0 ) | g s / m s for mesons containing a heavy quark Q. In constituent-quark models 88 m,i/ms ~ 0.64, so fQg/fQs — V0.64 = 0.8. An upper limit on |Vt»/Vtd| (see Sec. 4.3) is then < [A(0.66)]_1 = 6.9, implying Ams/Amd < 74 or Ams < 36 p s - 1 . A recent prediction 89 based on lattice estimates for decay constants is Ams = 16.2 ± 2.1 ± 3.4 ps" 1 ; there is a hint of a signal at ~ 17 ps"- 1 4 4 9.2
Bs lifetime
The mass eigenstates of the strange B are expected to be nearly CP eigenstates. Their mass splitting is expected to be correlated with their mixing; large values of Ams imply large values of A r s . (In the calculation of AT/ Am, the values °f | / M | 2 cancel.) The value of A r s is much greater than that of A I ^ because of the shared intermediate states in the transition Bs — bs —> sees —> sb = Bs illustrated in Fig. 6. Specific calculations 90 imply that this quark subprocess may be dominated by CP = + intermediate states, implying a shorter lifetime for the even-CP eigenstate of the Bs-Bs system by 0(10%). The imaginary part of the diagram in Fig. 6 is due to on-shell states, which do not include those involving the top quark. This is the reason for the factor of (mllm2w)IS(m\jM'^) in Eq. (30). The lowest-order ratio, A r s / A m s ~ -1/180, implies that if | A m s / r s | = (20 ps _1 )(1.6 ps) = 32 (a reasonable v a l u e ) , t h e n | A r s / r s | ~ 1/6. Recent calculations predict A r s / r s = ( 9 . 3 ^ 6 ) % 91 or (4.7±1.5±1.6)% 9 2 . The latter group 93 also finds fBd ^/B~B~d = 206(28)(7) MeV, fB,yrB^/(fBVB^) - 1.16(7), fB, y/B^ = 237(18)(8) MeV.
459
9.3
Measuring A r s / r s
The average decay rate of the two mass eigenstates BH and Bi can be measured by observing a flavor-specific decay, e.g., Bs(= bs) -» DJ(= cs)t+vt
or Bs -*• Djir+
,
DJ -> ^TT"
.
(65)
The flavor of Ds labels the flavor of the Bs and then we note that \BS)~±={\BL)
+ \BH))
•
(66)
Such a flavor-specific decay then gives a rate f = ( I ^ + r # ) / 2 . One can also look for a decay in which the CP of the final state can be easily identified.94 Bs -4 J/ip
Helicity analyses of Bs —>• J/ip4> and B —> J/ipK*
It is convenient to re-express the three partial wave amplitudes for decay of a spin-zero mesons into two massive spin-1 mesons in terms of transversity amplitudes. 95 These are most easily visualized by analogy with the method originally used to determine the parity of the neutral pion through its decay to two photons. A spinless meson M can decay to two photons with two possible linear polarization states: parallel and perpendicular to each other. If they have parallel polarizations, the interaction Lagrangian is C-mt ~ MFliVFliV ~ M ( E 2 — B 2 ), while if they have perpendicular polarizations, £j n t ~ MFI1VF,1V ~ M ( E • B). Now, E 2 - B is CP-even, while E • B is CP-odd. The observation that the two photons emitted by the TT° had perpendicular polarizations then was used to infer that the pion had odd CP and hence (since its C was even as a result of its coupling to two photons) odd P. One can then identify two of the decay amplitudes for a spinless meson decaying to two massive vector mesons as Ay (parallel linear polarizations, even CP) and A± (perpendicular linear polarizations, odd CP). A third decay amplitude is peculiar to the massive vector meson case: Both vector mesons can have longitudinal polarizations (impossible for photons). Since there must be two independent CP-even decay amplitudes by the partial-wave exercise given above, this amplitude, which we call AQ, must be CP-even.
460
There are two recent experimental studies of decays of strange B's to pairs of vector mesons. (1) The CDF Collaboration 96 finds the results quoted in Table 6. The decays Bs -> J / ^ 0 and B° ->• J/ipK*° are related to one another by flavor SU(3) (the interchange s o d for the spectator quark) and thus should have similar amplitude structure. We have adopted a normalization in which |^o| 2 + |A||| 2 + | A x | 2 = 1. The CDF result says that Bs -»• J/ip(j> is dominantly CP-even. No significant AT/T has been detected, but the sensitivity is not yet adequate to reach predicted levels. These conclusions are supported by results from CLEO 9 7 (|A>|2 = 0.52 ±0.08, \A±\2 = 0.09 ±0.08) and BaBar 9 8 (\A0\2 = 0.60 ± 0.06 ± 0.04, \Aj_\2 = 0.13 ± 0.06 ± 0.02). Table 6: Amplitudes in the decays Bs —> J/ip> and B° —>
Amplitudes
Bs
->• J/tl>>
1^1
0.78 ±0.09 ±0.01 0.41 ±0.23 ±0.05 1.1 ±1.3 ±0.2 0.48 ±0.20 ±0.04
rL/r = \A0\2 r±/r = \A±\2
0.61 ±0.14 ±0.02 0.23 ±0.19 ±0.04
\M l^lll
Arg(A||/A,) Axg(AJAQ)
J/ipK*0.
B° -> J/ipK*° 0.77 ±0.04 ±0.01 0.53 ±0.11 ±0.04 2.2 ±0.5 ±0.1 0.36 ±0.16 ±0.08 -0.6 ±0.5 ±0.1 0.59 ±0.06 ±0.01 0.13±°£2 ± 0.06
(2) A recent ALEPH analysis " of the decays Bs -> D(S*HD{s*] finds that the decay to pairs of vector mesons occurs in mostly even partial waves, so that the lifetime in this mode probes that of the CP-even mass eigenstate, which turns out to be BL ~ (Bs + Bs)/y/2, giving TL = 1.27 ± 0.33 ± 0.07 ps. A similar study of the flavor eigenstate finds T(BS) — 1.54 ± 0.07 ps. Comparing the two values, one finds AT/T = (25^4)%. This is just one facet of a combined analysis of results from CDF, LEP, and SLD 10 ° that concludes AT/T = (16^)%, or AT/T < 31% at 95% c.l. 10
B decays to pairs of light mesons
We have already noted in Table 4 some branching ratios for B decays to pairs of light pseudoscalar mesons. Here we discuss these and related processes involving one or two light vector mesons in more detail. 10.1
Dominant processes in B —> 7T7T and B —> Kn
The decays B -> wn and B -» Kn are rich in possibilities for determining fundamental CKM parameters. The process B° ->• TT+TT~ could yield the angle
461
a in the absence of penguin amplitudes, whose contribution must therefore be taken into account. The process B° —>• K+ir~ and related decays can provide information on the weak phase 7. In order to discuss such decays in a unified way, we shall employ a flavorSU(3) description using a graphical representation. 101 ' 102 This language is equivalent to tensorial methods. 103 ' 104 The graphs are shown in Fig. 7. They constitute an over-complete set; all processes of the form B —» PP, where P is a light pseudoscalar meson belonging to a flavor octet, are described by only 5 independent linear combinations of these. The graphical technique allows one to check a result for B —> irir which can be obtained using isospin invariance. The subprocess b —> uud can change isospin by 1/2 or 3/2 units. The J = 0 TTTT final state, by virtue of Bose statistics, must have even isospin: 7 = 0,2. Thus there are only two invariant amplitudes in the problem, one with AI = 1 / 2 leading to In7t = 0 and one with AI = 3/2 leading to In7r = 2. Hence the amplitudes for the three decays B° -> ir+ir~, B+ -t ir+ir°, and B° -> 7r°7r° obey one linear relation. In the graphical representation they are A(B° ->7r + 7r-) = -(T + P) , +
A(B
(67)
+
->7r 7r°) = -{T + C)/V2
A(B° -> 7r°7r°) = (P - C)/y/2
,
(68)
,
(69)
leading to the relation A(B° -> TT+TT") = y/2A(B+ -> TT+TT0) - y/2A(B° -> 7r°7r°). Measurement of the rates for these processes and their charge-conjugates allows one to separate the penguin and tree contributions from one another and to obtain information on the CKM phase a. The only potential drawback of this method is that the branching ratio for A(B° —> 7r°7r°) is expected to be small: of order 1 0 - 6 . One can use B —> Kir and flavor SU(3) to evaluate the penguin contribution to B —> 7r7r.105 The decay B —> irn appears to be dominated by the tree amplitude while B -*• Kir appears to be dominated by the penguin: (Tree)x^ (Tree)™
fKVus fitVud
(Pen)™ (Pen)/^
Vts
1 4
(70)
Many other applications of flavor SU(3) to B —> Kir decays have been made subsequently.106 We shall discuss the results of one relatively recent example.107 10.2
Measuring 7 with B —> Kir decays
The Fermilab Tevatron and the CERN Large Hadron Collider (LHC) will produce large numbers of ir+ir~, ir±Kz^\ and K+K~ pairs from neutral non-
462
Figure 7: Graphs describing flavor-SU(3) invariant amplitudes for the decays of B mesons to pairs of light flavor-octet pseudoscalar mesons, (a) "Tree" (T); (b) "Color-suppressed" (C); (c) "Penguin" (P); (d) "Exchange (E); (e) "Annihilation" (A); (f) "Penguin annihilation"
(PA).
strange and strange B mesons. Each set of decays has its own distinguishing features. The processes B° -> K+K~ and Bs —• -K+TT~ involve only the spectatorquark amplitudes E and PA, and thus should be suppressed. They are related to one another by a flavor SU(3) "U-spin" reflection s «-> d108 and thus the ratio of their rates should be the ratio of the corresponding squares of CKM elements. The decays B° —>• TT+TT~ and Bs —>• K+K~ also are related to each other by a U-spin reflection. Time-dependent studies of both processes allow one to separate strong interaction and weak interaction information from one another and to measure the angle 7.109 This appears to be a promising method for Run II at the Fermilab Tevatron.110 The decays of non-strange and strange neutral B mesons to K±-KT provide another source of information on 7 107 when combined with information on B+ —> K°TT+. The rate for this process is predicted to be the same as that for B~ —*K 7T~, providing a consistency check. We consider the amplitudes T and P with relative weak phase 7 and relative strong phase S, neglecting the amplitudes E, A, and PA which are expected to be suppressed relative to T and P by factors of / s / m s . Then we find (letting T and P stand for
463
magnitudes) A(B° -> K+n~) = -[P + Tel{l+5)]
,
(71)
A(B+ -> K°ir+) = P ,
(72)
A{BS -)• w+K~) = XP- i r e i ( 7 + 4 ) ,
(73) (74)
with amplitudes for the charge-conjugate processes given by 7 —> —7. Here A = |Kts/Ktd| — I^cd/Kisl = tan0 c ~ 0.226. In the penguin amplitude the top quark has been integrated out and unitarity used to replace V*bVtq by — ^cli^cg _ KH&^u?- T h e t e r m ~~V*bVc
T(B° -> K+7T-) + T(gQ -> X-7T+) " r(B+ -» K°7r+) + r ( s - -> i?°7r-)
=
'
l
'
'
l
'
'
l
J
_ r(Bs^K-7r+) + T(Bs^K+7r-) s
~ r(B+ -» ^°TT+) + r ( B - -> ^°TT-)
and CP-violating rate (pseudo-)asymmetries: 0
_r(i?0^^+^-)-r(g°^^-7r+) " r(B+ -»• K°n+) + T(B~ -> K°7r-) r ( 5 s -> A-jr + ) - r ( s s -> K+TT-) T(B+^K°7r+)+r(B-^/C°7r-)
'
(78)
and let r = T/P. We find R = 1 + r2 + 2r cos <5 cos 7
,
i? s = A2 + (r/A) 2 - 2r cos (5 cos 7 A0 = -As = - 2 r sin 7 sin J
.
(79) ,
(80) (81)
The relation AQ = — As may be used to test the assumption offlavorSU(3) symmetry, while the remaining three equations may be solved for the three unknowns r, 7, and 5. An error of 10° on 7 seems feasible. (A small correction associated with the above approximation to the penguin graph also may be applied.111)
464
q
q
Figure 8: Singlet penguin diagram, important for B —• PP processes in which one of the pseudoscalar mesons P is r/ or rf.
10.3
Decays with r\ and rf in the final state
The physical r] and rf are mixtures of the flavor octet state 7i8 = (2ss — uu — dd)/\/6 and the flavor singlet rji = (uu + dd + ss)/%/3. This mixing is tested in many decays, such as (??,??') —>• 77, (p,u),>) —> nj, rf —> (p,ui)j, etc. The result 112 is that the r\ is mostly an octet and the rf mostly a singlet, with one frequently-employed approximation 101>113 corresponding to an octet-singlet mixing angle of 19°: r] ~ —=(ss — uu — dd) , V3
rf ~ —y=(uu + dd + 2ss) v6
,
(82)
(A single mixing angle may not adequately describe the 77—77' system.114) For a meson with a flavor singlet component, an amplitude in addition to those depicted in Fig. 7 corresponds to the "singlet" penguin diagram 113 shown in Fig. 8. The CLEO Collaboration's large branching ratios for B -» rfK: 68 B(B+ -+ri'K+)
= ( 8 0 i 1 0 ± 7 ) x l ( n 6 , B{B°^ri'K°)
= (89±^±9) x 1(T 6 , (83) with only upper limits for Krj production, indicate the presence of a substantial "singlet" penguin contribution, and constructive interference between nonstrange and strange quark contributions of rf to the ordinary penguin amplitude P, as suggested by Lipkin.115 The corresponding decays to nonstrange final states, B+ -> 7T+7? and B+ -> 7T+TI', are expected to have large CPviolating asymmetries, since several weak amplitudes in these processes are of comparable magnitude. 113 Moreover, CLEO sees J3(B+ -» r,K*+) = (26.4+^ ± 3.3) x 10" 6
(84)
465
B(B° -> r]K*°) = (13.8±t;a ± 1-6) x 1(T 8
,
(85)
with only upper limits for K*r]' production. These results favor the opposite signs for nonstrange and strange components of the rj, again in accord with predictions. 115 Much theoretical effort has been expended on attempts to understand the magnitude of the "singlet" penguin diagrams, 116 but they appear to be more important than one would estimate using pertubative QCD.
10-4
One vector meson and one pseudoscalar
The decays B —> VP, where V is a vector meson and P a pseudoscalar, are characterized by twice as many invariant amplitudes of flavor SU(3) as the decays B -> PP, since either the vector meson or the pseudoscalar can contain the spectator quark. We can label the corresponding amplitudes by a subscript V or P to denote the type of meson containing the spectator. A recent analysis within the graphical framework uses data to specify amplitudes. 70 Alternatively, one can incorporate models for form factors into calculations based on factorization. 117 ' 118 An interesting possibility suggested in both these approaches is that the large branching ratio B(B° —t K*+ir~) may suggest constructive tree-penguin interference, implying 7 > 90°. The tree amplitude in B° —> K*+n~ is proportional to V*bVus, with weak phase 7, while the penguin amplitude is proportional to Vt*bVts, with weak phase 7T. The relative weak phase between these two amplitudes is then 7 — it, which leads to constructive interference if the strong phase difference between the tree and penguin amplitudes is small and if T > ir/2. This could help explain why B(B° —> K*+TT~) seems to exceed 2 x 10~5 while the pure penguin process B ->•
466
10.5
Two vector mesons
No modes with pairs of light vector mesons have been identified conclusively yet. The existence of three partial waves (S, P, D) for such processes as B° —• <j)K*0 means that helicity analyses can in principle detect the presence of finalstate interactions (as in the case of B —¥ J/ipK*). It is not clear, however, whether such final-state phases are relevant to the case of greatest interest, in which two different channels are "fed" by different weak processes such as T and P amplitudes. Some further information obtainable from angular distributions in B —> VV decays has been noted. 121 10.6
Testing flavor SU(3)
The asymmetry prediction As = — AQ for Bs —• K-K vs. B —• KK, mentioned above, is just one of a number of U-spin relations 108 testable via Bs decays, which will first be studied in detail at hadron colliders. One expects the assumption of flavor SU(3), and in particular the equality of final-state phases for non-strange and strange B final states, to be more valid for B decays than for charm decays, where resonances still are expected to play a major role. 11 11.1
The role of penguins Estimates of magnitudes
Perturbative calculations of penguin contributions to processes such as B -> Kir, where they seem to be dominant, fall short of actual measurements. 122 Phenomenological fits indicate no suppression by a factor of as/4ir despite the presence of a loop and a gluon. One possible explanation is the presence of a cc loop with substantial enhancement from on-shell states, equivalent to strong rescattering from such states as DSD to charmless meson pairs. If this is indeed the case, penguin amplitudes could have different final-state phases from tree amplitudes, enhancing the possibility of observing direct CP violation. 11.2
Electroweak penguins
When the gluon in a penguin diagram is replaced by a (real or virtual) photon or a virtual Z which couples to a final qq pair, the process b —> (d or s)qq is no longer independent of the flavor (u, d, s) of q. Instead, one has contributions in which the uu pair is treated differently from the dd or ss pair. A color-favored electroweak penguin amplitude PEW [Fig. 9(a)] involves the pair appearing in the same neutral meson (e.g., 7r°), while a color-suppressed amplitude P%w [Fig. 9(b)] involves each member of the pair appearing in a different meson.
467
(b) P Ec¥
(a) P E ¥
Figure 9: Electroweak penguin (EWP) diagrams, suppressed (P%w).
(a) Color-favored (PEW)',
(b) Color-
One may parametrize electroweak penguin (EWP) amplitudes by contributions proportional to the quark charge, sweeping other terms into the gluonic penguin contributions. One then finds that the EWP terms in a flavor-SU(3) description may be combined as follows with the terms T, C, P, and S (the "singlet") penguin: 123 T-^t
= T + PEW
, P->p
= P-lpt,w
,
(86)
C ^c
= C + PEW
, S^s
= S-
•
(87)
\PEW
The flavor-SU(3) description holds as before, but weak phases now can differ from their previous values as a result of the EWP contributions. One early application of flavor SU(3) which turns out to be significantly affected by EWP contributions is the attempt to learn the weak phase 7 from information on the decays B+ -> K+TT°, B+ -> K°ir+, B+ -» 7r+7r°, and the corresponding charge-conjugate decays.124 The amplitude construction is illustrated in Fig. 10. The primes on the amplitudes refer to the fact that they describe strangeness-changing (|A5| = 1) transitions. The corresponding AS = 0 amplitudes are unprimed. The amplitudes in Fig. 10 form two triangles, whose sides labeled C" + T" and C" + T" form an angle 27 with respect to one another. (There will be a discrete ambiguity corresponding to flipping one of the triangles about its base.) One estimates the lengths of these two sides using flavor SU(3) from the amplitudes A(B+ -> TT+TT0) = (C + T)/V2 and A(B~ -> 7i-7r0) = (C + T)/V2. In the presence of electroweak penguin contributions this simple analysis must be modified, since there are important additional contributions when we replace C + V -> c' + t' and C + f' -» c' + F. 125 The culprit is the C
468
Figure 10: Amplitude triangles for determining the weak phase 7. These are affected by electroweak penguin contributions, as described in the text.
amplitude, which is associated with a color-favored electroweak penguin. It was noted subsequently 126 that since the C" + T" amplitude corresponds to isospin I (Kir) = 3/2 for the final state, the strong-interaction phase of its EWP contribution is the same as that of the rest of the C" + X" amplitude, permitting the calculation of the EWP correction. The result is that A[I(KIT)
= 3/2] ~ const x (e*7 - 8EW)
,
(88)
where the phase in the first term is hig(y*bVus) and the second term is estimated to be SEW = 0.64 ± 0.15 when SU(3)-breaking effects are included. Any deviation of the ratio 2T(B+ -> K+ir0)/T{B+ -> K°IT+) from 1 can signify interference of the C + T" amplitude with the dominant P' amplitude and hence can provide information on 7. The present value for this ratio, based on the branching ratios in Table 4, is 1.27 ± 0.47, compatible with 1. The triangle construction (with its generalization to include EWP contributions) avoids the need to evaluate strong phases. Other studies of calculable electroweak penguin effects have been made.127 Good use of these results will require 100 fb _ 1 at an e + e~ collider or 108 produced BB pairs. 12
Final-state interactions
It is crucial to understand final-state strong phases in order to anticipate direct CP-violating rate asymmetries and to check whether assumptions about the smallness of amplitudes involving the spectator quark are correct. The decay B+ —>• K°ir+ in the nai've diagrammatic approach is expected to be dominated
469 by the penguin diagram with no tree contribution. The penguin weak phase would be Arg(Vt*bVts) = IT. The phase of the annihilation amplitude A, which is expected to be suppressed by a factor of A2 / f i / m s and hence should be unimportant, should be kvg{V*bVus) = 7. This implies a very small CPviolating rate asymmetry between B+ —» K°ir+ and B~ —• K ir~, much smaller than in cases where T and P amplitudes can interfere such as B° -» K+TT~. Table 7: CP-violating rate asymmetries for several B —> Kw processes. Mode Signal events Acp K+-K80+J2 -0.04 ±0.16 K+TT° 42.1+9°99 -0.29 ±0.23 6 + KSTT 25.2t 5i 0.18 ±0.24 K+T]1 100±\l 0.03 ±0.12 28.5^7? -0.34 ±0.25 7 . 3 urn' The current data do not exhibit significant CP asymmetries in any modes.128 In Table 7 we summarize some recently report CP asymmetries, defined as
_ r(B->/)-r(i?->/) °r = T{B -> /) + T(B -+ f)
A
•
{89)
The asymmetry in the mode Ksn^ is no more or less significant than in other modes where ACP ^ 0 could be expected. How could we tell whether the amplitude A is suppressed by as much as we expect in the naive approach? 12.1
Rescattering
Rescattering from tree processes (such as those in Fig. 3 or Fig. 4 contributing to B+ -* K+TT°) could amplify the effective A amplitude in B+ -> Kair+, removing the suppression factor of JBI^BThe tree amplitude for B+ -> K+-K° should be proportional to V*bVus (as in the A amplitude), but the magnitude of the rescattering amplitude for K+n° -> K°n+ (in an S-wave) is unknown at the center-of-mass energy of TOBC2 = 5.28 GeV. 12.2
A useful SU(3) relation
A sensitive test for the presence of an enhanced A amplitude has been proposed, 129 utilizing the U-spin symmetry d -O- s of flavor SU(3). Under this transformation, the b ->• s penguin diagram contributing to B+ ->• K°ir+
470
is transformed into the b —> d penguin contributing to B+ —> K K+, suppressed by a relative factor of IV^Vw/V^VisI ~ A, while the annihilation diagram contributing to B+ —> K°ir+ is transformed into that contributing to B+ -> K K+, enhanced by a relative factor of \V*bVud/V*bVus\ ~ 1/A. Thus the relative effects of the "annihilation" amplitude should be stronger by a factor of 1/A2 in B+ -+ 'K0K+ than in B+ -> K°TT+. Even if these effects are not large enough to significantly influence the decay rate, they could well influence the predicted decay asymmetry. 12.3
The process B° -> K+K~
A process which should be dominated by interactions involving the spectator quark is 5 ° —» K+K~}30 Only the exchange (E) and penguin annihilation (PA) graphs in Fig. 7 contribute to this decay. The exchange (E) amplitude should be proportional to (fB/mB)V*bVud, and the penguin annihilation amplitude should be suppressed by further powers of a s , in a naive approach. The expected branching ratio if the E amplitude dominates should be less than 10~ 7 . However, if rescattering is important, the K+K~ final state could be "fed" by the process B° —> 7r+7r~, whose amplitude is proportional to T + P. Present experimental limits place only the 90% c.l. upper bound B(B° ->• K+K~) < 1.9 x 10" 6 . 67 12.4
Critical remarks
Some estimates of rescattering are based on Regge pole methods. 131 These may not apply to low partial waves at energies of mi,c2 = 5.28 GeV. Regge poles have proven phenomenologically successful primarily for "peripheral" partial waves £ ~ kc,m. x (R ~ 1 fm).132 13 13.1
Topics not covered in detail Measurement of 7 using B± -> DK decays
The self-tagging decays B± -> D0^, B± -> iflC*, and B± -)• DcpK*, where DCp is a CP eigenstate, permit one to perform a triangle construction very similar to that in Fig. 10 to extract the weak phase 7. 133 However, the interference of the Cabibbo-favored decay D° -> K~n+ and the doublyCabibbo-suppressed decay D° —> K+ir~ introduces an important subtlety in this method, which has been addressed.134
471
13.2
Dalitz plot analyses
The likely scarcity of the decay B° -» 7r°7r° (see Sec. 10) may be an important limitation in the method proposed 135 to extract the weak phase a from B —» nir decays using an isospin analysis. It has been suggested 136 that one study instead the isospin structure of the decays B —> pir, since at least some of these processes occur with greater branching ratios than the corresponding B —> 7T7T decays. One must thus measure time-dependences and total rates for the processes (B° or B ) ->• (p±-irT,pTT°). A good deal of useful information, in fact, can be learned just from the time-integrated rates. 137 13.3
CP violation in B°~B
mixing
The standard model of CP violation predicts that the number of same-sign dilepton pairs due to B°-B mixing should be nearly the same for l+£+ and £~£~. By studying such pairs it is possible to test not only this prediction, but also the validity of CPT invariance. The OPAL Collaboration 138 parametrizes neutral non-strange B mass eigenstates as {B
) =
'
,B v =
(l + eB+6B)\B0)
+
(l-eB-5B)\B°)
^ ( I + ICB+<JB|2)
'
0
(l+eB-«5B)|J? )-(l-eB+^)|5°) V2(l + \eB-5B\2)
and finds (allowing for CPT violation) lm(SB) = -0.020 ± 0.016 ± 0.006, Re(e B ) = -0.006±0.010±0.006. Enforcing CPT invariance, they find Re(e B ) = 0.002 ± 0.007 ± 0.003. A recent CLEO study 139 finds Re(e B )/(l + |EB| 2 ) = 0.0035 ± 0.0103 ± 0.0015 under similar assumptions. The standard model predicts \p/q\ ~ 1 and hence Re(tB) ~ 0. 14 14-1
W h a t if the C K M picture doesn't work? Likely accuracy of future measurements
It's useful to anticipate how our knowledge of the Cabibbo-Kobayashi-Maskawa matrix might evolve over the next few years.140-141 With sin(2/3) measured in B° -> J/tpKs decays to an accuracy of ±0.06 (the BaBar goal with 30 fb _ 1 4 8 ) , errors on |T4&/Vc!,| reduced to 10%, strange-B mixing bounded by xs = A m s / r s > 20 (the present bound is already better than this!), and B(B+ ->• T+VT) measured to ±20% (giving fB\Vub, or \Vub/ Vtd\ when combined with B°~B mixing), one finds the result shown in Fig. 11.
472
-0.4
-0.2
0
0.2
0.4
P Figure 11: Plot in the {p,rf) plane of anticipated ±lcr constraints on CKM parameters in the year 2003. Solid curves: |V^i,/V^i,|; dashed lines: constraint on IVub/^idl by combining measurement of B(B+ —> r+vT) with B°-B mixing; dotted lines: constraint due to £K (CP-violating K°~K of sin 2/3 to ±0.06.
mixing); dash-dotted line: limit due to xs; solid rays: measurement
The anticipated (p,rj) region is quite restricted, leading to the likelihood that if physics beyond the standard model is present, it will show up in such a plot as a contradiction among various measurements. What could be some sources of new physics? 14-2 Possible extensions Some sources of effects beyond the standard model which could show up first in studies of B mesons include: • Supersymmetry (nearly everyone's guess) (see Murayama's lectures 142 ); • Flavor-changing effects from extended models of dynamical electroweak symmetry breaking (mentioned, e.g., by Chivukula 143 ); • Mixing of ordinary quarks with exotic ones, as in certain versions of grand unified theories.144 Typical effects show up most prominently in mixing (particularly K°~K and B°-B ), 145 but also could appear in penguin processes such as B -> cpK}46
473
15
Summary
We are entering an exciting era of precision B physics. With experimental and theoretical advances occurring on many fronts, we have good reason to hope for surprises in the next few years. If, however, the present picture survives such stringent tests, we should turn our attention to the more fundamental question of where the CKM matrix (as well as the quark masses themselves!) actually originates. Acknowledgments I would like to thank Prof. K. T. Mahanthappa for his directorship of the TASI2000 Summer School and for his gracious hospitality in Boulder. These lectures were prepared in part during visits to DESY and Cornell; I thank colleagues there for hospitality and for the chance to discuss some of the subject material with them. These lectures grew out of long-term collaborations with Amol Dighe, Isard Dunietz, and Michael Gronau. I am grateful to them and to others, including O. F. Hernandez, H. J. Lipkin, D. London, and M. Neubert, for many pleasant interactions on these subjects. This work was supported in part by the United States Department of Energy through Grant No. DE FG02 90ER40560. References 1. J. H. Christenson, J. W. Cronin, V. L. Fitch, and R. Turlay, Phys. Rev. Lett. 13, 138 (1964). 2. M. Kobayashi and T. Maskawa, Prog. Theor. Phys. 49, 652 (1973). 3. Fermilab E288 Collaboration, S. W. Herb et al, Phys. Rev. Lett. 39, 252 (1977); W. R. Innes et al, Phys. Rev. Lett. 39, 1240,1640(E) (1977). 4. CDF Collaboration, F. Abe et al, Phys. Rev. D 50, 2966 (1944); 51, 4623 (1994); 52, 2605 (1995); Phys. Rev. Lett. 73, 225 (1994); 74, 2626 (1995); DO Collaboration, S. Abachi et al., Phys. Rev. Lett. 72, 2138 (1994); 74, 2422 (1995); 74, 2632 (1995); Phys. Rev. D 52, 4877 (1995). 5. Y. Nir, Lectures given at 27th SLAC Summer Institute on Particle Physics: CP Violation in and Beyond the Standard Model (SSI 99), Stanford, California, 7-16 July 1999, Institute for Advanced Study report IASSNS-HEP-99-96, hep-ph/9911321. 6. R. Fleischer, DESY report DESY 00-170, hep-ph/0011323, invited lecture at NATO ASI Institute, Cascais, Portugal, 26 June - 7 July, 2000.
474
7. S. L. Glashow, Nucl. Phys. 22, 579 (1961); S. Weinberg, Phys. Rev. Lett. 19, 1264 (1967); A. Salam, in Weak and Electromagnetic Interactions (Proceedings of the 1968 Nobel Symposium), edited by N. Svartholm, Almqvist and Wiksells, Stockholm, 1968, p. 367. 8. C. Quigg, this volume. 9. J. D. Bjorken and S. L. Glashow, Phys. Lett. 11, 255 (1964); Y. Hara, Phys. Rev. 134, B701 (1964); Z. Maki and Y. Ohnuki, Prog. Theor. Phys. 32, 144 (1964). 10. S. L. Glashow, J. Iliopoulos, and L. Maiani, Phys. Rev. D 2, 1285 (1970). 11. M. K. Gaillard, B. W. Lee, and J. L. Rosner, Rev. Mod. Phys. 47, 277 (1975). 12. C. Bouchiat, J. Iliopoulos, and Ph. Meyer, Phys. Lett. 38B, 519 (1972). 13. K. Niu, E. Mikumo, and Y. Maeda, Prog. Theor. Phys. 46, 1644 (1971). 14. J. J. Aubert et al, Phys. Rev. Lett. 33, 1404 (1974). 15. J.-E. Augustin et al, Phys. Rev. Lett. 33, 1406 (1974). 16. Particle Data Group, D. E. Groom et al, Eur. Phys. J. C 15, 1 (2000). 17. M. L. Perl et al, Phys. Rev. Lett. 35, 1489 (1975); Phys. Lett. 63B, 466 (1976); Phys. Lett. 70B, 487 (1977). 18. H. Harari, in Proceedings of the 20th Annual SLAC Summer Institute on Particle Physics: The Third Family and the Physics of Flavor, July 13-24, 1992, edited by L. Vassilian, Stanford Linear Accelerator Center report SLAC-412, May, 1993, p. 647. 19. H. Harari, in Proc. 1975 Int. Symp. on Lepton and Photon Interactions (Stanford University, August 21-27, 1975), edited by W. T. Kirk (Stanford Linear Accelerator Center, Stanford, CA, 1976), p. 317. 20. J. H. Christenson et al, Phys. Rev. Lett. 25, 1523 (1970). 21. C. W. Darden et al, Phys. Lett. 76B, 246 (1978); 78B, 364 (1978). 22. J. L. Rosner, C. Quigg, and H. B. Thacker, Phys. Lett. 74B, 350 (1978). 23. CLEO Collaboration, D. Andrews et al, Phys. Rev. Lett. 45, 219 (1980). 24. G. L. Kane and M. E. Peskin, Nucl. Phys. B195, 29 (1982). 25. CLEO Collaboration, A. Chen et al, Phys. Lett. 122B, 317 (1983); P. Avery et al, Phys. Rev. Lett. 53, 1309 (1984). 26. C. Quigg and J. L. Rosner, Phys. Rep. 56, 167 (1979); H. Grosse and A. Martin, Phys. Rep. 60, 341 (1980); W. Kwong, C. Quigg, and J. L. Rosner, Ann. Rev. Nucl. Part. Sci. 37, 325 (1987); H. Grosse and A. Martin, Particle Physics and the Schrodinger Equation, Cambridge, 1998. 27. E. H. Thorndike, hep-ex/0003027, in Proceedings of the Conference on Probing Luminous and Dark Matter: Symposium in honor of Adrian Melissinos, Rochester, New York, 24-25 Sept. 1999, edited by A. Das
475
and T. Ferbel, World Scientific, Singapore, 2000, pp. 127-159. 28. CLEO Collaboration, S. Behrends et al., Phys. Rev. Lett. 50, 881 (1983). 29. CUSB Collaboration, J. Lee-Franzini et al., Phys. Rev. Lett. 65, 2947 (1990). 30. J. L. Rosner and M. B. Wise, Phys. Rev. D 47, 343 (1993). 31. S. Nussinov, Phys. Rev. Lett. 35, 1672 (1975) 32. N. Cabibbo, Phys. Rev. Lett. 10, 531 (1963). 33. G. Buchalla, CERN report CERN-TH-2001-041, hep-ph/0103166, published in this volume. 34. T. DeGrand, University of Colorado report COLO-HEP-447, hepph/0008234, published in this volume. 35. A. Falk, Johns Hopkins University report JHU-TIPAC-200005, hepph/0007339, published in this volume. 36. M. Neubert, Cornell University report CLNS-00-1712, hep-ph/0012204, published in this volume. 37. L. Wolfenstein, Carnegie-Mellon University report CMU-0006, hepph/0011400, published in this volume. 38. CLEO Collaboration, J. P. Alexander et al, CLEO-CONF 00-3, presented at XXX International Conference on High Energy Physics, Osaka, Japan, July 27 - August 2, 2000. 39. A. Falk, in Proceedings of the XlXth International Symposium on Lepton and Photon Interactions, Stanford, California, August 9-14, 1999, edited by J. Jaros and M. Peskin (World Scientific, Singapore, 2000), Electronic Conference Proceedings C990809, 174 (2000). 40. L. Wolfenstein, Phys. Rev. Lett. 51, 1945 (1983). 41. M. Gell-Mann and M. Levy, Nuovo Cim. 16, 705 (1960). 42. J. L. Rosner, Enrico Fermi Institute Report No. 2000-42, hepph/0011184. To be published in Proceedings of Beauty 2000, Kibbutz Maagan, Israel, September 13-18, 2000, edited by S. Erhan, Y. Rozen, and P. E. Schlein, Nucl. Inst. Meth. A, 2001. 43. V. Lubicz, Invited Talk at the XX Physics in Collision Conference, June 29 - July 1, 2000, Lisbon, Portugal, Univ. of Rome III report RM3TH/00-15, hep-ph/0010171. 44. D. Abbaneo, presented at Conference on Heavy Quarks at Fixed Target, Rio de Janeiro, Oct. 9-19, 2000, hep-ex/0012010. 45. J. L. Rosner, Phys. Rev. D 42, 3732 (1990). 46. F. Gilman, K. Kleinknecht, and Z. Renk, mini-review on pp. 110-114 of Particle Data Group, D. E. Groom et al., Eur. Phys. J. C 15, 1 (2000); A. Ali and D. London, DESY report DESY-00-026, hep-ph/0002167, in Proceedings of the 3rd Workshop on Physics and Detectors for DAPHNE,
476
47. 48. 49. 50. 51.
52. 53. 54. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71.
Frascati, Italy, Nov. 16-19,1999, edited by S. Bianco et al. (INFN, 1999), pp. 3-23; S. Stone, Conference Summary, Beauty 2000, hep-ex/0012162, to be published in Proceedings of Beauty 2000; 42 M. Ciuchini et al, Orsay preprint LAL 00-77, hep-ex/0012308, submitted to JHEP. J. J. Sakurai, Modern Quantum Mechanics, Revised Edition, 1994. The BaBar Physics Book: Physics at an Asymmetric B Factory, edited by P. F. Harrison and H. R. Quinn, SLAC Report SLAC-504, 1998. A. J. Buras, M. Jamin, and P. H. Weisz, Nucl. Phys. B347, 491 (1990). T. Inami and C. S. Lim, Prog. Theor. Phys. 65, 297,1772(E) (1981). A. J. Buras, A. Romanino, and L. Silvestrini, Nucl. Phys. B520, 3 (1998) and references therein; A. J. Buras and R. Fleischer, in Heavy Flavors II, edited by A. J. Buras, World Scientific, Singapore, 1997, p. 65; A. J. Buras, Technische Universitat Miinchen report TUM-HEP-316/98, hepph/9806471, in Probing the Standard Model of Particle Interactions, Les Houches, France, July 28 - Sept. 5, 1997, edited by R. Gupta, A. Morel, E. DeRafael, and F. David (Elsevier, 1999). I. Dunietz, Phys. Rev. D 52, 3048 (1995); T. E. Browder and S. Pakvasa, Phys. Rev. D 52, 3123 (1995), and references therein. D. London and R. D. Peccei, Phys. Lett. B 223, 257 (1989). M. Gronau, Phys. Rev. Lett. 63, 1451 (1989). I. Dunietz and J. L. Rosner, Phys. Rev. D 34, 1404 (1986). I. Bigi and A. Sanda, Nucl. Phys. B281, 41 (1987). J. D. Bjorken and I. Dunietz, Phys. Rev. D 36, 2109 (1987). OPAL Collaboration, K. Ackerstaff et al., Eur. Phys. J. C 5, 379 (1998). CDF Collaboration, T. Affolder et al., Phys. Rev. D 6 1 , 072005 (2000). ALEPH Collaboration, R. Barate et al., Phys. Lett. B 492, 259 (2000). Belle Collaboration, A. Abashian et al., Phys. Rev. Lett. 86, 2509 (2001). BaBar Collaboration, B. Aubert et al, Phys. Rev. Lett. 86, 2515 (2001). B. Winstein, Phys. Rev. Lett. 68, 1271 (1992). M. Gronau and J. L. Rosner, Phys. Rev. D 59, 113002 (1999). M. Neubert, JHEP 9902, 014 (1999). G. Eilam, M. Gronau, and J. L. Rosner, Phys. Rev. D 39, 819 (1989). CLEO Collaboration, D. Cronin-Hennessy et al, Phys. Rev. Lett. 85, 515 (2000). CLEO Collaboration, S. J. Richichi et al, Phys. Rev. Lett. 85, 520 (2000). CLEO Collaboration, D. Cinabro, Osaka Conf.,38 hep-ex/0009045. M. Gronau and J. L. Rosner, Phys. Rev. D 61, 073008 (2000). Belle Collaboration, KEK report 2001-11, hep-ex/0104030, submitted to Phys. Rev. Lett.
477
72. BaBar Collaboration, T. Champion, Osaka Conf.,38 hep-ex/0011018; G. Cavoto, XXXVI Rencontres de Moriond, March 17-24, 2001 (unpublished). 73. M. Gronau and J. L. Rosner, Phys. Rev. Lett. 72, 195 (1994); Phys. Rev. D 49, 254 (1994); 63, 054006 (2001). 74. R. D. Field and R. P. Feynman, Nucl. Phys. B136, 1 (1978). 75. G. R. Farrar and J. L. Rosner, Phys. Rev. D 7, 2747 (1973); 10, 2226 (1974). 76. CDF Collaboration, T. Affolder et al, Phys. Rev. Lett. 84, 1663 (2000). 77. F. Palla, in Beauty 2000 Procs 4 2 78. CDF Collaboration, F. Abe et al, Phys. Rev. D 60, 072003 (1999); T. Affolder et al., Phys. Rev. D 60, 112004 (1999). 79. A. Ali and F. Barreiro, Zeit. Phys. C 30, 635 (1986). 80. M. Gronau, A. Nippe, and J. L. Rosner, Phys. Rev. D 47, 1988 (1993). 81. A. De Rujula, H. Georgi, and S. L. Glashow, Phys. Rev. Lett. 37, 785 (1976). 82. J. L. Rosner, Comments on Nucl. Part. Phys. 16, 109 (1986). 83. M. Lu, M. B. Wise, and N. Isgur, Phys. Rev. D 45, 1553 (1992). 84. CDF Collaboration, Fermilab-Pub-99/330-E, to be published in Physical Review D. 85. OPAL Collaboration, G. Alexander et al, Zeit. Phys. C 66, 19 (1995); G. Abbiendi et al, CERN report CERN-EP/2000-125, hep-ex/0010031; DELPHI Collaboration, P. Abreu et al, Phys. Lett. B 345, 598 (1995); ALEPH Collaboration, D. Buskulic et al, Zeit. Phys. C 69, 393 (1996); R. Barate et al, Phys. Lett. B 425, 215 (1998); L3 Collaboration, M. Acciarri et al, Phys. Lett. B 465, 323 (1999). 86. CLEO Collaboration, S. Anderson et al, Nucl. Phys. A663, 647 (2000); FOCUS Collaboration, C. Ricciardi, Nucl. Phys. A663, 651 (2000) and by F. L. Fabbri at Osaka Conf.,38 hep-ex/0011044. 87. R. Van Royen and V. F. Weisskopf, Nuovo Cim. A50, 617 (1967). 88. S. Gasiorowicz and J. L. Rosner, Am. J. Phys. 49, 954 (1981). 89. D. Becirevic, talk at 18th International Sumposium on Lattice Field Theory (Lattice 2000), Bangalore, India, 17-22 August 2000, Nucl. Phys. Proc. Suppl. 94, 337 (2001). 90. R. Aleksan et al, Phys. Lett. B 317, 173 (1993); Zeit. Phys. C 67, 251 (1995); Phys. Lett. B 356, 95 (1995); M. Beneke, G. Buchalla, and I. Dunietz, Phys. Lett. B 393, 132 (1997). 91. M. Beneke and A. Lenz, RWTH Aachen report PITHA 00/29, hepph/0012222, submitted to J. Phys. G. 92. D. Becirevic et al, Eur. Phys. J. C 18, 157 (2000).
478
93. D. Becirevec et ai, hep-lat/0002025. 94. A. S. Dighe, I. Dunietz, H. J. Lipkin, and J. L. Rosner, Phys. Lett. B 369, 144 (1996). 95. I. Dunietz, H. J. Lipkin, H. R. Quinn, and A. Snyder, Phys. Rev. D 43, 2193 (1991). 96. CDF Collaboration, T. Affolder et al, Phys. Rev. Lett. 85, 4668 (2000). 97. CLEO Collaboration, C. P. Jessop et al, Phys. Rev. Lett. 79, 4533 (1997). 98. BaBar Collaboration, SLAC report SLAC-PUB-8679, hep-ph/0010067, G. Raven, Osaka Conf. 38 99. ALEPH Collaboration, R. Barate et al, Phys. Lett. B 486, 286 (2000). 100. ALEPH, CDF, DELPHI, L3, OPAL, and SLD Collaborations, subgroup consisting of P. Coyle, D. Lucchesi, S. Mele, F. Parodi, and P. Spagnolo, SLAC-Pub-8492, CERN-EP-2000-096, hep-ex/0009052. 101. L. L. Chau et al, Phys. Rev. D 43, 2176 (1991). 102. M. Gronau, O. F. Hernandez, D. London, and J. L. Rosner, Phys. Rev. D 50, 4529 (1994). 103. D. Zeppenfeld, Zeit. Phys. C 8, 77 (1981). 104. M. Savage and M. Wise, Phys. Rev. D 39, 3346 (1989); 40, 3127(E) (1989). 105. J. Silva and L. Wolfenstein, Phys. Rev. D 49, R1151 (1994). 106. M. Gronau, J. Rosner and D. London, Phys. Rev. Lett. 73, 21 (1994); R. Fleischer, Phys. Lett. B 365, 399 (1996); Phys. Rev. D 58, 093001 (1998); M. Gronau and J. L. Rosner, Phys. Rev. Lett. 76, 1200 (1996); A. S. Dighe, M. Gronau, and J. L. Rosner, Phys. Rev. D 54, 3309 (1996); A. S. Dighe and J. L. Rosner, Phys. Rev. D 54, 4677 (1996); R. Fleischer and T. Mannel, Phys. Rev. D 57, 2752 (1998); M. Neubert and J. L. Rosner, Phys. Lett. B 441, 403 (1998); Phys. Rev. Lett. 81, 5076 (1998); A. J. Buras, R. Fleischer, and T. Mannel, Nucl. Phys. B533, 3 (1998); R. Fleischer and A. J. Buras, Eur. Phys. J. C 11, 93 (1999); M. Neubert, JHEP 9902, 014 (1999); M. Gronau and D. Pirjol, Phys. Rev. D 61, 013005 (2000); A. J. Buras and R. Flesicher, Eur. Phys. J. C 16, 97 (2000); M. Gronau, Technion report TECHNION-PH-2000-30, hep-ph/0011392, to be published in Proceedings of Beauty 2000; 42 M. Gronau and J. L. Rosner, Phys. Lett. B 500, 247 (2001). 107. M. Gronau and J. L. Rosner, Phys. Lett. B 482, 71 (2000). 108. M. Gronau, Phys. Lett. B 492, 297 (2000). 109. R. Fleischer, Phys. Lett. B 459, 306 (1999); Eur. Phys. J. C 16, 87 (2000). See also I. Dunietz, Proceedings of the Workshop on B Physics at Hadron Accelerators, Snowmass, CO, 1993, p. 83; D. Pirjol, Phys.
479
Rev. D 60, 054020 (1999). 110. F. Wiirthwein and R. Jesik, presented at Workshop on B Physics at the Tevatron - Run II and Beyond, Fermilab, February 2000 (unpublished). 111. C.-W. Chiang and L. Wolfenstein, Phys. Lett. B 493, 73 (2000). 112. J. L. Rosner, Phys. Rev. D 27, 1101 (1983); in Proceedings of the International Symposium on Lepton and Photon Interactions at High Energy, Kyoto, Aug. 19-24,1985, edited by M. Konuma and K. Takahashi (Kyoto Univ., Kyoto, 1985), p. 448; F. J. Gilman and R. Kauffman, Phys. Rev. D 36, 2761 (1987). 113. M. Gronau and J. L. Rosner, Phys. Rev. D 53, 2516 (1996); A. S. Dighe, M. Gronau, and J. L. Rosner, Phys. Lett. B 367, 357 (1996); 377, 325 (1996); Phys. Rev. Lett. 79, 4333 (1997); A. S. Dighe, Phys. Rev. D 54, 2067 (1996). 114. T. Feldmann, Int. J. Mod. Phys. A 15, 159 (2000). 115. H. J. Lipkin, Phys. Rev. Lett. 46, 1307 (1981); Phys. Lett. B 254, 247 (1991); Phys. Lett. B 415, 186 (1997); 433, 117 (1998). 116. I. Halperin and A. Zhitnitsky, Phys. Rev. D 56, 7247 (1997); Phys. Rev. Lett. 80, 438 (1998); F. Yuan and K.-T. Chao, Phys. Rev. D 56, R2495 (1997); A. Ali and C. Greub, Phys. Rev. D 57, 2996 (1998); A. Ali, J. Chay, C. Greub, and P. Ko, Phys. Lett. B 424, 161 (1998); D. Atwood and A. Soni, Phys. Lett. B 405, 150 (1997); Phys. Rev. Lett. 79, 5206 (1997); W.-S. Hou and B. Tseng, Phys. Rev. Lett. 80, 434 (1998); H.-Y. Cheng and B. Tseng, Phys. Lett. B 415, 263 (1997); A. Datta, X.-G. He, and S. Pakvasa, Phys. Lett. B 419, 369 (1998); A. L. Kagan and A. A. Petrov, UCHEP-27/UMHEP-443, hep-ph/9707354; H. Fritzsch, Phys. Lett. B 415, 83 (1997). 117. W.-S. Hou, J. G. Smith, and F. Wiirthwein, hep-ex/9910014. 118. X.-G. He, W.-S. Hou, and K. C. Yang, Phys. Rev. Lett. 83, 1100 (1999). 119. CLEO Collaboration, C. P. Jessop et al, Phys. Rev. Lett. 85, 2881 (2000). 120. M. Gronau and J. L. Rosner, Phys. Rev. D 57, 6843 (1998). 121. B. Tseng and C.-W. Chiang, hep-ph/9905338; C.-W. Chiang and L. Wolfenstein, Phys. Rev. D 61, 074031 (2000); C.-W. Chiang, Phys. Rev. D 62, 014017 (2000). 122. See, e.g., M. Ciuchini et al., Nucl. Phys. B501, 271 (1997); B512, 3 (1998); B531, 656(E) (1998); Nucl. Instr. Meth. A 408, 28 (1998); hep-ph/9909530, to be published in Kaon Physics, edited by J. L. Rosner and B. Winstein, University of Chicago Press, 2001; Y.-Y. Keum, H.-n. Li, and A. I. Sanda, Phys. Lett. B 504, 6 (201); Phys. Rev. D 63, 054008 (2001).
480
123. M. Gronau, O. F. Hernandez, D. London, and J. L. Rosner, Phys. Rev. D 52, 6374 (1995). 124. M. Gronau et al, Phys. Rev. Lett. 73, 21 (1994). 125. R. Fleischer, Phys. Lett. B 365, 399 (1994); N. G. Deshpande and X.-G He, Phys. Rev. Lett. 74, 26,4099(E) (1995). 126. M. Neubert and J. L. Rosner, Phys. Lett. B 441, 403 (1998); Phys. Rev. Lett. 8 1 , 5076 (1998); M. Neubert, JHEP 9902, 014 (1999). 127. M. Gronau and D. Pirjol, Phys. Rev. D 60, 034021 (1999). 128. CLEO Collaboration, S. Chen et al, Phys. Rev. Lett. 85, 525 (2000). 129. A. Falk et al., Phys. Rev. D 57, 4290 (1998). 130. M. Gronau and J. L. Rosner, Phys. Rev. D 58, 113005 (1998). 131. See, e.g., J.-M. Gerard and J. Weyers, Eur. Phys. J. C 7, 1 (1999); A. F. Falk, A. L. Kagan, Y. Nir and A. A. Petrov, Phys. Rev. D 57, 4290 (1998). 132. H. Harari, Ann. Phys. (N.Y.) 63, 432 (1971); H. Harari and M. Davier, Phys. Lett. 35B, 239 (1971); H. Harari and A. Schwimmer, Phys. Rev. D 5, 2780 (1972). 133. M. Gronau and D. Wyler, Phys. Lett. B 265, 172 (1991). 134. See, e.g., D. Atwood, I. Dunietz, and A. Soni, Phys. Rev. Lett. 78, 3257 (1997); Phys. Rev. D 63, 036005 (2001); M. Gronau, Phys. Rev. D 58, 037301 (1998). 135. M. Gronau and D. London, Phys. Rev. Lett. 65, 3381 (1990). 136. A. Snyder and H. Quinn, Phys. Rev. D 48, 2139 (1993). 137. H. Quinn and J. Silva, Phys. Rev. D 62, 054002 (2000). 138. OPAL Collaboration, K. Ackerstaff et al, Zeit. Phys. C 76, 401, 417 (1997). See also OPAL Collaboration, G. Abbiendi et al, Eur. Phys. J. C 12, 609 (2000) for a more recent result based on comparison of b and b decays. 139. CLEO Collaboration, D. E. Jaffe et al., Cornell University report CLNS 01/1717, hep-ex/0101006 (unpublished). 140. J. L. Rosner, Nucl. Phys. Proc. Suppl. 73, 29 (1999). 141. P. Burchat et al, Report of the NSF Elementary Particle Physics Special Emphasis Panel on B Physics, July, 1998 (unpublished). 142. H. Murayama, this volume. 143. R. S. Chivukula, Boston University report BUHEP-00-24, hepph/0011264, published in this volume. 144. J. L. Rosner, Phys. Rev. D 61, 097303 (2000), and references therein. 145. See, e.g., M. Gronau and D. London, Phys. Rev. D 55, 2845 (1997). 146. Y. Grossman and M. P. Worah, Phys. Lett. B 395, 241 (1997); Y. Grossman, G. Isidori, and M. P. Worah^ Phys. Rev. D 58, 057504 (1998).
mrw+^v: Ilillllilli!
Matthias Neubert
This page is intentionally left blank
LECTURES O N T H E THEORY OF N O N - L E P T O N I C B DECAYS
Newman
MATTHIAS NEUBERT Laboratory of Nuclear Studies, Cornell Ithaca, NY 14853, USA E-mail: [email protected]
University
These notes provide a pedagogical introduction to the theory of non-leptonic heavymeson decays recently proposed by Beneke, Buchalla, Sachrajda and myself. We provide a rigorous basis for factorization for a large class of non-leptonic two-body B-meson decays in the heavy-quark limit. The resulting factorization formula incorporates elements of the naive factorization approach and the hard-scattering approach, and allows us to compute systematically radiative ("non-factorizable") corrections to naive factorization for decays such as B —> D-n and B —> TTTT.
1
Introduction
Non-leptonic two-body decays of B mesons, although simple as far as the underlying weak decay of the b quark is concerned, are complicated on account of strong-interaction effects. If these effects could be computed, this would enhance tremendously our ability to uncover the origin of CP violation in weak interactions from data on a variety of such decays being collected at the B factories. In these lecture, I review recent progress towards a systematic analysis of weak heavy-meson decays into two energetic mesons based on the factorization properties of decay amplitudes in QCD x'2. My discussion will follow very closely the detailed account of this approach given in 2 . (We have worked so hard on this paper that any attempt to improve on it were bound to fail and leave the author in despair.) Much of the credit for these notes belongs to my collaborators Martin Beneke, Gerhard Buchalla, and Chris Sachrajda. As in the classic analysis of semi-leptonic B —> D transitions 3 ' 4 , our arguments make extensive use of the fact that the b quark is heavy compared to the intrinsic scale of strong interactions. This allows us to deduce that nonleptonic decay amplitudes in the heavy-quark limit have a simple structure. The arguments to reach this conclusion, however, are quite different from those used for semi-leptonic decays, since for non-leptonic decays a large momentum is transferred to at least one of the final-state mesons. The results of our work justify naive factorization of four fermion operators for many, but not all, nonleptonic decays and imply that corrections termed "non-factorizable", which up to now have been thought to be intractable, can be calculated rigorously if the mass of the decaying quark is large enough. This leads to a large number 483
484
of predictions for CP-violating B decays in the heavy-quark limit, for which measurements will soon become available. Weak decays of heavy mesons involve three fundamental scales, the weakinteraction scale Mw, the 6-quark mass mj, and the QCD scale AQCD, which are strongly ordered: Mw > mj > AQCD- The underlying weak decay being computable, all theoretical work concerns strong-interaction corrections. QCD effects involving virtualities above the scale mj are well understood. They renormalize the coefficients of local operators Oi in the effective weak Hamiltonian 5 , so that the amplitude for the decay B —>• Mi M 2 is given by A(B -> M1M2) = ^
^
At d(n) (M 1 M 2 |O i (/i)|B),
(1)
where each term in the sum is the product of a Cabibbo-Kobayashi-Maskawa (CKM) factor A;, a coefficient function Ci(fi), which incorporates stronginteraction effects above the scale [i ~ mj, and a matrix element of an operator Oi- The difficult theoretical problem is to compute these matrix elements or, at least, to reduce them to simpler non-perturbative objects. A variety of treatments of this problem exist, which rely on assumptions of some sort. Here we identify two somewhat contrary lines of approach. The first one, which we shall call "naive factorization", replaces the matrix element of a four-fermion operator in a heavy-quark decay by the product of the matrix elements of two currents 6 ' 7 , e.g. (D+n-\(cb)V-A(du)V-A\Bd)
-> (7r-|(du)v-^|0) (D+\(cb)V-A\Bd).
(2)
This assumes that the exchange of "non-factorizable" gluons between the ir~ and the (Bd D+) system can be neglected if the virtuality of the gluons is below (j, ~ mj. The non-leptonic decay amplitude then reduces to the product of a form factor and a decay constant. This assumption is in general not justified, except in the limit of a large number of colours in some cases. It deprives the amplitude of any physical mechanism that could account for rescattering in the final state. "Non-factorizable" radiative corrections must also exist, because the scale dependence of the two sides of (2) is different. Since such corrections at scales larger than \i are taken into account in deriving the effective weak Hamiltonian, it appears rather arbitrary to leave them out below the scale \i. Various generalizations of the naive factorization approach have been proposed, which include new parameters that account for non-factorizable corrections. In their most general form, these generalizations have nothing to do with the original "factorization" ansatz, but amount to a general parameterization of the matrix elements. Such general parameterizations are exact, but at the price of
485
introducing many unknown parameters and eliminating any theoretical input on strong-interaction dynamics. The second method used to study non-leptonic decays is the hard-scattering approach, which assumes the dominance of hard gluon exchange. The decay amplitude is then expressed as a convolution of a hard-scattering factor with light-cone wave functions of the participating mesons, in analogy with more familiar applications of this method to hard exclusive reactions involving only light hadrons 8 ' 9 . In many cases, the hard-scattering contribution represents the leading term in an expansion in powers of AQCD/<9> where Q denotes the hard scale. However, the short-distance dominance of hard exclusive processes is not enforced kinematically and relies crucially on the properties of hadronic wave functions. There is an important difference between light mesons and heavy mesons in this regard, because the light quark in a heavy meson at rest naturally has a small momentum of order AQCD, while for fast light mesons a configuration with a soft quark is suppressed by the endpoint behaviour of the meson wave function. As a consequence, the soft (or Feynman) mechanism is power suppressed for hard exclusive processes involving light mesons, but it is of leading power for heavy-meson decays. It is clear from this discussion that a satisfactory treatment should take into account soft contributions, but also allow us to compute corrections to naive factorization in a systematic way. It is not at all obvious that such a treatment would result in a predictive framework. We will show that this does indeed happen for most non-leptonic two-body B decays. Our main conclusion is that "non-factorizable" corrections are dominated by hard gluon exchange, while the soft effects that survive in the heavy-quark limit are confined to the (BM\) system, where M\ denotes the meson that picks up the spectator quark in the B meson. This result is expressed as a factorization formula, which is valid up to corrections suppressed by powers of A Q C D / " ^ . At leading power, non-perturbative contributions are parameterized by the physical form factors for the B -4 M\ transition and leading-twist light-cone distribution amplitudes of the mesons. Hard perturbative corrections can be computed systematically in a way similar to the hard-scattering approach. On the other hand, because the B —» Mi transition is parameterized by a form factor, we recover the result of naive factorization at lowest order in as. An important implication of the factorization formula is that strong reseat tering phases are either perturbative or power suppressed in AQCD/?™&- It is worth emphasizing that the decoupling of M-2 occurs in the presence of soft interactions in the (BM,) system. In other words, while strong-interaction effects in the B —» Mi transition are not confined to small transverse distances, the other meson M-2 is predominantly produced as a compact object with small
486
transverse extension. The decoupling of soft effects then follows from "colour transparency". The colour-transparency argument for exclusive B decays has already been noted in the literature 10 ' 11 , but it has never been developed into a factorization formula that could be used to obtain quantitative predictions. The approach described in 1 ' 2 is general and applies to decays into a heavy and a light meson (such as B -> D-K) as well as to decays into two light mesons (such as B —> TTTT). Factorization does not hold, however, for decays such as B —> TTD and B —>- DD, in which the meson that does not pick up the spectator quark in the B meson is heavy. For the main part in these lectures, we will focus on the case of B —> D^*'L decays (with L a light meson), for which the factorization formula takes its simplest form, and power counting will be relatively straightforward. Occasionally, we will point out what changes when we consider more complicated decays such as B —> irir. A detailed treatment of these processes can be found in 12 . The outline of these notes is as follows: In Sect. 2 we state the factorization formula in its general form. In Sect. 3 we collect the physical arguments that lead to factorization and introduce our power-counting scheme. We show how light-cone distribution amplitudes enter, discuss the heavy-quark scaling of the B —> D form factor, and explain the cancellation of soft and collinear contributions in "non-factorizable" vertex corrections to non-leptonic decay amplitudes. We also comment on the implications of our results for finalstate interactions in hadronic B decays. The cancellation of long-distance singularities is demonstrated in more detail in Sect. 4, where we present the calculation of the hard-scattering functions at one-loop order for decays into a heavy and a light meson. Various sources of power-suppressed effects, which give corrections to the factorization formula, are discussed in Sect. 5. They include hard-scattering contributions, weak annihilation, and contributions from multi-particle Fock states. We then point out some limitations of the factorization approach. In Sect. 7 we consider the phenomenology of B —> D^L decays on the basis of the factorization formula and discuss various tests of our theoretical framework. We also examine to what extent a charm meson should be considered as heavy or light. Section 8 contains the conclusion.
2
Statement of the factorization formula
In this section we summarize the factorization formula for non-leptonic B decays. We introduce relevant terminology and provide definitions of the hadronic quantities that enter the factorization formula as non-perturbative input parameters.
487
2.1
The idea of factorization
In the context of non-leptonic decays the term "factorization" is usually applied to the approximation of the matrix element of a four-fermion operator by the product of a form factor and a decay constant, as illustrated in (2). Corrections to this approximation are called "non-factorizable". We will refer to this approximation as "naive factorization" and use quotes on "non-factorizable" to avoid confusion with the (much less trivial) meaning of factorization in the context of hard processes in QCD. In the latter case, factorization refers to the separation of long-distance contributions to the process from a short-distance part that depends only on the large scale rnb. The short-distance part can be computed in an expansion in the strong coupling as(rrib). The long-distance contributions must be computed non-perturbatively or determined experimentally. The advantage is that these non-perturbative parameters are often simpler in structure than the original quantity, or they are process independent. For example, factorization applied to hard processes in inclusive hadron-hadron collisions requires only parton distributions as non-perturbative inputs. Parton distributions are much simpler objects than the original matrix element with two hadrons in the initial state. On the other hand, factorization applied to the B —• D form factor leads to a non-perturbative object (the "IsgurWise function"), which is still a function of the momentum transfer. However, the benefit here is that symmetries relate this function to other form factors. In the case of non-leptonic B decays, the simplification is primarily of the first kind (simpler structure). We call those effects non-factorizable (without quotes) which depend on the long-distance properties of the B meson and both final-state mesons combined. The factorization properties of non-leptonic decay amplitudes depend on the two-meson final state. We call a meson "light" if its mass m remains finite in the heavy-quark limit. A meson is called "heavy" if its mass scales with rrib in the heavy-quark limit, such that m/mt, stays fixed. In principle, we could still have m 3> AQCD f° r a light meson. Charm mesons could be considered as light in this sense. However, unless otherwise mentioned, we assume that m is of order AQCD for a light meson, and we consider charm mesons as heavy. In evaluating the scaling behaviour of the decay amplitudes, we assume that the energies of both final-state mesons (in the B-meson rest frame) scale with mt, in the heavy-quark limit. 2.2
The factorization
formula
We consider a generic weak decay B —> M\M2 in the heavy-quark limit and differentiate between decays into final states containing a heavy and a light
488
Figure 1: Graphical representation of the factorization formula. Only one of the two formfactor terms in (3) is shown for simplicity.
meson or two light mesons. Our goal is to show that, up to power corrections of order AQCD/"I&, the transition matrix element of an operator Oi in the effective weak Hamiltonian can be written as
(M1M2\Oi\B)=J2Ff^Ml(ml)fM2 f duT^u) $ M » J
J
°
if Mi is heavy and M2 is light,
{MlM2\Ol\B)=
V F f ^ M l ( m 2 ) / M 2 ( duT^u) *M,(u) + (Mx o M2) + fBfMjM2 / Jo
d^dudvTlII^,u,v)^B(^)^Ml(v)^M2(u)
if Mi and M 2 are both light.
(3)
Here F B "^ M (m 2 ) denotes a B —> M form factor evaluated at q2 = m 2 , m\^ are the light meson masses, and $^(u) is the light-cone distribution amplitude for the quark-antiquark Fock state of the meson X. These non-perturbative quantities will be defined below. T!-{u) and T/7(£,M,t>) are hard-scattering functions, which are perturbatively calculable. The factorization formula in its general form is represented graphically in Fig. 1. The second equation in (3) applies to decays into two light mesons, for which the spectator quark in the B meson (in the following simply referred to as the "spectator quark") can go to either of the final-state mesons. An example is the decay B~ —> ir°K~~. If the spectator quark can go only to
489 one of the final-state mesons, as for example in Bd -> ir+K~, we call this meson Mi, and the second form-factor term on the right-hand side of (3) is absent. The formula simplifies when the spectator quark goes to a heavy meson (first equation in (3)), such as in Bd -> D+n~. Then the second term in Fig. 1, which accounts for hard interactions with the spectator quark, can be dropped because it is power suppressed in the heavy-quark limit. In the opposite situation that the spectator quark goes to a light meson but the other meson is heavy, factorization does not hold, because the heavy meson is neither fast nor small and cannot be factorized from the B —> Mi transition. Finally, notice that annihilation topologies do not appear in the factorization formula, since they do not contribute at leading order in the heavy-quark expansion. Any hard interaction costs a power of as. As a consequence, the hardspectator term in the second formula in (3) is absent at order a°s. Since at this order the functions T'Ju) are independent of u, the convolution integral results in the normalization of the meson distribution amplitude, and (3) reproduces naive factorization. The factorization formula allows us to compute radiative corrections to this result to all orders in as. Further corrections are suppressed by powers of AQCD/"I& m the heavy-quark limit. The significance and usefulness of the factorization formula stems from the fact that the non-perturbative quantities appearing on the right-hand side of the two equations in (3) are much simpler than the original non-leptonic matrix elements on the left-hand side. This is because they either reflect universal properties of a single meson (light-cone distribution amplitudes) or refer only to a B —> meson transition matrix element of a local current (form factors). While it is extremely difficult, if not impossible 13 , to compute the original matrix element (MiM-2\Oi\B) in lattice QCD, form factors and lightcone distribution amplitudes are already being computed in this way, although with significant systematic errors at present. Alternatively, form factors can be obtained using data on semi-leptonic decays, and light-cone distribution amplitudes by comparison with other hard exclusive processes. After having presented the most general form of the factorization formula, we will from now on restrict ourselves to the case of heavy-light final states. Then the (simpler) first formula in (3) applies, and only the first term shown in Fig. 1 is present at leading power. 2.3
Definition of non-perturbative parameters
The form factors F?~*M(q2) in (3) arise in the decomposition of current matrix elements of the form (M (p')\qTb\B (p)), where T can be any irreducible Dirac matrix that appears after contraction of the hard subgraph to a local vertex
490 with respect to the B -> M transition. We will often refer to the matrix element of the vector current evaluated between a B meson and a pseudoscalar meson P, which is conventionally parameterized as
(P(p')\qYlb\B(p)) = F^P(q2)
(p" +p'")
+ [ ^ V ) - F*-*ptf)\ ^ _2 ^ , (4)
T
>p
B_>f,
where q = p - p', and Ff~ (0) = F 0 (0) at zero momentum transfer. Note that we write (3) in terms of physical form factors. In principle, Fig. 1 could be looked upon in two different ways. We could suppose that the region represented by Fj accounts only for the soft contributions to the B -» Mi form factor. The hard contributions to the form factor would then be considered as part of T/j (or as part of the second diagram). Performing this split-up would require that one understands the factorization of hard and soft contributions to the form factor. If Mi is heavy, this amounts to matching the form factor onto a form factor defined in heavy-quark effective theory 14 . However, for a light meson Mi the factorization of hard and soft contributions to the form factor is not yet completely understood. We bypass this problem by interpreting Fj as the physical form factor, including hard and soft contributions. This avoids the above problem, and in addition has the advantage that the physical form factors are directly related to measurable quantities. Light-cone distribution amplitudes play the same role for hard exclusive processes that parton distributions play for inclusive processes. As in the latter case, the leading-twist distribution amplitudes, which are the ones we need at leading power in the 1/mj expansion, are given by two-particle operators with a certain helicity structure. The helicity structure is determined by the angular momentum of the meson and the fact that the spinor of an energetic quark has only two large components. The leading-twist light-cone distribution amplitudes for pseudoscalar mesons (P) and longitudinally polarized vector mesons (Vy) with flavour content (qq1) are defined as
(P(q)\q(yW(x)0\O)
=^
(fr5)0a
f
due^+^$P(u^)
o
f d u e ^ + ^ ^ n ) ,
./o
(5)
where (x - y)2 = 0. We have suppressed the path-ordered exponentials that connect the two quark fields at different positions and make the light-cone operators gauge invariant. The equality sign is to be understood as "equal
491
up to higher-twist terms". It is also understood that the operators on the left-hand side are colour singlets. When convenient, we use the "bar"-notation u = 1 — u. The parameter p, is the renormalization scale of the light-cone operators on the left-hand side. The distribution amplitudes are normalized as f0 du^x(u,p) = 1 with X = P, Vj|. One defines the asymptotic distribution amplitude as the limit in which the renormalization scale is sent to infinity. In this case
*x(u,/0"=°°Ml-«)-
(6)
The use of light-cone distribution amplitudes in non-leptonic B decays requires justification, which we will provide in Sects. 3 and 4. The decay amplitude for a B decay into a heavy-light final state is then calculated by assigning momenta uq and uq to the quark and antiquark in the outgoing light meson (with momentum q), writing down the on-shell amplitude in momentum space, and performing the replacement (7) for pseudoscalars and, with obvious modifications, for vector mesons. (Even when working with light-cone distribution amplitudes it is not always justified to perform the collinear approximation on the external quark and antiquark lines right away. One may have to keep the transverse components of the quark and antiquark momenta until after some operations on the amplitude have been carried out. However, these subtleties do not concern calculations at leading-twist order.) 3
Arguments for factorization
In this section we provide the basic power-counting arguments that lead to the factorized structure shown in (3). We do so by analyzing qualitatively the hard, soft and collinear contributions to the simplest Feynman diagrams. 3.1
Preliminaries and power counting
For concreteness, we label the charm meson which picks up the spectator quark by M\ — D+ and assign momentum p' to it. The light meson is labeled M-2 — TT~ and assigned momentum q = En+, where E is the pion energy in the B rest frame, and n± = (1,0,0,±1) are four-vectors on the light-cone. At leading power, we neglect the mass of the light meson. The simplest diagrams that we can draw for a non-leptonic decay amplitude assign a quark and antiquark to each meson. We choose the quark and
492
antiquark momenta in the pion as P lq = uq + l± + -~nAuE
,
lq=uq-l±
I2 + -^=-n4uE
.
(8)
Note that q ^ lq+ lq, but the off-shellness (lq + lq)2 is of the same order as the light meson mass, which we can neglect at leading power. A similar decomposition (with longitudinal momentum fraction v and transverse momentum l'±) is used for the charm meson. To prove the factorization formula (3) for the case of heavy-light final states, one has to show that: i) There is no leading (in powers of AQCB/TTII,) contribution to the amplitude from the endpoint regions u ~ A Q C D / ^ ; , and u ~ AQCD/mbii) One can set l± = 0 in the amplitude (more generally, expand the amplitude in powers of l±) after collinear subtractions, which can be absorbed into the pion wave function. This, together with i), guarantees that the amplitude is legitimately expressed in terms of the light-cone distribution amplitudes of pion. iii) The leading contribution comes from v ~ A Q C D / " 1 6 (the region where the spectator quark enters the charm meson as a soft parton), which guarantees the absence of a hard spectator interaction term. iv) After subtraction of infrared contributions corresponding to the lightcone distribution amplitude and the form factor, the leading contributions to the amplitude come only from internal lines with virtuality that scales with m&. v) Non-valence Fock states are non-leading. The requirement that after subtractions virtualities should be large is obvious to guarantee the infrared finiteness of the hard-scattering functions T^. Let us comment on setting transverse momenta in the wave functions to zero and on endpoint contributions. Neglecting transverse momenta requires that we count them as order AQCD when comparing terms of different magnitude in the scattering amplitude. This conforms to our intuition and the assumption of the parton model, that intrinsic transverse momenta are limited to hadronic scales. However, in QCD transverse momenta are not limited, but logarithmically distributed up to the hard scale. The important point is that contributions that violate the starting assumption of limited transverse momentum can be absorbed into the universal light-cone distribution amplitudes.
493 The statement that transverse momenta can be counted of order AQCD is to be understood after these subtractions have been performed. The second comment concerns endpoint contributions in the convolution integrals over longitudinal momentum fractions. These contributions are dangerous, because we may be able to demonstrate the infrared safety of the hard-scattering amplitude under assumption of generic u and independent of the shape of the meson distribution amplitude, but for u —> 0 or u —> l a propagator that was assumed to be off-shell approaches the mass-shell. If such a contribution were of leading power, we would not expect the perturbative calculation of the hard-scattering functions to be reliable. Estimating endpoint contributions requires knowledge of the endpoint behaviour of the light-cone distribution amplitude. Since it enters the factorization formula at a renormalization scale of order mt,, we can use the asymptotic form (6) to estimate the endpoint contribution. (More generally, we only have to assume that the distribution amplitude at a given scale has the same endpoint behaviour as the asymptotic amplitude. This is generally the case, unless there is a conspiracy of terms in the Gegenbauer expansion of the distribution amplitude. If such a conspiracy existed at some scale, it would be destroyed by evolving the distribution amplitude to a different scale.) We count a lightmeson distribution amplitude as order A.Qcr>/rnb in the endpoint region (defined as the region the quark or antiquark momentum is of order AQCD), and order 1 away from the endpoint, i.e. (for X = P, Vj|) 1
f
*x(u) ~ i
;
A
generic u,
,
[AQCD/^6;
-
A
/
9
)
U, U ~ AQCD/TO6.
Note that the endpoint region has a size of order AQCD/"if,, so that the endpoint suppression is ~ ( A Q C D / " ^ ) 2 . This suppression has to be weighted against potential enhancements of the partonic amplitude when one of the propagators approaches the mass shell. The counting for B mesons, or heavy mesons in general, is different. Naturally, the heavy quark carries almost all of the meson momentum, and hence we count £ ~ A Q C D/m 6 ,
The zero probability for a light spectator with momentum of order mj must be understood as a boundary condition for the wave function renormalized at a scale much below m^. There is a small probability for hard fluctuations that transfer large momentum to the spectator. This "hard tail" is generated by
494
«
r-
§ o
(a)
(b)
Figure 2: Leading contributions to the B —> D form factor in the hard-scattering approach. The dashed line represents the weak current. The two lines to the left belong to the B meson, the ones to the right to the recoiling charm meson.
evolution of the wave function from a hadronic scale to a scale of order mi,. If we assume that the initial distribution at the hadronic scale falls sufficiently rapidly for £ 3> A Q C D / ^ ^ 6 , this remains true after evolution. We shall assume a sufficiently fast fall-off, so that, for the purposes of power counting, the probability that the spectator-quark momentum is of order mj can be set to zero. The same counting applies to the D meson. (Despite the fact that the charm meson has momentum of order mj, we do not need to distinguish the rest frames of B and D for the purpose of power counting, because the two frames are not connected by a parametrically large boost. In other words, the components of the spectator quark in the D meson are still of order AQCD-) 3.2
The B ^ D form factor
We now demonstrate that the B —» D form factor receives a leading contribution from soft gluon exchange. This implies that a non-leptonic decay cannot be treated completely in the hard-scattering picture, and so the form factor should enter the factorization formula as a non-perturbative quantity. Consider the diagrams shown in Fig. 2. When the exchanged gluon is hard the spectator quark in the final state has momentum of order m/,. But according to the counting rule (10) this configuration has no overlap with the .D-meson wave function. On the other hand, there is no suppression for soft gluons in Fig. 2. It follows that the dominant behaviour of the B —>• D form factor in the heavy-quark limit is given by soft processes. Because of this argument, we can exploit the heavy-quark symmetries to determine how the form factor scales in the heavy-quark limit. The well-known result is that the form factor scales like a constant (modulo logarithms), since it is equal to one at zero velocity transfer and independent of mj as long as the Lorentz boost that connects the B and D rest frames is of order 1. The same conclusion follows from the power-counting rules for light-cone wave
495
Figure 3: Leading-order contribution to the hard-scattering kernels T?Au). The weak decay of the b quark through a four-fermion operator is represented by the black square.
functions. To see this, we represent the form factor by an overlap integral of wave functions (not integrated over transverse momentum),
F^D(0) ~ J
d
~^-
* B & *L) * z > ( m * L ) .
(11)
where £'(£) is fixed by kinematics, and we have set q2 = 0 for simplicity. The probability of finding the B meson in its valence Fock state is of order 1 in the heavy-quark limit, i.e. / ^ ^ I * B .
C
« , * ± ) |
2
~ 1 .
(12)
Counting k± ~ AQCD and d£ ~ A Q C D / " 1 6 , we deduce that * B ( ^ , A ; X ) ~ mb / A Q C D . From (11), we then obtain the scaling law F+~^D(Q) ~ 1, in agreement with the prediction of heavy-quark symmetry. The representation (11) of the form factor as an overlap of wave functions for the two-particle Fock state of the heavy meson is not rigorous, because there is no reason to assume that the contribution from higher Fock states with additional soft gluons is suppressed. The consistency with the estimate based on heavy-quark symmetry shows that these additional contributions are not larger than the two-particle contribution. 3.3
Non-leptonic decay amplitudes
We now turn to a qualitative discussion of the lowest-order and one-gluon exchange diagrams that could contribute to the hard-scattering kernels T/Au) in (3). In the figures which follow, the two lines directed upwards represent 7T~, the lines on the left represent B^, and the lines on the right represent D+. Lowest-order diagram There is a single diagram with no hard gluon interactions shown in Fig. 3. According to (10) the spectator quark is soft, and since it does not undergo a hard
496
V
V
V
V
Figure 4: Diagrams at order o s that need not be calculated.
interaction it is absorbed as a soft quark by the recoiling meson. This is evidently a contribution to the left-hand diagram of Fig. 1, involving the B —> D form factor. The hard subprocess in Fig. 3 is just given by the insertion of a four-fermion operator, and hence it does not depend on the longitudinal momentum fraction u of the two quarks that form the emitted ir~. Consequently, the lowest-order contribution to T^(u) in (3) is independent of u, and the u-integral reduces to the normalization condition for the pion distribution amplitude. The result is, not surprisingly, that the factorization formula reproduces the result of naive factorization if we neglect gluon exchange. Note that the physical picture underlying this lowest-order process is that the spectator quark (which is part of the B —• D form factor) is soft. If this is the case, the hard-scattering approach misses the leading contribution to the non-leptonic decay amplitude. Putting together all factors relevant to power counting, we find that in the heavy-quark limit the decay amplitude for a decay into a heavy-light final state (in which the spectator quark is absorbed by the heavy meson) scales as A(Bd -> D+ir-) ~ GFm2b FB^D(0)
/„ ~ GFm\ A Q C D .
(13)
Other contributions must be compared with this scaling rule. Factorizable diagrams In order to justify naive factorization as the leading term in an expansion in as and AQCD/"1J,, w e must show that radiative corrections are either suppressed in one of these two parameters, or already contained in the definition of the form factor and the pion decay constant. Consider the graphs shown in Fig. 4. The first three diagrams are part of the form factor and do not contribute to the hard-scattering kernels. Since the first and third diagrams contain leading contributions from the region in which the gluon is soft, they should not be considered as corrections to Fig. 3. However, this is of no consequence since these soft contributions are absorbed into the physical form factor. The fourth diagram in Fig. 4 is also factorizable. In general, this graph would split into a hard contribution and a contribution to the evolution of the
497
A/
^f
(a)
(b)
VL (c)
\A (d)
Figure 5: "Non-factorizable" vertex corrections.
pion distribution amplitude. However, as the leading-order diagram in Fig. 3 involves only the normalization integral of the pion distribution amplitude, the sum of the fourth diagram in Fig. 4 and the wave-function renormalization of the quarks in the emitted pion vanishes. In other words, these diagrams would renormalize the (ud) light-quark current, which however is conserved. "Non-factorizable" vertex corrections We now begin the analysis of "non-factorizable" diagrams, i.e. diagrams containing gluon exchanges that cannot be associated with the B —• D form factor or the pion decay constant. At order as, these diagrams can be divided into three groups: vertex corrections, hard spectator interactions, and annihilation diagrams. The vertex corrections shown in Fig. 5 violate the naive factorization ansatz (2). One of the key observations made in l>2 is that these diagrams are calculable nonetheless. Let us summarize the argument here, postponing the explicit evaluation of these diagrams to Sect. 4. The statement is that the vertex-correction diagrams form an order-a s contribution to the hardscattering kernels T/j(u). To demonstrate this, we have to show that: i) The transverse momentum of the quarks that form the pion can be neglected at leading power, i.e. the two momenta in (8) can be approximated by uq and uq, respectively. This guarantees that only a convolution in the longitudinal momentum fraction u appears in the factorization formula, ii) The contribution from the soft-gluon region and gluons collinear to the direction of the pion is power suppressed. In practice, this means that the sum of these diagrams cannot contain any infrared divergences at leading power in AQCD I'mi,. Neither of the two conditions holds true for any of the four diagrams individually, as each of them separately contains collinear and infrared divergences. As will be shown in detail later, the infrared divergences cancel when one sums over the gluon attachments to the two quarks comprising the emission pion ((a+b), (c+d) in Fig. 5). This cancellation is a technical mani-
498
festation of Bjorken's colour-transparency argument 10 , stating that soft gluon interactions with the emitted colour-singlet (ud) pair are suppressed because they interact with the colour dipole moment of the compact light-quark pair. Collinear divergences cancel after summing over gluon attachments to the b and c quark lines ((a+c), (b+d) in Fig. 5). Thus the sum of the four diagrams (a-d) involves only hard gluon exchange at leading power. Because the hard gluons transfer large momentum to the quarks that form the emission pion, the hard-scattering factor now results in a non-trivial convolution with the pion distribution amplitude. "Non-factorizable" contributions are therefore non-universal, i.e. they depend on the quantum numbers of the final-state mesons. Note that the colour-transparency argument, and hence the cancellation of soft gluon effects, applies only if the (ud) pair is compact. This is not the case if the emitted pion is formed in a very asymmetric configuration, in which one of the quarks carries almost all of the pion momentum. Since the probability for forming a pion in such an endpoint configuration is of order ( A Q C D / W ; , ) 2 , they could become important only if the hard-scattering amplitude favoured the production of these asymmetric pairs, i.e. if T^ ~ l/v? for u —> 0 (or Tfj ~ \/v? for u —> 1). However, we will see that such strong endpoint singularities in the hard-scattering amplitude do not occur. To complete the argument, we have to show that all other types of contributions to the non-leptonic decay amplitudes are power suppressed in the heavy-quark limit. This includes interactions with the spectator quark, weak annihilation graphs, and contributions from higher Fock components of the meson wave functions. This will be done in Sect. 5. In summary, then, for hadronic B decays into a light emitted and a heavy recoiling meson the first factorization formula in (3) holds. At order as, the hard-scattering kernels T'Au) are computed from the diagrams shown in Figs. 3 and 5. Naive factorization follows when one neglects all corrections of order AQCD/TO& and as. The factorization formula allows us to compute systematically corrections to higher order in a s , but still neglects power corrections. 3-4
Remarks on final-state interactions
Some of the loop diagrams entering the calculation of the hard-scattering kernels have imaginary parts, which contribute to the strong rescattering phases. It follows from our discussion that these imaginary parts are of order as or A Q C D / ^ 6 - This demonstrates that strong phases vanish in the heavy-quark limit (unless the real parts of the amplitudes are also suppressed). Since this statement goes against the folklore that prevails from the present understand-
499 ing of this issue, and since the subject of final-state interactions (and of stronginteraction phases in particular) is of p a r a m o u n t importance for the interpretation of CP-violating observables, a few additional remarks are in order. Final-state interactions are usually discussed in terms of intermediate hadronic states. This is suggested by the unitarity relation (taking B —• TTTT for definiteness) Im AB->7T7T ~ ^2 As-^n n
A*n^nn
,
(14)
where n runs over all hadronic intermediate states. We can also interpret the sum in (14) as extending over intermediate states of p a r t o n s . T h e partonic interpretation is justified by the dominance of hard rescattering in the heavy-quark limit. In this limit, the number of physical intermediate states is arbitrarily large. We may then argue on the grounds of p a r t o n - h a d r o n duality t h a t their average is described well enough (up t o A Q C D / " 1 6 corrections, say) by a partonic calculation. This is the picture implied by (3). T h e hadronic language is in principle exact. However, the large number of intermediate states makes it intractable to observe systematic cancellations, which usually occur in an inclusive sum over hadronic intermediate states. A particular contribution to the right-hand side of (14) is elastic rescattering (n = TTTT). T h e energy dependence of the total elastic 7T7r-scattering cross section is governed by soft pomeron behaviour. Hence the strong-interaction phase of the B —> TTTT amplitude due to elastic rescattering alone increases slowly in the heavy-quark l i m i t 1 5 . On general grounds, it is rather improbable t h a t elastic rescattering gives an appropriate representation of the imaginary p a r t of the decay amplitude in the heavy-quark limit. This expectation is also borne out in the framework of Regge behaviour, as discussed in 1 5 , where the importance (in fact, dominance) of inelastic rescattering was emphasized. However, this discussion left open the possibility of soft rescattering phases t h a t do not vanish in the heavy-quark limit, as well as the possibility of systematic cancellations, for which the Regge approach does not provide an appropriate theoretical framework. Eq. (3) implies t h a t such systematic cancellations do occur in the sum over all intermediate states n. It is worth recalling t h a t similar cancellations are not uncommon for hard processes. Consider the example of e + e ~ —> hadrons at large energy q. While the production of any hadronic final state occurs on a time scale of order 1 / A Q C D (and would lead to infrared divergences if we a t t e m p t e d to describe it using perturbation theory), the inclusive cross section given by the sum over all hadronic final states is described very well by a (qq) pair t h a t lives over a short time scale of order 1/q. In close analogy, while each particular hadronic intermediate state n in (14) cannot be described
500
partonically, the sum over all intermediate states is accurately represented by a (qq) fluctuation of small transverse size of order l/m;,. Because the (qq) pair is small, the physical picture of rescattering is very different from elastic TTTT scattering. In perturbation theory, the pomeron is associated with two-gluon exchange. The analysis of two-loop contributions to the non-leptonic decay amplitude in 2 shows that the soft and collinear cancellations that guarantee the partonic interpretation of rescattering extend to two-gluon exchange. Hence, the soft final-state interactions are again subleading as required by the validity of (3). As far as the hard rescattering contributions are concerned, two-gluon exchange plus ladder graphs between a compact (qq) pair with energy of order mi, and transverse size of order l / m j and the other pion does not lead to large logarithms, and hence there is no possibility to construct the (hard) pomeron. Note the difference with elastic vector-meson production through a virtual photon, which also involves a compact (qq) pair. However, in this case one considers s > Q 2 , where ^fs is the photon-proton center-of-mass energy and Q the virtuality of the photon. This implies that the (qq) fluctuation is born long before it hits the proton. It is this difference of time scales, nonexistent in non-leptonic B decays, that permits pomeron exchange in elastic vector-meson production in -y*p collisions. 4
B —)• DTV: Factorization at one-loop order
We now present a more detailed treatment of the exclusive decays Bd ~> D(*) + L~, where L is a light meson. We illustrate explicitly how factorization emerges at one-loop order and compute the hard-scattering kernels T^(u) in the factorization formula (3). For each final state / , we express the decay amplitudes in terms of parameters a\ (/) defined in analogy with similar parameters used in the literature on naive factorization. 4-1
Effective Hamiltonian and decay topologies
The effective Hamiltonian for B -4 Dir is He5 = ^
V:dVcb (C0O0 + C808).
(15)
We choose to write the two independent four-quark operators in the singletoctet basis O 0 = C7"(l - 7 5 ) 6 d 7 M ( l - 7 5 ) " , Os = C7"(l - 7 5 ) ^ 6 ^ ( 1 -
7
S ) T V
(16)
501
rather than in the more conventional basis of 0\ and C*2- The Wilson coefficients Co and C$ describe the exchange of hard gluons with virtualities between the high-energy matching scale Mw and a renormalization scale \i of order mj. (These coefficients are related to the ones of the standard basis by Co = C\ + C2/3 and Cs — 2C-2-) They are known at next-to-leading order in renormalization-group improved perturbation theory and are given by 5 Nn 1 C_ 2Nr
(17)
Cs — C+ — C-
/here C±(n),
C±(n)
B±
" 2NC
B,
(18)
arid
c±M =
as{M- w) "s(M)
d±
1 +
as(Mw)
47T
as(n)
S4
(19)
11 as well as S4 For Nc = 3 and / = 5, we have d4 and d_ 25' 6473 9371 S- — 1 5 g ? . The scheme dependence of the Wilson coefficients at « l u ^_ 3 | 7 1 and next-to-leading order is parameterized by the coefficient B in (18). We note that -BNDR = 11 in the naive dimensional regularization (NDR) scheme with anticommuting 75, and # H V = 7 in the 't Hooft-Veltman (HV) scheme. We will demonstrate below that the scale and scheme dependence of the Wilson coefficients is canceled by a corresponding scale and scheme dependence of the hadronic matrix elements of the operators OQ and 0%. Before continuing with a discussion of these matrix elements, it is useful to consider the flavour structure for the various contributions to B —> Dir decays. The possible quark-level topologies are depicted in Fig. 6. In the terminology generally adopted for two-body non-leptonic decays, the decays Bd —> D+n~, Bd -> _D°7r° and B~ —> D°n~ are referred to as class-I, classII and class-Ill, respectively 16 . In Bd -> D+ir~ and B~~ ->• D°n~ decays the pion can be directly created from the weak current. We call this a classI contribution, following the above terminology. In addition, in the case of Bd —> D+n~ there is a contribution from weak annihilation, and a classII amplitude contributes to B~ —> D°n~. The important point is that the spectator quark goes into the light meson in the case of the class-II amplitude. This amplitude is suppressed in the heavy-quark limit, as is the annihilation amplitude. It follows that the amplitude for Bd —> D07r°, receiving only classII and annihilation contributions, is subleading compared with Bd —> D+n~ and B~ —• D°TT~, which are dominated by the class-I topology.
£
502 d
U
(a)
C
U
(b)
Figure 6: Basic quark-level topologies for B —> D-K decays (q = u, d): (a) class-I, (b) class-II, (c) weak annihilation. B^ —> D+TT~ receives contributions from (a) and (c), Bj —> D°TT° from (b) and (c), and B~ —• D°TT~ from (a) and (b). Only (a) contributes in the heavy-quark limit.
We shall use the one-loop analysis for Bd —» D+TY~ as a concrete example to illustrate explicitly the various steps involved in establishing the factorization formula. Most of the arguments given below are standard from the theory of hard exclusive processes involving light hadrons 8 . However, it is instructive to repeat these arguments in the context of B decays. 4-2
Soft and collinear cancellations at one-loop order
In order to demonstrate the property of factorization for the decay Bd —> D+n~, we now analyze the "non-factorizable" one-gluon exchange contributions shown in Fig. 5 in some detail. We consider the leading, valence Fock state of the emitted pion. This is justified since higher Fock components only give power-suppressed contributions to the decay amplitude in the heavy-quark limit (as demonstrated later). For the purpose of our discussion, the valence Fock state of the pion can be written as
|7r(g)) = | ^ L 0
* ( a t ( y b|(i?) _ at{lq) 6 t {l_^ |0) $(U] r±)
(20)
where a\ (b\) denotes the creation operator for a quark (antiquark) in a state with spin s = | or s =1, and we have suppressed colour indices. The wave function ^(u,l±) is defined as the amplitude for the pion to be composed of two on-shell quarks, characterized by longitudinal momentum fraction u and transverse momentum lj_. The on-shell momenta of the quark and antiquark are chosen as in (8). For the purpose of power counting, l± ~ AQCD ^ E ~ mj. Note that the invariant mass of the valence state is (lq + lg)2 = Vj_l(uu), which is of order A Q C D and hence negligible in the heavy-quark limit unless u is in the vicinity of the endpoints u = 0 or 1. In this case, the invariant mass of the quark-antiquark pair becomes large, and the valence Fock state is no
503
longer a valid representation of the pion. However, in the heavy-quark limit the dominant contributions to the decay amplitude come from configurations where both partons are hard (u and u both of order 1), and so the two-particle Fock state yields a consistent description. We will provide an explicit consistency check of this important feature later on. As a next step, we write down the amplitude <7r(g)KO)ad(2/)„|0> = fdu ^ ./
-±=
V*(uJ±) (75 4)«V e ^ ,
(21)
lD7r° ^JIN,.
which appears as an ingredient of the B —> Dir matrix element. It is now straightforward to obtain the one-gluon exchange contribution to the B —> Dir matrix element of the operator <3g. For the sum of the four diagrams in Fig. 5, we find (D+7r-|08|Bd)1.g,uo„=
.g,Cj
f d^k
/ n + |
(22)
^ ,^,-
N
1 r^dH± **(u,l±)
where 7 A ( A - ji + mc)T 2pc-kk2
Ai(fc) A2(lQ,l9tk)
=
m
+
*°7* -
7A(
T(fib+ ]t + mh)jx 2pb-k + k2 ^+ W
.
(23)
Here T = 7''(1 — 75), and pt,, pc are the momenta of the b- and c-quark, respectively. There is no correction to the matrix element of OQ at order as, because in this case the (du) pair is necessarily in a colour-octet configuration and cannot form a pion. In (22) the pion wave function $(u, l±_) appears separated from the B —» D transition. This is merely a reflection of the fact that we have represented the pion state in the form shown in (20). It does not, by itself, imply factorization, since the right-hand side of (22) still involves non-trivial integrations over l± and the gluon momentum k, and long- and short-distance contributions are not yet disentangled. In order to prove factorization, we need to show that the integral over k receives only subdominant contributions from the region of small k2. This is equivalent to showing that the integral over k does not contain infrared divergences at leading power in A Q C D / W J . To demonstrate infrared finiteness of the one-loop integral J E / A ^ ! ^ ) ® ^ ^ , , ! ^ )
(24)
504
at leading power, the heavy-quark limit and the corresponding large light-cone momentum of the pion are again essential. First note that when k is of order mi,, J ~ 1 by dimensional analysis. Potential infrared divergences could arise when k is soft or collinear to the pion momentum q. We need to show that the contributions from these regions are power suppressed. (Note that we do not need to show that J is infrared finite. It is enough that logarithmic divergences have coefficients that are power suppressed.) We treat the soft region first. Here all components of k become small simultaneously, which we describe by scaling k ~ A. Counting powers of A {dAk ~ A4, 1/fc2 ~ A - 2 , l/p-k ~ A~ : ) reveals that each of the four diagrams in Fig. 5, corresponding to the four terms in the product in (24), is logarithmically divergent. However, because k is small the integrand can be simplified. For instance, the second term in A2 can be approximated as 7 A(^+
flr 2
2lq-k + k
=
ix(u fi+h + ^E * - + *0 r 2uq-k + 2l±-k+
~fen-
.. gA
•k+k
2
r
(25)
qk
'
where we used that 4 to the extreme left or right of an expression gives zero due to the on-shell condition for the external quark lines. We get exactly the same expression but with an opposite sign from the other term in A 2 , and hence the soft divergence cancels out. More precisely, we find that the integral is infrared finite in the soft region when l± is neglected. When l± is not neglected, there is a divergence from soft k which is proportional to l2j_/rnl ~ AQ C D /TTI 2 . In either case, the soft contribution to J is of order AQCD/W/, or smaller and hence suppressed relative to the hard contribution. This corresponds to the standard soft cancellation mechanism, which is a technical manifestation of colour transparency. Each of the four terms in (24) is also divergent when k becomes collinear with the light-cone momentum q. This implies the scaling k+ ~ A0, k± ~ A, and jfc- ~ A2. Then dAk ~ dk+dk~d2kx ~ A4, and q • k = q+k~ ~ A2, 2 + 2 k = 2k k~ + k\ ~ A . The divergence is again logarithmic, and it is thus sufficient to consider the leading behaviour in the collinear limit. Writing k = aq + ... we can now simplify the second term of A2 as 7A(%+
W
„
qx
2(u + a)T _
(2g)
No simplification occurs in the denominator (in particular, l± cannot be neglected), but the important point is that the leading contribution is proportional to q\. Therefore, substituting k = aq into Ai and using q2 — 0, we
505
obtain
i{j>c + mc)Y
— —
2apc
r(fib + mb) j
•q
2apb
•q
— U,
[Zt)
employing the equations of motion for the heavy quarks. Hence the collinear divergence cancels by virtue of the standard Ward identity. This completes the proof of the absence of infrared divergences at leading power in the hard-scattering kernel for Bd -> D+ir~ to one-loop order. Similar cancellations are observed at higher orders. A complete proof of factorization at two-loop order can be found i n 2 . Having established that the "non-factorizable" diagrams of Fig. 5 are dominated by hard gluon exchange (i.e. that the leading contribution to J arises from k of order mb), we may now use the fact that \l±\
the matrix element of 0$ in (22) becomes
(D+\^Ai(k)b\Bd)
(29) ^ UJ
du
On the other hand, putting y on the light-cone in (21) and comparing with (5), we see that the Zj_-integrated wave function $n(u) in (28) is precisely the lightcone distribution amplitude of the pion. This demonstrates the relevance of the light-cone wave function to the factorization formula. Note that the collinear approximation for the quark and antiquark momenta emerges automatically in the heavy-quark limit. After the k integral is performed, the expression (29) can be cast into the form (D+7r-\Os\Bd)1.glaon^FB^v(0)
duTs(u,z)$n(u),
(30)
Jo
where z = mc/mb, T8(u,z) is the hard-scattering kernel, and FB^D(0) the form factor that parameterizes the {D+\c[.. ]b\Bd) matrix element. Because of the absence of soft and collinear infrared divergences in the gluon exchange between the (cb) and (du) currents, the hard-scattering kernel Tg is calculable in QCD perturbation theory.
506
4-3
Matrix elements at next-to-leading order
We now compute these hard-scattering kernels explicitly to order as effective Hamiltonian (15) can be written as ^eff =
-7=V*dVcb<
Nc + 1 2NC
CM
+
!±=1C-<,)
+
°-M£BCM
+ C8(M)08
The
O0
(31)
where the scheme-dependent term in the coefficient of the operator Oo has been written explicitly. Because the light-quark pair has to be in a colour singlet to produce the pion in the leading Fock state, only Oo gives a contribution to zeroth order in as. Similarly, to first order in as only 0 8 can contribute. The result of evaluating the diagrams in Fig. 5 with an insertion of Og can be presented in a form that holds simultaneously for a heavy meson H = D,D* and a light meson L = n, p, using only that the (fid) pair is a colour singlet and that the external quarks can be taken on-shell. We obtain (z = mc/mb) (H(p')L(q)\Os\Bd(p))
4TT2NC
ifL
f /
(32)
du$L{u)
Jo
6 In
+ B mi
« J V ) - (JA)) + F(u, z) (Jv) - F(u, -z)
(JA)
/here (Jv) = (H(p')\c4b\Bd{p)),
(JA) = {H{p,)\cfab\Bd{p)).
(33)
It is worth noting that even after computing the one-loop correction the (fid) pair retains its V — A structure. This, together with (5), implies that the form of (32) is identical for pions and longitudinally polarized p mesons. (The production of transversely polarized p mesons is power suppressed in AQCD/«!&•) The function F(u, z) appearing in (32) is given by F(u, z) = (3 + 2 In " ) In z1 - 7 + f(u, z) + / ( u , 1/z)
(34)
where /(u,z) =
+2
u(l - z 2 )[3(l - u(l - z2)) + z\ ln[u(l - z2)] 2^12 [l-u{l-z
ln[u(l - z2)} -ln2[u(l l-u{l-z2)
-Li2[l-u(l
\-u(l {u —> u}
(35)
507
and Li2(a;) is the dilogarithm. The contribution of f(u,z) in (34) comes from the first two diagrams in Fig. 5 with the gluon coupling to the b quark, whereas f(u, l/z) arises from the last two diagrams with the gluon coupling to the charm quark. Note that the terms in the large square brackets in the definition of the function f(u, z) vanish for a symmetric light-cone distribution amplitude. These terms can be dropped if the light final-state meson is a pion or a p meson, but they are relevant, e.g., for the discussion of Cabibbo-suppressed decays such as Bd -> D^+K~ and Bd -> D^+K*~. The discontinuity of the amplitude, which is responsible for the occurrence of the strong rescattering phase, arises from f(u, l/z) and can be obtained by recalling that z2 is z2 — it with e > 0 infinitesimal. We find 1
(1 - u)(l - z 2 )[3(l - u(l - z2)) + z] [1 - u(l - z2)]2
Ini F(u,.
^2
ln[l-u(l
+ 2 In u +
z2)
1 -u{\-
{u —> u}
(36)
As mentioned above, (32) is applicable to all decays of the type Bd —> £)(*) + L _ , where L is a light hadron such as a pion or a (longitudinally polarized) p meson. Only the operator Jy contributes to B& -> D+L~, and only JA contributes to Bj —> D*+L~. Our result can therefore be written as (D+L-\O0,s\Bd)
= (D+\cr(l-l5)b\Bd)
-ifLqfi / duT0,s(u,z) ./o
<S>L(u), (37)
where L = ir, p, and the hard-scattering kernels are T0(u,z) = l + O(a2.), rr ,
x
TH(u,z) =
as
CF
4TT
2Nr
-6 In /"
B + F(u,z) + 0(a28
(38)
When the D meson is replaced by a D* meson, the result is identical except that F(u,z) must be replaced with F(u,—z). Since no order-a s corrections exist for OQ, the matrix element retains its leading-order factorized form (D+L-\O0\Bd)
= ifLq^ (D+\cr(l
~
l5)b\Bd)
(39)
to this accuracy. From (35) it follows that Ts(u, z) tends to a constant as u approaches the endpoints (u —> 0, 1). (This is strictly true for the part of Tg(u,z) that is symmetric in u <-> u; the asymmetric part diverges logarithmically as u —> 0, which however does not affect the power behaviour and the convergence
508
properties in the endpoint region.) Therefore the contribution to (37) from the endpoint region is suppressed, both by phase space and by the endpoint suppression intrinsic to $ L ( « ) - Consequently, the emitted light meson is indeed dominated by energetic constituents, as required for the self-consistency of the factorization formula. The final result for the class-I, non-leptonic B^ —> £)(*)+L~~ decay amplitudes, in the heavy-quark limit and at next-to-leading order in as, can be compactly expressed in terms of the matrix elements of a "transition operator" T = % 0 4 6 [ai(DL) Qv - ax{D*L) QA
(40)
where Qv = c^b ® djuil - 7 5 )u .
QA = c^^b
(41)
and hadronic matrix elements of QV,A are understood to be evaluated in factorized form, i.e. (DL|J1®J2|B)EE(D|J1
(42)
\B)(L\j2\0).
Eq. (40) defines the quantities ai(D^L), which include the leading "nonfactorizable" corrections, in a renormalization-scale and -scheme independent way. To leading power in A Q C D / " ^ these quantities should not be interpreted as phenomenological parameters (as is usually done), because they are dominated by hard gluon exchange and thus calculable in QCD. At next-to-leading order we get N + '\ 2NC , as CF + ^ 2 N (H(D*L)
N 1 C-M 2NC C r
^
-6 In ^
Nc + l
N — 1
27V C
2NC
as
CF
4TT 2NC
Cs(v)
-6 In
mt
+
duF(u,z)
$x,(u)
+
duF(u, -z) $L(M)
(43)
We observe that the scheme-dependent terms parameterized by B have canceled between the coefficient of O0 in (31) and the matrix element of Og in (37). Likewise, the fi dependence of the terms in brackets in (43) cancels
509 against the scale dependence of the coefficients C±(/i), ensuring a consistent result at next-to-leading order. T h e coefficients ai(DL) and ai(D*L) are seen t o be non-universal, i.e. they depend explicitly on t h e n a t u r e of the final-state mesons. This dependence enters via t h e light-cone distribution amplitude of the light emission meson and via the analytic form of the hard-scattering kernel (F(u,z) vs. F(u,—z)). However, the non-universality enters only at next-toleading order. Using the fact t h a t violations of heavy-quark spin symmetry require hard gluon exchange, Politzer and Wise have computed the "non-factorizable" vertex corrections to the decay-rate ratio of the Dn and D*ir final states many years ago 1 7 . In t h e context of our formalism, this calculation requires the symmetric part (with respect to « f> «) of the difference F(u,z) — F(u, —z). Explicitly,
T(Bd^D+ir-) F(Bd^D*+Tr-)
(D+\cftl-y5)b\Bd) (D*+\c{{(l-l5)b\Bd)
CII(DTT)
ai(D*n)
(44)
where for simplicity we neglect the light meson masses as well as the mass difference between D and D* in the phase-space for the two decays. At nextto-leading order ai(DTr) ai (D*7T)
= * + ? if ?r R e / du I F ( M ' z ) -F(u,-z)} 47T JMC Go Jo
Our result for t h e symmetric part of F(u,z) found in 1 7 .
5
— F(u,—z)
* „ (u).
(45)
coincides with t h a t
Power-suppressed contributions
Up to this point we have presented arguments in favour of factorization of non-leptonic B-decay amplitudes in the heavy-quark limit, a n d have explored in detail how the factorization formula works at one-loop order for the decays Bd —> D(*>+L~. It is now time t o show t h a t other contributions not considered so far are indeed power suppressed. This is necessary to fully establish the factorization formula. Besides, it will also provide some numerical estimates of t h e corrections to the heavy-quark limit. We start by discussing interactions involving t h e spectator quark and weak annihilation contributions, before turning to the more delicate question of t h e importance of non-valence Fock states.
510
-V- -Mo
o
Q
O
o
c
Figure 7: "Non-factorizable" spectator interactions.
5.1
Interactions with the spectator quark
Clearly, the diagrams shown in Fig. 7 cannot be associated with the formfactor term in the factorization formula (3). We will now show that for B decays into a heavy-light final state their contribution is power suppressed in the heavy-quark limit. (This suppression does not occur for decays into two light mesons, where hard spectator interactions contribute at leading power. In this case, they contribute to the kernels T/ 7 in the factorization formula (second term in Fig. 1).) In general, "non-factorizable" diagrams involving an interaction with the spectator quark would impede factorization if there existed a soft contribution at leading power. While such terms are present in each of the two diagrams separately, they cancel in the sum over the two gluon attachments to the (ud) pair by virtue of the same colour-transparency argument that was applied to the "non-factorizable" vertex corrections. Focusing again on decays into a heavy and a light meson, such as Bd —• D+n~, we still need to show that the contribution remaining after the soft cancellation is power suppressed relative to the leading-order contribution (13). A straightforward calculation leads to the following (simplified) result for the sum of the two diagrams: A{Bd
-»• D+
Jo £ ~ GF as mb A2QGD
Jo V .
Jo
u
(46)
This is indeed power suppressed relative to (13). Note that the gluon virtuality is of order £77 ml ~ A Q C D and so, strictly speaking, the calculation in terms of light-cone distribution amplitudes cannot be justified. Nevertheless, we use (46) to deduce the scaling behaviour of the soft contribution, as we did for the heavy-light form factor in Sect. 3.2.
511
x<: >< >4 >^ (a)
(b)
(c)
(<1)
Figure 8: Annihilation diagrams.
5.2
Annihilation topologies
Our next concern are the annihilation diagrams shown in Fig. 8, which also contribute to the decay Bd ->• D+ix~. The hard part of these diagrams could, in principle, be absorbed into hard-scattering kernels of the type T/ 7 . The soft part, if unsuppressed, would violate factorization. However, we will see that the hard part as well as the soft part are suppressed by at least one power of AQCD/TO&-
The argument goes as follows. We write the annihilation amplitude as A(Bd -)• £> + 7r-) ann ~
GF UIDIB
as
x f d£dj7du$B(0*z?fa)Mu)Tann(£,»?,«), Jo
(47)
where the dimensionless function Tann(^,n,u) is a product of propagators and vertices. The product of decay constants scales as AQ C D /TO(,. Since d£3>B(£) scales as 1 and so does d/7 $£>(?]), while du$n(u) is never larger than 1, the amplitude can only compete with the leading-order result (13) if Tann(£, n, u) can be made of order (mt/AQco) 3 or larger. Since T ann (£,?7, u) contains only two propagators, this can be achieved only if both quarks the gluon splits into are soft, in which case T ann (£,?7,it) ~ (m&/AQCD) 4 • But then du^n(u) ~ (AQCD/TO!,) 2 , SO that this contribution is power suppressed. 5.3
Non-leading Fock states
Our discussion so far concentrated on contributions related to the quarkantiquark components of the meson wave functions. We now present qualitative arguments that justify this restriction to the valence-quark Fock components. Some of these arguments are standard 8 ' 9 . We will argue that higher Fock states yield only subleading contributions in the heavy-quark limit.
512
Figure 9: Diagram that contributes to the hard-scattering kernel involving a quarkantiquark-gluon distribution amplitude of the B meson and the emitted light meson.
Additional hard partons An example of a diagram that would contribute to a hard-scattering function involving quark-antiquark-gluon components of the emitted meson and the B meson is shown in Fig. 9. For light mesons, higher Fock components are related to higher-order terms in the collinear expansion, including the effects of intrinsic transverse momentum and off-shellness of the partons by gauge invariance. The assumption is that the additional partons are collinear and carry a finite fraction of the meson momentum in the heavy-quark limit. Under this assumption, it is easy to see that adding additional partons to the Fock state increases the number of off-shell propagators in a given diagram (compare Fig. 9 to Fig. 3). This implies power suppression in the heavy-quark expansion. Additional partons in the B-meson wave function are always soft, as is the spectator quark. Nevertheless, when these partons are connected to the hardscattering amplitudes the virtuality of the additional propagators is still of order TO^AQCD, which is sufficient to guarantee power suppression.
Figure 10: The contribution of the qqg Fock state to the B^ —> D+TT
decay amplitude.
Let us study in more detail how the power suppression arises for the simplest non-trivial example, where the pion is composed of a quark, an antiquark, and an additional gluon. The contribution of this 3-particle Fock state to the B -» DTT decay amplitude is shown in Fig. 10. It is convenient to use the Fock-Schwinger gauge, which allows us to express the gluon field A\ in terms
513
of the field-strength tensor Gp\ via Ax{x) = [ dvvxpGpX{vx). Jo
(48)
Up to twist-4 level, there are three quark-antiquark-gluon matrix elements that could potentially contribute to the diagrams shown in Fig. 10. Due to the V — A structure of the weak-interaction vertex, the only relevant three-particle light-cone wave function has twist-4 and is given by 1 8 , 1 9 (7r(g)|d(0)7M75S.GQ/j(ua:)u(0)|0) = Mqpgcv
~ 9a5/3M)
jVu
+ U1^(qax0-qpxa)Jvu
(
•
(49)
Here JT>u = J0 du\ d,U2 du3<5(l —iti —U2—U3), with m, u 2 and W3 the fractions of the pion momentum carried by the quark, antiquark and gluon, respectively. Evaluating the diagrams in Fig. 10, and neglecting the charm-quark mass for simplicity, we find (D+ir-\Os\Bd)m=iU(D+\cd(l-l5)b\Bd)
f v u
2
^ ^ - .
(50)
Since 0j| ~ A Q C D , the suppression by two powers of A Q C D / W J compared to the leading-order matrix element is obvious. Note that due to G-parity 0y is antisymmetric in u\ -O- u-i for a pion, so that (50) vanishes in this case. Additional soft partons A more precarious situation may arise when the additional Fock components carry only a small fraction of the meson momentum, contrary to the assumption made above. It is usually argued 8 ' 9 that these configurations are suppressed, because they occupy only a small fraction of the available phase space (since J dui ~ AQCD/TI6 when the parton that carries momentum fraction Ui is soft). This argument does not apply when the process involves heavy mesons. Consider, for example, the diagram shown in Fig. 11 (a) for the decay B —» Dir. Its contribution involves the overlap of the J3-meson wave function involving additional soft gluons with the wave function of the D meson, also containing soft gluons. There is no reason to suppose that this overlap is suppressed relative to the soft overlap of the valence-quark wave functions. It represents (part
514
(a)
(b)
Figure 11: (a) Soft overlap contribution which is part, of the B -> D form factor, (b) Soft overlap with the pion which would violate factorization, if it were unsuppressed.
of) the overlap of the "soft cloud" around the b quark with (part of) the "soft cloud" around the c quark after the weak decay. The partonic decomposition of this cloud is unrestricted up to global quantum numbers. (In the case where the B meson decays into two light mesons, there is a form-factor suppression ~ ( A Q C D / " " ^ ) 3 / 2 for the overlap of the valence-quark wave functions, but once this price is paid there is again no reason for further suppression of additional soft gluons in the overlap of the 5-meson wave function and the wave function of the recoiling meson.) The previous paragraph essentially repeated our earlier argument against the hard-scattering approach, and in favour of using the B —> D form factor as an input to the factorization formula. However, given the presence of additional soft partons in the B —>• D transition, we must now argue that it is unlikely that the emitted pion drags with it one of these soft partons, for instance a soft gluon that goes into the pion wave function, as shown in Fig. 11 (b). Notice that if the (qq) pair is produced in a colour-octet state, at least one gluon (or a further (qq) pair) must be pulled into the emitted meson if the decay is to result in a two-body final state. What suppresses the process shown in Fig. 11 (b) relative to the one in Fig. 11 (a) even if the emitted (qq) pair is in a colour-octet state? It is once more colour transparency that saves us. The dominant configuration has both quarks carry a large fraction of the pion momentum, and only the gluon might be soft. In this situation we can apply a non-local "operator product expansion" to determine the coupling of the soft gluon to the small (qq) pair 2 . The gluon endpoint behaviour of the qqg wave function is then determined by the sum of the two diagrams shown on the right-hand side in Fig. 12. The leading term (for small gluon momentum) cancels in the sum of the two diagrams, because the meson (represented by the black bar) is a colour singlet. This cancellation, which is exactly the same cancellation needed to demonstrate that "non-factorizable" vertex corrections are domi-
515
+ Figure 12: Quark-antiquark-gluon distribution amplitude in the gluon endpoint region.
nated by hard gluons, provides one factor of AQCD/W;, needed to show that Fig. 11 (b) is power suppressed relative to Fig. 11 (a). In summary, we have (qualitatively) covered all possibilities for non-valence contributions to the decay amplitude and find that they are all power suppressed in the heavy-quark limit. 6
Limitations of the factorization approach
The factorization formula (3) holds in the heavy-quark limit rtib —> oo. Corrections to the asymptotic limit are power-suppressed in the ratio A-Qcn/mb and, generally speaking, do not assume a factorized form. Since mt, is fixed to about 5 GeV in the real world, one may worry about the magnitude of power corrections to hadronic B-decay amplitudes. Naive dimensional analysis would suggest that these corrections should be of order 10% or so. We now discuss several reasons why some power corrections could turn out to be numerically larger than suggested by the parametric suppression factor AQCD/TI{>- Most of these "dangerous" corrections occur in more complicated, rare hadronic B decays into two light mesons, but are absent in decays such a s B 4 Dir. 6.1
Several small parameters
Large non-factorizable power corrections may arise if the leading-power, factorizable term is somehow suppressed. There are several possibilities for such a suppression, given a variety of small parameters that may enter into the non-leptonic decay amplitudes. i) The hard, "non-factorizable" effects computed using the factorization formula occur at order as. Some other interesting effects such as final-state interactions appear first at this order. For instance, strong-interaction phases due to hard interactions are of order as, while soft rescattering phases are of order A Q C D / " ^ - Since for realistic B mesons as is not particularly large compared to A Q C D / ^ 6 , we should not expect that these phases can be calculated with great precision. In practice, however, it is probably more important to know that the strong-interaction phases are parametrically suppressed in the heavy-quark limit and thus should be
516
small. (This does not apply if the real part of the decay amplitudes is suppressed for some reason; see below.) ii) If the leading, lowest-order (in as) contribution to the decay amplitude is colour suppressed, as occurs for the class-II decay B^ —t ir°ir°, then perturbative and power corrections can be sizeable. In such a case even the hard strong-interaction phase of the amplitude can be large 1 , 2 . But at the same time soft contributions could be potentially important, so that in some cases only an order-of-magnitude estimate of the amplitude may be possible. iii) The effective Hamiltonian (1) contains many Wilson coefficients C,; that are small relative to Ci « 1. There are decays for which the entire leading-power contribution is suppressed by small Wilson coefficients, but some power-suppressed effects are not. An example of this type is B~ —> K~K°. This decay proceeds through a penguin operator b —> dss at leading power. But the annihilation contribution, which is power suppressed, can occur through the current-current operator with large Wilson coefficient C\. Our approach does not apply to such (presumably) annihilation-dominated decays, unless a systematic treatment of annihilation amplitudes can be found. iv) Some amplitudes may be suppressed by a combination of small CKM matrix elements. For example, B —> ixK decays receive large penguin contributions despite their small Wilson coefficients, because the so-called tree amplitude is CKM suppressed. This is not a problem for factorization, since it applies to the penguin and the tree amplitudes. We are not aware of any case (for ordinary B mesons) in which a purely powersuppressed term is CKM enhanced and which would therefore dominate the decay amplitude. (But this situation could occur for B~ —> D°K~, where the QCD dynamics is similar if we consider the charm quark as a light quark.) 6.2
Power corrections enhanced by small quark masses
There is another enhancement of power-suppressed effects for some decays into two light mesons, connected with the curious numerical fact that 2
^
^
^
= -%>«3GeV
(51)
is much larger than its naive scaling estimate AQCD- (Here (qq) = (0\uu\0) = (0\dd\0) is the quark condensate.) Consider the contribution of the penguin
517
operator OQ = (dibj)v-A(ujUi)v+A to the Bd —> 7r+7r leading-order graph of Fig. 3 results in the expression
(ir+n-Kd^v-AiujU^v+AlBd)
= im2BF^{Q)f,
decay amplitude. The
x^ , rrn,
(52)
which is formally a AQCD/"i6 power correction compared to the corresponding matrix element of a product of two left-handed currents, but numerically large due to (51). We would not have to worry about such terms if they could all be identified and the factorization formula (3) applied to them, since in this case higher-order perturbative corrections would not contain non-factorizing infrared logarithms. However, this is not the case. After including radiative corrections, the matrix element on the left-hand side of (52) is expressed as a non-trivial convolution with pion light-cone distribution amplitudes. The terms involving fi^ can be related to two-particle twist-3 (rather than leading twist-2) distribution amplitudes, conventionally called $ p (w) and $a(u). We find that the radiative corrections to the matrix element in (52) do indeed factorize. However, at the same order there appear twist-3 corrections to the hard spectator interaction shown in Fig. 7, and these contributions contain an endpoint divergence (related to the fact that the distribution amplitudes $ p (u) and $^(u) do not vanish at the endpoints). In other words, the twist-3 "corrections" to the hard spectator term in the second factorization formula in (3) relative to the "leading" twist-2 contributions are of the form as x logarithmic divergence, which we interpret as being of order 1. The non-factorizing character of the "chirally-enhanced" power corrections can introduce a substantial uncertainty in some decay modes 12 . As in the related situation for the pion form factor 20 , one may argue that the endpoint divergence is suppressed by a Sudakov form factor. However, it is likely that when uib is not large enough to suppress these chirally-enhanced terms, then it is also not large enough to make Sudakov suppression effective. We stress that the chirally-enhanced terms do not appear in decays into a heavy and a light meson such as B —> Dn, because these decays have no penguin contribution and no contribution from the hard spectator interaction. Hence, the twist-3 light-cone distribution amplitudes responsible for chirallyenhanced power corrections do not enter in the evaluation of the decay amplitudes. 6.3 Non-leptonic decays when Mi is not light The analysis of non-leptonic decay amplitudes in Sect. 3.3 referred to decays where the emission particle M2 is a light meson. We now briefly discuss the case where M-i is heavy.
518
Suppose that M 2 is a D meson, whereas the meson that picks up the spectator quark can be heavy or light. Examples of this type are the decays Bd —» ir°D° and Bd —» D+D~. It is intuitively clear that factorization must be problematic in these cases, because the heavy D meson has a large overlap with the BTT or BD systems, which are dominated by soft processes. In more detail, we consider the coupling of a gluon to the two quarks that form the emitted D meson, i.e. the pairs of diagrams in Fig. 5 (a+b), (c+d) and Fig. 7. Denoting the gluon momentum by k, the quark momenta by lq and lq, and the D-meson momentum by q, we find that the gluon couples to the "current" JA =
lx{Vq+K + mc)T _ r(%+ #) 7 A
(53)
where F is part of the weak decay vertex. When k is soft (all components of order AQCD)> each of the two terms scales as 1/AQCD- Taking into account the complete amplitude as done explicitly in Sect. 4.2, we can see that the decoupling of soft gluons requires that the two terms in (53) cancel, leaving a remainder of order l/m^. This cancellation does indeed occur when M^ is a light meson, since in this case lq and lq are dominated by their longitudinal components. When M2 is heavy, the momenta lq and lq are asymmetric, with all components of the light antiquark momentum lq of order AQCD m the Bor .D-meson rest frames, while the zero-component of lq is of order m/,. Hence the current can be approximated by
A
^ SX0T
T{y,+ /Q 7 A
~
21, -k + k* ~ A Q C D '
k0
1
(
4]
l
'
and the soft cancellation does not occur. (The on-shell condition for the charm quark has been used to arrive at this equation.) It follows that the emitted D meson does not factorize from the rest of the process, and that a factorization formula analogous to (3) does not apply to decays such as Bd —> n°D° and Bd —» D+D~. An important implication of this statement is that one should also not expect naive factorization to work in these cases. In other words, we expect that "non-factorizable" corrections modify the factorized decay amplitudes by terms of order 1. 6.4
Difficulties with charm
There are decay modes, such as B~ -* D°ir~, can go to either of the two final-state mesons. applies to the contribution that arises when the meson, but not when the spectator quark goes
in which the spectator quark The factorization formula (3) spectator quark goes to the D to the pion. However, even in
519 the latter case we may use naive factorization to estimate the power behaviour of the decay amplitude. Adapting (13) to the decay B~ ->• D°n~, we find that the non-factorizing (class-II) amplitude is suppressed compared to the factorizing (class-I) amplitude by A{B-
-> D°7r-) class -„
FB^(m2D)fD
fAQCp\2
B D
A(B~ -> D°^-)c,ass-i ~ F ^ {ml)fn
(g5)
~ V rnb J
Here we use that FB^n(q2) ~ (AQCD/"T.(,) 3// ' 2 even for q2 ~ m'2,, as long as 2 s a so 2 Imax ~ Q i l °f order m ,. (It follows from our definition of heavy final-state mesons that these conditions are fulfilled.) As a consequence, strictly speaking factorization does hold for B~ —> D°TT~ decays in the sense that the class-II contribution is power suppressed with respect to the class-I contribution. Unfortunately, the scaling behaviour for real B and D mesons is far from the estimate (55) valid in the heavy-quark limit. Based on the dominance of the class-I amplitude we would expect that _ Br(B~ -> D°n-) -Br(Bd^D+7r-)~l
^
R
(56)
in the heavy-quark limit. This contradicts existing data which yield R = 1.89 ±0.35, despite the additional colour suppression of the class-II amplitude. One reason for the failure of power counting lies in the departure of the decay constants and form factors from naive power counting. The following compares the power counting to the actual numbers (square brackets):
to .. ^c»V' J ,-, L5|J, l U
V mc J
' '
& | 1 „ f^Y"2 Ff^K)
V "i*
M . 5 | . (57)
However, it is unclear whether the failure of power counting can be attributed to the form factors and decay constants alone. Note that for the purposes of power counting we treated the charm quark as heavy, taking the heavy-quark limit for fixed mc/mb- This simplified the discussion, since we did not have to introduce mc as a separate scale. However, in reality charm is somewhat intermediate between a heavy and a light quark, since m,c is not particularly large compared to AQCD- In this context it is worth noting that the first hard-scattering kernel in (3) cannot have AQCD/TTIC corrections, since there is a smooth transition to the case of two light mesons. The situation is different with the hard spectator interaction term, which we argued to be power suppressed for decays into a D meson and a light meson. We shall come back to this in Sect. 7.5, where we estimate the magnitude of this term for the Dn final state, relaxing the assumption that the D meson is heavy.
520 7
Phenomenology of B —> D^L
decays
The matrix elements we have computed in Sect. 4.3 provide the theoretical basis for a model-independent calculation of the class-I non-leptonic decay amplitudes for decays of the type B -»• D^L, where L is a light meson, to leading power in AQCD/W(< and at next-to-leading order in renormalizationgroup improved perturbation theory. In this section we discuss phenomenological applications of this formalism and confront our numerical results with experiment. We also provide some numerical estimates of power-suppressed corrections to the factorization formula. 7.1
Non-leptonic decay amplitudes
The results for the class-I decay amplitudes for B —> D^L are obtained by evaluating the (factorized) hadronic matrix elements of the transition operator T defined in (40). They are written in terms of products of CKM matrix elements, light-meson decay constants, B —> D^*> transition form factors, and the QCD parameters a\(D^L). The decay constants can be determined experimentally using data on the weak leptonic decays P~ —> l~i?i(j), hadronic T~ —• M~vT decays, and the electromagnetic decays V° —> e + e~. Following 1 6 , we use fn = 131 MeV, fK = 160 MeV, fp = 210 MeV, fK* = 214 MeV, and / a j = 229 MeV. (Here a\ is the pseudovector meson with mass mai ~ 1230 MeV.) The non-leptonic Bd —> D^+L~ decay amplitudes for L = TT, p can be expressed as A(Bd -» D+7T-) = & A(Bd^
D*+n')
=
A(Bd -> D+p~) = -i-±
V:dVcb ai(Dir) fn F0(ml) (m2B - m2D), -i^V:dVcba1(D*n)fnA0(ml)2mD.e*-p, V:dVcb ai(Dp) fp F+(m2p) 2mp n*-p,
(58)
where p (p1) is the momentum of the B (charm) meson, e and r/ are polarization vectors, and the form factors FQ, F+ and AQ are defined in the usual way 16 . The decay mode Bd —» D*+p~ has a richer structure than the decays with at least one pseudoscalar in the final state. The most general Lorentz-invariant decomposition of the corresponding decay amplitude can be written as A{Bd -+ D*+p~) = i^= V:dVcb e * V " (Si Hv ~ S2 q,p'„ + 1S3 e,„af3p'V)
, (59)
521 where the quantities 5, can be expressed in terms of semi-leptonic form factors. To leading power in A Q C D / ^ ; , , we obtain Si = ai (D*p) m,pfp (mB + mD* )AX (m2p), S2 = ai (D'p) mpfp
H
^ - . mB + mD.
60
The contribution proportional to S3 in (59) is associated with transversely polarized p mesons and thus leads to power-suppressed effects, which we do not consider here. The various B —> D^ form factors entering the expressions for the decay amplitudes can be determined by combining experimental data on semileptonic decays with theoretical relations derived using heavy-quark effective theory 3 ' 1 6 . Since we work to leading order in A Q C D / " ^ , it is consistent to set the light meson masses to zero and evaluate these form factors at q2 = 0. In this case the kinematic relations Fo(0) = F+(0),
2mD.Ao(0) (61) allow us to express the two B4 —> D+L rates in terms of F + (0), and the two Bd —> D*+L~ rates in terms of AQ(0). Heavy-quark symmetry implies that these two form factors are equal to within a few percent 1 4 . Below we adopt the common value F+(0) = AQ(0) = 0.6. All our predictions for decay rates will be proportional to the square of this number. 7.2
(mB + mD.)Ai(0)
- (mB ~ mD.)A2(0)
=
Meson distribution amplitudes and predictions for ai
Let us now discuss in more detail the ingredients required for the numerical analysis of the coefficients ai (D^L). The Wilson coefficients d in the effective weak Hamiltonian depend on the choice of the scale p, as well as on the value of the strong coupling a s , for which we take as(mz) = 0.118 and two-loop evolution down to a scale p, ~ mi. To study the residual scale dependence of the results, which remains because the perturbation series are truncated at next-to-leading order, we vary p, between va\,j2 and 2m;,. The hard-scattering kernels depend on the ratio of the heavy-quark masses, for which we take z = mc/mb = 0.30 ±0.05. Hadronic uncertainties enter the analysis also through the parameterizations used for the meson light-cone distribution amplitudes. It is convenient and conventional to expand the distribution amplitudes in Gegenbauer poly-
522 Table 1: J
Numerical values for the integrals J
du F(U>—Z)$L(U)
du F(u, z) $ L ( U ) (upper portion) and
(lower portion) obtained including the first two Gegenbauer moments.
z
Leading term
Coefficient of o f
Coefficient of a^
0.25 0.30 0.35
-8.41 -9.51? - 8 . 7 9 - 9.09? -9.13-8.59i
5.92 - 12.19?; 5.78-12.71i 5 . 6 0 - 13.21?
- 1 . 3 3 + 0.36? - 1 . 1 9 + 0.58i - 1 . 0 0 + 0.73?
0.25 0.30 0.35
- 8 . 4 5 - 6.56? -8.37-5.99? - 8 . 2 4 - 5.44?'
6.72 - 10.73* 6.83-11.49?' 6.81 - 12.29?
- 0 . 3 8 + 0.93?' - 0 . 2 1 + 0.85?: - 0 . 0 8 + 0.75?'
nomials as $L(U)
= 6u(l - u) l +
YJOcLn{^)C^{2u~l)
(62)
where C[ (x) = 3x, C\ \x) = | ( 5 x 2 — 1), etc. The Gegenbauer moments a^(/i) are multiplicatively renormalized. The scale dependence of these quantities would, however, enter the results for the coefficients only at order a 2 , which is beyond the accuracy of our calculation. We assume that the leading-twist distribution amplitudes are close to their asymptotic form and thus truncate the expansion at n = 2. However, it would be straightforward to account for higher-order terms if desired. For the asymptotic form of the distribution amplitude, $ L ( U ) = 6u(l - u), the integral in (43) yields ( duF{u,z)$L{u) Jo
+
(1
= 31nz2-7
62(1-22) >(1 + z)
fw 6
U2(z2)
4 - 17z + 20z2 + 5z 3 + 2 ( l - z ) ( l + z) 2 +
3(2 - 3z + 2z2 + z 3
(l-z)(l {z^l/z}
+ z)
-ln(l-z2)
(63)
and the corresponding result with the function F(u, -z) is obtained by replacing z -+ -z. More generally, a numerical integration with a distribution amplitude expanded in Gegenbauer polynomials yields the results collected in Table 1. We observe that the first two Gegenbauer polynomials in the expansion of the light-cone distribution amplitudes give contributions of similar
523 Table 2: The QCD coefficients ai(D'*^L) at next-to-leading order for three different values of the renormalization scale fi. The leading-order values are shown for comparison.
ax(DL) ai(D*L)
a\°
\x = 2rrib H = mb H = mb/2 1.074+ 0.037i 1.038 + 0-Ollz 1.055 + 0.020i -(0.024 - 0 . 0 5 2 i ) a f -(0.013-0.028i).af -(0.007 - 0.015z) a\ 1.072 + 0.024t 1.037+ 0.007i 1.054 + 0.013* - ( 0 . 0 2 8 - 0 . 0 4 7 i ) a f -(0.015 - 0 . 0 2 5 i ) a f - ( 0 . 0 0 8 - 0 . 0 1 4 i ) a f 1.049 1.025 1.011
magnitude, whereas the second moment gives rise to much smaller effects. This tendency persists in higher orders. For our numerical discussion it is a safe approximation to truncate the expansion after the first non-trivial moment. The dependence of the results on the value of the quark mass ratio z = rric/mb is mild and can be neglected for all practical purposes. We also note that the difference of the convolutions with the kernels for a pseudoscalar D and vector D* meson are numerically very small. This observation is, however, specific to the case of B —> D^L decays and should not be generalized to other decays. Next we evaluate the complete results for the parameters a,\ at next-toleading order, and to leading power in A Q C D / " " ^ . We set z = mc/mb = 0.3. Varying z between 0.25 and 0.35 would change the results by less than 0.5%. The results are shown in Table 2. The contributions proportional to the second Gegenbauer moment a^ have coefficients of order 0.2% or less and can safely be neglected. The contributions associated with a\ are present only for the strange mesons K and K*, but not for n and p. Moreover, the imaginary parts of the coefficients contribute to their modulus only at order a2s, which is beyond the accuracy of our analysis. To summarize, we thus obtain \ai(DL)\
= 1.055±°;°1? " (0.013± 0 ;^)af ,
\ai(D*L)\
= 1.054+°;°!? - (0.015 + °;°^)af ,
(64)
where the quoted errors reflect the perturbative uncertainty due to the scale ambiguity (and the negligible dependence on the value of the ratio of quark masses and higher Gegenbauer moments), but not the effects of power-suppressed corrections. These will be estimated later. It is evident that within theoretical uncertainties there is no significant difference between the two a± parameters, and there is only a very small sensitivity to the differences between strange and non-strange mesons (assuming that \af | < 1). In our numerical
524
analysis below we thus take |ai| = 1.05 for all decay modes. 1.3
Tests of factorization
The main lesson from the previous discussion is that corrections to naive factorization in the class-I decays Bd -> £>(*)+L~~ are very small. The reason is that these effects are governed by a small Wilson coefficient and, moreover, are colour suppressed by a factor l/N2. For these decays, the most important implications of the QCD factorization formula are to restore the renormalization-group invariance of the theoretical predictions, and to provide a theoretical justification for why naive factorization works so well. On the other hand, given the theoretical uncertainties arising, e.g., from unknown power-suppressed corrections, there is little hope to confront the extremely small predictions for non-universal (process-dependent) "non-factorizable" corrections with experimental data. Rather, what we may do is ask whether data supports the prediction of a quasi-universal parameter \a\\ ~ 1.05 in these decays. If this is indeed the case, it would support the usefulness of the heavyquark limit in analyzing non-leptonic decay amplitudes. If, on the other hand, we were to find large non-universal effects, this would point towards the existence of sizeable power corrections to our predictions. We will see that within present experimental errors the data are in good agreement with our prediction of a quasi universal a\ parameter. However, a reduction of the experimental uncertainties to the percent level would be very desirable for obtaining a more conclusive picture. We start by considering ratios of non-leptonic decay rates that are related to each other by the replacement of a pseudoscalar meson by a vector meson. In the comparison of B —» DTT and B —• D*n decays one is sensitive to the difference of the values of the two ai parameters in (64) evaluated for a.\ = 0. This difference is at most few times 10~ 3 . Likewise, in the comparison of B —\ DTT and B —• Dp decays one is sensitive to the difference in the light-cone distribution amplitudes of the pion and the p meson, which start at the second Gegenbauer moment a\. These effects are suppressed even more strongly. From the explicit expressions for the decay amplitudes in (58) it follows that F(Bd -»• D+7T-) _ {m% - m2D)2\q\Dn T(Bd->D*+n-) 4m||g|33^ T(Bd->D+p-) T(Bd -)• D+n-)
=
UI3 s ^m2B2\q\ Dp (m% - m2D)2\q\Dn
f F0(ml \A0{ml / p . frr,2\\ 2 f12 F+(m2p) p f2 { F0(m2w)
ai(Dn) ai(D*7r) 2
*
/r, N ai(Dp)
ai(DTr)
2
(65)
Using the experimental values for the branching ratios reported by the CLEO
525
Collaboration 21 we find (taking into account a correlation between some systematic errors in the second case) axiDir) a^(D*ii) ai(Dp) ai(Dn)
= 1.00 ± 0 . 1 1 , F+{m2p)
1.16 ± 0.11
foK)
(66)
Within errors, there is no evidence for any deviations from naive factorization. Our next-to-leading order results for the quantities ax(D^L) allow us to make theoretical predictions which are not restricted to ratios of hadronic decay rates. A particularly clean test of these predictions, which is essentially free of hadronic uncertainties, is obtained by relating the Bd —> D^+L~ decay rates to the differential semi-leptonic Bd —> D^-*>+ l~v decay rate evaluated at q2 = m2L. In this way the parameters |ai| can be measured directly 10 . One obtains R (*)
T(Bd -> DW+LdY{Bd^D^+l-D)ldq2\
=
Qn2\Vud\2f2\ai{D^L)\2X-(*)
(67) where Xp = X* = 1 for a vector meson (because the production of the lepton pair via a V — A current in semi-leptonic decays is kinematically equivalent to that of a vector meson with momentum q), whereas Xv and X* deviate from 1 only by (calculable) terms of order m%/m%, which numerically are below the 1% level 16 . We emphasize that with our results for aj given in (43) the above relation becomes a prediction based on first principles of QCD. This is to be contrasted with the usual interpretation of this formula, where a\ plays the role of a phenoinenological parameter that is fitted from data. The most accurate tests of factorization employ the class-I processes Bd —> D*+L~, because the differential semi-leptonic decay rate in B —> D* transitions has been measured as a function of q2 with good accuracy. The results of such an analysis, performed using CLEO data, have been reported in 2 3 . One finds Rl = (1.13 ± 0.15) GeV2 R*p = (2.94 ± 0.54) GeV
2
R* = (3.45 ± 0.69) GeV2
MZTTT)!
= 1.08 ± 0 . 0 7 ,
| o i ( D » | = 1.09 ± 0 . 1 0 , \ai(D*ai)\
= 1.08 ± 0 . 1 1 .
(68)
This is consistent with our theoretical result in (43). In particular, the data show no evidence for large power corrections to our predictions obtained at
526 Table 3: Model-independent predictions for the branching ratios (in units of 1 0 - 3 ) of class-T. non-leptonic B^ —» £)'*)+L~ decays in the heavy-quark limit. All predictions are in units of (|ai |/1.05) 2 . The last two columns show the experimental results reported by the CLEO Collaboration 2 1 , and by the Particle Data G r o u p 2 4 .
Decay Bd -> Bd -*• Bd -> Bd -» Bd ->•
mode D+n" D+K~ D+p~ D+K*D+a~
Bd -»•
D*+v~
Bd Bd Bd Bd
-> -> -> -»
D*+K~ D*+pD*+K*~ £>*+or
Theory (HQL) 3.27 0.25 7.64 0.39 7.76 x[F + (0)/0.6] 2 3.05 0.22 7.59 0.40 8.53 x[A)(0)/0.6] 2
CLEO data 2.50 ±0.40
PDG98 3.0 ± 0 . 4
7.89 ±1.39
7.9 ± 1 . 4
8.34 ±1.66
6.0 ± 3 . 3
2.34 ±0.32
2.8 ±0.2
7.34 ±1.00
6.7 ± 3 . 3
11.57 ±2.02
13.0 ± 2 . 7
leading order in A Q C D / ^ 6 - However, a further improvement in the experimental accuracy would be desirable in order to become sensitive to processdependent, non-factorizable effects. 7.4
Predictions for class-I decay amplitudes
We now consider a larger set of class-I decays of the form Bd —> D^+L", all of which are governed by the transition operator (40). In Table 3 we compare the QCD factorization predictions with experimental data. As previously we work in the heavy-quark limit, i.e. our predictions are model independent up to corrections suppressed by at least one power of AQCD/nib- The results show good agreement with experiment within errors, which are still rather large. (Note that we have not attempted to adjust the semi-leptonic form factors -F+(0) and Ao{0) so as to obtain a best fit to the data.) We take the observation that the experimental data on class-I decays into heavy-light final states show good agreement with our predictions obtained in the heavy-quark limit as evidence that in these decays there are no unexpectedly large power corrections. We will now address the important question of the size of power corrections theoretically. To this end we provide rough estimates of two sources of power-suppressed effects: weak annihilation and spectator in-
527
teractions. We stress that, at present, a complete account of power corrections to the heavy-quark limit cannot be performed in a systematic way, since these effects are not dominated by hard gluon exchange. In other words, factorization breaks down beyond leading power, and there are other sources of power corrections, such as contributions from higher Fock states, which we will not address here. We believe that the estimates presented below are nevertheless instructive. To obtain an estimate of power corrections we adopt the following, heuristic procedure. We treat the charm quark as light compared to the large scale provided by the mass of the decaying b quark (m c <S mi, and mc fixed as nil, —> oo) and use a light-cone projection similar to that of the pion also for the D meson. In addition, we assume that mc is still large compared to AQCD- We implement this by using a highly asymmetric .D-meson wave function, which is strongly peaked at a light-quark momentum fraction of order AQCD/?TI.D- This guarantees correct power counting for the heavy-light final states we are interested in. As discussed in Sect. 5.2, there are four annihilation diagrams with a single gluon exchange (see Fig. 8 (a)-(d)). The first two diagrams are "factorizable" and their contributions vanish because of current conservation in the limit mc —> 0. For non-zero mc they therefore carry an additional suppression factor rn2Dlrn2B K, 0.1. Moreover, their contributions to the decay amplitude are suppressed by small Wilson coefficients. Diagrams (a) and (b) can therefore safely be neglected. From the non-factorizable diagrams (c) and (d) in Fig. 8, the one with the gluon attached to the b quark turns out to be strongly suppressed numerically, giving a contribution of less than 1% of the leading class-I amplitude. We are thus left with diagram (d), in which the gluon couples to the light quark in the B meson. This mechanism gives the dominant annihilation contribution. (Note that by deforming the light spectator-quark line one can redraw this diagram in such a way that it can be interpreted as a final-state rescattering process.) Adopting a common notation, we parameterize the annihilation contribution t,o the Dj —> D+n~ decay amplitude in terms of a (power-suppressed) amplitude A such that A(Bd -> D+n~) = T + A, where T is the "tree topology", which contains the dominant factorizable contribution. A straightforward calculation using the approximations discussed above shows that the contribution of diagram (d) is (to leading order) independent of the momentum fraction £ of the light quark inside the B meson: A ~ UfDjB
/ du
/ dv
_2
~ 3UfDfB
/ dv
_2
.
(69)
The f?-meson wave function simply integrates to fs, and the integral over the
528
pion distribution amplitude can be performed using the asymptotic form of the wave function. We take $ D ( ^ ) in the form of (62) with the coefficients a f = 0.8 and a® = 0.4 ( a f = 0, i > 2). With this ansatz $ D ( ^ ) is strongly peaked at v ~ AQ C D/TO_D. The integral over $D{V) in (69) is divergent at v = 1, and we regulate it by introducing a cut-off such that v < 1 — A/mB with A « 0.3 GeV. Then J cfo $D(V)/V2 P» 34. Evidently, the proper value of A is largely unknown, and our estimate will be correspondingly uncertain. Nevertheless, this exercise will give us an idea of the magnitude of the effect. For the ratio of the annihilation amplitude to the leading, factorizable contribution we obtain A T
2was C+ + C_ 3 2C+ + C -
fDfB fJ $D(v) 2 J dv^±* Fo(0)m B
0.04.
(70)
We have evaluated the Wilson coefficients at \x = mj and used JD = 0.2 GeV, fs ~ 0.18GeV, ^ ( 0 ) = 0.6, and as = 0.4. This value of the strong coupling constant reflects that the typical virtuality of the gluon propagator in the annihilation graph is of order AQCDTTIB. We conclude that the annihilation contribution is a correction of a few percent, which is what one would expect for a generic power correction to the heavy-quark limit. Taking into account that fB ~ AQCD(AQCD/m B ) 1/2 , F0(0) ~ ( A Q C D / m B ) 3 / 2 and fD ~ A Q C D , we observe that in the heavy-quark limit the ratio A/T indeed scales as A Q O D / " ^ , exhibiting the expected linear power suppression. (Recall that we consider the D meson as a light meson for this heuristic analysis of power corrections.) Using the same approach, we may also derive a numerical estimate for the non-factorizable spectator interaction in Bd -4 D+TT~ decays, discussed in Sect. 5.1. We find T spec Tlead
2TTQS
3
C+-C2C++C-
fDfB Fo{0)mB
mB AB
Jf d, o ^$D(V) -^-0.03,
(71)
where the hadronic parameter XB = O ( A Q C D ) is defined as JQ (d£/£) $B{0 = m g / A s . For the numerical estimate we have assumed that XB « 0.3 GeV. With the same model for $D(V) as above we have J dv $D(V)/V SS 6.6, where the integral is now convergent. The result (71) exhibits again the expected power suppression in the heavy-quark limit, and the numerical size of the effect is at the few percent level. We conclude from this discussion that the typical size of power corrections to the heavy-quark limit in class-I decays of B mesons into heavy-light final states is at the level of 10% or less, and thus our prediction for the near universality of the parameters a\ governing these decay modes appears robust.
529 7.5
Remarks on class-II and class-Ill decay amplitudes
In the class-I decays Bd —> D^+L~, the flavour quantum numbers of the final-state mesons ensure that only the light meson L can be produced by the {du) current contained in the operators of the effective weak Hamiltonian in (15). The QCD factorization formula then predicts that the corresponding decay amplitudes are factorizable in the heavy-quark limit. The formula also predicts that other topologies, in which the heavy charm meson would be created by a (cu) current, are power suppressed. To study these topologies we now consider decays with a neutral charm meson in the final state. In the class-II decays B,i -4 D^°L° the only possible topology is to have the charm meson as the emission particle, whereas for the class-Ill decays B~ —> D^°L~ both final-state mesons can be the emission particle. The factorization formula predicts that in the heavy-quark limit class-II decay amplitudes are power suppressed with respect to the corresponding class-I amplitudes, whereas classIll amplitudes should be equal to the corresponding class-I amplitudes up to power corrections. It is convenient to introduce two common parameterizations of the decay amplitudes, one in terms of isospin amplitudes Ai/2 and A3i2 referring to the isospin of the final-state particles, and one in terms of flavour topologies (T for ''tree topology", C for "colour suppressed tree topology", and A for "annihilation topology"). Taking the decays B —> Dn as an example, we have
A(Bd -> D+n-) = \J\A3/2
V2A(Bd -• D07T°) = \J\A3/2 A(B~ -> D°ir-) = V3A3/2
+ \J\MI2
-
\J\A„2
=T + C.
=T + A,
=C-A, (72)
A similar decomposition holds for the other B —> D^L decay modes. Isospin symmetry of the strong interactions implies that the class-Ill amplitude is a linear combination of the class-I and class-II amplitudes. In other words, there are only two independent amplitudes, which can be taken to be Ai/2 and Az/2, or (T + A) and (C — A). These amplitudes are complex due to strong-interaction phases from final-state interactions. Only the relative phase of the two independent amplitudes is an observable. We define 5 to be the relative phase of Ai/2 and A3/2, and 5TC the relative phase of (T + A) and
530 Table 4: CLEO d a t a 2 1 ' 2 2 on the branching ratios for the decays B —> £>'*'£ in units of 1 0 - 3 . Upper limits are at 90% confidence level. See text for the definition of the quantities 6 and Tl.
Class-I (£>(*)+£-) Class-II (OWL0) Class-Ill (£>(*>°L-) 6 K
B -4 Dir 2.50 ±0.40 < 0.12 4.73 ±0.44 < 22° 1.34 ±0.13
B -» D*ir B -> Dp 7.89 ±1.39 2.34 ±0.32 < 0.44 < 0.39 9.20 ± 1.11 3.92 ±0.63 < 57° <30° 1.05 ±0.12 1.26 ±0.14
B -4 D*p 7.34 ± 1.00 < 0.56 12.77 ± 1.94 <31° 1.28 ±0.13
(C — A). The QCD factorization formula implies that •Al/2
>/2A 3/2 C -A T + A
l + 0{AQCD/mb), 0(AQCD/mb),
S=
0(AQCD/mb),
STC = 0(1) •
(73)
In the remainder of this section, we will explore to what extent these predictions are supported by data. In Table 4 we show the experimental results for the various B —> D^L branching ratios reported by the CLEO Collaboration 21,22 . We first note that no evidence has been seen for any of the class-II decays, in accordance with our prediction that these decays are suppressed with respect to the class-I modes. Below we will investigate in more detail how this suppression is realized. The fourth line in the table shows upper limits on the strong-interaction phase difference 5 between the two isospin amplitudes, which follow from the relation 16 9 T(B~) Bi(Bd^D°ir°) sin2 S < (74) 2 T{Bd) B r ( £ - -> DH-) The strongest bound arises in the decays B —>• Dn, where the strong-interaction phase is bound to be less than 22°. This confirms our prediction that the phase 5 is suppressed in the heavy-quark limit. Let us now study the suppression of the class-II amplitudes in more detail. We have already mentioned in Sect. 6.4 that the observed smallness of these amplitudes is more a reflection of colour suppression than power suppression. This is already apparent in the naive factorization approximation, because the appropriate ratios of meson decay constants and semi-leptonic form factors
531
exhibit large deviations from their expected scaling laws in the heavy-quark limit, as shown in (57). Indeed, it is obvious from Table 4 that there are significant differences between the class-I and class-Ill amplitudes, indicating that some power-suppressed contributions are not negligible. In the last row of the table we show the experimental values of the quantity
n=
lT(Bd) T(B-)
A{B~ - • I>(*)°L-) A(Bd -> £>(*)+L-
Br(B~ -»£>(»)°L-) Bi(Bd -> D(*)+L-)
(75)
which parameterizes the magnitude of power-suppressed effects at the level of the decay amplitudes. If we ignore the decays B —)• D*p with two vector mesons in the final state, which are more complicated because of the presence of different helicity amplitudes, then the ratio Ti is given by C - A T + A
11 = 1 +
1+x
02
(76)
where a\ are the QCD parameters entering the transition operator in (40), and fl
2 =
„Ar 2Nr
1
C+ -
2Nr
C_ + "non-factorizable corrections"
(77)
are the corresponding parameters describing the deviations from naive factorization in the class-II decays 16 . All quantities in (76) depend on the nature of the final-state mesons. In particular, the parameters X{DIT)
x(Dp)
X(D*.)
=
(m% ml)fDF0B^(ml) (m%-ml)UF0B^D(ml)
_fDA^{mp
fPF^D(ml) =
fD F im
0.9,
0.5.
* ^ »*)*09
(78)
account for the ratios of decay constants and form factors entering in the naive factorization approximation. For the numerical estimates we have assumed that the ratios of heavy-to-light over heavy-to-heavy form factors are approximately equal to 0.5, and we have taken fo = 0.2 GeV and fo» = 0.23 GeV for the charm meson decay constants. Note that in (76) it is the quantities x that are formally power suppressed ~ (AQCD/"1(,) 2 in the heavy-quark limit, not the ratios a-2Ja\. For the final states containing a pion the power suppression
532
is clearly not operative, mainly due to the fact that the pion decay constant fn is much smaller than the quantity ( / D V " ^ D ) 2 ^ 3 ~ 0.42 GeV. To reproduce the experimental values of the ratios 1Z shown in Table 4 requires values of a 2 / a i of order 0.1-0.4 (with large uncertainties), which is consistent with the fact that these ratios are of order 1/NC in the large-iVc limit, i.e. they are colour suppressed. The QCD factorization formula (3) allows us to compute the coefficients a\ in the heavy-quark limit, but it does not allow us to compute the corresponding parameters a 2 in class-II decays. The reason is that in class-II decays the emission particle is a heavy charm meson, and hence the mechanism of colour transparency, which was essential for the proof of factorization, is not operative. For a rough estimate of a 2 in B —> TTD decays we consider as previously the limit in which the charm meson is treated as a light meson (m c
where // = /
7T
2
dv$D{v) In v + l n u + — - 6 + iir{2Inv - 3) + 0(v)
o 12TT2
fnfB
mB (
$D(v)
The contribution from / / / describes the hard, non-factorizable spectator interaction. Note that this term involves Jdv^o(v)/v, which can be sizeable but remains constant in the heavy-quark limit implied here (mb -> oo with m c constant). Using the same numerical inputs as previously, we find that / / / w 13 and / / « - 1 - 19i. In writing the hard-scattering kernel for / / we have only kept the leading terms in v, which is justified because of the strongly asymmetric shape of <£_D(V). Note the large imaginary part arising from the "non-factorizable" vertex corrections with a gluon exchange between the final-state quarks. Combining all contributions, and taking fi = mb for the
533
renormalization scale, we find a2«0.25e-l41°,
(81)
which is significantly larger in magnitude than the leading-order result a\° w 0.12 corresponding to naive factorization. We hasten to add that our estimate (81) should not be taken too seriously, since it is most likely not a good approximation to treat the charm meson as a light meson. Nevertheless, it is remarkable that in this idealized limit one obtains indeed a very significant correction to naive factorization, which gives the right order of magnitude for the modulus of a2 and, at the same time, a large strong-interaction phase. For completeness, we note that the value for a 2 in (81) would imply a stronginteraction phase difference S sa 10° between the two isospin amplitudes Ax/2 and A3/2 in B —> Dn decays, and hence is not in conflict with the experimental upper bound on the phase S given in Table 4. The phase 5TC, on the other hand, is to leading order simply given by the phase of 02 and is indeed large, in accordance with (73). 8
Conclusion
With the recent commissioning of the B factories and the planned emphasis on heavy-flavour physics in future collider experiments, the role of B decays in providing fundamental tests of the Standard Model and potential signatures of new physics will continue to grow. In many cases the principal source of systematic uncertainty is a theoretical one, namely our inability to quantify the non-perturbative QCD effects present in these decays. This is true, in particular, for almost all measurements of CP violation at the B factories. In these lectures, I have reviewed a rigorous framework for the evaluation of strong-interaction effects for a large class of exclusive, two-body non-leptonic decays of B mesons. The main result is contained in the factorization formula (3), which expresses the amplitudes for these decays in terms of experimentally measurable semi-leptonic form factors, light-cone distribution amplitudes, and hard-scattering functions that are calculable in perturbative QCD. For the first time, therefore, we have a well founded field-theoretic basis for phenomenological studies of exclusive hadronic B decays, and a formal justification for the ideas of factorization. For simplicity, I have mainly focused on B —> Dir decays here. A detailed discussion of B decays into two light mesons will be presented in a forthcoming paper 1 2 . We hope that the factorization formula (3) will form the basis for future studies of non-leptonic two-body decays of B mesons. Before, however, a fair amount of conceptual work remains to be completed. In particular, it will be
534
important to investigate better the limitations on the numerical precision of the factorization formula, which is valid in the formal heavy-quark limit. We have discussed some preliminary estimates of power-suppressed effects in the present work, but a more complete analysis would be desirable. In particular, for rare B decays into two light mesons it will be important to understand the role of chirally-enhanced power corrections and weak annihilation contributions 12 ' 25 . For these decays, there are also still large uncertainties associated with the description of the hard spectator interactions. Theoretical investigations along these lines should be pursued with vigor. We are confident that, ultimately, this research will result in a theory of nonleptonic B decays, which should be as useful for this area of heavy-flavour physics as the large-m?, limit and heavy-quark effective theory were for the phenomenology of semi-leptonic decays. Acknowledgements I would like to thank the organizers of the TASI Institute for the invitation to present these lecture, for their hospitality, and for providing a stimulating atmosphere during the school. I am grateful to the students for attending the lectures and contributing with questions and discussions. Among many pleasant experiences during my stay in Boulder, I will remember a successful climb of Longs Peak, which helped me to recover from the course. Finally, I am indebted to my collaborators Martin Beneke, Gerhard Buchalla and Chris Sachrajda, who deserve much credit for these notes. This work was supported in part by the National Science Foundation. References 1. M. Beneke, G. Buchalla, M. Neubert and C.T. Sachrajda, Phys. Rev. Lett. 83, 1914 (1999). 2. M. Beneke, G. Buchalla, M. Neubert and C.T. Sachrajda, Nucl. Phys. B 591, 313 (2000). 3. N. Isgur and M.B. Wise, Phys. Lett. B 232, 113 (1989); ibid. 237, 527 (1990). 4. M.A. Shifman and M.B. Voloshin, Sov. J. Nucl. Phys. 45, 292 (1987) [Yad. Fiz. 45, 463 (1987)]; ibid. 47, 511 (1988) [47 (1988) 801]. 5. For a review, see: G. Buchalla, A.J. Buras and M.E. Lautenbacher, Rev. Mod. Phys. 68, 1125 (1996). 6. D. Fakirov and B. Stech, Nucl. Phys. B 133, 315 (1978). 7. N. Cabibbo and L. Maiani, Phys. Lett. B 73, 418 (1978); ibid. 76, 663 (1978) (E).
535
8. 9. 10. 11. 12.
13. 14. 15. 16.
17. 18. 19. 20. 21. 22. 23.
24. 25.
G.P. Lepage and S.J. Brodsky, Phys. Rev. D 22, 2157 (1980). A.V. Efreraov and A.V. Radyushkin, Phys. Lett. B 94, 245 (1980). J.D. Bjorken, Nucl. Phys. (Proc. Suppl.) B 11, 325 (1989). M.J. Dugan and B. Grinstein, Phys. Lett. B 255, 583 (1991). M. Beneke, G. Buchalla, M. Neubert and C.T. Sachrajda, QCD factorization for B —> TTK decays, Preprint hep-ph/0007256, to appear in the Proceedings of the 30th International Conference on High-Energy Physics (ICHEP 2000), Osaka, Japan, 27 July-2 August 2000, and paper in preparation. L. Maiani and M. Testa, Phys. Lett. B 245, 585 (1990). For a review, see: M. Neubert, Phys. Rep. 245, 259 (1994). J.F. Donoghue, E. Golowich, A.A. Petrov and J.M. Soares, Phys. Rev. Lett. 77, 2178 (1996). For a review, see: M. Neubert and B. Stech, in: Heavy Flavours II, ed. A.J. Buras and M. Lindner (World Scientific, Singapore, 1998) pp. 294 [hep-ph/9705292]; M. Neubert, Nucl. Phys. (Proc. Suppl.) B 64, 474 (1998). H.D. Politzer and M.B. Wise, Phys. Lett. B 257, 399 (1991). A. Khod.jamirian and R. Riickl, in: Heavy Flavours II, ed. A.J. Buras and M. Lindner (World Scientific, Singapore, 1998) pp. 345 [hep-ph/9801443]. V.M. Braun and I.E. Filyanov, Z. Phys. C 48, 239 (1990). B.V. Geshkenbein and M.V. Terentev, Phys. Lett. B 117, 243 (1982); Sov. J. Nucl. Phys. 39, 554 (1984) [Yad. Fiz. 39, 873 (1984)]. B. Barish et al., CLEO Collaboration, Conference report CLEO CONF 97-01 (EPS 97-339). B. Nemati et al., CLEO Collaboration, Phys. Rev. D 57, 5363 (1998). J.L. Rodriguez, in: Proceedings of the 2nd International Conference on B Physics and CP Violation, Honolulu, Hawaii, March 1997, ed. T.E. Browder et al. (World Scientific, Singapore, 1998) pp. 124 [hep-ex/9801028]. C. Caso et al., Particle Data Group, Eur. Phys. J. C 3, 1 (1998). Y.Y. Keum, H.-N. Li and A.I. Sanda, Fat penguins and imaginary penguins in pert.urbative QCD, Preprint hep-ph/0004004.
This page is intentionally left blank
m
Aaron Roodman
This page is intentionally left blank
A S Y M M E T R I C e+e~ COLLIDERS AARON ROODMAN Stanford Linear Accelerator Center Stanford University 2575 Sand Hill Rd., Menlo Park, California, E-mail: [email protected]
USA
In these lecture notes, I will focus on some of the interesting details of B-meson physics at an Asymmetric collider which are relevant for the study of CP-violating asymmetries. As an instructive example, I will describe the basic experimental tools used in the measurement of sin 2/3. In addition, I will describe the use of a blind analysis technique in the measurement of CP asymmetries.
1
Scope of these notes
As in my lectures, these notes aim to be a simple introduction to Asymmetric e + e~ colliders, and the B physics done there. I will not attempt to summarize the entire range of B physics, give a synopsis of the most recent experimental results, or describe the B factory detectors in detail. A good snapshot of recent results can be found in the proceedings of the most recent Rochester or LeptonPhoton symposia. Details about the BaBar detector and physics program can be found in the BaBar physics book1. 2
Introduction to Asymmetric Colliders
There are three accelerators devoted to B meson physics, a symmetric collider at Cornell (CESR), and two new asymmetric colliders at SLAC (PEP-II) and KEK (KEK-B). Parameters of the three machines are shown in Table 1. The bottom-line for any accelerator is integrated luminosity; the design levels of 30 — lOOfb-1 per year for PEP-II and KEK-B are ambitious but necessary to achieve the physics goals set for the BaBar and BELLE experiments. There are several important factors which allow the asymmetric colliders to achieve high luminosity. First, separate electron and positron rings allow high currents to be stored without disturbing the other beam. Happily, an asymmetric collider requires two rings anyway. Second, having separate rings allows for the storage of many bunches, which permits high currents without large beam-beam tune shifts. In the simplest terms, the beam-beam tune shift describes the amount that a beam is perturbed by another beam. This tune shift is due to Coulomb interactions and is proportional to the number of particles in the other beam. An accelerator cannot function stably with too 539
540 Table 1: Accelerator parameters for B Factories circumference (km) Energy e + (GeV) Energy e~ (GeV) Boost 0r(4S) Number of bunches crossing angle (mrad) vertical bunch size (/xm) beam-beam tune shift design Luminosity (10 3 3 cm~ 2 sec~ 1 )
CESR 0.768 5.3 5.3 0. 45 ±2.3 10 0.04 1.5
KEK-B 3.016 3.5 8.0 0.39 5000 ±11 1.9 0.04-0.05 3.0
PEPII 2.2 3.1 9.0 0.49 1658 0 5.4 0.03 10.
large a value of this tune shift?. Third, the small spot sizes and high currents demand sophisticated monitors of the beam. For example, KEK-B uses a beam-spot interferometer to monitor the size of the beam near the interaction region. This device has shown that the KEK-B beamspot was ~ 5fj,m in height, and became larger at higher currents, instead of the design value of 2fim. The advantage of a real-time monitor of this vital accelerator parameter cannot be over-emphasized. Finally, the use of numerous feedbacks makes stable operation at high luminosity possible. For example, PEP-II has single bunch feedbacks on the longitudinal and transverse position of every bunch, as well as slow feedbacks, of roughly 1-2 Hz, on the beam orbit, interaction angle, and luminosity itself. 3 ' 4 Of course, the interaction region in an asymmetric collider is much more complex than in a symmetric machine. PEP-II and KEK-B have chosen different schemes to collide their beams. PEP-II uses a series of quadrupole magnets which bring the beams into a head-on collision. While this produces a series of synchrotron radiation fans, these can be absorbed by careful placement of absorbers along the beam pipe. KEK-B has straight-thru beams, at an llmrad crossing angle, but can achieve head-on collisions using so-called crab cavities to rotate (and then unrotate) each beam. For both accelerators, beam related backgrounds are an important concern, especially for the survival of the silicon vertex detectors. As one might imagine, such machines can produce some interesting problems. One problem in the operation of PEP-II occurs in the positron ring, when the background rates suddenly increase, by a factor of 10 or more. The standard hypothesis is that this is due to the trapping of a charged dust particle in the positive beam. This phenomena had previously been observed in the HERA and ISR machines, which also had positive beams. In PEP-II an effort was made to avoid this problem by placing vacuum pumps below the beam-
541 2000/11/06
1 1.25
%\ \ % %\ * \ \ \ \ \ %\%
Figure 1: a) PEP-II integrated luminosity for the 1999-2000 run. b) BaBar recorded integrated luminosity per day for the 1999-2000 run.
line, so that dust in the pumps could not fall into the beam. Remarkably, it is possible to induce such trapped dust events by thumping the beam pipe. The two asymmetric colliders both commenced operation in 1999, with the goal of measuring a number of CP-violating asymmetries. Remarkably, both asymmetric colliders have quickly achieved high luminosities and have been able to provide a substantial amount of beam. For example, the daily peak and integrated luminosity for the year 2000 run of PEP-II is shown in Figure 1. Details about both PEP-II and KEK-B can be found in their respective design reports 6 ' 5 . 3
M e a s u r e m e n t of sin 2/3
The measurement of the CP-violating parameter sin 2/3 will be the first important result from the asymmetric colliders. As this measurement is described, we will highlight some of the typical experimental tools used in B physics. The phenomenology of CP violation is described in detail in other lectures7, so we will only note that the expected asymmetry is given by a/(At)
r(fi°(At)-+/)-r(gQ(At)^/) r(B°(At)->/) + r(B°(A*)->/)
-sin2/3sin(Am_BAt)
where the final state / is a CP eigenstate, At is the time difference between the decay into the CP eigenstate and the other B-meson decay, and the iden-
542
2.9 3 3.1 3.2 Electron Pair Invariant Mass (GeV)
Figure 2: BaBar J/ip —> e+e
signal.
tification of B° or B° is made using the decay of the other B-meson from the T. 3.1
B° -)• J/V> K 5 Decay
The decay £?° —> J/ip Ks has the advantage of being a two-body decay which can be isolated with high efficiency, of order 35%, and low background, roughly < 5%. The observation of a tp or Ks is straightforward; pairs of charged tracks are combined at the point of closest approach, the invariant mass distributions clearly shows the signal for 3/ip -> l+l~ in Figure 2, and for Kg —• 7r+7r~ in Figure 3. The neutral decay, K5 —> 7r07r°, is also used, as well as the ip(2S) state instead of the J/tp in CP violation measurements. To identify the decay B° -> J/i/i Kg, the 3/tp and the Ks are combined at their point of closest approach, and their kinematics are required to be consistent with a B meson decay. There are two possible kinematic variables, but the raw invariant mass or reconstructed momentum are not the ideal variables to use. We would like to use a pair of kinematic variables which are independent and can account for variations in the beam energy. The later condition is important because the beam energy may fluctuate by
543 Kg —> 7r+7r '£ MXXli U
1
1
1
1
1
1
1
1
.
1
1—
• N = 3431H±27<> < o > s 4 . 4 ± 0 . 1 MeV/c 2
\ M I \
5(MX> -
h 41XX) -
f
|
f 1 I ! 3(XX> -
j
2IXX) -
04M
i
I 1 1i Ii
0.47S
0.4SK
0 49S
D.50X
0.51K
0.52S
M(7i*JT)GeV/t 2
Figure 3: BELLE K s -> TT+TT" signal.
several MeV and can cause a noticeable smearing to the reconstructed B energy. A typical choice for the kinematic variables are AE = E& mes0n — -^Beam and m B = y/E^eam - p | mes0 n- T h e u s e o f EBeam in the calculation of mB removes most correlations between the two variables. A plot of these variables for the BaBar J5° -¥ 3/ip K5 data8 is shown in Figure 4. The small amount of background occuring at lower values of m-g, is due to random combinatorics, and falls off under the B meson signal. 3.2
Flavor Tag and At
With a reconstructed B° —>• J/i/> Ks in hand, we next must determine the flavor of the other B meson at its time of decay, the probability that we correctly identify the B meson flavor, as well as the time difference between the decays of the two B mesons. The B-meson flavor is determined using the sign of any high momentum leptons, charged Kaons, or slow pions from D* decays which are present. High momentum leptons come primarily from B semi-leptonic decay, in which the sign of the lepton identifies the flavor of the B. There is a small background from leptons which come from cascade, or B —¥ D —>• 1 decays, and thus have the wrong sign charge. Likewise charged Kaons largely come from cascade decays of the sort B —> D —»• K, which again allows the sign of the Kaon to identify
544
i
>
|
i
|
i
|
0A
s
r
0.05
0
-0.05
• •
—
-
•»•.•'*'
<
-
-
-
-
BABAR
-0.1
I
5.2
i
5.225
i
i
•
i
5.25
i
i
5.275
5.3
MB(GeV)
Figure 4: BaBar B° -* J/tp Ks signal, for 9fb
1
, in the AE mB plane.
545
MC Bhabha Clusters 5 2500 W o °
'
sigmal = 0.021 sigma2 = 0.057 meanl =1.02 mean2 = 1.0 d/c2 = 3.5
2000
1500
forward barrel
Jl l\ j \ 11 j I /
with Bhabha constants
BaBar
I
J
I
1000
500 1
0.6
i
0.7
i — r ~ -
0.8
i
1—--,
1
1
1—^==
1
1
1
0.9 1 1.1 1.2 1.3 measured energy / expected deposited energy
Figure 5: BaBar E / P ratio for electrons from Bhabha events.
the B flavor. However, in the case of Kaons, there is a larger background of wrong sign Kaons than in the lepton case. Electrons are identified using the ratio of energy deposited in the Calorimeter to the momentum measured by the curvature of its track in the Drift Chamber, as shown in Figure 5, as well as the transverse profile of the energy deposited in the Calorimeter. Muons are identified by their transit through the instrumented flux return iron. Both BaBar and BELLE use resistive plate chambers (RPCs) for their muon, and Ki, detection. Kaons are identified using a combination of energy-loss (dE/dx) in the tracking detectors, and dedicated Cerenkov-light detectors. The energy loss in the BaBar Drift Chamber is shown in Figure 6, showing good K/ix separation below 0.6 GeV. BaBar and BELLE have innovative, but different, detectors to produce and detector Cerenkov light, for Kaon identification above 0.6 GeV. BaBar's DIRC? uses 3meter quartz bars to produce Cerenkov light, and transmit it to a large water filled tank. The Cerenkov rings are detected by an array of I l k photo-tubes covering the water tank. A schematic of the DIRC is shown in Figure 7. The Cerenkov angle for a sample of Kaons is shown in Figure 8, showing excellent separation between pions and Kaons for the region above 0.6 GeV. BELLE uses a combination of time-of-flight counters, and Aerogel detectors, for Kaon identification above 0.6 GeV. The quality factor for B flavor tagging is given by D = ^V ej(l — 2w;)2,
546
dE/dx vs momentum
10"'
l
10
Track momentum (GeV/c) Figure 6: BaBar Drift Chamber dE/dx. where e; is the efficiency for each type of tagging source, and u>i is the mistagging probability for each type of tagging source. The true asymmetry is diluted by this factor, which was found to be between 20-28% for BELLE and BaBar. Given the size of the flavor tagging dilution, it is necessary to measure this effect using data. This can be done by using a sample of fully reconstructed B decays to final states which are not CP eigenstates, such as B° —» D*7r or B° —> ~D*lv. The energy substituted B mass, TUB, for a number of such final states, is shown from BELLE data in Figure 9. In these decays, the B flavor is identified by the exclusive decay, and so the other B meson can be used to measure the flavor mis-tagging probability. Finally, the decay time difference between the B decaying to a CP eigenstate and the other B is measured. Since each B has only a momentum of 340 MeV in the T rest-frame, and the decay length is only 30/im, this measurement is extremely difficult at a symmetric collider. However, at an asymmetric collider the T has a significant boost in the lab-frame, spreading out the decay vertices significantly along the beam direction, giving a typical vertex separation of 260//m, with BaBar's /J7 = 0.56 . The position resolution at the vertex for the BaBar Silicon Vertex Detector is given by az = 50/xm/P T © 15/xm, for a single track. The vertex resolution along the beam direction is roughly 100 — 120/um, with a resolution function which is offset from zero slightly due
547
PMT+Base-11,000 PMTs
Air
- 17.25 mm Ar (35.00 mm rAtj>)
Light Catcher
Water
- Bar Box
A
Window |——10 mm 1.17m [ 4 X 1.225m Bars] [ glued end-to-end J Figure 7: BaBar DIRC
Stand off Box
548
0.85
^
0.8
•a
ea
^0.75 0.7
0.65 1
2
3 2
Momentum
(GeV/c )
Figure 8: BaBar Cerenkov angle for Kaons.
to charmed meson decays, and which has a broad component. The error in the asymmetry measurement is proportional to aa ~ f(aAt)/VDNevents, but when a At ~ 2(3 JCT the asymmetry error varies slowly with a^t- It is worth understanding that the asymmetry measurement is possible as long as the vertex is precise enough to locate the event within the correct sin(AmAi) oscillation. 3.3
Measure the Asymmetry
To get a better physical understanding of the CP violating asymmetry present in B decays, consider the At decay time distributions shown in Figure 10, for the case in which sm2/3 = 0.7. The first plot shows the theoretical At distribution for B° and B° flavor tags for perfect flavor tagging and vertexing. The second plot shows the decay time for realistic vertex resolution, but perfect flavor tagging. The third plot then shows the At for a realistic model of flavor tagging. Finally the last plot shows the same distribution, but for only 10fb^ of data. These plots show that even with a modest data sample, it is possible to make a CP asymmetry measurement. The asymmetry measurements are done with a maximum likelihood fit to the data, taking into account the flavor mistagging and vertex resolution estimated for each particular event. The results of such a fit are shown for a preliminary analysis of 9fb~ of BaBar data in
549 Table 2: Experimental status of sin 2/3 Experiment
-^ tagged
sin 2/3
Opal
24
3.2+J ; ; ±0.5
CDF
395
0u 7 9 + 0 - 4 1 -0.44
Aleph
23
BaBar
120
-i'oi ± 0 1 6 0.12 ± 0 . 3 7 ± 0 . 0 9
Belle
98
U
World Average
0 8 4
-
+0.43 ±0.07 -0.44 -0.09 0.49 ± 0 . 2 3
Figure 11. A summary of all current results for sin 2/3 are shown in Table 2 s ' 10 . A definite measurement of CP violation in the B meson will most likely have to wait for a bit more data from the B factories. By 2002 both BaBar and BELLE should have sufficiently large data samples, of order lOOfb-1 each, to measure sin 2/3 to an precision of roughly 0.1. At this level it will be possible to conclude whether or not the CKM picture of CP violation is correct. 4
Blind Analysis
I will conclude these lecture notes with a discussion of the blind analysis technique which was used in the BaBar measurement of sin 2/3. The motivation for adopting a blind analysis is that it provides a powerful technique to eliminate Experimenter's Bias. Experimenter's Bias can arise from an unconscious bias toward the expected result or towards other measurements. It can also arise from an undue reliance on the value of a measured quantity, instead of on external systematic checks, to be convinced that the measurement is correct. There are compelling philosophical motivations for a blind analysis. One such argument concerns the point at which the decision is made to stop working and present one's result. It is possible for this decision to be influenced by the value of the result itself, and how it compares with prior results and predictions. However, this is precisely the way in which experimenter's bias creeps into measurements. The decision to stop working should be made very deliberately, and with attention paid solely to the external checks on the measurement and not on the result itself. This leads to the second philosophical argument; which is that there is no information in the value of the result about the internal correctness of that result. As a technique which ensures this separation, a blind analysis can eliminate Experimenter's bias.
550
There are two ways in which the sin 2/3 asymmetry can be evaluated. First, the presence of a time asymmetry can be seen in plots of At for a single B° flavor in its asymmetry around At = 0; or the asymmetry can be seen in a comparison of the At for B° and B°, as seen in Figure 10. Second, a fitting program, incorporating knowledge of the At resolution, and the tagging dilution, will be used to extract a value of sin 2/3 and its uncertainty. A blind analysis requires that both the visual evaluation of the asymmetry and the output of a fitter be hidden. Accordingly there are two ways in which the analysis must be modified. The asymmetry can be visually hidden by looking only at the product of the sign of the B° flavor tag and At, and adding an unknown offset to that product. In particular the hidden At given by At* = stagAt
+C
where stag is the sign of the B flavor tag, and C is a fixed but hidden offset. The factor of stag serves to hide the visual asymmetry between B° and B°, since the two distributions are nearly mirror images. Then the hidden offset, C, hides the asymmetry in each individual At distribution, as shown in Figure 12. This works because with the unknown offset, C, the At = 0 point is uncertain due to the statistical fluctuations in the Az distribution. With larger data samples this statistical uncertainty on the At = 0 point will decrease, but this only reduces the range of blind uncertainty and the result is still blinded by a sufficient amount. Thus we can visually study each time distribution, and compare the two distributions, while effectively hiding the actual asymmetry. Next the fitted value of sin 2/3 must be hidden. The fitting program can accomplish these easily by using a hidden asymmetry variable defined as
a* = i
i x a+C
Here the result of the fit is hidden by both a sign flip and an offset; the sign flip hides whether changes in the analysis are making the result move up or down. This is the technique used by KTeV 11 in its recent blind measurement of e'/e. The above blind analysis technique was successfully used by BaBar8 in its preliminary measurement of sin 2/3. In fact the method demonstrated the most utility in that it allowed for a number of decisions about the data set to be made without concern about possible bias from knowledge of the result. While the measurement of sin 2/3 is presently statistically limited, it is also the case that small changes in the data sample can produce a large change in the result.
551
Since the CP asymmetry is modulated by sin AmAt events with AmAt = TT/2 have a large effect while those at AmAt = 0or7r have little impact; all events are not created equally. BaBar is continuing to use this blind analysis technique for sin 2(3, and for other measurements as well. 5
Conclusion
Finally, let me thank the organizers and students of TASI-2000 for a pleasant and stimulating summer school. References 1. P. F. Harrison and H. R. Quinn, eds., etal, "The BaBar physics book: Physics at an asymmetric B factory", SLAC-R-0504 (1998). 2. Lee, S.Y. Accelerator Physics (World Scientific, Singapore, 1999). 3. L. Hendrickson et al., "Slow feedback systems for PEP-II", Presented at 7th European Particle Accelerator Conferences (EPAC 2000), Vienna, Austria, 26-30 Jun 2000 SLAC-PUB-8480. 4. T. Himel, Ann. Rev. Nucl. Part. Sci. 47, 157 (1997). 5. "KEK B B factory design report," KEK-REPORT-95-7 (1995). 6. "PEP-II: An Asymmetric B Factory. Conceptual Design Report. June 1993," SLAC-418. 7. J. Rosner in these lectures. 8. BaBar Collaboration, D. G. Hitlin, "First CP violation results from BaBar," submitted to the XXXth International Conference on High Energy Physics, Osaka, Japan, SLAC-PUB-8698 (2000). 9. J. Schwiening et al., "DIRC, the particle identification system for BABAR," submitted to the XXXth International Conference on High Energy Physics, Osaka, Japan, SLAC-PUB-8590 (2000). 10. BELLE Collaboration, H Aihara , "A measurement of CP violation in B0 meson decays with Belle," submitted to the XXXth International Conference on High Energy Physics, Osaka, Japan, (2000). 11. KTeV Collaboration, A. Alavi-Harati et al, Phys. Rev. Lett. 83, 22 (1999).
Beam constraint mass in B - to DOTT
L B" - > D V -> (K-n+)7i"
5.2
5.22
5.24
5.26
5.3 GeV
Mb(D(KTi)n)
B" - 4 D V - 4 (K-7t+7l°)7l5 2.5
^ J M A 5.2
5.22
5.24
5.26
5.28
5.3
Mb(D(KrniO)n)
L B " - » D ° 7 r - * (K-7l+7t-7t+)7r
'r
T
n l A Jf\ --A—n—Q_ s\A ™5.24 ^ ? 5.26 ;~ T ' 5.22 ^ -
^
5.2
5.28
L,
5.3
Mb(D(K3n)n) Entrin
6
7
49
"•.si ;.™
B° - > D + 7t'-» (K'7I+7t+)7t-
4 2
-
n
/ %
;H
.
1 V, II i ^ 5.28
,
•1 .nn .5.22i i. . i.ni 5.24 n r b 5.26n r .
5.2
i
5.28
Mb(D+(Kmv)7v-)
B" - > D * V - > (D°7t°)7i-
T-rr
5.22
azfl^3 5.24
5.26
5.28
Wb(D«07T- (combined))
V'
; B° - » D * + T I ' - > (D°7i+)7f
-
•—rrr—r. 5.22
5.2
, . 5.24 ,i!. 11
V\
5.26
Mb(D'+TT- (combined))
Figure 9: BELLE Exclusive B reconstruction.
5.3
553
S 45(H)
Negative Tags
-4
-3
-2
II
I
2
3
4
Negative Tags
-4
-
3
-
2
-
1
0
1
2
3
4
At(psee)
P o s i t i v e : 1 K||s
Negative Tags
Negitm l.igs
-4
-3
-2
-1
II
I
2
3
-4
-
3
-
2
-
1
0
1
2
3
4 At (psec)
Figure 10: Simulated A t distributions for a) perfect flavor-tagging and vertexing b) realistic vertex resolution c) realistic vertex and flavor-tagging d) for 10fb _ 1 of data
554
<0.5
At (ps)
Figure 11: Asymmetry versus At for B° -> J/ip Ks and B° -> tp{2S) Ks BaBar data.
-i
F-
100
100
100 "i
i
50
50
\ -8
"l
" '
n
8 At
y
-8
50 rt
8 s~ At Tag
\J' V 1
-8
sT
At + C
Tag
Figure 12: a) The At distributions for B° and B ° , b) distributions of s T a 9 A t , and c) Distributions of the blinded variable stagAt + C, for 1000 event toy simulation.
Sheldon Stone
This page is intentionally left blank
PATHOLOGICAL SCIENCE
Physics
SHELDON STONE Department, Syracuse University, NY 13244-1130, USA E-mail: [email protected]
Syracuse
I discuss examples of what Dr. Irving Langmuir, a Nobel prize winner in Chemistry, called "the science of things that aren't so." Some of his examples are reviewed and others from High Energy Physics are added. It is hoped that discussing these incidents will help us develop an understanding of some potential pitfalls.
1
Introduction
Often, much more often t h a n we would like, experimental results are reported t h a t have impressive "statistical significance," but are subsequently proven to be wrong. These results have labeled by Irving Langmuir as the "Science of things that aren't so." Langmuir described some of these incidents in a 1953 talk t h a t was transcribed and may still be available. 1 Here I will repeat some of Langmuir's examples, and show some other examples from High Energy Physics. T h e examples shown here are not cases of fraud; the proponents believed in the work they presented. B u t they were wrong. 2
D a v i s - B a r n e s Effect
Circa 1930 Professors B. Davis and A. Barnes of Columbia University did an experiment where they produced a particles from the decay of Polonium and e~ from a filament in an a p p a r a t u s sketched in Fig. 1. T h e electrons are accelerated by a varying potential. At 590 V they move with the same velocity as the a ' s . T h e n they m a y combine with the a ' s to form a b o u n d a — e~ "atomic state." They then continue down the tube and are counted visually by making scintillations in the screens at Y or Z t h a t viewed with a microscope. W i t h o u t a magnetic field all the a ' s reach the screen at Y. W i t h a magnetic field and no accelerating voltage for the electrons they all reach the screen at Z. However, if the electrons bind with the doubly positive charged a ' s they would deflect half as much and not reach Z. W h a t they found was very extraordinary. Not only did the electrons combine with the a ' s at 590 V, but also at other energies " t h a t were exactly the velocities t h a t you calculate from Bohr theory." Furthermore all the capture probabilities were about 80%. Their d a t a are shown in Fig. 2. 557
558
Figure 1: Diagram of first tube. S, radioactive source; W, thin glass window; F, filament; G, grid; R, lead to silvered surface; A, second anode; M, magnetic field; C, copper seals; Y and Z, zinc sulfide screens.
\t,*y,
Mi T 590
M,>H,
Figure 2: Electron capture as a function of accelerating voltage.
Now there is a problem here because in Bohr theory when an electron comes in from infinity it has to radiate half its energy to enter into orbit. There was no evidence for any such radiation and the electron would have needed to have twice the energy to start with. However, there were some theorists including Sommerfeld who had an explanation that the electron could be captured if it had a velocity equal to what it was going to have in orbit. There were other disturbing facts. The peaks were 0.01 V wide; the fields in the tube were not that accurate. In addition, scanning the entire voltage range in such small steps would take a long time. Well, Davis and Barnes explained, they didn't quite do it that way: they found by some preliminary work that they did check with the Bohr orbit velocities, so they knew where to look. Sometimes they weren't quite in the right place, so they explored around and found them. Their precision was so good they were sure they could get a
559 better value for the Rydberg constant (known then to 1 part in 10 8 ). Then Langmuir visited Columbia. T h e way the experiment was done was t h a t an assistant named Hull sat opposite to Barnes in front of a voltmeter, t h a t had a scale t h a t went from 1 to a thousand volts and on t h a t scale he was reading 0.01 V. T h e room was dark, to see the scintillations, and there was a light on the voltmeter and on the dial of the clock t h a t Barnes used to time his measurements. Langmuir says it best: "He said he always counted for two minutes. Actually, I had a stop watch and I checked him up. They sometimes were as low as one minute and ten seconds and sometimes one m i n u t e and fifty-five seconds, but he counted t h e m all as two minutes, and yet the results were of high accuracy! "And then I played a dirty trick. I wrote out on a card of paper ten different sequences of V and zero. I m e a n t to p u t on a certain voltage and then take it off again. Later I realized t h a t t h a t wasn't quite right because when Hull took off the voltage, he sat back in his chair—there was nothing to regulate at zero, so he didn't. Well, of course, Barnes saw him whenever he sat back in his chair. Although, the light wasn't very bright, he could see whether he was sitting back in his chair or not so he knew the voltage wasn't on and the result was t h a t he got a corresponding result. So later I whispered, 'Don't let him know t h a t you're not reading,' and I asked him to change the voltage from 325 down to 320 V so he'd have something to regulate and I said, 'regulate it just as carefully as if you were sitting on a peak.' So he played the part from t h a t time on, and from t h a t time on Barnes' readings had nothing whatever to do with the voltages t h a t were applied. Whether the voltage was at one value or another didn't make the slightest difference. I said 'you're through. You're not measuring anything at all. You never have measured anything at all.' 'Well,' he said, 'the tube was gassy. T h e t e m p e r a t u r e has changed and therefore the nickel plates must have deformed themselves so t h a t the electrodes are no longer lined up properly.' "He immediately—without giving any thought to it—he immediately had an excuse. He had a reason for not paying any attention to any wrong results. It just was built into him. He just had worked t h a t way all along and always would. There is no question but what he is honest; he believed these things, absolutely." In fact they did publish their results even after being confronted by Langmuir.' Later after no one else had been able to reproduce their results they published a retraction, 3 t h a t said in part: "These results reported depended on observations made by counting scintillations visually. T h e scintillations produced by a particles on a zinc sulfide screen are a threshold phenomenon. It is possible
560
that the number of counts may be influenced by external suggestion or autosuggestion to the observer. The possibility that the number of counts might be greatly influenced by suggestion had been realized, and a test of their reliability had been made by two methods: (a) The voltage applied to the electrons was altered without the knowledge of the observer (Barnes); (b) the direction of the electron stream with respect to the a-particle path was altered by a small electro-magnet. Such changes in voltage and direction of electron stream were noted at once by the observer. These checks were thought at the time to be entirely adequate. In examining the data of observation made in our laboratory Dr. Irving Langmuir concluded that the checks applied had not been sufficient, and convinced us that the experiments should be repeated by wholly objective methods. Accordingly we have investigated the matter by means of the Geiger counter. Four additional experimental electron a-ray tubes have been constructed for this purpose. "Capture of the kind reported was often observed over a considerable period of time, but following prolonged observation the effect seemed to disappear. The results deduced from visual observations have not been confirmed. If such capture of electrons does take place, it must depend on unknown critical conditions which we were not able to reproduce at will in the new experimental tubes." It is interesting to note that they still seemed to be holding out the possibility that somehow the earlier results were correct. 3
N-rays
In 1903 there was a lot of experimentation with x-rays. Blondlot, a respected member of the French Academy of Sciences, found that if you have a hot wire heated inside an iron tube with a window cut out of it, rays would emerge that would get through aluminum. He call these N-rays.4 They had specific properties. For example, they could get through 2" or 3" of aluminum but not through iron. The way he detected these rays was by observing an object illuminated with a faint light. When the N-rays were present you could "see the object much better." N-rays could be stored. Brick wrapped in black paper put in sunlight would store and reemit them, but the effect was independent of the number of bricks. Many other things would give off N-rays, including people. They even split when traversing an aluminum prism. Blondlot measured the index of refraction of the different components. The American physicist R. W. Wood visited Blondlot's laboratory and was shown the experiments. While Blondlot demonstrated his measurement of the
561 refractive indicies, Wood palmed the prism. It did not affect the measurements. Wood cruelly published that, 5 and t h a t was the end of Blondlot. Now the question is how do we explain Blondlot's findings. Pringsheim tried to repeat Blondlot's experiments and focused on the detection m e t h o d . He found t h a t if you have a very faint source of light on a screen of paper and to make sure t h a t you are seeing the screen of paper you hold your hand up and move it back and forth. And if you can see your hand move then you know it is illuminated. One of Blondlot's observations was t h a t you can see much better if you had some N-rays falling on the piece of paper. Pringsheim repeated these and found t h a t if you didn't know where the paper was, whether it was in front or behind your hand, it worked just as well. T h a t is you could see your hand just as well if you held it back of the paper as if you held it in front. Which is the natural thing, because this is a threshold phenomenon, and a threshold phenomenon means t h a t you don't know, you really d o n ' t know, whether you are seeing it or not. But if you have your hand there, well of course, you see your hand because you know your hand's there, and t h a t ' s just enough to win you over to where you know t h a t you see it. But you know it just as well if the paper happens to be in front of your hand instead of in back of your hand, because you don't know where the paper is but you do know where your hand is. 4
S y m p t o m s of Pathological Science
Langmuir lists six characteristics of pathological science: 1. T h e m a x i m u m effect t h a t is observed is produced by a causative agent of barely detectable intensity, and the magnitude of the effect is substantially independent of the cause. 2. T h e effect is of a magnitude t h a t remains close to the limit of detectability, or many measurements are necessary because of the low statistical significance of the results. 3. Claims of great accuracy. 4. Fantastic theories contrary to experience. 5. Criticisms are met by ad hoc excuses thought up on the spur of the moment. 6. Ratio of supporters to critics rises up to somewhere near 50% and then falls gradually to oblivion.
562
Let us go on to examples from high energy physics and see how these criteria apply. Unfortunately, there are many examples. I have just chosen a few. 5
The Split A2 Meson Resonance
Two CERN experiments, the Missing Mass Spectrometer (MMS) experiment,6 and the CERN Boson Spectrometer (CBS) experiment,7 claimed that the structure around 1300 MeV, believed to be the 2 + Ai meson produced in pion proton collisions did not have a simple Breit-Wigner form, as expected for a short lived resonance, but was in fact divided (or split) into two peaks. Their data are shown in Fig. 3.
Jin M M * tog-mow
p &Lrtir oi* DI
FTT5 1 D T « T0tAL(F*MS^CB5)A: MTA
Figure 3: (a-c) Evidence for Ai splitting in -K p -> pX collisions in the two CERN experiments, (d) same as (c) in 5 MeV bins fit to two hypotheses.
This indeed was a startling result with no obvious explanation. Such a resonance shape could mean new physics. The MMS experiment just observed
563
the outgoing proton and thus computed the missing mass from proton and the knowledge of the incident n~ beam. The CBS experiment could also observe the decay products of the Ai. Bubble chamber experiments also saw evidence for the splitting. Bockman et al. had 5 GeV/c 7r+p data. 8 They looked at the p°ir+ final state and showed their data for a specific cut on the momentum transfer between the initial and final state proton (t — tmin < 0.1). Their data shows a split (see Fig. 4). Anguilar-Benitez et al. 9 showed their K°K^ data which they claim fits best to a double pole (see Fig. 5). There were also inconclusive but split-suggestive data from Crennell et al. 10
1.2 1.3 M(p*ii») IGcV)
U
Figure 4: Data from Bockman et al. The simple Breit-Wigner has a 20% probability while the double-pole fit gives 63%.
A. Barbaro-Galtieri reviewed the situation at the 1970 Meson Spectroscopy conference.11 At the time there was only one publicly available result that directly contradicted the split. These data, from a 7r+p bubble chamber experiment done by the Berkeley group (LRL), are compared with the MMS data in Fig. 6.12 At this conference doubt was raised about the validity of the split. Others then came forward.13 There were new experiments.14 By the 1972 Meson Spectroscopy conference, there was no mention of the split. It had vanished into oblivion.15 How did this happen? I have heard several possible explanations. In the MMS experiment, I was told that they adjusted the beam energy so the dip always lined up! Another possibility was revealed in a conversation I had with Schiibelin, one of the CBS physicists. He said: "The dip was a clear feature. Whenever we didn't see the dip during a run we checked the apparatus and
564 structure -T
155
r
r
170
I 85
of the A t pioli
M" (KTK*)
155
170
18*
6«V*
Figure 5: Data of Aguilar-Benitez et al. (a) All p momenta included (note the suppressed zero), (b) Only the 0.7 GeV/c data. Using (a) only, the fit for double-pole (a) gives 65% likelihood.
always found something wrong." I then asked him if they checked the a p p a r a t u s when they did see the dip, and he didn't answer. W h a t about the other experiments t h a t did see the dip? Well there were several experiments t h a t didn't see it. Most people who didn't see it had less statistics or poorer resolution than the C E R N experiments, so they just kept quiet. Those t h a t had a small fluctuation toward a dip worked on it until it was publishable; they looked at different decay modes or t intervals, etc. (This is my guess.) 6
T h e R , S, T a n d U B o s o n s
T h e 1970 Meson Spectroscopy conference contained many new results involving claims for states of relatively narrow width in the 3 GeV mass region. T h e CBS group also presented evidence of significant peaks, more than four standard deviations for six new resonances above 2.5 G e V / c . Their d a t a are shown in Fig. 7.16 Other groups also saw b u m p s . At the 1970 conference Miller reported similar structures seen in a 13 G e V / c incident m o m e n t a ir+p bubble chamber experiment. 1 7 Kalbfleisch gave a review entitled "The T Region," in which he stated "In this reiview paper I will discuss evidence for mesons in the T region (mass 2.19 GeV). T h e T~ is well known from the original missing mass spectrometer work at CERN." 1 8 Subsequently, there were no further confirmations of these signals. In fact, all of these b u m p s eventually went away.
565
500
1300 Me V
500 400
400
300
300 (a)
aE<-t
200
200 100
100
0 1200 1300 1400 X" CERN, MMS i r - p ^ - pX"a16,7GeV/c Signal T 1400 Bock. * 5660
1200
1400 M(3r)
L R L , HBC l T + p - - • p TT+TT'T"*" Of
7.0 GeV/c Signal M 32 Back. * 1943
Figure 6: Comparison of MMS and LRL data at similar incident pion momenta, (a) MMS, resolution T/2=8.0 MeV, (b) LRL, resolution T/2=6.4 MeV.
7
T h e F o r Ds
Meson
T h e spin-0 meson formed from cs constituent quarks was called at first the F, but was renamed by the Particle D a t a G r o u p as the Ds. T h e n a m e was not changed to protect the innocent. 19 T h e F decays mostly by having the c quark transform to an s quark and a virtual W+ boson. In the simplest case, the W+ manifests itself as a 7r+ and the s quark combines with the original s" quark to form a <j> or r) meson. In 1977 the DASP group working at the DORIS e+e~ storage ring at DESY found a handful of events at a center-of-mass energy of 4.42 GeV t h a t they classified as coming from the reaction e + e ~ —>• F+F~f, where the 7 and one of the F ' s formed an F*, the spin-1 state. One F± candidate was required to decay into an 7;7T±, while the other F was not reconstructed 2 0 T h e r; —• 7 7 channel was used. T h e d a t a were fit to the FF* hypothesis requiring t h a t both the F+ and the F~ have the same mass. Their results are shown in Fig. 8. At a center of
566
(murqot mi"
1 Figure 7: Compiled spectrum obtained by adding the spectra in different mass regions with the hand-drawn background subtracted. The arrows indicate the position of known or suspected particles.
567
mass energy of 4.42 GeV they observe a cluster of events in rjn^ mass above 2 GeV, and no such cluster at other energies. They also observed an increase in the production of r] mesons in 4.42 GeV region. (More on this later.) The F
•
excluding U.i.2 GeV
••
• t
•
*
'„
10
1,1.5
20
1IjO
1.5
2.0
F i g u r e 8: ( T o p ) F i t t e d -qir^ m a s s v e r s u s fitted recoil m a s s a s s u m i n g e + e - —• FF*, where F* -»• F-y, a n d F ^ T T * a t (left) E cm—4.42 G e V a n d ( r i g h t ) a t all o t h e r e n e r g i e s e x c l u d i n g 4.42 G e V . At t h e b o t t o m a r e t h e fitted M(r)w) p r o j e c t i o n s .
mass had an ambiguity because the low momentum photon from the F* decay could be associated either with the F that decayed into ?j7r or with the F that wasn't reconstructed. Thus they found two possible mass values for the F, 2040±10 MeV or 2000±40 MeV. The generally quoted value was 2020-2030 MeV. There was however a disturbing aspect of this result. The Crystal Ball group, operating at SPEAR checked the level of 77 production in the same center-of-mass energy region 21 A comparison of their result with the DASP result is shown in Fig. 9. Crystal Ball, which had a far superior ability to detect photons compared to DASP, did not find the increase in 77 production at 4.42 GeV that DASP claimed. Without an increase in r/ production it's hard to see why FF* would be produced at 4.42 GeV and not at other energies. Unfortunately the Crystal Ball experiment did not have charged track momentum analysis and could not look for F's. In 1981 a group using the CERN Omega Spectrometer with a 20-70 GeV
568 .c — LnlCrysjIol Boli
c.e - \
I 'I"-'
%
0.4 L
0 .0 -
-fr-
c.s
_
i
Co
.
0.4
-
-
t t—
-
0? 1
<.s
4.0
4.5
5.0
E 5 r. iueV
Figure 9: R,, = cr(e+e~ ->• r;X)/
photon beam found evidence for the F meson at 2020±10 MeV in several different decay modes. T h e y used two sets of selection criteria t o record d a t a , denoted T l and T2. 2 2 T l required a minimum of four charged particles at a plane 1.5 m downstream of the target center. T 2 required a photon with transverse m o m e n t u m greater than 800 MeV and at least one charged track leaving the target. They show results for selected decay modes containing r\ —> 7 7 in Fig. 10. In Fig. 11 they select candidates for v( —»• 7r+7T~?7 from t h e V^T; and rji-K samples. T h e y see a signal in TJ'ZK and nothing in IJ'TT.
In a 1983 paper they presented results based on a different trigger where they required one photon with energy greater t h a n 2 GeV a n d forward charged multiplicity between 2 and 5. 2 3 Their d a t a are shown in Fig. 12. T h e F mass value was virtually unchanged. These newer signals are quite weak, as was a signal they found in
569
F i g u r e 10: M a s s s p e c t r a of (a) TITT^TT^ TT from T l , (b) •qir'*- from T l , (c) TJTT^7r+ TT 7r+ -K from T l , a n d ( d ) rj-rr^ from T 2 . T h e c u r v e s a r e p o l y n o m i a l p l u s B r e i t - W i g n e r fits.
Now there are several significant problems here comparing D° production with F^ production. First of all product of the branching ratio times production cross section (a • B) for F^ —y r]^ is more t h a n twice as large as for D° —> K~n+. It is expected t h a t the F, being a charmed-strange meson, would be produced approximately 15% as often as the charmed-light quark combination. While the two-body branching ratios cannot be accurately predicted, the T/TT would have to be about 14 times larger in the F t h a n the K~TY+ was in the D°. Since it was known t h a t K~TT+ was about 3 % , this would have required a ~ 4 0 % branching ratio for F —> rjir. Secondly, the production mechanism for D decay had been shown to be mostly associated production, where jp —> DAQX, while in the F d a t a the production was mostly -yp —• F+F~X. Why should the F production mechanism be so different? In any case the accepted F mass now was 2020±10 MeV, having now been "confirmed" by the C E R N Omega d a t a . In July of 1980 one F+ decay was detected in emulsion reactions from a neutrino beam, in the here-to-fore unknown decay mode 7r + 7r + 7r~7r°. T h e mass was determined to be 2017±25 MeV, and lifetime of 1 . 4 x l 0 ~ 1 3 s measured. 2 6 Later in Sept. of t h a t year another neutrino emulsion e x p e r i m e n t 2 7 measured
570
(a )
200-
.
>
, t
£ 150-J-
1
|U^T\ 1 /
10
£ °-
/<
2
/
501.6
1.6
1.8
T
2
2.2
2.4
MTI'H1!!**-
(GeV)
2
2.4
1.8
Mi)'it
2.2 1
(GeV)
Figure 11: Mass spectra of (a) r}'ir+Tr+ TT~ , a subset of (c) in Fig. 10, (b) r/n^, a subset of (a) in Fig. 11. The curves are polynomial plus Breit-Wigner fits.
a lifetime with two events." In 1983 the CLEO experiment presented results that showed strong evidence for the F at 1970±5±5 MeV. The evidence consisted of a mass peak containing 104±19 events in the (j)^ decay mode shown in Fig. 13, the helicity distribution of the <j>, that showed the expected cos2 9 decay angular distribution and Siff-B that was 1/3 of the cr-Bfor D° —> K~ir+?s I can assure you that the CLEO collaboration was not easily persuaded by myself and Yuichi Kubota that our results were right because they contradicted the previous experiments shown above. We were forced to go through experiment by experiment and detail what might have gone wrong. (Thus the material for this section was created.) The CLEO result was quickly confirmed by TASSO 29 ACCMOR,30 and ARGUS experiments. 31 The Particle Data Group subsequently chose to rename the F as the Ds, a logical choice. What lessons are we to learn from this story? DASP based their "discovery" on the obervation of increased rj production at 4.42 GeV and a handful of events where one F was reconstructed in rj-K^ and the other not reconstructed. Presumably they searched many energies and several final states including F+F~ and F*+F*~, reporting a signal in only one case. However, a These experiments prompted Lou Hand later to state: "The F was the first particle whose lifetime was measured before it was discovered."
571 in
11 18
16
1!
10 2.2 MiT)n-!-G«v
2.4
2i
1.6
U
20
U
26
16
11
2.2
H(T]n'n!i-c*
2.0 2.2 Mir|n*frff-i-G«v
2.1
2A
2.0
2A
26
2.2
nirjnTrn'n'i-Gev
F i g u r e 12: M a s s s p e c t r a of ^ ( n x * ) from t h e C E R N O m e g a s p e c t r o m e t e r . T h e c u r v e s ; p o l y n o m i a l p l u s G a u s s i a n fits.
the statistical significance is often viewed by not considering all the searches that yielded no signal. The CERN Omega spectrometer results all were of very marginal accuracy and didn't fit together very well. The F production mechanism was completely different than that for D's, the rates for •qn± were more than 7 times larger than for (j>n±, yet these discrepancies were never addressed. We note that none of their results for the relative branching ratios are consistent with current measurements. 32 For example, r]^ is about half of (frit* and <j)p± is twice (f)^. Apparently their data showed marginal signals at 2020 MeV and the DASP result was sufficient to push them over the edge in giving credibility to their fluctuations. The neutrino experiments now had the DASP and CERN Omega results to fall back on. Doubtless if they hadn't believed the 2020 MeV value for the mass they may not have shown their results; on the other hand, they could have been more conservative. The DASP, Omega and neutrino results satisfy Langmuir's first criterion: "The maximum effect that is observed is....of barely detectable intensity...." Even more so the second criterion is fully satisfied, especially by the Omega
572 Table 1: CERN Omega Spectrometer Photo-production Yields for F Mesons into TJ, TJ' or <j> and D Modes into Kaons. Mode
Trigger
Efficiency
Tl Tl T2 Tl Tl T3 T3 T3 T4 T4 T4
10 5 3 12 6
4(7
1.83 0.85 0.87
17±6 14±9 20±11
38±14 66±42 93±52
3(7
33±10
60±17 K'-TT+TT 63±19 66±19 K°Tr+irEfficiencies include r;' ' decay fractions. Upper limits are at 3(7.
13.5± 4 108±33 39±11
+
±
7J7V 7V~ 7V TJ'TT^
TV~ ir^
•qit^ •qtr^ TJ'TT^ TJTV^ ?77r 0 7r : ^ 7]7T~*~ 7T~ TT^
+
0
# of events or significance 3(7
5a
B X (7 (nb) 60±15 20± 8 27± 7 <45 <30
<4 <15 42 5 17
result: "The effect is of a magnitude t h a t remains close to t h e limit of detectability, or many measurements are required because of the low statistical significance of t h e results." T h e neutrino experiments satisfied his third criteria, "Claims of great accuracy," in t h a t they measured lifetimes! Criteria (4) a n d (5) did n o t come into play although (4) should have been invoked t o explain production yields a n d relative branching ratios. Criterion (6), "Ratio of supporters to critics ..." was funny in t h a t most people believed t h e DASP and Omega results until t h e CLEO result came out and then suddenly no one believed them. 8
Conclusions
Be suspicious of new results. Think them over and if they don't make sense then doubt them. It doesn't mean they are wrong, just not proven. Sometimes it's difficult to know when something is right or wrong. It is also difficult to find out exactly what went wrong unless you are directly involved with an experiment or had the opportunity to visit and question as Langmuir had. Even Langmuir found it difficult to figure out t h e process. From Langmuir: "I don't know what it is. T h a t ' s the kind of thing t h a t happens in all these. All the people who had anything to do with these things find t h a t when you're through with them some things are inexplicable. You can't account for Bergen Davis saying t h a t they didn't calculate those things from
573
1.8
2.0 2.2 J.4 $r- MftSS IGevl
a.6
2X
Figure 13: (a) Mass spectra of ^w^) from CLEO. (b) The <j> candidates are chosen to be above or below the 0 signal region.
the Bohr theory, t h a t they were found by empirical methods without any idea of the theory. Barnes m a d e the experiments, brought them in to Davis, and Davis calculated t h e m u p and discovered all of a sudden t h a t they fit the Bohr theory. He said Barnes didn't have anything to do with t h a t . Well, take it or leave it. How did he do it? It's up to you to decide. I can't account for it. All I know is t h a t there was nothing salvaged at the end, and therefore none of it was ever right, and Barnes never did see a peak. You can't have a thing halfway right." As a final note, Prof. R o o d m a n at this school described how some current experiments, KTeV, Babar and Belle have ensured t h a t the final answer to their most i m p o r t a n t measurements is actually hidden from the d a t a analyzers until they are satisfied t h a t all systematic checks have been performed. 3 3 In my view this is a useful technique and should be employed more often. Another method t h a t has been employed is to have different groups within a collaboration obtain their results independently. Hopefully reviewing these painful lesson will help others avoid the same pitfalls.
574
9
Acknowledgements
I would like to thank Tom Ferbel for showing me Langmuir's paper long ago. Thanks to K. T. Mahanthappa, H. Murayama and J. Rosner for organizing a very interesting school and inviting me to participate. Ray Mountain and Jon Rosner helped greatly by carefully reading and editing this paper. 1. I. Langmuir, "Pathological Science," General Electric, (Distribution Unit, Bldg. 5, Room 345, Research and Development Center, P. O. Box 8, Schenectady, NY 12301), 68-C-035 (1968). 2. A. H. Barnes, Phys. Rev. 35, 217 (1930). 3. B. Davis and A. H. Barnes Phys. Rev. 37, 1368 (1931). 4. R. Blondlot, The N-rays, Longmans, Green and Co., London, England (1905). 5. R. W. Wood, Nature, 72, 195 (1904); R. W. Wood, Physik. Z. 5, 789 (1904). 6. G. E. Chikovani et al., Phys. Lett. 25B, 44 (1967). 7. H. Benz et al, Phys. Lett. 28B, 233 (1968). 8. K. Bockman et al., Nucl. Phys. B 16, 221 (1970) 9. M. Anguilar-Benitez et al., Phys. Lett. B 29, 62 (1969) 10. D. J. Crennell et al., Phys. Rev. Lett. 20, 1318 (1968) 11. A. Barbaro-Galtieri, "The A? and the 2 + Nonet," in Experimental Meson Spectroscopy, ed. C. Baltay and A. H. Rosenfeld, (Columbia Univ. Press, New York, 1970. 12. The LBL data were published in M. Alston-Garnjost et al., Phys. Lett. B 33, 607 (1970). 13. K. J. Foley et al., Phys. Rev. Lett. 36, 413 (1971); G. Grayer et al., Phys. Lett. B 34, 333 (1971). 14. D. Bowen et al., Phys. Rev. Lett. 26, 1663 (1971) 15. Experimental Meson Spectroscopy - 1972, ed. A. H. Rosenfeld and K. W. Lai, (American Institute of Physics, New York, 1972). 16. R. Baud et al."Charged Non-strange Bosons with Masses Higher Than 2.5 GeV," in Experimental Meson Spectroscopy, ed. C. Baltay and A. H. Rosenfeld, (Columbia Univ. Press, New York, 1970). 17. D. H. Miller, "Comparison of the CERN Boson Spectrometer Results with a n+p Experiment at 13 GeV/c in a Bublle Chamber," in Experimental Meson Spectroscopy, ed. C. Baltay and A. H. Rosenfeld, (Columbia Univ. Press, New York, 1970). 18. G. Kalbfleisch "The T Region," in Experimental Meson Spectroscopy, ed. C. Baltay and A. H. Rosenfeld, (Columbia Univ. Press, New York, 1970). 19. A famous U. S. television show from the 1950's, "Dragnet," ended by
575
saying "...the names were changed to protect the innocent." 20. R. Brandelik et al., Phys. Lett. B 70, 132 (1977); ibid, Phys. Lett. B 67, 243 (1977); ibid Phys. Lett. B 34, 358 (1977); ibid, Phys. Lett. B 80, 412 (1979). 21. F. C. Porter, "Measurement of Inclusive r\ Production in e+e" Interactions Near Charm Threshold," in High Energy Physics-1980, XX Int. Conf. ed. L. Durand and L. Pondrom, American Institute of Physics, New York (1981) p380. 22. D. Aston et al., Phys. Lett. B 100, 91 (1981). 23. M. Atkinson et al., Z. Phys. C 17, 1 (1983). 24. D. Aston et al., Nucl. Phys. B 189, 205 (1981). 25. D. Aston et al., Phys. Lett. B 94, 113 (1980). 26. R. Ammaret al., Phys. Lett. B 100, 118 (1980). 27. N. Ushida et al., Phys. Rev. Lett. 45, 1053 (1980). 28. A. Chen et al., Phys. Rev. Lett. 51, 634 (1983). 29. M. Althoff et al., Phys. Lett. B 136, 130 (1984). 30. R. Bailey et al., Phys. Lett. B 139, 320 (1984). 31. H. Albrecht et al., Phys. Lett. B 146, 111 (1984); H. Albrecht et al., Phys. Lett. B 153, 343 (1984). 32. Particle Data Group, Eur. Phys. J. C15, 1 (2000). 33. A. Roodman, "Results from Asymmetrical e + e~ Collisions," in Theoretical Advanced Study Insitute In Elementary Particle Physics, Boulder, CO, June, 2000.
This page is intentionally left blank
liii^MHI^Riiiii
"«N>
Elizabeth H. Simmons
This page is intentionally left blank
TOP PHYSICS E L I Z A B E T H H. S I M M O N S Department of Physics, Boston University 590 Commonwealth Avenue, Boston, MA 02215, USA and Radcliffe Institute for Advanced Study, Harvard University 34 Concord Avenue, Cambridge, MA 02138 e-mail: [email protected] The Run I experiments at the Fermilab Tevatron Collider discovered the top quark and provided first measurements of many of its properties. Run II (and eventually the LHC and NLC experiments) promise to extend our knowledge of the top quark significantly. Understanding the top quark's large mass, and indeed the origin of all mass, appears to require physics beyond the Standard Model. Thus, the top quark may have unusual properties accessible to upcoming experiments.
1
Within the Standard Model
The three-generation Standard Model (SM) of particle physics came into existence with the discoveries of the tau lepton 1 and b quark. 2 Completing the model required a weak partner for b. Several important properties of this hypothetical "top" quark could be deduced from measurements of bottom quark characteristics. The electric charge of the b quark was related to the ratio
fl=g(f:""L/,°fTf)=sg(3q;).
a)
The increment in the measured 3 value SRexpt = 0.36 ± 0.09 ± 0.03 at the b threshold agreed with the predicted 5RSM = | , confirming Qb = — | . Likewise, data 4 on the front-back asymmetry for electroweak b-quark production = FB
a(b,9> 90°) - q ( M < 9 0 ° ) cr(b,8>90°) + a(b,9 <90°)
, . '
[
where 6 is the angle between the incoming electron and outgoing b quark, showed Aep£ = - ( 2 2 . 8 ± 6 . 0 ± 2 . 5 ) % while Afg = -.25 was predicted. Since the Zbb coupling depends on the weak isospin of the b quark, the measurement confirmed that T36 = — | . Therefore, the b quark's weak partner in the SM was required to be a color-triplet, spin-| fermion with electric charge Q = | and weak charge T3 = | . Such a particle is readily pair-produced by QCD processes involving quark/anti-quark annihilation or gluon fusion, as illustrated in Figure 1. At 579
580
X
X
"^b,
7(
/X
1^
^"
\
"xxxmi—7-7-
-wxmi\ ' -v
-33333$ -<—
-JimXJ^tX
Figure 1. Feynman diagrams for QCD pair-production of top quarks.
the Tevatron's collision energy y/s = 1.8 TeV, a 175 GeV top quark is produced 90% through qq -» ti and 10% through gg -> ti; at the LHC with \fs = 14 TeV, the opposite will be true. This is because the incoming partons must carry a momentum fraction of order mt/Ei,earn, a large fraction at the Tevatron and a small one at LHC, and because the parton distribution function of gluons is softer than that of valence quarks. Note that had the size of mt been different, weak (single top) production would have rivaled QCD (pair) production: for mt ~ 60 GeV, the process qq —» W -> tb is competitive while for mt ~ 200 GeV, Wg -t tb dominates. 5 In the three-generation SM, the top quark decays primarily to W + b because |Vtf,| « 1. As the W can decay into leptons or hadrons, there are three main classes of final states from top pair production. In the "dilepton" events (5% of all ti events), both W's decay to Ivi (where I = e, (i) and the event includes two b-jets, two leptons and missing energy from two neutrinos. In the "lepton+jets" events (30%), there are two b-jets, two other jets from W decay, one energetic lepton, and missing energy. The "all jets" events (44%) have multiple jets (including 2 b-jets) and no hard leptons. The remaining 21% of events would include tau leptons which are harder to identify in highenergy hadron collider experiments. In 1995, the CDF 6 and D 0 7 experiments at Fermilab discovered a new particle answering the above description and having a pair-production crosssection consistent with that predicted for a SM top quark with mt = 175 GeV. During Tevatron Run I, each experiment gathered w 125 p b _ 1 of integrated luminosity, measured some top quark properties in detail and took a first look at others. In this section of the talk, we will review the measured characteristics of the top quark, considered primarily as a Standard Model particle". We will discuss the Run I results on the top quark mass, width, pair and single production cross-sections, spin correlations, and decays. We will also describe the increases in measurement precision anticipated at Run II and future accelerators and discuss what we hope to learn. Another useful reference on this topic is ref.
581 Table 1. Measured 1 1 mt and att from CDF and D 0 .
experiment CDF
D0
Tevatron
1.1
channel dilepton lepton + jets
mt (GeV) 167.4 ± 11.4 175.9 ± 7.1
all jets combined dilepton lepton + jets
186.0 ± 11.5 176.0 ± 6.5 168.4 ± 12.8 173.3 ± 7.8
all jets combined combined
172.1 ± 7.1 174.3 ± 5.1
°tt (pb)
8-4t3:S 5.1 ± 1.5 9.2 ± 4.3 (SVX b-tag) (soft lepton tag) 7 6+3S °-2.7 6.5li;I' • {m t = 175) 4.1 ± 2.1 8.3 ± 3.6 (topological) (soft lepton tag) 7.1 ± 3.2 5.9 ± 1.7 {mt = 172)
Mass
The top quark mass has been measured 9 ' 10 by reconstructing the decay products of top pairs produced at the Tevatron. The most precise measurements use lepton-t-jets decay channel which affords both a large top branching fraction and full event reconstruction. The combined measurement from CDF and D 0 is mt = 174.3 ± 5.1 GeV, as shown in Table 1. This implies that the top Yukawa coupling At = 23/4Gp2mt is approximately 1, so that the top is the only quark to have a Yukawa coupling of "natural" size. The top quark's mass is already known to ±3%, comparable to the precision with which m& is measured and better than that for the light quarks. 12 This is quite impressive given that the top quark was discovered nearly 20 years after the bottom! This precision is also quite useful in interpreting other measurements because many electroweak observables are subject to rat
\AAAA(\\AAA/ x w -—s w b Figure 2. Examples of SM radiative corrections sensitive to mt: (left) Ap (right) Zbb.
582 80.6 -LEP1.SLD, vNData
80.5
68% CL
> CD
£
80.4
80.3 m H [Gey 113/300/1000 80.2 130 150 170
Preliminary
190
210
Figure 3. Predicted 1 3 Mn/(m^,mt)rTlj tffeisM] compared 1 4 to data on Mw,
mt-
diative corrections sensitive to mt. As illustrated in Figure 2, for example, the W mass (which enters Ap) and the Zbb coupling (which enters R/,) are affected by virtual top quarks. Comparing the experimental constraints on Mw and mt with the SM prediction 13 for Mw(mt,mjiiggs) provides an opportunity to test the consistency of the SM and to constrain mmggs. As Figure 3 shows, the current data are suggestive, but not precise enough to provide a tightly-bounded value for mniggs- Run II measurements of the W and top masses are expected 11 to yield 8MW ~ 40 MeV (per experiment) and Smt « 3 GeV (1 GeV in Run lib or LHC). With this precision, it should be possible to obtain a much tighter bound 11 on the SM Higgs mass: 8MH/MH < 40%. A far more precise measurement, with Smt ~ 150 MeV, could in principle be extracted from near-threshold NLC 15 data on a(e+e~ -> tt). The calculated line shape shows a distinct rise at the remnant of what would have been the toponium IS resonance if the top did not decay so quickly. The location of the rise depends on mt; the shape and size, on the decay width Tt. This measurement has the potential for good precision because it is based on counting color-singlet tt events, making it relatively insensitive to QCD uncertainties. Taking advantage of this requires a careful choice of the definition of mt used to extract information from the data. Consider, for example, the mass appearing in the propagator D(j>) =i/(j>- TUR - S(/5)) • In principle, one can
583
Figure 4. (left) Top production and decay, (center) Same, with b-quark hadronization indicated, (right) Soft gluon resummation in the top propagator. After ref. 1 6 .
344
345
346
347
348
349
350
351
352
346
347
348
349
350
351
352
353
354
Vq1 [GeV]
V [GeV]
Figure 5. Near-threshold cross-section for photon-induced top production at an NLC 2 2 calculated (left) in the pole mass scheme and (right) in the IS mass scheme. Leading-order (dotted), NLO (dashed) and NNLO (solid) curves are shown with renormalization scales ^ = 15 (topmost), 30, and 60 GeV
reconstruct this mass from the four-vectors of the top decay products, as is done in the current Tevatron measurements. But this pole mass is inherently uncertain to O(AQCD)- For example, the clean top production and decay process sketched in Figure 4(left) is, in reality, complicated by QCD hadronization effects which connect the b-quark from top decay to other quarks involved in the original scattering, 16 as in Figure 4(center). Attempting to sum the softgluon contributions to the top propagator sketched in Figure 4 (right) yields the same conclusion. Taking the Borel transform of the self-energy allows one to effect the summation, 17 but real-axis singularities (infrared renormalons) in the Borel-transformed self-energy impede efforts to invert the transform. 16 The ambiguity introduced in distorting the integration contour of the inverse Borel transform around the singularities is of order 18 h-QCDUsing a short-distance mass definition avoids these difficulties.19 For example, one could adopt the MS mass definition
m(m)=m p o / e fl + - ^ + 8 . 3 ^ J
+...J
(3)
584
although the numerical value lies about 10 GeV below m po / e , which is inconvenient for data analysis. Another is the IS mass 22 2 , mis = rripoie - -asmpoie
+ ...
(4)
where 2m\s is naturally near the peak of a{e+e~ -t tt) . Others include the potential-subtracted 20 or kinetic 21 masses. Figure 5 compares the photoninduced tt cross-section near threshold as calculated in the pole mass and IS mass schemes (for mt = 175 GeV and Tt = 1.43 GeV). In the pole mass scheme, the location and height of the peak vary with renormalization scale and order in perturbation theory; this choice introduces QCD uncertainties into what should be a color-singlet process. Using the short-distance mass renders the peak location stable and large higher-order corrections are avoided. 1.2
Top Decay Width
In the 3-generation SM, data on the lighter quarks combined with CKM matrix unitarity implies12 0.9991 < \Vtb\ < 0.9994. Thus the top decays almost exclusively through t —> Wb. At tree level, in the approximation where M\y = nib — 0 and setting |Vtf,| = 1, the decay width is T0(t -> Wb) = ^£^L
= 1.76 GeV.
(5)
87r\/2
More precise calculations yield similar results. Including Mw ^ 0 gives A//"4 2
M6
rt/\Vtb\ = r o ( l - 3 ^ + 2 — f ) = 1.56GeV . (6) while including the b-quark mass and radiative corrections refines this to 2 3 Tt/\Vtb\2 = 1.42 G e V . As a result, the top decays in r t ss 0.4 x 10~ 24 s. Since this is appreciably shorter than the characteristic QCD time scale TQCD ~ 3 x 10~ 24 s, the top quark decays before it can hadronize. Therefore, unlike the b and c quarks which offer rich spectra of bound states for experimental study, the top quark is not expected to provide any interesting spectroscopy. A precise measurement of the top quark width could, in principle, be made at an NLC running at -Js ~ 350 GeV by exploiting the fact that Yt controls the threshold peak height in cr(e + e~ —> tt). Until recently, the NNLO calculations were plagued by a 20% normalization ambiguity which made the realization of this aim uncertain; 19 preliminary new results 24 suggest the issue has been favorably resolved.
585 92.2%
(a)
>
oi n
6 4 2
yziikii 200
400
600 200 mfl (GeV/c2)
400
600
Figure 6. Invariant mass distribution 1 0 for top pairs: D 0 data (histogram), simulated background (triangles), simulated S + B (dots). In (a) mt unconstrained; in (b) raj = 173 GeV.
f io~2 9. 10
L •
-, JETRAD CTEQ3M.H = 0.5 E™'M
*L„='-3* In, J < 1.0
•
4
lio- XlO
110-
•
6
1 io"
• D 0 Data
7 i
"l
200
400
600 M (GeV/c)
800
28
Figure 7. Light dijet invariant mass distribution : prediction (solid) and D 0 data (dots).
1.3
Pair Production
The top pair production cross-section has been measured in all available channels by CDF 25 and D0. 2 8 As with rnt, the lepton+jets channel, with its combination of statistics and full reconstruction, gives the single most precise measurement (see Table 1). The combined average of au{rnt = 172 GeV) = 5.9±1.7 pb is consistent with SM predictions including radiative corrections. 27 Initial measurements of the invariant mass (Mtt) and transverse momentum (pr) distributions of the produced top quarks have been made, as shown in Figure 6. While a comparison with the measured Mjj distribution for QCD dijets (Figure 7) illustrates how statistics-limited the Run I top sample is, some preliminary limits on new physics are being extracted. It has been noted, e.g. that a narrow 500 GeV Z' boson is inconsistent with the observed shape of the high-mass end of CDF's Mu distribution. 29 The pr distribution for the hadronically-decaying top in fully-reconstructed lepton + jets events
586
0
1
<15 > v o
810 c s> 5 UJ
Data
-
0
1
///
0
v>
'S
y/////
Standard Model tt + Background
•
Estimated Background
'~
' v
1
&M? 97^ '/////A 100
Z^^z -,-..-- ^: 200
300
P, (GeV/c) Figure 8. P y distribution for hadronically-decaying tops in lepton+jets events from C D F . 3
(Figure 8) constrains non-SM physics which increases increase the number of high-p T events. The fraction i? 4 = 0.000to;ooo(stof)io]ooo(s2/s) o f events in the highest pr bin (225 < pr < 300 GeV) implies30 a 95% c.l. upper bound i?4 < 0.16 as compared with the SM prediction R4 = 0.025. In Run II, the att measurement will be dominated by systematic uncertainties; the collaborations will use the large data sample to reduce reliance on simulations. 31 Acceptance issues such as initial state radiation, the jet energy scale, and the b-tagging efficiency will be studied directly in the data. The background uncertainty for the lepton+jets mode will be addressed by measuring the heavy-flavor content of W+jets events. It is anticipated 31 that an integrated luminosity of 1 (10, 100) ft)-1 will enable att to be measured to ± 11 (6, 5) %. The Mtt distribution will then constrain a • B for new resonances decaying to tt as illustrated in Figure 9.
1.4
Spin Correlations
When a tt pair is produced, the spins of the two fermions are correlated. 32 This can be measured at lepton or hadron colliders, and provides another means of testing the predictions of the SM or looking for new physics. One starts from the fact that the top quark decays before its spin can flip.33 The spin correlations between t and t therefore yield angular correlations among their decay products. If \ is angle between the top spin and the momentum of a given decay product, the differential top decay rate (in the
587
'
Min o"*B(X—>tt) for a resonance tob e observed at the 5o level. A
1 fb"' ...
'*•••...
A A
10 fb"' D
....^ A
1
••4.
100 fb" ...
10
A"'••.,
A
a D
A TopColor Z ,r= 1.2% a TopColor Z ,r= 10%
-
400
500
600 700 800 M„ GeV/c'
D'.-.
900
1000
Figure 9. Anticipated 3 1 Run II limits on <j • B(X —• tt).
helicity . basis / l Beamline basis
,
, Off-diagonal basis
Figure 10. Definitions of the off-diagonal basis and decay lepton angles for studying top spin correlations.
top rest frame) is 1 dr 1 (7) -= -(1 + acosx) 1 acosx 2 The factor a is computed 34 to be 1.0 (0.41, -0.31, -0.41) if the decay product is £ or d (W, v or u, b). A final-state lepton is readily identifiable and has largest value of a; thus, dilepton events are best for studying tt spin correlations. Choosing a good basis along which to project the spin variables is key to extracting information from the data. For example, consider e + e~ -» tt at the NLC. If the beams are polarized, using a helicity basis seems logical, but near
588
the tt threshold helicity is not very useful. Fortunately, there is an optimal "off-diagonal" basis 35 which gives a clean prediction for spin correlations: in leading order the spins are purely anti-correlated (iff; + t\tf). One projects the top spin along an axis identified by angle ip , P2 sin 9* cos 6* tanw= ~ 1 - /32 sin2 (9*
(8) V '
where /3 is the top quark's speed in the center-of-momentum scattering frame and 9* is the top scattering angle in that frame. The basis angle ip and decay lepton angle x a r e illustrated in Figure 10. The advantages of an appropriate basis are clear from Figure 11: for a given data sample, discerning the clean predication of the off-diagonal basis should be far easier than untangling the several possible spin configurations in the helicity basis. Moreover, while the fraction of top quarks in the dominant spin configuration in e ^ e j —> tt approaches unity in the helicity basis at large f3, it is always nearly one in the off-diagonal basis (Figure 12). This idea carries over to the Tevatron. In the helicity basis, 70% of tt pairs have opposite helicities36: threshold production via qq annihilation puts the tops in a 3S\ state 37 where their spins tend to be aligned. But the off-diagonal basis still does better 38 : 92% of the top pairs have anti-aligned spins. The larger spin correlation translates into larger and more measurable correlations among the decay leptons. Writing the differential cross-section in terms of the angular positions x± °f the decay leptons £ ± 1
~T(
d2<J
\M
a d(cosx+)d(cosx-)
T=
1,, 4
~A 1 + KCOSX+COSX-)
(9)
one finds re « 0.9 in the SM for -Js = 1.8 TeV. As D 0 recorded only six dilepton events in Run I, they set 39 merely the 68% c.l. limit re > —0.25. Nonetheless, the possibility of making a top spin correlation measurement in a hadronic environment has been established and Run Ha promises ~150 dilepton events. 31 At the LHC, the top dilepton sample will be of order 4 x 105 events per year 23 - but no spin basis with nearly 100% correlation at all ft has been identified. Pair production proceeds mainly through gg -> tt, putting the tops in a 1SQ state 3 7 at threshold. Near threshold, angular momentum conservation favors like helicities; far above threshold, helicity conservation favors opposite helicities.23 In the helicity basis, one conventionally studies a differential crosssection of the same form as Eq. (9), in which the coefficient re is renamed — C and the angle x± refers to the angle between the t it) momentum in the center-of-momentum frame and the £ ± direction in the t (t) rest frame. The
589 0.6
;'
(a)
i ' ' ' ' i • '
e, e
0.4
UD
0.2
DU UU+DD, I I I
: o>)
I I I I
I I I I
I I I
(b)
e„ e
ep e
i UU+DD, -1.0
0.0
I , 0.5
cos e" Figure 11. Differential cross-section for top production at 400 GeV NL/C with 3 5 (a) LH and (b) RH electron beams. At left spins are projected onto the helicity basis; at right, onto the off-diagonal basis.
SM predicts C « 0.33 in leading order at the LHC. Physically, C corresponds 6 to the ratio 36 C =
N{tLtL + tRtR) - N(tLtR N(tLtL+tRiR) + N(tLtR
+ tRtL) + tRiL) '
(10)
The effects of radiative corrections and the likely measurement precision achievable remain to be evaluated. 23 1.5
Single Production
The three SM channels for single top production are Wg fusion, qq annihilation through an off-shell W, and gb —¥ tW; the Feynman diagrams are shown in Figure 13. The Wg fusion events are characterized by one hard and one soft b-jet, an additional jet and a W; the SM Run I cross-section is calculated 40 to ''This expression also holds for — K at the Tevatron if L and R are taken to refer to the off-diagonal rather than the helicity basis.
590 l.C
:
' " I
1
"
(d,
1
1 1
1 " " 1 ''''1 '
OJf-diagonal UD
•
1.0
-t-»
o
E-< <4M
0.8
o
e
c
o *3 oID
L
e
+
.-'
R
-—^_,* ^' — -
0.6
t.
fa
' Helicity LR
"
0.4 r, o
0.0
,..!,,, , 1 ,...1 ,, ,, 1 , 0.2
0.4
0.6
0.8
1.0
Figure 12. Fraction of top quarks in the dominant spin configuration 35 for e ^ e ^ —> tt.
W \ Figure 13. Feynman diagrams for single top quark production.
be 1.70 ±0.9 pb. The W* events, in contrast, include two hard b quarks and a W from top decay; the calculated 40 SM Run I cross-section is a = 0.73 ± 0.04 pb. The gb —> tW process is highly suppressed at the Tevatron. Searches for single top production generally focus on leptonically-decaying W bosons. The principle backgrounds come from top pair production, W+jets events, QCD multijet events in which a jet fakes an electron, and WW events. While the D 0 analysis of single top production is still in progress c , CDF has set two limits. 42 The first is based on reconstructing a top quark mass for the six events with Wbb identified in the final state, as illustrated in Figure 14. Using Run I data, CDF finds atb < 18.6 pb (the SM prediction is 2.43 pb). The higher luminosity in Run II should provide S/\/B > 4 in this channel. The second limit exploits the differences among the HT distributions in signal c
T h e D 0 limit 4 1 became available after these lectures were given. It is not stronger than the CDF limits.
591 CDF preliminary
300
350
400
Reconstructed mass (GeV/c2) F i g u r e 14. R e c o n s t r u c t e d mt for t h e C D F 4 2 Wbb s a m p l e . " S i g n a l " is single t o p p r o d u c t i o n .
and background W+jet events; HT is the scalar sum of the jet, lepton, and missing transverse-energies. Each event is required to include 1-3 jets (one of which is b-tagged), a lepton from W decay, and a reconstructed top mass in the range 140 - 210 GeV. The cross-section limit set with Run I data shown in Figure 15 is atb < 13.5 pb. 1.6
Decays
W helicity in top decay
The SM predicts the fraction (To) of top quark decays to longitudinal (zero-helicity) W bosons will be quite large, due to the top quark's big Yukawa coupling: m2t/2M^ = (70.1 ± 1 . (11) 1 + m2t/2M? w One can measure To in dilepton or lepton+jet events by exploiting the correlation of the W helicity with the momentum of the decay leptons. For To
592
H T for Events in W+1,2,3 Jet Bins (CDF Run 1 Data)
50
100
150
200
250
300
H
350
400
450
500
HT (GeV/c2)
Figure 15. HT distribution in single top production 4 2 : CDF data and simulated backgrounds.
W+ —» £+v, the spins of the decay leptons align with that of the W; for massless leptons, the £+ (z/)momentum points along (opposite) its spin. Then a positive-helicity W (boosted along its spin) yields harder charged leptons than a negative-helicity W. The longitudinal W gives intermediate results. Table 2. Predicted 3 1 precision of Run II W helicity measurement for several j Cdt.
ST0 5T+
l f b =T" 6.5% 2.6%
10 fb" 2.1% 0.8%
100 fb" 0.7% 0.3%
CDF has measured 43 the lepton pr spectra for dilepton and lepton + jet events and performed fits as shown in Figure 16. There is insufficient data to permit forming conclusions about all three helicity states simultaneously. By
593 Fit for Fraction of W with h w = 0 (F0) (CDF) F Lepton + Jet Channel
-Dilepton Channel
Combined Result F„ = 0.91 ±0.37 ±0.13 (Background gaussian constrained, h w - +1 component fixed to 0)
80
100
120
Lepton P T (GeV/c)
Figure 16. Measured lepton py spectra and fits to W helicity by CDF.'
assuming no positive-helicity W's are present, CDF obtains the limit !F0 — 0.91 ± 0.37 ± 0.13. By setting T0 to its SM value of 0.70, they obtain the 95% c.l. upper limit T+ < 0.28. Note, however, that the first limit essentially states only that no more than 100% of the decay W's are longitudinal and the second, that no more than 1 — To have positive helicity. More informative constraints are expected from Run II (see Table 2). b quark decay fraction
The top quark's decay fraction to b quarks is measured by CDF 44 to be Bb = T(t -+ bW)/Y{t ->• qW) = 0.99 ± 0.29. In the three-generation SM, Bb is related to CKM matrix elements as Bh
\Vtb\2 2
vtb\ + \vts\? + \vtd\ t
(12)
Three-generation unitarity dictates that the denominator of (12) is 1.0, so that the measurement of Bb implies44 \Vtb\ > 0.76 at 95% c.l. However, within the 3-generation SM, data on the light quarks combined with CKM unitarity has already provided 12 the much tighter constraints 0.9991 < \Vtb\ < 0.9994.
594 Table 3. Run II sensitivity 31 to FCNC top decays as a function of J Cdt.
BR(t -> Zq) BR(t -> jq)
lfb"1 0.015 3.0 x 10" 3
10 fb" 1 3.8 x 10" 3 4.0 x 10" 4
100 fb" 1 6.3 x 10" 4 8.4 x 10" 5
If we add a fourth generation of quarks, the analysis differs. A search by D 0 has constrained 12 any 4-th generation b' quark to have a mass greater than rat — Tnw, so that the top quark could not readily decay to b'. This means that the original expression (12) for B\, is still valid. However, once there are four generations, the denominator of the RHS of (12) need not equal 1.0. Then the CDF measurement of B\, implies \Vtb\ S> \Vtd\ , \Vts\- In contrast, light-quark data combined with 4-generation CKM unitarity allows \Vtb\ to lie in the range 12 0.05 < |Vtt| < 0.9994. While the measurement of Bb gives only qualitative information about \Vtb\, that information is new and useful in the context of a 4-generation model. Direct measurement of \Vtb\ in single top-quark production (via qq —>• W* —> tb and gW -» tb) at the Tevatron should reach an accuracy 31 of 10% in Run Ha (5% in Run lib). FCNC decays CDF has set limits 45 on the flavor-changing decays t —> Zq, 79 which are GIM-suppressed in the SM. In seeking t -> Zq they looked at pp -> ti -> qZbW, qZbZ -> ££ + 4 jets with high jet ET. The SM background from WW, ZZ and WZ events is predicted to be 0.6 ± 0.2 events. The data contains a single candidate (in which the Z decayed to muons). On this basis, the 95% c.l. upper limit B{t —> Zq) < 0.33 was set. To study t —• 75, CDF examined pp ->• tt -> Wbjq events. If the W decayed leptonically, the signature was 7 + £ + Ej + (> 2) jets; if hadronically, the signature was 7 + (> 4) jets with one jet b-tagged. The expected SM background is a single event. Finding a single candidate event (with a leptonic W decay), CDF set the 95% c.l. upper bound B(t —> 75) < 0.032. Run II will provide much greater sensitivity to these decays, 31 as indicated in Table 3. 1.7
Summary
The Run I experiments at the Tevatron discovered the top quark and provided the first measurements of a variety of properties including mt, Tu otu j%fc,^;,K,(Tth,Fo,r+,Bb,r(t ~> Zq), smdr(t -> jq). As we
595
Figure 17. (left) Naturalness problem: M2H oc A 2 , (right) Triviality: /3(A) =
|^-
> 0.
have seen, most of the measurements were limited in precision by the small top sample size. This will be ameliorated at Run II and future colliders. As a starting point for further discussion, we note that each property measured has been seen to have multiple implications for theory. Moreover, the interpretation of the measurement can depend critically on the theoretical context. In some cases, measurements may even shed more light on the merits of proposed non-standard physics than on the Standard Model itself. This is the line of thought we shall take up in the second section of the talk. 2
Beyond the Standard Model
Two central concerns of particle theory are finding the cause of electroweak symmetry breaking and identifying the origin of flavor symmetry breaking by which the quarks and leptons obtain their diverse masses. The Standard Higgs Model of particle physics, based on the gauge group SU(3)C x SU[2)w x U(l)y accommodates both symmetry breakings by including a fundamental weak doublet of scalar ("Higgs") bosons 0 = (t0) with potential function V(4>) = A (ft 4> — | f 2 ) . However the SM does not explain the dynamics responsible for the generation of mass. Furthermore, the scalar sector suffers from two serious problems. The scalar mass is unnaturally sensitive to the presence of physics at any higher scale A (e.g. the Planck scale), as shown in Figure 17. This is known as the gauge hierarchy problem. In addition, if the scalar must provide a good description of physics up to arbitrarily high scale (i.e., be fundamental), the scalar's self-coupling (A) is driven to zero at finite energy scales as indicated in Figure 17. That is, the scalar field theory is free (or "trivial"). Then the scalar cannot fill its intended role: if A = 0, the electroweak symmetry is not spontaneously broken. The scalars involved in electroweak symmetry breaking must therefore be a party to new physics at some finite energy scale - e.g., they may be composite or may be part of a larger theory with a UV fixed point. The SM is merely a low-energy effective field theory, and the dynamics responsible for generating mass must lie in physics outside the SM.
596
r\f _. H
_ _ 1 As_
vy
H"
" "*" ~
2
Figure 18. SMjj ~ i ^ ( m / ~ ml) + rn2slogk2
One interesting possibility is to introduce supersymmetry. 46 The gauge structure of the minimal supersymmetric SM (MSSM) is identical to that of the SM, but each ordinary fermion (boson) is paired with a new boson (fermion) called its "superpartner" and two Higgs doublets are needed to provide mass to all the ordinary fermions. As sketched in Figure 18, each loop of ordinary particles contributing to the Higgs boson's mass is countered by a loop of superpartners. If the masses of the ordinary particles and superpartners are close enough, the gauge hierarchy can be stabilized.47 Supersymmetry relates the scalar self-coupling to gauge couplings, so that triviality is not a concern. Another intriguing idea, dynamical electroweak symmetry breaking, 48 is that the scalar states involved in electroweak symmetry breaking could be manifestly composite at scales not much above the electroweak scale v ~ 250 GeV. In these theories, a new strong gauge interaction with ft < 0 (e.g., technicolor) breaks the chiral symmetries of massless fermions / at a scale A ~ 1 TeV. If the fermions carry appropriate electroweak quantum numbers (e.g. LH weak doublets and RH weak singlets), the resulting condensate ( / L / R ) ^ 0 breaks the electroweak symmetry as desired. The Goldstone Bosons (technipions) of the chiral symmetry breaking simply become the longitudinal modes of the W and Z. The logarithmic running of the strong gauge coupling renders the low value of the electroweak scale (i.e. the gauge hierarchy) natural. The absence of fundamental scalars obviates concerns about triviality. Once we are willing to consider physics outside the SM, seeking experimental evidence is imperative. One logical place to look is in the properties of the most recently discovered state, the top quark. The fact that rnt ~ vweak suggests that the top quark may afford us insight about non-standard models of electroweak physics and could even play a special role in electroweak and flavor symmetry breaking. Since the sample of top quarks accumulated in Tevatron Run I was small, many of the top quark's properties are still only loosely constrained. The top quark may yet prove to have properties that set it apart from the other quarks, such as light related states, low-scale compositeness, or unusual gauge couplings.
597
The Run II experiments will help us evaluate these ideas. One approach would be to classify measurable departures from SM predictions and identify the theories which could produce them. 49 For example, an unexpectedly large rate for tt production could signal the presence of a coloron resonance, a techni-eta decaying to tt or a gluino decaying to tt. The approach we adopt here, is to consider general classes of theoretical models and identify signals characteristic of each. We will discuss two-higgs and SUSY models, dynamical symmetry breaking, new gauge interactions for the top quark and the phenomenology of strong top dynamics. 2.1
Multiple-Scalar-Doublet Models
Many quite different kinds of models include relatively light charged scalar bosons, into which top may decay: t —> H+b. The general class of models that includes multiple Higgs bosons 50 features charged scalars that can be light. Dynamical symmetry breaking models with more than the minimal two flavors of new fermions (e.g. technicolor with more than one weak doublet of technifermions) typically possess pseudoGoldstone boson states, some of which can couple to third generation fermions. SUSY models must include at least two Higgs doublets in order to provide mass to both the up and down quarks, and therefore have a charged scalar in the low-energy spectrum. Experimental limits on charged scalars are often phrased in the language of a two-higgs-doublet model. In addition to the usual input parameters Q e m , GF and Mz required to specify the electroweak sector of the SM, two additional quantities are relevant for the process t —> H+b: tan/3 (the ratio of the vev's of the two scalar doublets) and MH±. If the mass of the charged scalar is less than mt —TO&,then the decay t —> H+b can compete with the standard top decay mode t -> Wb. Since the tbH± coupling depends on tan /3 as 50 gtbH+ ocm t cot/3(l + 75) + m(,tan/3(l - 7 5 ) ,
(13)
the additional decay mode is significant for either large or small values of tan/3. The charged scalar, in turn, decays as H± —> cs or H^ —> t*b —> Wbb if tan/3 is small and as H± —> TVT if tan/3 is large. In either case, the final state reached through an intermediate H^ will cause the original tt event to fail the usual cuts for the lepton + jets channel. A reduced rate in this channel can therefore signal the presence of a light charged scalar. As shown in Figure 19, D 0 has set a limit 51 on M#± as a function of tan/3 and au- In Run II the limits should span a wider range of tan /3 and reach nearly to the kinematic limit.
598
Figure 19. D 0 charged scalar searches 51 in t -> H±b. (left) Run I limits for mt — 175 GeV. The region below the top (middle, bottom) contours is excluded if a(tt) = 4.5, 5.0, 5.5 pb. (right) Projected Run II reach 5 2 assuming ,/s = 2 TeV, J Cdt = 2fb~ l , and a(tt) = 7pb.
2.2
SUSY Models
The heavy top quark plays a role in several interesting issues related to Higgs and sfermion masses in supersymmetric models. Scalar mass-squared
SUSY models need to explain why the scalar Higgs boson acquires a negative mass-squared (breaking the electroweak symmetry) while the scalar fermions do not (preserving color and electromagnetism). In a number of SUSY models, such as the MSSM with GUT unification or models with dynamical SUSY breaking, the answer involves the heavy top quark. 46 In these theories, the masses of the Higgs bosons and sfermions are related at a high energy scale Mx:
MlH{Mx) = mg H V
MJ{MX) = ml
(14)
where the squared masses are all positive so that the vacuum preserves the color and electroweak symmetries. To find the masses of the scalar particles at lower energy scales, one studies the renormalization group running of the masses. 53 The large mass of the top quark makes significant corrections to the running masses. Comparing the evolution equations 54 for the Higgs, the
599
scalar partner of ta and the scalar partner of QL = (t, 6)L,
Ql
(15) it is clear that the influence of the top quark Yukawa coupling is greatest for the Higgs. At scale q, the approximate solution for Mh is Mt{q) = Ml{Mx)
- ^ A 2 ( A f £ , + M?R + M 2 + Alt)ln
(^*)
(16)
and At is seen to be reducing M\. For mt ~ 175 GeV, this effect drives the Higgs mass, and only the Higgs mass, negative - just as desired. 55 Light Higgs mass
The low-energy spectrum of the MSSM includes a pair of neutral scalars h° (by convention, the lighter one) and H°. At tree level, M/, < Mz\ cos(2/3)| where tan/3 is the ratio of the vev's of the two Higgs doublets. 50 Searches for light Higgs bosons then appear to put low values of tan /3 in jeopardy. In fact experiment has now pushed the lower bound on Mh well above the Z mass: Mh ~ 107.7 GeV. 14 Enter the top quark. Radiative corrections to Mh involving virtual top quarks introduce a dependence on the top mass. For large mt, this can raise the upper bound on Mh significantly.56 When tan/3 > 1, Ml < M2Z cos2(2/3) + i ? T2 <
ln
V2n
(17)
(^)
x ..- t/
and the mf term raises the upper bound well above MZ- Including higherorder corrections, the most general limit 56 appears to be Mh < 130 GeV, well above the current bounds but in reach of upcoming experiment. Light top squarks Since SUSY models include a bosonic partner for each SM fermion, there is a pair of complex scalar top squarks affiliated with the top quark (one for £L, one for tpt). A look at the mass-squared matrix for the stops 46 (in the tL,in basis) f
M^+m2t + M | ( | - I sin2 0W) cos 2/3
mt(At+
fj, cot (3)
\ (18)
m: = mt (A t + n cot /3)
Ml + m 2 + | M | sin2 (9w cos 2/3 /
600
Figure 20. Limits 5 7 on Light Stop (left) via ti —> ex?- (right) via i\ -> by^ direct £1 -+ blO assuming equal branching to all lepton flavors.
—• bli> or
reveals that the off-diagonal entries are proportional to mt. Hence, a large mt can drive one of the top squark mass eigenstates to be relatively light. Experiment still allows a light stop, 57 as may be seen in Figure 20; Run II will be sensitive to higher stop masses in several decay channels 58 (Figure 21). Perhaps some of the Run I "top" sample included top squarks. 59 If the top squark is not much heavier than the top quark, it is possible that it production occurred in Run I, with the top squarks subsequently decaying to top plus neutralino or gluino (depending on the masses of the gauginos). If the top is a bit heavier than the stop, some top quarks produced in ti pairs in Run I may have decayed to top squarks via t -> iN with the top squarks' subsequent decay being either semi-leptonic i -> blv or flavor-changing i ->• cN, eg. With either ordering of mass, it is possible that gluino pair production occurred, followed by g -»• ti. These ideas can be tested using the rate, decay channels, and kinematics of top quark events. 49 For example, stop or gluino production could increase the apparent ti production rate above that of the SM. Or final states including like-sign dileptons could result from gluino decays. 2.3
Dynamical Electroweak Symmetry Breaking
Extended technicolor (ETC) is an explicit realization of dynamical electroweak symmetry breaking and fermion mass generation. 48 One starts with a strong
601
Figure 21. Anticipated Run II Stop limits from various decay channels. 5 8
ETC
ETC
Figure 22. (left) Top-technifermion scattering mediated by a heavy E T C gauge boson. (right) Technifermion condensation creates the top quark mass.
gauge group (technicolor) felt only by a set of new massless fermions (technifermions) and extends the technicolor gauge group to a larger (ETC) group under which ordinary fermions are also charged. At a scale M, ETC breaks to its technicolor subgroup and the gauge bosons coupling ordinary fermions to technifermions acquire a mass of order M. At a scale Arc < M , a technifermion condensate breaks the electroweak symmetry as described earlier. The quarks and leptons acquire mass because massive ETC gauge bosons couple them to the condensate. The top quark's mass, e.g., arises when the condensing technifermions transform the scattering diagram in Figure 22 (left) into the top self-energy diagram shown at right. Its size is mt w (g2ETC/M2)(TT) Thus M must satisfy
M/QETC
«
(g%TC/M2)(4nv3)
(19)
« 1.4 TeV in order to produce mt = 175 GeV.
602
While this mechanism works well in principle, it is difficult to construct a complete model that can accommodate a large value for nit while remaining consistent with precision electroweak data. Two key challenges have led model-building in new and promising directions. First, the dynamics responsible for the large value of mt must couple to bi because t and b are weak partners. How, then, can one obtain a predicted value of Rt, that agrees with experiment? Attempts to answer this question have led to models in which the weak interactions of the top quark 60 ' 64 (and, perhaps, all third generation fermions) are non-standard. Second, despite the large mass splitting mt 3> mj, the value of the rho parameter is very near unity. How can dynamical models accommodate large weak isospin violation in the t — b sector without producing a large shift in M ^ ? This issue has sparked theories in which the strong (color) interactions of the top quark 61 (and possibly other quarks 62 ) are modified from the predictions of QCD. In the remainder of this talk, we explore the theoretical and experimental implications of having non-standard gauge interactions for the top quark. 2-4
New Top Weak Interactions
In classic ETC models, the large value of mt comes from ETC dynamics at a relatively low scale M of order a few TeV. At that scale, the weak symmetry is still intact so that ti and b^ function as weak partners. Moreover, experiment tells us that \Vtb\ ~ 1. As a result, the ETC dynamics responsible for generating mt must couple with equal strength to ti and bi. While many properties of the top quark are only loosely constrained by experiment, the b quark has been far more closely studied. In particular, the LEP measurements of the Zbb coupling are precise enough to be sensitive to the quantum corrections arising from physics beyond the SM. As we now discuss, radiative corrections to the Zbb vertex from low-scale ETC dynamics can be so large that new weak interactions for the top quark are required to make the models consistent with experiment. 63 ' 60 To begin, consider the usual ETC models in which the extended technicolor and weak gauge groups commute, so that the ETC gauge bosons carry no weak charge. In these models, the ETC gauge boson whose exchange gives rise to mj couples to the fermion currents 63
£ (& r nk) + r 1 {tR r UR)
(20)
where £ is a Clebsh of order 1 (see Figure 23). Then the top quark mass arises from technifermion condensation and ETC boson exchange as in Figure 22, with the relevant technifermions being UL and UR.
603
Figure 23. Fermion currents coupling to the weak-singlet E T C boson that generates m j .
bL DL vAAA/y* f> ETC z DL A Figure 24. Direct correction to the Zbb vertex from the E T C gauge boson responsible for mt in a commuting model.
Exchange of the same 63 ETC boson causes a direct (vertex) correction to the Z —> bb decay as shown in Figure 24; note that it is Di technifermions with / 3 = — | which enter the loop. This effect reduces the magnitude of the Zbb coupling by S9L
= 4 sin Losfl ( M ^ J
(21)
Given the relationship between M and mt from Eq. 19, we find 9
9
q v
2
M
mt A-KV
, „,
K
'
so that the top quark mass sets the size of the coupling shift. How to observe the shift in the couplings? The vertex correction will certainly produce a correction ST/, to the Z decay width T(Z —> bb). But since Tf, also receives oblique radiative corrections, Tcborr- = (1 + A/»)(r& + 6r&), a measurement of Y^ is not the best way to track Sg^. The ratio of Tj to the hadronic decay width of the Z Rb = T(Z -> bb)/T(Z ->• hadrons)
(23)
is also proportional to 8gi and has the additional advantage that oblique and QCD radiative corrections each cancel in the ratio (up to factors suppressed
604 0.19
0.18 DC 0.17-
0.16 0.214
0.216
0.22
0.218 R
b°
Figure 25. Data 1 4 on R/, and Rc showing experimental best fit (dot) and SM prediction (arrow). A 5% negative shift in K(, is clearly excluded.
by small quark masses). One finds63 SRb
~
rn% ^
(
m
t
\
(24)
Such a large shift in Rb is excluded14 by the data (see Figure 25). Then the ETC models whose dynamics produces this shift are likewise excluded. This suggests one should consider an alternative class of ETC models 60 in which the weak group SU(2)w is embedded in GETC, S O that the weak bosons carry weak charge. Embedding the weak interactions of all quarks in a low-scale ETC group would produce masses of order mt for all up-type quarks. Instead, one can extend SU(2) to a direct product group SU(2)hxSU(2)e such that the third generation fermions transform under SU{2)h and the others under SU(2)f. Only SU(2)h is embedded in the low-scale ETC group; the masses of the light fermions will come from physics at higher scales. Breaking the two weak groups to their diagonal subgroup ensures approximate Cabibbo universality at low energies. The electroweak and technicolor gauge structure of these non-commuting models is sketched below60: GETC
X SU{2)light
x £7(1)
1/ GTC x SU{2)heavy
x SU(2)iight
x U{l)Y
605
UL
ETC Figure 26. Fermion currents coupling to the weak-doublet ETC boson that generates rnt in non-commuting E T C models.
4- U GTC
x SU(2)weak
x U(1)Y
4- v GTC
x U(1)BM
(25)
Due to the extended gauge structure, there are now two non-standard contributions to Rb, one from the dynamics that generates rrit and the other from the mixing of the two Z bosons from the two weak groups. The ETC boson responsible for mt now couples weak-double fermions to weak-singlet technifermions (and vice versa) as in Figure 26. The radiative correction to the Zbb vertex is as in Figure 24 except that the technifermions involved are now Ui with T3 = + A. As a result, the shift in 5gi and Rb have the same size the results in Eqs. 21 and 24 - but the opposite sign.60 Were these the only contributions to Rb, this class of models would be excluded. Consider, however, what happens when the SU(2)h x SU(2)e x U(l)y bosons mix to form mass eigenstates. 60 The result is heavy states WH,ZH that couple mainly to the third generation, light states WL, ZL resembling the standard W and Z, and a massless photon A^ = sin 6>[sin
W± =cWf
D» = d» + ig (T? + T±) Wf
-s
+ ig ( ^
W^ - *-T±) Wl
Zi = cos 9(s Wu + c W3h) - sin OX Z2 = c Wu - sW-3h c 2 9 D» = 9" +i-»-(T3t + T3ll- sin 6Q) Z» cost' v '
(26)
where W\, Z\ have SM couplings and all non-standard couplings accrue to
606
W2,Z2, the mass eigenstates are (with x =
u2/v2) 3
WL -r-8 ^ 1 - • ^ w , . X
L
Z zs Z i -
c^ s a Z? '
WH ss W2 + — Wi a;
Z f f *.Z 2 +
° 3S /1, ^ i
(27)
X COS 0
and the heavy boson masses are degenerate: MWH K MZH RS MW\fxjsc. The Z L coupling to quarks thus differs from the SM value by SgL = (c4/x)T3i (c2s2/x)T3h which reduces Rb by 60
where the term in square brackets is 0(1). As the ETC and ZZ' mixing contributions to Rb are of the same magnitude, but opposite size, Rb can be consistent with experiment in noncommuting ETC models. The key element that permits a large m,t and a small value of Rb to co-exist is the presence of non-standard weak interactions for the top quark. 60 This is something experiment can test, and has since been incorporated into models such as topflavor64 and top seesaw.65 There are several ways to test whether the high-energy weak interactions have the form SU{2)h x SU(2)(. One possibility is to search for the extra weak bosons. The bosons' predicted effects on precision electroweak data gives rise to the exclusion curve 66 in Figure 27. Low-energy exchange of ZH and WH bosons would cause apparent four-fermion contact interactions; LEP limits on eebb and eerr contact terms imply 67 MZH ~ 400 GeV. Direct production of ZH and WH at Fermilab is also feasible; a Run II search for ZH -> TT ->• e/j,X will be sensitive 67 to ZH masses up to 750 GeV. Another possibility is to measure the top quark's weak interactions in single top production. Run II should measure the ratio of single top and single lepton cross-sections Ra = vtb/viv to ±8% in the W* process. 68 A number of systematic uncertainties, such as those from parton distribution functions, cancel in the ratio. In the SM, Ra is proportional to the square of the Wtb coupling. Non-commuting ETC models affect the ratio in two ways: mixing of the Wh and Wi alters the WL coupling to fermions, and both WL and WH exchange contributes to the cross-sectionsd. Computing the shift in Ra from these effects reveals (Figure 27) that Run II will be sensitive69 to WH bosons up to masses of about 1.5 TeV. d
T h e E T C dynamics which generates mt has no effect on the Wtb vertex because the relevant E T C boson does not couple to bn.
607 mass[TeV] 3r
Figure 27. FNAL Run II single top production can explore the shaded region of the Mwi vs. sin 2 4> plane. 6 9 The area below the solid curve is excluded by precision electroweak data. 6 6 In the shaded region R„ increases by > 16%; below the dashed curve, by > 24%.
jv\s\i\r~~\\rj\j\j W:
V
—^
W:
= tiw"+...
Figure 28. Electroweak boson propagator used in calculation of Ap.
2.5
New Top Strong Interactions
At tree-level in the SM, p = M ^ / M f cos2 6W = 1 due to a "custodial" global SU(2) symmetry relating members of a weak isodoublet. Because the two fermions in each isodoublet have different masses and hypercharges, however, oblique 70 radiative corrections to the W and Z propagators alter the value of p. The one-loop correction from the (t,b) doublet is particularly large because nit ^> mb- The shift in p is computed from the propagators in Figure 28 as 70 Ap(0) = p(0) - 1
sin 9W cos2 9WM
-[nu(0)-n33(0)]
(29)
Experiment 12 finds |Ap| < 0.4%, a stringent constraint on isospin-violating new physics. For example, a heavy lepton doublet (N,E) with standard weak couplings and mass 3> Mz would add 70 &PN„
OtEM
16-K sin2 0W cos2
9wMl
[m2N + m%
2m N mf mN
io5(-f)] m
(30)
608
Z
ETC
Z
W Z
'
M?
Figure 29. E T C contributions to Ap: (left) direct, from gauge boson mixing (right) indirect, from technifermion mass splitting.
and a new quark doublet, three times as much; the data forces the new fermions to be nearly degenerate. Dynamical theories of mass generation like ETC must break weak isospin in order to produce the large top-bottom mass splitting. However, the new dynamics may also cause large contributions to 5p. Direct mixing between and ETC gauge boson and the Z (Figure 29) induces the dangerous effect71
in models with No technifermion doublets and technipion decay constant FTC- TO avoid this, one could make the ETC boson heavy; however the required METc/gETC > 5.5TeV(v / AW ? Tc/250 GeV) is too large to produce mt = 175 GeV. Instead, one must obtain N^F^Q
609
A2(0)
K
\c
Figure 30. (left) E T C and new top dynamics generate masses for technifermions, t and b. (right) Second-order phase transition forms a top condensate for K > K C .
(KC = 37r/8 in the NJL 76 approximation), a top condensate forms (Figure 30). For a second-order phase transition, (it)/M3 ex (K — KC)/KC, so the top quark mass generated by this dynamics can lie well below the symmetry breaking scale; so long as M is not too large, the scale separation need not imply an unacceptable degree of fine tuning. A more complete model incorporating these ideas is topcolor-assisted technicolor 61 (TC2). The symmetry-breaking structure is: GTC
x SU(3)h x SU(3)e x SU(2)W |
GTC
x I
GTC
x
x U(l)h x U{l)t
M ~ 1 TeV x SU(2)W
SU{S)QCD
* U{1)Y
ATC ~ 1 TeV x
SU(3)QCD
(32)
U(l)EM
Below the scale M, the heavy topgluons and Z' mediate new effective interactions 61,77 for the (t,b) doublet 47TK3 2
" M
—
\a
1—
4_
2
M
2
n 2
(33)
where the Aa are color matrices and g3h > g3i, glh > gu. The K 3 terms are uniformly attractive; were they alone, they would generate large mt and m;,. The K\ terms, in contrast, include a repulsive component for b. As a result, the combined effective interactions 61 ' 77 1
(34) Kl 6 can be super-critical for top, causing (tt) / 0 and a large mt, and sub-critical for bottom, leaving (66) = 0. «3 + r « i >
KC
>
K3
610
The benefits of including new strong dynamics for the top quark are clear in TC2 models. 77 Because technicolor is responsible for most of electroweak symmetry breaking, Ap « 0. Direct contributions to Ap are avoided because the top condensate provides only / ~ 60 GeV; indirect contributions are not an issue if the technifermion hypercharges preserve weak isospin. The top condensate yields a large top mass. ETC dynamics at METC 3> 1 TeV generate the light mj without large FCNC and contribute only ~ 1 GeV to the heavy quark masses so there is no large shift in R^,. 2.6
Phenomenology of Strong Top Dynamics
Models with new strong top dynamics fall into three general classes with distinctive spectra and phenomenology: topcolor, 61 ' 77 flavor-universal extended color,62 and top seesaw.75 These theories include a variety of new states that can weigh less than a few TeV. A generic feature is colored gauge bosons with generation-specific (topgluon) or flavor-universal (coloron) couplings to quarks. The strongly-bound quarks may also form composite scalar states. Many models include color-singlet (Z') bosons with generation-dependent couplings. Some theories generate masses with the help of exotic fermions (usually, but not always weak-singlets). In this section of the talk, we review experimental searches for these new states. Topcolor Models
The gauge structure of topcolor 61,77 models, as outlined in section 2.5, generally includes extended color and hypercharge sectors (as in Eq. 33) and a standard weak gauge group. The third-generation fermions transform under the more strongly-coupled SU(3)h x U{\)h group, so that after the extended symmetry breaks to the SM gauge group the heavy topgluons and Z' couple preferentially to the third generation. The light fermions transform under SU{2>)i x U(l)e. CDF's search 78 for topgluons decaying to bb has put constraints on the topgluon mass for three different assumed widths (Figure 31); the topgluon's strong coupling to quarks ensures that it will be a rather broad resonance. Run II and the LHC should be sensitive to topgluons in bb or tt final states. The Z1', being more weakly coupled is narrow; CDF's limit on a • B for narrow states 78 decaying to bb just misses being able to constrain this state (Figure 31). A more recent CDF search29 for a leptophobic topcolor Z' decaying to top pairs excludes bosons weighing less than 480 (780) GeV assuming T/M = 0.012 (0.04). Precision electroweak data constrains 79 topcolor Z' bosons as shown in Figure 32; light masses are still allowed if the Z1 couples almost exclusively to the third generation. As mentioned earlier, FNAL Run II will be sensitive67 to topcolor Z' bosons as heavy as 750 GeV
611
400
600 M(g.,) (GeV/c2)
400
600 M(gT) (GeV/c2)
Figure 31. Results of CDF searches 78 for topgluons and Z' decaying to bb.
in the process Z' —\ TT —> e^iX. Ultimately, an NLC would be capable of finding a 3-6 TeV Z' decaying to taus. 62 The strong topcolor dynamics binds top and bottom quarks into a set of top-pions 61 ' 77 tt,tb,bt and bb. It has been observed80 that top-pion exchange in loops would noticeably decrease Rt, (Figure 33) and this implies that the top-pions must be quite heavy unless other physics cancels this effect81. Several searches for top-pion and top-higgs (a) states have been proposed. A singly-produced neutral top-higgs can be detected 82 through its flavor-changing decays to tc at Run II. Charged top-pions, on the other hand, would be visible83 in single top production, as in Figure 33, up to masses of 350 GeV at Run II and 1 TeV at LHC. F l a v o r - U n i v e r s a l Coloron Models The gauge structure of these models 62 is identical with that of the topcolor 61 models; they differ only in fermion charge assignments. The fermion hypercharges are as in topcolor models; hence, the Z1 phenomenology is also the same. But as the model's name suggests, all quarks transform un-
612
Figure 32. Lower bounds on the mass of topcolor Z' from precision electroweak data. 7 9
.
0.05
,
6
a\ & «*-(
0.03 0.02
s \13-
0.01
Ti
b
0.00 ISO
'
-
450
M„„.
600
„ (GeV)
750
'
1
'
'
'
;
900
100
'
'
'
I
'
'.
=200
GeV
~
L,
'-
L-,
!
'"\
^L-L-J 200
-
m h ' - 3 0 0 GeV ;
"
-
'
W_
i '"-
•
2
1
""" m h
[_,
n
4
;
'
— Wjj+Wbb
r" h
• J , 300
•
rJ-l
'
0.04
•
J
; ;
'—>~-,
^ i . ; : , : . . . 1 . . '•,•-:., _,____, 300
I..,....,
400
Mtj [GeV]
Figure 33. (left) Fractional reduction in Rb as a function of top-pion mass. 8 0 (right) Simulated signal and background for charged top-pions in the single top sample at the Tevatron. 8 3
der the more strongly-coupled SU(3)h group; none transform under SU(3)(. As a result, the heavy coloron bosons in the low-energy spectrum couple with equal strength to all quarks. Several experimental limits 84 have been placed on these color-octet states, as shown in Figure 34. CDF has excluded narrow colorons with masses below about 900 GeV by searching for resonances
613
\WcoH)
> 837 GeV
exSs
Mc(TeV/c2)
Figure 34. Limits 8 4 on the mass and mixing angle of flavor-universal colorons.
decaying to dijets. The bounds on Ap exclude light colorons which could be exchanged across quark loops in weak boson propagators. Heavier colorons tend to be broad (T ex K3MC) and therefore produce a distortion of the dijet angular distribution or excess events at high invariant mass, rather than a bump in the dijet spectrum. A D 0 study of the dijet angular distribution eliminated the light-shaded region of Figure 34 and a study 84 of the D 0 invariant mass distribution eliminated the darker-shaded slice, giving the limit Mc/cot9 > 837 GeV (where 8 is the mixing angle between the two SU(3) groups. This implies Mc ~ 3.4 TeV in dynamical models of mass generation where the coloron coupling is strong. In a TC2-like model incorporating flavor-universal colorons, 82 the gauge couplings K3 = as cot 2 #3 and Ki = ay cot2 8\ must satisfy several constraints which are summarized in Figure 35. Requiring solutions to the gauged NJL gap equations for dynamical fermion masses (Figure 36) such that only the top quark condenses leads to the inequalities 62 2 K-5 H
2?r K\ >
2 7 - 3 2 al 2TT «3 +7^— < ^r 27 KI 3 Ki
< 2-7T — day
4 -as 3 4
-
4
9aY 4 cts --ay
3 (TT)
= 0
<*£> ^ 0 (cc) =
(35)
which form the outer triangle in Figure 35. Mixing between the Z and Z'
614
K
A
i , \ N N ^ expt. \ ^ J excludes
5.0,
!
m
(2)
= o \ V
3.0
:
! ^S»^(5)
(6bl
1.0
:
j6o)
X. 1 f " \
j.
3
1.6
1.8
K1
.
2.0
Figure 35. Limits on the coupling strengths K3 and KI in flavor-universal coloron models. 6 2
alters the ZTT coupling by 79,62 1
S
9TL
=
,in20
2S9TR
Sm
M
Z
9w
vz
Ml
ay
(36)
where the top-pion decay constant is / 2 = gf^m2 In I ^ j . Keeping Z -> TT consistent with experiment yields the upper bound labeled (5). Both ZZ' mixing and coloron exchange contribute 79,62 to Ap
A2
(C)
Api'
2
3sin 6V r
ApX(z
\MCMZ
2
«3
ay sin2 6V M | /t2,"i 2 l ^ ( - +D M , Kl ir a y
(37)
yielding upper bound (4). Finally, requiring that the Landau pole of the strongly-coupled U(l)h group lie sufficiently far above the symmetry-breaking scale M yields the curves labeled (6a,b,c) according to whether the separation of scales is by a factor of 10, 102, or 105. The combined limits 62 indicate that the coloron coupling is not far below critical (K3 ~ 1.9) while K\ ~ 1. Similar constraints exist 7T for the original TC2 models.
615
Figure 36. NJL gap equation for dynamical generation of fermion mass.
Table 4. Third generation quark charge assignments in top seesaw models.
(t, b)L tR, bn XL XR
SU(5)h 3 1 1
3
5(7(3), 1 3 3 1
SU(2) 2 1 1 1
Top Seesaw Models
Top seesaw models 75 include an extended SU(3)h x SU(3)e color group which spontaneously breaks to SU(3)QCD while the electroweak gauge sector is standard. In addition to the ordinary quarks, there exist weak-singlet quarks \ which mix with the top quark; some variants 85,65 include weaksinglet partners for the b, or for all quarks, or weak-doublet partners for some quarks. The color and weak quantum numbers of the third-generation quarks are shown in Table 4. When the 51/(3)/, coupling becomes strong, the dynamical mass of the top quark is created through a combination of thXR. condensation and seesaw mixing:
' L ..*% V :'°
(hx,)( ° i ( " )
(38)
Composite scalars iL\R are also created by the strong dynamics. The phenomenology of the weak singlet quarks has received some attention in the literature. Experimental limits on weak isospin violation (Ap) provide a key constraint on models in which top has a weak-singlet partner and bottom does not. Even including a weak-singlet partner for the b quark cannot altogether alleviate this, as data on Rb limits the mixing between b and its partner. A combination of precision electroweak bounds and triviality considerations limits the \ quarks and the composite scalar to the mass range shown Figure 37. The exotic quarks are required 85 to have masses in excess of about 5 TeV. Note that the upper bound on the scalar mass from electroweak constraints at lower values of Mx is looser than in the SM because 86 the model constrains extra contributions to Ap.
616
Precision Electroweak Bounds
(TrV)
1.0
l.i
"'hi,*. (T..-V)
Figure 37. Electroweak 85 and triviality 8 6 bounds on the masses of the exotic quarks and composite scalars in a top seesaw model. The allowed region is within the banana-shaped region and to the left of the diagonal line.
Direct searches for weak-singlet quarks are limited to lower mass ranges; while they cannot probe the partner of the top, they are potentially sensitive to weak-singlet partners of the lighter quarks. For example, a heavy mostlyweak-singlet quark qH could contribute 87 to the FNAL top dilepton sample via
pp -> qHqH ->
qLwqLw
qLqLtvl(!vv
(39)
Comparing the number of dilepton events to the SM prediction yields a lower bound on MqH. The limits will be weaker than that for a sequential 4th generation quark because the mostly-singlet qH do not always decay via the charged-current weak interactions. The dH branching fraction to dH —> WuH is only about 60% due to competition from the flavor-conserving neutral current process dH —> ZdL. In the case of bH, the cross-generation chargedcurrent decay is also Cabibbo suppressed and the channel bH -> ZbL dominates. As a result, Run 1 data places the limit 87 MSH dn ~ 140 GeV, but cannot directly constrain Mbu. In models where all three generations of quarks have weak-singlet partners, self-consistency requires 87 Mbn ~ 160 GeV.
617 2.7
Summary
The quest for understanding electroweak symmetry breaking and fermion masses points to physics beyond the SM. In many theories, the top quark is predicted to have unusual properties accessible to experiments at the Fermilab Tevatron's Run II, the LHC or an NLC. New physics associated with the top quark might include new gauge interactions or decay channels, exotic fermions mixing with top, a light supersymmetric partner, strongly-bound top-quark states, or something not yet even imagined. Studying the top quark clearly has tremendous potential to produce results that will be surprising and enlightening. Acknowledgments The author thanks R.S. Chivukula and S. Willenbrock for very useful comments on the manuscript. She also acknowledges the support of the NSF POWRE and RAIS Bunting Fellowship programs. This work was supported in part by the National Science Foundation under grant PHY-0074274 and by the Department of Energy under grant DE-FG02-91ER40676. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
11. 12. 13. 14.
M.L. Perl et at, Phys. Rev. Lett. 35, 1489 (1975) S.W. Herb et al, Phys. Rev. Lett. 39, 252 (1977) E. Rice et al., Phys. Rev. Lett. 48, 906 (1982) W. Bartel et al, Phys. Lett. B 146, 437 (1984) C. Campagnari and M. Franklin, Rev. Mod. Phys. 69, 137 (1997) F. Abe et al, Phys. Rev. Lett. 74, 2662 (1995) S. Abachi et al., Phys. Rev. Lett. 74, 2632 (1995) S. Willenbrock, Studying the top quark, hep-ph/0008189. F. Abe et al. Phys. Rev. Lett. 80, 2767 (1998); Phys. Rev. Lett. 80, 2779 (1998); Phys. Rev. Lett. 82, 271 (1999) S. Abachi et al. Phys. Rev. Lett. 79, 1197 (1997); B. Abbott et al. Phys. Rev. Lett. 80, 2063 (1998); Phys. Rev. D 58, 052001 (1998); Phys. Rev. D 60, 052001 (1999) G. Watts, Talk given at Heavy Flavours 8, Southampton, England, July 25-29, 1999. http://www-d0.fnal.gov/ gwatts/outside/hf8/ D.E. Groom et al., E. Phys. J. C C15, 1 (2000) G. Degrassi, et al, Phys. Lett. B 418, 209 (1998) LEP Electroweak Working Group, http://lepewwg.web.cern.ch/LEPEWWG/
618
15. P. Comas et al., Recent Studies on Top Quark Physics at NLC, Proc. Iwate Linear Colliders, eds. A. Miyamoto, Y. Fujii, T. Matsui, and S. Iwata (World Scientific, 1996), p. 455. 16. M.C. Smith and S.S. Willenbrock, Phys. Rev. Lett. 79, 3825 (1997) 17. G. 't Hooft, in The Whys of Submuclear Physics, Proc. Int'l. School of Subnuclear Physics, Erice, 1997, ed. A. Zichichi (Plenum, NY, 1979), p. 943; Z. Mueller, in QCD - 20 Years Later, Proc. Workshop, Aachen, Germany, 1992, eds. P. Zerwas and H. Kastrup (World Scientific, Singapore, 1993), V. 1, p. 162. 18. M. Beneke and V. Braun, Nucl. Phys. B 426, 301 (1994); I Bigi et al., Phys. Rev. D 50, 2234 (1994) 19. A.H. Hoang et al, E. Phys. J. C 3, 1 (2000) 20. M. Beneke, Phys. Lett. B 434, 115 (1998) 21. I Bigi et al, Phys. Rev. D 56, 4017 (1997) 22. A.H. Hoang and T. Teubner, Phys. Rev. D 60, 114027 (1999) 23. M. Beneke et al., Top Quark Physics, hep-ph/0003033. 24. A.H. Hoang, Top Physics at the LC, presented at the Thinkshop 2 onTop Quark Physics for Run II & Beyond, Fermilab, Batavia, IL, November 10-12, 2000. 25. F. Abe et al, Phys. Rev. Lett. 80, 2773 (1998) 26. S. Abachi et al., Phys. Rev. Lett. 79, 1203 (1997) 27. E. Laenen, J. Smith, and W.L. van Neerven, Phys. Lett. B 321, 254 (1994); E. Berger and H. Contopangonos, Phys. Rev. D 57, 253 (1998); R. Bociani et al, Nucl. Phys. B , 5 (2)94241998 28. B. Abbott et al, Phys. Rev. Lett. 82, 2457 (1999) 29. T. Affolder et al., Phys. Rev. Lett. 85, 2062 (2000) 30. T. Affolder et al., Measurement of the Top Quark p(t) Distribution, Fermilab-Pub-00/101, 2000. 31. D. Amidei and R. Brock, eds., Future ElectroWeak Physics at the Fermilab Tevatron: Report of the TeV2000 Study Group, Fermilab-Pub96/046. http://www-theory.fnal.gov/TeV2000.html . 32. V. Barger, J. Ohnemus, and R. Phillips, Int. J. Mod. Phys. A 4, 617 (1989) 33. I. Bigi et al., Phys. Lett. B 181, 157 (1986) 34. M. Jezabek and J.H. Kuhn, Phys. Lett. B 329, 317 (1994) 35. S. Parke and Y. Shadmi, Phys. Lett. B 387, 199 (1996) 36. T. Stelzer and S. Willenbrock, Phys. Lett. B 374, 169 (1996) 37. Y. Hara, Prog. Theor. Phys. 86, 779 (1991) 38. G. Mahlon and S. Parke, Phys. Lett. B 411, 173 (1997) 39. B. Abbott et al., Phys. Rev. Lett. 85, 256 (2000)
619
40. M.C. Smith and S. Willenbrock, Phys. Rev. D 54, 6696 (1997); T. Stelzer, Z. Sullivan, and S. Willenbrock, Phys. Rev. D 56, 5919 (1997) 41. B. Abbott et al., Search for electroweak production of single top quarks in pp collisions, hep-ex/0008024 . 42. L. Dudko, Search for Single Top Quark Production at the Tevatron, Proc. 35th Rencontres de Moriond, Les Arcs, France, March 11-18, 2000. 43. T. Affolder et al., Phys. Rev. Lett. 84, 216 (2000) 44. K. Tollefson, CDF Collaboration, Fermilab-Conf-98/389-E, 1998. 45. F. Abe et al, Phys. Rev. Lett. 80, 2525 (1998) 46. For a review of supersymmetry, consult: H. Murayama in these TASI2000 lectures; or S. Dawson, SUSY and Such, hep-ph/9712464. 47. G. Anderson, D. Castano, and A. Riotto, Phys. Rev. D 55, 2950 (1997); H. Murayama and M. Peskin, Ann. Rev. Nucl. Part. Sci. 46, 533 (1996) 48. For a review of dynamical electroweak symmetry breaking, consult: R.S. Chivukula in these TASI-2000 lectures or in Models of Electroweak Symmetry Breaking, hep-ph/9803219; or K.Lane, Technicolor 2000, hepph/0007304. 49. See, for example, the "experimental symptoms of new top physics" list located at http://b0ndl0.fnal.gov/ regina/thinkshop/ts.html 50. For a review, see J. Gunion et al. The Higgs Hunter's Guide, (AddisonWesley, 1990). 51. B. Abbott et al, Phys. Rev. Lett. 82, 4975 (1999) 52. S. Snyder, Proc. EPS-HEP 99, Tampere, Finland, 15-21 July 1999, hepex/9910029. 53. M. Machacek and M. Vaughn, Nucl. Phys. B 222, 83 (1983); C. Ford et al., Nucl. Phys. B 395, 17 (1993) 54. L. Ibanez, Nucl. Phys. B 218, 514 (1983); L. Ibanez and G. Ross, Phys. Lett. B 110, 215 (1982); J. Ellis, D. Nanopoulos, and K. Tamvakis, Phys. Lett. B 121, 123 (1983); L. Alvarez-Gaume, J. Polchinski and M. Wise, Nucl. Phys. B 221, 495 (1983); B. Ananthanarayan, G. Lazarides, and Q. Shan, Phys. Rev. D 44, 1613 (1991) 55. V. Barger, M. Berger and P. Ohmann, Phys. Rev. D 49, 4908 (1994) 56. P. Chankowski, S. Pokorski, and J. Rosiek, Phys. Lett. B 274, 191 (1992); Phys. Lett. B 281, 100 (1992); Y. Okada, M. Yamaguchi, and T. Yanagida, Prog. Theor. Phys. 85, (1991); Phys. Lett. B 262, 54 (1991); J. Espinosa and M. Quiros, Phys. Lett. B 267, 27 (1991); Phys. Lett. B 266, 389 (1991); H. Haber and R. Hempfling, Phys. Rev. D 48, 4280 (1993); Phys. Rev. Lett.6618151991; J. Gunion and A. Turski, Phys. Rev. D 39, 2701 (1989); Phys. Rev. D 40, 2333 (1990); M. Berger, Phys. Rev. D 41, 225 (1990); K. Sasaki, M. Carenaand C. Wagner, Nucl.
620
57. 58. 59.
60. 61.
62.
63. 64. 65. 66. 67. 68.
69. 70.
Phys. B 381, 66 (1992); R. Barbieri and M. Frigeni, Phys. Lett. B 258, 395 (1991); J. Ellis, G. Ridolfi and F. Zwirner, Phys. Lett. B 257, 83 (1991); 262 477, 1991 (;) R. Hempfling and A. Hoang, Phys. Lett. B 331, 99 (1994); R. Barbieri, F. Caravaglios, and M. Frigeni, Phys. Lett. B 258, 167 (1991); H.Haber, R. Hempfling, and H. Hoang, Z. Phys. C 75, 539 (1997); M.Carena, M. Quiros, and C. Wagner, Nucl. Phys. B 461, 407 (1996); M. Carena et al, Phys. Lett. B 355, 209 (1995) T. Affolder et al, Phys. Rev. Lett. 84, 5704 (2000); Phys. Rev. Lett. 84, 5273 (2000) R. Demina et al, Phys. Rev. D 62, 035011 (2000) G.L. Kane and S. Mrenna, Phys. Rev. Lett. 77, 3502 (1996); G. Mahlon and G.L. Kane, Phys. Rev. D 55, 2779 (1997); M. Hosch et al, Phys. Rev. D 58, 034002 (1998) R.S. Chivukula, E.H. Simmons, and J. Terning, Phys. Lett. B 331, 383 (1994) C.T. Hill, Phys. Lett. B 266, 419 (1991); S.P. Martin, Phys. Rev. D 45, 4283 (1992); Phys. Rev. D 46, 2197 (1992); Nucl. Phys. B 398, 359 (1993); M. Lindner and D. Ross, Nucl. Phys. B 370, 30 (1992); R. Bonisch, Phys. Lett. B 268, 394 (1991); C.T. Hill et al, Phys. Rev. D 47, 2940 (1993); C.T. Hill, Phys. Lett. B 345, 483 (1995); R.S. Chivukula and H. Georgi, Phys. Rev. D 58, 115009 (1998); Phys. Rev. D 58, 075004 (1998) R.S. Chivukula, A.G. Cohen, and E.H. Simmons, Phys. Lett. B 380, 92 (1996); M.B. Popovic and E.H. Simmons, Phys. Rev. D 58, 095007 (1998); K. Lane, Phys. Lett. B 433, 96 (1998) R.S. Chivukula, S.B. Selipsky, and E.H. Simmons, Phys. Rev. Lett. 69, 575 (1992); R.S. Chivukula et al, Phys. Lett. B 311, 157 (1993) D.J. Muller and S. Nandi, Phys. Lett. B 383, 345 (1996); E. Malkawi, T. Tait, and C.-P. Yuan, Phys. Lett. B 385, 304 (1996) G. Burdman and N. Evans, Phys. Rev. D 59, 115005 (1999); H.-J. He, T. Tait, and C.P. Yuan, Phys. Rev. D 62, 011702 (2000) R.S. Chivukula, E.H. Simmons, and J. Terning, Phys. Rev. D 53, 5258 (1996) K.R. Lynch, et al, Phys. Rev. D to appear, hep-ph/0007286 A.P. Heinson, Proc. 31st Rencontres de Moriond, Les Arcs, France, March 23-30, 1996; A..P. Heinson, A.S. Belyaev and E.E. Boos, Phys. Rev. D 56, 3114 (1997); M.C. Smith and S. Willenbrock, Phys. Rev. D 54, 6696 (1996) E.H. Simmons, Phys. Rev. D 55, 5494 (1997) B.W. Lynn, Michael E. Peskin, R.G. Stuart, Radiative Corrections in
621
71. 72. 73. 74.
75. 76. 77.
78. 79. 80. 81. 82. 83. 84. 85. 86. 87.
SU{2) Proc. of LEP Physics Workshop, CERN Report 1985; M.B. Einhorn, D.R.T. Jones, and M. Veltman, Nucl. Phys. B 191, 146 (1981); M. Peskin and T. Takeuchi, Phys. Rev. D 46, 381 (1992) T. Appelquist et al, Phys. Rev. Lett. 53, 1523 (1984); Phys. Rev. D 31, 1676 (1985) R.S. Chivukula, Phys. Rev. Lett. 6 1 , 2657 (1988) R.S. Chivukula, A.G. Cohen, and K.D. Lane, Nucl. Phys. B 343, 554 (1990) V.A. Miransky, M. Tanabashi and K. Yamawaki, Phys. Lett. B 221, 177 (1989); Mod. Phys. Lett.A 4, 1043 (1989); Y. Nambu, EFI-89-08 (1989) unpublished; W. Marciano, Phys. Rev. Lett. 62, 2793 (1989); W.A. Bardeen, C.T. Hill, and M. Lindner, Phys. Rev. D 41, 1647 (1990) B.A. Dobrescu and C.T. Hill, Phys. Rev. Lett. 8 1 , 2634 (1998); R.S. Chivukula et al., Phys. Rev. D 59, 075003 (1999) Y. Nambu and G. Jona-Lasinio, Phys. Rev. 122, 345 (1961) ; Phys. Rev. 124, 246 (1961) R.S. Chivukula, B.A. Dobrescu, and J. Terning Phys. Lett. B 353, 289 (1995); K. Lane and E. Eichten, Phys. Lett. B 352, 382 (1995); K. Lane, Phys. Rev. D 54, 2204 (1996); G. Buchalla et al, Phys. Rev. D 53, 5185 (1996); K. Lane, Phys. Lett. B 433, 96 (1998) F. Abe et al, Phys. Rev. Lett. 82, 2038 (1999). R.S. Chivukula and J. Terning, Phys. Lett. B 385, 209 (1996) G. Burdman, Phys. Lett. B 403, 101 (1997) C.-X. Yue et al. Phys. Rev. D 62, 055005 (2000) G. Burdman, Phys. Rev. Lett. 83, 2888 (1999) H.-J. He and C.P. Yuan, Phys. Rev. Lett. 83, 28 (1999) LA. Bertram and E.H. Simmons, Phys. Lett. B 443, 347 (1998); E.H. Simmons, Phys. Rev. D 55, 1678 (1997) H. Collins, A. Grant, and H. Georgi, Phys. Rev. D 6 1 , 055002 (2000) R.S. Chivukula and N. Evans, Phys. Lett. B 464, 244 (1999); R.S. Chivukula, N. Evans, and C. Hoelbling, Phys. Rev. Lett. 85, 511 (2000) M. Popovic and E.H. Simmons, Phys. Rev. D 62, 035002 (2000)
This page is intentionally left blank
jljll^^
,^B
iiiiis lllllllJll^l
B. Kayser
This page is intentionally left blank
N E U T R I N O M A S S , MIXING, A N D OSCILLATION
National
B. K A Y S E R Science Foundation, 4201 Wilson Arlington, VA 22230, USA E-mail: [email protected]
Blvd.
Do neutrinos have nonzero masses? If they do, then these masses are very tiny, and can be sought only in very sensitive experiments. The most sensitive of these search for neutrino oscillation, a quantum interference effect which requires neutrino mass and leptonic mixing. In these lectures, we explain what leptonic mixing is, and then develop the physics of neutrino oscillation, including the general formalism and its application to special cases of practical interest. We also see how neutrino oscillation is affected by the passage of the oscillating neutrinos through matter.
1
Introduction
We humans, and the everyday objects around us, are made of nucleons and electrons, so these particles are the ones most familiar to us. However, for every nucleon or electron, the universe as a whole contains around a billion neutrinos. In addition, each person on earth is bombarded by ~ 1014 neutrinos, coming from the sun, every second. Clearly, neutrinos are abundant. Thus, it would be very nice to know something about them. One of the most basic questions we can ask about neutrinos is: Do they have nonzero masses? Until recently, there was no hard evidence that they do. It was known that, at the heaviest, they are very light compared to the quarks and charged leptons. But for a long time, the natural theoretical prejudice has been that the neutrinos are not massless. One reason for this prejudice is that in essentially any of the grand unified theories that unify the strong, electromagnetic, and weak interactions, a given neutrino v belongs to a large multiplet F, together with at least one charged lepton I, one positively-charged quark q+, and one negatively-charged quark q~: v
The neutrino v is related by a symmetry operation to the other members of F, all of which have nonzero masses. Thus, it would be peculiar if v did not have a nonzero mass as well. If neutrinos do have nonzero masses, we must understand why they are nevertheless so light. Perhaps the most appealing explanation of their lightness 625
626
is the "see-saw mechanism"-1 To understand how this mechanism works, let us note that, unlike charged particles, neutrinos may be their own antiparticles. If a neutrino is identical to its antiparticle, then it consists of just two massdegenerate states: one with spin up, and one with spin down. Such a neutrino is referred to as a Majorana neutrino. In contrast, if a neutrino is distinct from its antiparticle, then it plus its antiparticle form a complex consisting of four mass-degenerate states: the spin up and spin down neutrino, plus the spin up and spin down antineutrino. This collection of four states is called a Dirac neutrino. In the see-saw mechanism, a four-state Dirac neutrino MD of mass MB gets split by "Majorana mass terms" into a pair of two-state Majorana neutrinos. One of these Majorana neutrinos, vM, has a small mass M„ and is identified as one of the observed light neutrinos. The other, NM, has a large mass MJV characteristic of some high mass scale where new physics beyond the range of current particle accelerators, and responsible for neutrino mass, resides. Thus, NM has not been observed. The character of the breakup of MD into vM and NM is such that MVMN = MD. It is reasonable to expect that Mf), the mass of the Dirac particle MD, is of the order of Me OT q, the mass of a typical charged lepton £ or quark q, since the latter are Dirac particles too. Then MVMN = M}orq. With Meorq a typical charged lepton or quark mass, and MN very big, this "see-saw relation" explains why Mv is very tiny. Note that the see-saw mechanism predicts that each light neutrino vM is a two-state Majorana neutrino, identical to its antineutrino. 2
Neutrino Oscillation
To find out whether neutrinos really do have nonzero masses, we need an experimental approach which can detect these masses even if they are very small. The most sensitive approach is the search for neutrino oscillation. Neutrino oscillation is a quantum interference phenomenon in which small splittings between the masses of different neutrinos can lead to large, measurable phase differences between interfering quantum-mechanical amplitudes. To explain the physics of neutrino oscillation, we must first discuss leptonic "flavor". Suppose a neutrino v is born in the W-boson decay
w+ -> e+ + v.
(2)
Here, a = e,/x or T, and £+ is one of the positively charged leptons: £+ = e + , £+ = /x+, and £+ = T + . Suppose that, without having time to change its character, the neutrino v interacts in a detector immediately after its birth in the decay (2), and produces a new charged lepton V^ via the reaction v + target —> £g + recoils. It is found that the "flavor" /3 of this new charged
627
lepton is always the same as the flavor a of the charged lepton with which v was born. It follows that the neutrinos produced by the W decays (2) to charged leptons of different flavors must be different objects. We take this fact into account by writing these decays more accurately as W+^t+
+ va; a = e,fi,r.
(3)
The neutrino va, called the neutrino of flavor a, is by definition the neutrino produced in leptonic W decay in association with the charged lepton of flavor a. As we have said, when va interacts to create a charged lepton, the latter lepton is always £a. In neutrino oscillation, a neutrino born in association with a charged lepton £a of flavor a then travels for some time during which it can alter its character. Finally, it interacts to produce a second charged lepton £p with a flavor ft different from the flavor a of the charged lepton with which the neutrino was born. For example, suppose a neutrino is born with a muon in the pion decay ir+ -> Virtual W+ —> fj,+ + v^. Suppose further that after traveling down a neutrino beamline, this same neutrino interacts in a detector and produces, not another muon, but a T~. At birth, the neutrino was a fM. But by the time it interacted in the detector, it had turned into a vr. One describes this metamorphosis by saying the neutrino oscillated from a v^ into a vT. As we will see, the probability for it to change its flavor does indeed oscillate with the distance it travels before interacting. As we will also see, the oscillation in vacuum of a neutrino between different flavors requires neutrino mass. To see how neutrino mass can lead to neutrino oscillation, let us briefly recall the weak interactions of quarks. As we all know, there are three quarks— the u (up), c (charm), and t (top) quarks—which carry a positive electric charge Q = + 2 / 3 . In addition, there are three quarks—the d (down), s (strange), and b (bottom) quarks—which carry a negative electric charge Q = —1/3. Each of these six quarks is a particle of definite mass. As we know, the quarks are arranged into three families or generations, each of which contains one positive quark and one negative quark: Family :
Quarks :(f)
1
2
fy
3
(*)
However, we know experimentally that under the weak interaction, any of the negative quarks, d, s, or b, can absorb a positively-charged W boson and turn into any of the positive quarks, u, c, or t. This is illustrated in Fig. (1). There, di(i = d, s, b) is one of the down-type (negative) quarks. That is, dd = d is the
628
W+ d| Figure 1: Absorption of a W boson by a quark.
down quark, ds = s is the strange quark, and so on. Similarly, ua(a = u,c,t) is one of the up-type (positive) quarks, with uc = c being the charm quark, and so on. In the S(tandard) M(odel) of the electroweak interactions,2 the W-quark couplings depicted in Fig. (1) are described by the Lagrangian density
£udw = --j= Yl u^7XVaidLiW+ + h.c. . *
(4)
a=u,c,i
i=d,s,b
Here, the subscript L denotes left-handed chiral projection. For instance, dis = ^(1 — 75)ds is the left-handed strange-quark field. The constant g is the semi weak coupling constant, and V is a 3 x 3 matrix known as the quark mixing matrix. In the SM, V is unitary, because it is basically the matrix for the transformation from one basis of quantum states to another. The SM interaction (4) is very well confirmed experimentally. With this established behavior of quarks in mind, let us now return to the leptons. Like the quarks, the charged leptons e, /x, and r are particles of definite mass. However, if leptons behave as quarks do, then the neutrinos ve,i'tl, and vT of definite flavor are not particles of definite mass. Let us call the neutrinos which do have definite masses v\, i = 1,2,..., N. As far as we know, the number of Vi, N, may exceed the number of charged leptons, three. Now, as we recall, the neutrino va of definite flavor a is the neutrino state that accompanies the definite-mass charged lepton ta in the decay W —• la + va. If leptons behave as quarks do, this neutrino state must be a superposition of the neutrinos Vi of definite mass. To see this, we first note that just as any negative quark di of definite mass can absorb a W+ and turn into any positive quark ua of definite mass, so it must be possible for any neutrino v, of definite mass to absorb a W~ and turn into any charged lepton £~ of definite mass. This absorption is illustrated in Fig. (2). We expect that in analogy with Eq. (4) for the P^-quark
629
w \
Vj
^
Figure 2: Absorption of a W boson by a neutrino,
couplings, the SM interaction that describes the W-lepton couplings is
i=l,...,N
i
Here, U is an N x N unitary matrix which is the leptonic analogue of the quark mixing matrix V. The matrix U is referred to as the leptonic mixing matrix. 3 If iV > 3, then only the top 3 rows of U enter in the V^-lepton interaction, Eq. (5). The leptonic decays of the W+ are governed by the second term of £e„w, Eq. (5). From this term, we see that when W+ —> £ J + uva", the neutrino state \va) produced in association with the specific definite-mass charged lepton £+ is 6
K) = E^>*>-
()
i
That is, the "flavor-a" neutrino \va) produced together with t^ is a coherent superposition of the mass-eigenstate neutrinos \vi), with coefficients which are elements of the leptonic mixing matrix. What if N is bigger than three? Suppose, for example, that N = 4. Then, with the elements of the bottom row of U, U\astrow,i, we can construct a neutrino state I^E^trow.ik) (7) i
which does not couple to any of the 3 charged leptons. This state is called a "sterile" neutrino, which just means that it does not participate in the SM weak interactions. It may, however, participate in other interactions beyond the SM whose effects at present-day energies are too feeble to have been observed. Owing to the leptonic mixing described by Eq. (5), when the charged lepton of flavor a is created, the accompanying neutrino can be any of the
630
Source
Target
Source
Target
Figure 3: Creation of a neutrino with a charged lepton of flavor a. followed by the interaction of this neutrino to produce a charged lepton of flavor p. The "Source" is the particle whose decay creates the neutrino, la, and other, unlabelled, particles. The "Target" is the particle struck by the neutrino to produce £p and other, unlabelled, particles. "A" denotes an amplitude.
V{. Furthermore, if this V{ later interacts with some target, it can produce a charged lepton lp of any flavor (3. In such a sequence of events, the neutrino itself is an unseen intermediate state. Thus, as shown in Fig. (3), the amplitude for a neutrino to be born with charged lepton ta, and then to interact and produce charged lepton lp, is a coherent sum over the contributions of all the unseen mass eigenstates i/j. The birth of a neutrino with charged lepton va ' and its subsequent interaction to produce charged lepton li is usually described as the oscillation va —• up of a neutrino of flavor a into one of flavor j3 (see earlier discussion). Using "A" to denote an amplitude, we see from Fig. (3) that A(ua -> up) = \J[A(neutrino born with £j is Ui) x i
A(ui propagates ) A(when Ui interacts it makes 17)] . (8)
631
From C(vw, Eq. (5), we find that apart from irrelevant factors, ^(neutrino born with £j is Vi) = U^ .
(9)
A(when Pi interacts it makes tl) = Upi .
(10)
Similarly, To find the amplitude A(i^ propagates), we note that in the rest frame of i/;, where the proper time is nf, Schrodinger's equation states that i-£-\vi(Ti))
= Mi\vi{Ti)).
(11)
on Here, Mi is the mass of V{. From Eq. (11), h(ri))=e-tMiT-ki(0)).
(12)
Now, for propagation over a proper time interval n, A(i/j propagates) is just the amplitude for finding the original state |z^(0)) in the time-evolved \vi(n))That is, A(Vi propagates) = ( ^ ( 0 ) h ( n ) ) = e~iMiTi . (13) In terms of the time t and position L in the laboratory frame, the Lorentzinvariant phase factor exp (—iMjTj) is e-i(Eit-PiL)
_
(14)
Here, Ei and pi are, respectively, the energy and momentum of vi in the laboratory frame. In practice, our neutrino will be highly relativistic, so if it was born at (t, L) = (0,0), we will be interested in evaluating the phase factor (14) where t ss L, where it becomes e-t(Ei-p,)L
(15j
Suppose that the neutrino created with ia is produced with a definite momentum p, regardless of which i/t it happens to be. Then, if it is the particular mass eigenstate v>i, it has total energy Ei = y/p> + M?KP+^,
(16)
assuming that all the masses Mi are much smaller than p. From (15), we then find that . JVf?
A(i>i propagates) « e~l~^L
.
(17)
632
Alternatively, suppose that our neutrino is produced with a definite energy E, regardless of which Vi it happens to be.5 Then, if it is the particular mass eigenstate i/i, it has momentum Pl
= ^
- Ml * E - ^ l .
(18)
From (15), we then find that A{vi propagates) « e~l^L
.
(19)
Since highly relativistic neutrinos have E ss p, the propagation amplitudes given by Eqs. (17) and (19) are approximately equal. Thus, it doesn't matter whether our neutrino is created with definite momentum or definite energy. Collecting the various factors that appear in Eq. (8), we conclude that the amplitude A{va —> VQ) for a neutrino of energy E to oscillate from a va to a i/0 while traveling a distance L is given by Kie~lM'*U0l
A(ua -> ^ ) = £
.
(20)
i
The probability P(i/a —> vp) f° r this oscillation is then given by P(va -+Vfi) = \A(va - > ^ ) | 2 = SaP-4j2
XiKiUpiUajUh)
+2 £
sin 2 (*M?. ^ )
Z(U*aiUPiUajU*0j) sin(<JM£ A ) .
(21)
i>j
Here, 6Mf, = M? - M 2 , and in calculating \A(i/a -> i/p)\2, we have used the unitarity constraint ^ E O T s i = <$a/3 • (22) i
Some general comments are in order: 1. From Eq. (21) for P{va —> i/g), we see that if all neutrino masses vanish, then P(i/a -» i/p) = Sap, and there is no oscillation from one flavor to another. Neutrino flavor oscillation requires neutrino mass. 2. The probability P(va -»• vp) oscillates as a function of L/E. This is why the phenomenon we are discussing is called "neutrino oscillation".
633
From Eq. (20) for A{va -¥ i/g), we see that the L/E dependence of neutrino oscillation arises from interferences between the contributions of the different mass eigenstates J/J. We also see that the phase of the z/j contribution is proportional to Mf. Thus, the interferences can give us information on neutrino masses. However, since these interferences can only reveal the relative phases of the interfering amplitudes, experiments on neutrino oscillation can only determine the splittings SMfj = Mf Mj, and not the underlying individual neutrino masses. This fact is made perfectly clear by Eq. (21) for P(va -> vp). With the so-far omitted factors of h and c inserted,
SM?,± = 1.27 W U ^ ) ^ .
(23)
Thus, from Eq. (21) for the oscillation probability P(va -»• up), we see that an oscillation experiment characterized by a given value of L(km) / .E(GeV) is sensitive to mass splittings obeying 2 *M?.(eV )J > [ yl
L(km)
~ [^(GeV)
(24)
To be sensitive to tiny 5M'f-, an experiment must have large L/E. In Table 1, we indicate the SM2 reach implied by Eq. (24) for experiments working with neutrinos produced in various ways. There are basically two kinds of oscillation experiments: appearance experiments, and disappearance experiments. In an appearance experiment, one looks for the appearance in the neutrino beam of neutrinos bearing a flavor not present in the beam initially. For example, imagine that a beam of neutrinos is produced by the decays of charged pions. Such a beam consists almost entirely of muon neutrinos and contains no tau neutrinos. One can then look for the appearance of tau neutrinos, made by oscillation of the muon neutrinos, in this beam. In a disappearance experiment, one looks for the disappearance of some fraction of the neutrinos bearing a flavor which is present in the beam initially. For example, imagine again that a beam of neutrinos is produced by the decays of charged pions, so that almost all the neutrinos in the beam are muon neutrinos, v^. If one knows the v^ flux that is produced initially, one can look to see whether some of this initial v^ flux disappears after the beam has traveled some distance, and the muon neutrinos have had a chance to oscillate into other flavors.
634 Table 1: The approximate reach in SM2 of experiments studying various types of neutrinos. Often, an experiment covers a range in L and a range in E. To construct the table, we have used typical values of these quantities.
Neutrinos (Baseline)
L(km)
£(GeV)
L(km) £(GeV)
<5M2(eV2) Reach
Accelerator (Short Baseline) Reactor (Medium Baseline) Accelerator (Long Baseline)
1
1
1
1
1
lO" 3
103
10~ 3
103
10
102
10" 2
Atmospheric
104
1
104
10" 4
Solar
108
10" 3
1011
10" 11
6. Even though neutrinos can change flavor through oscillation, the total flux of neutrinos in a beam will be conserved so long as U is unitary. To see this, note that
P
0
T,(Zu'«up*e~iMfh)(Zu»uhe+ iMf
= 0
i
= ^\Uai\2
£* ^
3
= l
(25)
Here, we have used Eq. (20) for the amplitude A(va -> v$) and the unitarity relations X ^ t^jE/^ • = <% and YsiWai\2 = 1- The result J2gP(ua -* vp) = 1 means that if one starts with a certain number n of neutrinos of flavor a, then after oscillation the number of neutrinos that have oscillated away into new flavors /? ^ a, plus the number that have retained the original flavor a, is still n. Note, however, that some of the new flavors that get populated by the oscillation might be sterile. If they are indeed sterile, then the number of "active" neutrinos (i.e.,
635
neutrinos that participate in the SM weak interactions) remaining after oscillation will be less than n. 2.1
Special Cases
Let us now apply the general formalism for neutrino oscillation in vacuum to several special cases of practical interest. The simplest special case of all is two-neutrino oscillation. This occurs when the SM weak interaction, Eq. (5), couples two charged leptons (say, e and fi) to just two neutrinos of definite mass, v\ and i/2, and only negligibly to any other neutrinos of definite mass. It is then easily shown that the 2 x 2 submatrix ~Uei Ue2 (26) U U^ Uli2_ of the mixing matrix U must be unitary all by itself. This means that the definite-flavor neutrinos ve and v^ are composed exclusively of the mass eigenstates v\ and u2, and do not mix with neutrinos of any other flavor. From Eq. (20) for the oscillation amplitude, we have for this two-neutrino case e+iM?&A(ve -+ Vll) = U*elU^ + U^U^e-™" ™ . (27) Since U is unitary, U^U^i + U*2U,j,2 — 0, so Eq. (27) may be rewritten as e+iM*ibA(ve
->
Vft)
= -f/e*2L/M2(l - e - W M « & ) = _ 2 i e-iSM"
*U; 2 U, 2 smiSMi, ~)
.
(28)
Squaring this result, using Eq. (23) to take account of the requisite factors of h and c, we find that the probability P(ve —> v^) for a ve to oscillate into a v^ is given by P(ue -> „„) = 4 | C / e 2 | 2 | t / M 2 | 2 s i n 2 ( 1 . 2 7 ^ M 2 ( e V 2 ) - | ^ ) .
(29)
Here, we have introduced the abbreviation 8M2X = SM2. From the e «-• \i symmetry of the right-hand side of Eq. (29), it is obvious that Piv» -+ve)=
P(ye - 4 I / J .
(30)
v
The probability P(va —> a) that a neutrino va of flavor a = e or /x retains its original flavor is given by P(ya -> va) = 1 - P(va -» vp±a) = 1 - 4|C/ a2 | 2 (l - |C/ a2 | 2 ) s i n 2 (1.27 J M 2 ( e V 2 ) - | g ^ ) . (31)
636
a
v
V3
:>5M Small
5?
(Mass)'
5M Big
v2
5MSmall
or
5M Big v
3
Figure 4: A three neutrino (Mass) 2 spectrum in which the vi — v\ splitting <5M, is much Small smaller than the splitting <5Af|. between 1/3 and the 1*2 — v\ pair. The latter pair may be at either the bottom or the top of the spectrum.
To obtain this expression, we have used the conservation of probability, Eq. (25), the "off-diagonal" oscillation probability, Eqs. (29) and (30), and the unitarity relation |L/e2|2 + \U^ = 1. The unitarity of U, Eq. (26), implies that it can be written in the form U =
cost
_ e i(vi+¥> 3 )
sin
Q
sin(
ei(v2+-fi3)
cos
Q
(32)
Here, 9 is an angle referred to as the leptonic mixing angle and ^1,2,3 are phases. From Eq. (32), ^\Ue2\2\U^2\2 = sin2 20, so that P{ve -> ^ ) , Eq.' (29), takes the form P(ye -> „„) = P(^
-+ ve) = sin2 2 g s i n 2 ( 1 . 2 7 J M 2 ( e V 2 ) ^ ( ^ )
(33)
This is the most-commonly quoted form of the two-neutrino oscillation probability. A second special case which may prove to be very relevant to the real world is a three-neutrino scenario in which two of the neutrino mass eigenstates are nearly degenerate. That is, the neutrino (Mass) 2 spectrum is as in Fig. (4), where \5M22lI = «5M|mall « \SMlI - \6Mi2\ = <5M2ig . (34) All three of the charged leptons, e,/x, and r, are coupled by the SM weak interaction, Eq. (5), to the neutrinos ^1,2,3Suppose that an oscillation experiment has L/E such that 5M^XgL jE is of order unity, which implies that SM^mailL/E
637
P ^ a, the oscillation amplitude of Eq. (20) is given approximately by iM e °*&A(va
-» v^a)
2 (U*alU0l + U*a2Up2ySM^
+ U*a3Up3 .
(35)
Using the unitarity constraint of Eq. (22), this becomes eiM'^A(ua
-» v^a)
= U*3UP3(1 -
eiSM^)
= _ 2 i e « ^ A f / * 3 ^ 3 sin(<JM|2 A ) .
(36)
Taking the absolute square of this relation, using |<5M|2| = <5Mgig, and inserting the omitted factors of fi and c, we find that the va —> v^a oscillation probability is given by P{va -> v^a)
= 4|[/ Q3 | 2 |C/ /3 3| 2 sin 2 (1.27 5 M l i g ( e V 2 ) | | ^ ) .
(37)
To find the corresponding probability P{va —> va) that a neutrino of flavor a retains its original flavor, we simply use the conservation of probability, Eq. (25): P(ya -> va) = l - Y,
p
( ^ ->• v&) •
From Eq. (37) and the unitarity relation Ylpjza 1^/3312 find that
=
(38) 1 ~ | t/ a 31 2 ,
we
then
P(va -> va) = 1 - 4|t/ a 3 | 2 (l - |^ 3 | 2 )sin 2 (1.27<5M 2 i g (eV 2 ) J j § ^ y ) • (39) Comparing Eqs. (37) and (29), and Eqs. (39) and (31), we see that in the three-neutrino scenario with 6M^maiiL/E
Neutrino Oscillation in Matter
So far, we have been talking about neutrino oscillation in vacuum. However, some very important oscillation experiments are concerned with neutrinos that
638
travel through a lot of matter before reaching the detector. These neutrinos include those made by nuclear reactions in the core of the sun, which traverse a lot of solar material on their way out of the sun towards solar neutrino detectors here on earth. They also include the neutrinos made in the earth's atmosphere by cosmic rays. These atmospheric neutrinos can be produced in the atmosphere on one side of the earth, and then travel through the whole earth before being detected in a detector on the other side. To deal with the solar and atmospheric neutrinos, we need to understand how passage through matter affects neutrino oscillation. To be sure, the interaction between neutrinos and matter is extremely feeble. Nevertheless, the coherent forward scattering of neutrinos from many particles in a material medium can build up a big effect on the oscillation amplitude. For both the solar and atmospheric neutrinos, it is a good approximation to take just two neutrinos into account. Furthermore, it is convenient to treat the propagation of neutrinos in matter in terms of an effective Hamiltonian. To set the stage for this treatment, let us first derive the Hamiltonian for travel through vacuum. For the sake of illustration, let us suppose that the two neutrino flavors that need to be considered are ve and v^. The most general time-dependent neutrino state vector \v(t)) can then be written as
( 4 °)
!"(*)> = £ /<» (*)!"«> -
where fa(t) is the time-dependent amplitude for the neutrino to have flavor a. If Tiv (short for % vacuum) is the Hamiltonian for this two-neutrino system in vacuum, then Schrodinger's equation for \v{t)) reads
dv 0 0
=E Here, (Hv)ap
= {valtivWp)-
K> •
(41)
0
Comparing the coefficients of \va) at the begin-
639 ning and end of Eq. (41), we clearly have d
fe(t)
U\
fe(t)
(42)
where Hy is now the 2 x 2 matrix with elements {rHy)af3- The Schrodinger equation (42) is completely analogous to the familiar one for a spin-1/2 particle. The roles of the two spin states are now being played by the two flavor states. Let us call the two neutrino mass eigenstates out of which ve and v^ are made v\ and v2- To find the matrix T-Ly, let us assume that our neutrino has a definite momentum p, so that its mass-eigenstate component Vi has definite energy Et given by Eq. (16). That is, TivWi) = Ei\vi), and the different mass eigenstates \vi), like the eigenstates of any Hermitean Hamiltonian, are orthogonal to each other. Then, in view of Eq. (6), the elements (Hv)a/3 of the vacuum Hamiltonian are given by
WvU = {va\nvH) = (^K^-HvlY^u^)
= YsV°"uhE* • (43)
In this expression, the two-neutrino mixing matrix U may be taken from Eq. (32). However, we have seen that the complex phase factors in Eq. (32) have no effect on the two-neutrino oscillation probabilities, Eq. (33). Indeed, it is not hard to show that when there are only two neutrinos, complex phases in U have no effect whatsoever on neutrino oscillation. Thus, since oscillation is our only concern here, we may remove the complex phase factors from the U of Eq. (32). If, in addition, we relabel the mixing angle By (short for ^vacuum )i U becomes U
COS By
Sin#y
— sin By
cos By
(44)
Inserting in Eq. (43) the elements Uai of this matrix and the energies Ei given by Eq. (16), we can obtain all the {T-Ly)apThe matrix Jiy can be put into a more symmetric and convenient form if we add to it a suitably chosen multiple (AE)I of the identity matrix I. Such an addition will not change the predictions of Tiy for neutrino oscillation. To see why, we note first that the identity matrix is invariant under the unitary transformation that diagonalizes T-Ly. Thus, adding (AE)I to 7iy in the flavor basis, where its elements are (7iv)a0, is equivalent to adding (AE)I to Hy in the mass eigenstate basis, where it is diagonal. Hence, if the eigenvalues of Hy are Et,i = 1,2, those of Uv + (AE)I are Ei + AE,i = 1,2. That is, both eigenvalues are displaced by the same amount, AE. To see that such a common shift of all eigenvalues does not affect neutrino oscillation, suppose a
640
neutrino is born at time t = 0 with flavor a. That is, \v(Q)) = \va). After a time t, this neutrino will have evolved into the state \v{t)) given, according to Schrodinger's equation, by \v{t)) =
e-mvt\v(0))
= e - ^ ' J X h ) = Y.U^lE'l\^) • i
(45)
i
The amplitude A{ya ->• up) for this neutrino to have oscillated into a i/p in the time i is then given by A(ya -* v0) = {vp\v[t)) = J2 u*e-iEitU0i
•
(46)
i
Clearly, if we add (AE)I to Hy so that its eigenvalues Ei are replaced by Ei + AJB, then A[ya -> i/^) is just multiplied by the overall phase factor exp[—i(AE)t}. Obviously, this phase factor has no effect on the oscillation probability P(va ->• i/p) = \A(va ->• v$)\2. Thus, the addition of (AE)I to Uv does not affect neutrino oscillation. For our purposes, the most convenient choice of AE is — [p+(M 1 2 +M.|)/4p]. Then, from Eq. (43), the new effective Hamiltonian 7i'v = Uv + (AE)I has the matrix elements JU~2 _|_ Jur2
(KU
= E UaiU^Ei -\p+
l
Jp
2
}5a0 .
(47)
From Eqs. (44) and (16), this gives'7 Uv
SMI AE
— cos 29y sin 29v sin 29y cos 2dv
(48)
Here, we have used the fact that p = E, the energy of the neutrino avers over its two mass-eigenstate components. We leave to the reader the instructive exercise of verifying that, inserted into the Schrodinger Eq. (42), the 7i'v of Eq. (48) does indeed lead to the usual two-neutrino oscillation probability, Eq. (33). With the Hamiltonian that governs neutrino propagation through the vacuum in hand, let us now ask how neutrino propagation is modified by the presence of matter. Matter, of course, consists of electrons and nucleons. When passing through a sea of electrons and nucleons, a (non-sterile) neutrino can undergo the forward elastic scatterings depicted in Fig. (5). Coherent forward scatterings, via the pictured processes, from many particles in a material
641
e A
Ven
W
(a)
A Vf
v« A
e, p, or n
Ae
Va A
e, p, or n (b)
Figure 5: Forward elastic scattering of a neutrino from a particle of matter, (a) W-exchangeinduced scattering from an electron, which is possible only for a ue. (b) Z-exchange-induced scattering from an electron, proton, or neutron. This is possible for va = v e , ^ i , or vT. According to the Standard Model, the amplitude for this Z exchange is the same, for any given target particle, for all three active neutrino flavors.
medium will give rise to an interaction potential energy of the neutrino in the medium. Since one of the reactions in Fig. (5) can occur only for electron neutrinos, this interaction potential energy will depend on whether the neutrino is a ve or not. The interaction potential energy for a neutrino of flavor a must be added to the matrix element {T-i'v)aa to obtain the Hamiltonian for propagation of a neutrino in matter. 8 An important application of this physics is to the motion of solar neutrinos through solar material. The solar neutrinos are produced in the center of the sun by nuclear reactions such as p + p -» ^H + e + + ve. The neutrinos produced by these reactions are all electron neutrinos. Let us suppose that the only neutrinos with which electron neutrinos mix appreciably are muon neutrinos, so that we have a two-neutrino system of the kind we have just been discussing. The solar neutrinos stream outward from the center of the sun in all directions, some of them eventually arriving at solar neutrino detectors here on earth. The passage of these solar neutrinos through solar material on their way out of the sun modifies their oscillation. Any neutrino which is still a ve, as it was at birth, can interact with solar electrons via the W exchange of Fig. (5a). This interaction leads to an interaction potential energy Vw{ye) of an electron neutrino in the sun. This Vw(ve) is obviously proportional to the Fermi coupling constant Gp, which governs the amplitude for the process in Fig. (5a). It is also proportional to the number of electrons per unit volume, Ne, at the location of the neutrino, since Ne measures the number of electrons
642 which can contribute coherently to the forward ve scattering. One can show 9 t h a t in the S t a n d a r d Model, Vw(ve)
=
V2GFNe
(49)
This energy must be added to {H'v)ee to obtain the Hamiltonian for propagation of a neutrino in the sun. In principle, the interaction energy produced by the Z exchanges of Fig. (5b) must also be added t o Wv. However, since these Z exchanges are b o t h flavor diagonal and flavor independent, their contribution to the Hamiltonian is a multiple of the identity matrix. As we have already seen, a contribution of this character does not affect neutrino oscillation. Thus, we may safely ignore it. In incorporating Vw(ve) into the Hamiltonian, it is convenient, in the interest of symmetry, to add as well the multiple ~^Vw(ve)I of the identity matrix. T h u s , with H'v the Hamiltonian for propagation in vacuum given by Eq. (48), the Hamiltonian HQ for propagation in the sun is given b-f Hy
Ho
+
y/2
SM^D
0
4E
1 0
0 -1
' - cos 28Q sin 2#o
sin 2#o cos 2#©
(50)
In this expression, sin 2 2<9Q + (cos 26»G
Dm
(51)
and sin 2 26V •20<*
=
2
sin 26 v + (cos 2#v
(52)
xGy
where 2V2GFNeE XQ
(53)
The angle 9Q is the effective neutrino mixing angle in the sun when the electron density is Ne. We note t h a t HQ, Eq. (50), has precisely the same form as Wv, Eq. (48). T h e only difference between these two Hamiltonians is t h a t t h e p a r a m e t e r s — the mixing angle and the effective neutrino (Mass) 2 splitting out in front of the matrix—have different values. Of course, the electron density 7Ve is not a constant, but depends on the distance r from the center of the sun. Thus, the parameters # Q and (SM^DQ are not constant either, unlike their counterparts, By and (JAffj, in T-L'v. However, let us imagine for a moment t h a t
643
Ne is a constant. Then HQ, like H'v, is independent of position, and must lead to the same oscillation probability, Eq. (33), as 7i'v does, except for the substitutions 0 -> 8Q and SM^ -> (SM'^DQ. That is, in matter of constant electron density Ne, if© leads to the oscillation probability P(ye
-> i/M) = P ( ^ M - 4 j/ e )
= sin2 26>Q sin2 [1.27 JM| 1 (eV 2 ) J D G
L(km) E(GeV) J
(54)
Now let Ne vary with r as it does in the real world. However, suppose that it varies slowly enough that the constant-N e picture we have just painted applies at any given radius r, but with Ne(r), hence xQ(r), slowly decreasing as r increases. Suppose also that SM'^i and E are such that xQ(r = 0) > cos 26VThen, assuming that cos 26V > 0, there must be a radius r — rc somewhere between r = 0 and the outer edge of the sun, where Ne —» 0, such that x Q (r c ) = cos 26V• From Eq. (52), we see that at this special radius r c , there is a kind of "resonance" with sin2 28Q = 1, even if Qy is tiny. That is, mixing can be maximal in the sun even if it is very small in vacuo. As a result, the oscillation probability, which is proportional to the mixing factor sin2 2# Q as we see in Eq. (54), can be very large. A nice picture of this enhanced probability for flavor transitions in matter can be gained by considering the neutrino energy eigenvalues and eigenvectors. If we neglect the inconsequential Z exchange contribution, and take HQ from Eq. (50), then the true Hamiltonian for propagation in the sun is "H-True = KQ + \p +
Mf + MJ 4p
+
\vw{ye)\I
(55)
since the second term in this expression was subtracted from the true Hamiltonian to get HQ- NOW, p+ (M 2 + M | ) / 4 p = E, the energy our neutrino would have in vacuum, averaged over its two mass-eigenstate components. Thus, from Eqs. (55) and (50), (48), (49), and (53), we have in the ve - v^ basis ^True(r)
E 0 0 E
+
(5M2! AE
- c o s 26> v + 2xQ(r) sin 26V
sin 29y cos20v
(56)
If we continue to assume that Ne(r), and consequently i Q ( r ) , varies slowly, we may diagonalize this Hamiltonian for one r at a time to see how a solar neutrino will behave. We find from Eq. (56) that for a given r, the energy eigenvalues E±(r) are given by E±(r) =E +
6M& 21 xQ(r) ± \J(x0{r) AE
- cos26V)2 + sin2 20x
(57)
644
Born avey
Energy
**
Emerges *
V
-£
^-
Outer Edge of Sun
*
<.
4. *
*"
>
Center of Sun
N e and x 0 Figure 6: Propagation of a neutrino in the sun. The horizontal axis is linear in Ne and XQ. The solid line shows the upper eigenvalue, E+(r), for small 9y • The dashed lines show the two eigenvalues, E±(r), when there is no vacuum mixing (6y = 0). A neutrino born as a i/e follows the path indicated by the arrows.
To explore the implications of these energy levels, let us suppose that ^M| x is such that at r = 0, where Ne(r) and hence £ 0 ( r ) has its maximum value, xe > 1. From Eq. (53), the value of GF, the value (~ 10 26 /cc) of Ne(r = 0), and the typical energy E (~ 1 MeV) of a solar neutrino, we find that the required SM^ is of order 10~ 5 eV 2 . The dominant term in the energies E±(r) of Eq. (57) will be the first one, E (~ 1 MeV). However, very interesting physics will result from the second term, despite the fact that this term is only of order 6M&/4E ~ (10- 5 eV 2 )/lMeV ~ 10" 1 7 MeV! The neutrino states which propagate in the sun without mixing significantly with each other are the eigenvectors of T^mieM- To study these eigenvectors, let us assume for simplicity that the vacuum mixing angle By is small. Then it quickly follows that, except in the vicinity of the special radius rc where x Q (r c ) = cos2#y, one of the eigenvectors of HTrUe(r), Eq. (56), is essentially pure ve, while the other is essentially pure v^. The evolution of a neutrino traveling outward through the sun is then as depicted in Fig. (6). The neutrino follows the trajectory indicated by the arrows. Produced by some nuclear
645
process, the neutrino is born at small r as a ve. From Eq. (56), the eigenvector that is essentially ve at r = 0 is the one with the higher energy, E+(r). Thus, our neutrino begins its outward journey through the sun as the eigenvector belonging to the upper eigenvalue, E+. Since the eigenvectors do not cross at any r and do not mix appreciably, the neutrino will remain this eigenvector. However, the flavor content of this eigenvector changes dramatically as the neutrino passes through the region near the radius r = rc where x©(r) — cos 26V = 0. For r C r e , the eigenvector belonging to E+ (r) is essentially a i / e , as one may see from Eq. (56) if one neglects the small off-diagonal terms proportional to sin 26V, and imposes the small-r condition xQ > cos26V. But for r 2> rc, the eigenvector belonging to the higher energy level E+(r) is essentially a u^, as one may also see from Eq. (56) if one neglects the off-diagonal terms and imposes the large-r condition xQ < cos 26V. Thus, the eigenvector corresponding to the higher energy eigenvalue E+(r), along which the solar neutrino travels, starts out as a ue at the center of the sun, but ends up as a i/M at the outer edge of the sun. The solar neutrino, born a ve in the solar core, emerges from the rim of the sun asai/,,. Furthermore, it does this with high probability even if the vacuum mixing angle 9y is very small, so that oscillation in vacuum would not have much of an effect. This very efficient conversion of solar electron neutrinos into neutrinos of another flavor as a result of interaction with matter is known as the Mikheyev-Smirnov-Wolfenstein (MSW) effect.8'10 We turn now from solar neutrinos to atmospheric neutrinos, whose propgation through the earth entails a second important application of the physics of neutrinos traveling through matter. As already mentioned, an atmospheric neutrino can be produced in the atmosphere on one side of the earth, and then journey through the whole earth to be detected in a detector on the other side. While traveling through the earth, this neutrino will undergo interactions that can significantly modify its oscillation pattern. There is very strong evidence11 that the atmospheric neutrinos born as muon neutrinos oscillate into neutrinos vx of another flavor. It is known that vx is not a ve. It could be a i/T, or a sterile neutrino i/s, or sometimes one of these and sometimes the other. One way to find out what is becoming of the oscillating muon neutrinos is to see whether their oscillation is afffected by their passage through earth-matter. The oscillation v^ ->• vs will be affected, but v^ —> vr will not be. To see why this is so, let us first assume that the oscillation is v^ —»• vT. Then the Hamiltonian HE (short for %Earth) that describes neutrino propagation in the earth is a 2 x 2 matrix in v^ — vT space. Now, either a v^ or a vT will interact with earth-matter via the Z exchange of Fig. (5b). This interaction will give rise to an interaction potential energy of the neutrino in the earth.
646
However, according to the S(tandard) M(odel), the Z-exchange amplitude is the same for a vT as it is for a u^. Thus, in the case of z/M -> vT oscillation, the contribution of neutrino-matter interaction to HE is a multiple of the identity matrix. As we have alrady seen, such a contribution has no effect on oscillation. Now, suppose the oscillation is not v^ —> vT but v^, —> vs. Then HE is a 2 x 2 matrix in v^ — vs space. A v^ will interact with earth-matter, as we have discussed, but a i/s, of course, will not. Thus, the contribution of neutrino-matter interaction to HE is now of the form
"ft
VziyJ
0
0
0
(58)
where Vz(y^ is the interaction potential energy of muon neutrinos produced by the Z exchange of Fig. (5b). Since the matrix (58) is not a multiple of the identity, neutrino-matter interaction does affect v^ —• vs oscillation. According to the SM, the forward Z-exchange amplitudes for a target e and a target p are equal and opposite. Thus, assuming that the earth is electrically neutral so that it contains an equal number of electrons and protons per unit volume, the e and p contributions to Vz{v^) cancel. Then Vz{v^) is proportional to the neutron number density, Nn. Taking the proportionality constant from the SM, we have Vz(^)
= -^Nn.
(59)
To obtain the Hamiltonian HE for v^ —» vs oscillation in the earth, we add to the vacuum Hamiltonian H'v of Eq. (48) the contribution (58) from matter interactions, using Eq. (59) for Vz{vv)- Of course, it must be understood that H'v is now to be taken as a matrix in v^ — vs space, and that the vacuum (Mass)2 splitting 6~M%X and mixing angle By in H'v are now different parameters than they were when we obtained from H'v the Hamiltonian HQ for neutrino propagation in the sun. The quantities 8M$i and 9y are now new parameters appropriate to the vacuum oscillation of atmospheric, rather than solar, neutrinos. To obtain a more symmetrical and convenient HE from H'v, we also add — | V z ( ^ ) J , a multiple of the identity which will not affect the implications of HE for oscillation. The result is U E
-
{SMl)DE IE
- cos 29E sin 29 E
sin 26E cos 29 E
(60)
Here, DE = y s m 2 29v + (cos2<9K - xE)2
,
(61)
647
and sin2 29E = — ^ ^ sin2 26V + (cos 26>v -
xE)2
,
(62)
where XE =
V2GFNnE
(63)
—sMir~ •
As a rough approximation, we may take the neutron density Nn to be constant throughout the earth. Then, i g is also a constant, and the Hamiltonian HE OI Eq. (60) is identical to the vacuum Hamiltonian Wv of Eq. (48), except that the constant 5M$i is replaced by the constant (SM^DE, and the constant 9y by the constant 9E- Thus, from the fact that H'v leads to the vacuum oscillation probability of Eq. (33) (with ve -> v^ replaced by v^ —> vs for the present application), we immediately conclude that the HE of Eq. (60) leads to the oscillation probability P(z/„ - • va) = sin2 29E sin2[1.27 8M& (eV2)DE j
^ ] .
(64)
As we see from Eq. (63), XE, which is a measure of the influence of matter effects on atmospheric neutrinos, grows with energy E. In a moment we will see that for E ~ 1 GeV, matter effects are negligible. Fits to data 12 on atmospheric neutrinos with roughly this energy have led to the conclusion that atmospheric neutrino oscillation involves a neutrino (Mass) 2 splitting SM\tmos given by <5M2tmos ~ 3 x n n 3 e V 2 ,
(65)
and a neutrino mixing angle #Atmos given by sin2 26>Atmos ~ 1 •
(66)
That is, the mixing when matter effects are negligible is very large, and perhaps maximal. The quantities <5MAtm0S and #Atmos are to be taken, respectively, for 6M$i and 6y in Eqs. (60) - (64) to find the implications of those equations for Vy, —> vs within the earth. Since atmospheric neutrino oscillation involves maximal mixing when matter effects are negligible, the matter effects cannot possibly enhance the oscillation, but can only suppress it. From Eq. (62), we see that if, as observed, sin2 29v — 1, then matter effects will lead to a smaller effective mixing sin2 29E in the earth, given by 1 sin2 29E = . (67) r
648
As is clear in Eq. (64), this will result in a smaller oscillation probability P(v,i —> vs) than one would have in vacuum, where sin2 28E is replaced by sin226V (~ 1). Since XE grows with energy, the degree to which matter effects suppress Vp, ->• vs grows as well. From Eq. (63) for XE, the known values of GF and Nn, and the value (65) required for SM^ by the data, we find that i £ « l when E ~ 1 GeV. Thus, matter effects are indeed negligible at this energy, so it is legitimate to determine12 the vacuum parameters 5M\tm0S and sin2 26>Atmos by analyzing the ~ 1 GeV data neglecting matter effects. However, at sufficiently large E, the matter-induced suppression of v^ —> vs will obviously be significant. From Eq. (63), we find that sin2 29E is below 1/2 when E >, 20 GeV. The consequent suppression of oscillation at these energies has been looked for, and is not seen.13 This absence of suppression is a powerful part of the evidence that the neutrinos into which the atmospheric muon neutrinos oscillate are not sterile neutrinos, or at least not solely sterile neutrinos.13 4
Conclusion
Evidence has been reported that the solar neutrinos, the atmospheric neutrinos, the accelerator-generated neutrinos studied by the Liquid Scintillator Neutrino Detector (LSND) experiment at Los Alamos, and the acceleratorgenerated neutrinos studied by the K2K experiment in Japan, actually do oscillate. Some of this evidence is very strong. The neutrino oscillation experiments, present and future, are discussed in this Volume by John Wilkerson.14 In these lectures, we have tried to explain the basic physics that underlies neutrino oscillation, and that is invoked to understand the oscillation experiments. As we have seen, neutrino oscillation implies neutrino mass and mixing. Thus, given the compelling evidence that at least some neutrinos do oscillate, we now know that neutrinos almost certainly have nonzero masses and mix. This knowledge rasises a number of questions about the neutrinos: • How many neutrino flavors, including both interacting and possible sterile flavors, are there? Equivalently, how many neutrino mass eigenstates are there? • What are the masses, Mi, of the mass eigenstates v(l • Is the antiparticle V[ of a given mass eigenstate Ui the same particle as (/;, or a different particle? • What are the sizes and phases of the elements Uai of the leptonic mixing matrix? Equivalently, what are the mixing angles and complex phase
649 factors in terms of which U may be described? Do complex phase factors in U lead to CP violation in neutrino behavior? • What are the electromagnetic properties of neutrinos? In particular, what are their dipole moments? • What are the lifetimes of the neutrinos? Into what do they decay? • What is the physics that gives rise to the masses, the mixings, and the other properties of the neutrinos? Seeking the answers to these and other questions about the neutrinos will be an exciting adventure for years to come. Acknowledgments It is a pleasure to thank the organizers of TASI 2000 for an excellent summer school, and for giving me the opportunity to participate in it. I am grateful to Leo Stodolsky for a fruitful collaboration on the oscillations of both neutral mesons and neutrinos, and to Serguey Petcov and Lincoln Wolfenstein for a helpful discussion of neutrinos in matter. I am also grateful to my wife Susan for her accurate, patient, and gracious typing of the written version of these lectures. References 1. M. Gell-Mann, P. Ramond, and R. Slansky, in Supergravity, eds. D. Freedman and P. van Nieuwenhuizen (North Holland, Amsterdam, 1979) 315; T. Yanagida, in Proceedings of the Workshop on Unified Theory and Baryon Number in the Universe, eds. 0 . Sawada and A. Sugamoto (KEK, Tsukuba, Japan, 1979); R. Mohapatra and G. Senjanovic, Phys. Rev. Lett. 44, 912 (1980). 2. C. Quigg, this Volume. 3. Increasingly, U is also being referred to as the "Maki-Nakagawa matrix" in recognition of insightful early work reported in Z. Maki, M. Nakagawa, and S. Sakata, Prog. Theor. Phys. 28, 870 (1962). For other pioneering work related to neutrino oscillation, see B. Pontecorvo, Zh. Eksp. Teor. Fiz. 53, 1717 (1967) [Sov. Phys. JETP 26, 984 (1968)]; V. Gribov and B. Pontecorvo, Phys. Lett. B 28, 493 (1969); S. Bilenky and B. Pontecorvo, Phys. Reports C 4 1 , 225 (1978); A. Mann and H. Primakoff, Phys. Rev. D 15, 655 (1977). 4. Y. Srivastava, A. Widom, and E. Sassaroli, Z. Phys. C 66, 601 (1995).
650
5. Y. Grossman and H. Lipkin, Phys. Rev. D 55, 2760 (1997); H. Lipkin, Phys. Lett. B 348, 604 (1995). 6. B. Kayser and R. Mohapatra, to appear in Current Aspects of Neutrino Physics, ed. D. Caldwell (Springer-Verlag, Heidelberg, 2001). 7. J. Bahcall, Neutrino Astrophysics (Cambridge Univ. Press, Cambridge, 1989). This book contains a nice discussion of neutrino oscillation in matter, and references to key original papers in the literature. 8. The foundations of the physics and oscillations of neutrinos in matter are laid in L. Wolfenstein, Phys. Rev. D 17, 2369 (1978). 9. See, for example, F. Boehm and P. Vogel, Physics of Massive Neutrinos (Cambridge Univ. Press, Cambridge, 1987). 10. S. Mikheyev and A. Smirnov, Sov. J. Nucl. Phys. 42, 913 (1986), Sov. Phys. JETP 64, 4 (1986), Nuovo Cimento 9C, 17 (1986). 11. H. Sobel, Nucl. Phys. B (Proc. Suppl.) 9 1 , 127 (2001). See also W. Mann, ibid., p. 134, and B. Barish, ibid., p. 141. 12. T. Kajita, Nucl. Phys. B (Proc. Suppl.) 77, 123 (1999). 13. S. Fukuda et al. (The Super-Kamiokande Collaboration), Phys. Rev. Lett. 85, 3999 (2000). 14. J. Wilkerson, this Volume.
it- • ^ s
s
^ ^ ^ ,
^
s
111^ ^ * ^
K
I -r •&
l**d-*"\;
Hitoshi Murayama
This page is intentionally left blank
Flavor in Supersymmetry Hitoshi Murayama Center for Theoretical Physics, Department of Physics University of California, Berkeley, CA 94720 and Theoretical Physics Group Lawrence Berkeley National Laboratory University of California, Berkeley, CA 94120 Abstract This lecture was given at TASI 2000, Flaovor Physics in the New Millennium, July, 2000, Boulder, Colorado. It reviews supersymmetry and emphasizes its flavor physics aspects.
1 1.1
Motivation for Supersymmetry Problems in the Standard Model
The Standard Model of particle physics, albeit extremely successful phenomenologically, has been regarded only as a low-energy effective theory of the yet-more-fundamental theory. One can list many reasons why we think this way, but a few are named below. First of all, the quantum number assignments of the fermions under the standard SU(3)c x SU(2)L x C/(l)y gauge group (Table 1) appear utterly bizarre. Probably the hypercharges are the weirdest of all. These assignments, however, are crucial to guarantee the cancellation of anomalies which could jeopardize the gauge invariance at the quantum level, rendering the theory inconsistent. Another related puzzle is why the hypercharges are quantized in the unit of 1/6. In principle, the hypercharges can be any numbers, even irrational. However, the quantized hypercharges are responsible
653
654
Table 1: The fermionic particle content of the Standard Model. Here we've put primes on the neutrinos in the same spirit of putting primes on the down-quarks in the quark doublets, indicating that the mass eigenstates are rotated by the MNS and CKM matrices, respectively. The subscripts g, r, b refer to colors. .
, -1/2
P i
\ L/6
, \ -1/2
/
II
I
7/ \'
t r \^
d"
I
«
C
\ 6
\
, \ -1/2
/
'
T
/ t \ I
b
7/
/ r \ 1 \S')c ,r\
1/6
/ I
7, \ \ 'L,r
da u
\
d
'U,
1/6
L
R,g
3
VA,6
C
R,g
sRf
t2/3 R,g
bRf
7/ aRr *£'
r cSRr f
t2/3 biRrf
x 1/6 I \
2/3 URb
2/3 CRb
,2/3 tRh
dRf
S"f
6-V3
\b'JL,b
2/3
l
2/3
ft \ i\b'J i ,r \ / I
r
R 2/3
1/6
L
x 1/6 C \
2/3
a
I
' U,g U A , S \v)La 1/6
>R
^
176
R
R
Table 2: The bosonic particle content of the Standard Model. W\W2,H+,H—>• W+,W~ W3,B,lm(H°) —^ 7,Z Re//0
—•
H
655
for neutrality of bulk matter Q(e) + 2Q{u) + Q(d) = Q(u) + 2Q(d) = 0 at a precision of 10~ 21 [1]. The gauge group itself poses a question as well. Why are there seemingly unrelated three independent gauge groups, which somehow conspire together to have anomaly-free particle content in a non-trivial way? Why is "the strong interaction" strong and "the weak interaction" weaker? The essential ingredient in the Standard Model which appears the ugliest to most people is the electroweak symmetry breaking. In the list of bosons in the Standard Model Table 2, the gauge multiplets are necessary consequences of the gauge theories, and they appear natural. They of course all carry spin 1. However, there is only one spinless multiplet in the Standard Model: the Higgs doublet
( % )
<")
which condenses in the vacuum due to the Mexican-hat potential (described in Section 1.4). It is introduced just for the purpose of breaking the electroweak symmetry SU{2)L x U(l)y —> ^ ( 1 ) Q E D - The potential has to be arranged in a way to break the symmetry without any microscopic explanations. Why is there a seemingly unnecessary three-fold repetition of "generations"? Even the second generation led the Nobel Laureate I. I. Rabi to ask: "Who ordered the muon?" Now we face the even more puzzling question of having three generations. And why do the fermions have a mass spectrum which stretches over almost six orders of magnitude between the electron and the top quark? This question becomes even more serious once we consider the recent evidence for neutrino oscillations which suggest the mass of the third-generation neutrino v'T of about 0.05 eV [2]. This makes the mass spectrum stretch over thirteen orders of magnitude. We have no concrete understanding of the mass spectrum nor the mixing patterns.
1.2
Drive to go to Shorter Distances
All the puzzles raised in the previous section (and more) cry out for a more fundamental theory underlying the Standard Model. What history suggests is that the fundamental theory lies always at shorter distances than the distance scale of the problem. For instance, the equation of state of the ideal gas was found to be a simple consequence of the statistical mechanics of free
656
molecules. The van der Waals equation, which describes the deviation from the ideal one, was the consequence of the finite size of molecules and their interactions. Mendeleev's periodic table of chemical elements was understood in terms of the bound electronic states, Pauli exclusion principle and spin. The existence of varieties of nuclide was due to the composite nature of nuclei made of protons and neutrons. The list would go on and on. Indeed, seeking answers at more and more fundamental level is the heart of the physical science, namely the reductionist approach. The distance scale of the Standard Model is given by the size of the Higgs boson condensate v = 250 GeV. In natural units, it gives the distance scale of d = hc/v = 0.8 x 10~16 cm. We therefore would like to study physics at distance scales shorter than this eventually, and try to answer puzzles whose partial list was given in the previous section. Then the idea must be that we imagine the Standard Model to be valid down to a distance scale shorter than d, and then new physics will appear which will take over the Standard Model. But applying the Standard Model to a distance scale shorter than d poses a serious theoretical problem. In order to make this point clear, we first describe a related problem in the classical electromagnetism, and then discuss the case of the Standard Model later along the same line [3].
1.3
Positron Analogue
In the classical electromagnetism, the only dynamical degrees of freedom are electrons, electric fields, and magnetic fields. When an electron is present in the vacuum, there is a Coulomb electric field around it, which has the energy of 1 e2 A-Ecoulomb = ~, • 4TT£0 re
(1-2)
Here, re is the "size" of the electron introduced to cutoff the divergent Coulomb self-energy. Since this Coulomb self-energy is there for every electron, it has to be considered to be a part of the electron rest energy. Therefore, the mass of the electron receives an additional contribution due to the Coulomb self-energy: (mec2)obs = {mec2)bare + A^couiomb(1-3) Experimentally, we know that the "size" of the electron is small, re < 10~17 cm. This implies that the self-energy AE is greater than 10 GeV
657
.- /±\ .Figure 1: The Coulomb self-energy of the electron.
e+ Figure 2: The bubble diagram which shows the fluctuation of the vacuum. or so, and hence the "bare" electron mass must be negative to obtain the observed mass of the electron, with a fine cancellation like 0.511 = -9999.489 + lOOOO.OOOMeV.
(1.4)
Even setting a conceptual problem with a negative mass electron aside, such a fine-cancellation between the "bare" mass of the electron and the Coulomb self-energy appears ridiculous. In order for such a cancellation to be absent, we conclude that the classical electromagnetism cannot be applied to distance scales shorter than e2/(4TTe0mec2) — 2.8 x 10~ 13 cm. This is a long distance in the present-day particle physics' standard. The resolution to this problem came from the discovery of the anti-particle of the electron, the positron, or in other words by doubling the degrees of freedom in the theory. The Coulomb self-energy discussed above can be depicted by a diagram Fig. 1 where the electron emits the Coulomb field (a virtual e~-
Figure 3: Another contribution to the electron self-energy due to the fluctuation of the vacuum.
658
photon) which is absorbed later by the electron (the electron "feels" its own Coulomb field).1 But now that we know that the positron exists (thanks to Anderson back in 1932), and we also know that the world is quantum mechanical, one should think about the fluctuation of the "vacuum" where the vacuum produces a pair of an electron and a positron out of nothing together with a photon, within the time allowed by the energy-time uncertainty principle At ~ h/AE ~ h/(2mec2) (Fig. 2). This is a new phenomenon which didn't exist in the classical electrodynamics, and modifies physics below the distance scale d ~ cAt ~ Hc/(2mec2) = 200 x 10~ 13 cm. Therefore, the classical electrodynamics actually did have a finite applicability only down to this distance scale, much earlier than 2.8 x 10 - 1 3 cm as exhibited by the problem of the fine cancellation above. Given this vacuum fluctuation process, one should also consider a process where the electron sitting in the vacuum by chance annihilates with the positron and the photon in the vacuum fluctuation, and the electron which used to be a part of the fluctuation remains instead as a real electron (Fig. 3). V. Weisskopf [4] calculated this contribution to the electron self-energy for the first time, and found that it is negative and cancels the leading piece in the Coulomb self-energy exactly: A£ P air = - ^
1
e2
•
(1.5)
After the linearly divergent piece l / r e is canceled, the leading contribution in the re —> 0 limit is given by AE = AEcouiomb + A£ p a i r = —mec2 log . 4n mecre
(1.6)
There are two important things to be said about this formula. First, the correction AE is proportional to the electron mass and hence the total mass is proportional to the "bare" mass of the electron, (mec2)obs x
3a, — (mec2)bare 1 + — log 4-7T
h mecre
(1.7)
The diagrams Figs. 1, 3 are not Feynman diagrams, but diagrams in the old-fashioned perturbation theory with different T-orderings shown as separate diagrams. The Feynman diagram for the self-energy is the same as Fig. 1, but represents the sum of Figs. 1, 3 and hence the linear divergence is already cancelled within it. That is why we normally do not hear/read about linearly divergent self-energy diagrams in the context of field theory.
659
Therefore, we are talking about the "percentage" of the correction, rather than a huge additive constant. Second, the correction depends only logarithmically on the "size" of the electron. As a result, the correction is only a 9% increase in the mass even for an electron as small as the Planck distance re = l/Mpi = 1.6 x 10~ 33 cm. The fact that the correction is proportional to the "bare" mass is a consequence of a new symmetry present in the theory with the antiparticle (the positron): the chiral symmetry. In the limit of the exact chiral symmetry, the electron is massless and the symmetry protects the electron from acquiring a mass from self-energy corrections. The finite mass of the electron breaks the chiral symmetry explicitly, and because the self-energy correction should vanish in the chiral symmetric limit (zero mass electron), the correction is proportional to the electron mass. Therefore, the doubling of the degrees of freedom and the cancellation of the power divergences lead to a sensible theory of electron applicable to very short distance scales.
1.4
Supersymmetry
In the Standard Model, the Higgs potential is given by V = fx2\H\2 + X\H\\
(1.8)
where v2 — (H)2 — —/J,2/2\ = (176 GeV) 2 . Because perturbative unitarity requires that A ;$ 1, — /J,2 is of the order of (100 GeV) 2 . However, the mass squared parameter JJL2 of the Higgs doublet receives a quadratically divergent contribution from its self-energy corrections. For instance, the process where the Higgs doublets splits into a pair of top quarks and come back to the Higgs boson gives the self-energy correction
A
^op = - 6 ^ 4 '
(L9)
where rH is the "size" of the Higgs boson, and ht « 1 is the top quark Yukawa coupling. Based on the same argument in the previous section, this makes the Standard Model not applicable below the distance scale of 10~ 17 cm. The motivation for supersymmetry is to make the Standard Model applicable to much shorter distances so that we can hope that answers to many of the puzzles in the Standard Model can be given by physics at shorter distance scales [5]. In order to do so, supersymmetry repeats what history did
660
with the positron: doubling the degrees of freedom with an explicitly broken new symmetry. Then the top quark would have a superpartner, stop, 2 whose loop diagram gives another contribution to the Higgs boson self energy A
^toP = +
6
^ -
(1-10)
The leading pieces in l/rH cancel between the top and stop contributions, and one obtains the correction to be A M 2 op + AMt2op = - 6 ^ ( m ? - m
2
)log^.
(1.11)
One important difference from the positron case, however, is that the mass of the stop, mi, is unknown. In order for the A/j,2 to be of the same order of magnitude as the tree-level value fj? = —2Aw2, we need m? to be not too far above the electroweak scale. Similar arguments apply to masses of other superpartners that couple directly to the Higgs doublet. This is the so-called naturalness constraint on the superparticle masses (for more quantitative discussions, see papers [6]).
1.5
Other Directions
Of course, supersymmetry is not the only solution discussed in the literature to avoid miraculously fine cancellations in the Higgs boson mass-squared term. Technicolor (see a review [7]) is a beautiful idea which replaces the Higgs doublet by a composite techni-quark condensate. Then rH ~ 1 TeV is a truly physical size of the Higgs doublet and there is no need for fine cancellations. Despite the beauty of the idea, this direction has had problems with generating fermion masses, especially the top quark mass, in a way consistent with the constraints from the flavor-changing neutral currents. The difficulties in the model building, however, do not necessarily mean that the idea itself is wrong; indeed still efforts are being devoted to construct realistic models. Another recent idea is to lower the Planck scale down to the TeV scale by employing large extra spatial dimensions [8]. This is a new direction which 2 This is a terrible name, which was originally meant to be "scalar top." If supersymmetry will be discovered by the next generation collider experiments, we should seriously look for better names for the superparticles.
661
has just started, and there is an intensive activity to find constraints on the idea as well as on model building. Since the field is still new, there is no "standard" framework one can discuss at this point, but this is no surprise given the fact that supersymmetry is still evolving even after almost two decades of intense research. One important remark about all these ideas is that they inevitably predict interesting signals at TeV-scale collider experiments. While we only discuss supersymmetry in this lecture, it is likely that nature has a surprise ready for us; maybe none of the ideas discussed so far is right. Still we know that there is something out there to be uncovered at TeV scale energies. For instance, one can constrain the energy scale of "new physics" once mH is known, by requiring that the fine-tuning at the new physics scale A is no worse than a certain percentage. This constraint can be combined with other traditional constraints based on triviality, vacuum stability and the electroweak precision measurements and is shown in Figure 4.
2
Supersymmetric Lagrangian
We do not go into the full-fledged formalism of supersymmetric Lagrangians in this lecture but rather confine ourselves to a practical introduction of how to write down Lagrangians with explicitly broken supersymmetry which still fulfill the motivation for supersymmetry discussed in the previous section. One can find useful discussions as well as an extensive list of references in a nice review by Steve Martin [10].
2.1
Supermultiplets
Supersymmetry is a symmetry between bosons and fermions, and hence necessarily relates particles with different spins. All particles in supersymmetric theories fall into supermultiplets, which have both bosonic and fermionic components. There are two types of supermultiplets which appear in renormalizable field theories: chiral and vector supermultiplets. Chiral supermultiplets are often denoted by the symbol (p, which can be (for the purpose of this lecture) regarded as a short-handed notation for the three fields: a complex scalar field A, a Weyl fermion ^-^-ip = ip, and a non-dynamical (auxiliary) complex field F. Lagrangians for chiral supermultiplets consist of two parts, Kahler potential and superpotential.
662
1
10
102
A (TeV) Figure 4: The constraints on the m/, - A plane, including triviality (dark region at top) and vacuum stability (dark region at bottom). The hatched regions marked "Electroweak" (if the operators are at the tree-level, with or without non-perturbative enhancements) and the region bounded by the dashed line (if the operators arise at one-loop level) are ruled out by precision electroweak analyses. The darkly hatched region marked " 1 % " represents tunings of greater than 1 part in 100; the "10%" region means greater than 1 part in 10. The empty region is consistent with all constraints and has less than 1 part in 10 fine-tuning. See [9] for details.
663 The Kahler potential is nothing but the kinetic terms for the fields, usually written with a short-hand notation / d49(fi*(f>, which can be explicitly written down as CD Jd^cj)1 = d^A^A1 + jivfdrf + F*F\ (2.1) Note that the field F does not have derivatives in the Lagrangian and hence is not a propagating field. One can solve for Fl explicitly and eliminate it from the Lagrangian completely. The superpotential is defined by a holomorphic function W(4>) of the chiral supermultiplets <j>1. A short-hand notation f d20W((f)) gives the following terms in the Lagrangian, CD
d2W
Jd26W{(f>)
4>i=Ai
W
dW
F\
(2.2)
The first term describes Yukawa couplings between fermionic and bosonic components of the chiral supermultiplets. Using both Eqs. (2.1) and (2.2), we can solve for F and find
F:
dW
(2.3)
Substituting it back to the Lagrangian, we eliminate F and instead find a potential term dW CD-VF (2.4) Vector supermultiplets Wa (a is a spinor index, but never mind), which are supersymmetric generalization of the gauge fields, consist also of three components, a Weyl fermion (gaugino) A, a vector (gauge) field A^, and a non-dynamical (auxiliary) real scalar field D, all in the adjoint representation of the gauge group with the index a. A short-hand notation of their kinetic terms is £ D f d20W^Waa
= ~\F^ + \aiip\a
+
X
-DaDa.
(2.5)
Note that the field D does not have derivatives in the Lagrangian and hence is not a propagating field. One can solve for Da explicitly and eliminate it from the Lagrangian completely. Since the vector supermultiplets contain gauge fields, chiral supermultiplets which transform non-trivially under the gauge group should also couple
664
to the vector multiplets to make the Lagrangian gauge invariant. This requires the modification of the Kahler potential Jdi8
fdA04>\e29Vft
= D^AlD^A1
+ ^i'fD^
+ F}Fl - ^g{A]Ta\ai))
-
gA]TaDaA. (2.6)
Using Eqs. (2.5,2.6), one can solve for Da and eliminate it from the Lagrangian, finding a potential term a2 C D -VD = -~(A^TaA)2
(2.7)
General supersymmetric Lagrangians are given by Eqs. (2.4,2.6,2.7). 3 Even though we do not go into formal discussions of supersymmetric field theories, one important theorem must be quoted: the non-renormalization theorem of the superpotential. Under the renormalization of the theories, the superpotential does not receive renormalization at all orders in perturbation theory. 4 We will come back to the virtues of this theorem later on. Finally, let us study a very simple example of superpotential to gain some intuition. Consider two chiral supermultiplets (f)1 and (f>2, with a superpotential W = m(f>l(f)2. (2.8) Following the above prescription, the fermionic components have the Lagrangian 1 d2W m v (2 9) c D
"2 W * * = - * '
'
while the scalar potential term Eq. (2.4) gives CD
F)W 2
= -m'\A'\2-m2\Az\z.
(2.10) 4>'=A' 3 We dropped one possible term, called the Fayet-Iliopoulos D-tevm, possible for vector supermultiplets of Abelian gauge groups. Such terms can have important effects phenomenologically [11, 12]. 4 There are non-perturbative corrections to the superpotential, however. See, e.g., a review [13].
665
Obviously, the terms Eqs. (2.9,2.10) are mass terms for the fermionic (Dirac fermion) and scalar components (two complex scalars) of the chiral supermultiplets, with the same mass m. In general, fermionic and bosonic components in the same supermultiplets are degenerate in supersymmetric theories.
3
Softly Broken Super symmetry
We've discussed supersymmetric Lagrangians in the previous section, which always give degenerate bosons and fermions. In the real world, we do not see such degenerate particles with opposite statistics. Therefore supersymmetry must be broken. We will come back later to briefly discuss various mechanisms which break supersymmetry spontaneously in manifestly supersymmetric theories. In the low-energy effective theories, however, we can just add terms to supersymmetric Lagrangians which break supersymmetry explicitly. The important constraint is that such explicit breaking terms should not spoil the motivation discussed earlier, namely to keep the Higgs mass-squared only logarithmically divergent. Such explicit breaking terms of supersymmetry are called "soft" breakings. The possible soft breaking terms have been classified [14]. In a theory with a renormalizable superpotential
W = \^W
+^Ay^W,
(3.1)
the possible soft supersymmetry breaking terms have the following forms: mfA*Ai,
MXX,
^bijfHjA'A*,
l
-al]k\ljkAl
& Ak.
(3.2)
The first one is the masses for scalar components in the chiral supermultiplets, which remove degeneracy between the scalar and spinor components. The next one is the masses for gauginos which remove degeneracy between gauginos and gauge bosons. Finally the last two ones are usually called bilinear and trilinear soft breaking terms with parameters fry and a^ with mass dimension one. In principle, any terms with couplings with positive mass dimensions are candidates for soft supersymmetry breaking terms [15]. Possibilities in theories without gauge singlets are i/>V>
A*iAjAk,
xplX
(3.3)
666
Obviously, the first term is possible only in theories with multiplets with vector-like gauge quantum numbers, and the last term only in theories with chiral supermultiplets in the adjoint representation. In the presence of gauge singlet chiral supermultiplets, however, such terms cause power divergences and instabilities, and hence are not soft in general. On the other hand, the Minimal Supersymmetric Standard Model, for instance, does not contain any gauge singlet chiral supermultiplets and hence does admit first two possible terms in Eq. (3.3). There has been some revived interest in these general soft terms [16]. We will not consider these additional terms in the rest of the discussions. It is also useful to know that terms in Eq. (3.2) can also induce power divergences in the presence of light gauge singlets and heavy multiplets [17]. It is instructive to carry out some explicit calculation^ of Higgs boson selfenergy in supersymmetric theories with explicit soft supersymmetry breaking terms. Let us consider the coupling of the Higgs doublet chiral supermultiplet H to left-handed Q and right-handed T chiral supermultiplets, 5 given by the superpotential term W = htQTHu. (3.4) This superpotential term gives rise to terms in the Lagrangian 6 £D-htQTHu-h2t\Q\2\Hu\2-h2Af\2\Hu\2-m2Q\Q\2-m2T\f\2-htAtQTHu, (3.5) where rrig, m^, and At are soft parameters. Note that the fields Q, T are spinor and Q, T, Hu are scalar components of the chiral supermultiplets (an unfortunate but common notation in the literature). This explicit Lagrangian allows us to easily work out the one-loop self-energy diagrams for the Higgs doublet Hu, after shifting the field Hu by its vacuum expectation value (this also generates mass terms for the top quark and the scalars which have to be consistently included). The diagram with top quark loop from the first term in Eq. (3.5) is quadratically divergent (negative). The contractions of Q or T in the next two terms also generate (positive) contributions to the Higgs self-energy. In the absence of soft parameters TUQ = m2^ = 0, these 5 As will be explained in the next section, the right-handed spinors all need to be charged-conjugated to the left-handed ones in order to be part of the chiral supermultiplets. Therefore the chiral supermultiplet T actually contains the left-handed Weyl spinor (*.R)C. The Higgs multiplet here will be denoted Hu in later sections. 6 We dropped terms which do not contribute to the Higgs boson self-energy at the one-loop level.
667
two contributions precisely cancel with each other, consistent with the nonrenormalization theorem which states that no mass terms (superpotential terms) can be generated by renormalizations. However, the explicit breaking terms rriq, m2- make the cancellation inexact. With a simplifying assumption rrig = m'f = fh2, we find c 2
6/1?
,, 2
(47r)'
A2 mz
Here, A is the ultraviolet cutoff of the one-loop diagrams. Therefore, these mass-squared parameters are indeed "soft" in the sense that they do not produce power divergences. Similarly, the diagrams with two htAt couplings with scalar top loop produce only a logarithmic divergent contribution.
4
The Minimal Super symmetric Standard Model
Encouraged by the discussion in the previous section that the supersymmetry can be explicitly broken while retaining the absence of power divergences, we now try to promote the Standard Model to a supersymmetric theory. The Minimal Supersymmetric Standard Model (MSSM) is a supersymmetric version of the Standard Model with the minimal particle content.
4.1
Particle Content
The first task is to promote all fields in the Standard Model to appropriate supermultiplets. This is obvious for the gauge bosons: they all become vector multiplets. For the quarks and leptons, we normally have left-handed and right-handed fields in the Standard Model. In order to promote them to chiral supermultiplets, however, we need to make all fields left-handed Weyl spinors. This can be done by charge-conjugating all right-handed fields. Therefore, when we refer to supermultiplets of the right-handed down quark, say, we are actually talking about chiral supermultiplets whose left-handed spinor component is the left-handed anti-down quark field. As for the Higgs boson, the field Eq. (1.1) in the Standard Model can be embedded into a chiral supermultiplet Hu. It can couple to the up-type quarks and generate their masses upon symmetry breaking. In order to generate down-type quark
668
Table 3: The chiral supermultiplets in the Minimal Supersymmetric Standard Model.. The numbers in the bold face refer to SU(3)c, SU(2)L representations. The superscripts are hypercharges. Lx(l,2)-W L2(l,2)-W L3(l,2)-V* E1(l,l)+1 £2(1,1)+1 S3(l,l)+1 QiiW)1'* Q2(3,2)V6 Q3(3,2)1'6 2 3 2 i71(3,l)- / f/ 2 (3,l)" /3 L/ 3 (3,l)- 2 / 3 +1 3 +1 3 A(3,l) / £> 2 (3,1) / D3(3,l)+1/3 Htt(l,2)+W Hd(l,2)~W masses, however, we normally use
^*=(
H
H:
) =(5;:).
(4.D
Unfortunately, this trick does not work in a supersymmetric fashion because the superpotential W must be a holomorphic function of the chiral supermultiplets and one is not allowed to take a complex conjugation of this sort. Therefore, we need to introduce another chiral supermultiplet Hd which has the same gauge quantum numbers of io2H* above. 7 In all, the chiral supermultiplets in the Minimal Supersymmetric Standard Model are listed in Table 3. The particles in the MSSM are referred to as follows.8 First of all, all quarks, leptons are called just in the same way as in the Standard Model, namely electron, electron-neutrino, muon, muon-neutrino, tau, tau-neutrino, up, down, strange, charm, bottom, top. Their superpartners, which have spin 0, are named with "s" at the beginning, which stand for "scalar." They are denoted by the same symbols as their fermionic counterpart with the tilde. Therefore, the superpartner of the electron is called "selectron," and is written as e. All these names are funny, but probably the worst one of 7 Another reason to need both Hu and Hd chiral supermultiplets is to cancel the gauge anomalies arising from their spinor components. 8 When I first learned supersymmetry, I didn't believe it at all. Doubling the degrees of freedom looked too much to me, until I came up with my own argument at the beginning of the lecture. The funny names for the particles were yet another reason not to believe in it. It doesn't sound scientific. Once supersymmetry will be discovered, we definitely need better sounding names!
669
all is the "sstrange" (s), which I cannot pronounce at all. Superpartners of quarks are "squarks," and those of leptons are "sleptons." Sometimes all of them are called together as "sfermions," which does not make sense at all because they are bosons. The Higgs doublets are denoted by capital H, but as we will see later, their physical degrees of freedom are h°, H°, A0 and H±. Their superpartners are called "higgsinos," written as H°, H+, Hj, Hd. In general, fermionic superpartners of bosons in the Standard Model have "ino" at the end of the name. Spin 1/2 superpartners of the gauge bosons are "gauginos" as mentioned in the previous section, and they exist for each gauge group: gluino for gluon, wino for W, bino for U(l)y gauge boson B. As a result of the electroweak symmetry breaking, all neutral "inos", namely two neutral higgsinos, the neutral wino W3 and the bino B, mix with each other to form four Majorana fermions. They are called "neutralinos" xl f° r i = 1,2,3,4. Similarly, the charged higgsinos H+, Hd , W~, W+ mix and form two massive Dirac fermions "charginos" xt f° r i = 1>2. All particles with tilde do not exist in the non-supersymmetric Standard Model. Once we introduce i?-parity in a later section, the particles with tilde have odd .R-parity.
4.2
Superpotential
The SU(3)c x SU{2)i x U(l)y gauge invariance allows the following terms in the superpotential W
=
\1QIUJHU
+\i'kUlD3Dk
+
XJQtDjHa + Xl]LtEjHd + ^HuHd + X'^QiDjL,
+ \TUE3Lk
+^ L ^ .
(4.2)
The first three terms correspond to the Yukawa couplings in the Standard Model (with exactly the same number of parameters). The subscripts i,j,k are generation indices. The parameter /x has mass dimension one and gives a supersymmetric mass to both fermionic and bosonic components of the chiral supermultiplets Hu and Hd. The terms in the second line of Eq. (4.2) are in general problematic as they break the baryon (B) or lepton (L) numbers. If the superpotential contains both B- and L-violating terms, such as A'u112LriD1D2 and \'d121QiD2Li, one can exchange D2 = s to generate a fourfermion operator \/112 \/121
-*—aMufldflXQiLO, mf
(4.3)
670
where the spinor indices are contracted in each parentheses and the color indices by the epsilon tensor. Such an operator would contribute to the proton decay process p —> e+n° at a rate of T ~ \'Ambp/mA, and hence the partial lifetime of the order of
r,~«
X
.0--=(I^)
4
Jf.
(4.4)
Recall that the experimental limit on the proton partial lifetime in this mode is TP > 1.6 x 10 33 years [18]. Unless the coupling constants are extremely small, this is clearly a disaster.
4.3
R- parity
To avoid this problem of too-rapid proton decay, a common assumption is a discrete symmetry called .R-parity [19]. The Zi discrete charge is given by Rp = (_ 1 )2,+3B+L
(4
5)
where s is the spin of the particle. (Alternatively, one can impose matter parity [20] (—1) 3B+L , which is equivalent to the jR-parity upon 2n spatial rotation.) Under Rv, all standard model particles, namely quarks, leptons, gauge bosons, and Higgs bosons, carry even parity, while their superpartners are odd due to the (—l) 2s factor. Once this discrete symmetry is imposed, all terms in the second line of Eq. (4.2) will be forbidden, and we do not generate a dangerous operator such as that in Eq. (4.3). Indeed, B- and L-numbers are now accidental symmetries of the MSSM Lagrangian as a consequence of the supersymmetry, gauge invariance, renormalizability and .R-parity conservation. One immediate consequence of the conserved imparity is that the lightest particle with odd .R-parity, i.e., the Lightest Supersymmetric Particle (LSP), is stable. Another consequence is that one can produce (or annihilate) superparticles only pairwise. These two points have important implications for collider phenomenology and cosmology. Since the LSP is stable, its cosmological relic is a good (and arguably the best) candidate for the Cold Dark Matter particles (see, e.g., a review [21] on this subject). If so, we do not want it to be electrically charged and/or strongly interacting; otherwise we should have detected it already. Then the LSP should be a superpartner of Z,
671
7, or neutral Higgs bosons or their linear combination (called neutralino). 9 On the other hand, the superparticles can be produced only in pairs and they decay eventually into the LSP, which escapes detection. This is why the typical signature of supersymmetry at collider experiments is missing energy/momentum. The phenomenology of imparity breaking models has been also studied. If either B-violating or L-violating terms exist in Eq. (4.2), but not both, they would not induce proton decay [24]. However they can still produce n-n oscillation and a plethora of flavor-changing phenomena. We refer to a recent compilation of phenomenological constraints [25] for further details.
4.4
Soft Supersymmetry Breaking Terms
In addition to the interactions that arise from the superpotential Eq. (4.2), we should add soft supersymmetry breaking terms to the Lagrangian as we have not seen any of the superpartners of the Standard Model particles. Following the general classifications in Eq. (3.2), and assuming .R-parity conservation, they are given by C,oft = d + d,
(4.6)
A = -m^QlQj -mfL*L3
- mffUfUj ~ mfEtEj
£ 2 = -A^IQ%(]3HU
2
-
m ^b*D3
- m
2 Hu
- A'JX^Q^H,
\HU\2 - m2Hd\Hd\2, - AfX^Q^H,
(4.7) + B,iHuHd + c.c. (4.8)
The mass-squared parameters for scalar quarks (squarks) and scalar leptons (sleptons) are all three-by-three hermitian matrices, while the trilinear couplings A1* and the bilinear coupling B of mass dimension one are general complex numbers. 10 9
A sneutrino can in principle be the LSP [12], but it cannot be the CDM to avoid constraints from the direct detection experiment for the CDM particles [22]. It becomes a viable candidate again if there is a large lepton number violation [23]. 10 It is unfortunate that the notation A is used both for the scalar components of chiral supermultiplets and the trilinear couplings. Hopefully one can tell them apart from the context.
672
4.5
Higgs S e c t o r
It is of considerable interest to look closely at the Higgs sector of the MSSM. Following the general form of the supersymmetric Lagrangians Eqs. (2.4,2.6,2.7) with the superpotential W = /j,HuHd in Eq. (4.2) as well as the soft parameters in Eq. (4.7), the potential for the Higgs bosons is given as
+fJ2{\Hu\2 + \Hd\2) + m2Hu\Hu\2 + m2Hd\Hd\2 - (ByHuHd
+ c.c.)(4.9)
It turns out that it is always possible to gauge-rotate the Higgs bosons such that W
(Hd) = ( V« ) ,
= ( ° ) ,
(4.10)
in the vacuum. Since only electrically neutral components have vacuum expectation values, the vacuum necessarily conserves [/(I)QED- 1 1 Writing the potential (4.9) down using the expectation values (4.10), we find
y=
g
hi-^
(vuvd)(^+^
+
-B/j.
-**
[i1 + m\d
)(:A, J \v
(4.1D d
where g2z = g2 + g'2. In order for the Higgs bosons to acquire the vacuum expectation values, the determinant of the mass matrix at the origin must be negative,
det(»2+f^ ^ -By.
-B»2 )<0. n + m2Hd J
(4.12)
2
However, there is a danger that the direction vu = vd, which makes the quartic term in the potential identically vanish, may be unbounded from below. For this not to occur, we need /J2 + m2Hu + n2 + m2Hd > 2yB.
(4.13)
In order to reproduce the mass of the Z-boson correctly, we need
11
[26].
vu = —F= sin/?, V2
ud = - 7 = c o s ^ , v2
v = 250 GeV.
(4-14)
This is not necessarily true in general two-doublet Higgs Models. Consult a review
673
The vacuum minimization conditions are given by dV/dv, = dV/dvd from the potential Eq. (4.11). Using Eq. (4.14), we obtain M
~~
2
+
tan2/3-l
'
= 0
(4.15)
and 5/i = (2fi2 + m2Hu +m2Hd)sm/3
cos /3.
(4.16)
Because there are two Higgs doublets, each of which with four real scalar fields, the number of degrees of freedom is eight before the symmetry breaking. However three of them are eaten by W+, W~ and Z bosons, and we are left with five physics scalar particles. There are two CP-even scalars h°, H°, one CP-odd scalar A0, and two charged scalars H+ and H~. Their masses can be worked out from the potential (4.11): m2A = 2fj,2 + m2Hu + m2Hd,
m2H± — vr?w + m2A,
(4-17)
and m2ho,m2HO
= - (m2A + m2z ± \J{m2A + m | ) 2 - 4m2zrn2A cos 2 2/3 J .
(4.18)
A very interesting consequence of the formula Eq. (4.18) is that the lighter CP-even Higgs mass m2h0 is maximized when cos2 2/3 = 1: m2h0 = (mA + m2z — \rr?A — mz\)/2. When mA < mz, we obtain m2ha = mA < m | , while when 2 mA > mz, m ho = mz. Therefore in any case we find mho < mz.
(4.19)
This is an important prediction in the MSSM. The reason why the masses of the Higgs boson are related to the gauge boson masses is that the Higgs quartic couplings in Eq. (4.9) are all determined by the gauge couplings because they originate from the elimination of the auxiliary D-fields in Eq. (2.6). Unfortunately, the prediction Eq. (4.19) is modified at the one-loop level [27], approximately as
A
« ° ) = ^ y ^ & 1^ ( ! ! ¥ 1 ) •
(4 20)
-
With the scalar top mass of up to 1 TeV, the lightest Higgs mass is pushed up to about 130 GeV. (See also the latest analysis including the resummed two-loop contribution [28].)
674
The parameter space of the MSSM Higgs sector can be described by two parameters. This is because the potential Eq. (4.11) has three independent 2 A*2 + m parameters, /i* m"!H , \i" m ffd' and B/j, while one combination is fixed by the Z-mass Eq. (4.12). It is customary to pick either (mA,ta.nf3), or (mho, tan 0) to present experimental constraints. The current experimental constraint on this parameter space is shown in Fig. 5. 12 The range of the Higgs mass predicted in the MSSM is not necessarily an easy range for the LHC experiments, but three-years' running at the high luminosity is supposed to cover the entire MSSM parameter space, by employing many different production/decay modes as seen in Fig. 6.
4.6
Neutralinos and Charginos
Once the electroweak symmetry is broken, and since supersymmetry is already explicitly broken in the MSSM, there is no quantum number which can distinguish two neutral higgsino states H°, H°, and two neutral gaugino states W3 (neutral wino) and B (bino). They have a four-by-four Majorana mass matrix £D
(B W3 H°d H°u) 0 -mzswcp mzsws/3
0 M2 mzcwC/3
-mzcwsp
-mzswcp mzcwcp 0
mzswsp -mzcwsp -M
0
\ ( * \ 3 W
H°d ) \H°U J (4.21)
Here, sw = sin9w, cw = cos9w, sp = sin/3, and cp = cos/3. Once M\, M 2 , fi exceed raz, which is preferred given the current experimental limits, one can regard components proportional to mz as small perturbations. Then the neutralinos are close to their weak eigenstates, bino, wino, and higgsinos. But the higgsinos in this limit are mixed to form symmetric and anti-symmetric 2 and H°A {Hi - Hi)/^2. linear combinations Hg = Hl + H[ 12 The large tan/3 region may appear completely excluded in the plot, but this is somewhat misleading; it is due to the parametrization (mho, tan 0) which squeezes the mho region close to the theoretical upper bound to a very thin one. In the (m,4,tan/3) parametrization, one can see the allowed region much more clearly.
675
MSSM Exclusions in the Max-m H Scenario
,-, 160 o > 140
f'~ -I,
o
'
'•?
um^-m^ .... »
Q
7
.
I . . . L
• ' / . -sr , .,-// --. - x.«
J L1'
f ' ."~l *
mh (GeV/c )
m, (GeV/c )
Mass Limits: 10
M s u s v = l TeV M 2 =200 GeV H=-200 GeV m luil„ =800 GeV Stop mix: X,=2M sl]
obs expected mH> 89.9 93.8 mA>90.5 94.1
_D0
m. (GeV/c")
tan (3 excluded from 0.52 to 2.25 obs. 0.48 to 2.48 expected
Figure 5: Regions in the (mho, tan/?) plane excluded by the MSSM Higgs boson searches at LEP-II [29].
676
03.50
•t->bH + , H * - > T J /
n 40
^ - h —^77 and Wh/tth, h^yy
100
150
200
250
300
350
ATLAS / L d t = 3 0 0 fb" 1
400
450
500
m A (GeV) Figure 6: Expected coverage of the MSSM Higgs sector parameter space by the ATLAS experiment at the LHC, after three years of high-luminosity running.
677
Similarly two positively charged inos: if+ and W+, and two negatively charged inos: Hj and W~ mix. The mass matrix is given by M2
V2mwS/3 \ f W
Again once M 2 , /LZ > mw, the chargino states are close to the weak eigenstates winos and higgsinos.
4.7
Squarks, Sleptons
The mass terms of squarks and sleptons are also modified after the electroweak symmetry breaking. There are four different contributions. One is the supersymmetric piece coming from the \dW/dcf)i\2 terms in Eq. (2.4) with ^ = Q, U, D, L, E. These terms add mj where mj is the mass of the quarks and leptons from their Yukawa couplings to the Higgs boson. Next one is combing from the \dW/d(j)i\2 terms in Eq. (2.4) with fc = Hu or Hd in the superpotential Eq. (4.2). Because of the (j, term, dW
Ml dW
=
-»H°d+
>$$&,
=
-nHZ + X'QiDj
(4.23) + XyLiEj.
(4.24)
Taking the absolute square of these two expressions and picking the cross terms together with (H%) = vcosj3/\/2, (H°) = usin/3/\/2, we obtain mixing between Q and U, Q and D, and L and E. Similarly, the vacuum expectation values of the Higgs bosons in the trilinear couplings Eq. (4.8) also generate similar mixing terms. Finally, the D-term potential after eliminating the auxiliary field D Eq. (2.7) also gives contributions to the scalar masses m2z(I3 — Qsin2 6w)cos2/3. Therefore, the mass matrix of stop, for instance, is given by
CD-(fLfR m
Q?3 3
+m
^
s c l" " t +' m2 " "z{\ Z V 2 - l w) 2p mt(At - / i c o t / 3 )
rnt{At -/zcot/3) m2Uz + m2 + m2z(-ls2w)c20
\ f iL ) \ tR (4.25)
with c2p = cos 2(3. Here, ti is the up component of Q3, and in = T*. For first and second generation particles, the off-diagonal terms are negligible
678
for most purposes. They may, however, be important when their loops in flavor-changing processes are considered.
4.8
W h a t We Gained in the MSSM
It is useful to review here what we have gained in the MSSM over what we had in the Standard Model. The main advantage of the MSSM is of course what motivated the supersymmetry to begin with: the absence of the quadratic divergences as seen in Eq. (3.6). This fact allows us to apply the MSSM down to distance scales much shorter than the electroweak scale, and hence we can at least hope that many of the puzzles discussed at the beginning of the lecture to be solved by physics at the short distance scales. There are a few amusing and welcome by-products of supersymmetry beyond this very motivation. First of all, the Higgs doublet in the Standard Model appears so unnatural partly because it is the only scalar field introduced just for the sake of the electroweak symmetry breaking. In the MSSM, however, there are so many scalar fields: 15 complex scalar fields for each generation and two in each Higgs doublet. Therefore, the Higgs bosons are just "one of them." Then the question about the electroweak symmetry breaking is addressed in a completely different fashion: why is it only the Higgs bosons that condense? In fact, one can even partially answer this question in the renormalization group analysis in the next sections where "typically" (we will explain what we mean by this) it is only the Higgs bosons which acquire negative mass squared (4.12) while the masses-squared of all the other scalars "naturally" remain positive. Finally, the absolute upper bound on the lightest CP-even Higgs boson is falsifiable by experiments. However, life is not as good as we wish. We will see that there are very stringent low-energy constraints on the MSSM in Section 6.
5
Renormalization Group Analyses
Once supersymmetry protects the Higgs self-energy against corrections from the short distance scales, or equivalently, the high energy scales, it becomes important to connect physics at the electroweak scale where we can do measurements to the fundamental parameters defined at high energy scales. This can be done by studying the renormalization-group evolution of parameters. It also becomes a natural expectation that the supersymmetry breaking itself
679
originates at some high energy scale. If this is the case, the soft supersymmetry breaking parameters should also be studied using the renormalizationgroup equations. We study the renormalization-group evolution of various parameters in the softly-broken supersymmetric Lagrangian at the one-loop level.13 If supersymmetry indeed turns out to be the choice of nature, the renormalization-group analysis will be crucial in probing physics at high energy scales using the observables at the TeV-scale collider experiments [32].
5.1
Gauge Coupling Constants
The first parameters to be studied are naturally the coupling constants in the Standard Model. The running of the gauge couplings constants are described in term of the beta functions, and their one-loop solutions in nonsupersymmetric theories are given by
— with
= ~ i - + -^log^
11 2 6o = y C 2 ( G ) - -Sf
-
1 -Sb.
(5.2)
This formula is for Weyl fermions / and complex scalars b. The group theory factors are defined by 5adC2{G) ab
6 Sftb
= fabcfdbc a b
= TvT T
(5.3) (5.4)
and C2(G) = Nc for SU(iVc) groups and 5 / ^ = 1/2 for their fundamental representations. In supersymmetric theories, there is always the gaugino multiplet in the adjoint representation of the gauge group. It contributes to Eq. (5.2) with Sf = C 2 (G), and therefore the total contribution of the vector supermultiplet is 3C2(G). On the other hand, the chiral supermultiplets have a Weyl spinor and a complex scalar, and the last two terms in Eq. (5.2) can always combined since Sf = Sb- Therefore, the beta function coefficients simplify to b0 = 3C2(G) - Sf. 13
(5.5)
Recently, there have been developments in obtaining and understanding all-order beta functions for gauge coupling constants [30] and soft parameters [31].
680 60 P ^ - - ™ - - ™ 50
™—"™r™-
60
™ —
c^r^
50
40 - 30 :
40
V
"SU(2)
,#*=
20
10"
10'
3o
SU(2)
20
Standard Model
10 ~v*^SU(3) 0 2 10
MSSM U(l)
10 T6'°"
'lO^ 10'"
10"
10 0 2 1 10
10"
sum 10"
10s
10'
H (GeV)
10'° 10'2 10'" 10" 10"
PL (GeV)
Figure 7: Running of gauge coupling constants in the Standard Model and in the MSSM. Given the beta functions, it is easy to work out how the gauge coupling constants measured accurately at LEP/SLC evolve to higher energies. One interesting possibility is that the gauge groups in the Standard Model SU(3)c x SU(2)L x U(1)Y may be embedded into a simple group, such as SU(5) or 50(10), at some high energy scale, called "grand unification." The gauge coupling constants at \i ~ mz are approximately a " 1 = 129, sin2 9W ~ 0.232, and a ; 1 = 0.119. In the SU(5) normalization, the U(l) coupling constant is given by a.\ = |ct' = | a / cos2 9w It turns out that the gauge coupling constants become equal at \i ~ 2 x 1016 GeV given the MSSM particle content (Fig. 7). On the other hand, the three gauge coupling constants miss each other quite badly with the non-supersymmetric Standard Model particle content. This observation suggests the possibility of supersymmetric grand unification.
5.2
Yukawa Coupling Constants
Since first- and second-generation Yukawa couplings are so small, let us ignore them and concentrate on the third-generation ones. Their renormalizationgroup equations are given as M
dht d/j,
dhb A* du
ht 16TT
2
hb 16TT 2
6ft2 6ft2
til-
ls
-93
332
13 15 91
16 K - —g3 - 3#2
(5.6) 15 5i
(5.7)
681 i
uu
I X
I
b = 3 -1 - > _ _ „ „
— — ^ ^ ^
. i.
^i
3 3
-
•
mb=4.4-
en. 10
CO
b
M SUSY =m t ;
a s ( M z ) = 0.118 X b =X T
-
1
100
120
140
i
i
160
180
200
m t (m t ) (GeV) Figure 8: The regions on (m t ,tan/3) plane where hb = hT at the GUT-scale [34]. dhT dfi
M'
hT 16^2 Ahl +3ht
352 " l<)l
(5.8)
The important aspect of these equations is that the gauge coupling constants push down the Yukawa coupling constants at higher energies, while the Yukawa couplings push them up. This interplay, together with a large top Yukawa coupling, allows the possibility that the Yukawa couplings may also unify at the same energy scale where the gauge coupling constants appear to unify (Fig. 8). There are two regions of tan/3 which lead to Yukawa unification: tan/3 ~ 2 and tan/? ~ 60. The first range is essentially excluded by the negative result in the Higgs boson search at LEP-II. It turned out that the actual situation is much more relaxed than what this plot suggests. This is because there is a significant correction to mb at tan/3 > 10 when the superparticles are integrated out [33]. Therefore the mb-mT Yukawa unification may work for a larger range of parameter space tan/? > 10.
682
5.3
Soft Parameters
Since we do not know any of the soft parameters at this point, we cannot use the renormalization-group equations to probe physics at high energy scales. On the other hand, we can use the renormalization-group equations from boundary conditions at high energy scales suggested by models to obtain useful information on the "typical" superparticle mass spectrum. First of all, the gaugino mass parameters have very simple behavior that MX:-T "dfi gf
= 0
(5-9)
-
Therefore, the ratios Mi/gf are constants at all energies. If the grand unification is true, both the gauge coupling constants and the gaugino mass parameters must unify at the GUT-scale and hence the ratios are all the same at the GUT-scale. Since the ratios do not run, the ratios are all the same at any energy scales, and hence the low-energy gaugino mass ratios are predicted to be Mx : M 2 : M 3 = g\ : g\ : g\ ~ 1 : 2 : 7
(5.10)
at the TeV scale. We see the tendency that the colored particle (gluino in this case) is much heavier than uncolored particle (wino and bino in this case). This turns out to be a relatively model-independent conclusion. The running of scalar masses is given by simple equations when all Yukawa couplings other than that of the top quark are neglected. We find lGir^—m2^
= 3Xt-6glMi-plMl
(5.11)
1 6 ^ <
= -SglMl
(5.12)
1 6
= Xt-fglM*-6giMl-^Ml
(5.13)
= 2Xt - fgJMl
(5.14)
^ V ^ 3
Ifor^mk
- plMl
- f^M?.
Here, Xt — 2h2(rn2Hu + m2Q3 + m^ 3 ) and the trilinear couplings are also neglected. Even within the simplifying assumptions, one learns interesting lessons. First of all, the gauge interactions push the scalar masses up at
683
lower energies due to the gaugino mass squared contributions. Colored particles are pushed up even more than uncolored ones, and the right-handed sleptons would be the least pushed up. On the other hand, Yukawa couplings push the scalar masses down at lower energies. The coefficients of Xt in Eqs. (5.11, 5.13, 5.14) are simply the multiplicity factors which correspond to 3 of SU{3)c, 2 of SU(2)Y and 1 of U(l)Y- It is extremely interesting that m2Hu is pushed down the most because of the factor of three as well as is pushed up the least because of the absence of the gluino mass contribution. Therefore, the fact that the Higgs mass squared is negative at the electroweak scale may well be just a simple consequence of the renormalization-group equations! Since the Higgs boson is just "one of them" in the MSSM, the renormalization-group equations provide a very compelling reason why it is only the Higgs boson whose mass-squared goes negative and condenses. One can view this as an explanation for the electroweak symmetry breaking: "radiative breaking" of electroweak symmetry.
6
Low-Energy Constraints
Despite the fact that we are interested in superparticles in the 100-1000 GeV range, which we are just starting to be explored in collider searches, there are many amazingly stringent low-energy constraints on superparticles.
6.1
Mass Insertion Techinique
To study the constraints from the rare processes on supersymmetry, the socalled mass insertion technique is very useful. To introduce the technique, let us pose a different question first, and then come back to supersymmetry. We have learned that the atmospheric neutrinos seem to oscillate. The mode is most likely u^ —> vT. What it means is that they have finite masses, and their mass eigenstates are different from the interaction eigenstates. If so, in the basis where the charged lepton masses are diagonal, the mass matrix for the neutrinos is not diagonal. Keeping only vT and fM, we assume for the sake of the discussion that the neutrinos are Dirac neutrinos, and take sin2 26 = 1, m 2 3 = 3 x 10" 3 eV2 > m22 « 0. Then the mass term in the Lagrangian is approximately
C = -\mVl(v>
* T ) ( ! \)(l:)-
(6-1)
684
Figure 9: The Feynman diagram for r —> ^ 7 from the neutrino mass insertion. Given the observation that v^ can oscillate to vT, there is violation of muon number and tau number. A natural question to ask is if there is also a corresponding process in the charged leptons, such as r —> fi-y. Let us estimate this rate without actually calculating the diagram. The effective operator responsible for such a decay must be T^a^fiLF^
(6.2)
or with the opposite chirality combination. The diagram is shown in Figure 9. One unusual feature of this diagram is that the flow of chirality is shown explicitly. Another point is that the masses are treated as "insertions", i.e., as "interactions" represented by crosses. Because the chirality of r is TR, while it needs to convert to neutrinos to pick up flavor violation, it has to interact with the VF-boson, which requires TL. Therefore, there must be the insertion of m T to flip the chirality from TR to TL. There is no need for a mM insertion because fii can interact with the W-boson. The off-diagonal element of Eq. (6.1) changes flavor, but also the chirality because it is a mass term. In order to keep neutrino left-handed so that it can interact with the W-boson, we need to insert the mass term twice. This way, one can determine the minimum number of mass insertions very simply. The coefficient of the operator Eq. (6.2) then should approximately be a2 m,.2mT ^T2-\^-
e
167T
(6-3)
77%
Given this estimate, the branching fraction for r —> ^ 7 would be 4 2m e1^~ l(T5i. (6.4) w This predicted width is certainly experimentally allowed and unlikely to be seen any time soon.
r ( r -> M7)
3
685
( 5 1 d 2>LL
sL
dL -x-( 8 1 d 2)LL
Figure 10: A Feynman diagram which gives rise to AmK
6.2
and eK.
Neutral Kaon System
One of the most stringent constraints comes from the K°-K° mixing parameters Amjf and eK. The main reason for the stringent constraints is that the scalar masses-squared in the MSSM Lagrangian Eq. (4.7) can violate flavor, i.e., the scalar masses-squared matrices are not necessarily diagonal in the basis where the corresponding quark mass matrices are diagonal. To simplify the discussion, let us concentrate only on the first and the second generations (ignore the third). We also go to the basis where the down-type Yukawa matrix A^J is diagonal, such that x
'dvd
0
md 0
(6.5)
m.
Therefore the states K° - : (ds), K° — (sd) are well-defined in this basis. In the same basis, however, the squark masses-squared can have off-diagonal elements in general, 2
2ij
m,
&L m
Q,12
2 Q,12
m
2
2ij
mn
J
=
2*
m D,12
m D,12 m
(6.6)
i«
Since their off-diagonal elements will be required to be small (as we will see later), it is convenient to treat them as small perturbations. We insert the off-diagonal elements as two-point Feynman vertices which change the squark flavor ditR «-)• s L R in the diagrams. To simplify the discussion further, we assume that all squarks and gluinos are comparable in their masses m. Then ^2 the relevant quantities are given in terms of the ratio {5f2)LL = "1Q U/™'
686
(and similarly {5d2)RR = rn2Dl2/rh2), as depicted in Fig. 10. The operator from this Feynman diagram is estimated approximately as 0.005a? ^ ^ ( d L 7 " s L ) ( d L 7 / 1 s i ) .
(6.7)
This operator is further sandwiched between K° and K° states, and we find Am2K ~ 0 . 0 0 5 / £ m ^ ( t f 2 ) £ L - L
= L2 x 10"12 Gey2 (leolev) 2 ( £ ) > » > " < 3 ' 5 x 10"15 GeV2' (6.8) where the last inequality is the phenomenological constraint in the absence of accidental cancellations. This requires
and hence the off-diagonal element m ^ 1 2 must be small. It turns out that the product (5d2)LL{&\2)RR YS more stringently constrained, especially its imaginary part from BK- Much more careful and detailed analysis than the above order-of-magnitude estimate gives [35] Re [(Sd12)LL(Sd12)RR\ < (1 x lO" 3 ) 2 ,
Im [(6d2)LL(8d12)RR]
< (1 x 10~ 4 ) 2 . (6.10)
This and other similar limits are summarized in Table 4. 6.3
ji —>• e 7
Another important example is \i —> e'y. The digaram is given in Figure 11. Using the mass insertion, the effective operator is given approximately by 2
2 m
107T
where we took constraint [36]
THSUSY
m
= w
SUSY
~
m
i- A more detailed calculation gives the
< « * > " < " • <
" " - * ( & ) ' •
^
This is also a very stringent constraint on the flavor mixing in the scalar masses-squared matrix.
687
Table 4: Limits on Re(5l:j)AB(5ij)CD, with A,B,C,D = {L,R), for an average squark mass rrig = 500 GeV and for different values of x = m^/m]-. Taken from [35]. NO QCD, VIA LO, VIA LO, Lattice Bt NLO, Lattice Bt
x/l«)LI
X
0.3 1.0 4.0 X
0.3 1.0 4.0
1.6 x 10" 2 3.4 x 10- 2 8.0 x 10" 2
1.4 x 10" 2 3.0 x 10" 2 7.0 x 10" 2 3.1 x 10" 3 3.4 x 10" 3 4.9 x 10" 3
sIWi2)U
X
0.3 1.0 4.0
5.5 x 10"3 3.1 x lO- 3 3.7 x 10- 3
2.3 x 10"3 2.5 x 10"3 3.5 x 10-3 >/
H(tf 2 )Ll
2.2 x 10" 2 4.6 x l O ' 2 1.1 x 10" 1
2.2 x 10" 2 4.6 x 10~ 2 1.1 x 10" 1
(K^WI » 3m2)RL\)
2.6 x lO" 3 2.8 x lO" 3 3.9 x lO" 3
2.8 x lO" 3.1 x lO" 3 4.4 x lO- 3
((Sd12)LR = (5d12)RL)
3.3 x lO" 3 2.7 x lO" 3 2.8 x lO" 3
2.2 x lO" 3 5.5 x 10" 3 3.8 x lO" 3
1.0 x lO" 3 1.1 x lO" 3 1.6 x 10" 3
1.0 x lO" 3 1.2 x lO" 3 1.6 x lO" 3
X
1.7 x lO" 3 2.8 x 10- 2 3.5 x lO" 3
RR\
0.3 1.0 4.0
1.8 X 10"3 2.0 x 10-3 2.8 x 10-3
yx \i~
>
8.6 x 10" 4 9.6 x 10" 4 1.3 x lO" 3
\e
Wv\A/WAAA/»
>
e'
wFigure 11: The Feynman diagram for fi —> ej from the slepton loop.
688
6.4
W h a t do we do?
There are various ways to avoid such low-energy constraints on supersymmetry. The first one is called "universality" of soft parameters [37]. It is simply assumed that the scalar masses-squared matrices are proportional to identity matrices, i.e., rn2Q,rn2J,rn2D oc 1. Then no matter what rotation is made in order to go to the basis where the quark masses are diagonal, the identity matrices stay the same, and hence the off-diagonal elements are never produced. There have been many proposals to generate universal scalar masses either by the mediation mechanism of the supersymmetry breaking such as gauge mediated (see reviews [38]), anomaly mediated [39], or gaugino mediated [40] supersymmetry breaking, or by non-Abelian flavor symmetries [41]. The second possibility is called "alignment," where certain flavor symmetries should be responsible for "aligning" the quark and squark mass matrices such that the squark masses are almost diagonal in the same basis where the down-quark masses are diagonal [42]. Because of the CKM matrix it is impossible to do this both for down-quark and up-quark masses. Since the phenomenological constraints in the up-quark sector are much weaker than in the down-quark sector, this choice would alleviate many of the low-energy constraints (except for flavor-diagonal CP-violation such as EDMs). Finally there is a possibility called "decoupling," which assumes first- and second-generation superpartners much heavier than TeV while keeping the third-generation superpartners as well as gauginos in the 100 GeV range to keep the Higgs self-energy small enough [43]. Even though this idea suffers from a fine-tuning problem in general [44], many models had been constructed to achieve such a split mass spectrum recently [45]. In short, the low-energy constraints are indeed very stringent, but there are many ideas to avoid such constraints naturally within certain model frameworks. Especially given the fact that we still do not know any of the superparticle masses experimentally, one cannot make the discussions more clear-cut at this stage. On the other hand, important low-energy effects of supersymmetry are still being discovered in the literature, such as muon g-2 [46, 47], and direct CP-violation [48, 49, 50, 51, 52]. There may be even more possible low-energy manifestations of supersymmetry which have been missed so far.
689
7
Models of Super symmetry Breaking
One of the most important questions in supersymmetry phenomenology is how supersymmetry is broken and how the particles in the MSSM learn the effect of supersymmetry breaking. The first one is the issue of dynamical supersymmetry breaking, and the second one is the issue of the "mediation" mechanism. Especially in the discussions about flavor physics in supersymmetry, the issue of supersymmetry breaking is unavoidable. Depending on what mechanism you employ, you arrive at completely different results. This is actually a good news. If supersymmetry is found, studying its flavor signatures would tell us a great deal about the origin of supersymmetry breaking as well as the origin of flavor.
7.1
Minimal Supergravity
One of the earliest ideas to break supersymmetry was due to [53]. To a supersymmetric Lagrangian, these authors added universal (the same) mass to all scalars in chiral multiplets in the theory. They made this assumption to avoid the constraints from flavor-changing processes as those discussed in the previous section. This assumption was later elevated to the so-called "minimal supergravity" scenario [54] where one assumes (see Eqs. (4.7,4.8)) (mQ)ij m
2
m
2
2
Hu = ™Hd = m0 \Au)ij = (Ad)ij = (Ae)ij
MI
m
= (™,u)ij = ( m o ) y = ( m L ) y = ( l ) u = o<% = AQ
— M2 — M3 = Mi/2
(7-1) (7.2) (7.3) (7.4)
all at the "GUT-scale" ~ 2 x 1016 GeV. 14 The trilinear couplings are universal in the sense that (-Au)y(A„)y = A0(Xu)ij etc. There are only five additional parameters in this framework at the GUT-scale: (mo,Ao,Mi/2,n,B).
(7.5)
By running these parameters down to the electroweak scale, and in particular calculating m2Hu and m2H , we can calculate mz and tan/3 using 14
The papers [54] actually did not distinguish the "GUT-scale" from the reduced Planck scale 2 x 1018 GeV. This distinction became an issue only after the LEP/SLC measurements of the gauge coupling constants in the 90's.
690
Eqs. (4.15,4.16). In other words, we can eliminate \x and B in favor of mz and tan j3, up to a sign ambiguity in \x in Eq. (4.15). Therefore the commonly accepted parameter set is (m 0 , Mi/2, A 0 , tan/3,sign(/i)).
(7.6)
The relation among soft parameters above is definitely a strong assumption, but it makes the number of parameters small and tractable. Indeed, most of the papers on supersymmetry until the mid-90's, both theoretical and experimental, assumed this scenario. It is remarkable, however, that such a simple (and strong) assumption leads to viable phenomenology. The flavorchanging constraints are mostly avoided (except b —> sj as we will discuss later). The lightest supersymmetric particle is almost always a neutralino (mostly a bino) which turns out to have cosmologically interesting abundance. Radiative electroweak symmetry breaking works beautifully within this framework. Phenomenology had been worked out in great detail and clearly this framework is viable, even though the direct search limits from LEP, Tevatron, indirect limits from b —> S7, and the limit on the MSSM Higgs boson from LEP already constrain the model. I would say that roughly "a little more than a half" of the parameter space has been excluded already. A natural question is if this scenario is "reasonable." The answer is yes and no. The reason why this is called the supergravity scenario is because it can certainly be realized in the N = 1 supergravity theory. In the so-called Polonyi-type models, one can easily break supersymmetry within supergravity by assuming an explicit mass scale in the superpotential together with a fine-tuning to keep the cosmological constant vanishing. We will discuss them in the Gravity Mediation section. The problem is that there is no principle to guarantee the universality of scalar masses, gaugino masses, and trilinear couplings. This is achieved basically by fine-tuning of parameters. We will make this point more explicit later. The amount of flavor signature one obtains therefore depends on how much one sticks to the universality. It is useful to ask how a small modification of the minimal supergravity, even flavor-blind ones, affects phenomenology. For instance, I mentioned that the minimal supergravity leads to bino-like LSP. Consider one additional parameter, namely the Fayet-Iliopoulos Z?-term for U(1)Y, which changes all of the scalar masses according to their hypercharges: m] -» m,2 + YiDY-
(7.7)
691 In this "Less Minimal" model, there are portions of the parameter space where a higgsino-like neutralino or sneutrino becomes the LSP, which changes the phenomenology drastically [12]. The fact that such a small modification (one additional parameter) can lead to a big change in phenomenology tells us that simplifying assumptions such as minimal supergravity must be used with caution.
7.2
Dynamical Supersymmetry Breaking
Classic examples of supersymmetry breaking were based either on an O'Raifeartaigh-type superpotential or a Fayet-Iliopoulos D-term (see, e.g., [55]). Both of them had explicit mass scales built into the Lagrangian by hand, and do not explain the hierarchy why the supersymmetry breaking scale is much lower than the Planck scale. The non-renormalization theorem in supersymmetric field theories makes it impossible to break supersymmetry at higher orders in perturbation theory if it is not broken already at the treelevel. This point, however, makes it hopeful that supersymmetry is broken only non-perturbatively by dimensional transmutation so that the scale of supersymmetry breaking is expontentially suppressed relative to the Planck scale (see, e.g., [56]). Work on the problem of supersymmetry breaking has made dramatic progress in the past few years thanks to works on the dynamics of supersymmetric gauge theories by Seiberg [13]. We will briefly review the progress below. The original idea by Witten [5] was that dynamical supersymmetry breaking is ideal to explain the hierarchy. Because of the non-renormalization theorem, if supersymmetry is unbroken at the tree-level, it remains unbroken at all orders in perturbation theory. However, there may be non-perturbative effects suppressed by e~&ir lg that could break supersymmetry. Then the energy scale of the supersymmetry breaking can be naturally suppressed exponentially compared to the energy scale of the fundamental theory (string?). Even though this idea attracted a lot of interest, 15 the model building was hindered by the lack of understanding in dynamics of supersymmetric gauge theories. Only relatively few models were convincingly shown to break supersymmetry dynamically, such as the SU(5) model with two pairs [57] of 5* + 10 and the 3-2 model [58]. After Seiberg's works, however, there has 15
I didn't live through this era, so this is just a guess.
692
been an explosion in the number of models which break supersymmetry dynamically (see a review [59] and references therein). For instance, some of the models which were claimed to break supersymmetry dynamically, such as 5(7(5) with one pair [60] of 5* + 10 or 50(10) with one spinor [61] 16, are actually strongly coupled and could not be analyzed reliably (called "noncalculable"), but new techniques allowed us to analyze these strongly coupled models reliably [62]. Unexpected vector-like models were also found [63] which proved to be useful for model building. In many of these models, direct renormalizable interactions between the sector that breaks supersymmetry dynamically and the supersymmetric standard model are not possible simply due to gauge invariance. For instance, the lowest dimension operator in the 50(10) model with one spinor 16 is the gauge kinetic term (dimension 4) and a superpotential 16 4 (dimension 5). On the other hand, the lowest dimension operator in the supersymmetric standard model is the /z-term HuHd (dimension 3). Therefore, the lowest dimension operators that couple these two sectors are of dimension 7, and the couplings are necessarily suppressed by at least three powers of the energy scale, possibly the Planck-scale. This simple observation makes the existence of a sector with only Planck-scale-suppressed coupling to us not so surprising. Whether it leads to a phenomenologically acceptable spectrum of superparticles is an issue of the "mediation."
7.3
Mediation Mechanisms
There has also been an explosion in the number of mediation mechanisms proposed in the literature. The oldest mechanism is that in supergravity theories where interactions suppressed by the Planck scale are responsible for communicating the effects of supersymmetry breaking to the particles in the MSSM. For instance, see a review [55]. Even though gravity itself may not be the only effect for the mediation, and there could be many operators suppressed by the Planck-scale responsible for the mediation, nonetheless this mechanism was sometimes called "gravity-mediation." The good thing about this mechanism is that this is almost always there. However we basically do not have any control over the Planck-scale physics and the resulting scalar masses-squared are in general highly non-universal. In this situation, the best idea is probably to constrain the scalar masses-squared matrix to be proportional to the identity matrix by non-Abelian flavor symmetries [41]. Models of this type have been constructed where the breaking patterns of
693
the flavor symmetry naturally explain the hierarchical quark and lepton mass matrices, while protecting the squark masses-squared matrices from deviating too far from the identity matrices. 7.3.1
Gravity Mediation
A supergravity theory is characterized by two quantities. One is the Kahler density of mass dimension two and the other is the superpotential of mass dimension three, similar to the case of global supersymmetry. The Kahler density K is a real function of both chiral superfields (jf and their complex conjugates (p* , while the superpotential W is a holomorphic function of the chiral superfields 0 \ For the discussions below, we use the system of units where the reduced Planck scale, MPI/\/8-K = 1. For a given Kahler density, which is the fundamental input in the Lagrangian, one defines a derived quantity, the Kahler potential, i K = —31n(l — K). Then the scalar potential is given as V = eK \{KlW* + W^iK-^iiKjW
+ W3) - 3\W\2] ,
(7.8)
where W3 = dW/dft, W*1 = (Wi)*, K3 = dK/dft, Kl = {K,)*, and {K~l)\ l 2 is the inverse matrix of K , — d K / d(f>*d(f>>. On the other hand, the scalar field kinetic term is given by CK = Kfrft&p.
(7.9)
The minimal supergravity is defined by the choice K — 4>*4>l. This choice guarantees the canonical kinetic term for the scalar fields. However, this is a rather odd choice from the point of view of the original Kahler density because it corresponds to a specific form K = 1 — e ~^*^'/ 3 . There is no theoretical reasoning behind this choice except for the convenience of getting canonical kinetic terms. In particular, K involves interactions among the chiral superfields suppressed by the Planck scale, because supergravity is an effective theory valid below the Planck scale and hence allows higherdimension operators suppressed by Planck scale. Therefore, one can always add more terms to the Kahler density suppressed by the Planck scale, such as (j>*<j)*(j)k(j)1 (suppressed by two powers of the Planck scale). The Polonyi chiral superfield, z, typically acquires a Planck-scale expectation value as well as a supersymmetry breaking F-component expectation value. The original Polonyi model is given by W = n2(z + 2- V3).
(7.10)
694 H~1018GeV
M010GeV
Supergravity
Hidden Sector
Supersymmetric Standard Model
H~102-103GeV
Figure 12: Structure of gravity mediation models. The minimum is at
- l + ^+02\/3/A
(7.11)
By expanding the scalar potential with the minimal supergravity Kahler potential, one indeed finds universal scalar masses and universal trilinear couplings. Universal gaugino masses can be obtained if one couples the Polonyi field to the gauge multiplets as i
/<*(
cz
WaW°
(7.12)
with an 0(1) coefficient c the same for all gauge multiplets. The soft parameters arise at the order of magnitude fi2/MPi and hence we take fi ~ 1010 GeV to obtain electroweak-scale supersymmetry breaking. The problem with minimal supergravity, as we hope is clear from the above brief discussion, is that it is based on too many assumptions. For example, one can write terms such as /
d49z*za
(7.13)
This term gives rise to additional contributions to the scalar masses due to the F-compnent of the Polonyi field z. But there is no reason why such term should come with the same coefficients for all chiral multiplets. If the coefficients are different, the universal scalar mass hypothesis is comletely
695 destroyed. One can even have a flavor-off-diagonal scalar mass squared if 4>l and
Jd20zMj
(7.14)
J (POzfih.
(7.15)
or in the Kahler potential
Finally the gaugino masses become non-universal if the coefficients c in Eq. (7.12) are not the same. Not only is universality not guaranteed in supergravity, the presence of an explicit energy scale /J, poses a problem. Supergravity does not explain the origin of the hierarchy mw ~ fj?/MPi
Gauge Mediation
A beautiful idea to guarantee the universal scalar masses is to use the MSSM gauge interactions for the mediation. Then the supersymmetry breaking effects are mediated to the particles in the MSSM in such a way that they do not distinguish particles in different generations ("flavor-blind") because they
696
only depend on the gauge quantum numbers of the particles. Such a model was regarded difficult to construct in the past [58]. However, a break-through was made by Dine, Nelson and collaborators [65], who started constructing models where the MSSM gauge interactions could indeed mediate the supersymmetry breaking effects, inducing postive scalar masses-squared and large enough gaugino masses (which used to be one of the most difficult things to achieve) [64]. The original models had three independent sectors, one for supersymmetry breaking, one (the messenger sector) for mediation alone, and finally the MSSM. The messenger sector is essentially a vector-like pair of 5 and 5* under the SU(5) GUT gauge group, or in other words TV D(3*, 1, i ) + D(3,1,
~),
L ( l , 2, - | ) + 1(1, 2, \)
(7.16)
TV is the number of messengers.16 Due to a (weak) gauge interaction between the dynamical supersymmetry breaking sector and the messenger sector, there is a supersymmetric mass term M and supersymetry breaking 73-type mass term F induced for messenger fields. The messenger fermions therefore have mass M, while the messenger scalars have mass matrices
The scalar mass spectrum is therefore \/M2 ± F, and the mismatch between the fermion and scalar mass spectra breaks supersymmetry. Supersymmetry breaking effects in the supersymmetric standard model arise from the loops of the messenger particles via standard model gauge interactions. The superparticle spectrum can be predicted in these models in terms of the mass of messengers M, the amount of supersymmetry breaking F in the messenger sector, and the number of messengers. In particular, one finds the following scalar and gaugino soft masses, n2
3
F
M, = T V ^ - ^ - , 16TT2M'
/ r t
2
m£ = V 2 C M - | M k
^
k
\l6ir2J
\
2
/ F \
2
(77) . \MJ
(7.18) V
;
Here, C\ is the second order Casimir TaTa for the gauge group i and the particle species k. 16
In order to preserve gauge coupling unification together with extra fields in the messenger sector, it is usually imposed that the messenger fields come in complete 5(7(5) multiplets. Other possibilities are 10 + 10* etc.
697
u~107GeV
Dynamical Supersymmetry Breaking messenger U(1)
u~105GeV
Messenger Sector SU(3)xSU(2)xll(1) Supersymmetric Standard Model
u~10 2 -10 3 GeV
Figure 13: Structure of gauge mediation models [65]. u~10 1 5 GeV?
Dynamical Supersymmetry Breaking SU(3)xSU(2)xU(1)
H~10 2 -10 3 GeV
Supersymmetric Standard Model
Figure 14: Structure of direct gauge mediation models [66]. Later models eliminated the messenger sector entirely and the dynamical supersymmetry breaking sector is coupled directly to the supersymmetric standard model [66] (see also reviews [38]). The energy scale of the dynamical supersymmetry breaking is model-dependent. The main virtue of the gauge mediation models is that the scalar masses come out universal for all three generations simply because the gauge interactions, responsible for generating scalar masses, do not distinguish different generations. Therefore the flavor effects in these models are virtually absent. It is important to note, however, that this virtue is based on a simple but strong assumption. The flavor physics that distinguishes different generations should occur at energy scale higher than the mediation scale. Other-
698
wise, the generated universal scalar masses would undergo flavor-dependent interactions at the flavor scale and the scalar masses would likely be highly non-universal at lower energies. 7.3.3
Anomaly Mediation
Models where the sector of dynamical supersymmetry breaking couples to the MSSM fields only by Planck-scale suppressed interactions still had difficulty in generating large enough gaugino masses [64]. One could go around this problem by a clever choice of the quantum numbers for a gauge singlet field [67]. On the other hand, it was pointed out only recently that the gaugino masses are generated by the superconformal anomaly [39]. This observation was confirmed and further generalized by other groups [68]. Randall and Sundrum further realized that one could even have scalar masses entirely from the superconformal anomaly if the sector of dynamical supersymmetry breaking and the MSSM particles are physically separated in extra dimensions. The supersymmetry breaking parameters are then given by Mz
2g? MPl' "u F 2 4 M,PI
mf Aijk
=
- 2 ( 7 * + 7i + 7 * ) ^ - -
(7-19)
The consequence was striking: the soft parameters were determined solely by the low-energy theory and did not depend on the physics at high energy scales at all. This makes it attractive as a solution to the problem of flavorchanging neutral currents. The mediation scale is at the Planck-scale, and the flavor physics scale is likely be lower. However, unlike the gauge mediation or generic supergravity cases, the complicated flavor physics completely decouples from the supersymmetry breaking parameters below its energy scale. The anomaly mediation initially suffered from the problem that some of the scalars had negative mass-squared. Later simple fixes were proposed [69]. All of these proposals, however, spoiled the virtue of the anomaly mediation, namely ultraviolet insensitivity. Recently a way to preserve the ultraviolet insensitivity and to construct realistic models has been proposed [70], using D-terms for U(l)Y and U{1)B-L-
699
Figure 15: Structure of anomaly mediation models [39]. Therefore anomaly mediation is a successful mechanism to suppress flavorchanging effects. On the other hand, it also means that it eliminates possible interesting flavor signatures of supersymmetry. 7.3.4
Gaugino Mediation
Finally the idea called "gaugino mediation" came out [40]. This idea employs an extra dimension where the gauge fields propagate in the bulk. Supersymmetry is broken on a different brane and the MSSM fields learn about the supersymmetry breaking effects from the MSSM. gauge interactions. This solves the flavor-changing problem in the same way as gauge mediation. The spectrum generated from this mechanism is that gaugino masses are proportional to the gauge coupling Ml oc gf, while the scalar masses vanish at the mediation scale (the compactification scale in this context). To avoid a cosmological problem of charged dark matter, the slepton (especially the right-handed stau) masses need to be pushed by the renormalization group evolution above the lightest neutralino (the bino in this case). This sets a lower bound on the mediation scale in excess of 1016 GeV. On the other hand, flavor physics below this scale would induce non-universality again. Therefore the flavor physics needs to be put in a small window above the compactification scale and below the Planck scale. There is a way out [70], however, from this constraint, if you use the shining mechanism [71] to generate flavor breaking without 0(1) flavor-violation on our brane.
700
Figure 16: Structure of gaugino mediation models [40].
8
Models of Flavor
The flavor signatures of supersymmetry depend not only on the supersymmetry breaking/mediation mechanisms but also on the possible sources of flavor violation. This in turn means that the result depends on the origin of flavor. I will make this statement more explicit in the discussions below. One basic question here is how we understand the structure of fermion masses and mixings. There are at least two popular approaches to this question, which can well be mutually compatible. One is grand unification, and the other is approximate flavor symmetry.
8.1
Grand Unification
Consider the simplest unified group, SU(5). It unifies quarks and leptons into two irreducible multiplets, 5* 9 (dfl)0, II and 10 3 (uR)c, QL, (IR)C- This immediately gives us hope to understand the relative magnitudes of quark and lepton masses. Indeed, the simplest SU(5) models with standard model Higgs doublets embedded into 5 + 5* of SU(5) lead to the prediction that he = hd, h^ = hs, and hT = hi at the GUT-scale. As shown in Section 5.2, this relation works phenomenologically for somewhat large tan j3 due to the renormalization group evolution of Yukawa couplings between the GUT-scale and on-shell. However, the relation is quite bad for the first and second generations. Georgi and Jarskog [72] suggested a modified relation he = h^/N^ /iM = hsNc, where Nc — 3 is the number of colors. Phenomenologically, these
701
relations are quite successful. The way they implemented these relations in the SU(5) GUT is somewhat technical, using the embedding of the downtype Higgs doublet into 45* rather than 5*. Similarly, the minimal version of the 50(10) models predict a stronger relation ht = hb = hT = hV3 at the GUT-scale. This relation may work for large tan/3, while it fails badly for first and second generations. One important effect of grand unified theories on the scalar masses is their running above the GUT-scale [73]. Suppose we assume universal scalar masses at the Planck scale (minimal supergravity?) as a conservative assumption for possible flavor violations in supersymmetry. The point is that the RGE running above the GUT-scale introduces a sizable and interesting flavor violation in the soft parameters. For instance, consider an 50(10) GUT model. 17 All particles in a generation are unified in a 16 multiplet of 5 0 ( 1 0 ) , including the right-handed neutrinos. We, however, have to be careful about the basis. Because of the Kobayashi-Maskawa matrix, the uptype particles and down-type particles in a single GUT-multiplet cannot be simultaneously in the mass eigenstates. If we say one 16 is in the basis where the top Yukawa coupling is diagonal, other components in the same multiplet, b', T', and v' are not in their mass eigenstates (y can be, though, if the large mixing angle in the atmospheric neutrino oscillation arises from rotation among charged leptons). Let us focus on the b'L = V^L + VtsSL + Kd^L component for the purpose of this discussion. The effect of the top Yukawa coupling above the GUT-scale suppresses the scalar mass of this multiplet relative to the first- and second-generation multiplets. The mass matrix for up-type squarks then reads as m2
\ m2
,
(8.1)
m2 - A / where A is the effect of the top Yukawa coupling. This mass matrix is diagonal in the basis where the top Yukawa coupling is diagonal. Similarly, the mass matrix for down-type squarks would be the same except that the above matrix is defined in the basis where the top Yukawa coupling is diagonal. By performing the KM rotation to go to the basis where down-type Yukawa 17
In order to have quark mixing, we need at least two Higgs multiplets 10.
702
couplings are diagonal, we find m\L
= VKMmlLV^M
/ \vtd\2 vtdvt:
vtdvt*b \
V vtbv;d vtbv*
2
= m 2 l - A \ VtsVt*d \VU\> VtsVt*b ,
(8.2)
\vtb\ J
which is non-diagonal and hence violates flavor. One can regard consequences of the off-diagonal elements derived in this fashion as a "conservative" estimates of the flavor-changing effects in supersymmetric unified theories. However, there is considerable model dependence on the size of the RGE effects that depend on the various beta functions above the GUT-scale. More importantly, it is assumed that the supersymmetry breaking is induced by gravity mediation (Section 7.3.1). Therefore size of the flavor violation estimated in this fashion is not guaranteed. However grand unification is one of the major motivations for supersymmetry anyway and it is quite reasonable to discuss flavor violation within this context. One recent addition to this type of effect is the right-handed neutrinos [74]. Given recent strong evidence for oscillation in atmospheric neutrinos, it is quite likely that there are right-handed neutrinos around 1015 GeV generating small neutrino masses of order 0.05 eV with O(l) Yukawa coupling. If so, the running of slepton masses is affected between the GUT-scale and right-handed neutrino masses. Especially given that the mixing angle between /i and r is large, it can give rise to a large flavor violation. Similarly, the solar neutrino data, if explained in terms of oscillation of ve, can be linked to flavor violation as well.
8.2
Approximate Flavor Symmetry
The idea of approximate flavor symmetry is in a sense a generalization of what people did often in the past: isospin and flavor SU(3) in hadrons. For instance, the isospin SU(2) is supposedly a symmetry between protons and neutrons. It is an explicitly broken symmetry due to the difference in mu and md and also to the electromagnetic interaction. One can stil exploit the isospin symmetry by regarding md — mu ^ 0 and the electric charge operator eQ as "spurions" which parametrize the size of the explicit breaking. Then one can write down the most general Lagrangian consistent with the isospin transformation properties of the spurions. Even though such an operator analysis does not have power to predict the size of the coefficients in the Lagrangian, it can relate different quantities using the SU{2) symmetry and
703
allows us to estimate the order of magnitude of the symmetry breaking effects. We now generalize this idea to all three generations, assuming a certain flavor symmetry exists with a small explicit breaking. The philosophy behind this analysis is the belief that all coupling constants must be 0 ( 1 ) . The top Yukawa coupling is indeed 0(1) and is "natural," while all other Yukawa couplings (possibly except hb, hT if tan/3 is large) are "unnaturally small." Therefore there must be a flavor symmetry which allows top Yukawa coupling while forbidding other Yukawa couplings. The explicit breaking of the flavor symmetry, however, makes the other Yukawa couplings possible, at suppressed orders of magnitude due to the smallness of the spurions. Let us employ one simple example of flavor symmetry, based on a single £7(1) [75]. The charge assignment is S'C/(5)-like:18 1QI(+2)
10 2 (+1)
io3(o;
5i(0) li(0)
3S(0) l 2 (o)
53(0) l 3 (o)
(8.3)
where the subscripts are generation indices and the U(l) flavor charges are given in bold face. The 5f/(5)-like multiplets contain 10 = (Qi, (UR)C, (e#) c ), 5* = (LL, (dR)c), and 1 = {VR)C. If we require the conservation of this U(l) charge, the top, bottom, tau Yukawa couplings are allowed, all neutrino Yukawa couplings are allowed, but all other quark, lepton Yukawa couplings are forbidden. Then let us also suppose that this U(l) symmetry is broken by a small spurion e( —1) ~ 0.04. This allows us to fill in blanks in the Yukawa matrices, and we can make order of magnitude estimates of the matrix elements:
Yu ~ [ e
3
2
e
e | ,
e2 e 1 Yi ~ | e2 e 1 | , 2 e 1 18
e2 e2 e2 Yd ~ | e et ee | , 1 1 1 Yv
1 1 1\ 1 1 1 1 1 1J
(8.4)
(8.5)
This charge assignment would prefer a large tan/3. Another possibility is to assign charge + 1 for all 5's, which would prefer a small tan/?. This is consistent with the charge assignments in [76] which makes superpartners of fields with non-zero (7(1) charges heavy due to the anomalous 1/(1) in string-inspired models and the model safe from flavorchanging constraints.
704
where the left-handed (right-handed) fields couple to them from the left (right) of the matrices. There are "random" O(l) coefficients in each of the matrix elements. The property that Yd ~ Yf is true in many S£/(5)-like models. Finally, the Majorana mass matrix of right-handed neutrinos is (1 MR ~ Mo
1 1\ 1 1 1 ,
(8.6)
where M0 ~ 1015 GeV is the mass scale of lepton-number violation. This flavor charge assignment would predict order of magnitude relations: mu:mc:mt
~
e4 : e2 : 1,
md:ms:mb
~
e2 : e : 1,
m e : mM : mT
~
e2 : e : 1.
(8.7)
In other words, the following ratios must all be equal up to unknown 0(1) coefficients: (mu/mt)1/4 0.059
(mc/mt)1/2 0.077
{md/mbyl2
(me/mT)1/2 0.017
m^/rrir 0.03 0.059 (8.8) With fluctuation of 0(1) coefficients within a factor of two or so, this set of charge assignments appears successful. Moreover, random 0(1) coefficients among the neutrino Yukawa couplings via the seesaw mechanism naturally lead to near-maximal mixings in neutrino oscillations [77].19 The above mass matrices would naturally explain (1) the "double" hierarchy in up quarks relative to the hierarchy in down quarks and charged leptons, (2) Vcb ~ O(e) ~ 0(A 2 ), (3) the similarity between the down quark and charged lepton masses. Some "concerns" with the above mass matrices would be that the following points may be difficult to understand: (a) i^a ~ m M /3, (b) me ~ radfZ, (c) Vus ~ e1/2 rather than e. However, in view of the fact that the 0(1) coefficients would seem "anarchical" [77] from the low-energy point of view, a factor of 1/3 is quite likely to appear. And once ms is fluctuated downwards by a factor of ~ 1/3, Vus would fluctuate upwards to ~ 3e which is enough to understand the observed pattern of masses and mixings. 19
ms/mb 0.03
We need to assume that the CHOOZ limit on |£7e3| is "accidentally" satisfied.
705
Therefore one can regard these simple U(l) charge assignements as a starting point for building models of flavor. In explicit models, often the so-called Froggatt-Nielsen mechanism [78] is employed, where the spurion e arises as a suppressed ratio of a vacuum expectation value, ((f)), which spontaneously breaks the U(\) flavor symmetry to the mass, M, of vectorlike families whose exchange generates the forbidden Yukawa matrix elements by picking up the VEV: e = (cp)/M. For instance, one can imagine the heavy particles are all at (or slightly below) the Planck scale where the flavorbreaking VEV is induced around the GUT-scale. Now the question is what the approximate flavor symmetry does to the scalar mass matrices. Consider the Q mass matrix. Because their charges differ in Eq. (8.3) among three generations, off-diagonal elements are forbidden in the limit of flavor symmetry. The matrix therefore is given parametrically
m2( Q\
Q\
Q\ ) \ t
1
e
Q2
.
(8.9)
m 2 sets the overall scale of supersymmetry breaking parameter, while the offdiagonal elements are suppressed by powers of e. Indeed (m 2 )i 2 ~ 0.04m2 is already an adequate suppression, as discussed in Section 6.2. Note, however, the charge assignments in Eq. (8.3) do not distinguish UR, CIR, li of different generations, and 0(1) off-diagonal elements are allowed. Therefore the simple U(l) charge assignment here is not enough to suppress all flavor violation in supersymmetry. Nevertheless it demonstrates the idea: once different generations are distinguished due to their different flavor symmetry properties, the off-diagonal elements are suppressed. In this manner, one may hope to link the fermion masses and scalar masses in a model framework. Many choices of flavor symmetry groups had been discussed in the literature. There are [/(l)-based models (most notably [42]), while many of them are non-abelian [41] to ensure the degeneracy between first two generations: SU(2), 0(2), A(75), (S 3 ) 3 , U(2).
8.3
Grand Theme
Given the considerations in Sections 8.1 and 8.2, the following theme of supersymmetric flavor physics emerges. First of all, we know the Yukawa matrices of quarks and charged leptons quite well except for the right-handed
706
rotation matrices. We even have learned quite a bit about neutrino masses and mixings. If supersymmetry is found, the combination of Yukawa matrices and would-be measurement of superparticle masses allow us to test various models of flavor. If successful, we will be able to learn the origin of flavor, e.g., what approximate flavor symmetry is responsible. In my mind, this is the strongest motivation to pursue rare processes of flavor violation.
9
Flavor Signatures
We finally come to the quantitative discussions of flavor signatures in supersymmetry. I do not go into quantitative details, but rather present pointers to the original papers and show some plots to give you an idea on how important these effects might be.
9.1
Leptons
We first discuss flavor signatures of supersymmetry in the lepton sector. The list is: cy, /x —> e conversion, r —• /J,J, electric dipole moment of electron. One exotic entry is the study of oscillation among sleptons. 9.1.1
g^-2
The anomalous magnetic moment of the muon gM — 2 can be calculated in QED to a great accuracy. Even though this quantity is not quite a flavor signature in that sense that it does not involve any flavor violation, it is so interesting that I'd like to discuss it. To achieve enough accuracy, the hadronic contributions in the photon vacuum polarization diagram and the electroweak loops also need to be included. It turns out that the supersymmetric contribution is as important as the electroweak contribution if the sleptons are not too far above mw, a n d can be much more important if there is an enhancement due to a large tan/5. The predicition in the minimal supergravity framework was worked out in detail in [46], while the general case in supersymmetry was studied in [47]. The Brookhaven E821 experiment is currently taking data and is expected to measure aM = (g^ — 2)/2 with the accuracy of Aa^ — 0.4 x 10~9. One can see from Fig. 17 that the supersymmetric contribution can be important for a wide range of parameter space. Just before finishing this writeup, E821 reported the measured value that
707 m ji L = 300GeV
-1000 -800
-600
-400
-200
0
200
400
600
800
1000
p. (GeV)
Figure 17: The SUSY contribution to the muon MDM, Aa^ USY , in the /x-M2 plane. The smuon masses taken to be m^ = milL = 300GeV and tan /3 = 30. The numbers given in the figures represent the value of Aa^ USY in units of l t r 9 . Taken from [47]. deviates from the Standard Model at 2.6CT level, possibly hinting at slepton masses in the 120-400 GeV range [79]. 9.1.2
n-^cy
There is no contribution from the Standard Model to this process. Even with supersymmetry, there is no contribution if soft masses are universal, i.e., no flavor violation. Therefore the prediction depends sensitively on the source of flavor violation. One important source for flavor violation is the GUT-effect, due to the large top Yukawa coupling above the GUT-scale. The importance of this
708
effect was pointed out in [80]. More detailed calculations were carried out in [81]. A missing diagram in these analyses which can partially cancel the GUT-effect contribution was pointed out in [82]. The MEGA collaboration has improved the experimental limit down to BR{fi ->• ej) < 1.2 x 10~ n [83]. A new experiment at SIN should improve it to 1(T 14 level. Another possible source for flavor violation here is the effect of the righthanded neutrinos. This had been studied in [74], and the result depends on the mass of the right-handed neutrino as well as on which solution to the solar neutrino problem is right. Models with ppproximate flavor symmetries also give rise to fi —>• cy. See, for example, [84]. 9.1.3
fj, —> e Conversion
This process is closely related to the fj, —• cy, but is experimentally cleaner and is expected to be improved by the MECO experiment to the 0.5 x 10~16 level [85]. 9.1.4
Electric Dipole Moment of Electron
An electric dipole moment de, if it exists, would be direct evidence for Tviolation. The Standard Model does not give rise to an electric dipole moment of the electron, and hence its detection would be a clear signal of physics beyond the Standard Model. In the case of supersymmetry, there can be additional sources for CP violation in the soft parameters (and /x) and hence they can give rise to de. In the case of the GUT-effect, according to [81], there is an approximate scaling relation between \i —> e'y rate and de such as \de\ ~ 1 0 ~ 2 7 e c m x 1.3sin^x J
^
2
•
(9.1)
The current limit is de = (1.8 ± 1.6) x 10" 27 e cm. Even without resorting to the GUT-effect, an additional CP violating parameter among selectrons, charginos, and winos can induce de. If the phase is O(l), we need the selectron mass to be above a TeV!
709 LK
0
20
0
40
60 LI < 0
n>0
80
100 GeV
0
20
40
60
SO
100 GeV
|i > 0
Figure 18: Isoplots of B.R.(/i -> ej) in SU(5) in the M2,Ae/meR plane for XtG = 1.4, mgR = 100GeV and (a) tan/3 = 2, \x < 0, (b) tan/3 = 2, /it > 0, (c) tan/3 = 10, /i < 0, (d) tan/3 = 10, ^ > 0. The dashed (dotted) lines delimit regions where m?R < 0 (^2 < 0). The shaded area also extends to mfR < 45 GeV. The darker area shows a region where the rate is small, and passes through zero, due to a cancellation of terms. The dot-dashed line corresponds to the present experimental limit. For the CKM matrix elements we take |Vc6| = 0.04 and |Vtd| = 0.01. Taken from [81].
710
M<0
50
100
ISO
p>0
200
250 GoV
U<0
Figure 19: Isoplots of B.R.(/i -> ej) in SO(10) for mirt = 300 GeV, XtG 1.25 and all other parameters as in fig. 18. Taken from [81].
711
0
50
100
150
200
250
300 GeV
H>0
Figure 20: Isoplots of C.R.(/i -> e in Ti) in SU(5) for mlR = 100 or 300 GeV, XtG = 1.4 and tan/? = 2.
712
9.1.5
r —>• /X7
Atmospheric neutrino oscillations, if explained by the seesaw mechanism with right-handed neutrinos around 1015 GeV, can yield interesting contributions to T —t //7 or ej. The effect is quite large if the sleptons are below 200 GeV or so and if the right-handed neutrino mass is close to 1015 GeV (as prejudiced by the SO(10)-type relations). Similar effects on \x —> ej and fj, —> e conversion are much more model dependent partly because we do not know which solution to the solar neutrino problem is right at this moment, giving a huge possible range for Am2 and sin2 29e2. 9.1.6
Slepton Oscillation
A surprising but interesting and possible consequence of lepton flavor violation in the slepton mass matrix is the oscillation between different slepton flavors in the collider environment. This was proposed in [86]. The fi —> e"f constraint requires two mass eigenvalues to be close unless the mixing angle is very small. If the mass splitting is only of the order of the decay width F ~ a'm, where a' = a/ cos2 9W for the right-handed sleptons, the mass eigenstates live a long enough time to mix with each other. The signature then is e + e~ —> e+e~, where e^ oscillates into /i* and decays into a muon. Therefore e/j final state can be looked for. The signal is particularly clean in e~e~ collisions because of the absence of W+W~ background and larger cross sections.
9.2
Hadrons
In the hadronic sector, the possible flavor signatures include the neutron electric dipole moment dn, e and e' in the neutral kaon system, CP violation in hyperon decay, A m B , b -> S7, and the B dilepton asymmetry. 9.2.1
dn, e, b -» sj
The neutron electric dipole moment dn, e, b —¥ S7 can all be induced from the GUT effect. Both e in the kaon system and b —> sj exist within the standard model, while dn would be a clear sign of new physics.
713
T-^iy in the MSSMRN M2=130GeV, m~=170GeV, m v =0.07eV
Figure 21: Dependence of the branching ratio of r —> fij on the thirdgeneration right-handed neutrino Majorana mass M„3 in the MSSM with right-handed neutrinos. The input parameters are the same as those of Fig. (2) except that in this figure we take m ^ = 170 GeV and that we do not impose the condition fU3 = /„ 3 but treat M„3 as an independent variable. The dotted line shown in the figure is the present experimental bound. Here also the larger tan/3 corresponds to the upper curve. Taken from [74].
714
10 : V
V.
' ' '
10*-:
:
I \'
\
°' \.
1
l'\ \
V
'•••
\
YA
^ \
\
\
'''[
V \
\
\ v
\ v
x
"''•••,\; :
^x^V^
\. N.
10°;
10
'\ \
^
:
\
\
01
\
10
x
°A \
mD2
",ioox
\
"\ . \
\
\
"\
\
\
\
-
:
^
^Cx^ N^X 1d2
10
Sin 26o Figure 22: Contours of constant a(e+eR —> e ± ^ =F x°x°) (solid) in fb for the NLC, with T/~S = 500 GeV, m-eR,m-^R « 200 GeV, and Mx = 100 GeV (solid). The thick gray contour represents the experimental reach in one year. Constant contours of B(/J, —> e-f) are also plotted, but for left-handed sleptons degenerate at 350 GeV. Taken from [86].
715
10 1
:
X
X
••
:
, 500
\
\
\
\ "\
\
" ' " • •
\
\A
1
\
"\
1 1 10
10*
rrir?
,100
\
x
\
\
\
\
'
s ^ \
x
\.
•
•
•
,
\
;
\
^xV^x
10"
x
\
"
\
N. \ \
0.01\
" " ' • ' • •
\
V \
°- \
N
'•
\.
\-
10 10
10
Sin 29 R Figure 23: Same as in Fig. 22, but for a(eReR [86].
—> e /x x°x°)- Taken from
716
i^1:; : ^«fff^
I 4 [
i "v^__ ',
vi
>;c;i;\'
200
250 G c V
\
\
! ^
r
i
*•;•-..V
i
1 1 •>
>
! • •
^;'$&?.' pt'f '*>• y ;,
:
,
M»
UX)
ISO
200
50
250 G c V
1
i A^ft~71' ' ^ ? J ^x£i<;\ f,- -T { $
IfX)
150
• • ' / ' /
I^f^l^T
Figure 24: Contour plots in minimal SO(10) for m^R — 300 GeV', AtG = 1.25, yU < 0, tan/? — 2, and maximal CP violating phases (see text) for (a) B.R.0/ -> e 7 ); (b) dn; (c) eK\ (d) ^ / ^ ; (e) AmB; (f) B.R.(6 - • 57). In the hadronic observables only the gluino exchange contribution is included.
717
9.2.2
e, e', Hyperon C P Violation
The e parameter of the neutral kaon system can also arise from models with approximate flavor symmetries. Saturating the constraints, it is even possible to obtain the entire t from supersymmetry, without any CP violation in the Kobayashi-Maskawa matrix. If that is the case, the Ki —> ~K°VV experiment, which probes ^s{VtdV^s) directly, would see a vanishing result, as opposed to BR(KL —> ir°vv) ~ 2-4 x 10~ n as expected in the standard model. The supersymmetric contribution to e' was believed to be negligible for a long time. However, it was based on the minimal supergravity prejudice, and an approximate flavor symmetry leads to an acceptable and interesting contribution to e' which can saturate the observed value naturally [48] with 1 TeV squarks. Other mechanisms that generate supersymmetric e' have also been suggested [49, 50]. The same operator that gives rise to e' in [48] also may contribute to hyperon CP violation [51]. Due to the interference between S-wave and P-wave amplitudes in the A —> pir~ decay, there is a forward-backward asymmetry Q?A in the decay angle distribution due to parity non-conservation. The search is under way looking for CP-violation manifested as a difference in the asymmetries a A and its CP conjugate — a^. Fermilab E891 (HyperCP) experiment hopes to get down to A(A) = (a\ + a^)/(aA — a^) at the 2 x 1CT4 level. The same type of diagrams in models with approximate flavor symmetries would lead to rather large /i —>• ej and de, and would require sleptons above 500 GeV or so (see, e.g., [48]). 9.2.3
AmBd
B-B mixing, similarly to the neutral kaon system, is also sensitive to new physics effects. The supersymmetric contribution to A m ^ can also be CPviolating, and can make the asymmetries in B° —> J/tpKs differ from the true sin 2/3. A large effect is especially motivated in models with electroweak baryogenesis [87]. See also Fig. 24. 9.2.4
B Dilepton A s y m m e t r y
In the standard model, the CP-violating pieces in Mi2 and Ti2 are essentially proportional to each other. In many models with approxiate flavor symmetry, however, there is an additional possibly CP-violating contribution to Mi2 but
718
not to T 12 . The mismatch between M 12 and T12 can induce a different type of CP asymmetry. In the same-sign dilepton final states, e+e- -> T(45) -> B°B°,
B° -> l+X,
B° -> B° -»• i+X,
(9.2)
one can define the dilepton asymmetry ,
i+i+-ri-
,
,
In the standard model the asymmetry is at most of order ASM < 10~3, while it can be as large as 10~2 in models with approximate flavor symmetry [88]. 9.2.5
b ->• sj
The observed rate of the inclusive b —> sj is consistent with the NLO standard model calculation. In general two-doublet Higgs model, including the MSSM, the additional diagram due to the charged Higgs exchange instead of the W boson is always constructive with the VF-boson diagram, and is already highly constrained from this process. On the other hand, the supersymmetric contribution can take either sign, depending mostly on the sign of fi. The constraint is quite significant.
10
Conclusion
Supersymmetry is a well-motivated candidate for physics beyond the Standard Model. It would allow us to extrapolate the (supersymmetric version of the) Standard Model down to much shorter distances, giving us hope to connect the observables at TeV-scale experiments to parameters of much more fundamental theories. Even though it has been extensively studied over two decades, many new aspects of supersymmetry have been uncovered in the last few years. We expect that research along this direction will continue to be fruitful. We, however, really need a clear-cut confirmation (or falsification) experimentally. The good news is that we expect it to be discovered, if nature did choose this direction, at the currently planned experiments. If so, we also hope to see a wealth of flavor data to help us unravel the origin of flavor.
719
S
2
600^ 500^
a)
\
300-
.V.Vo>";V>-,<
,
:
•
&
$
I =
2m
i/2
"•*-'-<-£
-500
b)
A, = 2 m 1 / 2 m A = 250 G e V
m„=200 G e V
400-
1
A
i / -.-
300mA=150GeV
-
:'•••'1 ;' \
200-
/A? v'AX>3
tan(3= 3
s"
500-
%
200-^
>
S 700^ s 6oo^ m„ = 500 GeV A
™A=350Ge.VS
400-
100 -H
tan(3= 3
= 102 GeV
%
S 700^
J i
100^ o 1
1 1 1 1 '1
-500
500 H(GeV)
^.
t ---... " - m„=l T e V ,
1
1
1
I
|
1
1
500 \i (GeV)
800-
tan(3= 3
c)
itig = 500 GeV m. = 250 GeV
tan|3= 10 d ) A, = 2 m
Figure 25: Constraints on the parameter space in minimal SUGRA models with non-universal Higgs masses imposed by b -» 57: domains in the (/x, M2) plane excluded for tan/3 = 3 (a,b,c) and tanj3 — 10 (d). In all plots the 'reference' excluded region for TUA = 250 GeV, m 0 = 500 GeV and the infrared quasi-fixed-point value A0 — 2mi/ 2 is shaded, assuming mt = 175 GeV. The effect of varying 771,4 is shown in panel (a), the effect of varying m 0 is shown in panel (b), the effect of changing the sign of A is shown in panel (c), and panel (d) illustrates the effect of increasing tan/5. See [89] for more details.
720
Acknowledgements I thank Jon Rosner for his patience to wait for my manuscript as well as for useful comments. I also thank my student Aaron Pierce for his careful proofreading. This work was supported in part by the DOE Contract DEAC03-76SF00098 and by NSF grant PHY-95-14797.
References [1] D. E. Groom et al., Eur. Phys. J. C15, 1 (2000). [2] Y. Fukuda et al. [Super-Kamiokande Collaboration], Phys. Rev. Lett. 81, 1562 (1998) [hep-ex/9807003]. [3] H. Murayama, Talk given at 22nd INS International Symposium on Physics with High Energy Colliders, Tokyo, Japan, 8-10 Mar 1994. Published in Proceedings, eds. S. Yamada and T. Ishii. World Scientific, 1995. 476p. [hep-ph/9410285]. [4] V.F. Weisskopf, Phys. Rev. 56, 72 (1939). [5] M. Veltman, Acta Phys. Polon. B12, 437 (1981); S. Dimopoulos and S. Raby, Nucl. Phys. B192, 353 (1981); E. Witten, Nucl. Phys. B188, 513 (1981); M. Dine, W. Fischler and M. Srednicki, Nucl. Phys. B189, 575 (1981). [6] R. Barbieri and G. F. Giudice, Nucl. Phys. B306, 63 (1988); G. W. Anderson and D. J. Castano, Phys. Lett. B347, 300 (1995) [hepph/9409419]; Phys. Rev. D 52, 1693 (1995) [hep-ph/9412322]. [7] E. Farhi and L. Susskind, Phys. Rept. 74, 277 (1981). [8] N. Arkani-Hamed, S. Dimopoulos and G. Dvali, Phys. Lett. B429, 263 (1998) [hep-ph/9803315]; I. Antoniadis, N. Arkani-Hamed, S. Dimopoulos and G. Dvali, Phys. Lett. B436, 257 (1998) [hep-ph/9804398]. [9] C. Kolda and H. Murayama, JHEP0007, 035 (2000) [hep-ph/0003170]. [10] S. P. Martin, hep-ph/9709356.
721
[11] K. R. Dienes, C. Kolda and J. March-Russell, Nucl. Phys. B492, 104 (1997) [hep-ph/9610479]. [12] A. de Gouvea, A. Friedland and H. Murayama, Phys. Rev. D 59, 095008 (1999) [hep-ph/9803481]. [13] K. Intriligator and N. Seiberg, Nucl. Phys. Proc. Suppl. 4 5 B C , 1 (1996) [hep-th/9509066]. [14] L. Girardello and M. T. Grisaru, Nucl. Phys. B194, 65 (1982). [15] K. Harada and N. Sakai, Prog. Theor. Phys. 67, 1877 (1982); D. R. Jones, L. Mezincescu and Y. P. Yao, Phys. Lett. B148, 317 (1984); L. J. Hall and L. Randall, Phys. Rev. Lett. 65, 2939 (1990). [16] F. Borzumati, G. R. Farrar, N. Polonsky and S. Thomas, Nucl. Phys. B555, 53 (1999) [hep-ph/9902443]; I. Jack and D. R. Jones, Phys. Lett. B457, 101 (1999) [hep-ph/9903365]. [17] J. Polchinski and L. Susskind, Phys. Rev. D 26, 3661 (1982); H. P. Nilles, M. Srednicki and D. Wyler, Phys. Lett. B124, 337 (1983); J. Bagger and E. Poppitz, Phys. Rev. Lett. 71, 2380 (1993) [hepph/9307317]; J. Bagger, E. Poppitz and L. Randall, Nucl. Phys. B455, 59 (1995) [hep-ph/9505244]. [18] M. Shiozawa et al. [Super-Kamiokande Collaboration], Phys. Rev. Lett. 81, 3319 (1998) [hep-ex/9806014]. [19] G. R. Farrar and P. Fayet, Phys. Lett. B76, 575 (1978). [20] S. Dimopoulos and H. Georgi, Nucl. Phys. B193, 150 (1981); S. Weinberg, Phys. Rev. D 26, 287 (1982); N. Sakai and T. Yanagida, Nucl. Phys. B197, 533 (1982); S. Dimopoulos, S. Raby and F. Wilczek, Phys. Lett. B112, 133 (1982). [21] G. Jungman, M. Kamionkowski and K. Griest, Phys. Rept. 267, 195 (1996) [hep-ph/9506380]. [22] T. Falk, K. A. Olive and M. Srednicki, Phys. Lett. B339, 248 (1994) [hep-ph/9409270].
722
[23] L. J. Hall, T. Moroi and H. Murayama, Phys. Lett. B424, 305 (1998) [hep-ph/9712515]. [24] L. J. Hall and M. Suzuki, Nucl. Phys. B231, 419 (1984). [25] H. Dreiner, in "Perspectives on Supersymmetry." Edited by Gordon L. Kane. World Scientific, 1998. (Advanced Series on Directions in High Energy Physics, Vol. 18) [hep-ph/9707435]. [26] J.F. Gunion, H.E. Haber, G. Kane, and S. Dawson, "The Higgs Hunter's Guide," Addison-Wesley, 1990 (Frontiers in Physics, 80). [27] Y. Okada, M. Yamaguchi and T. Yanagida, Prog. Theor. Phys. 85, 1 (1991); J. Ellis, G. Ridolfi and F. Zwirner, Phys. Lett. B257, 83 (1991); H. E. Haber and R. Hempfling, Phys. Rev. Lett. 66, 1815 (1991). [28] M. Carena, H. E. Haber, S. Heinemeyer, W. Hollik, C. E. Wagner and G. Weiglein, Nucl. Phys. B580, 29 (2000) [hep-ph/0001002]; Nucl. Phys. B586, 3 (2000) [hep-ph/0003246]. [29] Tom Junk, LEP Fest, Oct 10, 2000, CERN, h t t p : / / l e p h i g g s . w e b . cern.ch/LEPRTGGS/talks/tom_lepfest.pdf. [30] D. R. Jones, Phys. Lett. B123, 45 (1983); V. A. Novikov, M. A. Shifman, A. I. Vainshtein and V. I. Zakharov, Nucl. Phys. B229, 381 (1983); N. Arkani-Hamed and H. Murayama, JHEP0006, 030 (2000) [hepth/9707133]. [31] N. Arkani-Hamed, G. F. Giudice, M. A. Luty and R. Rattazzi, Phys. Rev. D 58, 115005 (1998) [hep-ph/9803290]. [32] H. Murayama, in "Perspectives on Supersymmetry," edited by Gordon L. Kane, World Scientific, 1998 (Advanced Series on Directions in High Energy Physics, Vol. 18), hep-ph/9801331. [33] L. J. Hall, R. Rattazzi and U. Sarid, Phys. Rev. D 50, 7048 (1994) [hep-ph/9306309].
723
[34] V. Barger, M.S. Berger, P. Ohmann, and R.J.N. Phillips, Presented at Workshop on Physics at Current Accelerators and the Supercollider, Argonne, IL, 2-5 Jun 1993. Published in Argonne Accel.Phys.1993:255270, hep-ph/9308233. [35] M. Ciuchini, V. Lubicz, L. Conti, A. Vladikas, A. Donini, E. Franco, G. Martinelli, I. Scimemi, V. Gimenez, L. Giusti, A. Masiero, L. Silvestrini, and M. Talevi, JHEP9810, 008 (1998) [hep-ph/9808328]. [36] A. Masiero and L. Silvestrini, Lectures given at International School of Subnuclear Physics, "35th Course: Highlights: 50 Years Later," Erice, Italy, 26 Aug - 4 Sep 1997, and given at International School of Physics, '"Enrico Fermi': Heavy Flavor Physics - A Probe of Nature's Grand Design," Varenna, Italy, 8-18 Jul 1997. hep-ph/9711401. [37] S. Dimopoulos and H. Georgi, Nucl. Phys. B193, 150 (1981). [38] G. F. Giudice and R. Rattazzi, Phys. Rept. 322, 419 (1999) [hepph/9801271]; Y. Shadmi and Y. Shirman, Rev. Mod. Phys. 72, 25 (2000) [hepth/9907225]. [39] L. Randall and R. Sundrum, Nucl. Phys. B557, 79 (1999) [hepth/9810155]; G. F. Giudice, M. A. Luty, H. Murayama and R. Rattazzi, JHEP9812, 027 (1998) [hep-ph/9810442]; R. Rattazzi, A. Strumia and J. D. Wells, Nucl. Phys. B576, 3 (2000) [hep-ph/9912390]. [40] D. E. Kaplan, G. D. Kribs and M. Schmaltz, Phys. Rev. D 62, 035010 (2000) [hep-ph/9911293]; Z. Chacko, M. A. Luty, A. E. Nelson and E. Ponton, JHEP0001, 003 (2000) [hep-ph/9911323]. M. Schmaltz and W. Skiba, Phys. Rev. D 62, 095005 (2000) [hep-ph/0001172]. [41] M. Dine, R. Leigh and A. Kagan, Phys. Rev. D 48, 4269 (1993) [hepph/9304299]; P. Pouliot and N. Seiberg, Phys. Lett. B318, 169 (1993) [hepph/9308363];
D. B. Kaplan and M. Schmaltz, Phys. Rev. D 49, 3741 (1994) [hepph/9311281]; L. J. Hall and H. Murayama, Phys. Rev. Lett. 75, 3985 (1995) [hepph/9508296]; A. Pomarol and D. Tommasini, Nucl. Phys. B466, 3 (1996) [hepph/9507462]; R. Barbieri, G. Dvali and L. J. Hall, Phys. Lett. B377, 76 (1996) [hepph/9512388]; R. Barbieri, L. J. Hall, S. Raby and A. Romanino, Nucl. Phys. B493, 3 (1997) [hep-ph/9610449]. [42] Y. Nir and N. Seiberg, Phys. Lett. B309, 337 (1993) [hep-ph/9304307]. [43] M. Dine, A. Kagan and S. Samuel, Phys. Lett. B243, 250 (1990); A. G. Cohen, D. B. Kaplan and A. E. Nelson, Phys. Lett. B388, 588 (1996) [hep-ph/9607394]. [44] N. Arkani-Hamed and H. Murayama, Phys. Rev. D 56, 6733 (1997) [hep-ph/9703259]. [45] J. Hisano, K. Kurosawa and Y. Nomura, Phys. Lett. B445, 316 (1999) [hep-ph/9810411]; J. L. Feng, C. Kolda and N. Polonsky, Nucl. Phys. B546, 3 (1999) [hepph/9810500]; J. Bagger, J. L. Feng and N. Polonsky, Nucl. Phys. B563, 3 (1999) [hepph/9905292]; J. L. Feng, K. T. Matchev and T. Moroi, Phys. Rev. D 6 1 , 075005 (2000) [hep-ph/9909334]; J. A. Bagger, J. L. Feng, N. Polonsky and R. Zhang, Phys. Lett. B473, 264 (2000) [hep-ph/9911255]. [46] U. Chattopadhyay and P. Nath, Phys. Rev. D 53, 1648 (1996) [hepph/9507386]. [47] T. Moroi, Phys. Rev. D 53, 6565 (1996) [hep-ph/9512396]. Erratum ibid., D56, 4424 (1997). [48] A. Masiero and H. Murayama, Phys. Rev. Lett. 83, 907 (1999) [hepph/9903363]. [49] G. Colangelo and G. Isidori, JHEP9809, 009 (1998) [hep-ph/9808487].
725
[50] A. L. Kagan and M. Neubert, Phys. Rev. Lett. 83, 4929 (1999) [hepph/9908404]. [51] X. He, H. Murayama, S. Pakvasa and G. Valencia, Phys. Rev. D 61, 071701 (2000) [hep-ph/9909562]. [52] G. D'Ambrosio, G. Isidori and G. Martinelli, Phys. Lett. B480, 164 (2000) [hep-ph/9911522], [53] S. Dimopoulos and H. Georgi, Nucl. Phys. B193, 150 (1981). [54] A. H. Chamseddine, R. Arnowitt and P. Nath, Phys. Rev. Lett. 49, 970 (1982). L. Hall, J. Lykken and S. Weinberg, Phys. Rev. D 27, 2359 (1983). [55] H. P. Nilles, Phys. Rept. 110 (1984) 1. [56] H. Murayama, hep-ph/0010021. [57] I. Affleck, M. Dine and N. Seiberg, Phys. Rev. Lett. 52, 1677 (1984). [58] I. Affleck, M. Dine and N. Seiberg, Nucl. Phys. B256, 557 (1985). [59] A. E. Nelson, Talk given at 5th International Conference on Supersymmetries in Physics (SUSY 97), Philadelphia, PA, 27-31 May 1997, Nucl. Phys. Proc. Suppl. 62, 261 (1998) [hep-ph/9707442]. [60] I. Affleck, M. Dine and N. Seiberg, Phys. Lett. B137, 187 (1984). [61] I. Affleck, M. Dine and N. Seiberg, Phys. Lett. B140, 59 (1984). [62] H. Murayama, Phys. Lett. B355, 187 (1995) [hep-th/9505082]. [63] K. Izawa and T. Yanagida, Prog. Theor. Phys. 95, 829 (1996) [hepth/9602180]; K. Intriligator and S. Thomas, Nucl. Phys. B473, 121 (1996) [hepth/9603158]; hep-th/9608046. [64] M. Dine and D. Maclntire, Phys. Rev. D 46, 2594 (1992) [hepph/9205227]; T. Banks, D. B. Kaplan and A. E. Nelson, Phys. Rev. D 49, 779 (1994) [hep-ph/9308292].
726
[65] M. Dine and A. E. Nelson, Phys. Rev. D 48, 1277 (1993) [hepph/9303230]; M. Dine, A. E. Nelson and Y. Shirman, Phys. Rev. D 5 1 , 1362 (1995) [hep-ph/9408384]; M. Dine, A. E. Nelson, Y. Nir and Y. Shirman, Phys. Rev. D 53, 2658 (1996) [hep-ph/9507378]. [66] H. Murayama, Phys. Rev. Lett. 79, 18 (1997) [hep-ph/9705271]; S. Dimopoulos, G. Dvali, R. Rattazzi and G. F. Giudice, Nucl. Phys. B510, 12 (1998) [hep-ph/9705307]. [67] A. E. Nelson, Phys. Lett. B369, 277 (1996) [hep-ph/9511350]. [68] M. K. Gaillard, B. Nelson and Y. Wu, Phys. Lett. B459, 549 (1999) [hep-th/9905122]; J. A. Bagger, T. Moroi and E. Poppitz, JHEP0004, 009 (2000) [hepth/9911029]. [69] A. Pomarol and R. Rattazzi, JHEP9905, 013 (1999) [hep-ph/9903448]. E. Katz, Y. Shadmi and Y. Shirman, JHEP9908, 015 (1999) [hepph/9906296]. Z. Chacko, M. A. Luty, I. Maksymyk and E. Ponton, JHEP0004, 001 (2000) [hep-ph/9905390]. M. Carena, K. Huitu and T. Kobayashi, Nucl. Phys. B592, 164 (2000) [hep-ph/0003187]. D. E. Kaplan and G. D. Kribs, JHEP0009, 048 (2000) [hep-ph/0009195]. I. Jack and D. R. Jones, Phys. Lett. B482, 167 (2000) [hep-ph/0003081]. [70] N. Arkani-Hamed, D. E. Kaplan, H. Murayama and Y. Nomura, hepph/0012103. [71] N. Arkani-Hamed, L. Hall, D. Smith and N. Weiner, Phys. Rev. D 6 1 , 116003 (2000) [hep-ph/9909326]. [72] H. Georgi and C. Jarlskog, Phys. Lett. B86, 297 (1979). [73] L. J. Hall, V. A. Kostelecky and S. Raby, Nucl. Phys. B267, 415 (1986). [74] J. Hisano and D. Nomura, Phys. Rev. D 59, 116005 (1999) [hepph/9810479]. [75] N. Haba and H. Murayama, hep-ph/0009174, to appear in Phys. Rev. D.
727
[76] J. Hisano, K. Kurosawa and Y. Nomura, Nucl. Phys. B584, 3 (2000) [hep-ph/0002286]. [77] L. Hall, H. Murayama and N. Weiner, Phys. Rev. Lett. 84, 2572 (2000) [hep-ph/9911341]. [78] C. D. Froggatt and H. B. Nielsen, Nucl. Phys. B147, 277 (1979). [79] H. N. Brown et al. [Muon g-2 Collaboration], hep-ex/0102017. [80] R. Barbieri and L. J. Hall, Phys. Lett. B338, 212 (1994) [hepph/9408406]. [81] R. Barbieri, L. Hall and A. Strumia, Nucl. Phys. B445, 219 (1995) [hep-ph/9501334]. [82] J. Hisano, T. Moroi, K. Tobe and M. Yamaguchi, Phys. Lett. B391, 341 (1997) [hep-ph/9605296]. [83] M. L. Brooks et al. [MEGA Collaboration], Phys. Rev. Lett. 83, 1521 (1999) [hep-ex/9905013]. [84] W. Buchmuller, D. Delepine and L. T. Handoko, Nucl. Phys. B576, 445 (2000) [hep-ph/9912317]. [85] J. L. Popp [MECO Collaboration], To be published in the proceedings of NuFACT'00: International Workshop on Muon Storage Ring for a Neutrino Factory, Monterey, California, 22-26 May 2000. hep-ex/0101017. [86] N. Arkani-Hamed, H. Cheng, J. L. Feng and L. J. Hall, Phys. Rev. Lett. 77, 1937 (1996) [hep-ph/9603431]. [87] M. P. Worah, Phys. Rev. D 56, 2010 (1997) [hep-ph/9702423]. [88] L. Randall and S. Su, Nucl. Phys. B540, 37 (1999) [hep-ph/9807377]. [89] J. Ellis, T. Falk, G. Ganis and K. A. Olive, Phys. Rev. D 62, 075010 (2000) [hep-ph/0004169].
This page is intentionally left blank
" " ^ • • ,
R. Sekhar Chivukula
U —~^:<J $ V,T ; ~*
This page is intentionally left blank
TECHNICOLOR A N D COMPOSITENESS R. S E K H A R C H I V U K U L A Physics Department Boston University 590 Commonwealth Ave. Boston, MA 02215, USA E-mail: [email protected] BUHEP-00-24 Lecture 1 provides an introduction to dynamical electroweak symmetry breaking. Lectures 2 and 3 give an introduction to compositeness, with emphasis on effective lagrangians, power-counting, and the 't Hooft anomaly-matching conditions.
Lecture 1: Technicolor" 1
Dynamical Electroweak Symmetry Breaking
The simplest theory of dynamical electroweak symmetry breaking is technicolor. 3 ' 4 Consider an SU(NTC) gauge theory with fermions in the fundamental representation of the gauge group *
L =
U
*>DR-
( L > )
(!)
The fermion kinetic energy terms for this theory are C = ULiipUL + URipUR + DLiIpDL + DRiJj)DR ,
(2)
and, like QCD in the mu, md -» 0 limit, they have a chiral SU{2)L X SU(2)R symmetry. As in QCD, exchange of technigluons in the spin zero, isospin zero channel is attractive, causing the formation of a condensate D 5
'
->
(ULUR) = (DLDR)?0,
(3)
" W h a t follows is largely an abbreviated version of the sections on technicolor in lectures 1 I presented at the Les Houches summer school in 1997. For lack of space, I have not included a description of the phenomenology of dynamical electroweak symmetry breaking - for a recent review see Chivukula and Womersley in the 2000 Review of Particle Properties. 2
731
732
which dynamically breaks SU{2)L x SU{2)R -4 SU(2)V. These broken chiral symmetries imply the existence of three massless Goldstone bosons, the analogs of the pions in QCD. Now consider gauging SU{2)w x U(l)y with the left-handed fermions transforming as weak doublets and the right-handed ones as weak singlets. To avoid gauge anomalies, in this one-doublet technicolor model we will take the left-handed technifermions to have hypercharge zero and the right-handed upand down-technifermions to have hypercharge ±1/2. The spontaneous breaking of the chiral symmetry breaks the weak-interactions down to electromagnetism. The would-be Goldstone bosons become the longitudinal components of the W and Z w±,
TTU
-> W*, ZL ,
(4)
which acquire a mass MW = ^
.
(5)
Here FTC is the analog of fn in QCD. In order to obtain the experimentally observed masses, we must have FTC ~ 246GeV and hence this model is essentially QCD scaled up by a factor of ~
« 2500.
(6)
While I have described only the simplest model above, it is straightforward to generalize to other cases. Any strongly interacting gauge theory with a chiral symmetry breaking pattern G —> H, in which G contains SU(2)wxU{l)Y and sub group H contains t / ( l ) e m (but not SU(2^)\y x t/(l)y) will break the weak interactions down to electromagnetism. In order to be consistent with experimental results, however, we must also require that H contain custodial 5,6 SU(2)v- This custodial symmetry insures that the inconstant associated with the W± and Z are equal and therefore that the relation »-
M ™ -1 M§ cos2 6W
(7)
is satisfied at tree-level. If the chiral symmetry is larger than SU{2)L x SU(2)R, theories of this sort will contain additional (pseudo-) Goldstone bosons which are not "eaten" by the W and Z.
733
2 2.1
Flavor Symmetry Breaking and ETC Fermion Masses & ETC
Interactions
In order to give rise to masses for the ordinary quarks and leptons, we must introduce interactions which connect the chiral-symmetries of technifermions to those of the ordinary fermions. The most popular choice7'8 is to introduce new broken gauge interactions, called extended technicolor interactions (ETC), which couple technifermions to ordinary fermions. At energies low compared to the ETC gauge-boson mass, METC, these effects can be treated as local four-fermion interactions \ / 9„2
JX£L^LUR)(qRqL)
.
M 2ETC
(8)
After technicolor chiral-symmetry breaking and the formation of a (UU) condensate, such an interaction gives rise to a mass for an ordinary fermion ~
~
9 2ETG
'
METC
(UU)ETC
,
(9)
where (UU)ETC 1S the value of the technifermion condensate evaluated at the ETC scale (of order METC)- The condensate renormalized at the ETC scale in eq. (9) can be related to the condensate renormalized at the technicolor scale as follows (UU)BTC
= (UU)Tcexplj
ETC
—7m(M)j
,
(10)
where j m (p) is the anomalous dimension of the fermion mass operator and Axe is the analog of AQCD for the technicolor interactions. For QCD-like technicolor (or any theory which is "precociously" asymptotically free), 7 m is small in the range between Arc and METCUsing dimensional analysis s. 10 . 1 !,^ w e £j nc j (UU) ETC « (UU)TC « 4 ^ F T C .
(11)
In this case eq. (9) implies that
9ETC
v 250GevJ
\
mq
)
'
[U)
In order to orient our thinking, it is instructive to consider a simple "toy" extended technicolor model. The model is based on an SU(NETC) gauge
734
group, with technicolor as an extension of flavor. In this case NETC = Np, and the model contains the (anomaly-free) set of fermions = (AW,3,2)1/6 UR = (NETC, 3, l ) 2 / 3 DR = (NETC, 3, l ) - i / 3 QL
LL = (JVBTC, l , 2 ) _ i / 2 ER = {NETc, 1, l ) _ i Wfl = (NETC, 1, l)o ,
NTC
+
(13)
where we display their quantum numbers under SU(NETC) X S7/(3)C7 X 5C/'(2)v7 x t / ( l ) y . We break the ETC group down to technicolor in three stages SU(NTC Ai
I
™i * ^ f
SU(NTc A2
+ 3) + 2) ™i~A-jf-
I
SU(NTC + 1)
A3
4-
rn3 « ^ p
SU(NTC) resulting in three isospin-symmetric families of degenerate quarks and leptons, with mi < m 2 < m^. Note that the heaviest family is related to the lightest ETC scale! Before continuing our general discussion, it is worth noting a couple of points. First, in this example the ETC gauge bosons do not carry color or weak charge [GETC,
SU(3)C]
= [GETC, SU(2)W]
= 0 .
(14)
Furthermore, in this model there is one technifermion for each type of ordinary fermion: that is, this is a "one-family" technicolor model. 13 Since there are eight left- and right- handed technifermions, the chiral symmetry of the technicolor theory is (in the limit of zero QCD and weak couplings) SU(8)L x SU(8)R -» SU(8)V- Such a theory would yield 8 2 - 1 = 63 (pseudo) Goldstone bosons. Three of these Goldstone bosons are unphysical — the corresponding degrees of freedom become the longitudinal components of the W * and Z by the Higgs mechanism. The remaining 60 must somehow obtain a mass. This will lead to the condition in eq. (14) being modified in a realistic model. 7 We will return to the issue of pseudo-Goldstone bosons below. The most important feature of this or any ETC-model is that a successful extended technicolor model will provide a dynamical theory of flavorl As in the toy model described above and as explicitly shown in eq. (8) above, the
735
masses of the ordinary fermions are related to the masses and couplings of the ETC gauge-bosons. A successful and complete ETC theory would predict these quantities and, hence, the ordinary fermion masses. Needless to say, constructing such a theory is very difficult. No complete and successful theory has been proposed. Examining our toy model, we immediately see a number of shortcomings of this model that will have to be addressed in a more realistic theory: • What breaks ETC? • Do we require a separate scale for each family? • How do the T3 = ±\ masses?
fermions of a given generation receive different
• How do we obtain quark mixing angles? • What about right-handed technineutrinos and m„? 2.2
Flavor-Changing
Neutral-Currents
Perhaps the single biggest obstacle to constructing a realistic ETC model (or any dynamical theory of flavor) is the potential for flavor-changing neutral currents. 7 Quark mixing implies transitions between different generations: q —> \I> —> q', where q and q' are quarks of the same charge from different generations and $ is a technifermion. Consider the commutator of two ETC gauge currents: [q^,yiq'}
D qiq'.
(15)
Hence we expect there are ETC gauge bosons which couple to flavor-changing neutral currents. In fact, this argument is slightly too slick: the same applies to the charged-current weak interactions! However in that case the gauge interactions, SU(2)W, respect a global (SU(3) x £/(l)) 5 chiral symmetry 6 leading to the usual GIM mechanism. Unfortunately, the ETC interactions cannot respect the same global symmetry; they must distinguish between the various generations in order to give rise to the masses of the different generations. Therefore, flavor-changing neutral-current interactions are (at least at some level) unavoidable. The most severe constraints come from possible | A 5 | = 2 interactions which contribute to the K^-Ks mass difference. In particular, we would 6
One SU(3) flavor symmetry for the three families of each type of ordinary fermion. 14
736
expect that in order to produce Cabibbo-mixing the same interactions which give rise to the s-quark mass could cause the flavor-changing interaction
£|A5|=2 = % r ^ (sr"<*) ( s l » +h-c ,
(16)
ETC
where 9sd is of order the Cabibbo angle. Such an interaction contributes to the neutral kaon mass splitting
(AM£)£TC = fe^M (K-^dST'^K*)
+ c.c.
(17)
ETC
Using the vacuum insertion approximation we find 9%TcMS2 )
sd * "*iZ™™> &MU . 2Ml
(AM^ETC
(18)
L
ETC
Experimentally 2 we know that AMK < 3.5 x 10 METC
12
MeV and, hence, that
> 600 TeV
(19)
gETcV^WU) Using eq. (9) we find that „
m
Q1
'
9ETCL/TT\
W M
TT ETC
^ '
<- ° ' 5 <
M e V
Ar 3/2 a2 2
OM
( 2 °)
ETC Njj e sd showing that it will be difficult to produce the s-quark mass, let alone the c-quark!
2.3
Pseudo-Goldstone
Bosons
A "realistic" ETC theory may require a technicolor sector with a chiral symmetry structure bigger than the SU(2)L X SU{2)R discussed initially. The prototypical theory has one-family of technifermions, as incorporated in our toy model. As discussed there, the theory has an SU(8)L X SU(8)R —>• SU(8)v chiral symmetry breaking structure resulting in 63 Goldstone bosons, 3 of which are unphysical. The quantum numbers of the 60 remaining Goldstone bosons are shown in table 1. Clearly, these objects cannot be massless in a realistic theory! In fact, the ordinary gauge interactions break the full SU(&)L X SU(8)R chiral symmetry explicitly. The largest effects are due to QCD, and the color octets and triplets mesons get masses of order 200 - 300 GeV, in analogy to the electromagnetic mass splitting mn+ — mno in QCD. Unfortunately, the others 7 are massless to 0 ( a ) !
737
SU(3) C 1 1 3 3 8 8
SU(2) V 1 3 1 3 1 3
Particle P°> , UJT
P°± , p^ pOl
.0/
p0,± 0,± ^3 ' PT3 p0,± 0,± ^8 ' PT8
Table 1. Quantum numbers of the 60 physical Goldstone bosons (and the corresponding vector mesons) in a one-family technicolor model. Note that the mesons that transform as 3's of QCD are complex fields.
Luckily, the ETC interactions (which we introduced in order to give masses to the ordinary fermions) are capable of explicitly breaking the unwanted chiral symmetries and producing masses for these mesons. This is because in addition to coupling technifermions to ordinary fermions, some ETC interactions also couple technifermions to one another. 7 Using Dashen's formula,15 we can estimate that such an interaction can give rise to an effect of order Fk:Ml
ex jkcL{{TT?)ETc
•
(21)
ETC
In the vacuum insertion approximation for a theory with small 7 m , we may rewrite the above formula using eq. (9) and find that cm M
—
^
55GeV
/
m
f
/250GeV
VlGeVV-i^-
,
s
W
It is unclear whether this is large enough. In addition, there is a particularly troubling chiral symmetry in the onefamily model. The St/(8)-current Qj^jsQ-3Lj^5L is spontaneously broken and has a color anomaly. Therefore, we have a potentially dangerous weak scale axion 16 ' 17 - 18 ' 19 ! An ETC-interaction of the form - % £ - {QLrLL) METC
(WQfi)
,
(23)
is required to give to an axion mass, and we must 7 embed SU(3)c in ETC. 2.4
ETC etc.
There are other model-building constraints 20 on a realistic TC/ETC theory. A realistic ETC theory:
738
must be asymptotically free, cannot have gauge anomalies, must produce small neutrino masses, • cannot give rise to extra massless (or even light) gauge bosons, • should generate weak CP-violation without producing unacceptably large amounts of strong CP-violation, • must give rise to isospin-violation in fermion masses without large contributions 21,22 to Ap and, • must accommodate a large mt while giving rise to only small corrections 23,24 to Z -> bb and b -> sj. Clearly, building a fully realistic ETC model will be quite difficult! However, as I have emphasized before, this is because an ETC theory must provide a complete dynamical explanation of flavor. In the next section, I will concentrate on possible solutions to the flavor-changing neutral-current problem(s). 3 3.1
Walking Technicolor The Gap Equation
Up to now we have assumed that technicolor is, like QCD, precociously asymptotically free with a small anomalous dimension 7m(/x) for scales Kxc < H < METC- However, as discussed above it is difficult to construct an ETC theory of this sort without producing dangerously large flavor-changing neutral currents. On the other hand, if the /^-function foe 1S small, arc can remain large above the scale Aye — i-e. the technicolor coupling would "walk" instead of running. In this same range of momenta, j m may be large and, since (TT)ETC
= (TT)Tc
exp
/
-^mM
(24)
this could enhance the size of the condensate renormalized at the ETC scale {{TT)ETC) and produce larger fermion masses. 25,26,27,28,29,30 In order to proceed further, however, we need to understand how large 7 m can be and how walking affects the technicolor chiral symmetry breaking dynamics. These questions cannot be addressed in perturbation theory. Instead, what is conventionally done is to use a nonperturbative approximation for 7 m and chiral-symmetry breaking dynamics based on the "rainbow"
739 k-p
Figure 1. Schwinger-Dyson equation for the fermion self-energy function E(p) in the rainbow approximation. The dashed line represents the technigluon propagator and the solid line technifermion propagator.
approximation 31,32 to the Schwinger-Dyson equation shown in Figure 1. Here we write the full, nonperturbative, fermion propagator in momentum space as i5- 1 (p) = Z ( p ) 0 i - E ( P ) ) .
(25)
The linearized form of the gap equation in Landau gauge (in which Z(p) = 1 in the rainbow approximation) is w i
-xr tm
rf4fc
f
E(p) = 3C2(R) J j
a
rc((k-p)2)
^
{k_p)2
S(fc)
-jjr
•
(26)
Being separable, this integral equation can be converted to a differential equation which has the approximate (WKB) solutions 33 ' 34 Efpjap-^f"',/-'")-2.
(27)
Here a(/z) is assumed to run slowly, as will be the case in walking technicolor, and the anomalous dimension of the fermion mass operator is
-'^W1-^
°^3ck-
(28)
One can give a physical interpretation of these two solutions 35 ' 36 in eq. 27. Using the operator product expansion, we find
lim S(p)
cc
m{ l>
^
+
V \ / Y /
•
(29)
Thus the first solution corresponds to a "hard mass" or explicit chiral symmetry breaking, while the second solution corresponds to a "soft mass" or
740
spontaneous chiral symmetry breaking. If we let mo be the explicit mass of a fermion, dynamical symmetry breaking occurs only if lim E(p) ^ 0 .
(30)
A careful analysis of the gap equation, or equivalently the appropriate effective potential, 37 implies that this happens only if arc reaches the critical value of chiral symmetry breaking, ac defined in eq. (28). Furthermore, the chiral symmetry breaking scale Arc is defined by the scale at which aTc(ATc)
= ac
(31)
and, hence, at least in the rainbow approximation, at which 7m(A TC ) = 1.
(32)
In the rainbow approximation, then, chiral symmetry breaking occurs when the "hard" and "soft" masses scale the same way. It is believed that even beyond the rainbow approximation one will find j m = 1 at the critical coupling. 38 ' 39 ' 40 3.2
Implications of Walking: Fermion and PGB Masses
If P(ctTc) — 0 all the way from Arc to In this case, eq. (9) becomes mq, = %F~ M
x ((TT)ETC
ETC
METC:
then 7m(/Li) = 1 in this range.
- (TT)TC
\
^ - ) A
TC
.
(33)
J
We have previously estimated that flavor-changing neutral current requirements imply that the ETC scale associated with the second generation must be greater than of order 100 to 1000 TeV. In the case of walking technicolor the enhancement of the technifermion condensate implies that 50 - 500 MeV arguably enough to accommodate the strange and charm quarks. In addition to modifying our estimate of the relationship between the ETC scale and ordinary fermion masses, walking also influences the size of pseudo-Goldstone boson masses. In the case of walking, Dashen's formula for the size of pseudo-Goldstone boson masses in the presence of chiral symmetry breaking from ETC interactions, eq. (21), reads:
ftcMlr -
%F-({TT)2))ETC m
ETC
741
%¥^ ((TT)ETCy
M
ETC
-,2
gF-*£§2£ ((TT)Tc)2 .
(35)
Consistent with the rainbow approximation, we have used the vacuuminsertion to estimate the strong matrix element. Therefore we find (^FTC
*
\ ATC /750GeV\ /ITeVN 9ETC
{-J^-J U^J '
(36)
i.e. walking also enhances the size of pseudo-Goldstone boson masses! As shown in the discussion of eq. 22, such an enhancement is welcome. While this is very encouraging, two caveats should be kept in mind. First, the estimates given are for the limit of "extreme walking", i.e. assuming that the technicolor coupling walks all the way from the technicolor scale Arc to the relevant ETC scale METC- To produce a more complete analysis, ETC-exchange must be incorporated into the gap-equation technology in order to estimate ordinary fermion masses. Studies of this sort are encouraging; it appears possible to accommodate the first and second generation masses without necessarily having dangerously large flavor-changing neutral currents. 25 ' 26,27 ' 28,29 ' 30 The second issue, however, is what about the third generation quarks, the top and bottom? Because of the large top-quark mass, further refinements1 or modifications will be necessary to produce a viable theory of dynamical electroweak symmetry breaking. This issue remains the outstanding obstacle c in ETC or any theory of flavor. Various models, including top condensate, top seesaw, and top-color assisted technicolor have been proposed; many are discussed by Elizabeth Simmons in her lectures in this volume.
c
As noted by Lane, 2 0 we cannot to apply precision electroweak t e s t s 4 1 ' 4 2 ' 4 3 , 4 4 ' 4 5 to directly constrain theories of walking technicolor.
742
Lectures 2 &: 3: Compositeness 4
What is Compositeness?
The relevant phenomenological question is: can any of the observed gauge bosons, the quarks and leptons, or the Higgs boson (if it exists) be composite particles? As we shall quantify in this lecture, the fact that the standard model works well implies that a successful theory must be one in which • the short distance degrees of freedom are not the same as the long distance degrees of freedom, and • the masses of the composite states are much less than the intrinsic scale of the dynamics A. In order to obtain light bound states, the binding energy must be comparable to the intrinsic scale A and, therefore, the bound states must be relativistic and the theory must be strongly-coupled. We will be discussing field theories in which these conditions are satisfied. As the masses of the composite states are less than the intrinsic scale (A) of the underlying dynamics, there must be a consistent effective field theory d valid for energies E < A describing dynamics in that energy range. In general, we will not be able to completely solve the strongly-interacting underlying dynamics to give a complete description of the low-energy properties of the bound states. However, we may estimate the types and sizes of interactions 9 ' 11,49 ' 50 based on the following principles: • That which is not forbidden is required: the effective lagrangian will include all interactions consistent with space-time, global, and gauge symmetries (and, in the case of supersymmetric theories, considerations of analyticity). • N o small dimensionless numbers: the interaction coefficients must be consistent with dimensional analysis. When A —> oo, i.e. if there is a large hierarchy of scales, the effective theory must reduce to a renormalizable theory up to corrections suppressed by powers of A. From this point of view, the fact that current experimental results are consistent with a renormalizable theory (the standard one-doublet higgs model) only implies that the scale A must be larger (perhaps substantially larger) than energy scales we have experimentally probed. d
General reviews of effective field theory have been written by Howard Georgi 4 6 , David Kaplan 4 7 , and Antonio Pich. 4 8
743
4-1
Dimensional
Analysis
Dimensional analysis is the key we will use to extract bounds on the scale of compositeness A from the results of experiments. Using dimensional analysis, we will estimate sizes of interactions involving composite scalars (0), fermions (ip), and vector bosons (V M ). Since the effective theory should have no small dimensionless numbers, the sizes of these interactions are determined by two parameters: • A, the scale of the underlying strong dynamics, and • g, the size of typical coupling constants. As we now show, the natural size of g is 4-K. Let us start with 51 the Wilsonian effective action at scale A:
5A =
W
£
U'A^'T'AJ
(37)
Here the parameters A and g are introduced to get the dimensions correct and to account for an extra coupling for each field in an interaction, while the A 4 /g 2 is present to correctly normalize the kinetic energy, e.g.:
^-<M?)(x)(l)©-
<»>
Consider9 a process which receives contributions from one operator at treelevel, and another at L-loop order. Any powers of A must be the same for both contributions. The 1/g2 pre-factor in SA "counts" the number of loops and therefore the ratio of the L-loop and tree-level contributions is of order
where we have included one 1/167T2 for each loop from 4-D phase space. Neither of the two extreme possibilities for g is self-consistent: • g
744
Hence, we expect g = 0(4w) is the natural size for couplings in our effective theory. In the Wilsonian effective theory, one computes with a momentum-space cutoff of order A. All operators consistent with symmetry requirements then contribute at the same order in A2 to each process, making it impractical for use in actual calculations. Instead, we will use a dimensionless regulator and organize computation in powers of A - 1 . Matching our estimates from such a calculation with those from the Wilsonian approach implies that the rules of dimensional analysis give us the sizes of interaction coefficients defined using a dimensionless regulator renormalized at a scale of order A. In constructing the effective theory, we must impose all space-time, global, and gauge symmetries by hand. We may also incorporate any external, weakly-coupled fields (e.g. the photon AM), by including an appropriate suppression factor (e.g. one factor of e/g for every A1*). 4-2
Example: The QCD Chiral Lagrangian
As an example of the use of dimensional analysis in an effective lagrangian, consider the chiral lagrangian in QCD. 52,53 ' 10 The approximate SU(2)L X SU(2)R chiral symmetry of the QCD lagrangian for light quarks is spontaneously broken to isospin, SU(2)y, producing three (approximate) Goldstone Bosons na which we identify with the ordinary pions. In terms of the matrix TX = TTaaa/2, where the aa are the Pauli matrices, we define
£ = exP (*f)
(4°)
•
which transforms as E ->• LZI&
L,Re
SU(2)LA
.
(41)
If we write the lagrangian for £ in an expansion in powers of momentum, the lowest order term invariant under the symmetry of eq. 41 is 1 A 4 „ \fd»\ „ (d»\ v t ] _ / ^ T r ( a M E ^ E t ) > (42) 4 A / V A 4^Tr where we have identified k/g = fn m 93 MeV (from the chiral current) and canonically normalized the kinetic energy of the pion fields. It is customary to denote the dimensional scale gfn by AXSB — 1 GeV. Higher order chirally invariant terms are possible and applying the dimensional rules we find, for example, a term with four powers of momentum f2
2
-jft-Tr (d^W^) A
xSB
.
(43)
745
From this we conclude that chiral perturbation theory (x?T) an expansion inpVA^.9'11 In reality, chiral symmetry is not exact, as the bare quark mass terms violate the symmetry: C-QCD
= $ipil> -
^LM^R
-
.
IPRM^L
(44) 12
We can incorporate M as an "external field" in the chiral lagrangian. Consider first the symmetry properties of the quark mass term: CQCD is "invariant" under a chiral transformation combined with the redefinition M -> LME)
L,Re
SU(2)LiR
,
(45)
therefore terms incorporating M must also have this property (this is, essentially, an implementation of a generalized Wigner-Eckart theorem). The power-counting for M can be established by considering the "natural size" of the fermion mass A4 ( 9i> \ ( g2
^A3/2
) G § 7# l ) = A * s * W ,
(46)
in the absence of chiral symmetry. Therefore, the small parameter M/AxSB is a measure of explicit chiral symmetry breaking and the leading term in M has the form A' 92
Tr
lftf)
+ h.c.
(47)
This leads to the usual result ml ~ A x s s ( m „ + mj). 5
The Phenomenology of Compositeness
Using the rules of dimensional analysis, we can investigate the phenomenology of the compositeness of the observed fermions and gauge bosons. To the extent that they appear fundamental, we establish lower bounds on their scale of compositeness. 5.1
Fermions
We begin by considering the quarks and leptons. Compositeness can be expected to produce several phenomenological effects: 1. Form Factors: If ordinary fermions are composite particles, we expect their gauge interactions to have nontrivial form factors. These form factors can be thought of, in analogy with "vector meson dominance" for
746
the pion form-factor in QCD, as arising from processes such as: w,z,g,Y
^ y
W,Z,g,Y
Resonance
=*• mass of order A
(48)
yielding changes in four-fermion cross sections of the form
a(ff->f'f')-ts
(49)
1+ 0
where s is the partonic center-of-mass energy squared of the process. 2. Contact Interactions 54 : If ordinary fermions are composite, they can also directly exchange heavy resonances arising from the interactions responsible for binding the fermions:
(50) The effect of these interactions on four fermion cross sections 2 -
Hff -• /'/') - *s is expected to be much larger,
1+ 0
9 s
inasm
(51)
A2
due to the factor of g
j\^asrn.
By searching for deviations in four-fermion processes from the predictions of the standard model, we can place bounds on possible contact interactions. In principle, any interactions of the form CFF —
9 • VLL^LI^LK^PL^L) 2!A:
+ (RR,
LR)
(52)
consistent with chiral symmetry, gauge symmetries, and flavor symmetries may be present. The convention followed in the typical analyses is a — g2 /AIT = 1. Note that this is lower than would be expected from dimensional analysis and tends to understate the limits on A. In addition, the analyses generally set only one coefficient {e.g. TJLL or TJLR) to be non-zero at a time, and give it a value of ± 1 , which effectively "normalizes" A. Current lower bounds on the scale A are shown in table 2, and are typically several TeV.2
747
Aj L (eeee) AlL{eeee) AjL(ee/U/i) AlL(ee(i(j.) A+ i (eerr) STLL{eeee) k+LL(im) \lL{UU) KlL{eeqq) KlL(eeqq) klL(vvqq) A"L(^OT) A$L(qqqq)
> > > > > > > > > > > > >
3.1 TeV 3.8TeV 4.5 TeV 4.3TeV 3.8 TeV 4.0TeV 5.2TeV 5.3TeV 4.4TeV 2.8TeV 5.0TeV 5.4 TeV 1.9 TeV
OPAL OPAL OPAL OPAL OPAL NUTEV
D0
Table 2. Current limits on the scale A for compositeness derived from various four-fermion scattering experiments. 2
5.2
Gauge Bosons
Next we consider the possibility that the ordinary gauge bosons are composite objects. At first sight this seems unreasonable, since these particles are associated with a local gauge symmetry. However, as shown by Weinberg, 55 for consistency any massless vector particle must couple to a conserved current i.e. the existence of a "gauge symmetry" is automatic for any massless spin-1 particle. Dimensional analysis allows us to estimate the size of the resulting couplings. The natural size of the coupling constant is 0(4ir), as can be seen by considering a generic three-point coupling:
£ (2) (£)'-«- »-• However, the standard model gauge interactions are asymptotically free and their couplings at high energies (~ 1 TeV) are small. For this reason, it is likely that the 7, g, WT , ZT are fundamental. The situation is quite different for the longitudinally polarized weak gauge bosons, the W^ and Zi. These particles are "eaten" Goldstone bosons, have effective couplings to each other proportional to momentum and, as discussed in the previous lecture, they are not fundamental in theories of dynamical electroweak symmetry breaking. The effective lagrangian for a theory of massive
748
electroweak bosons with composite longitudinal modes includes the fundamental WT and BT gauge bosons of SU(2)w x U(l)y symmetry. In addition, just as in the chiral lagrangian in QCD, we may describe the Goldstone bosons of electroweak symmetry breaking by a matrix X which transforms to LY,R) under a global SU(2)LXSU(2)R which is broken to SU(2)y. As discussed previously, the residual "custodial" SU(2)v symmetry ensures 5,6 that the weak interaction p parameter (eq. 7) is equal to 1. The low-energy of effects of dynamical electroweak symmetry breaking include anomalous weak gauge-boson couplings described by the effective lagrangian described above. These corrections can be thought of as due to the exchange of the lightest resonances likely to be present in such theories, "technirho" vector mesons {pre) analogous to the p in QCD. Such corrections modify the 3-pt functions6: 3
p„_ /WL (54)
yielding the couplings 9sumTjT2rbW'"'D ltXDvtf 2
,
16TT
(55)
and " i 9u(D ^ 2 TrB^D^Dvi:
.
(56)
In these expressions we have taken g ~ 47r, so the Ts are normalized to be 0(1). The conventional 57 description of anomalous weak gauge-boson couplings was given by Hagiwara, et. al.: 1
ecot#
Cwwz
= gtiW^WZ"
+KZWIWVZ»V
+
- W}Zvwn L
(57)
jfa-w^wpz^
w
and - CWWy = {W^WA" +KnWlWvF>»> + e
- W*Avwn
(58)
^tWl^W^F"x
There are also "vacuum polarization" corrections to the 2-pt functions, generally expressed in terms of contributions to the Peskin-Takeuchi 4 1 ' 4 2 - 4 3 - 4 4 ' 4 5 S and T parameters. 2 0
749
0.3
LEP
Preliminary
0.15
3
o-0.15
-0.3 -0.3
68% CL 95% CL -0.15 1
SM &M
9
•
1
0
r
0.15
0.3
Agf Figure 2. Current limits 5 6 on the anomalous weak gauge-boson coupling parameters K1
and gf.
Comparing with the interactions above (in unitary gauge, £ = I ) , we find: 9 i - l KZ - 1
1
a*l
47rshr 6>
= 0(1O" 2 - 1CT3)
(59)
The couplings Xzn arise from higher order interactions, and are estimated to be 0(1CT 4 - 10~ 5 ). Current limits from LEP are shown in fig. 2. These data show agreement with the standard model, but do not set useful limits. For your consideration...
1. If quarks and leptons are composite, one expects there are excited states with the same quantum numbers (typically denoted £* and q*). The PDG 2 lists bounds on excited states of quarks and leptons of O(100 GeV).
750
• Based on dimensional analysis/ what bound does this place on the scale of compositeness A? 2. BNL experiment E-821 will measure the anomalous magnetic moment of the muon to 0.35 ppm. • Show using dimensional analysis that this is measurement should be sensitive to one-loop weak corrections. • If the experiment agrees with the SM, what bound will this measurement place on the scale of muon compositeness? 6
Composite Higgs Bosons
Up to now, our discussion of compositeness has consisted of the construction of consistent effective low- energy theories and an analysis of current experimental lower bounds on the scale of compositeness of the observed particles. In order to proceed, we need to understand the characteristics of plausible fundamental high-energy theories which give rise to light (possibly massless) composite states. In this section we consider models which give rise to composite scalars, and in particular a Higgs Boson. In subsequent sections, we will describe models giving rise to composite fermions and gauge bosons. 6.1
Top-Condensate Models
The fact that the top quark, with rrit ~ 2Mw, 2Afz, is much heavier than other fermions implies that the top is more strongly coupled to the EWSB sector. This has led to the construction 9 of models in which an59,60,6i,62,63,64,65,66,67 o r s o m e 68 0 f e l ec troweak symmetry breaking is due to top condensation, (it) ^ 0. The simplest model involves a spontaneously broken but strong topcolor 64 gauge interaction SU(3)tc x SU{3) 4
SU(3)QCD
(60)
which couples preferentially to the third generation of quarks. At energies small compared to the mass (M) of the topgluon, such an interaction gives •^See also, Weinberg and Witten. 5 8 T h e phenomenology of these models, especially as they relate to the top quark, is discussed in the lectures by Elizabeth Simmons in this volume. 9
751
Kc
K
Figure 3. Top-quark condensate produced in NJL approximation to topcolor dynamics as a function of the coupling K. Condensate forms only for K > KC and, at least in this approximation, the condensate turns on continuously - i.e. the quantum chiral phase transition is second order.
Figure 4. Spectrum of low-energy states in any model with a second order quantum chiral phase transition. The spectrum must change smoothly at the critical coupling; there must be massless fermions below the critical coupling and massless Goldstone bosons above the critical coupling.
rise to a local four-fermion operator 47TK
Ql»
Aa
(61)
where K, OC gfc(M), the Aa are the Gell-mann matrices, and the fields Q are the doublets (left- and right-handed, for the model as described so far) third generation weak doublet fields. This model may be solved69 in the "NJL approximation" in the large-Nc limit. The behavior of the chiral symmetry breaking condensate is shown in fig. 3, with: (62)
The condensate changes smoothly from zero as K exceeds the critical value KC; this behavior represents a second order chiral phase transition. Clearly, if M 3> 1 TeV and the dynamics of topcolor occurs at scales much higher than 1 TeV, the value of K must be tuned close to KC. Assuming the transition is second order, as motivated by the NJL calculation, it is easy to understand the form of the effective low-energy field theory when K is just slightly greater than KC. AS shown in fig. 4, the light degrees of freedom include the top-quark, the Goldstone bosons eaten by the W and Z, as well as a singlet scalar particle (the a). In order for this theory
752
to have a smooth limit as K —>• K+, the a and the Goldstone bosons must arrange themselves to form a light composite Higgs boson! In fact, for the theory as described, we have not distinguished the top from the bottom quark. The theory includes two light Higgs doublets in a 2 x 2 matrix field $ as required in an SU(2)L x SU{2)R linear sigma model. Using our rules of dimensional analysis, the most general effective lagrangian describing the light fields is: Ceff=rTr(d',^dll^)+rp^p + yiP^ + o(j^\
.
(63)
Note that, the absence of a Higgs mass term (m 2 Tr($ t $)) is due entirely to the dynamical assumption that we are (very) close to the transition (\m2\ -C M2). Dimensional analysis implies that y = 0(4n) and a heavy top quark arises naturally. The Higgs (at large JVC in the NJL approximation) can be found directly as pole in the sum of bubble sum diagrams 63
+
X X + •••
(64) The eaten Goldstone bosons arise in the corresponding diagrams for the W/Z self-energies:
^
^
,
(65)
and implies a Higgs vacuum expectation value
_ Nc fA k2dk2mj st' = : 2)2, • ~ -iz 47 J :{k2 :+ m
(66)
A number of phenomenological issues must be addressed prior to constructing a realistic model based on topcolor. First, additional "tilting" interactions must be introduced to ensure that the bottom quark is not heavy, i.e. (bb) RJ 0. Second, some account must be given of the observed mixing between the third generation and the first two. Finally, top quark condensation alone produces only a Higgs vacuum expectation value of ft ~ 60 GeV, which is too small to account for electroweak symmetry breaking. 63 The simplest model of a single composite Higgs boson based on topcondensation is the top seesaw model. 66,67 In this model, electroweak symmetry breaking is due to the condensate of the left-handed top quark tL with a new right-handed weak singlet quark \R- While (£LXA) i s responsible for all of
753
electroweak symmetry breaking, mixing of the top with left- and right-handed singlet quarks yields a seesaw mass matrix
,— ^ ^ ( 0
(tL XL) [J
\Vxt
mtxX\
7
hx J
(tR
U"
(67)
\XR,
I:VVS hRiikin.'1 hy coiKi^il^iition wit!) rnassiw HW single! ivrmioii
which gives rise to the observed mass-eigenstate top-quark. 6.2
The Triviality of the Standard Higgs Model11
A composite Higgs is also motivated by the fact that the standard one-doublet Higgs model does not strictly exist as a continuum field theory. This result is most easily illustrated in terms of the Wilson renormalization group. 71,72 ' 73 Any quantum field theory is defined using a regularization procedure which ameliorates the bad short-distance behavior of the theory. Following Wilson, we define the scalar sector of the standard model £A = D»tfDtl
(68)
in terms of a fixed UV-cutoff A. Here we have allowed for the possibility of terms of (engineering) dimension greater than four. While there are an infinite number of such terms, one representative term of this sort, (
(69)
T h e work presented in this and the following two subsections has appeared previously.
754 6 4
J 1 1 1 1 1 1 1 1 1 1 1 1 | I I 1 1 1 1 1 I l_
:
x = loo
2 —
:
.-fir-j-'" r ln(
3 .4 2 .
Slope=-1.03(8)V
_j;
"i 1 i i i 1 i i i [ i i i 1 i i i 1 i i i 1 r
.6
.8
1
1.2
1.4
1.6
ln(|ln(|T|)I) Figure 5. Graphical representation of WilFigure 6. Results of a nonperturbative latson RG flow of (m 2 (A), A(A), 77(A)). As we tice monte carlo study 7 4 of the scalar sec2 scale to low energies, m —> oo, A —> 0, and tor of the standard model with bare cou7j - > • 0 . pling A = 10. The approximate slope of-1 for the renormalized coupling, A/{, shows agreement with the naive one-loop perturbative result.
A(A) -s- A(A') 77(A) -> 7?(A') . Wilson's insight was to see that many properties of the theory can be summarized in terms of the evolution of these (generalized) couplings as we move to lower energies. Truncating the infinite-dimensional coupling constant space to the three couplings shown above, the behavior of the scalar sector of the standard model is illustrated in Figure 5. This figure illustrates a number of important features of scalar field theory. As we flow to the infrared, i.e. lower the effective cutoff, we find: • 77 —> 0 — this is the modern interpretation of renormalizability. If ran <^ A, the theory is drawn to the two-dimensional (m#,A) subspace. Any theory in which TUH
755
to of order Am 2 (A) m 2 (A)
v2 A2
.„,
• A —• 0 — The coupling A has a positive j3 function and, therefore, as we scale to low energies A tends to 0. If we try to take the "continuum" limit, A —• +oo, the theory becomes free or trivial. 71 ' 72 ' 73 The triviality of the scalar sector of the standard one-doublet Higgs model implies that this theory is only an effective low-energy theory valid below some cut-off scale A. Given a value of m'2H = 2\(rriH)v2, there is an upper bound on A. An estimate of this bound can be obtained by integrating the one-loop /^-function, which yields
A m
- "expU<J •
(7i)
For a light Higgs, the bound above is at uninterestingly high scales and the effects of the underlying dynamics can be too small to be phenomenologically relevant. For a Higgs mass of order a few hundred GeV, however, effects from the underlying physics can become important. I will refer to these theories generically as "composite Higgs" models. Finally, while the estimate above is based on a perturbative analysis, nonperturbative investigations of A>4 theory on the lattice show the same behavior. This is illustrated in Figure 6. 6.3
T, S, and U in Composite Higgs Models
In an SU(2)w x U(l)y invariant scalar theory of a single doublet, all interactions of dimension less than or equal to four also respect a larger "custodial" symmetry 5 ' 6 which insures the tree-level relation p = M2vjM\ cos2 8w = 1 is satisfied. The leading custodial-symmetry violating operator is of dimension six 7 5 ' 7 6 and involves four Higgs doublet fields 0. In general, the underlying theory does not respect the larger custodial symmetry, and we expect the interaction
bK * Atf D»
(72)
756 100000. 50000 20000
A
10000
I ATI
5000 2000 100
200
300
400
500
600
700 100
200
300
6CC
700
mH
mH Figure 7. Upper bound on scale A as per eq. (71).
Figure 8. Lower bound on expected size of |AT| as per eq. (73), for |6|re2 167T% 4?r, and 3.
to appear in the low-energy effective theory. Here b is an unknown coefficient of 0(1), and K measures size of couplings of the composite Higgs field. In a strongly-interacting theory, K is expected 11 ' 49 to be of 0(4n). Deviations in the low-energy theory from the standard model can be summarized in terms of the "oblique" parameters 4 1 ' 4 2 ' 4 3 ' 4 4 , 4 5 S, T, and U. The operator in eq. 72 will give rise to a deviation (A/5 = E\ = aT) 87rV"
\b\K2V2
| A T | = \b\K<
> a(Mz)A?~a(M2z)m2H
exp
3m2H
(73)
where v « 246 GeV and we have used eq. 71 to obtain the final inequality. The consequences of eqns. (71) and (73) are summarized in Figures 7 and 8. The larger m # , the lower A and the larger the expected value of AT. Current limits imply \T\ ~ 0.5, and hence 77 A ~ 4TeV • K. (For K ~ 4?r, mH ^ 450 GeV.) By contrast, the leading contribution to S arises from ""
W
B.
2!A;
{[D^DV]^[D\D»]
(74)
This gives rise to (£3 = aS/4sin 8w) AS
A-KCLV2
A2
(75)
757
It is important to note that the size of contributions to AT and AS are very different
AS
^/lO"1^
a fina\ =0
AT = -b{-^) {-^) Even for K ~ 1, |AS| « |AT|. Finally, contributions to U (e2 = ~ As$gw)>
,„.
• arise
(76)
from
^Vw""»2
(77)
and, being suppressed by A4, are typically much smaller than AT. 6.^
Limits on a Composite Higgs Boson
From triviality, we see that the Higgs model can only be an effective theory valid below some high-energy scale A. As the Higgs becomes heavier, the scale A decreases. Hence, the expected size of contributions to T grow, and are larger than the expected contribution to S or U. The limits from precision electroweak data in the (mjj.AT) plane are shown in Figure 9. We see that, for positive AT at 95% CL, the allowed values of the Higgs mass extend to well beyond 800 GeV. On the other hand, not all values can be realized consistent with the bound given in eq. (71). As shown in figure 9, values of the Higgs mass beyond approximately 500 GeV would likely require values of AT much larger than allowed by current measurements. I should emphasize that these estimates are based on dimensional arguments. I am not arguing that it is impossible to construct a composite Higgs model consistent with precision electroweak tests with mn greater than 500 GeV. Rather, barring accidental cancellations in a theory without a custodial symmetry, contributions to AT consistent with eq. 71 are generally to be expected. In particular composite Higgs boson models, the bounds given here have been shown to apply.78 These results may also be understood by considering limits in the (S, T) plane for fixed (m//,m t ). In Figure 10, changes from the nominal standard model best fit (m# = 84 GeV) value of the Higgs mass are displayed as contributions to A5(m//) and AT(m#). Also shown are the 68% and 95% CL bounds on AS and AT consistent with current data. We see that, for m # greater than O(200 GeV), a positive contribution to T can bring the model within the allowed region. At Run II of the Fermilab Tevatron, it may be possible to reduce the uncertainties in the top-quark and W-boson masses to Arnt = 2 GeV and
758
m H -AT limits
800
400 mH [GeV]
Figure 9. 68% and 95% CL regions allowed 7 8 in (mjy, |AT|) plane by precision electroweak data. 7 9 Fit allows for mt} as, and aem to vary consistent with current limits. 7 8 Also shown by the dot-dash curve is the contour corresponding to A x 2 = 4, whose intersection with the line AT = 0 - at approximately 190 GeV - corresponds to the usual 95% CL upper bound quoted on the Higgs boson mass in the standard model. The triviality bound curves are for |6|« 2 = 4?r and 4TT2, corresponding to representative models. 7 8
AMw = 30 MeV. 80 Assuming that the measured values of rrit and Mw equal their current central values, such a reduction in uncertainties will result the limits in the {WIM, AT) plane shown in Figure 11. Note that, despite reduced uncertainties, a Higgs mass of up to 500 GeV or so will still be allowed. 7 7.1
Composite Fermions Chiral Symmetry and Anomalies
Consider a chiral transformation of a 4-component Dirac field:
v
,+iais i>
O
IPL,R -> e±tai>L!R
,
(78)
which may also be written tl>L-> e+^to
k
Ci/>R = tl>cL->e+%ail>. L 5
(79)
759 m„-AT limits
0.3
^s^' s^
0,2
y
/ /''
0.1
95% CL ~\
68% CL -'—>
) /
M /
0
A& *
/ /
-0.1
1
-0.2
" " x ^ m \ ^
V* -
--0.3
-0.4
-0.3
-0.2
-0.1
0.1
0.2
0.3
Figure 10. 68% and 95% CL regions allowed in (AS, A T ) plane by precision electroweak data. 7 9 Fit allows for wit, « s 3 and aem to vary consistent with current limits. 7 8 Standard model prediction for varying Higgs boson mass shown as parametric curve, with m j j varying from 84 to 1000 GeV.
400 mH [GeV]
600
Figure 11. 68% and 95% CL allowed region 7 8 in (mji,AT) plane if Fermilab Tevatron Run II reduces the uncertainty in top-quark and W-boson masses to A m i = 2 GeV and AMW = 30 MeV about their current central values.
in terms of the charge-conjugate field ^£ • Since fermion mass terms mipil> <$
rmpLtpR + h.c.
(80)
couple the left- and right-handed components of a fermion field, mass terms are not chirally invariant. Therefore, an unbroken global chiral symmetry is a sufficient condition for the existence of massless fermions at low energies. If these massless fermions are composite objects, the fundamental fermions of which they are composed must carry the same symmetries, 't Hooft realized 81 that there were additional constraints relating the representations of the fundamental and composite fermions: the anomaly matching conditions. We begin our discussion with a review of anomalies in quantum field theory. Regularization of quantum field theory, necessary to extract finite answers, generally breaks various global symmetries of the model, and one must re-impose these symmetries upon renormalization. However, regularization always breaks chiral symmetries and, surprisingly, we generally cannot re-impose both 8 2 , 8 3 , 8 4 , 8 5 vector and chiral symmetries in renormalized theory! The result is that a symmetry of the classical theory is broken at quantum level. Hence, the use of the term anomaly.
760
In perturbation theory, the anomaly manifests itself in the behavior of the triangle diagram p
/
\ .
.
Nvc
(81)
V
Imposing vector current conservation, one finds that the divergence of the axial current is nonzero:
• Tv(Ta{Tb,Tc})
.
(82)
For simplicity, we will write the theory, and symmetries, in terms of lefthanded fermions and currents only. Diagrammatically, one can move the chiral projector (^r 5 -) to a single vertex. We will regularize so that the resulting VVV term not anomalous; the A W term remains as above. (Gauge) 3 Anomalies Consider first a vectorial SU(NC) gauge theory with Nf fermions in the fundamental Nc representation. The fermions transform as: r/>L:(Nc,Nf,l)
rL-(Nc,l,Nf)
(83)
under the SU(NC) gauge and SU(Nf)L x SU(Nf)R global symmetries. For consistency, the SU(NC) gauge current must be conserved and the corresponding symmetry cannot be anomalous. For a representation R, define: Tr(T^{TbR,TR})
= ~A(R)dabc
,
(84)
and hence A(NC) = 1. For the 7VC representation: Tr(TR{TbR,TR})
= Tr((-Ta)T{(-T6)T,(-nT}) ,
(85)
and A(NC) — — 1- Hence the total gauge anomaly Nf • A{NC) + Nf • A{NC) = 0 .
(86)
In general, for any real representation the generators TR are unitarily equivalent to {-Ta)T and therefore A(R) = 0. One can also construct a chiral gauge theory using a complex, but anomaly-free, representation. Consider the two-index antisymmetric tensor representation, A y , of SU(N). Since dabc is a group-theoretic invariant, to
761
calculate A(R) one need only consider a single nonvanishing term on the left hand side of eq. 84. In particular, consider the generator proportional to
(87) V
N-l) 3
in the fundamental representation. Tr(T ) implies A(N) oc (N - 1) • ( - 1 ) 3 + 1 • (IV - l ) 3
(88)
for the fundamental representation N and A{A) oc
(7V
~
1)(iV
~
2)
• ( - 2 ) 3 + (JV - 1) • (N - 2) 3
(89)
for the antisymmetric tensor representation A. A little algebra shows that A(A) = (JV — 4) -4(AT), hence we can construct an anomaly-free chiral gauge theory by including fermions transforming as one antisymmetric tensor and N — 4 antifundamentals. For your consideration . . .
Consider one family of quarks and leptons in the standard SU(3)c SU{2)W x U(l)Y model. Show that all of the following gauge anomalies cancel: •
(SU(3)C)3
.
{SU{2)wf
•
(U(l)y)3 w
. (SU(2)W)2 • (SU(3)C)2
SU(3)C U(1)Y
.
(SU(2)wy2U(l)Y
.
(U(l)Y)2SU(3)c
• (C/(l)y)25t/(2)w
x
762
• .
SU(3)CSU(2)WU(1)Y {SU(3)c,SU(2)w,U(l)Y}(Grav)2 Which connect quark and lepton charges?
Global/Gauge Anomalies Global chiral symmetries can be violated by anomalies as well. The classic example is axial quark number, U(1)A, in QCD. Including this (approximate) classical global symmetry the Nf quarks transform as Vi: (3,^,1)+! under SU(3) x SU(Nf)L
x SU(Nf)R
rL-
(3,l,iV/) + i
(90)
x U(1)A. The triangle graph (91)
yields a result proportional to Tr(Tu(1)A
= Nf Tr{T 3 6 ,r 3 c } + Nf Tr{T|,Tf } ^ 0 .
{T»,TC})
(92)
Consequently,86 the U{1)A current is not conserved aH/5 3P. ~
d
i>QCD on_?
J ^gvfiX £
jpa rpa b av B\ >
b
fno\ (9-5)
and there is no ninth Goldstone Boson in QCD! Surprisingly, there is an anomaly-free global U(l) symmetry in the chiral SU(N) model described above. The U(l)gi0baiSU(N)2 anomaly is proportional to index = k(R) 5ab .
^{T^T^}
(94)
of the SU(N) representation of the fermion, which is a group theoretic invariant. Consider generator proportional to 1 -1
I
(95)
The index of a representation is proportional to Tr(T 2 ), which gives k(N) <x 2 • (l) 2 + (N - 2) • (0)2
(96)
763
for the fundamental representation, and k(A) <x 1 • (0)2 + 2 • (N - 2) • (l) 2
(97)
for the antisymmetric tensor. Recall that the consistent chiral SU(N) theory has one antisymmetric tensor A and N — 4 antifundamental representations N. Comparing to eqs. 96 and 97, we see that there is an anomaly-free global U(l) symmetry under which? the antisymmetric tensor has charge —1 and (N — 4) antifundamentals have charge %E^- I n the simplest nontrivial case, chiral SU(5), the antisymmetric tensor is ten dimensional and the fermion fields transform as 10_i and S3 under SU(5) x U(l). 7.2
(Global)3 Anomalies: the 't Hooft Conditions
The existence of massless composite fermions implies that there is a low-energy global chiral symmetry group H, and this group must be a subgroup of the high-energy global symmetries, 't Hooft argued 81 that the (global) 3 anomaly factor (AH) must be the same in the low- and high-energy theories. His argument runs as follows: consider a theory with massless composite fermions with a global chiral symmetry group H. Suppose you were to weakly gauge the chiral global symmetry group H. In order to avoid gauge anomalies, one must also add "spectator" fermions which are weakly, but not strongly interacting, to cancel the anomalies AH- By definition, these weak gauge interactions don't affect the dynamics, and the massless composite fermions must still form. In order for the weak gauge group to remain consistent, therefore the low-energy massless composites must cancel anomalies of spectator fermions. Hence, *f
undamental
•^•composite
•
V'-'*/
The condition in eq. 98 is a nontrivial relation between the H representations of the fundamental and composite fermions. It provides a necessary condition which must be satisfied by any putative theory of composite fermions. We will illustrate the anomaly matching conditions with two plausible theories of composite massless fermions. Consider the chiral 517(5) gauge theory described earlier with fundamental fermion transforming as 10_i (x*J) a n d S3 (tpk)- As discussed, the theory has a global chiral U(l) free of strong anomalies with the charges shown. It is possible to construct an SU(5) singlet fermion: x^tpitfj- The U(l) charge ' O u r choice of the f/(l) charges allows this symmetry to commute with the global SU(N — 4) symmetry on the N — 4 antifundamental fields.
764
of this composite fermion is: -1+3+3=5. Therefore ^comvosite ^•composite
— \^J
— I Z u — ^fundamental
— -W ' \
-LJ
+ 0 '
\o)
(99)
Assuming SU(5) confines, it is possible that the chiral U(l) symmetry is unbroken and the a single massless fermion is present in the low-energy spectrum. In the case of SU(5) there is a complementary picture of the physics which, surprisingly, yields the same low-energy spectrum. Consider the possible bilinear (scalar) condensates of fermions: aj3
^ailppj
-
(15)6
° V « x ; = (5 + 45) 2 ,
(100) (101)
and
^^4
(5 + 50)- 2
(102)
Of these four channels we can guess that the most attractive
(103) two-fermion channel forms first. The diagram above is proportional to
n-T% = \ an+nf - in? - ra2)«C(A+B)-(?(A)-C(B) (KM) where C(A) is the Casimir of representation A. The most attractive channels (corresponding to the most negative value of the expression in eq. 104) are the smallest representations: 5 2 + 5_ 2 Assuming that these two 5± 2 condensates form, and that their vevs align, we find the symmetry breaking pattern SU(5)gauge x U(l)gi0t,ai —^ SU(4)gauge x U'(1)global- The residual global U(l) is a combination of the original global U(l) charge with the diagonal SU(5) generator, Q = (2Q + <2s)/5 with
n
\ (105)
5 =
\
-4 J
*for a review of the most attractive channel hypothesis, and chiral symmetry breaking in general, see Michael Peskin's 1982 lectures at the Les Houches summer school. 32
765
\ Higjis Phase
\k \ \
0
(.'iMiliiiiiis; Pluiu-
m2
Figure 12. Complementarity: In a gauge theory (coupling g) with scalars (mass m 2 ) in the fundamental representation, Fradkin and Shenker have shown 8 7 that the confining and Higgs phases can be smoothly connected. This implies that the massless spectrum is the same in both phases.
Under the unbroken symmetry, the original fermion representations decompose as 63 —» 4i + 1-2 and 10_i —» 6o + 4_i. The residual SU(4) gauge symmetry is vectorial, and we expect condensates to give dynamical masses corresponding to the condensates 4 • 4 and 6 • 6. Both of these condensates carry zero global charge, and U(l) remains unbroken. Therefore, the gauge singlet fermion (I2) remains massless! Both the confining picture and the gauge-symmetry-breaking / Higgs phase picture have the same low-energy spectrum. These two pictures are complementary. Indeed, the correspondence can be seen directly in terms of the fields: xV;V; ^ {xtP)'4)i where we explicitly note the condensate in which the singlet fermion propagates. Note that the global symmetry (£/(l)) is the same in the two pictures, while the gauge symmetry is not. This is consistent because a gauge symmetry is a redundancy in the Hilbert space, while a global symmetry relates different physical states. Note that, in the Higgs phase picture, the condensate (xV') transforms in the fundamental representation of the gauge group. Fradkin and Shenker have shown87 that in a gauge theory with scalars transforming in the fundamental representation, the confining and Higgs phases are smoothly connected (see fig. 12). This implies that the massless spectrum is the same in both phases. The behavior we have noted in the chiral SU(5) gauge theory is a dynamical realization of complementarity.
766 7.3
MoosesS8>89
Georgi has proposed a class of composite models with QCD-like dynamics. The models are most easily described diagrammatically ("moose" diagrams), with the basic element N h——! M ' ;——.,M., (1Q6) where the solid circle denotes an SU(N) gauge group, the dashed circle an SU(M) global group, and the line a left-handed (iV, M) fermion. In this notation, QCD with three light flavors is '•-„,-;R
(107)
where the global SU(3)L X SU(3)n are shown by the outer circles and SU(S)c by the middle circle. The fermions (3,3, l)+i are denoted by the left-hand line, and (l,3,3)_i by the right-hand one. Finally, the charges for the nonanomalous U(1)B are shown by the numbers above the line. The constraints of anomaly cancellation are easily seen: gauge anomalies are canceled whenever the number of lines leaving a solid circle equal the number of lines entering; nonanomalous global t/(l)'s exist if the total charge of the fermions coupled to a given gauge group equals zero. After QCD chiral symmetry breaking, the fermions condense and the residual global symmetries "collapse" the moose diagram to '•••.?...••'
(108)
which denotes the residual vector 5(7(3) symmetry. The simplest nontrivial model of this sort is the "odd linear moose" ••--•• \
^
\
W
v
,
•'--'
(109)
which has an SU(M) x SU(N) x 1/(1) global symmetry. Assuming both gauge groups confine, all global anomalies are saturated by a massless ipaippipy bound state with quantum numbers (M,N)+\. The odd linear moose has two complementary Higgs phase pictures depending on the scales, A-SU(M) and A-SU(N), at which the two gauge groups become strong. If ASU(N) > A-su(M), SU(N) behaves like QCD and the symmetry breaking pattern SU(M)gi0bai x SU(M)gauge -> SU(M)gi0bai is expected. In this case a ipaip/3 dynamical mass forms and the remaining
767
tp-y is massless. Alternatively, if ASu(M) > ing SU(N)globai x SU{N)gauge -> SU(N)giobai cal mass forms, and \j)a remains massless. In fermion remains, summarized by the "reduced
h-su(N), the symmetry breakis expected, a V/#7 dynamiall cases a massless (M, N)+i moose"
Mi
N
(110)
Out of these basic ingredients, many models 88,89 with composite fermions (and scalars) can be formed, and the interested reader is encouraged to explore the literature. 8
Composite Gauge Bosons: Duality' in SUSY SU{NC)
Consider a supersymmetric SU(NC) gauge theory with Nf flavors. The lefthanded fermions tpi, of an ordinary gauge theory, become "chiral superfields" comprised of a complex scalar Q and left-handed fermion ipQ. Similarly, the left-handed charge conjugate fermions of an ordinary gauge theory ipcL become the fields Q and tpQ. The gluon g of the ordinary theory becomes a "vector superfield" with the addition of the gluino, an adjoint Majorana-Weyl fermion Aa (XL = XCR). The global symmetry of the theory is SU(Nf)L x SU(Nf)R x U(1)B X U(1)R, where the fermions have charges SU(NC) SU(Nf)L Nf 1 A N? - 1 1
SU(Nf)R 1 Nf 1
U(1)B U(1)R 1 _1YZ
-i
-£
1
+1N,
0
(111)
Note that the addition of the massless gluinos has resulted in an additional nonanomalous symmetry U(1)R. The anomaly factor for U(1)RSU(NC)2 is 2-Nr{-^)\
+
(+l)Ne^0
(112)
where we note that the index of the fundamental is 1/2 while that of the adjoint is Nc. Seiberg 91 has conjectured that, for Nc + 1 < Nf < 3NC, this theory is a "dual", i.e. has the same low-energy theory as, a supersymmetric SU(Nf — 'For more complete review of duality in supersymmetric theories, see Peskin's lectures in the 1996 TASI summer school. 90
768
Nc) gauge theory with fields SU(Nf - Nc) SU(Nf)L Nf-Nc Nf Nf-Nc 1 0 Nf A (Nf - JVC)2 - 1 1
SU(Nf)! 1 Nf
Nf-Nc Nc N,-Nc
Nj
0
1
0
Nf - 1 2JV 1 - TV, +1
(113)
C
and a superpotential W oc Mipqipq coupling the global symmetries on the dual "quark" q and "meson" M fields. All global anomalies match, but only if both the mesons and dual gauginos are included! Unlike the composite theories we have discussed previously, the dual gauge bosons and quarks cannot be interpreted as simple bound states of fundamental particles. The proposed duality satisfies90 a number of other nontrivial checks as well, including: holomorphic decoupling, consistency with non-abelian conformal phase SNc/2 < Nf < 3NC, generalization to N = 2 supersymmetric theories and string calculations. While no proof has yet been given, the overwhelming preponderance of evidence indicates that Seiberg duality holds and the theory described provides a highly nontrivial example of compositeness. For your consideration ...
Verify that all of the following SU(NC) anomalies match in the SU(Nf Nc) dual theory: .
(SU(Nf)Ly
.
{SU{Ns)LfU{l)B
.
(SU(Nf)L)2U(l)R (U(l)B)2U(l) TTU(1)R
•
9
R
("gravitational anomaly")
(U(1)R)3
The ACS Conjecture 9 2
Given the panoply of examples we have considered, it is worth asking if there is a limit on the complexity of the low-energy theory. Consider the free-energy
769 per unit volume T of a theory at temperature T, and define
^
=
-hF?
(114)
and T7 90 /i/v = - lim ^ ^ Z-
(H5)
T->oo i * 7T
For a free theory, both equal ns + 7nF/8. In this sense, /IR,UV counts the number of low- and high-energy degrees of freedom. Appelquist, Cohen, and Schmaltz have conjectured 92 that: fm < fuv ,
(116)
corresponding to the intuitively reasonable result that the number of degrees of freedom should not increase at low energies. This conjecture has been confirmed for a number of models, 92 ' 93 but no general proof has been found. 10
Conclusions on Compositeness
• Experimental limits place a lower bound of order 4 TeV (using the ELP 5 4 convention) on the scale of quark and lepton compositeness. • Weak coupling and asymptotic freedom imply that the standard model gauge bosons are likely to be fundamental; the longitudinal W and Z may be composite with a scale of order 1 TeV or higher. Composite Scalars: Generically, light scalars occur near a 2nd order phase transition. "Tuning" is required to keep them light compared to the compositeness scale, leading to potential hierarchy/naturalness problems! • Composite Fermions: Massless fermions are a natural consequence of confinement and unbroken chiral symmetry, 't Hooft's anomaly matching conditions must be satisfied. • Composite Gauge Bosons: Seiberg duality shows 4-dimensional field theory at its most subtle - who says we need 10 or 11 dimensions! Acknowledgements: I thank Jon Rosner and K. T. Mahanthappa for organizing a stimulating summer school, and Gustavo Burdman, Myckola Schwetz, and especially Elizabeth Simmons for comments on the manuscript. This work was supported in part by the Department of Energy under grant DE-FG02-91ER40676.
770
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31.
R. S. Chivukula, (1998), hep-ph/9803219. D. E. Groom et al., Eur. Phys. J. C15, 1 (2000). S. Weinberg, Phys. Rev. D19, 1277 (1979). L. Susskind, Phys. Rev. D20, 2619 (1979). M. Weinstein, Phys. Rev. D8, 2511 (1973). P. Sikivie, L. Susskind, M. Voloshin, and V. Zakharov, Nucl. Phys. B173, 189 (1980). E. Eichten and K. Lane, Phys. Lett. 90B, 125 (1980). S. Dimopoulos and L. Susskind, Nucl. Phys. B155, 237 (1979). S. Weinberg, Physica 96A, 327 (1979). H. Georgi, Menlo Park, Usa: Benjamin/cummings ( 1984) 165p. A. Manohar and H. Georgi, Nucl. Phys. B234, 189 (1984). J. Gasser and H. Leutwyler, Nucl. Phys. B250, 465 (1985). E. Farhi and L. Susskind, Phys. Rev. D20, 3404 (1979). R. S. Chivukula and H. Georgi, Phys. Lett. 188B, 99 (1987). R. Dashen, Phys. Rev. 183, 1245 (1969). R. D. Peccei and H. R. Quinn, Phys. Rev. Lett. 38, 1440 (1977). R. D. Peccei and H. R. Quinn, Phys. Rev. D16, 1791 (1977). S. Weinberg, Phys. Rev. Lett. 40, 223 (1978). F. Wilczek, Phys. Rev. Lett. 40, 279 (1978). K. Lane, (1993), hep-ph/9401324. T. Appelquist, M. J. Bowick, E. Cohler, and A. I. Hauser, Phys. Rev. Lett. 53, 1523 (1984). T. Appelquist, M. J. Bowick, E. Cohler, and A. I. Hauser, Phys. Rev. D31, 1676 (1985). R. S. Chivukula, S. B. Selipsky, and E. H. Simmons, Phys. Rev. Lett. 69, 575 (1992), hep-ph/9204214. L. Randall and R. Sundrum, Phys. Lett. B312, 148 (1993), hepph/9305289. B. Holdom, Phys. Rev. D24, 1441 (1981). B. Holdom, Phys. Lett. 150B, 301 (1985). K. Yamawaki, M. Bando, and K. iti Matumoto, Phys. Rev. Lett. 56, 1335 (1986). T. W. Appelquist, D. Karabali, and L. C. R. Wijewardhana, Phys. Rev. Lett. 57, 957 (1986). T. Appelquist and L. C. R. Wijewardhana, Phys. Rev. D35, 774 (1987). T. Appelquist and L. C. R. Wijewardhana, Phys. Rev. D36, 568 (1987). H. Pagels, Phys. Rept. 16, 219 (1975).
771
32. M. E. Peskin, Lectures presented at the Summer School on Recent Developments in Quantum Field Theory and Statistical Mechanics, Les Houches, France, Aug 2 - Sep 10, 1982. •33. R. Fukuda and T. Kugo, Nucl. Phys. B117, 250 (1976). 34. K. Higashijima, Phys. Rev. D29, 1228 (1984). 35. K. Lane, Phys. Rev. D10, 2605 (1974). 36. H. D. Politzer, Nucl. Phys. B117, 397 (1976). 37. J. M. Cornwall, R. Jackiw, and E. Tomboulis, Phys. Rev. D10, 2428 (1974). 38. T. Appelquist, K. Lane, and U. Mahanta, Phys. Rev. Lett. 61, 1553 (1988). 39. A. Cohen and H. Georgi, Nucl. Phys. B314, 7 (1989). 40. U. Mahanta, Phys. Rev. Lett. 62, 2349 (1989). 41. M. E. Peskin and T. Takeuchi, Phys. Rev. Lett. 65, 964 (1990). 42. M. E. Peskin and T. Takeuchi, Phys. Rev. D46, 381 (1992). 43. M. Golden and L. Randall, Nucl. Phys. B361, 3 (1991). 44. B. Holdom and J. Terning, Phys. Lett. B247, 88 (1990). 45. A. Dobado, D. Espriu, and M. J. Herrero, Phys. Lett. B255, 405 (1991). 46. H. Georgi, Ann. Rev. Nucl. Part. Sci. 43, 209 (1993). 47. D. B. Kaplan, (1995), nucl-th/9506035. 48. A. Pich, (1998), hep-ph/9806303. 49. H. Georgi, Phys. Lett. B298, 187 (1993), hep-ph/9207278. 50. R. S. Chivukula, M. J. Dugan, and M. Golden, Phys. Lett. B292, 435 (1992), hep-ph/9207249. 51. A. G. Cohen, D. B. Kaplan, and A. E. Nelson, Phys. Lett. B412, 301 (1997), hep-ph/9706275. 52. S. Coleman, J. Wess, and B. Zumino, Phys. Rev. 177, 2239 (1969). 53. J. Curtis G. Callan, S. Coleman, J. Wess, and B. Zumino, Phys. Rev. 177, 2247 (1969). 54. E. Eichten, K. Lane, and M. E. Peskin, Phys. Rev. Lett. 50, 811 (1983). 55. S. Weinberg, Phys. Rev. 135, B1049 (1964). 56. LEP Electroweak Working Group, h t t p : //lepewwg. web. cem/LEPEWWG/tgc 57. K. Hagiwara, R..D. Peccei, D. Zeppenfeld, and K. Hikasa, Nucl. Phys. B282, 253 (1987). 58. S. Weinberg and E. Witten, Phys. Lett. B96, 59 (1980). 59. V. A. Miranskii, M. Tanabashi, and K. Yamawaki, Mod. Phys. Lett. A4, 1043 (1989). 60. V. A. Miranskii, M. Tanabashi, and K. Yamawaki, Phys. Lett. B221, 177 (1989). 61. Y. Nambu, Enrico Fermi Institute - EFI-89-08.
772
62. W. J. Marciano, Phys. Rev. Lett. 62, 2793 (1989). 63. W. A. Bardeen, C. T. Hill, and M. Lindner, Phys. Rev. D41, 1647 (1990). 64. C. T. Hill, Phys. Lett. B266, 419 (1991). 65. G. Cvetic, (1997), hep-ph/9702381. 66. B. A. Dobrescu and C. T. Hill, Phys. Rev. Lett. 81, 2634 (1998), hep-ph/9712319. 67. R. S. Chivukula, B. A. Dobrescu, H. Georgi, and C. T. Hill, Phys. Rev. D59, 075003 (1999), hep-ph/9809470. 68. C. T. Hill, Phys. Lett. B345, 483 (1995), hep-ph/9411426. 69. Y. Nambu and G. Jona-Lasinio, Phys. Rev. 122, 345 (1961). 70. R. S. Chivukula, (2000), hep-ph/0005168. 71. K. G. Wilson, Phys. Rev. B4, 3174 (1971). 72. K. G. Wilson, Phys. Rev. B4, 3184 (1971). 73. K. G. Wilson and J. Kogut, Phys. Rept. 12, 75 (1974). 74. J. Kuti, L. Lin, and Y. Shen, Phys. Rev. Lett. 61, 678 (1988). 75. W. Buchmuller and D. Wyler, Nucl. Phys. B268, 621 (1986). 76. B. Grinstein and M. B. Wise, Phys. Lett. B265, 326 (1991). 77. R. S. Chivukula and E. H. Simmons, Phys. Lett. B388, 788 (1996), hep-ph/9608320. 78. R. S. Chivukula, C. Holbling, and N. Evans, Phys. Rev. Lett. 85, 511 (2000), hep-ph/0002022. 79. LEP Electroweak Working Group, http://lepewwg.web.cern.ch/LEPEWWG/plots/summer2000/. 80. D Amidei and R. Brock, http://fnalpubs.fnal.gov/archive/1996/pub/Pub-96-082.ps. 81. G. 't Hooft and others (ed.), New York, USA: Plenum (1980) 438 P. (Nato Advanced Study Institutes Series: Series B, Physics, 59). 82. J. S. Bell and R. Jackiw, Nuovo Cim. A60, 47 (1969). 83. S. L. Adler, Phys. Rev. 177, 2426 (1969). 84. R. Jackiw and K. Johnson, Phys. Rev. 182, 1459 (1969). 85. J. Schwinger, Phys. Rev. 82, 664 (1951). 86. G. 't Hooft, Phys. Rev. Lett. 37, 8 (1976). 87. E. Fradkin and S. H. Shenker, Phys. Rev. D19, 3682 (1979). 88. H. Georgi, Nucl. Phys. B266, 274 (1986). 89. H. Georgi, Harvard Univ. Cambridge - HUTP-86-A040. 90. M. E. Peskin, (1997), hep-th/9702094. 91. N. Seiberg, Nucl. Phys. B435, 129 (1995), hep-th/9411149. 92. T. Appelquist, A. G. Cohen, and M. Schmaltz, Phys. Rev. D60, 045003 (1999), hep-th/9901109. 93. T. Appelquist, A. Cohen, M. Schmaltz, and R. Shrock, Phys. Lett. B459, 235 (1999), hep-th/9904172.
HIH^Hl^BHl
iiiftiiiiiiii
Graham G. Ross
This page is intentionally left blank
Models of fermion masses Graham G. Ross Department of Physics, University of Oxford, 0X1 3NP, UK
November 30, 2000 Abstract The prospects for understanding the observed pattern of fermion masses and mixing angles are reviewed. We start with a discussion of the experimental determination of the quark mass matrices and the evidence for approximate "texture zeros". A simple (broken) Abelian symmetry capable of generating the texture zeros and ordering the hierarchical structure of the quark masses and mixing angles is described and ideas for the origin of the hierarchy are discussed. The extension of the family symmetry to include leptons is also discussed paying particular regard to the need to obtain large mixing angles in the neutrino sector. Finally we consider the possibility that the structure of fermion masses results from non-Abelian flavour and family symmetries.
1
Quark masses and mixing angles
As the subject of these lectures is a discussion of our attempts to construct a (field theory) description of the masses and mixing angles of quarks (and leptons) it is perhaps appropriate first to summarise what we know about them. In Table 1 I present the data for the quark masses and mixing angles as currently known [1]. Note that what we know are the eigenvalues of the mass matrix and the CKM matrix which is a combination of the rotations matrices needed to diagonalize the left-handed components of the up and down quarks. UCKM = V?V£
(1)
However, the full mass matrix requires a knowledge of both the left and right handed rotation matrices Mu,
=
Vu,^Mu,^vu,d
( 2 )
Thus measurement of the CKM matrix and the diagonal mass entries is insufficient to construct the full mass matrix. This is an embarrasment when trying
775
776 0 . 9 7 3 - 0 . 9 7 5 0.217- 0.222 0.0023 - 0.004 0.21-0.24 0 . 9 - 1.2 0.038 ± 0 . 0 4 1 0 . 0 0 6 - 0 . 0 1 0.026 -0.04 0.84-1.14
\Vu,dj,CKM\
Md(Mz) MS{MZ) Mb(Mz)
= (1.8 - 5.3)MeV = (35 - 100)MeV = (2.8 - 3.0)GeV s/md
MU{MZ) = (0.9 - 2.9)MeV MC(MZ) = (0.53 - 0.68)Gey Mt(Mz) = 168 - 180GeV 22.7 ± 0 . 0 8 0.553 ± 0 . 0 4 3 8.23 ± 1 . 5
mu/md mc/ms
Table 1: E x p e r i m e n t a l d a t a on quark m a s s e s and m i x i n g angles. to develop a theory of masses because masses in a field theory arise through the matrix of couplings of quarks to the Higgs boson. Upon spontaneous breakdown this leads to a mass matrix as the fundamental starting point. What then do we know about this mass matrix?
1.1
Perturbative analysis of mass matrix
Bjorken [2] has introduced a systematic way to determine the mass matrix from the observed masses and mixing angles for the case of small off diagonal matrix elements. Although not inevitable, this is the preferred case for it leads to small mixing angles as are observed without requiring a cancellation between up and down rotations. In this case it is easy to show that the CKM matrix elements determine those elements of the mass matrix above the diagonal. Those below the diagonal are only weakly constrained. To see this we write the mass matrix in the form M - MD + AM + m
(3)
in which Mp is a diagonal matrix with the mass eigenvalues along the diagonal, m refers to the small off diagonal elements and AM refers to the correction to the diagonal elements on going to the non-diagonal form. The mass matrices M and Mo are related by V+MM+VL
= MD
(V = Vu'd
M =
Mu'd)
(4)
Substituting the form of eq(3) and premultiplying by Vj, gives (MD + AM + m)(M + AM + m+)VL = VLM2D
(5)
777 Since the rotation matrices are ordered in powers of the small parameters rriij by Vij = 0{m) Vu = \ + 0(m2) (6) we may solve eq(5) perturbatively to obtain two equations in leading nonvanishing order
2Mi AMi + ^ ImA2 + YlmiiMi
+ M^mWa
(7)
=°
and M V
iH
+ Y^mikMk k
+ Mimli)vkj
+ ^2mikmTjk k
= VijMf
(8)
These may be solved for the off diagonal elements of the CKM matrix giving (mijMj lJ
+m*1M%)
(Mf(Mf-Mf) mikm*k
(Mf - Mj)
(mikMk +
+ m^M^jm^M^
+
(M? - Mj)(M* - Mj)
m*kMk) {>
+ 0{ms
From eq(9) we may determine the contributions to the CKM coming from diagonalisation of the up and the down quarks. Combining these we have _ ^12. _ 2 i i 4- m£*mi*3 i U^IMJL I "12 ~ Ms M c ' M„Mt "•" Ms MB ' •" K — 12 M» "^ M,: ~T~ M„Mt "'"••••
VCKM
QQ\
This form immediately leads to a constraint between the matrix elements. To leading order we have \V12\^\V2i\
(11)
This is in excellent agreement with the measurements \VUS\ = 0.217 — 0.222, I^cul = 0.21 — 0.24 and supports the initial assumption of small off diagonal mass matrix elements. Note that to leading order the magnitude of the matrix elements is determined only by the mass matrix entries above the diagonal. The same procedure applied to V23, V32 gives VCKM
^23 VCKM ^32
_ H^a. _ n}!k 1 Mb_ Aft "f" •" _ H^a. , ^2a_ , - ~ M„ + Mt +
112)
\V2z\ ^ | V 3 2 |
(13)
_
SO
778 again in good agreement with experiment \Vcb\ = 0.038 - 0.041, |V ts | ^ 0.026 0.04. Finally one finds
TO?
V..
= 11^1-11^3+11^1(1!^
V
=
w 31
i3 | wi3 Mb ^ Mt
™&,™% Ms{ Mr
_11M)+
(U)
™% Mb>
Again if the leading term dominates we have approximate equality of V13 and V31 in magnitude, but this is in disagreement with the observation (cf Table 1). If, on the other hand, we assume that the leading terms are zero ie mf3 = TOj3 = 0, so called "texture zeros[3]", then the next term in eq(14) dominates and leads to /
MT
(15)
Substituting the measured values given in Table 1 one finds these relations are in reasonabbly good agreement with experiment. Since presenting these lectures, new data has indicated [4] that there are corrections to eq(15) which require non-zero mf3, although smaller than the other mass matrix elements. As we shall discuss, family symmetries capable of explaining the approximate texture zeros also indicate that the texture zeros are not exactly zero and this new infromation will be valuable in testing these schemes, i shall return to a discussion of this point later.
1.2
Texture Zeros
We have seen that the assumption of the vanishing (in leading order) of two elements of the mass matrix ("texture zeros") leads immediately to a prediction for a CKM matrix elements in excellent agreement with experiment. Are there any other possible texture zeros? Suppose the diagonal element ( M + A M ) n = 0 and further assume that the (1,2), (2,1) elements are symmetric. The submatrix for either the up or the down quarks has the form /0 ^TO12
mi2 TO22
779 This has now only two parameters (the phase can be absorbed in a redefinition of the fermion fields). Hence there is a relation between its two mass eigenvalues and the rotation diagonalising it. One may readily see that this implies[5]
MR - J ^ -
lrj e
\
(16)
c./.(0.218 - 0.224) = |(0.16 - 0.33) - (0.047 - 0.07)eICT| Here a is a phase that enters when combining the up and the down rotations. The experimental comparison is also shown and one may see this prediction is in good agreement with experiment (due to the smallness of the up quark contribution the a dependence is quite small). Thus we see that experiment is in good agreement with the hypothesis of four texture zeros (counting pairs of off-diagonal zeros as one zero since we have assumed a symmetric form). We shall return to a more systematic discussion of texture zeros shortly. Note that as was the case for V12, V21 the other CKM matrix elements principally depend on the elements of m above the diagonal so we are largely ignorant of the elements below the diagonal. In fact the only hint for the magnitude of these off-diagonal lower elements comes from the success of eq( 16) which applies only if 11121 = m i 2 - In what follows we will often follow this hint and assume that the full mass matrices are symmetric. To summarise, we have found the data is consistent with four texture zeros plus the assumption that the off-diagonal matrix elements are small. This simple structure determines 5 of the 6 of diagonal CKM matrix elements (ie three of the four independent elements of a unitry matrix) The mass matrix has the form / Md{-U) =
0 mi2 0 \ m12 m 2 2 m 2 3 \ 0 m33 m33 J
(17)
The one remaining matrix element left to be determined is Vcf, which is related to the mass matrix elements as in eq(9 ) y
_ ^23
m
23
/,o\
We will return to a discussion of how such a structure might emerge from a field theory description of the fundamental interactions in a later section. First however we turn to a more systematic discussion of the possible texture zeros in the mass matrix.
1.3
General texture zero structure.
The appearance of texture zeros in the mass matrix strongly suggests that there is some new symmetry which, when exact, enforces the zero of the mass matrix
Solution
( 1
0 V2A 6 0\ V2A6 A4 0
0 1 /
V o
/ 0 A6 0 \ A6 0 A2 \ 0 A2 1 /
2
/ 3
Yd
Yu
0 0
\V2\4
0 V2A 4 \ A4 0
0
1 /
0 ^2A 6 0 \ ^2A 6 ^3A 4 A2 V 0 A2 1 / /
4
/ 0 2A4 0 \ 2A4 2A3 4A3 \ 0 4A3 1 /
/ 0 2A4 0 \ 2A4 2A3 2A3 \ 0 2A3 1 /
/ 0 2A4 0 \ 2A4 2A3 4A3 \ 0 4A3 1 /
/ 0 2A4 0 \ 2A4 2A3 0
V o o 1/
A4 \
/ 0 2A40\ 2A4 2A3 0
U S i/
V o o 1/
/ 0
0
5
Table 2: Approximate forms for the five-zero symmetric textures.
781 element. We wish to use this "bottom-up" approach, starting with low-energy observables, to explore the possible appearance of such new symmetries as an indication of a further stage of unification of the fundamental interactions. As we shall discuss there is some indication of unification of the gauge couplings at a very high scale of 0(W16GeV). In this case the analysis we have presented should be applied to the masses and mixing angles continued up to the unification scale before looking for texture zeros. The reason is that the mass matrix is scale dependent and zeros at one scale will be filled in by radiative corrections at another scale. To do this one uses the renormalisation group to continue the measured masses and mixings to the high scale and looks for simplicity appearing at the unification scale. In the case of gauge couplings, simplicity is equality between the couplings. In the case of the mass matrices simplicity is simple ratios of quark and lepton masses and the appearance of "texture" zeros. However the analysis of masses so far presented falls short of the ideal "bottom-up" approach for it assumes a particular set of texture zeros rather than just starting with measured values and seeing the zeros emerge. The major difficulty in implementing the ideal program is that, as we have just discussed, laboratory measurements only determine the masses, i.e. the diagonal mass matrix and the CKM mass matrix, not the full mass matrices, If we are to determine V ^ ' R ) and hence the mass matrices separately, a simplifying assumption is needed. A plausible simplifying assumption that has been explored in detail is that the mass matrices are symmetric (as we have seen the success of the (1,1) texture zero lends some support to this hypothesis). We can analyse the most general symmetric mass matrix case leading to five or six texture zeroes, because there are just 6 possible forms of symmetric mass matrix with an hierarchy of the three (non-zero) eigenvalues and three texture zeroes (at least one of the up or down quark mass matrices must have three of the texture zeroes). Allowing for the redefinition of the quark fields to absorb phases, these matrices involve just three real parameters. This allows us to determine, up to the six fold discrete ambiguity, the diagonalising matrix Vu (or V ) in terms of the masses [6]. Hence, using eq(2), we may compute M d (or M u ) in terms of the CKM matrix and hence find Md (or M u ) . Further texture zeroes in M u , Md will result in predictions for the mixing angles of the CKM matrix. The advantage of this technique is that it allows a determination of the down quark mass matrix using experimentally measured quantities without prejudicing the result by the choice of a specific texture. Thus the general problem of searching for (five or six texture zero) structure in mass matrices may be solved with the assumption of symmetric mass matrices 1 and the predictive ansatz 1 In fact a similar analysis may be applied to hermitian matrices too for a general 3 x 3 symmetric mass matrix may be transformed to an hermitian matrix through the freedom to redefine the nine phases of the three left-handed doublets and six right-handed singlets of quark fields. These 9 phases may be used to make both the up and down quark mass matrices hermitian (up to an overall irrelevant phase) since it is always possible to choose a basis in which either the top or the bottom mass matrix is diagonal. However the texture zeros are
782 that either the u p or the down quark mass matrices have three texture zeros and thus contain just three real parameters giving a mass matrix with non-zero determinant.. Using this method one may find all structures with 6 or 5 texture zeroes. Encouragingly there are solutions, consistent with the hoped for simplicity in the mass matrices, but the number of possible solutions is limited; no solutions with 6 texture zeroes were found but there are several 5 texture zero solutions [6], given in Table 2. Altogether five solutions are found for the Yukawa matrices at Q = Mx which are consistent with the low energy fermion masses and CKM matrix elements. In table 2, rather than list the numeric values, we have presented approximations to the matrix elements in powers of A(~ 0.22), the small parameter in eqn (11). In these solutions, after using the freedom to choose the 9 independent quark phases, there is one phase left which we take to be the phase of the (complex) (1,2) matrix element of the down quark matrix (This is not displayed in 2). In addition, solution 2 has a phase associated with the (2,3) element.
1.4
Detailed tests of a quark mass texture
It is instructive to consider the implications of a particular quark mass texture as it illustrates just how constraining texture zeros are and how precision measurements will be able to test the structures in detail. We consider the general texture of the form of Solution 2 in Table 2, but without constraining the non-zero elements. The predictions following from these texture zeros are given in eq(15) and eq(16). It turns out that the former give the most stringent constraints. In terms of the Wolfenstein parameters we have
^Jp2+t Vts
(19a)
^{l--p? + rf
(19b)
where ~p = cp, rj = cq and c = \ / l — A2. The relations of eq(15) invlve ratios of quark masses. It has been pointed out [7] that chiral perturbation theory determines the quantity Q defined in Table 1 to a remarkable accuracy whereas additional assumptions, plausible but not following from pure QCD, are required to determine mu/md. For this reason it is useful toexpress the predictions of eq(15) in terms of Q. Using mu _ mumdms mc
mdmsmc
__
rnu/md
ms
Q ^ / l - (mu/md)
2
mc
not necessarily preserved in transforming to the hermitian form so a separate analysis of the exture zero possibilities in the hermitian case is, in general, necessary.
783 md *
m
Q\/l
-
(mu/rnd)2
gives
(p2+t)
(l-pr+ff =
\2QyJl
- (mu/md)2
A2Q^1 -
mums mdmc
(21)
c A4Q2
(22)
i.rnu/md)2
Together these give
2 ((i-p) 2 +f) 2 -(?r)V+f) = •m3
A comparison of these predictions with a SM fit [7] is shown in Figure 1. One sees
i ''
*""•*
i
0.8 .
Figure 1: Fit 1 (smaller regions): Predictions for p, 77 from a fit to are the two individual constraints eq(19a) and eq(19b) separately (larger regions): SM fit using IK^/Vy, Arris,,, Am,Ba but not e/c, is shown independently with different theoretical errors (see text). contours are at 68, 95 and 99% CL respectively.
data. Also shown (see text). Fit 2 whose constraint For both fits the
that the prediction is quite precise and consistent with present measurements. However it is clear that improvements in the data will provide a very sensitive test of the texture. This is graphically illustrated in Figure 2 showing the probability distributions for the predictions of various measureables compared
784 to their present measurements. The most immediate tests are a deviation from complete B3 mixing, AniB, < 14.9ps _ 1 at 90% confidence level and |sin2a| , |sin 2/3j peaked around 0.5. 1.4.1
Addendum
As discussed above recent data has indicated that the (1,3) matrix element in the down quark mass matrix cannot be zero [4] because it lies outside the predicted region shown in Figure 2. The origin of this discrepancy is indicated in Figure 3 - the improvement in the determination of p and r\ requires a nonzero c corresponding to a non-zero (1,3) matrix element. As discussed below the prediction of an Abelian family symmetry is that c = 1 corresponding to (1,3) = 0 ( e 4 ) • The correction to eq(19a) occurs at 0(ce) while the correction to eq(19b) occurs at 0(ce 2 ) where ~e « 0.2. Thus we see that data suggests the texture zero in the (1,3) position is still a reasonably good approximation.
2
Family Symmetry
In my opinion the hierarchical structure for the fermion mass matrices strongly suggests it originates from a spontaneously broken family symmetry. In this approach, when the family symmetry is exact, only the third generation will be massive corresponding to only the (3,3) entry of the mass matrix being nonzero. When the symmetry is spontaneously broken, the zero elements are filled in at a level determined by the symmetry. Suppose a field 9 which transforms non-trivially under the family symmetry acquires a vacuum expectation value, thus spontaneously breaking the family symmetry. The zero elements in the mass matrix will now become non-zero at some order in < 6 > . If only the 2-3 and 3-2 elements are allowed by the symmetry at order 6/M, where M is a mass scale to be determined, then a second fermion mass will be generated at 0((8/M)2). In this way one may build up an hierarchy of masses.
M ~
/000\ 000
\ooi/ 2.1
-»
/0 0
0 0
\o<e>/M
0 <6>
\ /M\
I
(23)
J
Symmetry breaking in the mass matrix
An important question is how do these elements at 0(6/M) arise? A wide variety of models have been constructed in which the mechanism for communicating the breaking to the mass matrix, and generating the hierarchy takes on different forms:
785
50 « 40 S 30 SI 20
<*» Cu
10
A
A
/ \
\h
0.04- 0.06 0.0! 0.1 0.12 0.U 0.16 |V«/Vd,l
: -0.5
1/Wk 0.
0.5
F i g u r e 2: Probability distributions (lighter area: predictions; darker area: from the SM fit, excluding 6K b u t taking r; > 0) for different observables: a) sin 2 a , b) sin 2/3, c) sin 2 7, d) eK, e) A m B „ f) C, g) P, h) ??.
786
n
c=4.6
0.1
0,2
iV 0.3
0.4
05
0.6
0.7
0.8
0.9
Figure 3: Sensitivity of the (1, 3) matrix element of the down quark mass matrix to the precision measurements of the rescaled Wolfenstein parameters ~p and rj. (The (1,3) element is given by c e4)
787 2.1.1
Proggatt N i e l s e n m i x i n g
A widely studied approach communicates symmetry breaking via an extension of the "see-saw" mechanism mixing light to heavy states - in this context it is known as the Froggatt Nielsen mechanism [8]. To illustrate the mechanism, suppose there is a vector-like pair of quark states X and X with mass M and carrying the same Standard Model quantum numbers as the CR quark, but transforming differently under the family symmetry, so that the Yukawa coupling HcZXH is allowed. Here H is the Standard Model Higgs responsible for giving up quarks a mass. When H acquires a vacuum expectation value (vev), there will be mixing between CL and X. If in addition there is a gauge singlet field 6 transforming non-trivially under the family symmetry so that the coupling hiXCRO is allowed, then the mixing with heavy states will generate the mass matrix.
Diagonalising this gives a see-saw mass formula m
—
hh'
(24)
This mass arises through mixing of the light with heavy quarks. A similar mechanism can generate the mass through mixing of the light Higgs with heavy Higgs states. Suppose Hx, Hx are Higgs doublets with mass M. If Hx has family quantum numbers allowing the coupling HHxd, there will be mixing between H and Hx • If the family symmetry also allows the coupling ~CECRHX, the light-heavy Higgs mixing induces a mass for the charm quark of the form given in eq.(24). 2.1.2
R a d i a t i v e breaking
Another avenue that has been explored is to construct models in which the breaking is not communicated by tree level graphs but by radiative graphs of one loop order or higher. This has the merit that an hierarchy is naturally generated because of the loop factors a (/i 2 /167r 2 ) n where h is the coupling responsible for the radiative mass generation. However we have seen that the expansion parameter needed to explain all masses and mixings is really quite large making it difficult to construct realistic models in which all the suppressed entries are generated radiatively. 2.1.3
Large n e w dimensions
Recently there has been great interest in the possibility that there are large new space dimensions [9] and this has led to new ideas for the generation of
788 hierarchies in mass matrices. These new dimensions must be compacified and the way this happens affects the phenomena of the low enrgy theory. The original Kaluza Klein formulation involved compact dimensions, for example toroidal compactification in which the new dimension can be mapped onto a circle with a (factorisable) space-time metric given by ds2 = ~dx°dx0
+ dxidxi
+ dxadxa,
i = 1,2,3,
a = 4,...,4 + <5
(25)
Recently there has been renewed interest in the case the metric is not factorisable [10]. For the case of a single extra dimension the line element has the form ds2 = e-p{r)(~dx°dx0
+ dxidxi)
+ dx4dx4,
i = 1,2,3,
(26)
where X4 = rd, 0 < 9 < 2ir. Motivation for such theories comes from string theory. Closed string states, which include the graviton, propagate in all the space time dimensions. However open string states may be confined to branes living in a lower dimension [11]. These states may include those of the Standard Model, as is the case in Type I and Type II string theories. In this case a new mechanism for communicating the symmetry breaking to the mass matrix arises [12] for it may happen that the family symmetry, exact on the Standard Model brane, is broken on a distant brane. This breaking may be communicated to the Standard Model brane via states propagating in the bulk but the penalty for this is a suppression factor a e~mr where m is the mass of the bulk state and r is the distance for the Standard model brane to the brane on which the symmetry is broken. Obviously one may build up a complicated hierarchy if the bulk states vary in mass or if the breaking occurs on several branes at different distances. Variations of this theme may be constructed. An attractive example is one in which the different generations live on different branes explaining the hierarchy between generations by the different distances, r^, appearing in the exponent.
2.2
Identification of the Family symmetry
The nature of the spontaneously broken family symmetry is hard to identify because the available data on quark masses and mixings is insufficient uniquely to pin it down. The kinetic terms and gauge interactions of the Standard Model has a very large family symmetry group, namely £/(3) 5 , where the U(3) factors act on the left- and right- handed multiplets of quarks and leptons respectively. The group is extended to £/(3) 6 if three right-handed neutrinos are added. Any family group should be contained in U(3)6 but this leaves very many possibilities. In what follows I shall illustatrate the possibilities by discussing two characteristic possibilities. In the first I consider the simplest case of an Abelian family symmetry. Such symmetries abound in compactified string theories so are quite natural extensions of the Standard Model. The second is a non-Abelian family symmetry, a subgroup of the £/(3) 6 . In both cases I shall insist that there
789
V{l)FD
ft
"i
dl
U
Oti
oti
<*i
a%
Table 3: U(1)FD
»i a%
a-i
H2 -2ax
# 1
-2aj
symmetries.
be texture zeros giving not just qualitative (order of magnitude) predictions but also quantitative predictions. These models require a symmetric mass matrix structure. As a result I will not discuss many, many models that have been developed which adequately describe the data although I will try to mention cases in which interesting alternative ideas play a role. I will also consider how a family symmetry may be combined with a Grand Unified symmetry to relate quark and lepton masses. The discussion below assumes the effective theory is supersymmetric as only in this case are the radiative corrections associated with the Grand Unified symmetry consistent. However it is not difficult to develop (non-Grand-Unified) non-supersymmetric examples.
2.3
An Abelian family symmetry for quark masses
How difficult is it to find a family symmetry capable of generating an acceptable fermion mass matrix? The surprising answer is "Not at all difficult" as I will illustrate by a very simple example utilising an Abelian family symmetry group[13, 14]. As we shall discuss this automatically gives texture zeros simultaneously in the (1,1) and (1,3) positions suggesting that we may be on the right track in concentrating on the texture zero structure. The symmetry giving rise to Froggart Nielsen structure may be an additional gauge symmetry, a global symmetry or a discrete symmetry. In the case of a global symmetry there may be problems associated with the appearance of Goldstone bosons when the symmetry is broken and with possibly large gravitational corrections due to wormhole effects. The latter are known to be absent if the symmetry is a local gauge symmetry, either continuous or discrete. Here I first consider the continuous possibility further restricted to the simplest possibility namely that there is an additional Abelian gauge symmetry beyond the Standard Model gauge group and that it is broken at a very high scale so that the phenomenology of the Standard Model is essentially unchanged. The example applies to the Supersymmetric version of the Standard Model although the techniques apply also to non-supersymmetric schemes. It turns out to be remarkably easy to construct a model generating texture zeros through the introduction of this Abelian gauge symmetry, U(l) (such additional symmetries abound in string theories). The most general possible £7(1) charge assignment of the Standard Model states is given in Table 3. This follows since the need to preserve SU(2)L invariance requires (left-handed) up and down quarks (leptons) to have the same charge. This plus the requirement
790 of symmetric matrices then requires that all quarks (leptons) of the same i-th generation transform with the same charge oti(ai). If the light Higgs, H2, Hi, responsible for the up and down quark masses respectively have U(l) charge so that only the (3,3) renormalisable Yukawa coupling to H2, Hi is allowed, only the (3,3) element of the associated mass matrix will be non-zero as desired. The remaining entries are generated when the U(l) symmetry is broken. We assume this breaking is spontaneous via Standard Model singlet fields, 9, 9, with U(\)FD charge - 1 , + 1 respectively, which acquire vacuum expectation values (vevs), < 9 >, < 9 >,along a "D-flat" direction (This is discussed in more detail shortly). After this breaking all entries in the mass matrix become non-zero. For example, the (3,2) entry in the up quark mass matrix appears at 0 ( e l a 2 - a i l ) because U(1) charge conservation allows only a coupling cctH2{0/M2)a2~a\ a2 > « i or cctH2(9/M2)ai"a2, a x > a2 and we have defined e = (< 9 > /M2) where M2 is the unification mass scale which governs the higher dimension operators. As discussed in reference [13] one may expect a different scale, Mi for the down quark mass matrices (it corresponds to mixing in the H2, Hi sector with M2, Mi the masses of heavy H2, Hi fields). Thus we arrive at mass matrices of the form
mt
(
,
|2+6a| ,
|3o| ,
|l+3a| '
hiiPiA hi2pi2e'b hi3pi3e'a ^2iP2i<43a| h22p22e2 h23p23el I (27) l h3ip3i£a h32p32e h33 ( kn<TiieJ2+6al ki2ai2ebW ki3cn3fj1+3a\ \ |3a| 2 fc2iO-2i
791 where the powers n ^ are those appearing in eq(28) and p, a are related to Yukawa couplings in the Higgs sector in a manner discussed below. These and the Yukawa couplings hij, hj are all assumed to be of 0 ( 1 ) . As discussed above, for a > 0, there are two approximate texture zeros in the (1,1) and (1,3), (3,1) positions. These give rise to excellent predictions for two combinations of the CKM matrix. The magnitude of the remaining matrix elements is sensitive to the magnitude of a and the values of the expansion parameters. Then choosing a — 1 the remaining non-zero entries have magnitude in approximate agreement with the measured values. From Table 2 we see that to a good approximation we have the relation [13] e = e2
(29)
which also implies that Mi > M\. How reasonable are the charge assignments needed? A very simple choice consistent with a = 1 is « i = 1, a.2 = 2, a^ = —3. This assignment fits snugly in the diagonal SU(3) subgroup of the SU(3)4 symmetry group of the quark kinetic terms in the Standard Model. It corresponds to the combination 2T3 + Y where T3 = Diagonal(0,1,-1) and Y = Diagonal(2, — 1, — 1). In this case the Higgs supermultiplets must have charge 2, i.e. they are components of a triplet of the diagonal SU(3). Thus we see the Abelian family symmetry readily emerges from the breaking of an underlying SU(Z) family symmetry. The choice of family charges has been replaced by the choice of spontaneous breaking which leaves the combination 2T3 + Y unbroken. However we should emphasise that it is not necessary that there is indeed a non-Abelian family symmetry - Abelian family symmetries are characteristic of compactified string theories. As we shall discuss there are additional constraints on the quark charges following from the requirement of anomaly cancellation but for this we need to discuss the extension of the family symmetry to leptons. 2.3.1
Addendum
One may see from eq(28) that the texture zero in the (1,3) is not exactly zero but rather is expected at 0 ( e 4 ) . This is indeed what is needed by the recent precision measurements coming from the b—factories c.f. Section 1.4.1.
2.4
Vcb
The structure of the quark mass matrices in eq(28) leads to the successful texture zero predictions of eq(15). However it does not give a good description for Vcb because, c.f. eq(28), the (2,2) and (2,3) matrix elements in the down quark sector are of different order. As a result, assuming the coefficients are of 0 ( 1 ) , Vcb = 0(^/ms/mb, \/mc/mt) which is uncomfortably large. As may be seen from Table 2 data prefers the (2,2) and (2,3) matrix elements to be of the same order giving Vcb = 0(ms/m.b). This is an important piece of data in the attempt
792 to determine an underlying family symmetry and has led to several suggested resolutions: 2.4.1
Yukawa couplings
The value for Vcb is sensitive to the unknown coefficients. For example the choice &23T23 =fc32
U(l) v a c u u m alignment
It is also possible to arrange for the (2,2) and (2,3) matrix elements to be of the same order, even in the case of the Abelian gauge symmetry just discussed. This happens if a field, (j), carrying charge —2, acquires a vev equal to that of 0 because
\e2lp +
\e2-92 + 2(
793 chains). With this the prediction is certainly acceptable up to the uncertainties associated with the Abelian symmetry. This example illustrates the importance of vacuum alignment in determining the implications of a family symmetry. As we shall see, vacuum alignment is central to all attempts to determine the pattern of fermion masses in cases involving additional family symmetry. 2.4.3
N o n - s y m m e t r i c charges
Another possible way to generate a small Va, is to allow for non-symmetric charges . As discussed in Section 1.2, we have only (indirect) evidence for equality of the (1,2) and (2,1) matrix elements and the symmetric structure need not apply everywhere. For example the charge assignment (—3,2,0) for Qi and (—5,0,0) for u£, d\ give the mass matrices of the form
*±J*//*\
^±J%%%\
(31)
Here Vcb = 0(ms/rrib) as required. Further the contribution to the masses and mixing angles of the (1,1) elements is small, keeping the prediction of eq(16). On the other hand the (1,3) element gives a contribution to V^iof the same order as the other terms in eq(14) so the relations of eq(15) are only approximate. This is a characteristic of non-symmetric mass matrix solutions. Although not inconsistent I think we should pursue the more predictive solutions first (provided of course they give acceptable predictions!) and for this reason I will not consider this solution further here.
3
Lepton Masses
As we have seen it is possible to get a good understanding of the pattern of quark masses and mixing angles from a (broken) family symmetry. Of course the immediate question is does the family symmetry also give a viable structure for the charged lepton and neutrino masses and mixing angles? [15] The charged lepton masses have been very precisely measured m e = 0.51099907 ± O-OOOOOOlSMe^ mM = 105.658389 ± 0.000034MeV m T = 1777.05 ±
0.27MeV
The data on neutrino masses and mixing angles is still quite limited because of the difficulty in performing precision measurements involving neutrinos. However recent reports by the Super-Kamiokande collaboration [16], indicate that the number of Vy. in the atmosphere is decreasing, due to neutrino oscillations. These reports seem to be supported by the recent findings of other experiments [17], as well as
794 by previous observations [18]. T h e d a t a indicates that the number of u^ is almost half of the expected number, while the number of ve is consistent with the expectations. Vfj. — vT oscillations, with Sml^T 2
sin 20liT
ss (10~ 2 to 10" 3 ) eV 2
(32)
> 0.8
(33)
match the d a t a very well, while dominant v^ —> ve oscillations are disfavoured by Super-Kamiokande [16] and CHOOZ [19]. On the other hand, the solar neutrino puzzle can be resolved through m a t t e r enhanced oscillations [20] with either a small mixing angle: 6ml^a
« (3 - 10) x 10~ 6 eV 2
29ae « ( 0 . 4 - 1 . 3 ) x 1 0 " 2
sin
(34) (35)
or a large mixing angle: Srn
leva
sin
« (1 - 20) x 1 0 " 5 eV 2
2Bae « ( 0 . 5 - 0 . 9 )
(36) (37)
or vacuum oscillations:
&™ttV„ ~: 2
sin 2Bae
(0.5-1.1) x 10"10eV2
> 0.67
(38) (39)
where a is fi or r . If neutrinos were to provide a hot dark m a t t e r component, then the heavier neut r i n o ^ ) should have mass in the range ~ (1 — 6) eV, where the precise value depends on t h e number of neutrinos t h a t have masses of this order of magnitude [22]. Of course, this requirement is not as acute, since there are many alternative ways to reproduce t h e observed scaling of t h e density fluctuations in the universe. Finally, we note t h a t there is another indication of neutrino mass. T h e collaboration using the Liquid Scintillator Neutrino Detector at Los Alamos (LSND) has reported evidence for the appearance of PM — Pe [23] and Vy. — ve oscillations [24]. Interpretation of the LSND d a t a favours the choice 0.2 eV 2 < Sm2
< 10eV 2
0.002 < sin 2 20 < 0.03
(40)
T h e experiment K A R M E N 2 [25] (the second accelerator experiment at medium energies) is also sensitive to this region of parameter space and restricts the allowed values to a relatively small subset of the above region. 2
Best fit regions for solutions to the solar neutrino deficit have been identified in [21].
795
3.1
Abelian family symmetry and Charged leptons
Clearly a theory of fermion masses should be able simultaneously to determine the quark and lepton masses. We first consider the constraints on the harged lepton mass matrix following from our Abelian family symmetry. Its structure is of the same form as in eq(28) but controlled by the lepton family charges. Taking into account the running of quark and lepton masses just discussed it turns out that there is a relative renormalisation by approximately a factor of 3 of the down quarks relative to the charged leptons on going from the Grand Unified scale to the laboratory scale. If nib = mT at unification scale this renormalisation then gives an excellent agreement with the measured values. For this reason in [13] we restrict the lepton charges by requiring mi, = mT at unification scale a\ = a\ giving
— mT
*
hl^l^
V *3K73iel1+3Q+6l
122*22^+^
Z23
h2^2^+b\
Z33
(41)
J
where 6 = (0,2 — 0*2)/(&2 ~ « i ) and again the Yukawa couplings, Uj, are assumed of (9(1). At this stage b is not determined but for any value of b > 0 there are texture zeros in the (1,1) and (1,3) positions. Remarkably, there is evidence that such texture zeros are needed in the lepton matrix too because they give rise to the phenomenologically successful prediction Det(Ml) « Det(Md) w g|3o| rpQ p r o c e e c j further a value of b must be chosen and two viable choices suggest themselves. For 6 = 0 the lepton charges are the same as the down quark sector, and so the structure of the down quark and lepton mass matrices are identical. In order to explain the detailed difference between down quark and lepton masses it is necessary in this case to assume that the constants of proportionality determined by Yukawa couplings which we have so far taken to be equal (and of 0(1)) differ slightly for the lepton case. A factor 3 in the (2,2) entry is sufficient to give excellent charged lepton masses. As we shall discuss in Section 5.2, this factor arises in a class of Grand Unified theories so it may be quite natural. An alternative which does not rely on different Yukawa couplings is to choose b half integral. For a = 1, b — 1/2 we found excellent agreement for the charged lepton masses (in this case there is a Z2 symmetry forcing the (1,3), (3,1), (2,3), (3,2) matrix elements to vanish).
3.2 3.2.1
Neutrinos Small n e u t r i n o m a s s e s : t h e see-saw m e c h a n i s m
The implications of the neutrino oscillation experiments are very exciting, for non-zero neutrino mass means a departure from the Standard Model and neutrino oscillations indicate violation of lepton family number, again lying beyond the Standard Model. The first question that needs to be answered, is why are
796 the neutrino masses so small, In this this lecture I will follow what we believe to be the most promising explanation, namely that neutrino masses are small due to the "see-saw" mechanism [26] in which the light neutrino are suppressed by a very large scale associated with the onset of new (unified?) physics. The "see-saw" mechanism follows naturally in the case that right-handed neutrinos exist. Suppose that there is no weak isospin 1 Higgs field and hence there are no mass terms of the vLvL type. In this case there are two possible neutrino masses
mDirac17ZiyR + MMa3oranavRvR
= (VZ vR)
(
™D J (
VL
j
(42)
The Dirac mass is similar to an up quark mass and one's naive expectation is that they should have similar magnitude. On the other hand, the Majorana mass term is invariant under the Standard Model gauge group and does not require a stage of electroweak breaking to generate it. For this reason, one expects the Majorana mass to be much larger than the electroweak breaking scale, perhaps as large as the scale of the new physics beyond the Standard Model; for example, the Grand Unified scale or even the Planck scale. Diagonalising the mass matrix gives the eigenvalues '"'Heavy
mLight
^
MM
~
m2
(43) D
~ —— MM
The see-saw mechanism generates an effective Majorana mass for the light neutrino (predominantly VL) by mixing with the heavy state (predominantly VR) of mass MM- It is driven by an effective Higgs $ J w = 1 made up of fjIw=1/2 JJTW=I/2 J MM (hence the two factors of mo in eq.(43) ). eq.(43) shows that a large scale for the Majorana mass gives a very light neutrino. For example with MM = 10 16 GeV and mo taken to be the top quark mass gives m-Light
3.10"^ eV
This estimate shows that it is quite natural to have neutrinos in a mass range appropriate to give, for example, solar neutrino oscillations. However, in many cases, larger masses capable of explaining the other oscillation phenomena are possible because the Majorana mass for the right handed neutrinos is often smaller than the Grand Unified mass. A Majorana mass for the right-handed neutrino requires a Higgs carrying right-handed isospin 1 (in analogy with the left-handed case when it needed left-handed isospin 1). If this field is not present (for example in level one string theory this is always the case) one may get a double see-saw because the Majorana mass for the right-handed neutrino is also generated by an effective Higgs, made up of HIK''"=1^HIW-"=1/2/M', where
797 M' denotes a scale of physics beyond the Grand Unification scale. Taking this to be the Planck scale (probably the largest reasonable possibility) and < fjiw,n=i/'2 •> |-0 ^g j ^ g Q r a n c l Unified scale (it breaks any Grand Unified group) one finds (10 1 6 ) 2 M
M
c ^ f
GeV
giving mLight
~ 1 eV
Thus, one sees that the see saw mechanism naturally gives neutrino masses in the range relevant to neutrino oscillation measurements. Moreover, as the neutrino mass is proportional to the Dirac mass squared, taking the Dirac mass of each family of neutrinos to be of the order of the equivalent up quark mass, one obtains a large hierarchy between different families of neutrino. This is what is required if one is to explain several oscillation phenomena, for it allows the existence of several mass differences. In what follows, we will concentrate on the possibility that there is a minimal extension of the Standard Model involving just three new right-handed neutrino states and that the mass structure of the neutrinos is intimately related to that of the charged leptons and quarks. This implies that the three different indications for neutrino oscillations discussed above cannot be simultaneously explained, because three neutrino masses allow only two independent mass differences. To explain all three observations requires another (sterile) light neutrino state. However, introducing such a state breaks any simple connection between neutrino masses and those of the other Standard Model states and here we wish to explore whether the apparently complex pattern of quark and lepton masses and mixing angles can be simply understood. In this, while the structure of the see-saw mechanism leads naturally to light neutrino masses in a physically interesting range, it does not by itself explain the pattern of neutrino masses and mixing angles. To go further requires a theory of the family structure of the couplings giving rise to the neutrino masses and here we first discuss the implications of the Abelian family symmetry discussed above. 3.2.2
Small n e u t r i n o masses - radiative generation
Although the see-saw mechanism offers a very plausible explanation for small neutrino masses it is not the only possibility. Examples have been constructed in which neutrino masses are zero at tree level and are generated at some order in perturbation theory. A simple example is given in (non-MSSM) supersymmetric extensions of the Standard Model in which there are R—parity violating couplings generate the lepton-number violating couplings needed to give a Majorana mass. For example the coupling hvddc generates a neutrino mass at one
798 loop order via a d quark dc squark intermediate state given by [27] h2
m2dASusY
where ASUSY is a soft trilinear supersymmetry breaking parameter and m^.is the down squark mass. For a reasonable choice of the supersymmetry breaking parameter this gives ^ « I0~5h2 md small without the need for a GUT mass. 3.2.3
Small n e u t r i n o masses - large n e w dimensions
We have seen how Grand Unification provides an elegant explanation for the lightness of neutrinos. In the case of large new dimensions, however, the solution to the hierarchy problem is supposed to be due to the absence of any fundamental mass scale much larger than the electroweak breaking scale [9]! The reason this is a possibility follows because, in such theories, the weakness of the gravitational interactions does not require a fundamental Planck scale of O(l0lsGeV). The reason is readily seen if we compute the Newtonian potential between two point masses in the underlying (4+<5) dimensional theory. It is given by v
, s
m1m2
1
* gravity y ) Ivl
Planck,4+S
where Mpianck,4+6 IS the fundamental mass scale in the (4+5) dimensional theory (the string tension in a string theory). This form applies at distances much smaller than the size, R, of the 6 additional compact dimensions. At distances larger than their size we have V
CM
V gravity [r)\r>>R
m m
—
i 2
M
S+2
J_ • RS n
Planck,4+S
-
1 r
This shows that the Planck scale in our 4 dimensional theory is made larger by the spreading of flux in the new dimensions. It is given by M j > ( o n c M = Msp+2nckA+6
Rs
(44)
For a large new dimension (measured in the fundamental unit Mpianckt4+s) we have Mpianck,4+8 R S> 1 and so we can see that the large size of the Planck mass (and the associated feebleness of the gravitational force) may be due to the existence of a new large dimension. Given that the fundamental scale is now Mpianck,4+s the cutoff for the field theory radiative corrections of the Standard Model should be identified with this scale because, above it, the states of the
799 theory propagate in more than four dimensions. Thus above this scale the radiative corrections must be re-computed in the higher dimensional theory. Thus the hierarchy problem is solved if Mpianck,4+s < O ( l T e V ) . For the factorisable metric case the implications are dramatic for the graviton Kaluza Klein modes must be very light. We may solve eq(44) for R for various additional dimensions assuming MPlanck,A+s — ITeV. R = 0.4mm, 6 = 2 5
R = 10- nm,
(45)
.5 = 4
R = 3 0 / m , <5 = 6 The case 6 = 1 is ruled out because the Cavendish experiments have verified Newtonian gravity down to scales of O ( l m m ) . However as one may see the other cases are allowed. Using this bound and eq(44) one obtains the mass gap between Kaluza Klein modes of the graviton
1
YTeV—J
= 10 _ 3 eV, 20KeV, 7MeV, lOOMeV; 6 =
(® 2,4,6,8
Although each mode couples with gravitational strength only, and thus is very difficult to observe, at high energies the inclusive cross section for producing the kinematically accessible tower of Kaluza Klein states is only suppressed by inverse powers of Mpianckti+s due to the large number of states which contribute. As we shall discuss such effects should be visible in LHC experiments. The origin of the hierarchy is somewhat different for the case of a nonfactorisable metric.An explicit example was recently provided by Randall and Sundrum[10]. They considered the case of two parallel 3-branes sitting on the fixed points of an S1 /Z2 orbifold. The 5D spacetime is essentially a slice of AdS^ and the tensions of the two 3-branes are chosen so that the 4D spacetime appears flat. This last requirement forces the one of the two branes to have negative tension. The solution to Einstein's equations has the form of equation(26) with p(r) = kR where k is the five dimensional curvature. The exponential "warp" factor, w = e~kR, in the metric then generates an hierarchy of the mass scales between the two branes. All the fundamental mass scales are of order Mpianck,4- The graviton is localised to the positive tension brane and matter is on the negative tension brane. Masses on the negative tension brane are of order zoMpianck,4- Due to the exponential dependence of the warp factor on the size of the new dimension one may readily generate the desired mass hierarchy with w = 0 ( l O - 1 5 ) even though the size, R, of the orbifold is only some 30 times fc_1 which is of order the Planck length.. As a result it cannot use the see-saw mechanism, eq(43) involving the very large Grand Unified scale. Somewhat surprisingly, these theories provide an equally elegant reason for the lightness of neutrinos.
800 Factorisable Case. In these theories the smallness of Newton's coupling is due to flux spreading of the graviton in the additional dimension. If the right-handed neutrino is also able to propagates in the new large dimension its flux too will spread in the extra dimensions and this will lead to a suppression of the Yukawa coupling of the neutrino to the Higgs. As a result the neutrino mass will be naturally small, having a similar see-saw form to that in eq(43) suppressed by an inverse power of the four dimensional Planck mass [28]. To make this more explicit consider a five dimensional (5D) space with coordinates (xi,y). The interaction of the right-handed bulk neutrino, i//j(a;.y), with the Standard Model lepton doublet and Higgs states, l(x), h(x), confined to a 4-D brane is given by S = X Id
xl(x)h(x)i/R(x,y
= 0)
If the fifth dimension is compact (here, compactified on a circle) we may expand the neutrino wave function in terms of its Kaluza Klein tower of states ..
vR(x,y)
= -F v
oo
V
vR(x)einy'R
n = — oo
where the normalisation factor has been chosen so that the states, v^{x), have canonical normalisation for their kinetic terms - this factor accounts for the flux spreading into the fifth dimension. Here R is the radius of compactification and M* is the fundamental mass scale. The neutrino mass is given by X
mv =
Mpianck,4
where we have used the fact that the 5D mass scale is related to the 4D Planck scale by 2nRMt, = Mpianck:4/Mt. Thus we see the effect of flux spreading is to suppress the neutrino coupling in the same way as the gravitational coupling is suppressed since both propagate in the extra dimension(s). As a result one naturally obtains a see-saw mechanism suppressed by the large Planck mass even though there are no fundamental mass scales of this magnitude in the theory. For M* of order ITeV, a scale that avoids the hierarchy problem, one readily obtains neutrino masses of the correct order. Non-factorisable case In this case the graviton wave function (the zero mode of the Kaluza Klein tower) is peaked on the Planck brane and masses on the Standard Model brane are suppressed by the warp factor corresponding to the fall-off of the graviton propagator, M; ~ Mpianck&~kr• If the right handed neutrino, vR, propagates
801 in the bulk one finds it, too, has its lightest peaked on the Planck brane. As a result its Yukawa coupling to the lepton doublet and Higgs doublet fields living on the Standard Model brane is suppressed by the factor e~m'k where m is the vR mass in the bulk and k is the 5D curvature. One may try to use these factors to obtain a pattern of mass hierarchies [29] but, as in the case of radiative corrections, one may wonder why the hierarchies needed are quite small when the natural expectation from an exponential factor of this type is a very large hierarchy.
3.3 3.3.1
Abelian family symmetry and neutrinos T h e t w o g e n e r a t i o n case
Let us consider in more detail the 2 x 2 heavier sector of the theory, relevant to the atmospheric neutrino oscillations for the case only one mass squared difference contributes. The charged lepton matrix constrained by the U(l) family symmetry has the form
V m^{jz)
mT
1
)
where the origin of the intermediate mass scale, M, will be discussed shortly. The parameters m ^ are constants of 0 ( 1 ) , reflecting the unknown Yukawa couplings and qi = a^ — a^, qn — hi — 63 where, for the purposes of the general discussion, we have allowed different charges for the right-handed neutrinos. Below we shall discuss how the symmetric left-right charge assignments can give rise to a realistic structure. It is instructive to write eq(47) in the form
where E
A
^=(';;).
=(ii)>
^
The matrix A is determined by the Yukawa couplings only. If the only symmetry restricting the form of the mass matrices is the Abelian family symmetry there is no reason to expect correlations between the elements of A and so we expect Det(A) = 0(1). This is the situation we will explore in this paper. Given this we may see that Mf has the form Me
= VeL • Me,Diagonal ' VJR
(50)
where Ml.Diaganal
rn-r
=
/,-(&)«•(£)«« 0 \
\
0
1 /
802 and VeL = v(r'(-^ry with r, r',r"
= v(r"(^r^
Vm
= 0 ( 1 ) and V{x) = ( }
X
(52)
J. The lepton analogue [30] of the
CKM mixing matrix for quarks is given by VMNS
~ VlL
• VlL
(53)
The important point to note is that the left-handed lepton mixing matrix contribution is determined entirely by the left-handed lepton doublet family symmetry charges while the eigenvalues are determined by both the left-handed and right-handed charges. A similar analysis may be applied to the neutrino sector. We have MUght
= MD
. (JVfM-pl . MDT
(M) )T
= (VvL • M°Diagonal
• V^ )
= VvL • VvR • M*°D*agonal
1
• (Mff)-
• ( K L • M?tDiagmal
•
V^)T)T
• VjR • VjL
We see that there are two contributions to VMNS neutrino sector. The first is
in eq.(53) coming from the
KL = V^s(—r-j
(55)
where s = 0 ( 1 ) and we have allowed for a different intermediate scale M' (see below). It is determined by the same left-handed lepton doublet family symmetry charges that determine Vn,. The second contribution, VuR, is sensitive to the right-handed neutrino family charges. Let us discuss whether, in the case the light neutrinos have a hierarchical mass pattern (necessary if we are to explain both the atmospheric and solar oscillations), this contribution can be large. We note that if the elements of VvR are all of O(l) and one neutrino mass, m j , dominates then the elements of the matrix VuR • -MtrH^gonai ' VJR a r e a n °f C ( m i ) but its determin a n t i s « 0{m\). This matrix is also given by (M°Diagonal-V%)T)(M^)~l • (•M„'jjiagonal ' Vfl V'• As discussed above, the Abelian family symmetry cannot give correlations between the Yukawa couplings determining different matrix elements of -M®Di Thus, at first sight, it appears its determinal and Mff. nant cannot be of a different order than the product of its diagonal elements, in contradiction with the conclusion that follows if the neutrinos are hierarchical in mass. We shall discuss the circumstance in which this conclusion is false shortly. If it applies we have VMNS ~ Vji ' ^tL giving J
\LVr
qL -s(—)"': r '(—) K y M' M>
(56)
803 with the implication that qi = 0 for near maximal mixing. This is the solution adopted by several authors [31]. It leads to O ( l ) mixing, although there is no reason for the mixing to be really maximal i.e. 7r/4 (for this, a non-Abelian symmetry is necessary [32]). Although the mixing matrix VMNS is determined by the left-handed charges only, the mass eigenvalues are sensitive to the right-handed charges. In particular the Majorana mass has a similar form to that in eq.(48)
^ M «^((^r)-^-i?((^r)
(57)
where we have allowed for a different intermediate mass scale, M", in the righthanded sector and B is a matrix of Yukawa couplings of 0 ( 1 ) . This gives Det(m^) Det{m
]
- t ^ W ) ! - Det(M*)
2
oc ^ a
^ ^ (&)**
(58) (58)
Thus we see that the lepton mass may be adjusted by the choice of qm, while the neutrino masses may be adjusted by the choice of qVR- As a result a U{\) family symmetry is readily compatible with an hierarchical neutrino mass matrix and a large mixing angle in the lepton sector although it is unlikely to be maximal. However, in the context of our discussion concerning texture zeros the solution is somewhat unsatisfactory as it requires non-symmetric mass matrices. This in turn means that the full texture zero structure is lost - the relations eq(15) are valid only as order of magnitude equations rather than equalities. However it is possible to evade this conclusion and build a model with large mixing generated by the neutrino sector even for the case q^ ^ 0 and a left- rightsymmetric charge assignment. Indeed we will show that this is quite natural and relates the large mixing angle to the smallness of V^! The problem is to explain why the elements of VVR • M*9Di nal • VjR are all of 0(mi) but its determinant is < < 0(m\). In fact this happens without the need to tune Yukawa couplings if M^f has a very small eigenvalue [33]. In this case this state may dominate in the see-saw product (M°Dlagonal • V%)T) • (Mff)'1 • (M°Diagonal • V^Tf giving rise to an anomalously heavy light neutrino state. Further, if this light right-handed neutrino state couples equally to the second and third families of left-handed neutrinos, it will give equal elements for VUR • M*9D* nal • VjR are all of 0(mi) as required. In the case the neutrinos have the same Abelian charge assignments as the quarks this latter condition corresponds to the equality of the (2,2) and (2,3) matrix elements that was forced on us by the smallness of Vcb. 3.3.2
T h e t h r e e generation case
In fact almost the simplest lepton charge assignment gives rise to a large mixing angle in the v^ — vT sector. Following the discussion of Section 3.1 we take the
804 leptons to have the same family charges as the quarks. This is consistent with an underlying (SU(5)) GUT symmetry and, as we shall discuss, this can explain the difference between the quarks and lepton masses. Since we have a left- rightsymmetric assignment the right handed neutrino charges are now determined and so the structure of M^f is largely determined. Such masses arise from terms of the form VRVRY, where E is a SU{2>) ® SU(2) &> U(l) invariant Higgs scalar field with I\y = 0 and VR is a right-handed neutrino. Since we do not know the charge of E, we have to consider all possible choices. This allows us to "rotate" the larger coupling to any of the entries of the heavy Majorana mass matrix, generating a discrete spectrum of possible forms [34, 35]. For example, if the E charge is the same as that of the Hit2 doublet Higgs charges, the larger element of Mv will be in the (3,3) entry. The rest of the terms will be generated as before through the U(1)FD breaking by < 9 > and < 9 >. In the preceding section we observed that a large mixing angle will result if there is an anomalosly light state equally coupled to the second and third families. This coupling is determined by M® which has the structure of eq(31). We see that the equality of the (2,2) and (2,3) matrix elements means that VR^_ couples equally to the second and third generations so that if it is anomalously light it will automatically generate large mixing in the light neutrino sector. If the E charge is -(a\ + a 3 ) it dominantly couples to the first and third families leaving VR^ light as desired. Whether it is light enough to dominate the seesaw light neutrino mass matrix depends on how small the coupling of E to the second family is. This in turn is determined by the expansion parameter (J^TT)At this point, it is important to discuss what are the expansion parameters in the various sectors, i.e. what are M, and M' . As discussed above, the most reasonable origin of the higher dimension terms oc ( -^ J is via the FroggattNielsen mechanism [8], through the mixing of the lepton states or the Higgs states. In the case of the mixing responsible for VMNS, the former is irrelevant for in this case the mixing arises via heavy states which belong to SU(2) doublets and hence are closely degenerate ( M ' = M). In this case, the contributions to eq.(53) or (56) cancel. We conclude that the relevant mixing is generated through the Higgs states. Thus, M should be interpreted as the mass of the heavy Higgs states mixing with Hi, generating the down quark and charged lepton masses, while M' is the mass of the Heavy Higgs state mixing with H2, generating the up quark and Dirac neutrino masses. Consequently, the expectation is that M' > M because the same expansion parameters govern the hierarchy of quark masses, and typically one needs a smaller expansion parameter in the up-quark sector to explain the larger hierarchy of masses in that sector. Finally M" should be identified with the mass of the heavy Higgs states mixing with E. This is not determined by the Abelian symmetry and for M" » M' the expansion parameter {-^n) will be small enough so that VR^ indeed gives the dominant contribution to (M®Diagonal
• VJR'
) • {M^)~l
•
805
\JVlu,
Diagonal
v
vR
I
•
Putting all this together we find the following structure for the neutrino masses and mixing
(e*
2eU*\
2e 4 e2 e2
M? =
\e*
(el el v,
M£ =
l\
el e% e%\ < E >
e> l)
(59)
\lel4J
where e = < 6 > / M i ~ e 2 , and ev =< 6 > /M'". For reasons that will become apparent I have chosen the vacuum alignment breaking of Section 2.4.2 when determining the matrix elements. The Majorana mass matrix has eigenvaules 1, 1 and e% and so for sufficiently small ev the lightest state which is in the V-IR direction will dominate the see saw mechanism and generate the heaviest of the three light neutrino statres. Up to Yukawa couplings of 0 ( 1 ) its mass is given by m^ffeavy ~ e4 < H >2 lvnv.1R =v 2 e 4 /e® < £ > and its composition is given by 2 ^Heavy ~ ( ^ + ^ r ) / \ / 2 + 0{e )ve. Thus we have near maximal mixing as required. The reson for this is that VIR couples to the combination e2(yl_l + vT) + e4ve, as may be seen from the form of M®. The structure of M® is the same as that of the quark mass matrices and so one sees that, due to the see saw mechanism, maximal mixing results from the smallness of Vcb (which required theat the (2,3) element should be the same magnitude as the (2,2) element). This is quite contrary to the naive expectation that quark and lepton mixing angles should be the same order and is the reason why a very simple family symmetry can generate acceptable mass and mixing angle hierarchies for both quark and leptons. However we see that it is necessary that M® preserves the same form as M^ and it was for this reason I had to use the vacuum alignment of Section 2.4.2. While we showed that this could readily be obtained for an Abelian symmetry as I shall siscuss it is perhaps more natural in the context of a non-Abelian family symmetry. It is straightforward to determine the three light eigenstates resulting from eq(59). Up to Yukawa couplings of 0 ( 1 ) they are given by va ~ve-e2(Vll
+ vT)/^/2,
Vb ~ {v^ - vT)/V2,
mVa K, eselv2/ 4
< £ >
(60)
2
mv,t K, e v / < £ >
2
vc ~ e ve + ( ^ + vT)/\f2,
mVv w e 4 v 2 /e£ < E >
The contribution to mixing in the e/x sector comes mainly from the charged lepton sector so finall one finds sin^«
/^E
sin^*y|,
+
0(e2),
6m%~ e\4 / < E > 2
6m2M « e V / ^ < £ > 2
(61)
806
U(1)H
U{\)xx U{l)x
Q u d L e H2 0 0 0 0 0 1 0 0 1 1 0 0 1 1 0 0 1 0
Table 4: Anomaly-free U(1)FI
Hx -1 0 0
symmetries.
One sees that the mixing following from eq(59) can explain the atmospheric neutrino oscillation with near maximal mixing angle and solar neutrino oscillation via the small angle MSW solution.
4
Anomaly Cancellation and the prediction of t h e weak mixing angle.
To complete the discussion we turn to a consideration of the anomaly structure of the model based on an Abelian family symmetry. Although the choice of charges for the quarks and letons given above has no SU(3)2U(1)FD, SU{3)2 U(1)FD o r ^ ( l ) 2 U(1)FD anomalies it is clear that the Higgs charges are not anomaly free. However in theories derived from a string theory there is a significant new possibility for a non-vanishing anomaly associated with a new U(l) gauge factor can be cancelled by the Green-Schwarz (GS) anomaly cancellation mechanism[36, 37]. In the 4-D version of the GS mechanism one cancels the anomalies of a single 1/(1) by an appropriate shift of the axion present in the dilaton multiplet of four-dimensional strings. This happens because such an axion has a direct coupling to FF. For the GS mechanism to be possible, the coefficients Ai, i = 3,2,1 of the mixed anomalies of the £/(l) with SU(S), SU(2) and U(l)Y have to be in the ratio A3 : A2 : A\ = k$ : k2 : k\. Here fc; are the Kac-Moody levels of the corresponding gauge factors and they determine the boundary condition of the gauge couplings at the string scale by the well-known equation g2k% = g2k2 = gjkf. The usual (e.g. GUT) canonical values for these normalization factors (corresponding to the successful result sin2(8w) = 3/8) yield k$ : k2 : fci = 1 : 1 : 5/3 and hence the GS mechanism can only work in this case if the mixed anomalies of the £/(l) with the SM gauge factors are in the ratio A3 : A2 : Ax = 1 : 1 : 5/3[36]. One can easily convince oneself that there are only two t / ( l ) symmetries with anomalies consistent with this ratio of gauge coupling constants, namely U(l)x and U(1)XX given in Table 4 . Thus, in a supersymmetric SM coming from a string the most general familyindependent anomaly-free [/(l)'consistent with canonical gauge coupling unifi-
807 U(l)
Q-i + x
a.i + x
a.i + y
a,i + y
z-2«!
-z+wai
Table 5: Anomaly-free U(l) symmetries. cation is given by:
= z U(1)H + x U{l)x + y U(l)xx •
U(\)FI
(62)
As the U(1)FI is family blind one may use any combination of the U(1)FD and the U(1)FI currents while still maintaining the structure for the fermion masses discussed above, he full charges of this U(l) factor may now be determined using Tables 3 and 4 and give the charges shown in Table 5. One can easily check that for z = — 2x one gets the results of 27 for the uquark mass matrix. If one further has 2>x + y = — 4«i (and w = —2) one gets the results of eq.28 for the d-quark masses. Note, however, the choice of the flavourindependent component allows for further possibilities for the d-quark matrices. Thus we conclude that the generic problem raised by anomaly cancellation may naturally be solved in the context of string based models. It is important to recall that the U(l)s whose anomalies are cancelled through a GS mechanism are necessarily spontaneously broken not far below the string scale. The reason for this is that the piece in the Lagrangian cancelling the anomalies has a supersymmetric counterpart which is a sort of field-dependent Fayet-Iliopoulos term for the [7(1). This term forces £/(l) symmetry breaking in a natural way at a scale of order l/v / 1927rM s t r i n s [36, 37]. Thus the present scheme also explains why the extra 1/(1) symmetry required to generate the fermion mass patterns does not survive down to low energies. So far we have assumed that sin2(8\y) = 3/8 at the string unification scale. In fact an acceptable pattern of fermion masses actually requires this value! To see this let us compute the mixed anomalies Ai for the U(1)FD symmetry of table 3 with SU(3),SU(2) and U(1)Y- One finds respectively: 3
^3 = 2 £' Oii 1 i == l1
3
.,
2 ^
2
1=1
Al
=
11
3
1=1
3
Ql T^C 6 ^ i=i
+
2
?J2a* + « i ( w - 2 )
(63)
i=i
where, as we are working with the full (7(1) charges, we no longer have 03 = — («i + a2) and a 3 = —(ai + a 2 ) as the U{l)pi piece adds a family independent
808 U(1)F
3rd generation 2nd generation 1st generation
Q
u
d
L
0 0 0 1 1 1 -4 -4 -4
0 -1/2 -5/2
e
Hi
0 -1/2 -5/2
0
H2
0
6
0 1 - 1
Table 6: Anomaly-free 1/(1) gauge symmetry giving rise to the textures in eqs(A,B,C)
term to a; and also to a^. However, to maintain the result rat, = mT, this term is the same for at and for aj and so 3
3
^ a ,
= ^
i=\
f l i
(64)
t=l
Now, in order that A3 = A2 so that the SU(3) and SU{2) couplings are unified eq(63) gives (65) w = 2 and A3: A2:Al
or ati = 0 =
1:1 : \
(66)
The second equation requiring sin2(8w) = 3/8 may be seen as a consequence of an acceptable mass matrix structure. The first condition in eq(66) requires « i = 0 since w = 2 is no good as it leads to very bad results for the d-quark masses. Thus we see that a simple U{\)FD extension of the standard model generates much of the quark and lepton mass matrix structure through the constraints of anomaly cancellation, but that the full U(l) extension including the Higgs doublets needed for the MSSM is only anomaly free in the context of string theory via a Green Schwarz term. The charges of each individual particle with respect to this anomaly-free £/(l) are shown in Table 4. The simplicity of the assignments is remarkable. It is also worth to emphasise that this U(l) symmetry may be made anomaly free through the GS mechanism if and only if the normalization of the coupling constants is the canonical one g% = g% = 5/3<7j yielding the successful prediction sin28w = 3/8. Thus the present scheme not only predicts a successful pattern of fermion masses and mixings but also predicts sin20w = 3/8 even without any grand unification group. To summarise we have found it very straightforward to construct an Abelian family symmetry capable of generating both the texture zeros and the required hierarchical structure for the remaining mass matrix elements. For the simple
809 choice of the simple Abelian family symmetry discussed above the mass matrix is given by /0 celA o \ Mu,d a = c'4,d < d ^,d (67) mt b ' \0 a'eu,d 1 / where the non-zero entries have been expanded in a power series in the expansion parameters eu « e2d « 0.04. The quantities a, b and c are constants of order 1. Essentially all attempts to explain the pattern of light, quark and lepton masses and mixing angles rely on some spontaneously broken family symmetry of this type to organise the magnitude of the mass matrix elements in the manner given in eq 67, most involving an Abelian family symmetry [38, 13, 14]. In string theories such symmetries abound and have been used to generate fermion mass structure [39]. Such a symmetry can generate an hierarchy of masses and mixing angles in powers of < 8 > but does not determine the coefficients a, b and c of order one which are needed if one is to obtain quantitative predictions for masses.
5
Non-Abelian Symmetry
So far our discussion has been entirely in the context of Abelian family symmetries. As we have seen, while such symmetries provide a promising source for an hierarchy, they leave the Yukawa couplings undetermined. One possibility is that there are new non-Abelain symmetries which will give relations between Yukawa couplings. These symmetries may be intra-generational (GUTs) or inter generational (family) symmetries.
5.1
Grand Unification
While an Abelian family symmetry provides a promising origin for an hierarchical pattern of fermion masses, in order to go further it is necessary to specify the charges of the quarks, charged leptons and neutrinos. As we discussed in the last section, it is straightforward to fit all the observed masses and mixing angles by the choice of the U{\) charges not constrained by the Standard Model gauge symmetry. However, the structure of the Standard Model is suggestive of an underlying unification which may relate quark and lepton multiplets. The success of the unification of the gauge couplings also supports this picture. Thus it of interest to consider whether the charges needed to obtain realistic quark mass structures are consistent with the constraints on an Abelian family symmetry that result from some underlying grand unified gauge symmetry.
810 5.1.1
5£7(5)
The prototype Grand Unified Theory is based on the group 5£7(5) a rank 5 group with just enough neutral generators to accommodate those of the rank 5 Standard Model[40]. In 5£7(5) the states of a single family are accommodated in just 2 representations 5 and a 10. Thus the representaion content of the Standard Model is simplified. The assignment to 5 and 10 are shown in eqn(68) 51/(5) D SU(3) ® SU(2)
(68)
dc2 ^5
X
J_ 72
0 w§ —«2 -u% 0 u\ «2 —u\ 0 u1 v? u3 d1 d2 d3
— u1 —d1 -u2 -d2 3 —u —d3 0 e+ e+ 0
(69)
where c denotes the charge conjugate and the numerical indices are 5£7(3) indices. It may be seen that the quarks and leptons belong to the same multiplet. Of course 5£7(5) must be broken and it is important to consider the pattern of the symmetry breaking leading to the Standard Model. In the case of 577(5), this is shown in eqn(70). SU(5)
5£7(3)®£7(1)
SU(3) <8> 517(2) ® £7(1) <£&>
(70)
where one can see that the single 24-dimensional adjoint representation, E, leads to the breaking of 5£7(5) immediately to the Standard Model 5£7(3) ® SU(2) <S> £7(1) and subsequently the electroweak breaking is triggered by a doublet contained in a 5-dimensional Higgs representation, H§. The most interesting prediction following from 5£7(5) follows because there is just a single gauge coupling. Thus the weak mixing angle which is given in terms of the ratio of the two coupling constants associated with SU(2) ® £7(1) is determined. The prediction is 3 2 _ Tr(T$L) 3/8 (71) sin aw ~ T (Q2) r but this applies at the unification scale, Mx- As we shall discuss including radiative corrections to the couplings to run them down to low scales one finds that the 3/8 initial value leads to quite acceptable values at laboratory energies for the weak mixing angle but only in a supersymmetric GUT. 3
1 leave it to the reader to derive this form!
811 The Yukawa couplings of the theory are also restricted by 5£/(5) in a way that gives predictions for fermion masses. This may be seen from the equation LY = hi>v5Rxl°qLHi
=>me = md = h
(72)
The mass of electron is equal to the mass of down quark at the unification scale. The same is true for the heavier generations. For the third generation this is a good prediction because QCD radiative corrections, which only affect the b quark mass, increase the b mass at low energies and bring the predicted ratio of b to T masses into excellent agreement with experiment. For the light generations equality of down quark and lepton masses at the GUT scale is inconsistent with the measured values, so some modification is needed. It is possible to generate such a modifications for example through the coupling L'Y = ti i?RHllpXsqL
=>mq = 2>me
(73)
where _ff45 is a 45 dimensional representation of Higgs fields. We will discuss shortly how a combination of 5 and 45 Higgs representations gives an excellent description of the first two generation masses too. 5.1.2
50(10)
Although 51/(5) is the simplest example of a Grand Unified group may others have been studied. Of these, probably the most promising is 5 0 ( 1 0 ) because an entire family plus the right-handed neutrino fits neatly in a single 16 dimensional representation [41]. In addition 5 0 ( 1 0 ) includes many other candidate groups having as subgroups
5O(10) D
SU(5) x t / ( l ) SU(Q) x 5(7(4) 5C/(4) x SU(2)L x SU(2)R
The pattern of symmetry breaking determines which subgroup is relevant. Following the first breaking path we may decompose the 5O(10) representations according to their SU(5) content. This gives for the lowest lying representations 10 = 5 + 5 16 = 10 + 5 + 1 45 = 24 + 10 +J.0 + 1 _ 120 = 5 + 5 + 10 + 10 + 45 + 45 126 = 1 + 5 + 10 + 10 + 50 + 50 As one may see the 16 neatly contains the 10 + 5 needed to accommodate a family of quarks and leptons plus a singlet state that is identified with a righthanded neutrino component. The quark and lepton masses are given by two
812 trilinear couplings of the two 16s either to the 10 dimensional Higgs or to the 120 dimensional Higgs. According to their 5C/(5) content these have the structure 16.16.10= (10 + 5 + l)(10 + 5 + l)(5 + 5) D 10.5.5_+10.10.5_ _ _ 16.16.120 = (10 + 5 + 1)(10 +_5 + 1)(5 + 5 + 10 + 10 + 45 + 45) D 5.10.45 From this one sees that one has the same couplings as appeared in the SU(5) case and thus the same possibile mass ratios of the down quarks and leptons. However there are now implications for the up quark and Dirac neutrino masses. For the case the Higgs is a 10 dimensional representation we have mu = m® and for the case only the 45 in the 120 dimensional representation acquires a vev one has only off diagonal mass terms with meaf,b = 3mdadb, rn^aul, = 0> % „ » t 7^ 0. For the case of the 126 diagonal couplings are allowed and along a particual vacuum direction one may obtain mass relations of the form me = 3m
5.2
SU(5) x UFamUy{l)
We turn to the possibility that the family symmetry commutes with an SU(5) GUT [42]. There are only three U{\) family charges needed for each family. These are given by Qto,u°,e°)i = Qi° a
Q(i,d )i
(74)
= Qi
(,«), = QT From the above it immediately follows that : (i) The up-quark mass matrix is symmetric. (ii) the charged lepton mass matrix is the transpose of the down quark mass matrix. The expansion parameters in the various sectors can be different (depending on whether the non-renormalisable contributions are due to fermion or to Higgs mixing, or a to combination of the two). However, as discussed in Section 3.3, a single expansion parameter describing H\ mixing determines the down quark and charged lepton mixing, and similarly a single expansion parameter describing H2 mixing determines the up quark and Dirac neutrino mixing. The fact that the right-handed neutrino charges are unconstrained means the neutrino mass spectrum is not restricted but, again as discussed in Section 3.3, the mixing angle in the /IT sector is insensitive to these charges and is determined primarily by Q ^ T . At first sight this charge structure seems to offer an immediate explanation for the difference between large mixing angle observed in atmospheric neutrino
813 mixing and the small quark mixing angles. This is because the former is determined by Q5 T while the corresponding quark mixing matrix element, Vcb, is determined by Qj°. However the main difficulty in using this freedom to describe both mixings arises from the associated correlations between the eigenvalues of the charged lepton and the down quark mass matrices due to structure (ii) above. Indeed, if the eigenvalues of the down mass matrix (with expansion parameter e) are given by a sequence 1, e, efc the eigenvalues for the leptons (with expansion parameter e) are 1, e, <=*. The down quark masses are well described by the choice e ~ e2 ~ 0.04 and k = 2. while for the leptons the hierarchies are well described by e ~ e ~ 0.2 and k = 5. This is clearly inconsistent with the pattern coming from the family symmetry which requires the same k in the down quark and lepton sectors. One way to reconcile the two forms for the mass matrix, originally advocated by Georgi and Jarlskog [41], is to have different Yukawa couplings in the quark and lepton sectors. These couplings are determined by the underlying SU(5) gauge group. If the mass comes from the coupling to a 5 of Higgs then m^. = mi. while if the mass comes from the coupling to a 45 of Higgs then m^ = 3m;,. .The observed hierarchy for the lepton masses is well described by the eigenvalues l , 3 e 2 , e 4 / 3 . Georgi and Jarlskog achieved this by restricting the mass matrices by family symmetries to have the form / 0 a' Md = [a < H5 > c < H45 >
\ M£
0
0
0 0
\ (75)
bJ
/ 0 a' 0 b = [a < H > 3c < H45 > 0 \ 0 0 b< H5
\ >)
The equality of the (3, 3) entries gives m j = mT at the GUT scale. As discussed inclusion of radiative corrections leads to an excellent prediction for the masses as measured at a low energy scale. For a < H5 >, a' < H5 ><< c < H45 > < < b < H5 >, the second generation gets its mass predominantly from the 45 and so mM ^ ms at the GUT scale. Including radiative corrections the QCD effects make the strange quark heavier at low energy scales and give an excellent prediction for the ratio of these masses at laboratory scales. Finally, what about the first generation? .With the form of eq(75), the texture zeros in the (1,1), (1,3) and (3,1) positions imply Det(Md) = Det(Me). At the GUT scale this implies me/md = msmt,/mlj.mT =a 1/3. Again radiative corrections increase the down quark mass relative to the lepton mass, giving an excellent prediction for the light mass ratio. This example has illustrated how a GUT can give an explanation for the relative charged lepton and down quark masses. We can now return to the question of how the U(l) family symmetry may generate the inter-family structure.In particular such a symmetry should explain the ordering a < H5 >,
814 a' < H5 > « c < H45 >« b < H5 >. Moreover it should address the question of the large mixing angle in the v^vT sector. It is straightforward to modify Georgi and Jarlskog's scheme to give the promising Abelian family symmetry discussed above. The symmetric charge assignment follows if Qj° = Q\ = Q"". If we choose the charge assignments of the three families as in Section 2 the mass matrix structure discussed there follows4. The combination of the GUT and the family symmetry has answered several questions left unanswered by the original Abelian family symmetry alone. In particular the GUT requires the same family charge for the down quark and leptons and this, together with the GUT relations between down quark and charged lepton Yukawa couplings have given excellent predictions for the charged lepton masses.It also required that the up mass matrix be symmetric up to Yukawa couplings because the up quarks charges of a single family are related by 5C/(5). However it still leaves some questions. For example SU(5) does not fix all the charges of a single family and we had to assume the equality of Q}°, Q\ and Q"a in building the mass matrix model. This suggests looking at larger gauge groups, the most promising being 5 0 ( 1 0 ) .
5.3
SO(10)(g)U(l) family
Since a complete family belongs to a single representation of 5 0 ( 1 0 ) it is necessary for the Abelian charges of the members of a single family to be the same. As a result the mass matrices must be symmetric, as was assumed in the examples considered in detail above. Of course 5 0 ( 1 0 ) must be broken to SU(3) x SU{2) x U(l). The example analysed above corresponds to the case that the 5O(10) breaking appears in the masses of the heavy states involved in the Froggatt Nielsen chain. Thus, after 5O(10) breaking there is no reason why the mass of the heavy H\ states responsible for up-quark and neutrino mass generation should be the same as that of the heavy H2 states responsible for down quark and charged lepton masses. However the 5O(10) relations between Yukawa couplings will persist up to corrections involving higher dimension terms which are likely to be small. Thus one may implement the promising Abelian family symmetry just as we discussed for the case of SU(5) and obtain the good relations between down quark and charged lepton masses. In this case however the equality of quark and lepton Abelian charges is predicted so only the three charges of the three families needs to be specified. 4 This is provided < H45 > = < H5 > .The question of vacuum alignment in a GUT is quite involved and I do not have the time to discuss it here.
815
Sh
•Sh
^THi
d»
b)
a) crossed
Figure 4: The gluino box diagrams for AS — 2 transitions.
6
Non-Abelian family symmetry
As discussed above, the kinetic terms and gauge interactions of the Standard Model has a very large family symmetry group, namely E/(3) 5 , where the U(3) factors act on the left- and right- handed multiplets of quarks and leptons respectively. The group is extended to U(3)6 if three right-handed neutrinos are added. Any family group should be contained in f/(3) 6 but this leaves very many possibilities for non-Abelian family symmetries. If there is a G U T symmetry the possible family group will be smaller; for example if the GUT is 5 0 ( 1 0 ) the maximal group is t/(3). There is a strong motivation for a non-Abelian family symmetry in supersymmetric theories because it can explain why the squarks and sleptons are nearly degenerate as is required to suppress flavour changing neutral currents. The constraints are strongest for the first two generations. For example graphs such as those shown in Fig 4 lead to the following constraints [43] < 6 . 1 ( r 3 (Tev ^ ) (' v sin 0
C
^
< I P - I T m ^ \2fsinfl P \2 ^ 1U \Tev) \sinOc)
(76)
where the tilde refers to the relative mixing angle between the Standard Model state and its SUSY partner. The simplest way to account for this near mass degeneracy is to consider the diagonal SU(3) subgroup of C/(3)6. In the symmetry limit the symmetry will guarantee the exact degeneracy of the squarks and sleptons. However the Yukawa interactions (and the related fermion masses) do not respect this symmetry and so the symmetry must be broken. Indeed our discussion of the
816 Abelian family symmetry is consistent with the strong breaking of this symmetry to U (1) Family The top Yukawa coupling gives the strongest breaking and so it is likely that at least some of the scalars of the third generation are likely to have masses very different from their first and second family partners. Luckily the flavour changing bounds allow such breaking provided the squark mixing matrices have elements involving the third generation no greater than those of the CKM matrix. These considerations have led several authors [44] to consider the U(2) subgroup of the [/(3) 6 symmetry as an approximate family group. This group acts on the first two generations and has a symmetry breaking pattern C/(2)l^£/(l)1^0 so that the family mass hierarchies m^/m-2 and m^/mi can be explained in terms of the two symmetry breaking parameters e and e'. The symmetry breaking is generated by fields >°, Sab and Aab, where 5 and A are symmetric and antisymmetric tensors and the upper indices denote a (7(1) charge opposite to that of the fermions ip = ipa (Dips which transform as 2(1)1 under U(2). In leading order fermion masses arise from < cf)2 > jM = e^ and <S22 > jM = es giving mass matrices of the form
U l) in the heavy 2, 2 space. It has been shown that it is possible to arrange es « £4, by vacuum alignment as is necessary to describe the magnitude of Vct, (cf. Section 2.4). The texture zero in the (1,3) and (3,1) positions follow because these are given by < (p1 > jM and this vanishes automatically because the <pa vev can always be rotated into the
\
€S £4,
0 £0 1 /
While this gives the texture zero structure desired it does not give a fermion mass hierarchy which is larger in the U sector than in the D/E sectors. However it was shown [44] that if the symmetry is combined with an 51/(5) symmetry the up matrix elements actually vanish at order e and e'. It is thus possible to construct a viable model of fermion masses using the U{2) symmetry. An interesting alternative model realising the alternative texture 3 of Table 2 is presented in [45]. It is clear that an underlying non-Abelian family symmetry offers interesting new possibilities for generating a phenomenologically viable theory of fermion
817 masses including the precise predictions which follow from texture zeros. However the example discussed above does not use the non-Abelian structure to relate couplings involving different families. In particular the near equality of the (2,2) and (2,3) matrix elements that is required to generate Vcb suggests there is also an underlying non-Abelian symmetry relating the second and third generations. As we observed above the same structure in the neutrino Dirac matrix readily leads to maximal mixing suggesting the symmetry should apply to leptons too. For these reasons I think it interesting to construct a simple non-Abelian symmetry acting on all three generations which achieves this.
6.1
A SI/(3) x 1/(1) family m o d e l
The simplest starting point is to assign the three generations of quarks and leptons to be triplets under a 5C/(3) family group: Qi, ul
dl
h, el
ucRi) 1 = 1,2,3
63
We will take the Higgs H\t2 to be 577(3) singlets. In addition it is necessary to include Standard Model singlet fields 9\, 9l2% i = 1,2,3 which transform as 3 under the family group and 9u, 92i which transform as 3 under the family group. These fields acquire vevs and spontaneously break the symmetry. To allow these vevs to develop along a D—flat direction The additional U(l) family symmetry will be necessary to restrict the allowed couplings in the theory. Finally we include a Standard Model singlet and SU(3) family singlet field, S. 6.1.1
Vacuum alignment
The family group has a breaking pattern given by SU(3) —> 577(3)
—> <9i>
577(2)
—> < 92>
Nothing
Vacuum expectation values (vevs) for B\ and 92 may be readily driven by negative soft masses squared or by other terms in the potential 5 . Clearly one may always perform an SU(3) rotation so that <0f >= Vi(OOl) At the second stage of breaking one may always use the residual SU(2) symmetry to rotate the vev of the second field to have the form < 9% > = Vi ( 0 cos if sin (f>) 5 I assume that the vevs develop along D-flat directions <9i > — < 6 > but that the family symmetry prevents the 9 from coupling to quarks and leptons. We shall show that it is easy to do this via an additional U(l) family charge.
818 Since these fields both transform as 3 there is no cubic term in the superpotential we can construct which involves these fields and thus there is no term in the potential which aligns the vevs of these fields. For this reason one may expect cos
T h e m a s s matrices
Due to the family symmetry there are no trilinear mass terms for the quarks and leptons. Thus masses will only occur after breaking the SU(3) family symmetry. For reasons that will soon be apparent we assume that the only fields charged under the additional t / ( l ) factor are ^|> @2i a n d S, with charges 1, 1 and —2 respectively. Then D—flatness under SU(3) x £7(1) requires < 82 > = < 92 > = < S > . With this charge assignment, the leading terms in the superpotential giving down quarks a mass are (suppressing coupling constants)
Wl
= Qi W i | | + QAWh^jtfs
The resulting mass matrix has the form
where I have taken cosip « sin^. Similarly one obtains the same form for the up quark mass matrix but with expansion parameter e2 = we also have the relation mb _ < H1 > Ml mt~ < H2> M\
2 V2 M,
• In this case
so for a large difference between M\ and M2 there will be a large difference between the bottom and top masses even for tan/3 = < H\ > / < H2 > of 0 ( 1 ) . Note that the near equality of the (2, 2) and (2,3) matrix elements results from the underlying SU(3) symmetry. One can go further and use this symmetry to predict the relative magnitude of these two terms by vacuum alignment constrained by the non-Abelian symmetry in the manner discussed in [32] but I will not develop this further here. Finally the remaining elements of the mass matrix are generated by the next terms of higher order allowed in the superpotential 6
6 It is necessary to forbid the lower dimension terms £,J'*Qj0i,2 j ^tj^f' a n ^ this requires at least an additional discrete Zjv symmetry under which only the 8, 9 fields transform. If 01 2 transforms nontrivially while 6182 81 is invariant the unwanted term is not present. One may readily check that all other unwanted terms can similarly be eliminated.
819 giving
\-e2a
e2 1 + e2 /
M
?
where a = V\jM\. The up mass matrix has the same form but different expansion parameters. As discussed in Section 2.3, this is due to the fact that the heavy intermediate states involved in the Froggatt Nielsen mechanism may be different in the up and the down quark sectors. Thus one readily obtains the form given in Section 1.4.1 needed to describe the observed up and down quark mass matrices. This model may readily be extended to the lepton sector in a similar way to the case of the Abelian symmetry discussed in Section 3. Since the charge assignments are symmetric there is no difficulty in combining the family symmetry with SU(5) or S'O(IO) and thus obtaining good relations between down quark and lepton masses in the manner discussed above. Similarly, by suitable choice of the vacuum alignment of the vevs responsible for the Majorana neutrino mass matrix one may also implement the scheme discussed above generating large neurtino mixing in the (2,3) sector. Although the model has quite a complicated symmetry breaking structure there are some advantages to the non-Abelian family symmetry In contrast to fermion masses, degenerate scalar masses are allowed in the symmetry limit. As a result the symmtry breaking contribution is relatively small as is required to solve the SUSY flavour changing neutral current problem. Furthermore, following from the non-Abelian symmetry the expectation is that the (2, 2) and (2,3) matrix elements are of the same order and their relative magnitude can, in variants of the scheme, be predicted. Finally it is not necessary to choose very particular charges for the additional family symmetries, in contrast to the Abelian example presented above. However it is clear that the non-Abelian symmetry does not answer all questions. In particular the magnitude of the hierarchy between families or between the up and the down sectors is not determined by the symmetry alone, being determined by the vevs < H\p >, ^1,2 and heavy masses Mi. Although we have considered in detail only one specific model these comments apply more generally to non-Abelian family symmetries; the interested reader is referred to the papers in reference [44] for further examples.
7
Summary and Conclusions
The pattern of fermion masses and mixings is not explained by Standard Model and provides information about what lies beyond. Although the measured masses and mixing angles do not determine the full quark mass matrices, reasonable assumptions lead to an hierarchical form with some anomalously small elements (approximate texture zeros). It is interesting that such structure can readily be obtained if there is a broken Abelian family symmetry. Moreover
820 such family symmetries, when combined with relations following from Grand Unification, can readily describe the charged lepton sector too. At first sight the extension to neutrinos looks problematic because the mixing angle needed to describe atmospheric neutrino oscillation is near maximal in contrast to the equivalent mixing angle in the quark sector, V c t, which is very small. However there are several ways to explain the discrepancy. A very plausible mechanism follows naturally from the fact that neutrinos are special in that their righthanded components can have Majorana masses. The light neutrino spectrum is then determined by the see-saw mechanism. If the Majorana mass matrix has an anomalously small eigenvalue its contribution will be dominant and determine the properties of the heaviest of the light neutrino states, z/y.. If the corresponding Majorana eigenstate has approximately equal coupling to the mu and tau neutrinos, Vfj will be approximately maximally mixed. This case is realised if the neutrino mass matrix has the same form as the quark mass matrix required to give a small Vc(,. Thus, due to the see-saw mechansm, maximal neutrino mixing results because the equivalent quark mixing is small, quite counter to our naive expectation! The success of simple family symmetries has encouraged the search for more predictive family symmetries. Realistic models can readily be constructed using non-Abelian symmetries which preserve the success of the Abelian family symmetry and can more convincingly explain some of the fermion mass structures. In addition they may solve the flavour changing neutral current problem in supersymmetric theories. While promising, there are still many questions to be answered. There are many candidate models and they all require quite complex patterns of symmetry breaking and have many parameters. One may hope that the more precise measurements of masses and mixing angles coming from b-factories will be able to distinguish between these models. One may also hope that advances in our understanding of the theory of everything will allow us to determine the couplings and symmetry breaking parameters and allow us to make a precise comparison between experiment and theory. Only then will the fermion masses really be able to shed light on the Physics Beyond the Standard model.
References [1] C. Caso et al, Eur. Phys. J. C 3 , 1 (1998). H. Leutwyler, Phys. Lett. B 3 7 8 , 313 (1996) [hep-ph/9602366]. S. Narison, in Nucl. Phys. Proc. Suppl. 86, 242 (2000) [hep-ph/9911454]. A.Ali and D. London, hep-ph/0002167 [2] B.J. Bjorken, lectures presented at Oxford 1996, available on Oxford W W W home page; See also L. J. Hall and A. Rasin, Phys. Lett. B 3 1 5 (1993) 164
821 [3] F. Wilczek and A. Zee, Phys. Lett. B 7 0 , 418 (1977). H. Fritzsch, Phys. Lett. B 7 0 , 436 (1977). For a recent review on quark masses and extensive references on textures see: H. Fritzsch and Z. Xing, Prog. Part. Nucl. [4] R.G.Roberts, A. Romanino, G.G.Ross, L.Velasco Sevilia, paper in preparation. [5] R. Gatto, G. Sartori and M. Tonin, Phys. Lett. B 2 8 , 128 (1968). N. Cabibbo and L. Maiani, Phys. Lett. B28, 131 (1968). R. J. Oakes, Phys. Lett. B 2 9 , 683 (1969). [6] P.Ramond, R.G.Roberts and G.G.Ross, Nucl.Phys. B406(1993)19. [7] R. Barbieri, L. J. Hall and A. Romanino, Nucl. Phys. B 5 5 1 , 93 (1999) [hep-ph/9812384]. [8] C. D. Froggatt and H. B. Nielsen, Nucl. Phys. B 1 4 7 , 277 (1979). [9] N. Arkani-Hamed, S. Dimopoulos and G. Dvali, Phys. Lett. B429(1998) 263;Phys. Rev.D59(1999) 086004.;I.Antoniadis, N. Arkani-Hamed, S. Dimopoulos and G. Dvali, Phys.Lett B426(1998)257. [10] L. Randall and R. Sundrum, Phys. Rev. Lett. 83 (1999)3370; } hepth/9906064. [11] J. Polchinski, "TASI lectures on D-branes," hep-th/9611050. [12] N. Arkani-Hamed, L. Hall, D. Smith and N. Weiner, Phys. Rev. D 6 1 (2000) 116003 [hep-ph/9909326]; G. Dvali and M. Shifman, Phys. Lett. B 4 7 5 (2000) 295 [hep-ph/0001072] For an alternative view not depending on family symmetries see: E. A. Mirabelli and M. Schmaltz, Phys. Rev. D 6 1 (2000) 113011 [hep-ph/9912265]; N. Arkani-Hamed and M. Schmaltz, Phys. Rev. D 6 1 (2000) 033005 [hep-ph/9903417]. [13] L. Ibanez and G. G. Ross, Phys. Lett. B 3 3 2 , 100 (1994) [hep-ph/9403338]. [14] P.Binetruy and P.Ramond, Phys. Lett. B350(1995)49; V. Jain and R.Schrock, Phys. Lett. B352(1995); E.Dudas, S.Pokorski and C.A.Savoy, Phys. Lett. B356(1995)45; Y.Nir, Phys. Lett. B345(1995)107. [15] Y. Grossman and Y. Nir, Nucl. Phys. B 4 4 8 , 30 (1995) [hep-ph/9502418]. P. Binetruy, S. Lavignac, S. Petcov and P. Ramond, Nucl. Phys. B 4 9 6 , 3 (1997) [hep-ph/9610481]; H. Dreiner, G. K. Leontaris, S. Lola, G. G. Ross and C. Scheich, Nucl. Phys. B 4 3 6 , 461 (1995) [hep-ph/9409369]. ; R. Barbieri, L. J. Hall and A. Strumia, Phys. Lett. B 4 4 5 , 407 (1999) [hepph/9808333]. J H E P 9811, 025 (1998) [hep-ph/9810435]; Y. Grossman, Y. Nir and Y. Shadmi, J H E P 9810, 007 (1998) [hep-ph/9808355].
822 [16] Y. Fukuda et al., Super-Kamiokande collaboration, hep-ex/9803006, hepex/9805006; hep-ex/9807003. [17] S. Hatakeyama et al., Kamiokande collaboration, hep-ex/9806038; M. Ambrosio et al., MACRO collaboration, hep-ex/9807005; M. Spurio, for the MACRO collaboration, hep-ex/9808001. [18] K.S. Hirata et a l , Kamiokande collaboration, Phys. Lett. B205 (1988) 416, Phys. Lett. B280 (1992) 146; E.W. Beier et a l , Phys. Lett. B283 (1992) 446; Y. Fukuda et al., Kamiokande collaboration, Phys. Lett. B335 (1994) 237; K. Munakata et a l , Kamiokande collaboration, Phys. Rev. D56 (1997) 23; Y. Oyama et a l , Kamiokande collaboration, hep-ex/9706008; D. Casper et al., 1MB collaboration, Phys. Rev. Lett. 66 (1991) 2561; R. Becker-Szendy et a l , 1MB collaboration, Phys. Rev. D46 (1992) 3720; W.W.M. Allison et a l , Soudan-2 collaboration, Phys. Lett. B391 (1997) 491. [19] M. Apollonio et a l , CHOOZ collaboration, Phys. Lett. B420 (1998) 397. [20] See for example, L. Wolfenstein, Phys. Rev. D17 (1978) 20; S. P. Mikheyev and A. Yu Smirnov, Yad. Fiz. 42 (1985) 1441; S.P.Mikheyev and A.Y.Smirnov, Yad. Fiz. 42, 1441 (1986); S.P.Mikheyev and A.Y.Smirnov, Sov. J. Nucl. Phys. 42, 913 (1986); J. N. Bahcall and W.C. Haxton, Phys.Rev. D40 (1989) 931; X. Shi, D. N. Schramm and J. N. Bahcall, Phys. Rev. Lett. 69 (1992) 717; P. I. Krastev and S. Petcov, Phys. Lett. B299 (1993) 94; N. Hata and P. Langacker, Phys. Rev. D50 (1994) 632 and references therein; N. Hata and P. Langacker, Phys. Rev. D52 (1995) 420. [21] J.N. Bahcall, P.I. Krastev and A.Y. Smirnov, hep-ph/9807216. [22] E.L. Wright et a l , Astrophys. J. 359 (1992) 393; A.N. Taylor and 396; J. Primack, J. Holtzman, A. Lett. 74 (1995) 2160; K.S. Babu, D53 (1996) 606.
396 (1992) L13; M. Davis et M. Rowan-Robinson, ibid. Klypin and D. O. Caldwell, R.K. Schaefer and Q. Shan,
a l , Nature 359 (1992) Phys. Rev. Phys. Rev.
[23] C. Athanassopoulos et al., LSND Collaboration, Phys. Rev. C54 (1996) 2685; Phys. Rev. Lett. 77 (1996) 3082. [24] C. Athanassopoulos et a l , LSND Collaboration, Phys. Rev. Lett. 81 (1998) 1774. [25] K. Eitel et al., Nucl. Phys. Proc. Suppl. 70 (1999) 210; hep-ex/9809007, contributed to the 18th International Conference on Neutrino Physics and Astrophysics (NEUTRINO 98), Takayama, Japan, 4-9 Jun 1998.
823 [26] M. Gell-Mann, P. Ramond and R. Slansky, proceedings of the Supergravity Stony Brook Workshop, New York, 1979, ed. by P. Van Nieuwenhuizen and D. Freedman (North-Holland, Amsterdam). [27] S. Davidson and S. F. King, Phys. Lett. B 4 4 5 (1998) 191 [hep-ph/9808296]. [28] K.R.Dienes, E.Dudas and T.Gherghetta, Nucl. Phys. B557 (1999) 25 [hep-ph/9811428]; N.Arkani-Hamed, S.Dimopoulos, G.Dvali and J.MarchRussell, hep-ph/9811448. [29] Y. Grossman and M. Neubert, Phys. Lett. B 4 7 4 (2000) 361 [hepph/9912408]. [30] Z. Maki, M. Nakagawa and S. Sakata, Prog. Theo. Phys. 28 (1962) 247. [31] J. K. Elwood, N. Irges and P. Ramond, Phys. Rev. Lett. 81, 5064 (1998) [hep-ph/9807228]; P. Ramond, in NONE hep-ph/9808489; P. Binetruy and P. Ramond, Phys. Lett. B 3 5 0 , 49 (1995) [hep-ph/9412385]. Nucl. Phys. B 4 7 7 , 353 (1996) [hep-ph/9601243]; G. Altarelli and F. Feruglio, J H E P 9811, 021 (1998) [hep-ph/9809596]; G. Altarelli and F. Feruglio, J H E P 9811, 021 (1998) [hep-ph/9809596]. [32] R. Barbieri, L.J. Hall, G.L. Kane and G.G. Ross, hep-ph/9901228. [33] S. F. King,Nucl. Phys. B 5 7 6 (2000) 85, [hep-ph/9912492]. [34] H. Dreiner, G.K. Leontaris, S. Lola, G.G. Ross and C. Scheich, Nucl. Phys. B436 (1995) 461. [35] G.K. Leontaris, S. Lola and G.G. Ross, Nucl. Phys. B454 (1995) 25. [36] L. Ibanez, Phys. Lett. B 303 (1993) 55. [37] For a review of string theories, see M. Green, J. Schwarz and E. Witten, Superstring Theory, Cambridge University Press, 1987. [38] M.E. Machacek and M.T. Vaughn, Phys. Lett. B 1 0 3 (1981) 427; C. Wetterich, Nucl. Phys. B 2 6 1 (1985) 461; Nucl. Phys. B 2 7 9 (1987) 711; J. Bijnens and C. Wetterich, Phys. Lett. B 1 7 6 (1986) 431; Nucl. Phys. B 2 8 3 (1987) 237; Phys. Lett. B 1 9 9 (1987) 525; P. Kaus and S. Meshkov, Mod. Phys. Lett. A 3 (1988) 1251; C.D.Froggat and H.B. Nielsen, Origin of symmetries, World Scientific (1991); S. Dimopoulos, L. J. Hall and S. Raby, Phys. Rev. Lett. 68(1992)1984; Phys. Rev. D45(1992)4195; H. Arason, D. J. Castaho, P. Ramond and E. J. Piard, Phys.Rev.D47(1993)232; G. F. Giudice, Mod. Phys. Lett. A 7 (1992)2429. K.S.Babu and
824 R.N.Mohapatra, Univ. of Maryland preprint, UMD-PP-95-57; M.Bando, K.-I. Izawa and T. Takahashi, Kyoto Univ. preprint, KUNS 1252 [39] J.L.Lopez and D.V.Nanopoulos, Phys. Lett. B268(1991)359; A.E.Faraggi, Phys. Lett. B278(1992)131; Princeton preprint, IASSN-HEP-94/31; A.E.Faraggi and E.Halyo, Nucl. Phys. B416(1994)63 [40] H. Georgi and S.L.Glashow, Nucl.Phys.B193(1981)150.
Phys.
Rev.
Lett.
32
(1974)
438.
[41] H.Georgi and C.Jarlskog, Phys.Lett.B86 (1979) 297; H.Georgi and D.V.Nanopoulos, Nucl.Phys. B159 (1979) 16; J.Harvey, P.Ramond and D.B.Reiss, Phys.Lett. B92 (1980) 309; Nucl.Phys. B199 (1982) 223. [42] S. Lola and G. G. Ross, Nucl. Phys. B 5 5 3 (1999) 81 [hep-ph/9902283]. [43] F. Gabbiani and A. Masiero, Nucl. Phys. B322 (1989) 235; J.Hagelin, S. Kelley and T. Tanaka, Nucl. Phys. B 415 (1994) 293; Mod. Phys.Lett. A 8 (1993) 2737; D. Choudhury, F. Eberlein, A. Konig, J. Louis and S. Pokorski, Phys. Lett. B342 (1995), 80; F. Gabbiani, E. Gabrielli, A. Masierio and L. Schwestrini, Rome preprint ROM2F/96/21; hep-ph/9604387. [44] R. Barbieri, P. Creminelli and A. Romanino, Nucl. Phys. B 5 5 9 (1999) 17 [hep-ph/9903460]; R. Barbieri, L. J. Hall and A. Romanino, Phys. Lett. B 4 0 1 , 47 (1997) [hep-ph/9702315]. Nucl. Phys. B 5 5 0 , 32 (1999) [hepph/9812239]. R. Barbieri, L. J. Hall, S. Raby and A. Romanino, Nucl. Phys. B 4 9 3 (1997) 3 [hep-ph/9610449]; T. Blazek, S. Raby and K. Tobe, Phys. Rev. D 6 2 (2000) 055001 [hep-ph/9912482]; Z. Berezhiani and A. Rossi, J H E P 9903, 002 (1999) [hep-ph/9811447]. [45] M. Chen and K. T. Mahanthappa, Phys. Rev. D 6 2 (2000) 113007 [hepph/0005292].
J. D. Lykken
This page is intentionally left blank
Physics of Extra Dimensions Joseph D. Lykken Theoretical Physics Department, Fermi National Accelerator Laboratory, P.O. Box 500, Batavia, IL 60510 Enrico Fermi Institute and Dept. of Physics, University of Chicago, 5640 Ellis Ave. Chicago IL, 60631
1
Why extra dimensions?
It is an underappreciated fact that human intelligence derives largely from the propensity of human infants to perform physics experiments. Every baby squirming in her crib makes thousands of detailed observations about the basic attributes of physical reality. These experiments are the process by which the brain wires itself into an intelligent organism, capable of understanding and manipulating the world around it. Babies quickly develop the ability to differentiate their sensory stream into "objects", and learn that objects have a number of nearly universal attributes, including solidity, persistence, and extension. Persistence and extension, of course, are the notions that objects occupy time and space. Our baby physicist also quickly appreciates the concept of motion, and thus that spatial dimensions represent degrees of freedom. Baby easily discerns that, at least for objects in the crib, there are precisely three independent spatial dimensions accessible either for extension or for motion. Furthermore, relative to either the extent or range of motion of objects in the crib, these spatial dimensions appear to be continuous and very large. Although not an expert in child development, the great philosopher Immanuel Kant (1724 - 1804) also stressed the primary importance of space and time in the organization of our sensory input stream, and thus for comprehension and the acquisition of syn827
828
thetic knowledge. In his Critique of Pure Reason, Kant singled out the truths of geometry as "synthetic a •priori''. By this he meant that while the validity of these formal propositions doesn't depend on any particular empirical data, geometry itself presupposes a general feature of all human perception, namely, the existence of three-dimensional Euclidean space. It has sometimes been said that Kant's arguments deny the possibility of non-Euclidean spaces or of higher dimensional spacetimes, but I would say rather that Kant correctly stressed the primacy of three-dimensional Euclidean space as a basic organizing principle of human comprehension. In modern particle physics we have, for the most part, assigned an analogous primacy to 4-dimensional Minkowski spacetime as an underlying organizational principle for fundamental dynamics. Relativistic quantum fields permeate spacetime, interactions occur at points in spacetime, and quantum dynamics can be formulated, a la Feynman, as a weighted average over spacetime histories. We invoke the Lorentz and translational symmetries of 4-dimensional Minkowski space to define two basic quantum numbers: masssquared (a continuous non-negative parameter) and spin (a discrete non-negative half-integral parameter). For the quotidian activities of modern physics, Kant, with a little assist from special relativity, is right on the money. This cannot, however, be the end of the story. The great lesson from the past century of modern physics is that physics is an empirical science, both quantitatively and conceptually. Baby's intuitions about physics derive from experiments with a limited number of objects performed in a narrow range of energy scales, length scales, and time scales, i.e., those accessible to crib-dwellers. When baby bangs her head against a bumper, that is electrodynamics, but it is both quantitatively and conceptually different from electrodynamics at microscopic scales and high energies. During the past century we have gradually learned to accept the fact that an electron is an electron, not a rubber ball. More generally, as we
829
used experiments to piece together a description of the microscopic world, we found that the basic objects and processes of this world do not have any direct correlates in our everyday experience. The microscopic world turns out to be far stranger than any fiction we could have invented. In the century just beginning, we expect to make great advances in our quantitative and conceptual understanding of the nature of spacetime, and of its fundamental role in particle physics. Already we know that baby's eduction of Euclidean spatial dimensions is only an approximation, which breaks down in the infrared limit. Experiments which probe very long length scales and very low energy scales find that spacetime is curved, due to the presence of macroscopic sources of energy and momentum. As Einstein intuited, fluctuations of spacetime curvature are gravitational dynamics. This implies a basic paradox of modern physics: how can spacetime be the fundamental organizing principle of quantum dynamics if spacetime itself is dynamical? This paradox is usually described as the difficulty of reconciling general relativity with quantum theory, and its solution is one of the main goals of string theory. Which brings us to the subject of extra dimensions. Our conclusion that there are three independent spatial dimensions is valid in a certain regime with certain assumptions. Let us now examine the loopholes: • Our experiments were conducted with macroscopic objects at low energies. These experiments are not sensitive to the existence of additional spatial dimensions which are compact, i.e., which have finite extent. • Our conclusion assumed that all particles probe the same number of spatial dimensions at all energy scales. This is not at all obvious; it is equally plausible that some particles probe three spatial dimensions in an energy regime where
830
other kinds of particles probe more than three dimensions. • Our conclusion assumed that there is a clean conceptual separation between spatial degrees of freedom and other kinds of degrees of freedom, at all energy scales. This is not at all obvious, particularly since there are many similarities in particle physics between spatial degrees of freedom and "gauge" degrees of freedom. In modern physics we have become wary of asking seemingly straightforward questions which turn out to be ill-posed. "Is an electron a particle or a wave?" would be a famous example. The question "what happened before the Big Bang?" is not a meaningful question until you have some idea what "before" and "happened" mean in this context. Similarly we see that the question "are there extra spatial dimensions?" is not well-posed until we address (at least) three other questions: (i) "in what energy regime?", (ii) "for what particles?", and (iii) "what do you mean by spatial degrees of freedom in this context?". These lectures are an introduction to modern explorations and speculations about these basic issues related to the possibility of extra spatial dimensions. This subject began with the pioneering work of Nordstrom, Kaluza, and Klein some 80 years ago. It began to receive serious attention in particle physics in the late 1970s, motivated by the development of extended supergravity theories. This interest was further piqued by the development of modern string theory in the 1980s, which seemed to require a large number of extra spatial dimensions. Motivated by further developments in string theory, we have witnessed in the past three years a complete rethinking of the basic issues raised above, and an explosion of phenomenological interest in the physics of extra dimensions. Most exciting, I will describe the possibility that the physics of extra dimensions may be probed directly by experiments planned for this decade.
831
2
Gravity in four-dimensional spacetime
General relativity is the classical theory of gravity and, expanded around flat spacetimes, also serves as a starting point for a perturbative treatment of quantum gravity. Since this is not a course in general relativity, I will introduce only the essential practical elements 1 ' 2 . General relativity is a metric theory of gravity based upon a real symmetric tensor field gfJ/V(xp), the metric field of four-dimensional spacetime. The metric field contracted with the spacetime differentials dx11 defines the proper time interval dr: dr2 = g^ dx»dxu
.
(1)
Here and elsewhere in these lectures I use natural units h=c= 1, and adopt the metric signature used in particle physics, i.e. p^
= E2 -p2
= m2
.
(2)
With this convention, the Minkowski metric of flat four-dimensional spacetime is written ritu, = diag(l, - 1 , - 1 , - 1 )
.
(3)
Note that relativists always use the opposite convention for the metric signature. This makes sense if you are interested in geometrical aspects of general relativity, but is not very convenient if you are attempting to interface with particle physics. To a large extent, one can derive the minimal lagrangian formulation of metric gravity coupled to matter by making an analogy to gauge theories. The analog of local gauge invariance for the case at hand is general coordinate invariance. The basic objects from which we will construct lagrangians are tensors and tensor densities. The partial derivative of a tensor is not (in general) a tensor, thus we need to introduce the idea of covariant derivatives. For
832
example, the covariant derviative of a vector field Vx(x) can be written: V* = D,V\x)
= d,Vx(x)
+ T^(x)V(x)
,
(4)
where the semicolon notation is standard in relativity, while the "DM" notation is familiar from gauge theories. The field T*u(x) in the above expression is called the affine connection; it is defined in terms of the metric field and its derivatives: rji/fa) = \9Xa {d»9av + d„gati - dag^)
.
(5)
In general relativity, as in gauge theories, the commutator of two covariant derivatives gives a new tensor. For gauge theories this defines the field strength tensor: [D„Dl/] = F«uTa = F^
,
(6)
while for gravity this defines the Riemann curvature tensor: [D^D^V^n^Vx
.
(7)
From the Riemann tensor we can construct a second rank symmetric tensor and a scalar, simply by contracting indices with the metric field. We thus obtain the Ricci tensor: K»v = n\vX
,
(8)
and the Ricci scalar: K = K*
.
(9)
The Ricci scalar field TZ(x) is a good candidate for a gravity lagrangian, since it contains two derivatives of the metric field, but is invariant under general coordinate transformations. To construct an action we also need to know the invariant volume element; this
833
follows from observing that the naive volume element transforms by a Jacobian factor under a coordinate change x -» x'\ dV =
dx' (Fx dx
(10)
which can be compensated by the transformation of the determinant of the metric. The invariant volume element is thus j4
/
111)
a xy/—g
where g = d e t ^ ) . It is not too difficult to show that there is a unique action which is general coordinate invariant, Lorentz invariant, and is constructed purely from the metric field and its first and second derivatives. This action, known as the Einstein-Hilbert action, is S,grav
/
a X\J—g A
1 r
n
(12)
where A is the cosmological constant, which has dimensions of energy density, i.e. (mass)4. The gravitational coupling constant K, which has dimensions of (mass) -1 , can be expressed in terms of Newton's gravitational constant: 16TTG
(13)
Newton's constant is often re-expressed in terms of the Planck mass: 1 G = (14) Mpi = 1.221 x 1019 GeV . M?i' Now we can derive the equations of motion for the metric field by varying the action. Take /
d x
59,in/
-9 A + — 71+ £matter
(15)
834
where £matter is some generally covariant matter lagrangian. To evaluate this we need two simple identities: -99
h. -gpX
=
sg,fXV
)1V
-gp,l9Xl/
(16)
as well as the Palatini identity for the variation of the Ricci tensor:
sn^ = (STD,, - (sr^)]X .
(i?)
Note that Sr^u is a tensor, even though T^u is not. From the Palatini identity we observe that g^STZ^ is a covariant divergence; this means that \/--g g^SlZ^ is a total derivative, since e.g. for a vector field V^ gv*
-gd^Y" d„ S=gV»)
p,v grw ,
(18) where we used the relation: (19)
-g
Thus, dropping the total derivative, the variation of the gravity action becomes -Tl^
+ y^CR
+ A)} .
(20)
1{i,V
We also define: \— I V ~~ 9 -'-matter J —
Sg,
2
V
9J
)
(21)
835
where T^ is called the stress-energy tensor. The full equation of motion is thus:
n^ - y^n = y [Agtlv - T^\
.
(22)
This is the Einstein field equation. 3
Kaluza-Klein theory
Some 80 years ago Nordstrom, Kaluza, and Klein noticed an elegant relation between Einstein gravity in 4 + 1 spacetime dimensions and abelian gauge theory 3 . This relation allows a unified description of gravity and electromagnetism, if we postulate the existence of a compactified 5th dimension. Let us use Latin letters M, N, P, to denote spacetime indices in 5 dimensions. We write the metric as: gMN(xP),
M,N,P
= 0,1,2,3,4
(23)
where * >
(
•
2
4
)
The Einstein-Hilbert action generalizes trivially to the case of 5 dimensions: "Jgrav —Id
X^/g A5 +
^Tl
(25)
where we have introduced a 5d cosmological constant A5 and a 5d gravitational coupling: ,2
--
„
16vrG5;
»
1
G5 = ^ .
(26)
Here M5 denotes the 5d Planck mass which, as in 4d gravity, represents the energy scale at which gravitational effects become strong.
836
In Kaluza-Klein theory we postulate that the extra spatial dimension is compact. If we want a compact manifold then this must simply be a circle, with some radius R5. We can obtain this by imposing a periodicity condition on the coordinate x4: Sl
x4 = x4 + 2TTR5 .
:
(27)
Note that a circle is fiat (it has no curvature), so M4 x S1 is a solution of the vacuum (i.e. sourceless) 5d Einstein field equation. In order to be a single-valued field, gMN(xF) must be expandable in Fourier modes as 4
9MN{X^X
f;
)=
^ ( ^ ) e x p ( ~ )
.
(28)
Since the metric is real we must also have Q~M'N={91MN)*• The 4d fields g%-N{xli) are called Kaluza-Klein modes. From the 4d point of view, each Kaluza-Klein (KK) mode decomposes according to its 4d index structure: 9MN{*
) - ygt{xP)
gnu{xP))
•
W
The 5d metric field is a symmetric tensor, so 5 , 4 /i =^ 4 - Thus in terms of 4d tensors the original 5d metric field becomes three distinct towers of KK modes: • symmetric tensors
g^ix9),
• vector fields g^x?),
and
• scalar fields g™A(xp).
The three zero mode fields g^ix?), g^A{xp), gu(xp) are a special case because they are massless. To see this, observe that the nth KK mode carries n quantized units of KK momentum: l^ 4 ~ R5
837
Since 0=
M
PMp
= P^
~ {Pi? ,
(31)
we see that, from the 4d point of view, the nth KK mode has a mass given by m>
= g
.
(32)
Now we see very concretely how a compact dimension becomes invisible at low energies; as R5 —>• 0 all but the KK zero modes become very heavy and become kinematically inaccessible in low energy dynamics. 4
Linearized gravity in 5 dimensions
Before computing the low energy effective action for the KK zero mode fields, let us first do a more general calculation of the action for linearized gravity in 5 dimensions. Suppose we write 9MN(XP)
= 9CMN + SgMN ,
(33)
where gcMN is any solution of the 5d Einstein equations, and SgMN is small fluctuation. Then the second order variation of the EinsteinHilbert action is given by: ^Tl
-*^n+ +
\bgMNbgPQ
\5gMNJg
(-SUMN +
\gMNgPQ^nPQ)
\f¥R- (gMNgpQ - gMpgNQ - gMQgNp) • (34)
It is understood in this expression that indices are raised and lowered with respect to the background metric. We now specialize to the case of flat backgrounds, for which the last set of terms in the above expression vanish. We then write gMN(xp) = r)MN + KbhMN
,
(35)
838
where the fluctuation field hMN has been rescaled so that, at the end of the calculation, it will have a canonical kinetic term. 1%MN is the 5d graviton field. Plugging the above into the Palatini identity, we obtain: SUMN
= y (dPdphMN
- dMhN - dNhM + dMdNh}j .
Here we have introduced the compact notation h = hM ; }%M = d h^N •
(36)
(37)
With a little integration by parts we find
SFP = fd5x^
(dphMNdPhMN
- dMhdMh - 2hMhM + 2hMdMh)
(38) This is the 5d form of the Fierz-Pauli action for linearized gravity. Now we would like to understand what is the effective 4d theory when we compactify the fifth dimension on a circle of radius R5. For simplicity I will derive this for the zero modes, and simply quote the result for the massive modes. The first step is to decompose the zero mode h°MN(xp) into 4d tensors: hMN{X
}
~ y^Hh
V
Av
2
{69)
Note that with our metric signature A^ = V -» A» = -h^
.
(40)
The scalar field (/> dependence was chosen a posteriori such that, in the decomposed linearized action, h^ and
839
and
(42)
*- = {%£* •
After integrating over the circle, the zero mode part of the F-P action becomes: f d4x { \ [dPh^dph^ - d^hd^h - 2h^
+ 2Wd„h] (43)
From this expression we see that the effective low energy 4d theory consists of 4d Einstein gravity plus electromagnetism plus a massless scalar. This is an amazing and elegant result. The off-diagonal zero mode components of the 5d metric have become gauge field components, translations around the circle now correspond to abelian gauge transformations, and KK momentum has become electric charge. We can also understand why a massless scalar has appeared in the low energy 4d effective theory. This scalar is the Goldstone boson of a spontaneously broken classical global scale invariance. The spontaneous breaking arises from our compactification on a circle of definite radius R5. To see this, consider /
d5x yfg-tfl
.
(44)
K
5
This 5d Einstein-Hilbert action is not invariant under a global Weyl rescaling: 9MN
K V~9
—>• A - gMN ,
->• XR., - • A"5/2 (45)
840
However, the theory which we are considering is not this, but rather the truncation of this to the zero mode sector. In the truncated theory we can consider independent rescalings of the different 4d fields. Thus we attempt to find a global scale invariance under rescalings of the form:
9ji4
—> A a ^ 4 ,
944
—> A #44 ,
(46)
where a and b are to be determined. The key observation is that, since all d^ derivatives vanish acting on the zero modes, TZ in the truncated theory scales like g^u. We find Tl
-> XR ,
(47) Thus we conclude that the truncated 5d action ^fgR, is scale invariant provided b = 1. An additional consistency condition comes from observing that, since gMN9NP = 6? , (48) it should also hold in the truncated theory that g^g^ scales the same way as g^gu- This fixes a = 1/2. Thus the truncated zero mode action is invariant under the following global scale transformation:
gu
—> A 9M •
(49)
841
This symmetry is spontaneously broken by the compactification onto M 4 x Sl, i.e. by the background metric:
9U,A
= 0 ,
#44
= "I •
(50)
Another way of saying this is that the vacuum expectation value (vev) of the scalar field <> / fixes the compactification radius R5 and breaks the scale invariance. Thus (j> is the radius modulus field associated with the size of the 5th dimension. In more sophisticated treatments of Kaluza-Klein theory, 5d matter fields are added to the theory which generate, at 1-loop order, a potential for the radius modulus. This fixes the radius R5 dynamically and gives the scalar 4> a mass. 5
Kaluza-Klein theory beyond five dimensions
Kaluza-Klein theory can be extended to more than one extra spatial dimension in a straightforward way. However, certain aspects of the theory become more complicated, and 5d KK theory has certain special features which do not hold for d > 5. Let's warm up with the simple exercise of counting massless modes in d dimensions. The graviton in d-dimensional spacetime has (d-2)(d-l) , N j - 1 (51) on-shell degrees of freedom, thus: 2 for 4d, 5 for 5d, 44 for lid. In the case we just considered, the 5 physical degrees of freedom of our 5d graviton have broken up into the 2 d.o.f. of the 4d graviton, the 2 d.o.f. of the 4d gauge boson, plus the massless scalar. The story is different for the massive KK gravity modes. The massless zero mode graviton hlxu(xp) has a whole tower of massive
842
KK partners, h™u(xp). However these modes are massive spin 2 fields, with 5 physical polarization states (in 4d). Where did the extra three d.o.f come from? The answer is that there is a Kaluza-Klein Higgs mechanism whereby, at each KK mass level n, the field /i"1/(xp) "eats" the massive KK mode partners of A^ and 0, thus producing a single tower of massive spin 2 KK modes. The complete linearized action for the massive modes can be written entirely in terms of a single set of modes, defined by 4
The complete massive Fierz-Pauli lagrangian can be written
\{d"hvpndji-^
-dWdJr" - 2hmh~n + h^dji-" +hpnd~h-n - €h^nh-™ + €hnh~n) . (53)
Now we consider the generalization of Kaluza-Klein theory to n extra compact spatial dimensions, instead of just one. The 4+ndimensional metric decomposes into block form: _
gMN=(^
( 9\iv
9lib
<"* ; V 9av 9ab J
0,6 = 1 , 2 , . . . n ,
(54)
where gab is the metric of some n-dimensional compact manifold. The simplest generalization is KK compactification on an ndimensional hypertorus, i.e., an n-torus (S*1)". The zero mode action in this case will consist of the 4d graviton plus n abelian gauge fields plus n(n + l)/2 massless scalars. This is the correct counting since a graviton in 4+n dimensions has (n +
2)( n + 3 ) _
1 = 2 + 2n+n(!!+i)
843
on-shell degrees of freedom. The massive modes will assemble themselves into towers of massive spin 2, massive spin 1, and massive spin 0 fields. Both the massive spin 2 and the massive spin 1 fields "eat" degrees of freedom to acquire extra polarization states. The final counting is one tower of massive spin 2, n — 1 towers of massive spin 1, and n(n — l)/2 towers of massive scalars. We can also compactify on manifolds other than a hypertorus. If we compactify on a maximally symmetric space, e.g. the n-sphere Sn, we will get more zero mode gauge fields than in the case of the torus. Not surprisingly, this results in a nonabelian gauge theory. For example, consider 6-dimensional gravity compactified on M 4 x S2. The 2-sphere has 3 isometries, corresponding to the 3 Euler angles of an SO(3) rotation. The KK zero mode theory is thus an SO(3) gauge theory. Roughly speaking, the nonabelian gauge bosons correspond to the 3 Killing vectors which can be constructed out of the 6d metric components. The real story is not so simple, however, because the 2-sphere (unlike the torus) has nonzero curvature. Thus compactification on a sphere is not a solution to the vacuum Einstein field equations. A consistent treatment requires that we add some appropriate extra matter sources into the higher dimensional theory. This immediately becomes rather complicated and the results rather modeldependent. This problem surfaced in full force in the 1970s, with the development of 11-dimensional supergravity 5 . This theory has maximal supersymmetry (32 supercharges) and is unique. Nowadays we recognize it as the low energy limit of the M-theory limit of string theory. The supergravity theory has 128 on-shell bosonic degrees of freedom: 44 from the graviton hMN and 84 from a 3rd rank antisymmetric tensor gauge field AMNP- This theory can be compactified on a 7-sphere S 7 . In that case the 4d zero mode theory is 4d N = 8 supergravity. This 4d theory also has 128 on-shell bosonic degrees of freedom: 2 from the 4d graviton, 56 from the 28
844
gauge bosons of an SO (8) nonabelian gauge theory, and 70 massless scalars. Note that there are more KK gauge boson degrees of freedom in the compactified theory than there were graviton degrees of freedom in the l i d theory. 6
String theory and extra dimensions
Perturbative first-quantized bosonic string theory 6 consists of the worldsheet dynamics of the string coordinates: XM(r,a);
M = 0,l,...d-1,
(56)
where r, a parametrize the string worldsheet. Classically, the worldsheet theory respects both 2d general coordinate invariance and Weyl rescaling invariance. However the quantum theory is anomalous unless d = 26. For superstrings there is a similar story: there are now additional worldsheet superpartner degrees of freedom \& M (T, a), and the theory turns out to be anomalous unless d = 10. This leads to the unambiguous conclusion that superstrings live in a 10-dimensional spacetime. This turns out to be one of the few robust phenomenological predictions of string theory that we have yet managed to produce. String phenomenology is plagued by what appears to be a fundamental vacuum degeneracy problem, i.e., string theory has an infinite number of stable degenerate vacuum states, parametrized by (among other things) the vevs of a large number of scalar moduli fields. This moduli space of string vacua has not yet been understood or even mapped out: we do not even know its dimensionality, topology, or connectivity. Only a few special limits have been explored. As we move around in this moduli space, i.e., as we shift around different possible ground states, the effective field theory description of string theory varies enormously. For example, various numbers of the 9 spatial dimensions can get compactified on various
845
manifolds or orbifolds; the "radii" that parametrize these spaces are just some subset of the moduli vevs. String theory also contains a variety of p-dimensional dynamical membranes, p-branes, which can appear as part of the specification of the string ground state. From the point of view of perturbative string theory these branes are solitons, and we can classify vacua according to their brane content, just as we classify gauge theory vacua according to their monopole or instanton content. In the bigger picture branes are no less fundamental than the strings themselves, and indeed are in some respects more fundamental! Of particular interest are Dirichlet branes, usually called Dbranes 7 ' 8 . These are BPS solitons 9 of the Type IIA and Type IIB limits of string theory. Strings can begin and end on D-branes; these are called open strings to distinguish them from the closed strings, loops of string which propagate in the full bulk space. There are particle-like D-branes called DO branes, string-like Dbranes called D l branes, and D-branes with spatial dimensionalities ranging from 2 to 9, denoted D2, D3, ... D9. D-branes have "tension", i.e., energy per unit volume, and thus at low energies Dp-branes can be regarded as p-dimensional domain walls, spanning a subspace of the full 9-dimensional space. If the space is compactified D-branes may be "wrapped" around closed cycles of the manifold. D-branes carry a charge with repect to a gauge field that permeates the bulk of 10-dimensional spacetime (a Ramond-Ramond gauge field). Parallel D-branes of like-sign charge have the property (following from that fact that they are BPS solitons) that their electrostatic repulsion is exactly cancelled by their gravitational and dilatonic attraction. This implies, among other things, that the DO brane of Type IIA strings has a tower of evenly spaced threshold bound states. This tower of particle-like nonperturbative degrees of freedom is actually the KK modes of a compactified 11th dimension! There is a well-defined limit in string moduli space, the
846
M-theory limit, where the radius Rn —>• oo, and the strings disappear. This defines the mysterious 11-dimensional M-theory, whose low energy truncation is 11-dimensional supergravity. M-theory has its own branes: electric 2-branes which serve as sources for the AMNP gauge field, and 5-branes which serve as dual magnetic sources. The 2-branes, wrapped around the circle of the compactified 11th dimension, become the strings of string theory in 10 dimensions. Strings, like branes, can also wrap around closed cycles of compactified spaces. This implies new degrees of freedom called winding modes, since now the closed string periodicity condition (the requirement that the string is a loop) only has to be satisfied modulo the periodicity of the compact space. Since strings have a tension, T, wrapping a string once around a circle of radius R costs energy 2-KRT. The string mass spectrum for a closed string compactified on a circle now has contributions from both KK modes and winding modes: n2 m = -£f2 R 2
+ nL(27ri?)2T2 + (stringy normal modes) ,
(57)
where nKK is the KK mode number and nw is the winding number. This mass spectrum is obviously invariant under the T-duality transformation
where Ms = 2TTT is the string scale. This transformation, which interchanges KK modes with winding modes, turns out to be a symmetry of the full string dynamics, not just the perturbative mass spectrum. It implies, in some sense, that there is a minimum length scale ~ 1/MS for string theory. This is only one of the many remarkable string dualities. For strings compactfied to 4 dimensions, we can attempt to determine stringy parameters like Ms, the string self-coupling gs, and
847
the compactification radii, in terms of low energy phenomenological parameters like Mp\ and gauge couplings. However the results of this exercise depend crucially on where you think you are in string moduli space. As a result neither the string scale not the compactification scales are predicted in a robust way; they could be as high as MPi, or as low as a TeV 10 ' 11,12 . Dp-branes in string theory are phenomenologically interesting because their low energy effective field theory descriptions are p+1 dimensional gauge theories. To be more precise, they are gauge theories derivatively coupled to 9—p Goldstone bosons which arise from the fact that the D-brane breaks the translational symmetry of the bulk space. At momenta of order the brane tension, the gauge theory fields will start to excite brane fluctuations. These gauge theories can be chiral if we have broken enough supersymmetry. This has led to the intriguing braneworld hypothesis, the idea that perhaps the entire Standard Model gauge theory arises as the low energy limit of a configuration of branes in an appropriate bulk spacetime background 13.14.15.16.17. This has not yet been conclusively demonstrated in string theory, but some attempts come encouragingly close. 7
Large extra dimensions
The large extra dimensions scenarios pioneered by Arkani-Hamed, Dimopoulos, and Dvali (ADD) 18 take a phenomenological effective field theory approach to the braneworld idea inspired by string theory. They assume that the Standard Model (SM) gauge theory is confined to propagate in a 3-dimensional domain wall, embedded in a higher dimensional compactfiied bulk space. Only gravity propagates in the full bulk space. Some number n of the compactified extra dimensions are assumed to be "large", and these are assumed, for simplicity, to be circles (an n-torus) of common circumference R. One could con-
848
sider other compact spaces, but this would increase the complexity of the analysis, just as in conventional KK theory. We assume that we can use a low energy effective field theory description of this gravity + SM system, valid up to energies of order M*, the effective Planck mass of the 4+n dimensional gravity theory. Note that this is a strong assumption, since 4+n dimensional gravity is nonrenormalizable, and in more complete models one may find that high energy gravity processes are "softened" by string or brane modes at energy scales well below M*. With this set of assumptions we immediately obtain a simple relation between MP1, M*, and the radii of the large extra dimensions 18 : MIL = Ml+n Rn . (59) Since M* is approximately the cutoff scale of our effective gravity + SM theory, it is tempting to imagine that its value may be as low as a few TeV. This would solve the Higgs naturalness problem of the Standard Model, by introducing a cutoff not too far above the electroweak scale, and introducing new physics (strong gravity, string and brane dynamics,) at energies above this cutoff. This scenario also recasts the hierarchy problem of the Standard Model: it replaces the question of why MP\ is so large compared to Mz, with the question of why R is so large compared to l/MZ- In other words, an ultraviolet hierarchy problem is replaced with an infrared hierarchy problem. How large are the extra dimensions of the ADD scenario? If we take the most optimistic case of M* = 1 TeV and plug in the measured value of Mpi, we find: n
R
1 2 6
10" m 1 mm 10 fermi
849
These are indeed large extra dimensions. Experiments at high energy particle colliders have probed the interactions and degrees of freedom of the Standard Model in enormous detail down to length scales below 10~16 cm (i.e., 1/1000 of a fermi). These experiments see no evidence whatsoever for the existence of Kaluza-Klein modes of any Standard Model fields. Thus we see the critical importance in ADD of the braneworld hypothesis, which implies that only gravity, not SM fields, probes the large extra dimensions at energy scales below the cutoff. 8
Gravity in large extra dimensions
Let's begin by reviewing the situation in 4d spacetime. Because gravity is a weak force on macroscopic length scales, it is usually sufficient to discuss its behavior in terms of the static Newtonian potential: V(r) = - M * —
>
(60)
where r is the 3d spatial separation between the static mass sources rrii and m 2 . The Newtonian potential is the Fourier transform, in the static nonrelativistic limit, of the matrix element for single graviton exchange between two point sources with stress-energy tensors T f and T$v (see Fig. 1): %M = [-iTr{Kh)](h
iP, ; ? " .
-iT?{qu
•
(61)
The constant tensor P^vp\ in this expression can be written explicitly once we choose a gauge, i.e., once we fix a coordinate system. The most useful gauge choice is harmonic gauge, defined by imposing the following four conditions on the graviton field h^:
dv V = \d„K .
(62)
850
Figure 1: Graviton exchange between two point particles.
In harmonic gauge we then have P^p\ = - (VwVv\ + VuMvp ~ ViiuVpx) •
(63)
This tensor can be re-expressed in terms of transverse polarization tensors for the two source particles, as expected since it represents the propagation of a spin 2 graviton with two on-shell degrees of freedom. Now we can repeat the above analysis for gravity with 4+n infinite flat dimensions. The Newtonian potential is given by: V(r) = { }
i _ H ^ M2+n rl+n '
(64) '
y
where r is the (3+n)-dimensional spatial separation between the static mass sources. This potential is the static nonrelativitic limit of the matrix element for single graviton exchange in the bulk, i.e., the full 4+n dimensional spacetime. An important extra ingredient is that we assume that our point matter sources are confined to the brane, i.e., to a flat 3d slice of the full space. This has an important effect on our calculation of the matrix element, if we are at low enough energies that we can ignore fluctuations of the brane. The effect is that, if we turn off brane fluctuations, the couplings between
851
brane matter and bulk gravitons do not conserve the components of momentum in the n extra dimensions. To see this, letp^ = (&i — k2), Pf = (
iM = J d4Xid4xf [-tff e'*^] I d4+np %4ifft e^*1"*'* -iTfxeiprxf (65) where P^p\ are the 4d components of the 4+n dimensional constant tensor which generalizes Eq. (63): PMNPQ = x (VMPVNQ + VMQVNP) ~ J—^VMNVPQ
•
(66)
We observe that the integral / d4Xjd4xj over the positions of the vertices only enforces 4d momentum conservation. Thus we get:
iM = [-anfrp,
,„),;. ff, + fa [-oT] *4(« - v,), (67)
where pt are the n components of the graviton momentum transverse to the brane. Physically what has happened here is that the presence of the brane spontaneously breaks the n-dimensional translation symmetry in the directions transverse to the brane. The result is qualitatively the same as bouncing a ball off of a wall: if we are not sensitive to the coherent dynamics of the wall itself, we see an apparent violation of momentum conservation. To obtain the ADD scenario, we compactify the n infinite flat extra dimensions onto an n-torus with common circumference R. This replaces the integral / dnpt in Eq. (67) by a sum over discrete KK momenta: V\ = jp ,
(68)
852
where n is a vector of integer KK mode numbers. Now we can regard the 4 + n dimensional Newtonian potential Eq. (64) as arising from the exchange of the entire tower of KK graviton modes. Exchange of the zero mode graviton reproduces, of course, the 4d Newtonian potential Eq. (60) in the static low energy limit. Exchange of any given massive KK graviton mode will contribute a Yukawa-type potential, with a range determined by \n\/R. Since the KK tower is in fact truncated for \n\/R > M*, a more accurate expression for the full Newtonian potential is given by the low energy expansion: 1 mxm2
(
-r/R
n(n + l) -2r/R
\
(69) Thus, to an observer on the brane probing the Newtonian potential, gravity looks 4d up to exponentially small corrections on distance scales larger than R. For distance scales or order R, one starts to detect the additional gravitational strength Yukawa interactions produced by exchange of the first few massive KK graviton modes. On short distance scales
853
scales. The Eot-Wash group has already reported 20 a null result from their new experiment; they constrain the coefficients of the following modified Newtonian potential:
y(r) = - _ _ ^ l
+ ae
j .
(70)
For the ADD scenario with n=2 extra dimensions we would have a = 2. For this value the experiment sets the following upper bound on the size R of the extra dimensions: R < 200 microns ,
(71)
which is somewhat less than the value predicted if we had M* = 1 TeV. 9
Collider tests of large extra dimensions
In the much broader family of ADD and ADD-like large extra dimensions scenarios, the size of the extra dimensions is not usually large enough to be accessible in table-top gravity experiments. On the other hand even R ~ 10 fermi is quite a large extra dimension by the standards of high energy physics. The question then arises of whether experiments at high energy particle colliders are sensitive to the effects of large extra dimensions, via the coupling of Standard Model particles to massive KK gravitons. At first glance the answer would seem to be a resounding "No!", since gravity corrections to SM processes are normally completely negligible. For example, initial and final state radiation of gravitons presumably occurs in SM processes, but the resulting corrections are suppressed by factors of p2/Mph where p1* is the 4-momentum of the subprocess. In the ADD scenario, however, we now have the possibility of radiating any of a large number of massive KK graviton modes.
854
q2
(a)
(b)
(c)
(d)
Figure 2: Tree-level diagrams contributing to real emission of a KK graviton (double curved line) in association with a gauge boson (single curved line).
There are on the order of (ER)n such modes which are kinematically accessible in a collider subprocess with energy E. Each of these KK modes couples to the stress-energy tensor of SM matter just like the massless graviton; the coupling is proportional to 1/Mpi. Thus the cross section for on-shell production of some massive KK graviton mode goes like: &KK
2
M ,
(ER)T
1
E
(72)
where in the second expression we have substituted Eq. (59). The exact expressions for these cross sections can be found in Refs.21'22'4. This is an exciting result, since it implies that quantum gravity effects at colliders may be suppressed only by powers of M*, a scale which could be on the order of a TeV. Indeed we can search for these effects right now in existing colliders. The experimental signatures can be divided roughly into three categories: • Missing Energy Signatures: Real emission of massive KK gravitons can occur at hadron colliders via processes that produce a KK graviton in association with a high PT gluon or quark jet. The KK graviton, once produced, leaves no trace in the detector, since it is only coupled to matter with gravitational strength. Thus the signature is a high energy monojet (or dijet, from initial and final state radiation), with large
855
missing transverse energy. KK graviton production could also occur in association with a hard photon, with significant rates, at hadron or lepton colliders. These channels have been examined in exisiting data from LEP and the Tevatron 23,24 . • Interference with SM Processes: Virtual exchanges of massive KK gravitons give new diagrams contributing to a variety of Standard Model processes25'26'27. The dominant effect will be the interference between these new diagrams and the SM ones. The typical signature will be an enhancement of high energy or large invariant mass events, accompanied by a distortion of angular distributions. The most promising processes to examine for these KK effects are those which are clean and have high rate: examples are Bhabha scattering at e+e~ colliders and Drell-Yan production at hadron colliders. Lower bounds on M* slightly exceeding 1 TeV have already been obtained by analyses of data from LEP, HERA, and the Tevatron 28 ' 29 . • Exotica: As collider subprocess energies reach and exceed the energy scale set by M*, we approach the true realm of quantum gravity. At some point we expect the production of microscopic black holes, and we expect to see the effects of strings, brane fluctuations, or whatever is the new physics that softens (unitarizes) the behavior of strong gravity 30 . There are a number of theoretical and experimental challenges involved in the exploration of the possible signatures of quantum gravity and extra dimensions at colliders. Theoretically, one would like to flesh out the ADD scenario into full-fledged models. This is particularly important for understanding the virtual effects of KK gravitons, where one encounters divergences if one naively sums over the entire tower of massive KK states. Phenomenologically this is handled 25 by introducing a cutoff A ~ M*, but a proper
856
q + q -> g + KK 50.00
ECM = 2.0 TeV n=2
,~J0.00 > O
5.00
1.00 xi 0.50 \ b T3
0.10 0.05
M» = 1 TeV _L
0
200
400
j
600 MKK (GeV)
,
•
,
i
800
,_
1000
Figure 3: Mass distribution of KK graviton states, as they would be produced in qq annihilation at the Tevatron for M* = 1 TeV in the ADD scenario with n = 2, 4, or 6 extra dimensions. Plot courtesy of K. Matchev.
treatment would require specification of how the effective KK theory matches onto a more fundamental ultraviolet description like string theory. Experimental challenges include the difficulty of distinguishing a (weak) signal of KK gravitons from Standard Model backgrounds, and the issue of how to reconstruct the topography of the extra dimensions after an initial discovery. For example monojet production 31 at the Tevatron or LHC is a promising discovery channel for KK graviton production, but has large backgrounds, including irreducible physics backgrounds from processes like Z+ jet with
857
the Z boson decaying invisibly to two neutrinos. The background issues are similar to those encountered in missing energy searches for squarks and gluinos 3 2 . For M* = 1 TeV in the ADD scenario, the Tevatron will produce significant numbers of heavy KK gravitons, with masses averaging in the hundreds of GeV. This can be seen in Fig. 3, which convolves the density of states of KK gravitons (which increases with mass) and the parton distribution functions for the initial state quarks (which decreases rapidly with energy). The mass distribution peaks at a higher value for n = 6 extra dimensions, because the density of KK states is a more rapidly increasing function than for n — 2. Unfortunately, as seen in Fig. 4, this difference does not show up in the missing transverse energy (MET) distribution, which is what you actually observe in the experiment. This is due to two competing effects: on the one hand the heavier KK gravitons of the n = 6 case would tend to have larger transverse energy, but on the other hand because the parton distribution functions are decreasing rapidly the heavier KK gravitons are more likely to be produced near threshold. The two effects tend to cancel, leaving nearly identical MET distributions for the different cases. As this example shows, much of the collider phenomenology of KK gravitons depends only on the KK graviton density of states. Thus we can discuss simultaneously a much larger class of extra dimensions scenarios than ADD, in terms of the effective density of states p(m): d2a , . do , — — = p{jn) — . 73 y y dtdm ' dt ' For the simple ADD case p(m) only depends on 2 parameters:
p(m) = Rnnn.lmn-1
,
(74)
where ^ n _ ! is the solid angle in n dimensions. Many generalizations and modifications of ADD are possible, leading to modifications in the density of states. These include:
858
q + q -• g + KK — i — i — i — i —
— i
E,
1
1
1
1
1
r —
2.0 TeV
M. = 1 TeV •
50
100
•
•
150
200
250
300
# T (GeV) Figure 4: Distribution of missing transverse energy due to undetected KK graviton states, as they would be produced in qq annihilation at the Tevatron for M* = 1 TeV in the ADD scenario with n = 2, 4, or 6 extra dimensions. Plot courtesy of K. Matchev.
859
• Other compactifications
.
• Fat branes or other form factor effects34. • Wave function factors due to warped geometries, as in RandallSundrum theory 35 . There are important astrophysical constraints on ADD scenarios 36 . These constraints depend on the relative density of states at the lower end of the KK graviton spectrum, since only the lighter KK states are kinematically accessible in astrophysical phenomena. This again shows the importance to experimental searches of developing theoretically robust classes of models. 10
Randall-Sundrum Theory
Randall and Sundrum 35 studied an effective field theory picture in which a five-dimensional spacetime contains strongly gravitating three-branes which then produce a warped or nonfactorizable geometry. With a particular tuning of the bulk (negative) cosmological constant and the brane tension, the induced geometry on the three-branes is just flat four-dimensional Minkowski space. One can also induce cosmological geometries on the three-branes by varying these parameters 37>38>39>40. There is a massless fourdimensional graviton, but its wave function is localizedin the warped geometry of the extra dimension. In this construction, the size of the extra dimension is unconstrained: it could be either very small or arbitrarily large. This scenario can also be extended to cases with more than one extra dimension 41 . Consider a five-dimensional nonfactorizable background geometry whose metric takes the following form in Poincare coordinates: ds2 = e-^^Tj^dx^dx"
+ dy2 ,
(75)
where we have now reverted to the relativists' metric signature diag(-l,l,l,l,l), y is the coordinate of the 5th dimension, and A(y)
860
is the warp factor. Note that this geometry respects 4d Poincare invariance, but 4d invariants like masses will be rescaled by the warp factor as we move in the 5th dimension. We will restrict our attention to geometries which are reflection symmetric around y = 0. Thus we are essentially considering a Z2 orbifold (which may be either compact or noncompact). We will also be interested in the case where the geometry is asymptotically anti-de Sitter (AdS), i.e. for large \y\, A(y) - • k\y\ ,
(76)
where k is the inverse of the AdS radius of curvature. The simplest example of a brane setup which produces a background geometry of this type is to have a number of three-branes with positive tension superimposed at y = 0. We refer to these branes collectively as the "Planck brane," and we will designate this simple setup as the RS2 model. We consider the five-dimensional gravity action *->
with
Sbulk Sbrane
=
Jbulk
i &brane i A
{> > )
= J d x dy y^g~
(2M53i?
=
Vp ,
~
J d X \[—g\
- A) ,
which should be considered the leading terms in a low-energy effective action. Here, gMN (M, N = 0 , . . . 4) is the five-dimensional metric, while (g^)^ {[i,v = 0 , . . . 3) is the induced metric on the Planck brane. Also, M 5 , A and Vp denote the five-dimensional Planck scale, the (negative) bulk cosmological constant and the (total) brane tension, respectively. Finding a solution which is Poincare invariant in four dimensions requires that the tension Vp is tuned relative to the cosmological constant A. That is, we set VP = J24Mf\A\, as in 3 5 . The solution of the five-dimensional Einstein equations in this RS setup is then given by the metric (75),
861
with A(y) = k\y\, where _A . 24M;
k2 = - ^ 3
(78)
This background geometry is simply two AdS regions glued together along the surface y = 0 with the Planck brane supporting the appropriate discontinuity in the extrinsic curvature across the gluing surface. 11
Graviton Modes
When linearized metric fluctuations are included, the geometry takes the form ds2 = ( e ~ 2 A ( y ) ^ + h^)dx^dxv
+ dy2 .
(79)
We choose a gauge where d^h^ — h^ = 0. We will not consider the metric fluctuations h55 and h5fl (which are pure gauge for the case where y has an infinite range). It is useful to define a conformal coordinate z by z = sgn(y)
y
-
'- .
(80)
Now solve the linearized Einstein equations for h^ with separation of variables using an ansatz of the form: h^ = ew'xe~A^l2tpm{z)eljiU Here eMJ/ is a constant polarization tensor. The four-dimensional profile of these solutions is a plane wave with an effective fourdimensional mass: m2 = —p2. Solving the linearized equations is then reduced to a one-dimensional Schrodinger problem:
\dl + V{z) ipm(z) = -^rn2i/jm(z) ,
(81)
where the potential V(z) is given by v(z)
= sowTTp " T*W •
(82)
862
Figure 5: The volcano potential of the analog Id Schrodinger problem. Plotted also are (a) the profile of the graviton zero mode wavefunction, and (b) a typical massive KK graviton mode.
This is the "volcano potential" shown in Fig. 5. With these definitions, the natural norm for the profile in the fifth dimension is simply Jdz\ipm(z)\2 = 1. The zero mode graviton solution is given by
Mz) =
[&H + 1] 3 / 2
(83)
This zero mode is the usual single bound state mode of a delta function potential; it is a threshold bound state (i.e. a zero mode) because of our tuning of the 4d cosmological constant. In terms of the proper distance y, the zero mode graviton wavefunction is exponentially strongly peaked around the location of the brane. Thus the zero mode graviton is normalizable and localized at the brane. This zero mode will mediate 4d gravity for observers on the
863
brane. The Planck mass is determined in terms of the fundamental parameters: M3 ^pi = X
(84)
•
In addition to the zero mode, there is a whole continuum of massive KK graviton modes. Their wavefunctions can be expressed in terms of Bessel functions: 9
r 4k2 Y2(x) + — 2J2{x) urn1
(85)
<\A + \)-
(86)
where X = TJ
For large x, the Bessel functions oscillate like plane waves, and the continuum KK modes are mocking up 5d gravity in an AdS background. However we are interested in gravity as observed on the brane, for which the relevant limit is x
J2{x)~\.
An(z) ~ JT
(87) 8
(88)
.
Vk This behavior implies that the correction to the Newtonian potential as measured on the brane is ~
/
m
dme
r
r
1/12
WmY \ipo\2
i / i
-(—) T \k2r2
(89)
This correction is highly suppressed if, e.g., k ~ MP\. This is easily understood from Fig. 5: the wavefunctions of the massive KK
864
gravitons are unsuppressed for large x, but they are exponentially suppressed at the brane, due to the necessity of tunneling through the sides of the volcano potential. The amazing conclusion of this analysis is that an observer on the brane sees 4d gravity even though there is an infinite extra dimension. What is this good for? Well, Randall-Sundrum theory has many simple extensions which have interesting phenomenological properties. For example, we can add a second brane, located at a distance z ~ T e V _ 1 from the original brane. In this setup we refer to the new brane as the "TeV brane", and the original brane as the "Planck brane". Because of the warp factor, if our effective field theory is cut off at the Planck scale as seen on the Planck brane, then the cutoff will be at the TeV scale on the TeV brane. This scenario can be analyzed rather easily in the approximation where the TeV brane is a probe brane, i.e. where we ignore the backreaction of the TeV brane on the metric 4 2 . A complete analysis of the 5d graviton propagator 4 3 shows that, as seen on the TeV brane, the leading correction to the Newtonian potential is: M*,V(r) = - l ( l + £ £ . . . ) .
(90)
These effects (due to the continuum of massive KK modes) are only TeV suppressed, even for k ~ Mp\. In fact this scenario has collider signatures which closely resemble those of the ADD scenario with n — 6. The cross section for massive graviton production goes like
°~w,E°za-
(91)
where E is the subprocess energy scale. This interesting behavior has a simple explanation in terms of the holgraphic duality of the A d S / C F T correspondence ^^,46
865
Randall-Sundrum models with TeV branes 35'42 offer a qualitatively new solution of the hierarchy problem of the Standard Model, by reducing the ultraviolet cutoff of the SM to ~ a TeV. In these scenarios the weakness of gravity as measured on our brane is due to a wavefunction suppression, not a large dimensionful coupling constant. 12
Gravity in a box
There are many possible extensions of the basic ADD and RandallSundrum scenarios, so many that I will not attempt a coherent summary here. Instead I will briefly describe one brane setup 4T which interpolates between these two classes of scenarios, with a single extra dimension. This five-dimensional construction contains a nonfactorizable geometry which is asymptotically AdS, as in the Randall-Sundrum scenario. However, as in the framework of ADD, there is a completely flat region of finite width - "the box" bounded by three-branes. By varying the relative scales of the flat box and the AdS regions, this setup interpolates between a limit in which it reproduces Randall-Sundrum and another where it yields the ADD scenario. A convenient starting point is the Randall-Sundrum scenario discussed above, with the Planck brane and an infinite (but Z2 symmetric) fifth dimension. Now imagine splitting the Planck brane into two branes and pulling them away symmetrically from y = 0 to y = ±y 0 - We require that the region of the bulk space between the branes (i.e. with \y\ < yQ) is in a new vacuum where the bulk cosmological constant vanishes. Then the three-branes at y = yo remain flat with the same tuning of the brane tensions as in Randall-Sundrum. The solution of the 5d Einstein equations becomes the metric Eq. (75) with Ay) = ~k\y - y0\ +-k\y
+ y0\ - ky0 •
(92)
866
The resulting picture is then a flat "box" glued between two AdS regions. The linearized field equation for the graviton reduces to a one-dimensional Schrodinger problem, Eq. (81), with a potential given by 1 ^k2
"\h
where z is the conformal coordinate defined by Eq. (80). As in Randall-Sundrum, there is a normalizable graviton zero mode which produces 4d gravity for observers on either the Planck or TeV branes. The zero mode is just a constant inside the box, and falls off exponentially fast in y outside of the box: Mz)
=
\B0{kz)->"
for
\z\>z0,
(94)
where z = \z\ — z0 + 1/k, and B0 is a normalization constant: /
k
\1/2
Because the zero mode is spread out in the box, the resulting effective 4d gravity is weaker compared to Randall-Sundrum, i.e., the 4d Planck scale is enhanced: M
"
=
A//"3
M3
l |
*
{2kZ
°
+ l)
•
(96)
There is again a continuum of massive graviton modes. Some fraction of these modes have enhanced support inside the box; these resonant modes mock up the discrete mode spectrum that we expect for a box (and for ADD). If we introduce a probe TeV brane a distance z ~TeV _ 1 from the Planck brane, then observers on the TeV brane will see 4d gravity plus interesting collider effects from the massive gravity modes.
867
Indeed, because of the relative zero mode suppression discussed above, the cross section for production of massive graviton modes is enhanced by a factor proportional to the size of the box: a~(2kzo
+ l)-£
E*z8 .
(97)
-^Planck
13
Outlook
I have only touched the surface of what has become a major new branch of "beyond-the-Standard-Model" physics. The physics of extra dimensions may, if we are lucky, be directly accessible at high energy colliders, in which case we have many glorious discoveries awaiting us. The physics of extra dimensions may also be accessible indirectly, through its influences on supersymmetry breaking, flavor structure of the SM, the cosmological constant, black holes, gravitational radiation, etc. By developing the phenomenology of extra dimensions, we are also gaining a toehold on the stickier problem of establishing a phenomenology of string theory. In this way we are moving towards a deep understanding of the structure of space and time, and of our own place in the larger universe of extra dimensions. Acknowledgements These lectures were inspired by many interactions with experts in the field, including Nima Arkani-Hamed, Savas Dimopoulos, Gia Dvali, Tao Han, JoAnne Hewett, Greg Landsberg, John MarchRussell, Rob Myers, Michael Peskin, Lisa Randall, Tom Rizzo, Martin Schmaltz, Maria Spiropulu, Raman Sundrum, Jing Wang, and Ren-jie Zhang. I would like to thank the organizers of the TASI 2000 school for their hospitality, and thank the students of the school for stimulating questions and comments. Research supported by the U.S. Department of Energy Grant DE-AC0276CHO3000.
868
1. S. Weinberg, "Gravitation and Cosmology", Wiley, 1972. 2. C. W. Misner, K. S. Thorne and J. A. Wheeler, "Gravitation", Freeman, 1970. 3. The original papers are reprinted in: T. Appelquist, A. Chodos, and P.G.O. Freund, "Modern Kaluza-Klein Theories", Addison-Wesley, 1987. 4. T. Han, J. D. Lykken and R. Zhang, Phys. Rev. D 59, 105006 (1999). 5. P. Van Nieuwenhuizen, Phys. Rept. 68, 189 (1981). 6. M. B. Green, J. H. Schwarz, and E. Witten, "Superstring Theory", Cambridge Univ. Press, 1987. (Cambridge Monographs on Mathematical Physics); J. Polchinski, "String Theory", Cambridge Univ. Press, 1998. (Cambridge Monographs on Mathematical Physics). 7. J. Polchinski, Phys. Rev. Lett. 75, 4724 (1995). 8. J. Polchinski, "TASI lectures on D-branes," hep-th/9611050. 9. E. B. Bogomolny, Sov. J. Nucl. Phys. 24, 449 (1976) [Yad. Fiz. 24, 861 (1976)]; M. K. Prasad and C. M. Sommerfield, Phys. Rev. Lett. 35, 760 (1975). 10. J. D. Lykken, Phys. Rev. D54, 3693 (1996); I. Antoniadis, Phys. Lett. B246, 377 (1990). 11. I. Antoniadis, N. Arkani-Hamed, S. Dimopoulos and G. Dvali, Phys. Lett. B436, 257 (1998). 12. I. Antoniadis and K. Benakli, Int. J. Mod. Phys. A 15, 4237 (2000) [hep-ph/0007226]. 13. A. Lukas, B. A. Ovrut, K. S. Stelle and D. Waldram, Phys. Rev. D 59, 086001 (1999). 14. L. E. Ibanez, "New perspectives in string phenomenology from dualities", Talk at Trieste Conference on Phenomenological Aspects of Superstring Theories (PAST97), Trieste, Italy, 2-4 Oct 1997, hep-ph/9804236. 15. G. Shiu and S. H. Tye, Phys. Rev. D 58, 106007 (1998). 16. Z. Kakushadze and S. H. Tye, Nucl. Phys. B548, 180 (1999).
869
17. J. Lykken, S. Trivedi and E. Poppitz, "Chiral gauge theories and D-branes", Talk at Trieste Conference on Phenomenological Aspects of Superstring Theories (PAST97), Trieste, Italy, 2-4 Oct 1997; Nucl. Phys. B543, 105 (1999). 18. N. Arkani-Hamed, S. Dimopoulos and G. Dvali, Phys. Lett. B429, 263 (1998); Phys. Rev. D59, 086004 (1999). 19. J. C. Long, H. W. Chan and J. C. Price, Nucl. Phys. B539, 23 (1999). 20. C. D. Hoyle, U. Schmidt, B. R. Heckel, E. G. Adelberger, J. H. Gundlach, D. J. Kapner and H. E. Swanson, Phys. Rev. Lett. 86, 1418 (2001) [hep-ph/0011014]. 21. G. F. Giudice, R. Rattazzi and J. D. Wells, Nucl. Phys. B544, 3 (1999). 22. E. A. Mirabelli, M. Perelstein and M. E. Peskin, Phys. Rev. Lett. 82, 2236 (1999). 23. M. Acciarri et al. [L3 Collaboration], Phys. Lett. B 470, 268 (1999). 24. P. Onyisi, "Limits on extra dimensions and new particle production in the 7 + X signature at CDF", to appear. 25. J. L. Hewett, Phys. Rev. Lett. 82, 4765 (1999). 26. T. G. Rizzo, Phys. Rev. D 59, 115010 (1999). 27. K. Cheung and G. Landsberg, Phys. Rev. D 62, 076003 (2000). 28. G. Abbiendi et al. [OPAL Collaboration], Eur. Phys. J. C 13, 553 (2000); M. Acciarri et al. [L3 Collaboration], Phys. Lett. B 470, 281 (1999); P. Abreu et al. [DELPHI Collaboration], Phys. Lett. B 485, 45 (2000); R. Barate et al. [ALEPH Collaboration], Eur. Phys. J. C 12, 183 (2000); C. Adloff et al. [HI Collaboration], Phys. Lett. B 479, 358 (2000). 29. B. Abbott et al. [DO Collaboration], Phys. Rev. Lett. 86, 1156 (2001). 30. S. Nussinov and R. Shrock, Phys. Rev. D 59, 105002 (1999); E. Accomando, I. Antoniadis and K. Benakli, Nucl. Phys.
870
31. 32.
33. 34. 35. 36.
37.
38. 39.
40. 41. 42.
B579, 3 (2000); S. Cullen, M. Perelstein and M. E. Peskin, Phys. Rev. D 62, 055012 (2000); P. Creminelli and A. Strumia, Nucl. Phys. B 596, 125 (2001). The comments which follow are distilled from discussions with Maria Spiropulu. M. Spiropulu, "A blind search for supersymmetry in pp collisions at yfs = 1.8 TeV using the missing energy plus multijet channel", Ph.D Thesis, Harvard University (2000). N. Kaloper, J. March-Russell, G. D. Starkman and M. Trodden, Phys. Rev. Lett. 85, 928 (2000). A. De Rujula, A. Donini, M. B. Gavela and S. Rigolin, Phys. Lett. B 482, 195 (2000). L. Randall and R. Sundrum, Phys. Rev. Lett. 83, 3370 (1999); Phys. Rev. Lett. 83, 4690 (1999). S. Cullen and M. Perelstein, Phys. Rev. Lett. 83, 268 (1999); L. J. Hall and D. Smith, Phys. Rev. D 60, 085008 (1999); V. Barger, T. Han, C. Kao and R. J. Zhang, Phys. Lett. B 461, 34 (1999). N. Kaloper, Phys. Rev. D60, 123506 (1999) [hepth/9905210]; T. Nihei, Phys. Lett. B465, 81 (1999); J. M. Cline, C. Grojean and G. Servant, Phys. Rev. Lett. 83, 4245 (1999); H.B. Kim and H.D. Kim, Phys. Rev. D61, 064003 (2000); P. Kanti, I. I. Kogan, K. A. Olive and M. Pospelov, Phys. Lett. B 468, 31 (1999). . P. Binetruy, C. Deffayet, U. Ellwanger and D. Langlois, Phys. Lett. B 477, 285 (2000). C. Csaki, M. Graesser, L. Randall and J. Terning, Phys. Rev. D 62, 045015 (2000); C. Csaki, M. Graesser, C. Kolda and J. Terning, Phys. Lett. B 462, 34 (1999). A. Karch and L. Randall, Int. J. Mod. Phys. A 16, 780 (2001). N. Arkani-Hamed, S. Dimopoulos, G. Dvali and N. Kaloper, Phys. Rev. Lett. 84, 586 (2000). J. Lykken and L. Randall, JHEP 0006, 014 (2000).
871
43. S. B. Giddings, E. Katz and L. Randall, JHEP 0003, 023 (2000). 44. S. S. Gubser, Phys. Rev. D 63, 084017 (2001). 45. S. B. Giddings and E. Katz, "Effective theories and black hole production in warped compactifications," hep-th/0009176. 46. N. Arkani-Hamed, M. Porrati and L. Randall, "Holography and phenomenology," hep-th/0012148. 47. J. Lykken, R. C. Myers and J. Wang, JHEP 0009, 009 (2000).
u
£3
on
<£ o 2
>
•"
.a 'c
M 43
£
ra
£ <
•a !_)
£3
I S"3 K
C
3 ffl
a
J >
o§3|
-T
,3
S <S £ «3 6 5 I <S 3 5 £
sSu
£ B < 8" H U
•g ^
-
1 i;l
§
"S .H
o
|
3 a £ S
«
0-
f
l
- , "2 -3 Q
Q S
,
Jill
1° it
2
>,
K X P
z
I ' l j l< o a Q !H
s
m
•HMm^,Ti\Dr^oo(^OFH(S(fii,/)^o
ISBN 981-02-4562-9
www. worldscientific. com 4646 he