Physics Reports 301 (1998) 9—43
Three basic issues concerning interface dynamics in nonequilibrium pattern formation Wim van Saarloos Instituut-Lorentz, Leiden University, P.O. Box 9506, 2300 RA Leiden, Netherlands
Abstract In these lecture notes, we discuss at an elementary level three themes concerning interface dynamics that play a role in pattern forming systems: (i) We briefly review three examples of systems in which the normal growth velocity is proportional to the gradient of a bulk field which itself obeys a Laplace or diffusion type of equation (solidification, viscous fingers and streamers), and then discuss why the Mullins—Sekerka instability is common to all such gradient systems. (ii) Secondly, we discuss how underlying an effective interface or moving boundary description of systems with smooth fronts or transition zones, is the assumption that the relaxation time of the appropriate order parameter field(s) in the front region is much smaller than the time scale of the evolution of interfacial patterns. Using standard arguments we illustrate that this is generally so for fronts that separate two (meta)stable phases: in such cases, the relaxation is typically exponential, and the relaxation time in the usual models goes to zero in the limit in which the front width vanishes. (iii) We finally summarize recent results that show that so-called “pulled” or “linear marginal stability” fronts which propagate into unstable states have a very slow universal power-law relaxation. This slow relaxation makes the usual “moving boundary” or “effective interface” approximation for problems with thin fronts, like streamers, impossible. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 47.54.#r; 68.35.Rh; 47.20.Ky; 02.30.Jr Keywords: Pattern formation; Interface dynamics; Relaxation
0. Introduction In this course, and in the two related lectures by Ebert and Brener on their work in [1,2], some basic features of the dynamics of growing interfaces in systems which spontaneously form nonequilibrium patterns will be discussed. The analysis of such growth patterns has been an active field of research in the last decade. Moreover, the field is quite diverse, with examples coming from various (sub)disciplines within physics, materials science, and even biology — combustion, convection, crystal growth, chemical waves in excitable media and the formation of Turing patterns, dielectric breakdown, fracture, morphogenesis, etc. We therefore cannot hope to review the whole field, but instead will content ourselves with addressing three rather basic topics which we consider to be of rather broad interest, in that they appear (in disguise) in many areas of physics and in some of the related fields. These three themes are explained below. 0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 0 4 - 0
10
W. van Saarloos / Physics Reports 301 (1998) 9—43
Our first theme concerns the generality of interfacial growth problems in which the normal growth velocity v is proportional to the gradient +U of a bulk field U (v &+U), and the associated long n n wavelength instability of such interfaces. As we shall see, one important class of interfacial growth problems with these properties is diffusion-limited growth: either the interface grows through the accretion of material via diffusion through one of the adjacent bulk phases, or the growth of the interface is limited by the speed through which, e.g., heat can be transported away from the interface through diffusion. Since the diffusion current near the interface is proportional to the gradient of the appropriate field in the bulk (a concentration field, the temperature, etc.), v &+U in such growth n problems. But these are not the only possibilities of how one can have an interface velocity proportional to +U. As we will discuss, in viscous fingering an air bubble displaces fluid between two closely spaced plates; as the fluid velocity between the plates is proportional to the gradient of a pressure field p, the fact that the air displaces all the fluid means that again at the interface v &+p. n Likewise, for an ionization front the interface velocity is determined mostly by the drift velocity of electrons in the electric field E"!+U, where now U is the electrical potential, and so again v &D+UD. As we shall see, all such interfacial growth problems where the bulk field itself obeys n a Laplace of diffusion equation, exhibit a long wavelength instability, the so-called Mullins—Sekerka instability [3]. This instability lies at the origin of the formation of many nontrivial growth patterns, and Brener and Ebert discussed examples of these at the school. Although this theme is not at all new, it is nevertheless useful to discuss it as an introduction and to stress the generality — the same Mullins—Sekerka instability plays a role in fractal growth processes like Diffusion Limited Aggregation [4]. The above issues are the subject of the first lecture and Section 1. In physics, it is quite common — and often done intuitively without even stating this explicitly — to switch back and forth between a formulation in terms of a mathematically sharp interface (an infinitely thin surface or line at which the physical fields or their derivatives can show discontinuities) and a formulation in terms of a continuous order parameter field which exhibits a smooth but relatively thin transition zone or domain wall. E.g., we think of the interface between a solid and its melt as a microscopically thin interface, whose width is of the order of a few atomic dimensions. Accordingly, the formulation of the equations that govern the formation of growth patterns of a solid which grows into an undercooled melt on much larger scales l have traditionally been 1!55%3/ formulated in terms of a sharp interface or boundary. The equations, which will be discussed below, are then the diffusion equation for the temperature in the bulk of the liquid and the solid, together with boundary conditions at the interface. These interfacial boundary conditions are a kinematic equation for the growth velocity of the interface in terms of the local interface temperature, and a conservation equation for the heat. The latter expresses that the latent heat released at the interface upon growth of the solid has to be transported away through diffusion into the liquid and the solid. In other words, in an interfacial formulation or moving boundary approximation, the appropriate equations for the bulk fields are introduced, but the way in which the order parameter changes from one state to the other in the interfacial region, is not taken into account explicitly:1 1 Note that if we consider a solid—liquid interface of a simple material so that the interface width is of atomic dimensions, there can be microscopic aspects of the interface physics that have to be put in by hand in the interfacial boundary conditions anyway, as they cannot really be treated properly in a continuum formulation. E.g., if the solid—liquid interface is rough, a linear kinetic law in which the interface grows in proportion to the local undercooling is appropriate. If the interface is faceted, however, a different boundary condition will have to be used.
W. van Saarloos / Physics Reports 301 (1998) 9—43
11
the physics at the interface is lumped into appropriate boundary conditions. Such an interface formulation of the equations often expresses the physics quite well and most efficiently, and is often the most convenient one for the analytical calculations which we will present later. For numerical calculations, however, the existence of a sharp boundary or interface is a nuisance, as they force one to introduce highly non-trivial interface tracking methods. Partly to avoid this complication, several workers have introduced in the last few years different models, often referred to as phase-field models [5,6], in which the transition from one phase or state to another one is described by introducing a continuum equation for the appropriate order parameter. Instead of a sharp interface, one then has a smooth but thin transition zone of width ¼, where the order parameter changes from one (meta)stable state to another one. Numerically, such smooth interface models are much easier to handle, since one can in principle apply standard numerical integration routines.2 While we mentioned above an example where one has (sometimes in a somewhat artificial way) introduced continuum field equations to analyze a sharp interface problem, insight into the dynamics of problems with a moving smooth but thin transition zone is often more easily gained by going in the opposite direction, i.e., by viewing this zone as a mathematically sharp interface or shock front. An example of this is found in combustion [7,8]. In premixed flames, the reaction zone is usually quite small, and one speaks of flame sheets. Already long ago, Landau (and independently Darrieus) considered the stability of planar flame fronts by viewing them as a sharp interface [7,8]. In the last 20 years, much progress has been made in the field of combustion by building on this idea of using an effective interface description of thin flame sheets. Likewise, much of the progress on understanding chemical waves, spirals and other patterns in reaction diffusion systems rests on the possibility to exploit similar ideas [9,10]. Other examples from condensed matter physics: if the magnetic anisotropy is not too large, domain walls in solids can have an appreciable width, but for many studies of magnetic domains of size much larger than this width, we normally prefer to think of the walls as being infinitely thin [11]. Similar considerations hold for domains in liquid crystals [11]. In studies of coarsening (the gradual increase in the typical length scale after a quench of a binary fluid or alloy into the so-called spinodal regime where demixing occurs), both smooth and sharp interface formulations are being used [12—15]. At a summer school on statistical physics, it seems appropriate to note in passing that some of the model equations which include the order parameter are very similar to those studied in particular in the field of dynamic critical phenomena, such as model A, B,2, etc. in the classification of Hohenberg and Halperin [16]. Here, however, we are not interested in the universal scaling properties of an essentially homogeneous system near the critical point of a second-order phase transition, but in many cases in the nonlinear nonequilibrium dynamics of interfaces between
2 While the numerical code may be conceptually much more straightforward, the bottleneck with these methods is that one now needs to have a small gridsize, so as to properly resolve the variations of the order parameter field on the scale ¼. At the same time, one usually wants to study pattern formation on a scale l <¼, so that many gridpoints are 1!55%3/ needed. Hence, computer power becomes the limiting factor. Nevertheless, numerical simulations of dendrites using such phase-field models nowadays appear to present the best way to test analytical predictions and to compare with experimental data [6].
12
W. van Saarloos / Physics Reports 301 (1998) 9—43
a metastable and a stable state. This corresponds to the situation near a first-order transition.3 When we consider in Section 3 fronts which propagate into an unstable state, this can be viewed as the interfacial analogue of the behavior when we quench a system through a second-order phase transition, especially within a mean-field picture,4 where fluctuations are not important.5 Finally, we also stress that while thermal fluctuations are essential to second-order phase transition, they can often be neglected in pattern forming systems, since the typical length and energy scales of interest in pattern forming systems are normally very large (see Section VI.D of [17] for further discussion of this point). A word about nomenclature. For many physicists, front, domain wall and reaction zone are words that have the connotation of describing smooth transition zones of finite thickness,6 while the word interface is being used for a surface whose thickness is so small that it can be treated as a mathematically sharp boundary of zero thickness. The approximation in which the interface thickness is taken to zero is sometimes referred to as a moving boundary approximation. Since neither this concept nor the meaning of the word “interface” is universally accepted, we will sometimes use effective interface description or effective interface approximation as an alternative to “moving boundary approximation” to denote a description of a front or transition zone by a sharp interface with appropriate boundary conditions. Of course, switching back and forth between a sharp interface formulation or one with a smooth continuous order parameter field is only possible if the latter reduces to the first in the limit in which the interface thickness ¼P0 (the interface width is illustrated in Fig. 8 below). Indeed, it is possible to derive an effective interface description systematically by performing an expansion of the equations in powers of ¼ (technically, this is done using singular perturbation theory or matched asymptotic expansions [5,6,21—23]). In such an analysis, the wall or front is treated as a sharp interface when viewed on the “outer” pattern forming length scale l <¼, and the 1!55%3/
3 This will be illustrated, e.g., by the Landau free energy f in Fig. 4b below. Here the states /"0 and /"/ both s correspond to minima of the free energy density f. In Section 2.1 fronts between these two (meta)stable states are discussed. 4 In the mean-field picture, the Landau free energy has below ¹ one (unstable) maximum at /"0 and one minimum # at some /"/ O0. Fronts propagating into an unstable state precisely correspond to fronts between these two states. s Compare Fig. 9b, where »"!f. 5 It is actually rather exceptional to have propagating interfaces when we quench a system through the transition temperature of a second-order phase transition, because the fluctuations make it normally impossible to keep the system in the phase which has become unstable long enough that propagating interfaces can develop. Nevertheless, there is one example of a thermodynamic system in which the properties of propagating interfaces were used to probe the order of the phase transition: for the nematic—smectic-A transition, which was predicted to be always weakly first order, the dynamics of moving interfaces was used to probe experimentally [18] the order of the transition close to the point which earlier had been associated with a tricritical point (the point where a second-order transition becomes a first-order transition). These dynamical interface measurements confirmed that the transition was always weakly first order [18]. Note finally that in pattern forming systems, the fluctuations are often small enough (see the remarks about this in the next paragraph of the main text) that fronts propagating into unstable states can be prepared more easily. For an example of such fronts in the Rayleigh—Benard instability, see [19]. 6 And even this is not true: in adsorbed monolayers walls usually have only a microscopic thickness; e.g., light and heavy walls are concepts that have been introduced to distinguish walls which differ in the atomic packing in one row; see, e.g., [20].
W. van Saarloos / Physics Reports 301 (1998) 9—43
13
dynamics of the front or wall on the “inner” scale ¼ emerges in the form of one or more boundary conditions for this interface. E.g., one boundary condition can be a simple expression for the normal velocity of the interface in terms of the local values of the slowly varying outer fields, like the temperature. This brings us to the second theme and lecture of this course: for the adiabatic decoupling of slow and fast variables underlying an effective interface description, an exponential relaxation of the front structure and velocity is necessary. The point is that an interfacial description — mapping a smooth continuum model onto one with a sharp interface for the analysis of patterns on a length scale l <¼ — is only possible if we can make an adiabatic decoupling. In intuitive terms, this means 1!55%3/ that if we “freeze” the slowly varying outer fields (temperature, pressure, etc.) at their instantaneous values and perturb the front profile on the scale ¼ by some amount, then the front profile and speed should relax as exp(!t/q ) to some asymptotic shape and value which are given in terms &30/5 of the “frozen” outer field. If, moreover, the inner front relaxation time q vanishes as ¼P0 (e.g., &30/5 in the model we will discuss q &¼), then indeed in the limit ¼P0 the relaxation of the order &30/5 parameter dynamics within the front region decouples completely from the slow time and length scale variation of the outer fields, as in the limit ¼P0 both the length and the time scale become more and more separated. The adiabatic decoupling then implies that for ¼;l the front 1!55%3/ follows essentially instantaneously the slow variation of the outer fields in the region near the front. Accordingly, in the interfacial limit ¼P0 the front dynamics on the inner scale ¼ then translates into boundary conditions that are local in time and space at the interface. As we shall illustrate, for fronts between two (meta)stable states, the separation of both length and time scale as ¼P0 is normally the case, and this justifies an interfacial description. Of course, stated this way, the above point may strike you as trivial, as it is a common feature of problems in which fast variables can be eliminated [24]. However, it is an observation that we have hardly ever seen stressed or even discussed at all in the literature, and its importance is illustrated by our third theme: fronts propagating into an unstable state may show a separation of spatial scales in the limit ¼P0, but need not show a separation of time scales in this limit. Our reason for the last statement is that, as we will discuss, a wide class of fronts which propagate into an unstable state (the interfacial analogue of the situation near a second-order phase transition) exhibits slow power-law relaxation (&1/t). This certainly calls the possibility of an effective interface formulation with boundary conditions which are local in space and time into question, but the consequences of this power-law relaxation still remain to be fully explored. The connection between the issue of the front relaxation and the issue of the separation of time scales necessary for an effective interface description is still a subject of ongoing research of Ebert and myself. We will in these lecture notes only give an introduction to the background of this issue and to the ideas underlying the usual approaches, leaving the real analysis and a full discussion of this problem to our future publications [25]. That our third theme is not a formal esotheric issue, is illustrated by the fact that it grew out of our attempt to develop an interfacial description for streamers. As has been discussed by Ebert in her seminar, streamers are examples of a nonequilibrium pattern forming phenomenon. They consist of a very sharp fronts (¼+10 lm) which shows patterns with a size l of order 1 mm 1!55%3/ [32]. However, a streamer front turns out to be an example of a front propagating into an unstable state [1], and we have found through bitter experience that the standard methods to arrive at an interfacial approximation break down, and that the slow power-law relaxation lies at the heart of
W. van Saarloos / Physics Reports 301 (1998) 9—43
14
this. Apart from this, the power-law relaxation is of interest in its own right, especially at a summer school on Fundamental Problems in Statistical Mechanics, as its universality is reminiscent of the universal behavior near a second order critical point in the theory of phase transitions — the common origin of both is in fact the universality of the flow near the asymptotic fixed point.
1. Gradient-driven growth problems and the Mullins—Sekerka instability 1.1. The dendritic growth equations When we undercool a pure liquid below the melting temperature, the liquid will not solidify immediately. This is because below the melting temperature the liquid is only metastable. Moreover, the solid—liquid transition is usually strongly first order, so that the nucleation rate for the solid phase to form through nucleation at small to moderate undercoolings is low. If, however, we bring a solid nucleus into the melt, the solid will start to grow immediately at its interfaces. Initially the shape of the solid germ remains rather smooth (we assume the interface to be rough, not faceted [29]), but once it has grown sufficiently large, it does not stay rounded (like an ice cube melting in a softdrink), but instead branch-type structures grow out. An example of such a so-called dendrite is shown in Fig. 1a. The basic instability underlying the formation of these dendrites is the Mullins—Sekerka instability discussed below, and which in this case is associated with the build-up near the interface of a diffusion boundary layer in the temperature. This in turn is due to the fact that while the solid grows, latent heat is released at the interface. In fact, the amount of heat released is normally so large that most of the heat has to diffuse away, in order to prevent the local temperature to come above the melting temperature ¹ . E.g., for water the latent heat released M when a certain volume solidifies is enough to heat up that same volume by about 80°C. So, since the undercooling is normally just a few degrees, most of the latent heat has to diffuse away in order for the temperature not to exceed ¹ . M The basic equations that model this physics are the diffusion equation for the temperature in the liquid and the solid, ¹/t"D+ 2¹ ,
(1)
together with the boundary conditions at the interface ¸/c v "!D[(+ ¹l) !(+ ¹4) ] , n,*/5 n,*/5 n
(2)
v "(1/b)[¹ (1!(p/¸)i)!¹*/5] . n M
(3)
Eq. (1) is just the normal heat diffusion equation for the temperature; it holds both in the liquid (¹"¹l) and in the solid (¹"¹4). At the interface, the temperature is continuous, so there ¹l"¹4"¹*/5. In Eq. (1), we have for simplicity taken the diffusion coefficient D in both phases equal. The first boundary condition (2) expresses the heat conservation at the interface: v is the n normal growth velocity of the interface, so ¸v is the amount of heat released at the surface per unit n time (¸ is the latent heat per unit volume). If we consider an infinitesimal “pillbox” at the interface,
W. van Saarloos / Physics Reports 301 (1998) 9—43
15
Fig. 1. Examples of three growth patterns: a dendrite in (a), a viscous finger in (b), and streamer in (c). Usually, dendritic growth is studied in liquids with a melting temperature near room temperature. The one shown in (a) was observed in 3He [30] at 100 mK, and the fact that it is similar in shape and form to those usually observed illustrates the generality of dendritic growth (courtesy of E. Rolley and S. Balibar). In (b), a top view of a viscous finger is shown. The air inside the finger like pattern displaces the oil outside (from [31], with permission from the author). The streamer pattern is from a numerical simulation [32].
16
W. van Saarloos / Physics Reports 301 (1998) 9—43
the heat produced has to be equal to the net amount of heat which is being transported out of the flat sides through heat diffusion. The heat current is in general !cD+ ¹, with c the specific heat per unit volume (the combination cD is the so-called heat conductivity), and we denote the components normal to the interface by (+ ¹l) and (+ ¹4) . After dividing by c, we therefore recognize in the n,*/5 n,*/5 right-hand side of Eq. (2) the net heat flow away from the interface. Note that this equation is completely fixed by a conservation law at the interface, in this case conservation of heat, and that it can be written down by inspection. The only input is the assumption that the interface is very thin. Moreover, it shows the structure we mentioned in the beginning, namely that the normal growth velocity is proportional to the gradient of a bulk quantity — the larger the difference in the gradients in the solid and the liquids is, the faster the growth can be. Finally, the second-boundary condition at the liquid—solid interface (3) is essentially the local kinetic equation which expresses the microscopic physics at the interface: We assume the interface to be rough, so that the interface can be smoothly curved [29]. ¹ [1!(p/¸)i] is then the melting M temperature of such a curved interface, where p is the surface tension. Here the curvature i is taken positive when the solid bulges out into the liquid; the suppression of ¹ in such a case can M intuitively be thought of as being due to the fact that there are more broken “crystalline” bonds if the solid is curved out into the liquid, but the relation follows quite generally from thermodynamic considerations [26]. The ratio (p/¸) is a small microscopic length, say of the order of tens of A_ ngstroms. The necessity of introducing the suppression of the local interface melting temperature will emerge later, when we will see that if we do not do so, there would be a strong short wavelength (“ultraviolet”) instability at the interface.7 Now that we understand the meaning of ¹ (1!(p/¸)i) as a local melting term of a curved M interface, we see that Eq. (3) just expresses the linear growth law for rough interfaces [29]; 1/b in this expression has the meaning of a mobility. If we take the limit of infinite mobility (1/bPR), the interface grows so easily that we can approximate (3) by (bP0) ¹*/5"¹ [1!(p/¸)i] , (4) M which is sometimes referred to as the local equilibrium approximation. We stress that the boundary condition (3) is local in space and time, i.e., the growth velocity v responds instantaneously to the local temperature and curvature. There are of course sound n physical reasons why this is a good approximation: the typical solid—liquid interfaces we are interested in are just a few atomic dimensions wide, and respond on the time-scale of a few atomic collision times (of order picoseconds) to changes in temperature [35], while Eqs. (1)—(3) are used to analyze pattern formation on length scales of order microns or more and with growth velocities of the order of a lm/s, say. Hence, an interface grows over a distance comparable to its width in a time of the order of 10~3 s, and the time scale for the evolution of the patterns is typically even slower.
7 Due to the crystalline anisotropy, the capillary parameter actually depends on the angles the interface makes with the underlying crystalline lattice. It has been discovered theoretically that this crystalline anisotropy actually has a crucial influence on dendritic growth: without this anisotropy, needle-like tip solutions of a dendrite do not exist, and the growth velocity of such needles is found to scale with a 7/4 power of the anisotropy amplitude. We refer to [26,28,34] for a detailed discussion of this point.
W. van Saarloos / Physics Reports 301 (1998) 9—43
17
This wide separation of length and time scales justifies the assumptions underlying the interfacial boundary conditions. Eqs. (1)—(3), together with appropriate boundary conditions for the temperature far away from the interface, constitute the basic equations that describe the growth of a dendrite into a pure melt. They may look innocuous, as they appear to be linear equations, but they are not! The reason is that they involve the unknown position and shape of the interface through the boundary conditions, and that the dynamics of the interface depends in turn on the diffusion fields: the location of the interface has to be found self-consistently in the course of solving these equations! This is why such a so-called moving boundary problem is so highly nonlinear and complicated. These equations (with crystalline anisotropy included in the capillary term (p/¸)) were actually the starting point of Brener’s talk at the school, and the work he discussed [2] showed how challenging the nonlinear analysis necessary to obtain a phase diagram of growth patterns can be. Such work builds on many advances made in the last decade on understanding the growth velocity and shape of dendrites, and for a discussion of these we refer to the literature [34,36]. 1.2. Viscous fingering In viscous fingering, or Saffman—Taylor fingering, one considers a fluid (typically an oil) confined between two long parallel closely spaced plates. In Fig. 1b one looks from the top at such a cell — the plates, separated by a small distance b"0.8 mm, are thus in the plane of the paper. The two black sides in this photo constitute the lateral side walls; the oil between the plates cannot penetrate into these. The distance between these side walls is 10 cm, hence the lateral width of the cell is much smaller than the spacing b of the plates. The thin line is the air—fluid interface and the region inside the finger-like shape is air, while the oil is outside. The air is blown into the area between the plates from the right side of the figure. If the air—fluid interface initially stretches all the way across the cell from one sidewall to the other, one quickly finds that when the air is blown in, this interface is unstable, and that after a while a single finger-shaped pattern like the one shown in Fig. 1b penetrates to the left into the fluid. Understanding the shape and width of this finger has been a major theme in interfacial pattern formation [34]. In simple fluids it is so well understood that the analysis of the finger shapes when surfactants or polymers are added to the displaced fluid has become a way to learn something about the resulting properties of the fluid and the air—fluid interface [37]. We will content ourselves here with giving an introduction of the basic equations, aimed at bringing out the same gradient-driven structure of the interface equations. In viscous fingering experiments, the spacing b between plates is much smaller than the width of the cell (the distance between the two dark sides in Fig. 1b). As a result, the average fluid flow field varies in the plane of the cell only slowly over distances of the order of the lateral dimensions of the cell. Locally, therefore, the flow in the small direction normal to the planes is almost like that of homogeneous planar Poisseuille flow, for which we know that the average fluid velocity is !b2/(12g) times the gradient of the pressure p. Here g is the kinematic viscosity of the fluid. Hence, if we now introduce (x, y) as the height-averaged flow field between the plates of the cell, which we take to lie in the xy-plane, we have (x, y)"!b2/(12g)+p. Taking the fluid to be incompressible, + ) "0, implies that in the bulk of the fluid the pressure simply obeys the Laplace equation,
+ 2p"0 ,
(5)
18
W. van Saarloos / Physics Reports 301 (1998) 9—43
while at the interface the fact that the air displaces all the fluid is expressed by . (6) v "!(b2/12g)(+p) n,*/5 n Furthermore, if we ignore the viscosity of the air and wetting effects, the pressure at the interface is nothing but the equilibrium pressure of a smoothly curved air—fluid interface, i.e., pD "p !pi , (7) int 0 where as before p is the surface tension and p the background pressure in the gas. The curvature 0 term !pi is the direct analogue of the one in Eq. (3); in the context of air—fluid interfaces it is known as the Laplace pressure term, and it corresponds to the well-known effect that the pressure inside a soap bubble is larger than the one outside. As in the case of crystal growth discussed above, the form of the boundary conditions can essentially be guessed on physical grounds and by appealing to the fact that the microscopic relaxation time at the interface is typically orders of magnitude smaller than the time and length scale of the pattern. 1.3. Streamer dynamics — a moving charge sheet? We finally introduce a problem which is not at all understood in detail but whose similarity with dendrites and viscous fingers motivated some of the issues discussed here [1]. The basic phenomenon is that when an electric field is large enough, an electron avalanche type can build up in a gas, due to the fact that free electrons get accelerated sufficiently that they ionize neutral molecules, thus generating more free electrons, etc. Streamers are the type of dielectric breakdown fronts that can occur in gases as a combined result of this avalanche type of phenomenon and the screening of the field due to the build-up of a charge layer. The basic equations that are being used to model this behavior are the following continuum balance equations for the electron density n and ion density % n , and the electric field E [32] ` n #+ ) j "Dn k EDa e~E0@@E@ , (8) t % % % % 0 (9) n #+ ) j "Dn k EDa e~E0@@E@ , ` % % 0 t ` and the Poisson equation
+ ) E"(e/e )(n !n ) . (10) 0 ` % The electron and ion current densities j and j are % ` j "!n k E!D +n , j "0 , (11) % % % % % ` so that j is the sum of a drift and a diffusion term, while the ion current j is neglected, since the % ` ions are much less mobile than the electrons. The right-hand side of Eqs. (8) and (9) is a source term due to the ionization reaction: In high fields free electrons can generate free electrons and ions by impact on neutral molecules. The source term is given by the magnitude of the electron drift current times the target density times the effective ionization cross section; the constant E in the ionization 0 probability depends on the mean free path of the electrons and the ionization energy of the neutral
W. van Saarloos / Physics Reports 301 (1998) 9—43
19
gas molecules. The rate constant a has the dimension of an inverse length. The exponential 0 function expresses, that only in high fields electrons have a nonnegligible probability to collect the ionization energy between collisions. In Fig. 1c, we show a plot of the simulations of the above equations [32]. In this figure, a gap between two planar electrodes across which there is a large voltage difference is studied for parameters in the model that correspond to N gas. Initially, the electron density is essentially zero 2 everywhere except in a very small region near the upper electron. The simulation of Fig. 1c shows the situation 5.5 ns later; the lines in this plot are lines of constant electron density. The density differs by a factor 10 between successive lines. Since these lines are closely spaced — the electron density rises by a factor 1010 in a few lm — the simulations illustrate that a streamer consists of an ionized region (inside the contour lines) propagating into a nonionized zone. Inside this zone, almost all of the ionization takes place, and the total charge density is nonzero. It is this nonzero charge density that also screens the electric field from the interior of the streamer. We can therefore also think of the streamer as a moving charge sheet, whose shape is somewhat like a viscous finger. In fact, if the upper electrode is spherical rather than planar, one finds branched streamers which are reminiscent of dendrites [33]. Note that in this interfacial picture, the charge sheet is also the reaction zone where most of the ionization takes place, and that the build-up of the charge in this zone is at the same time responsible for the screening of the field in the interior of the streamer. An immediate question that comes to mind is whether we can analyze such a streamer as a moving interface problem by mapping the continuum equations onto a sharp interface of zero thickness, by taking the limit in which the charge sheet becomes infinitely thin [1]. If such an analysis can be done, very much along the lines of the analysis for combustion or for the so-called phase-field models mentioned in the introduction, one would intuitively expect that the dynamics in the transition zone would, in the ¼P0 limit, translate into boundary conditions at the interface. In particular, one expects one equation expressing charge conservation, and a kinematic relation for the normal velocity of the interface. Based on one’s experience with the other problems described earlier, this kinematic expression might be guessed to express the local interface velocity as a function of the instantaneous values of the “outer” fields at the interface. As we shall see in Section 3, this does not appear to be necessarily possible for front problems like streamers, which correspond to front propagation into unstable states. The physical reason that the nonionized region into which the streamer fronts propagate is unstable, is that as soon as there are free electrons, there is further ionization due to the source term on the right-hand side of the streamer equations (8) and (9). This leads to an avalanch type of phenomenon, with exponential growth of the electron density, characteristic of a linearly unstable state. Let us nonetheless not let ourselves get discouraged, but follow our nose and assert that if the electron diffusion is small, one would expect that the normal velocity of a streamer front like that of Fig. 1c is approximately equal to the drift velocity of the electrons on the outer side of the charge sheet, v +Dk E`D"Dk +U`D . (12) n % % Here U is the electrical potential, E"!+U, and the superscript # indicates the value of the field at the charge sheet, extrapolated from the nonionized region. A linear relation, like Eq. (12) is indeed found to a good approximation for negatively charged streamers [1]. Now, in the nonionized region outside the streamer, the charge density is essentially zero [n +0, n +0 in ` %
W. van Saarloos / Physics Reports 301 (1998) 9—43
20
Eq. (11)], and hence here
+ 2U+0 in the nonionized region .
(13)
Thus, we see that if we do think of a streamer sheet as a moving charge sheet and assume that the potential inside the streamer is roughly constant due to the high mobility of the electrons, it falls within the same class of gradient driven problems as dendrites and viscous fingers: the normal velocity of the charge sheet is proportional to the gradient of a field U, which itself obeys a Laplace or diffusion equation. 1.4. The Mullins—Sekerka instability We now discuss the Mullins—Sekerka instability of a planar interface. We will first follow the standard analysis for the simplest case of a planar solidification interface [3,26—28], and then indicate why in the long wavelength limit the same instability happens for all gradient-driven fronts whose outer field obeys a Laplace or diffusion equation. The analysis of the planar interface appears at first sight to be somewhat of an academic problem, as we will find such an interface to be unstable. However, the analysis does identify the basic physics that is responsible for the formation of natural growth shapes and it helps us to identify the proper length scale for the growth shapes that result (the form of the dispersion relation also plays a role in analytical approaches to the dendrite and viscous finger problem [34]). We want to consider the stability of a planar interface which grows with velocity v. To do so, we write the diffusion equation in a frame m "z!vt moving with velocity v in the z-direction, v ¹/t!v ¹/m "D+ 2¹ . v
(14)
Note that now + 2 denotes the Laplacian in the moving x, y, m frame. Furthermore, we consider for v simplicity the limit bP0 so that the other two basic equations are Eqs. (2) and (4), (¸/c)v "!D[(+ ¹l) !(+ ¹4) ] , n n,*/5 n,*/5
(15)
¹*/5"¹ [1!(p/¸)i] . M
(16)
Let us first look for a steady-state solution, i.e., the solution for a plane growing with a constant velocity v in the z-direction into an undercooled liquid. Since according to boundary condition (16) ¹*/5"¹ for a plane, the solution in the solid is ¹4 "¹ , while solving the diffusion equation M 0 M (Eq. (14)) for a solutions ¹l that are stationary in the m frame yields 0 v ¹l (m )"(¹ !¹ )e~mv@lD#¹ , ¹4 "¹ . = 0 M 0 v M =
(17)
Here we have taken the position of the plane at m "0, ¹ is the temperature far in front of the v = plane, and l "D/v is the thermal diffusion length. The temperature profile given by Eq. (17) is D sketched in Fig. 2. Substitution of this result into the boundary condition for heat conservation (15) yields ¸/c"(¹ !¹ ) . M =
(18)
W. van Saarloos / Physics Reports 301 (1998) 9—43
21
Fig. 2. Qualitative sketch of the temperature profile at a planar interface in solidification.
This equation shows that the temperature ¹ has to be precisely an amount ¸/c below the melting = temperature (this criterion is often refered to as unit undercooling), and it is an immediate consequence of heat conservation. For, in order for the plane to be able to move with a constant speed, the amount of heat in the diffusion boundary layer must be constant in time, in the co-moving frame, and hence the net effect of the moving interface is that it replaces a liquid volume element at a temperature ¹ by a solid volume element at a temperate ¹ , while the heat per unit = M volume which is generated is ¸. Equating this to the heat c(¹ !¹ ) necessary to give the required M = increase in temperature gives Eq. (18). In passing, we note that if the undercooling far away is less [(¹ !¹ )(¸/c], a planar M = solidification front will gradually slow down (v&1/Jt) due to the slow increase of the thickness of the boundary layer. This gradual decrease of the speed is slow enough that we can extend most of the analysis below by making a quasistationary approximation for the velocity, but for simplicity we will assume that condition (18) is satisfied. We now turn to a linear stability analysis of this planar interface. To do so, we assume that the interface is slightly perturbed, i.e., that the position of the interface deviates slightly from the planar position m "0. The strategy then is to write the interface position as m "f(x, y, t) v v with f(x, y, t) small, and to solve the diffusion equation and boundary conditions to first order in f(x, y, t). Since the unperturbed planar solution is translation invariant in the xy plane, the eigenmodes of the linearized equations are simple Fourier modes, and it suffices to analyze each Fourier mode separately. Moreover, for simplicity, we can take this mode to vary in the x direction only. We thus write the perturbed interface and the temperature field as single Fourier modes of the form f(x, y, t)"f eXt`*kx, d¹l,4"d¹l,4(m )eXt`*kx . k v
(19)
Our goal is to determine the dispersion relation, i.e., X as a function of k. If X is positive, the corresponding mode k grows, and the planar solution is unstable to that particular mode. Consider first the temperature diffusion equation. Since it is already linear, the functions d¹l,4(m ) satisfy the v simple differential equations 1 dd¹l,4(m ) d2d¹l,4(m ) k v# k v "(X/D!k2)d¹l,4(m ) . k v l dm2 dm D v v
(20)
22
W. van Saarloos / Physics Reports 301 (1998) 9—43
The solutions of these equations are simple exponentials; when we impose that the perturbed temperature fields d¹l,4 have to decay to zero far away from the interface, we get 1 d¹l (m )"d¹le~qmv, q" (1#J1!4l2 X/D#4k2l2 ) , (21) D D k v k 2l D 1 d¹4 (m )"d¹4 eq{mv, q@" (!1#J1!4l2 X/D#4k2l2 ) . (22) k v k D D 2l D Furthermore, continuity of the temperature at the interface implies ¹l (m "f)#d¹l(m "f) v 0 v "¹ #d¹4(m "f). To linear order, we can take d¹l and d¹4 at m "0, since they are already M v v linear in the perturbations. Expanding ¹l (m "f) to linear order gives ¹l (m "f)" 0 v 0 v ¹ !(¹ !¹ )f/l #2 and so we simply get M M = D (23) d¹l !(¹ !¹ )l~1f "d¹4 . k M = D k Turning now to the boundary conditions (15) and (16), we note that the curvature i of the surface m "f becomes i"!2f/x2/[(1#(f/x)2)3@2]"!2f/x2#O(f2). The local equilibrium v interface boundary condition (16) therefore becomes with this result and Eq. (23) d¹4 "!(p/¸)¹ k2f , (24) k M k d¹l "(¹ !¹ )l~1f !(p/¸)¹ k2f . (25) k M = D k M k Finally, we need to linearize the conservation boundary condition (15). The relation between the z-component of the interface velocity and the normal velocity v is v "v#fQ "v cos h, where h is n z n the angle between the interface and the z or m direction. Since cos h"1/J1#(f/x)2, this gives to linear order v "v#fQ . Furthermore, the perturbed gradient at the liquid side of the interface has n two contributions, one from ¹l evaluated at the perturbed position of the interface, and one from 0 d¹l. One gets, using also Eq. (18), X"(v/l )[!1#ql #D(q#q@)(!d k2)] , D D 0 where
(26)
d "p¹ c/¸2 (27) 0 M is the capillary parameter, which has a dimension of length. Just like the ratio p/¸, d is typically 0 a small microscopic length, of the order of tens of A_ ngstrom, say. Eq. (26) is the dispersion relation for the growth rate X we were after. In this general form, it is not so easy to analyze8 for general k, since q and q@ depend k and X through Eqs. (21) and (22).
8 It is easy to verify from expressions Eqs. (21) and (22) that X"0 for k"0. This is a consequence of the fact that the system is translation invariant, so that a perturbation that corresponds to a simple shift of the planar interface neither grows nor decays.
W. van Saarloos / Physics Reports 301 (1998) 9—43
23
Fig. 3. (a) Sketch of the dispersion relation (28) for the stability of the planar solidification interface, in the quasistationary approximation. The linear behavior of X with DkD is generic for gradient driven growth problems, while the stabilization for larger k values depends on the problem under consideration. (b) Sketch of the compression of the isotherms in front of a bulge of the interface. If such a bulge appears on a long enough length scale that the capillary suppression of the local melting temperature is not too large, then the enhanced heat diffusion near the bulge associated with the compression of the isotherms makes the interface unstable. This is the origin of the Mullins—Sekerka instability, which is generic to gradient-driven growth problems.
The expression for X becomes much more transparent if the diffusion coefficient D is large enough and the perturbations of short enough wavelength that both9X;Dk2 and kl ;1. This is D actually the relevant limit for small wavelengths, as then l is very large and timescales are slow. In D this case, Eqs. (21) and (22) show that q@+q+DkD, and then the dispersion relation (26) reduces to X+vDkD[1!2d l k2] . 0 D
(28)
This is the form in which the dispersion relation is best known. As Fig. 3 illustrates, X grows linearly for small k (long wavelength), and all modes with wave number k(k "1/J2d l have / 0 D positive growth rates and hence are unstable (k is the neutral wave number for which X"0). / Hence, a mode with this wave number neither grows nor decays). The maximum growth rate is for k "k /J3, i.e., for a wavelength j "2p/k "2pJ6d l . We thus see that the planar .!9 / .!9 .!9 0 D interface is unstable to modes within a whole range of wavenumbers. Hence, even if we could prepare initially a (nearly) flat interface, we would soon see that small protrusions, especially those with a spatial scale of order j , would start to grow out. Quite soon, the interface evolution is .!9 then not described anymore by the linearized equations, and one has to resort to some nonlinear analysis to understand the morphology of the patterns that subsequently arise. Typically, j still .!9 is an important length scale even for these growth shapes, but it definitely is not the only parameter that determines the scale and morphology of the patterns [26—28,34]. An example of this was discussed at the school by Brener [2].
9 One cannot choose X independently to satisfy these conditions; nevertheless, one can show that Eq. (28) is a good approximation to Eq. (26) if the diffusion coefficient is large enough that the conditions in the text are satisfied [27]. Physically, we can think of this limit as the one where the diffusion is so large that the temperature diffusion equation can be approximated by the Laplace equation.
24
W. van Saarloos / Physics Reports 301 (1998) 9—43
Qualitatively, the origin of the Mullins—Sekerka instability is easy to understand with the help of Fig. 3b. If the interface has some protrusion into the liquid, then the isotherms are compressed in the neighborhood of this protrusion, provided the length scale of the protrusion is large enough that the suppression of the local melting term due to the capillary correction is small. This means that the heat diffusion away from the interface is enhanced, i.e., that the latent heat produced in this region of the interface diffuses away more easily. Hence, the interface can grow faster there, and the protrusion grows larger in time. It is important to realize that the instability that we identified above only occurs upon growth, and not upon melting if the heat necessary to melt the crystal is supplied through the melt. This is why ice cubes keep a smooth rounded shape during melting. You are encouraged to repeat the qualitative arguments of Fig. 3b to convince yourself of this.10 See [26] for help and further discussion of this point. Clearly, the physical mechanism underlying the Mullins—Sekerka instability is not limited to crystal growth: it arises wherever the growth of or dynamics of a free interface is proportional to the gradient of a field which itself obeys a Laplace equation or diffusion equation — in fact, the approximation q+q@+k that allowed us to reduce the dispersion relation to Eq. (28) amounts to replacing the diffusion equation by the Laplace equation in the quasi-stationary limit! Now that we have done the analysis once in detail, it is easy to see that the linear dispersion X&DkD which we found for solidification for small k (See Fig. 3a), is a general feature of diffusion limited or gradient driven interface dynamics. To be specific, consider an interface whose normal velocity v is n proportional to the gradient of some field U, which obeys the Laplace equation, v "+UD , + 2U"0 in the bulk . (29) n */5 For a planar solution with velocity v, we have then the solution U (z)"U@ z"vz. Again, we 0 0 consider perturbations f"feXt`*kx of the interface. In order that dU then obeys the Laplace equation, it must be of the form dU"dU e*kx~@k@z, z'0 . (30) k Since to linear order the interface velocity v in the presence of the perturbation is v#fQ , we now n have Xf "!DkDdU . (31) k k Finally, the boundary conditions on the planar interface are such that they can be written in terms of derivatives of the fields or the interface shape, and if the basic equations are translation invariant 10 But nature always comes up with surprising exceptions: if a spin-polarized 3He crystal melts, a magnetic boundary layer builds up in the crystal, i.e., one now has a diffusion layer building up in front of a melting interface, while the temperature field is approximately homogeneous since the latent heat is small and the temperature diffusion fast. The Mullins—Sekerka instability upon melting that this results into due to the fact that both the interface velocity and the position of the diffusion boundary layer are reversed, was predicted in 1986 [38]. It has just this summer been observed in the low temperature group in Leiden by Marchenkov et al. We emphasize that in general, the presence or absence of the Mullins—Sekerka instability depends both on the ratios of transport coefficients, and on the direction of the gradients. As stressed to me by John Bechhoefer, it is the fact that impurity diffusion is usually negligible in the solid phase that makes the instability so asymmetric at most solid—liquid interfaces. In Bechhoefer’s thesis work on nematic—isotropic interface dynamics, the transport coefficients are roughly the same in both phases, and in this case both melting and freezing instabilities were observed.
W. van Saarloos / Physics Reports 301 (1998) 9—43
25
(i.e., there is no external field that tends to pin the position of the interface), then we must have in linear order U (m "f )#dU "terms of higher order in k , 0 v k k
(32)
hence, since U@ "v, 0 dU "!vf as kP0 . k k
(33)
Using this in Eq. (31), we find X+vDkD as kP0 .
(34)
This clearly shows the generality of the presence of unstable long wavelength modes with linear dispersion in gradient-driven interface dynamics. Not only solidification, but also viscous fingering, streamer formation and flames [7,8] are subject to this same type of instability, as our discussion earlier in this section demonstrates. The differences between the various problems mainly occur in the stabilizing behavior at short distance scales. These depend on the details of the physics, and are usually different for different problems. They have to be included, however, since otherwise the interface would be completely unstable in the short wavelength limit kPR. 1.5. The connection between viscous fingering and DLA An interesting illustration of the above observation is given by diffusion-limited aggregation (DLA), in which clusters grow due to accretion of Brownian particles. Hence the driving force for growth is essentially the same physics as above, a long-range diffusion field governed by the Laplace equation, but in this case there are no stabilizing smoothing terms at shorter wavelength. Only the particle size or the lattice serves as a short distance cutoff, and in this case the growth is fractal [4]. The connection between the viscous fingering problem and DLA is actually quite deep. In the viscous fingering case, the growth is deterministic, and controlled by solving the Laplace equation in the bulk. In DLA, the probability distribution of the random walkers is also governed by the Laplace equation and the flux at the boundary of the growing cluster is proportional to the gradient of the probability distribution of walkers — as we saw, this is the basic ingredient of the Mullins—Sekerka instability. More importantly, however, the DLA growth process is intrinsically noisy as one particle is added at a time, and as there is no relaxation at the boundary of the cluster. As pointed out by Kadanoff et al. [39,40], the noise can be suppressed by having a cluster grow only at a site once that site has been visited a number of times by a random walker, and by allowing particles at the perimeter to detach and re-attach to the cluster with a probability that depends on the number of neighbors at each site. With increasing noise reduction, DLA in a channel crosses over to viscous fingering. Another surprising connection is that the mean occupation profile of the average of many realizations of DLA clusters in a channel approaches the shape of a viscous finger. See [41,42] for details.
26
W. van Saarloos / Physics Reports 301 (1998) 9—43
2. Smooth fronts as effective interfaces 2.1. Fronts between a stable and a metastable state in one dimension — existence, stability and relaxation We now turn our attention to a different but related issue, namely the question when we can map a model with a smooth front, domain wall or transition zone, onto a sharp interface model, with boundary conditions which are local in space and time. The answer to this question, namely that this typically can be done for problems in which the interface separates two (meta)stable states or phases may not be that surprising. Nevertheless, thinking about these issues helped us clarify some of the points which we feel have not been paid due attention to in the literature, and which come to the foreground in our work with Ebert and Caroli on streamers [1]. There you really run into trouble if you blindly apply the formalism as it is usually presented in the literature. This will be discussed in detail in our future publications with Ebert [25], and I will keep you in suspense till Section 3 for a brief sketch of our present results and implications. Further motivation for the analysis of this section was given in the introduction. I am convinced most — if not all — elements of the discussion below must appear at many places in the literature. For example, the first part of the analysis appears in one form or another in [12—15], but since it will arise in almost any Ginzburg—Landau type of analysis — the working horse of condensed matter physics — I presume most ingredients can be found at many more places (similar questions arise in the analysis of instantons in field theory). Nonetheless, we have not come across any discussion from the perspective that we will emphasize in [25], the relation between relaxation, interface limits, and solvability. The present section is intended to provide a summary of the background material that can be found at scattered places in the literature and to serve as an introduction to our papers [25]. To be concrete, we will present our discussion in terms of a dynamical equation in one dimension of the form //t"2//x2#g(/) .
(35)
Here / is a real order parameter. This equation is about the simplest model equation for the analysis of relaxation dynamics, but it captures the essentials of the issues that also arise in more complicated variants and extensions. Later, in our discussion of the coupling to other fields, it is useful to introduce appropriate parameters to tune the time and spatial scales of the variation of /, but for the present discussion of Eq. (35) we will not need these. We have therefore used the freedom to choose appropriate time and spatial scales to set the prefactors of the derivative terms to unity. It will turn out to be useful to express g(/) in terms of the derivative of two other functions, which both play the role of a potential in different circumstances: g(/),!df (/)/d/,d»(/)/d/ ,
(36)
so that equivalent forms of Eq. (35) are / 2/ df (/) / 2/ d»(/) " ! Q " # . t x2 d/ t x2 d/
(37)
W. van Saarloos / Physics Reports 301 (1998) 9—43
27
As we shall see later on, f has the interpretation of a free energy density in a Ginzburg—Landau picture, while » will play the role of a particle potential in a standard argument in which there is a one-to-one correspondence between front solutions and trajectories of a particle moving in the potential ». We are interested in cases in which g(/) has two zeroes g(/ ) with g@(/ )(0; these correspond to 4 4 (meta)stable homogeneous solutions /"/ of Eq. (35). Indeed, if we linearize this equation about 4 / , and substitute */&eXt`*kx (very much like we have done before), then we find 4 X"g@(/ )!k2(0, which confirms that the state is linearly stable. Without loss of generality, we 4 can always take g(0)"0 (»@(0)"0), so that /"0 is one of the stable states. We will label the other linearly stable state simply / . Although this is not necessary, we will for simplicity also take 4 g antisymmetric [g(!/)"g(/)], so that the potentials f and » are symmetric. A typical example of a function g(/) and its corresponding potentials is sketched in Fig. 4. Note that there is also a third root of g(/) in between 0 and / , and that here g@(/ )'0. A homogeneous state /"/ is 4 6 6 therefore unstable (X"g@!k2'0 for small k). Let us now focus right away on front or domain wall type solutions of the type sketched in Fig. 5a: they connect a domain where / + / on the left to a domain where / + 0 on the right. 4
Fig. 4. The function g(/) and the associated potentials f (/) and »(/) used in our discussion of front solutions of Eq. (35).
Fig. 5. (a) Example of the type of moving front solution we are looking for. (b) The potential ». A moving front solution like the one sketched in (a) corresponds in the particle-on-the-hill analogy to the solution of the dynamical problem in which the particle starts at the top at / , moves down the hill and up the one in the center, and comes to rest at the center 4 top as the quasi-time m P R. v
28
W. van Saarloos / Physics Reports 301 (1998) 9—43
Obvious questions are: what does the solution look like? In which direction will the front move? And how does it relax to its moving state? The answers to these questions can be obtained in a very appealing and intuitive way for this simple model equation by reformulating the questions into a form that almost every physicist is familiar with. However, the two main points — the existence of a unique solution and exponential relaxation — have more general validity. We can look for the existence of moving front solutions by making the Ansatz / (x!vt)"/ (m ), with m "x!vt. Such solutions are uniformly translating in the x frame, and v v v v hence stationary in the co-moving frame m . Substitution of this ansatz into Eq. (37) gives, after v a rearrangement d2/ d/ d»(/ ) v"!v v! v . (38) dm2 dm d/ v v v This equation is familiar to you: it is formally equivalent to the equation for a “particle” with mass 1 moving in a potential », in the presence of “friction”. In this analogy, which is summarized below, m plays the role of time, and v the role of a friction coefficient: v (39)
Clearly, the question whether there is a traveling wave solution of the type sketched in Fig. 5a translates into the question, in the particle-on-the-hill analogy: is there a solution in which the particle starts at the top of the potential » at / at “time” m "!R, rolls down the hill, and 4 v comes to rest at the top of the hill at /"0? In the language of the analogy the answer is immediately obvious: if the value »(/ ) of the potential at / is larger than at /"0, i.e., if 4 4 (40) *» , »(/ )!»(0) 4 is positive, then there must be a solution with a nonzero positive value of the velocity (the “friction coefficient”). Such a solution corresponds to a front which moves to the right so that the / + / 4 domain expands. In the opposite case, when the potential at / is lower than at 0 so that *»(0, 4 then such a solution only exists for “negative friction” so that enough energy is pumped into the system that the particle can climb the center hill. Negative friction corresponds to a left-moving front with v(0, so that the / + 0 domain expands. Let us make this a bit more precise by first asking what happens when the “friction” v is very large. Then, there is no solution where the particle moves from the top at / to the one at /"0: for 4 large friction the particle creeps down the hill and comes to rest in the bottom of the potential. When the friction is reduced, the particle loses less energy, and is able to climb further up on the left side of the well. Hence, if we keep on reducing the “friction” v, at some value v"v the particle has 0 just enough energy left to climb up all the way to the top at /"0, and get to rest there. In other words, at v"v , there is a unique solution of the type sketched in Fig. 5a. If v is reduced slightly 0 below v , the particle overshoots a little bit, and it finally ends up in the left well. So for v just below 0 v , there are no solutions with / P 0 for m P 0. However, if we keep on reducing v, there comes 0 v v
W. van Saarloos / Physics Reports 301 (1998) 9—43
29
a point v"v where the particle first overshoots the middle top, then moves back and forth once in 1 the left well, and finally makes it to the center top — the profile / 1(m ) then has one node where v v / 1(m )"0. Clearly, we can continue to reduce v and find values v , v , v ,2 where the profile v v 2 3 4 / i has 2, 3, 4,2 nodes. As is illustrated in Fig. 6, we thus have a discrete set of moving front v solutions. Which one is stable and dynamically relevant? Intuitively, we may expect that the one with the largest velocity, v , is both the stable and the dynamically relevant one, since the multiple 0 oscillations of the other profiles look rather unphysical. This indeed turns out to be the case: if you start with an initial condition close to the profile / 1 with velocity v , you will find that the node v 1 either “peels off ” from the front region and then stays behind, or moves quickly ahead to disappear from the scene on the front end. In both cases, a front with velocity v emerges after a while. The 0 stability analysis of the front solutions which we will present later confirms that only the fastest v front solution is linearly stable. 0 Before turning to the linear stability analysis, we make a brief digression about the connection with a more thermodynamic point of view that is especially popular in studies of coarsening [12—15]. It is well known that Eqs. (35) and (37) can also be written as
P CA B
D
/ 1 / 2 dF #f (/) . "! , F" dx t 2 x d/
(41)
In a Ginzburg—Landau-like point of view, F plays the role of a free energy functional, whose derivative dF/d/ drives the dynamics, and f (/) is the coarse grained free energy density. This formulation brings out clearly that the dynamics is relaxational and corresponds to that of a non-conserved order parameter (the conserved case corresponds to //t"#+ 2dF/d/, so that :dx / remains constant under the dynamics). Note also that in statistical physics one often starts with postulating an expression for a coarse grained free energy functional like F, and then obtains the dynamics for / from the first equation in Eq. (41). One should be aware that in pattern formation, one usually has to start from the dynamical equations, and that these usually do not follow from some simple free energy functional [43]. An immediate consequence of Eq. (41) is that
P
P A B
dF dF / dF 2 " dx "! dx 40 , dt d/ t d/
(42)
Fig. 6. Graphical representation of the fact that a discrete set of moving front solutions is found at velocity values v , v , v ,2 The number of nodes of the corresponding profiles / is i. 0 1 2 vi
W. van Saarloos / Physics Reports 301 (1998) 9—43
30
so that under the dynamics of /, F is a non-increasing function of time — it either decreases or stays constant (in technical terms: F is a Lyapunov functional). Since the homogeneous steady states /"0 and /"/ correspond to minima of f (/) and hence of F, this immediately shows that a front 4 moves in the direction so that the domain whose state has the lowest free energy density f expands. Since f"!», this is equivalent to the conclusion reached above, that the domain corresponding to the maximum value of the potential » expands. Consider now the case in which the states /"0 and / have the same free energy density: 4 *f,f (/ )!f (0)"!*»"0. They are then “in equilibrium”, and a wall or interface between 4 these two states does not move. Then the excess free energy per unit area, associated with the presence of this wall, which is nothing but the surface tension p is
P CA B
p" dx
D
1 / 2 0 #f (/ )!f (0) . 0 2 x
(43)
We can rewrite this by using the fact that energy conservation in the particle picture implies that the sum of the kinetic and “potential” energy (!f ) is constant, so that 1(/ /x)2!f (/ ) 0 2 0 "!f (0), since far away to the right / /x P 0 and / P 0. Using this in Eq. (43), we get 0 0
P
p" dx(/ /x)2 , 0
(44)
which is an expression which is very often used in square-gradient theories of interfaces. In our case, we can use it to obtain a physically transparant expression for the velocity v of the moving front: If we multiply Eq. (38) by d/ /dm and integrate over m , the term on the left-hand side becomes v v v :dm (d/ /dm )(d2/ /dm2)"1:dm (d/dm )(d/ /dm )2"0 since d/ /dm vanishes for m P $R. As v v v v v 2 v v v v v v v a result, we are left with !*f :dm [d/ /dm df/d/] v v " . v" v :dm (d/ /dm )2 :dm (d/ /dm )2 v v v v v v
(45)
This expression confirms again that the domain whose state has the lowest free energy f expands. But it shows more: for small differences *f, the velocity is small, so we can approximate / in the v denominator by / , the profile of the interface in equilibrium. But in this approximation, the 0 denominator is nothing but the surface tension of Eq. (44), so v + !*f/p, v small .
(46)
Thus, the response of the interface is linear in the driving force *f and the surface tension p plays the role of an inverse mobility coefficient. The above expressions are often used in the work on coarsening, and can be extended to include perturbatively the effect of curvature or slowly varying additional fields on the interface velocity. We will come to this later. We now return to the question of stability of the front solutions with velocity v , v ,2, using an 0 1 analysis that is inspired by a few simple arguments in [44]. Keep in mind that we will study the stability of front solutions in one dimension themselves, not the stability of a planar interface or front to small changes in its shape, like we did in Section 1. To study the linear stability of a front
W. van Saarloos / Physics Reports 301 (1998) 9—43
31
solution / (m ), we write v v / i(m , t)"/ i(m )#g(m , t) , v v v v v
(47)
and linearize the dynamical equation Eq. (35) in g in the moving frame m to get v g g 2g "v # #g@(/ i)g#O(g2) . im v t x2 v
(48)
Since the equation is linear, we can answer the question of stability by studying the spectrum of temporal eigenvalues. To do so, we write g(m , t)"e~Ete~vmv@2t (m ) v E v
(49)
so that all modes with eigenvalues E'0 are stable. Upon substitution of this in Eq. (48), we get
(50)
which is nothing but the Schro¨dinger equation (with +2/m"1) and which explains why we used E for the temporal eigenvalue in the ansatz (Eq. (49)). In the analogy with quantum mechanics which we will now exploit, º(m ) plays the role of a potential, and we are interested in the energy v eigenvalues E of the quantum mechanical particle in this potential. If we find a negative eigenvalue, the profile / i is unstable, i.e., there is then at least one eigenmode of the linear evolution operator v whose amplitude will grow in time under the dynamics. In other words, if we take as an initial condition for the dynamics the uniformly translating profile / we considered plus a small v perturbation about this which has a decomposition along this unstable eigenmode, the perturbation proportional to this eigenmode will grow in time. Consider first the form of the potential º(m ) for v"v . In this case the front has a smooth v 0 monotonically decreasing profile of the form sketched in Fig. 5a. Both for m P!R and for v m PR, g@(/ 0) is negative, so º(m ) is positive for m $R. In between, around / 0"/ , g@(/) is v v v v v 6 positive as Fig. 4a shows, and so º(m ) is smaller than its asymptotic values in this range. The v resulting shape of º(m ) is sketched in Fig. 7 for a case in which º(!R)'º(R). Armed with v a physicist’s standard knowledge of quantum mechanics, we can now immediately draw the following conclusions: (1) The continuous spectrum corresponds to solutions t that approach plane wave states as E m P R in the case drawn in Fig. 7, and so they have an energy E5º(R)"(v2/4#Dg@(0)D)'0. v 0 In other words, the bottom of the continuous spectrum lies at a positive energy, and all the corresponding eigenmodes relax exponentially fast. (2) Next, consider the discrete spectrum. Since the original equation is translation invariant, if / 0(m ) is a solution, so is / 0(m #a)"/ 0(m )#a d/ 0(m )/dm #2. In other words, as the v v v v v v v v v perturbation is nothing but a small shift of the profile, the perturbation should neither grow nor decay. This implies that d/ 0(m )/dm must be a “zero mode” of the linear equation, i.e., be a solution v v v
32
W. van Saarloos / Physics Reports 301 (1998) 9—43
Fig. 7. Sketch of the potential º(m ) which enters in the stability analysis of the front / between a stable and v v0 a meta-stable state. The asymptotes º(R) and º(!R) are both positive. The resulting eigenvalues spectrum is sketched on the right. Note that there always is an eigenvalue E"0 due to the translation mode.
of the Schro¨dinger equation Eq. (49) with eigenvalue E"0:11 E"0: t (m )"d/ 0(m )/dm . (51) 0 v v v v (3) Clearly, the “translation mode” d/ 0(m )/dm with eigenvalue zero is a “bound state solution” v v v [as it should, in view of Eq. (1)] since it decays exponentially to zero for m P$R. Moreover, v since / 0 decays monotonically, d/ 0(m )/dm (0, so the translation mode d/ 0(m )/dm does not v v v v v v v have a zero, i.e., is nodeless. Now it is a well-known result of quantum mechanics [45] that the bound state wave functions can be ordered according to the number of nodes they have: the ground state with energy E has no nodes, the first excited bound state (if it exists) has one node, and so on. 0 If we combine this with our observation that t "d/ 0(m )/dm is nodeless and has an eigenvalue 0 v v v E "0, we are led immediately to the conclusion that if there are other bound states, they must 0 have eigenvalues E'0. Taken together, these results show that apart from the trivial translation mode all eigenfunctions12 have positive eigenvalues E and so are stable: they decay as tPR. Moreover, there is a gap: if the form of the function g(/) is such that there are bound state solutions, then the mode that relaxes slowest is the first “excited” bound state solution t with eigenvalue E '0. Otherwise, the 1 1 slowest relaxation mode is determined by the bottom of the continuous spectrum. In either case, all nontrivial perturbations around the profile / 0 relax exponentially fast. v It is now easy to extend the analysis to the other front profile solutions / 1,/ 2, etc. Consider, e.g., v v / 1. The analysis of the continuous spectrum proceeds as before so the continuous spectrum again v
11 You can easily convince yourself that this is true by substituting / (m #a) in the original ordinary differential v0 v equation for the profile Eq. (38), expanding to linear order in a, and transforming to the function t. You then get Eq. (49) with E"0 and t "d/ (m )/dm . 0 v0 v v 12 There is actually a slightly subtle issue here that we have swept under the rug. In quantum mechanics, wave functions t which diverge as m P$R are excluded, as these cannot be normalized; due to the transformation Eq. (49) v from g to t, there can be perfectly honorable eigenfunctions g of fronts that do not translate into normalizeable wave functions t. In the present case, these eigenfunctions turn out to have large positive eigenvalues, and so they do not affect our conclusions concerning the relaxation, but for fronts propagating into unstable states one has to be much more careful. See Section 3 and [25] for further details.
W. van Saarloos / Physics Reports 301 (1998) 9—43
33
has a gap. Again, the translation mode d/ 1(m )/dm neither grows nor decays, so has eigenvalue v v v zero, but now the fact that / 1 goes through zero once and then decays to zero implies that v d/ 1(m )/dm has exactly one node. According to the connection between the number of nodes of v v v bound state solutions and the ordering of the energy eigenvalues, there must then be precisely one eigenfunction with a smaller eigenvalue than the translation mode which has E"0. In other words, there is precisely one unstable mode. Likewise, all other profiles / i with i'0 are unstable v to i modes. In summary, our analysis shows that in the dynamical equation Eq. (35) for the order parameter /, there is a discrete set of moving front solutions. Only the fastest one is stable, and its motion is in accord with simple thermodynamic intuition. Moreover, the relaxation towards this unique solution is exponentially fast, as e~*Et, where *E is the gap to the lowest bound state eigenvalue, if one exists, or else to the bottom of the continuum band. 2.2. Relaxation and the effective interface or moving boundary approximation As explained in the introduction, in many cases one wants to map a problem with a smooth but thin front or interfacial zone onto one with a mathematically sharp interface with appropriate boundary conditions. We have termed this the effective interface approximation. Reasons for using this mapping can be either to replace a sharp interface problem by a computationally simpler one with a smooth front (e.g., a so-called phase-field model for a solidification front [5,6,46]) or to translate a problem with a thin transition zone (e.g., streamers [1], chemical waves [9,10], combustion [7,8]) onto a moving boundary problem, so as to be able to exploit our understanding of this class of problems. We will refer to this literature and to [23] for detailed discussion of the mathematical basis of such approaches. Here, we just want to emphasize how the exponential relaxation of front profiles that we discussed above is a conditio sine qua non for being able to apply this mapping. For concreteness, let us consider the following phase-field model which is a simple example of the type of models which have been introduced for studying solidification within this context [6,46] u/t"+ 2u#//t ,
(52)
e //t"e2+ 2/#g(/, u) ,
(53)
g(/, u)"!f//, f (/, u)"/2(/!1)2#ju/ .
(54)
In this formulation, / is the order parameter field, and u plays the role of a temperature. For fixed u, we recognize in Eq. (53) the order parameter equation that we have studied before: the potential f has a double-well structure for ju small. At u"0 the states /"0 and /"1 have the same free energy f, and the “liquid” state /"0 and “solid” state /"1 are then in equilibrium. As we have seen, an interface beween these two states then neither melts nor grows. For j'0, a positive temperature u makes the liquid-like state at the minimum near /"0 the lowest free energy state, and below the melting temperature u"0 the solid-like minimum near /"1 has the lowest free energy. The order parameter equation is coupled to the diffusion equation Eq. (52) for the temperature through the term //t. This term plays the role of a latent heat term when
34
W. van Saarloos / Physics Reports 301 (1998) 9—43
solidification occurs: it is a source term in the interfacial zone, where / rapidly increases from about zero to one. Moreover, if the interface is locally moving with speed v , then //t+!v//m , so n v if we integrate through the thin interfacial zone we see that this term contributes a factor v , in n agreement with the fact that the latent heat released at the solid-melt interface is proportional to v . n In writing Eqs. (52)—(54), the space and time scales have been written in units of the “outer” scale on which the temperature field u varies. In these units, the interface width in the order parameter field / should be small, and this is why the parameter e;1 has been introduced in Eq. (53): it ensures that the interface width ¼ scales as e and that the time scale q for the order parameter relaxation is also of order e. It thus allows us to derive the effective interface equations mathematically using the methods of matched asymptotic expansions or singular perturbation theory [5,21,22] by taking the limit eP0. Since both ¼ and q scale as e the response of the interface velocity v stays finite as e goes to zero. n Although the mathematical analysis by which effective interface equations can be obtained is certainly more sophisticated and systematical than what will transpire from the brief discussion in this section, what seems to be the essential step in all the approaches is the following. In the term ju in g or f, which is often treated for convenience as a small perturbation, it is recognized that in the interfacial zone (of width of order e) u does not change much and hence can effectively be treated as a constant in lowest order. Moreover, since the shape of the interface is curved on the “outer” scale, the curvature i of the interfacial zone, when viewed on the inner scale of the front width, is treated as a small parameter which enters, as we shall see below, the equations in order e. This is because when ¼P0, the front becomes locally almost planar. As is illustrated in Fig. 8, one now introduces a curved local coordinate system o(r, t), s(r, t) where the o is oriented normal to the front and points in the direction of the / + 0 phase, which in a Ginzburg—Landau description is normally associated with the disordered phase (we thought of it as the “liquid” phase before). By choosing, e.g., the line o"0 to coincide with the contour line /"1, we ensure that this line follows 2 the interface zone. In the limit o P 0 we then have lim o/tDr"!v (s, t), n o?0
lim + 2o"i(s, t) . o?0
(55)
The derivation of an effective interface approximation now proceeds by introducing the stretched (curvilinear) coordinate m "o/e for the analysis of the inner structure of the front profile, and v assuming that the fields / and u can be expanded in a power series of e as ‘‘inner region”: /"/*/(m , s, t)#e/*/(m , s, t)#2 , 0 v 1 v u"u*/(m , s, t)#eu*/(m , s, t)#2 , 0 v 1 v
(56)
‘‘outer region”: /"/065(r, t)#e/065(r, t)#2 , 0 1 u"u065(r, t)#eu065(r, t)#2 . 0 1
(57)
These “inner” and “outer” expansions then have to obey matching conditions [23,46] (according to the theory of matched asymptotic expansions [21,22], the outer expansion of the inner solution has to be equal to the inner expansion of the outer solution). We will not discuss these here, but instead
W. van Saarloos / Physics Reports 301 (1998) 9—43
35
Fig. 8. Qualitative sketch of a curved front of width ¼, and the local curvilinear coordinate system (o, s) used in the derivation of an effective interface model.
limit ourselves to an analysis of the inner problem.13 On the inner scale, we have14 e2+ 2"(2/m2)#ei(/m )#O(e2) . (58) v v Furthermore, we shall treat the term u in g formally as a term of order e and write v"v e#2 and 1 u"u e#2 — this is not so elegant and not necessary either, but it gets us to the proper answer 1 efficiently. As the velocity is then also of order e, this implies that / is then the stationary front 0 profile (/ /m "0) between two phases in equilibrium, so that from Eq. (54) the lowest order 0 v equation becomes 2/*/(m )/m2#g(/*/(m ), 0)"0 . (59) 0 v v 0 v The solution of this equation is just the equilibrium profile / that we introduced in our discussion 0 of the surface tension. Of course, it is not at all surprising that Eq. (59) emerges in lowest order, since at u"0 the two phases are in equilibrium. Now, in the next order, we get
A
B
K
/*/(m ) g(/*/, u) 2 0 u . (60) #g@(/*/) /*/(m )"!(v #i) 0 v ! 1 0 1 v 1 m u m2 v v u/0 This equation allows us to solve for /*/ in principle. But even without doing so explicitly, we can get 1 the most important information out of it. The operator between parentheses on the left is nothing but the linear operator we already encountered before: the Schro¨dinger operator in our discussion of stability. We then saw that this operator has a mode with eigenvalue zero, the translation mode d/ /dm . Moreover, since the operator is hermitian, it is also a left eigenmode with eigenvalue zero 0 v
13 You may easily verify yourself that by substituting Eq. (57) into Eqs. (52)—(54) the equation for /065 reduces to 0 g(/065, u065)"0 which shows that /065 is just “slaved” to u065: to lowest order, the order parameter in the bulk (outer) 0 0 0 0 region is the value of /065 which minimizes the free energy density f at the local temperature u065. 0 0 14 You can easily convince yourself of the correctness of this result by taking the interface as locally spherical with radius of curvature R. In spherical coordinates, the radial terms of + 2 are 2/r2#(2/r)/r, which gives e2+ 2 + e2(2/r2#(2/R)/r)"2/m #(ei)/m #2. v v
W. van Saarloos / Physics Reports 301 (1998) 9—43
36
of this operator. This implies that for the equation to be solvable, the right hand side has to be orthogonal to the left zero mode d/ /dm . This condition leads to a so-called solvability condition. 0 v Upon multiplying Eq. (60) by d/ /dm and integrating, we can write this condition as an expression 0 v for the normal interface velocity v to lowest order in e, n d/ g(/ , u) 0 0 u :dm v dm 1 u v v "!i! . (61) n :dm (d/ /dm )2 v 0 v Here, we used the fact that /*/"/ . Moreover, in the integration on the right hand side, we can 0 0 take u constant, since the temperature does not vary to lowest order in the interfacial region (its derivatives do — see [46] for more details). The above expression is our central result. The fact that the prefactor of the curvature term on the right is unity comes from the fact that the curvature enters according to the expansion Eq. (58) of the diffusion term + 2 in precisely the same way as the velocity term that arises from the transformation to the co-moving curvilinear from m . When u "0, i.e., when we consider an v 1 interface between two equilibrium phases, it expresses the tendency of the interfaces to straighten out. This effect drives coarsening [12—15], and the motion is sometimes referred to as motion by mean curvature. The second term gives the driving term when the interface temperature u is not equal to the equilibrium temperature. The structure of this term is also quite transparent. In the denominator, we recognize the surface tension Eq. (44), and as we already discussed, the inverse of the surface tension plays the role of an interface mobility in the context of the type of models we consider. In the numerator we can write g in terms of !f (/ , u)// and then do the integral in 0 0 the same way as before in deriving Eq. (45); we then simply get 1 d*f v "!i! n p du
K
u, (62) u/0 where now *f is the difference in free energy densities at opposite sides of the interface. Clearly, the second term is exactly what we could have guessed on the basis of what we already knew before, and together with the curvature term it has exactly the same type of structure as the boundary condition Eq. (3) that we introduced in our first discussion of solidification. The complications that are necessary to model anisotropic kinetics and surface tension with a phase-field model are significant [6], but conceptually the analysis is essentially the same. By taking big steps, we have not done justice to the systematics of the analysis, and there is much more to say about it. If you want to know more, you will find entries to the literature in [5,6,23,46]. However, the point we want to bring to the foreground, following [25] is that in all such approaches, a hidden assumption is made in writing the inner expansion as /*/"/*/(o/e, s, t)#2 0 in Eq. (56). In doing so, we basically already assume that on the slow time scale t, the profile responds instantaneously to variations in the outer field u. This is why on the inner scale, the changes in the profile (like /*/) are given by ordinary differential equations with coefficients which 1 may vary on the outer slow time scale. As it happens, this is actually justified for these type of problems. For, we have seen that the relaxation of a profile goes exponentially fast, as the spectrum of temporal eigenvalues E has a finite gap *E. In the present case, where the time scale q in the order parameter equation scales as e, this means that the relaxation of the front profile goes as
W. van Saarloos / Physics Reports 301 (1998) 9—43
37
e~*Et@e. This shows that as e P 0, the adiabatic assumption implicit in the above analysis is right, as the relaxation on the inner scale completely decouples from the slow scale variation of the outer fields. In other words, we have left out exponentially small terms as e P 0, but that is something that almost always happens when we perform an asymptotic expansion! As we shall see now, the adiabatic approximation cannot be made for fronts moving into an unstable state, such as streamers.15
3. Some elements of front propagation into unstable states — relaxation and the effective interface approximation We now briefly touch on a few elements of fronts propagating into unstable states. In view of the length restrictions on the contribution to the proceedings of the school, we only highlight some recent results obtained in collaboration with Ebert [25], which show that a large class of fronts propagating into an unstable state show universal power-law relaxation and that this makes the mapping of such fronts onto an effective interface model questionable. Our own motivation comes from our attempt to understand the streamer problem, but examples of fronts propagating into an unstable state arise in various fields of physics: they are important in many convective instabilities in fluid dynamics such as the onset of von Karman vortex generation [52], in Taylor [53] and Rayleigh-Be´nard [19] convection, they play a role in spinodal decomposition near a wall [54], the pearling instability of laser-tweezed membranes [55], the formation of kinetic, transient microstructures in structural phase transitions [56], the propagation of a superconducting front into an unstable normal metal [57], or in error propagation in extended chaotic systems [58]. The experimental relevance of the understanding of the relaxation of such fronts is illustrated on propagating Taylor vortex fronts. Here the measured velocities were about 40% lower than predicted theoretically [53], and only later numerical simulations [59] showed that this was due to slow transients. When one of the states is unstable, even a small perturbation around this state will grow out and spread; therefore, the properties of fronts that propagate into an unstable state depend on the initial conditions. If the initial profile is steep enough, arising, e.g., through local initial perturbations, it is known that the propagating front in practice always relaxes to a unique profile and velocity [44,47—49,51]. Depending on the nonlinearities, one generally can distinguish two regimes: as a rule, fronts whose propagation is driven (“pushed”) by the nonlinearities, resemble very much the fronts which propagate into a metastable state and which we have discussed extensively in Section 2 (e.g., their relaxation is also exponential in time). We will therefore not consider this regime, which is often refered to as “pushed” [50,51] or “nonlinear marginal stability” [49] any
15 At the summerschool, Roger Folch Manzanares nicely illustrated to me how one can go wrong with an effective interface approximation if one does not think about the stability of the equations on the inner scale: in a first naieve attempt to formulate phase field equations for the viscous finger problem, he had explored equations which did reduce to the standard viscous finger equations if one blindly followed the standard recipe for analyzing the e P 0 limit. However, the coupling of the phase field with the outer pressure-like field was such that the equations were completely unstable on the inner scale for small e. So do watch out!
38
W. van Saarloos / Physics Reports 301 (1998) 9—43
Fig. 9. The functions g and » in the case of front propagation into unstable states. Compare Fig. 4, where the functions are drawn for the case of a front between a stable and a metastable state.
further. If, on the other hand, nonlinearities mainly cause saturation, fronts propagate with a velocity determined by linearization about the unstable state, as if they are “pulled” by the linear stability (“pulled” [50,51] or “linear marginal stability” [48,49] regime). Almost all important differences between “pulled” or “linear marginal stability” fronts propagating into an unstable state and those propagating into a metastable state trace back to the fact that in the latter case there typically is a discrete set of front solutions, only one of which is stable as was illustrated in Fig. 6, while in the former case there generally is a family of moving front solutions [44,47—49]. To illustrate this, we again turn to Eq. (35), but now take g(/) of the form sketched in Fig. 9a. In this case, g@(0)'0, so the state /"0 is unstable. If we again consider fronts propagating into this state, the potential » corresponding to this function g is the one shown in Fig. 9b. Now, the question of the existence of a uniformly translating profile / (x!vt)"/(m ) translates into the v v question “is there a solution in the particle on the hill analogy in which the particle starts at time m "!R at the top, and comes to the bottom as m P R?”. Obviously, there is such a solution v v for any positive value of the “friction coefficient” v: there is a continuous family of uniformly translating front solutions. It is useful to consider the relation between the velocity v which labels the front solutions, and the asymptotic decay rate K: if we linearize Eq. (35) around the state /"0 and write / & e~Kmv, then v we get vK"K2#g@(0), so K "1v $ J1v2!g@(0), g@(0)'0 . (63) B 2 4 For v'2g@(0), the roots are real, and K (K . For v(2g@(0), the roots are complex, meaning that ~ ` the front solutions decay to zero as / &cos(IK m )e~RKBmv. Clearly, the velocity v*"2g@(0) is v B v a special value, as the two roots coincide there K "K "K*. It is a well-known result that in such ~ ` a degenerate case, the front profile does not decay as a single exponential, but that instead in this case / *(m )&(m #const.)e~K*mv , (64) v v v so that the dominant behavior for large m is the m e~K*mv term. v v The special status of the value v* also becomes clear when we look at the stability analysis of the fronts / (m ). If we retrace the stability analysis of Section 2, then in this case the potential º(m ) in v v v the Schro¨dinger-type equation for the spectrum has an asymptotic value (v*)2/4!g@(0)"0. Hence,
W. van Saarloos / Physics Reports 301 (1998) 9—43
39
according to our arguments the continuous spectrum associated with quantum mechanically allowable eigenfunctions16t comes all the way down to zero, i.e., there is no gap. This already gives a hint that there will be nonexponential relaxation. In the derivation of effective interface equations, we encountered solvability conditions which involved integrals of the form :dx(d/ /dx)2 — see Eq. (61). In the present case, the front velocity is 0 always nonzero, and as a result the stability operator is non-Hermitian [25]. If one tries to derive effective interface equations for such fronts using the same type of approach as discussed at the end of Section 2, one needs the zero mode of the adjoint operator of the problem with vO0 in the corresponding solvability condition. Because of the non-hermitian nature of this operator for vO0, this zero mode turns out to be evmv(d/ /dm ), and one encounters integrals of the type v v :dm evmv(d/ /dm )2 (note that for v"0, the zero mode of the adjoint operator reduces to the one we v v v used before, / /m ). As m PR, the integrand behaves as17 e(v~2K~)mv"eJv2@4~g{(0)mv. As a result, 0 v v the integrals that arise if one naively applies the standard analysis do not converge. Although there have been some suggestions [61] that one might regularize such integrals by introducing a cutoff which is taken to infinity at the end of the calculations, such fixes do not appear to work here and obscure the connection of this problem with the slow relaxation discussed below. We have not yet discussed the origin of the result that “pulled” fronts which emerge from sufficiently localized initial conditions move with a speed v* determined by the linear behavior of the dynamical equation [in our case, the fact that v* is determined solely by g@(0)]. The origin lies in the fact that any perturbation about the unstable state grows out and spreads by itself. This leads to a natural spreading speed of linear perturbations, and v* is nothing but this speed itself [49,60]. If nonlinearities mainly suppress further growth, then indeed the dynamically relevant front is “pulled” [50] by the leading edge whose dynamics is governed by the linearized equation. Ebert and I have recently found that one can build on this idea to analyze the relaxation of front profiles towards / * [25]. The main idea can be illustrated within the context of the dynamical equation v (Eq. (35)) as follows. Let us use the freedom of choosing appropriate space and time scales to take g@(0)"1. As the discussion following Eq. (63) shows, v*"2 and K*"1 in this case, and the linearized dynamical equation reads /(x, t)/t"2/(x, t)/x2#/(x, t) .
(65)
We now write the equation in the moving frame m "x!v*t moving with velocity v*"2, and v make the transformation /(m , t)"e~mvt(m , t). This is essentially the same type of transformation v v that we did before in Eq. (49) when we performed the stability analysis of moving front solutions. With these transformations, t simply obeys the diffusion equation t(m , t)/t"2t(m , t)/x2 . v v
(66)
16 At this point, the warning of footnote 12 becomes important: for fronts propagating into an unstable state, there are important eigenfunctions of the stability operator which are not in the class of eigenfunctions that are allowed in quantum mechanics, as they diverge as m P$R. These are especially important when studying the stability of front v solutions with velocity v'v*, as these are the type of solutions whose eigenvalue continues all the way down to zero. As a result, the stability spectrum is always gapless. See [25] for further details. 17 The factor Jv2/4!g@(0) in the exponential is zero at v*. At v*, the integrals still diverge, but only as a power law [25].
W. van Saarloos / Physics Reports 301 (1998) 9—43
40
As is well known, in many diffusion type problems the long time asymptotics is governed by the fundamental similarity solution or one of its derivatives, like t4:."(1/t1@2)e~m2v@4t or t4:."!t4:./m "(m /2t3@2)e~m2v@4t , (67) 1 2 1 v v so it is not unreasonable to expect that one of these similarity solutions also governs the long time asymptotics in the leading edge here. If so, the corresponding function /(m , t) should approach the v dominant m e~mv term of Eq. (64) for large times. As t"eK*mv/, this means that the spatial v dependence of the similarity solution t4:. that we are looking for should go as m for m2;t. v v Clearly, the appropriate one is t4:.. Hence, this simple argument suggests that in the frame moving 2 with velocity v*"2, the dominant long time dynamics in the leading edge is (68) /&(m /t3@2)e~mv~m2v@4t"e~mv~3@2-/ t`-/ mv~m2v@4t . v If we now track the position m (t) of the point where /(m , t)"h, we get to dominant order from the h v requirement that the exponent in the above expression remains constant m "!3ln t#2 Q mQ "!(3/2t)#2 (69) h 2 h As mQ is the velocity of the point where /"h, we see that in the leading edge of the profile the h velocity relaxes towards v* as !3/(2t). This is precisely what was found by Bramson [62] from a rigorous analysis. Although the above argument is rather handwaving, we have recently found [25] that it can be made into a systematic asymptotic analysis which applies not just to the second-order dynamical equation (Eq. (35)), but also to higher-order partial differential equations which admit uniformly translating front solutions. The surprising finding is that not just the leading order &1/t relaxation term in the velocity is universal, but also the first subdominant &1/t3@2 term, which cannot be obtained from the sove argument: independent of the “height” h whose position we track, we find that the velocity v (t)"v*#mQ relaxes to v* as h h
A
B AB
1 3 Jp v "v*! * 1! #O , h t2 2K t K*JDt
(70)
where for the order parameter equation (35) with g@(0)"1, v*"2, K*"1 and D"1. In the more general case, D is a coefficient which plays the role of a diffusion coefficient, and which can be determined explicitly from the dispersion relation of the linearized equation. Moreover, also the shape of the profile relaxes with the same slow power laws in a universal way which is related to the existence of a family of front solutions. We refer to [25] for details. The above 1/t power law relaxation is clearly too slow to make an effective interface approximation with boundary conditions which are local in space and time possible for “pulled” fronts whose propagation into an unstable states originates in diffusive spreading and growth. To see this, consider, e.g., a spherically symmetric front in Eq. (35) in three dimensions which grows out from some localized region around the origin. For long times the front region is thin in comparison with the distance r from the origin and the curvature of the front is small and of order 2/r +2/(2v*t). & & Thus, the curvature is of the same order as the dominant relaxation term of the front, and one cannot simply express the instantaneous front velocity in terms of v* plus some kind of curvature correction, as we saw one can do for fronts between a stable and a metastable state. Some
W. van Saarloos / Physics Reports 301 (1998) 9—43
41
preliminary numerical investigations have confirmed this. Whether some other interfacial description with memory type of terms can be developed, or whether there are other unexpected consequences of this slow relaxation, is at present an open question.
Acknowledgements Much of my thinking on the issues discussed in these lecture notes has been shaped by interactions and collaborations with Christiane Caroli and Ute Ebert. I wish to thank both of them. In addition, I want to thank Ute Ebert and Ramses van Zon for extensive comments on an earlier version of the manuscript, and Lucas du Croo de Jongh for teaching me how to make computer drawings.
References [1] [2] [3] [4] [5] [6]
[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]
U. Ebert, W. van Saarloos, C. Caroli, Phys. Rev. Lett. 77 (1996) 4178; Phys. Rev. E 55 (1997) 1530. E. Brener, H. Mu¨ller-Krumbhaar, D. Temkin, Phys. Rev. E 54 (1996) 2714. W.W. Mullins, R.F. Sekerka, J. Appl. Phys. 34 (1963) 323; J. Appl. Phys. 35 (1964) 444. P. Meakin, in: C. Domb, J.L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, vol. 12, Academic, New York, 1988. The literature on phase-field models is vast by now. An entry into the more mathematically oriented literature can be got from P.W. Bates, P.C. Fife, R.A. Gardner, C.K.R.T. Jones, Physica D 104 (1997) 1. An example where a phase-field model has been exploited numerically to study the dendrite problem is A. Karma, W.-J. Rappel, Phys. Rev. E 53 (1996) R3017, preprint, July 1997. These papers also give an entry to the more physics oriented literature. An early use of the word “phase-field model” is found in J.S. Langer, in: G. Grinstein, G. Mazenko (Eds.), Directions in Condensed Matter Physics, World Scientific, Singapore, 1986. J.D. Buckmaster, G.S.S. Lundford, Theory of Laminar Flames, Cambridge University Press, Cambridge, 1982. A very nice elementary account of several of the essential ingredients of flame fronts and their instabilities can be found in J.D. Buckmaster, Physica 12D (1984) 173. See, e.g., E. Meron, Phys. Rep. 218 (1992) 1. R.E. Goldstein, D.J. Muraki, D.M. Petrich, Phys. Rev. E 53 (1996) 3933. M. Kle´man, Points, Lines and Walls in Liquid Crystals, Magnetic Systems and Various Ordered Media, Wiley, New York, 1983. J.D. Gunton, M. San Miguel, P.S. Sahni, in: C. Domb, J.L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, vol. 8, Academic, New York, 1983. A.J. Bray, Adv. Phys. 43 (1994) 357. J.S. Langer, in: C. Godreche (Ed.), Solids far from Equilibrium, Cambridge University Press, Cambridge, 1992. Here, smooth interface models go back to the classic work of S.M. Allen, J.W. Cahn, Acta Metall. 27 (1979) 1085. P.C. Hohenberg, B.I. Halperin, Rev. Mod. Phys. 49 (1977) 435. M.C. Cross, P.C. Hohenberg, Rev. Mod. Phys. 65 (1993) 851. See M.A. Anisimov, P.E. Cladis, E.E. Gorodetskii, D.A. Huse, V.E. Podneks, V.G. Taratuta, W. van Saarloos, V.P. Voronov, Phys. Rev. A 41 (1990) 6749, and references therein. J. Fineberg, V. Steinberg, Phys. Rev. Lett. 58 (1987) 1332. M.P.M. den Nijs, in: C. Domb, J.L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, vol. 12, Academic, New York, 1988. C.M. Bender, S.A. Orszag, Advanced Mathematical Methods for Scientists and Engineers, McGraw-Hill, New York, 1978. M. Van Dyke, Perturbation Methods in Fluid Mechanics, Parabolic Press, Stanford, 1975.
42
W. van Saarloos / Physics Reports 301 (1998) 9—43
[23] P.C. Fife, Dynamics of Internal Layers and Diffusive Interfaces, SIAM, Philadelphia, 1988. [24] For an overview of adiabatic decoupling in ordinary differential equations as a result of the separation of time scales, see e.g., N.G. van Kampen, Phys. Rep. 124 (1985) 69. [25] U. Ebert, W. van Saarloos, Phys. Rev. Lett. 80 (1998) 1650. [26] J.S. Langer, Rev. Mod. Phys. 52 (1980) 1; in: J. Souletie (Ed.), Chance and Matter, North-Holland, Amsterdam, 1987. [27] B. Caroli, C. Caroli, B. Roulet, in: G. Godre`che (Ed.), Solids far from Equilibrium, Cambridge University Press, Cambridge, 1992. [28] K. Kassner, Pattern Formation in Diffusion-Limited Crystal Growth, World Scientific, Singapore, 1996. [29] For a review of the issues related to the microscopic nature of the crystal-melt interface, see, e.g., J.D. Weeks, in: T. Riste (Ed.), Ordering in Strongly Fluctuating Condensed Matter Systems, Plenum, New York, 1980, or P. Nozie`res, in: G. Godre`che (Ed.), Solids far from Equilibrium, Cambridge University Press, Cambridge, 1992. [30] E. Rolley, S. Balibar, F. Graner, Phys. Rev. E 49 (1994) 1500. [31] P. Tabeling, G. Zocchi, A. Libchaber, J. Fluid Mech. 177 (1987) 67. [32] P.A. Vitello, B.M. Penetrante, J.N. Bardsley, Phys. Rev. E 49 (1994) 5574. [33] P.A. Vitello, B.M. Penetrante, J.N. Bardsley, in: B.M. Penetrante, S.E. Schultheis (Eds.), Non-Thermal Plasma Techniques for Pollution Control, Springer, Berlin, 1993. [34] D.A. Kessler, J. Koplik, H. Levine, Adv. Phys. 37 (1988) 255; P. Pelce´, Dynamics of Curved Fronts, Academic, Boston, 1988; Y. Pomeau, M. Ben Amar, in: C. Godreche (Ed.), Dendritic growth and related topics, in: Solids far from Equilibrium, Cambridge University Press, Cambridge, 1992; E.A. Brener, V.I. Mel’nikov, Adv. Phys. 40 (1991) 53. [35] For this reason, kinetic relations like Eq. (3) can easily be studied with computer simulations. For an example of this, see, e.g., J.Q. Broughton, G.H. Gilmer, K.A. Jackson, Phys. Rev. Lett. 49 (1982) 1496. The interface mobility coefficient1/b inEq. (3) is nothing but the slope of Fig. 3 in this paper. [36] E. Brener, Phys. Rev. Lett. 71 (1993) 3653. [37] D. Bonn, H. Kellay, M. BenAmar, J. Meunier, Phys. Rev. Lett. 75 (1995) 2132. [38] L. Puech, G. Bonfait, B. Castaing, J. Physique 47 (1986) 723. [39] L.P. Kadanoff, J. Stat. Phys. 39 (1985) 267. [40] D. Bensimon, L.P. Kadanoff, S. Liang, B.I. Shraiman, C. Tang, Rev. Mod. Phys. 58 (1986) 977. [41] A. Arneodo, Y. Couder, G. Grasseau, V. Hakim, M. Rabaud, in: F.H. Busse, L. Kramer (Eds.), Nonlinear evolution of spatio-temporal structures in dissipative continuous systems, Plenum, New York, 1990. [42] E. Brener, H. Levine, Y. Tu, Phys. Rev. Lett. 66 (1991) 1978. [43] A very enjoyable article which discusses fronts in various non-gradient systems is M. San Miguel, R. Montagne, A. Amengual, E. Herna´ndez-Garcı´ a, Multiple front propagation in a potential non-gradient system, in: E. Tirapegui, W. Zeller (Eds.) Instabilities in Nonequilibrium Structures V, Kluwer, Dordrecht, 1996. [44] E. Ben-Jacob, H.R. Brand, G. Dee, L. Kramer, J.S. Langer, Physica D 14 (1985) 348. [45] A. Messiah, Quantum Mechanics, North-Holland, Amsterdam, 1974. [46] R. Kupferman, O. Shochet, E. Ben-Jacob, A. Schuss, Phys. Rev. B 46 (1992) 16045. [47] D.G. Aronson, H.F. Weinberger, Adv. Math. 30 (1978) 33. [48] W. van Saarloos, Phys. Rev. 37 (1988) 211. [49] W. van Saarloos, Phys. Rev. A 39 (1989) 6367. [50] The terms pulled and pushed were introduced by A.N. Stokes, Math. Biosci. 31 (1976) 307. [51] G.C. Paquette, Y. Oono, Phys. Rev. E 49 (1994) 2368; G.C. Paquette, L.-Y. Chen, N. Goldenfeld, Y. Oono, Phys. Rev. Lett. 72 (1994) 76. [52] C. Mathis, M. Provansal, L. Boyer, J. Phys. Lett. 45 (1984) L483; G.S. Triantafyllou, K. Kupfer, A. Bers, Phys. Rev. Lett. 59 (1987) 1914. [53] G. Ahlers, D.S. Cannell, Phys. Rev. Lett. 50 (1983) 1583. [54] R.C. Ball, R.L.H. Essery, J. Phys.: Condens. Matter 2 (1990) 10303; R.A.L. Jones, L.J. Norton, E.J. Kramer, F.S. Bates, P. Wiltzius, Phys. Rev. Lett. 66 (1991) 1326. [55] T.R. Powers, R.E. Goldstein, Phys. Rev. Lett. 78 (1997) 2555; see also L. Limat, P. Jenffer, B. Dagens, E. Touron, M. Fermigier, J.E. Wesfreid, Physica D 61 (1992) 166.
W. van Saarloos / Physics Reports 301 (1998) 9—43 [56] [57] [58] [59] [60] [61] [62]
E.K.H. Salje, J. Phys.: Condens. Matter 5 (1993) 4775. S.J. Di Bartolo, A.T. Dorsey, Phys. Rev. Lett. 77 (1996) 4442, and references therein. A. Torcini, P. Grassberger, A. Politi, J. Phys. A 27 (1995) 4533. M. Niklas, M. Lu¨cke, H. Mu¨ller-Krumbhaar, Phys. Rev. A 40 (1989) 493. L.D. Landau, E.M. Lifshitz, Course of Theoretical Physics, vol. 10, Pergamon, New York, 1981. L.-Y. Chen, N. Goldenfeld, Y. Oono, G. Paquette, Physica A 204 (1994) 111. M. Bramson, Mem. Am. Math. Soc. 44 (1983) 285.
43
Physics Reports 301 (1998) 45—64
Driven diffusive systems. An introduction and recent developments B. Schmittmann*, R.K.P. Zia Center for Stochastic Processes in Science and Engineering, Physics Department, Virginia Polytechnic Institute and State University, Blacksburg, VA, 24061-0435 USA
Abstract Nonequilibrium steady states in driven diffusive systems exhibit many features which are surprising or counterintuitive, given our experience with equilibrium systems. We introduce the prototype model and review its unusual behavior in different temperature regimes, from both a simulational and analytic view point. We then present some recent work, focusing on the phase diagrams of driven bi-layer systems and two-species lattice gases. Several unresolved puzzles are posed. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 64.60.Cn; 66.30.Hs; 82.20.Mj Keywords: Non-equilibrium steady states; Driven diffusive systems
1. Introduction In nature, there are no true equilibrium phenomena, since all of these require infinite times and infinite thermal reservoirs or perfect insulations. Nevertheless, for a large class of systems, it is possible to set up conditions under which predictions from equilibrium statistical mechanics provide excellent approximations, as many of the inventions of the industrial revolution can attest to. By contrast, non-equilibrium phenomena are not only ubiquitous, but often elude the powers of the Boltzmann—Gibbs framework. Unfortunately, to date, the theoretical development of nonequilibrium statistical mechanics is at a stage comparable to that of its equilibrium counterpart in the days before Maxwell and Boltzmann. Using the intuition developed by studying equilibrium statistical mechanics, we are often “surprised” by the behavior displayed by systems far from equilibrium, even if they appear to be in time-independent states. For example, the well-honed arguments, based on the competition between energy and entropy, frequently fail dramatically. In * Corresponding author. Tel.: 1 540 231-6518; Fax: 1540 231 7511. 0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 0 5 - 2
46
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
these lectures, we will present some explorations into the intriguing realm of non-equilibrium steady states, focusing only on a small class, namely, driven diffusive systems. Motivated by both the theoretical interest in non-equilibrium steady states and the physics of fast ionic conductors [1], Katz et al. [2] introduced a deceptively minor modification to a wellknown system in equilibrium statistical mechanics: the Ising [3] model with nearest-neighbor interactions. In lattice gas language [4], the time evolution of this model can be specified by particles hopping to nearest vacant sites, with rates which simulate coupling to a thermal bath as well as an external field, such as gravity. If “brick wall” boundary conditions are imposed in the direction associated with gravity (particles reflected at the boundary, comparable to a floor or a ceiling), and if the rates obey detailed balance, then this system will eventually settle into an equilibrium state, similar to that of gas molecules in a typical room on earth. Although there is a local bias in the hopping rates (due to gravity), thermal equilibrium is established by an inhomogeneous particle density, at all ¹(R. However, if periodic boundary conditions are imposed, then translational invariance is completely restored so that, in the final steady state, the particle density is homogeneous, for all ¹ above some finite critical ¹ , while a current will be # present. Clearly, for gravity, such boundary conditions can be imposed only in art, as by M.C. Escher.1 In physics, it is possible to establish such a situation with an electric field, e.g., by placing the d"2 lattice on the surface of a cylinder and applying a linearly increasing magnetic field down the cylinder axis. If the particles are charged, they will experience a local bias everywhere on the cylinder and a current will persist in the steady state. Echoing the physics of fast ionic conductors, we will therefore, use the term “electric” field to describe the external drive and imagine our particles to be “charged”, in their response to this drive. This is the prototype of a “driven diffusive system”. Now, such a system constantly gains (loses) energy from (to) the external field (thermal bath), so that the time-independent state is by no means an equilibrium state, a` la Boltzmann— Gibbs. Instead, it is a non-equilibrium steady state, with an unknown distribution in general. As discovered in the last decade, modifying the Ising model to include a simple local bias leads to a large variety of far-from-simple behavior. The scope of these lectures necessarily limits us to only a bird’s eye view. Since a more extensive review has been published recently [5], we will restrict these notes to a brief “introduction” to this subject. Instead, we choose to include some developments since that review. For completeness, we summarize, in Section 2, the lattice model introduced by KLS and the Langevin equation believed to capture its essence in the long-time large-distance regime. Section 3 is devoted to some of the surprising behavior displayed by this system, at temperatures far above, near, and well below ¹ . A collection of recent developments is presented in Section 4. In Section 5, # we conclude with a brief summary.
2. The microscopic model and a continuum field approach On a square lattice, with ¸ ] ¸ sites and toroidal boundary conditions, a particle or a hole x y may occupy each site. A configuration is specified by the occupation numbers Mn N, where i is a site i 1 In particular, see the lithograph Ascending and Descending, reproduced on the cover of Ref. [5].
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
47
label and n is either 1 or 0. Occasionally, we also use spin language, defining s,2n!1"$1. To access the critical point, half-filled lattices are generally used in Monte-Carlo simulations: + n "¸ ¸ /2. The particles are endowed with nearest-neighbor attraction (ferromagnetic, in spin i i x y language), modeled in the usual way through the Hamiltonian: H"!4J + n n , with J'0. Wi,jX i j The external drive, with strength E and pointing in the !y direction, will bias in favor of particles hopping “downwards”. To simulate coupling to a thermal bath at temperature ¹, the Metropolis algorithm is typically used, i.e., the contents of a randomly chosen, nearest-neighbor, particle—hole pair are exchanged, with a probability min[1, e~(*H~eE)@kBT]. Here, *H is the change in H after the exchange and e"(!1, 0, 1), for a particle attempting to hop (against, orthogonal to, along) the drive. Note that these dynamic rules conserve particle number. For E"0, this system will eventually settle into an equilibrium state which is precisely the static Ising model [3,4]. In the thermodynamic limit, it is known to undergo a second-order phase transition at the Onsager critical temperature ¹ (0)"(2.2692..)J/k . When driven (EO0), this system displays the same # B qualitative properties, i.e., there is a disordered phase for large ¹, followed by a second-order transition into a phase-segregated state for low ¹. However, with more scrutiny, this superficial similarity gives way to puzzling surprises. In particular, ¹ (E) appears to be monotonically # increasing with E, saturating at about 1.41¹ (0) [6] for E<J. Why should ¹ (R)'¹ (0) be # # # surprising? Consider the following “argument”. For very large E, hopping along y becomes like a random walk, in that *H is irrelevant for the rates. Therefore, hops along y might as well be coupled to a thermal bath at infinite ¹ (apart from the bias). Indeed, recall that our system gains energy from E and loses it to the bath, so that any drive may be thought of as a coupling to a second reservoir with higher temperature. Then, it seems reasonable that ¹ must be lowered to achieve ordering, since this extra reservoir pumps in a higher level of noise, helping to disorder the system! To date, there is neither a convincing argument nor a computation which predicts the correct sign of ¹ (E)!¹ (0), let alone the magnitude. In the next section, we will briefly review other puzzling # # discoveries, only a few of which are understood. To understand collective behavior in the long-time and large-scale limit, we often rely on continuum descriptions, which are hopefully universal to some extent. Successful examples include hydrodynamics and Landau—Ginzburg theories. Certainly, the u4 theory, enhanced by renormalization group techniques, offers excellent predictions for both the statics and the dynamics near equilibrium of the Ising model. Following these lines, we formulate a continuum theory for the KLS model, in arbitrary dimension d. In principle, such a description can be obtained by coarse-graining the microscopic dynamics [7] but we will pursue a more phenomenological approach here. Seeking a theory in the long-time, large-wavelength limit, we first identify the slow variables of the theory. These are typically f ordering fields which experience critical slowing down near ¹ , and # f any conserved densities. The KLS model is particularly simple since it involves a single ordering field, namely the local “magnetization” or excess particle density, u(x, t), which is also the only conserved quantity. Here, x stands for (x , x ,2, x , x "y). The last entry denotes the one-dimensional “parallel” 1 2 d~1 d subspace selected by E. Thus, we begin with a continuity equation, u#+ j"0, and postulate t
48
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
an appropriate form for the current j. In the absence of E, it simply takes its Model B [8] form, j(x, t)"!j+
dH #g(x, t) , du
with the Landau—Ginzburg Hamiltonian
PG
H"
H
q u 1 (+u)2# u2# u4 . 2 4! 2
As usual, qJ¹!¹ and u'0. While the first contribution to j is a deterministic term, reflecting # local chemical potential gradients, the second term, g, models the thermal noise. The noise is Gaussian distributed, with zero mean and positive second moment proportional to the unit matrix, i.e., Sg g T"2pd d(x!x@)d(t!t@), i, j"1,2, d. In the presence of E, we should expect at least two i j ij modifications to j, namely first, an additional contribution j modeling the non-vanishing mass E transport through the system and second, the generation of anisotropies, since E singles out a specific direction. By virtue of the excluded volume constraint, the “Ohmic” current j vanishes at E densities 1 and 0, corresponding to u"$1. Writing E for the coarse-grained counterpart of E, pointing along unit vector yL , the simplest form is therefore j "E(1!u2)[1#O(u)] yL , where the E O(u) corrections will turn out to be irrelevant for universal properties. Next, we consider possible anisotropies. Clearly, all + 2 operators should be split into components parallel (2) and transverse (+ 2) to E, accompanied by different coefficients. For example, the anisotropic version of the Model M B term q+ 2u will read (q 2#q + 2 )u, with two different “diffusion” coefficients for the parallel , M M and transverse subspaces. Should we expect both of these to vanish as ¹ approaches ¹ (E)? Or just # one of them — but which one? Recalling that the lowering of q, in the equilibrium system, is a consequence of the presence of interparticle interactions, we argue that E, stirring parallel jumps much more effectively than transverse ones, should counteract this effect in the parallel direction, having less of an impact in the transverse subspace. Thus, we anticipate that generically q 'q , so , M that criticality, in particular, is marked by q vanishing at positive q . This is borne out by the M , structure of typical ordered configurations, namely, single strips aligned with E, indicating that “antidiffusion”, i.e., q (0, dominates in the transverse directions below ¹ . Finally, the drive also M # induces anisotropies in the noise terms so that the second moment of their distribution should be taken as Sg g T"2p d d(x!x@)d(t!t@). Since the transverse subspace is still fully isotropic, we i j i ij define p "2"p , p . Generically, however, we should expect p , p O p . Summariz1 d~1 , d M , ing, we write down the full Langevin equation: u(x, t)"jM(q !+ 2)+ 2u#(q !2)2u!2a 2+ 2u t M M M , C M u # (+ 2#i2)u3#Eu2N!+g(x, t) , 3! M with noise correlations Sg (x, t)g (x@, t@)T"2p d d(x!x@)d(t!t@) . i j i ij
(1)
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
49
This equation forms the basis for the analytic study of the KLS model. Fortunately, we need not be concerned about the detailed dependence of its coefficients on the microscopic ¹, J, and E. While they are in principle calculable within an explicit coarse-graining scheme, the properties that we already outlined above suffice for our purposes. Moreover, it turns out that we can simplify the most general form, Eq. (1), depending on the temperature regime considered. We will return to this discussion in Sections 3.1 and 3.3 below. To conclude this section, let us take a glance at the structure of Eq. (1). Its basic form is u"F(u,+u,2)#f, with noise correlations Sf(x, t)f(x@, t@)T"2pNd(x!x@)d(t!t@) and p'0 t just a positive constant. For simplicity, we restrict ourselves to a scalar order parameter u and noise “matrices” N which do not depend on u (more general cases are discussed in, e.g., Refs. [9,10]. F is a functional of u and its derivatives. Given this basic form, a steady-state solution P*[u] for the configurational probability is easily found2provided F is “Hamiltonian”, i.e., if it can be written as N acting on a total functional derivative: F"!N(dH/du). In this case, P* is simply proportional to exp(!H/p), irrespective of the choice of N. We will refer to such a dynamics, in an operational sense [5], as “satisfying the fluctuation-dissipation theorem” (FDT) [11]. Model B falls into this category, with N"!+ 2. In contrast, for our driven system, F is just the expression in the M2N brackets of Eq. (1) and 1 , !+g, so that N"!(p 2#p + 2). Thus, this F is clearly not , M M Hamiltonian! In this fashion, our continuum theory reflects the fact that we are dealing with a generic non-equilibrium steady state. For completeness, we should point out that certain microscopically non-Hamiltonian dynamics may become Hamiltonian if viewed on sufficiently large length scales. The discussion of these subtleties [12—14], while intriguing, is beyond the scope of these lectures.
3. Surprising singular behavior, near and far from criticality For the d"2 Ising model in equilibrium, thermodynamic quantities are typically analytic, except at a single point, i.e., ¹ . In particular, being a model with only short-range interactions, # correlation functions are short-ranged far above ¹ , decaying exponentially with distance and # controlled by a finite correlation length. Far below criticality, a half-filled system in a square geometry will display two strips of equal width, as a result of the co-existence of a particle-rich (dense) phase with hole-rich one. Correlations of the fluctuations within each phase are also short-ranged. More interestingly, the interfaces between the phases do represent soft degrees of freedom, being the Goldstone mode of a broken translational invariance [15]. One consequence is a divergent structure factor (Fourier transform of the two-point correlation, of deviations from being straight). Although this behavior is singular, it is well understood and can be related to that of a simple random walk [16]. Only at criticality does the system possess non-trivial singular properties, the nature of which became transparent only in the 1970’s [17,26] despite Onsager’s tour-de-force in 1944 [18].
2 The easiest way to see the connection between F, N and P[u(x, t)] is to recast the Langevin equation in terms of its equivalent Fokker—Planck equation. See Ref. [9].
50
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
When driven into non-equilibrium steady states, this picture changes dramatically. In particular, non-trivial singular behavior appears at all ¹. While the transition itself remains second order, its properties fall into a non-Ising class. Here, we will present some of these surprising features briefly, referring the reader to Ref. [5] for more details. 3.1. T far above ¹ # Focusing on the behavior far above criticality, let us study the two-point correlation G(r) (r"(x, y)) and its transform, the structure factor, S(k). We first point out that the drive clearly induces an asymmetry between x and y, i.e., anisotropy beyond that due to the lattice. Thus, we should not be surprised if the familiar Ornstein Zernike form of S (J(1#m2k2)) were to become elliptical rather than circular, e.g., similar to Fig. 1a. However, simulations [2] showed that there is a discontinuity singularity in S at the origin, as in Fig. 1b. To be more precise, we may define lim y S(0, k ) k ?0 y (2) lim x S(k , 0) k ?0 x and measure the discontinuity by R!1. Though R does approach unity for ¹PR, it is about 4, for large E, even at twice the critical temperature. Further, it diverges as ¹P¹ ! We should remind # the reader that, for the equilibrium case, S may diverge at criticality, but R is unity always. Strange as it may seem, this behavior can be understood within the context of the continuum approach, Eq. (1). For ¹<¹ , both q and q are positive. Thus, the local magnetization u(x, t) # , M fluctuates around a stable minimum at zero, and neither the fourth-order derivatives nor the R,
Fig. 1. Schematic plot of structure factors. (a) Ellipses for a typical anisotropic Ising model in equilibrium. (b) “Butterfly” or “owl” pattern for a driven system.
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
51
non-linearities are necessary for stability. As a consequence, the behavior of the system in the disordered phase can be described by a much simpler, linear Langevin equation, namely, u(x, t)"j(q + 2#q 2)u!+g . t M M ,
(3)
With the help of a Fourier transform, we can easily find u(k, t) for any realization of g, and then compute averages over the Gaussian distributed g, with its second moment given by Eq. (1). This forms the starting point for the discussion of the disordered phase, quantified by, e.g., equal-time structure factors or three-point functions [19]. Here, we focus on the former, S(k),Su(k, t)u(!k, t)T, which can be easily computed: p k2#p k2 M x , y S(k)" , q k2#q k2#O(k4, k2k2, k4) M x , y x x y y
(4)
so that R"(p q )/(p q ). In order for this S to reduce to the equilibrium Ornstein—Zernike form, , M M , the FDT has to be invoked, which constrains R to unity. For the driven case, R is no longer constrained, so that a discontinuity develops. One consequence of such a singularity in S is that G becomes long-ranged, decaying as r~2 at large DrD. The amplitude, in addition to being J(R!1), has a dipolar angular dependence, so that an appropriate angular average of G is again shortranged [5]. To end this discussion, let us follow Grinstein’s argument [20] that such long-ranged power-law decays should be expected. Starting with the dynamic two-point correlation G(r, t), we know that the conservation law leads to the autocorrelation: G(0, t)Pt~d@2. In addition, being a diffusive system, we should expect r & Jt, at least far from criticality. Using naive scaling, we would write G(r, 0)PDrD~d! In other words, the generic behavior of G in a diffusive system is DrD~d decay, not exponential. In this case, the equilibrium system is the non-generic one, in which the amplitude of this generic term is forced to vanish, by FDT. Our familiarity with equilibrium systems is so strong that, on first sight, power law decays far above ¹ appear quite surprising. Finally, note that # a scaling argument of this kind cannot produce the angular dependence, a crucial feature of the generic singularities of driven diffusive systems. 3.2. T far below ¹ # Next, we turn to systems far below ¹ . Due to the conservation law, they typically display phase # co-existence, so that the dominant fluctuations and slow modes are those associated with the interface between the phases. Again, for the d"2 Ising model in equilibrium, there is a wealth of information on the properties of the interface [21]. In particular, being a one-dimensional object, the interface exhibits behavior identical to a simple random walk [16] at the large scales, such as widths diverging as J¸. In our case, if we study an interface aligned along the y axis, then, to a good approximation, we may specify the configuration by its position along x by the “height” function h(y). Known as capillary waves [22], these fluctuations have a venerable history. Since the interface is a manifestation of broken translational invariance, h(y) are the soft Goldstone modes [15], so that the associated structure factor S (q) , Sh(q)h(!q)T diverges as q~2. Of course, this h property is just the Fourier version of the J¸ divergence: : q~2 dq"O(¸). Interfaces with 1@L
52
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
Fig. 2. Log—log plot of interface structure factor vs. wavevector.
divergent widths are also known as “rough”; only for d'2 may interfaces in crystalline Ising-like models display transitions from rough to smooth phases. When driven, however, the interface width appears to approach a finite width at large distances. In particular, for E"2J, using lattices up to ¸ "60 and plotting the widths vs. ¸p with various y values of p, we found that the curves did not straighten out with p as low as 0.05 [23]. As a phenomenon, roughness suppression is not novel, gravity being the most common example. However, the drive here is parallel to the interface, reminiscent of wind driving across a water surface, which has a destabilizing effect by contrast. Subsequently, a more detailed simulation study [24] of the interface with ¸ 4600 showed that S(q) & q~2 for large q, crossing over to q~0.67 for y small q (Fig. 2). Since : q~0.67 dq"O(1) , the small q behavior is consistent with p"0. Though 1@L 0.67 appears temptingly close to a simple rational: 2, there is no viable theory so far (despite two 3 valiant attempts [25]. Finally, let us note that the crossover from rough to smooth occurs at the scale of q & E. Although no precise Monte-Carlo analysis of this crossover has been performed, it is consistent with E having the units of 1/length. In this respect, such a length is comparable to the capillary length which controls the crossover in the gravitationally stabilized interface. Of course, in that case, the small q behavior is simply q0! 3.3. Critical properties Finally, we turn to the critical region, described by q Z 0 in Eq. (1). In contrast to the situation M for ¹<¹ , we now need fourth-order terms in + to stabilize the system against large-wavelength # M fluctuations. Similarly, the non-linear terms are required to ensure a stable ordered phase below ¹ . # However, we still have q '0, so that fourth-order parallel gradients may safely be neglected. Thus, , near criticality, the leading terms on the right-hand side of Eq. (1) are (+ 2)2u and 2u, implying M that parallel and transverse wave vectors, k and k , scale naively as Dk D & Dk D2, i.e., parallel M , , M gradients are less relevant than transverse ones. More systematically, a naive dimensional
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
53
analysis [26] reveals that the non-linearity associated with E is the most relevant one, having an upper critical dimension d "5, distinct from the usual value of 4 for the Ising model. Dropping # irrelevant terms, Eq. (1) simplifies to u u(x, t)"jM(q !+ 2)+ 2u#2u# + 2u3#Eu2N!+ g (x, t) M M t M M M 3! M
(5)
with noise term (i, j"1,2, d!1) Sg (x, t)g (x@, t@)T"2p d d(x!x@)d(t!t@) . Mi Mj M ij Here, we have rescaled q to 1 and have kept the (naively irrelevant) non-linearity associated with u, , to ensure a stable theory below ¹ . This is the starting point for the analysis of universal critical # behavior. In this regime, large fluctuations on all length scales dominate the behavior of the system, so that renormalization group techniques are indispensable. To summarize very briefly, the Langevin equation (Eq. (5)) is recast as a dynamic functional [27], followed by a renormalized perturbation expansion in e , d!d [26]. The quartic coupling u must be treated as a dangerously irrelevant # operator. Gratifyingly, the series for the critical exponents can be summed, so that we obtain quantitatively reliable values even in two dimensions. The details are quite technical [28], so that we just review the key results here. The discussion leading to Eq. (5) already suggests that the critical behavior of the driven system is distinct from its equilibrium counterpart: the upper critical dimension is shifted to d "5, and # parallel and transverse wave vectors scale with different exponents. Anticipating renormalization, we reformulate their scaling as Dk D & Dk D1`D, introducing the strong anisotropy exponent D. To , M illustrate its importance, let us consider the wave vector scaling for the equilibrium Ising model. For isotropic exchange interactions, coarse-graining results in the usual Landau—Ginzburg Hamiltonian with gradient term (+u)2. Clearly, this cannot lead to anything but Dk D & Dk D. The only , M effect of anisotropies in the microscopic couplings is to give rise to different amplitudes, so that D remains zero. We refer to this situation as weak anisotropy, in contrast to strong anisotropy where DO0. Examples of the latter in equilibrium models include, e.g., Lifshitz points or structural phase transitions [29]. For any system with strong anisotropy, irrespective of its universality class, the renormalization group predicts the general scaling form of, e.g. the dynamic structure factor near criticality: S(k, t; q )"k~2`gS(k /k1`D, k /k, tkz; q /k1@l) . (6) M , M M Here, k is just a scaling factor. Eq. (6) can be viewed as a definition of the critical exponents l, z, g and D. The appearance of the latter is of course consistent with our earlier discussion. Different universality classes are distinguished by the characteristic values of these exponents, expressed, e.g., through their e-expansions. For our model, all exponents, except D, take their mean-field values: l"1, z"4, g"0 while D"1#e/3. A separate analysis yields the order parameter exponent 2 b"1. Note that all of these equalities are exact, i.e., all higher-order terms in the e-expansion 2 vanish! Care must be taken on two fronts, both associated with the presence of strong anisotropy, when comparing these exponents to Monte Carlo data. First, in the equilibrium Ising model, the same exponent g characterizes the divergence of the critical structure factor near the origin in k-space,
54
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
S(k, t"0; q "0) & DkD~(2~g) for kP0, and the power law decay of its Fourier transform, the M two-point correlations, G(r, t"0; q "0) & DrD~(d~2`g) as DrDPR. In contrast, four g-like expoM nents are needed in the driven case! Fortunately, scaling laws relate them to g and D. For example, we can define g via S(k "0, k , t"0; q "0) & Dk D~(2~gM), for Dk DP0. Using Eq. (6), we read off M , M M M M the simple result g "g. Similarly, we introduce g@ through the relation G(r , r "0, M @ , , M t"0; q "0) & r~(d~2`g ,), for r PR. Keeping track of D in the Fourier transform, we obtain , , M the less trivial scaling law g@ "(g!D(d!3))/(1#D). Since most simulations have focused on , two-dimensional systems, we set e"3, predicting D"2 and g@ "2. Monte Carlo data for the pair , 3 correlation along the drive beautifully display the expected r~2@3 decay [30]. Numerical evidence , for the strong anisotropy exponent is somewhat more indirect, extracted from an anisotropic finite size scaling analysis [6]. Convincing data collapse is obtained, using anisotropic systems of size ¸ ] ¸ with fixed “aspect” ratio ¸ /¸1`D, for the theoretically predicted values of the exponents. , M , M It is intriguing that the signals of continuous phase transitions in or near equilibrium, namely, a diverging length scale, scale invariance and universal behavior, also mark such transitions in far-from-equilibrium scenarios.
4. Some recent developments Beyond the topics discussed in the previous section, we may arbitrarily name three other “levels” of non-equilibrium steady-state systems. The first are associated with the KLS model itself, including fascinating results on higher correlations [19], failure of the Cahn—Hilliard approach to coarsening dynamics [31], models with shifted periodic and/or open boundary conditions [32], and systems with AC or random drives [33,14]. Topics in the next “level” involve various generalizations of KLS, such as other interactions [34,35], multi-layer systems, multi-species models, and systems with quenched impurities [36]. Further “afield” is a wide range of driven systems, e.g. surface growth, electrophoresis and sedimentation, granular and traffic flow, biological and geological systems, etc. In these brief lecture notes, we present two topics at the “intermediate level”: a driven bi-layer system and a model with two species. 4.1. Phase transitions in a bi-layer lattice gas The physical motivation for considering multi-layered systems may be traced to intercalated compounds [37]. The process of intercalation, where foreign atoms diffuse into a layered host material, is well suited for modeling by a driven lattice gas of several layers. In both physical systems and Monte Carlo simulations of a model with realistic parameters [38], finger formation has been observed. Both are transient phenomena, so that we might ask: are there any novel phenomena in the driven steady states? On the theoretical front, there are two motivations. To study critical behavior of the single layer KLS model, it is necessary to set the overall density at 1, with the consequence that two interfaces develop below criticality. These interfaces seriously 2 complicate the analysis, since their critical behavior is quite distinct from the bulk. One way to avoid these difficulties is to consider a bi-layer system [39], with no inter-layer interactions. Through particle exchange between the layers, the system may order into a dense and a hole-rich layer, neither of which has interfaces. Certainly, this is expected (and observed) in the equilibrium
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
55
case, where the steady state is just two Ising models, decoupled except for the overall conservation law. The other is using multilayer models as “interpolations” from two- to three-dimensional systems [40]. Motivations aside, what the simulations reveal is entirely unexpected. In the simplest generalization, a bi-layer system is considered as a fully periodic ¸ ] ¸ ] 2 model. The first simulations were carried out on the case with zero inter-layer interactions [40]. Therefore, within each layer, all aspects are identical to the KLS model. Particle exchanges across the layers, unaffected by the drive, are updated according to the local energetics alone, so that these jump rates are identical to a system in equilibrium. The overall particle density is fixed at 1. In the 2 absence of E, there would be a single second order transition at the Onsager temperature. Since the inter-layer coupling is zero, phase segregation at low temperatures will occur across the layers, i.e., the ordered phase is characterized by homogeneous layers of different densities (opposite magnetizations, in spin language). Having no interfaces, the free energy of the system is clearly lowest in such a state. At ¹"0, one layer will be full and the other empty, so that this state will be labeled by FE. Intriguingly, when E is turned on, two transitions appear [40]! As ¹ is lowered, the homogeneous, disordered (D) phase first gives way to a state with strips in both layers, reminiscent of two entirely unrelated, yet aligned, single-layer driven systems. We will refer to this state as the strip phase (S). As ¹ is lowered further, a first order transition takes the S phase into the FE state. Why the S state should interpose between the D and FE states was not understood. This puzzle, together with the presence of interlayer couplings in intercalated compounds, motivated our study of the interacting bi-layer driven system [41]. Within this wider context, the presence of two transitions is no longer a total mystery. On the other hand, this study reveals several unexpected features, leading to interesting new questions. Our system consists of two ¸2 Ising models with attractive inter-particle interactions of strength unity. Arranged in a bi-layer structure, the inter-layer interaction is specified by J. Thus, the “internal” Hamiltonian is just H,!4 +nn@!4J +nn@@, where the first sum is over nearest neighbors within a given layer while n and n@@ differ only by the layer index. The external drive, E, is imposed through the jump rates, as in Section 2. For simulations, we kept the overall density at 1 2 and chose J 3 [!10,10]. Note that negative J’s are especially appropriate for intercalated materials [37,38]. The eventual goal of such a study is to map out the phase diagram in the ¹—J—E space. So far, we have data for only three (positive) values of E. Not expecting further surprises, we believe that we have uncovered the main features of the phase diagram. To begin, let us point out various features in the equilibrium case. Here, the J'0 and the J(0 systems can be mapped into each other by a gauge transformation. Thus, in the thermodynamic limit, the phase diagram in the ¹—J plane is exactly symmetric. However, for the lattice gas constrained by a conservation law, the low-temperature states of these two systems are not the same, being S or FE, respectively. While the D—S or D—FE transitions are second order (shown as a line in Fig. 3), the S—FE line will be first order in nature (¹ axis from 0 to 1 in Fig. 3). Due to the presence of interfaces, the FE domain includes the J , 0 line. For the same reason, it “intrudes” into the J'0 half plane by O(1/¸), for finite systems. Of course, on the ¹ axis lies a pair of decoupled Ising systems, so that ¹ (J"0) assumes the Onsager value, as ¸PR. In general, ¹ (J) # # is not known exactly, though ¹ (J"$R) is precisely 2¹ (J"0), since every configuration of the # # two layers can be mapped into one in the usual Ising model. The lesson from equilibrium is now clear: the S state is as generic as the FE state in this wider context. As we will see below, the presence of two transitions does not represent a qualitative change from the equilibrium phase structure.
56
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
Fig. 3. Phase diagrams for the bi-layer Ising lattice gas. Solid lines are second-order transitions in the equilibrium case. Open/solid circles are continuous/discontinuous transitions in the driven case.
Turning to driven systems, we can no longer expect a symmetric phase diagram, since the drive breaks the Ising symmetry. The data for E"25 are shown in Fig. 3, in which the first/second-order transitions are marked as solid/open circles. We see that the only effect of the drive is to shift the phase boundaries. No new phases appear while the nature of the transitions remains unchanged. However, there are remarkable features. One is the lowering of the critical temperature for large DJD. Given that ¹ (E)'¹ (0) in the KLS model, it is quite unexpected that ¹ (DJD<1, E<1) is less # # # than its equilibrium counterpart. Perhaps more notable is the presence of a small region in the J(0 half plane, in which an S-phase is stable. Since particle-rich strips lie on top of each other, such a phase could not exist if either energy or entropy were to dominate the steady state. Concerned that its presence might be a finite size effect, we performed simulations using ¸’s up to 100, with J"!0.1 and ¹3[1.00, 1.20]. In all cases, the S-phase prevailed, leading us to conjecture that this region exists even in the thermodynamic limit. On the other hand, for lower ¹, the FE-phase penetrates into the J'0 half plane, as in equilibrium cases. Energetics seem to take the upper hand here. Though we believe that, as ¸PR, this part of the phase boundary will collapse onto the J"0 line, we have not checked the finite size effects explicitly. Of course, these “intrusions” into the positive/negative J region by the FE/S-phase are responsible for the appearance of two transitions in the studies of the J,0 model [40]. It would clearly be useful to develop some intuitive arguments which can “predict” these qualitative modification of the equilibrium phase structure. Since the usual energy—entropy considerations appear to fail, our attempt [41] is based on the competition between suppression of short-range correlations [42] and enhancement of long-range ones [30], as a result of driving. Let
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
57
us focus on the two-point correlations in the disordered phase with ¹ being lowered from R and see how they are affected by the drive. On the one hand, the nearest-neighbor correlations are found to be suppressed by E, consistent with the picture that the drive acts as an extra noise in breaking bonds. Taken alone, this suppression would lead to the lowering of the critical temperature, as pointed out in Section 2. On the other hand, as we showed above, the drive changes significantly the large distance behavior of G, from an exponential to a power law. Further, the amplitude is positive (negative) for correlations parallel (transverse) to E. Both the positive longitudinal correlations and the negative transverse ones should help the process of ordering into strips parallel to the field. In other words, the enhanced long-range parts favor the S-phase. Taken alone, we expect this effect to increase ¹ (E). Evidently, for the single-layer case, the latter effect “wins”. # For a bi-layer system, we need to take into account cross-layer correlations, which are necessarily short-ranged. Focusing first on the J"0 case, where short-range effects due to cross-layer interactions are absent, we are led to ¹ (0, E)/¹ (0, 0)'1. Indeed, this ratio is comparable to that in the single-layer # # case. Next, we consider systems with positive J. Without the drive, ¹ (J, 0) is of course enhanced # over ¹ (0, 0). For non-vanishing drive, it is not possible to track the competition between the short# and long-range properties of the transverse correlations. Evidently, for small J, the long-range part still dominates, so that ¹ (J, E)/¹ (J, 0)'1. However, for J<J , the presence of E effectively # # 0 lowers J, since the latter is associated with only short-range correlations. Since a lower effective J naturally gives rise to a lower ¹ , we would “predict” that ¹ (J, E)/¹ (J, 0) could decrease # # # considerably as J increases. From Fig. 3, we see that ¹ (J, E)(¹ (J, 0) for J55! The interplay of # # the competing effects is so subtle that either can dominate, in different regions of the phase diagram. Finally, we turn to the J(0 case. Here, the two effects tend to cooperate rather than compete, since the long-range parts favor the S-phase over the FE-phase. As a result, the domain of FE is smaller everywhere. In particular, note that the J(0 branch of ¹ (J, E) is significantly lower than # the J'0 branch. The small region of the S-phase can be similarly understood, at least at the qualitative level. In this picture, we may argue that the bicritical point and its trailing first order line should be “driven” to the J(0 half-plane. To end this subsection, we point out a few other interesting features. The order parameters of the S and FE phases are conserved and non-conserved, respectively. Based on symmetry arguments, we may expect that the critical properties of the two second order branches will fall into different universality classes. In contrast, the gauge symmetry forces static equilibrium properties to be identical along these two branches (only dynamic properties, e.g. the dynamical critical exponents, differ [43]. Since the drive breaks this symmetry, we can only speculate that the D—S transitions belong to the same class as the single layer, KLS model [28] while the D—FE ones should remain in the equilibrium Ising class [12]. Since these two classes are distinct, we can also expect a rich crossover structure near the bi-critical point. Assuming these conjectures are confirmed, we can truly marvel at the novelties a driving field can bring. 4.2. Biased diffusion of two species If the particles in the KLS model are considered “charged”, it is natural to explore the effects of having both positive and negative charges. Defined on a fully periodic ¸ ]¸ square lattice, x y a configuration of such a “two species” model can be characterized by two local occupation
58
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
variables, n` r and n~ r , which equal 1 (0), if a positive or negative particle is present (absent) at site r,(x, y). In spin language, this corresponds to a spin-1 model, and novel behavior is to be expected, since we have altered the internal symmetries of the local order parameter. For simplicity, we will restrict ourselves to zero total charge, Q,+r(n` r !n~ r )"0. Pursuing the analogy to electric charges, positive (negative) charges move preferentially along (against) the drive E which is directed along the !y axis. As a first step, we assume that there are no inter-particle interactions apart from an excluded volume constraint. Thus, our model can be viewed as the high-temperature, large E limit of a more complicated interacting system. To model biased diffusion, a particle with charge q"$1 jumps onto a nearest-neighbor empty site according to min[1, e~qE dy], where dy"0,$1 is the change in the y-coordinate of the particle. To allow charge exchange between neighboring particles, two nearest-neighbor sites carrying opposite charge may exchange their content with probability c min[1, e~qE dy]. Now, dy is the change in the y-coordinate of the positive particle. The parameter c sets the ratio of the characteristic time scales controlling, respectively, charge exchange and diffusion. Physical motivations for this model come from various directions: fast ionic conductors with several species of mobile charges [1], electric breakdown of water-in-oil microemulsions [44], and gel electrophoresis of charged polymers [45]. It can also be interpreted as a simple model for some traffic or granular flows [46,47]. Summarizing our simulation data, we first map out the phase diagram of the model, in the space spanned by E, the total mass density mN , +r(n` r #n~ r )/(¸ ¸ ), and c [48].3 Focusing on small c, x y the diffusive dynamics is limited by the excluded volume constraint. For sufficiently small E and mN , typical configurations are disordered, characterized by homogeneous charge and mass densities and fairly large currents. Nevertheless, this phase is highly nontrivial, supporting anomalous two-point correlations reminiscent of the KLS model: In generic directions, the familiar r~d decay prevails, with the remarkable exception of the field direction, where a novel r~(d`1)@2 dominates , [49]. With increasing E or mN , the tendency of the particles to impede one another becomes more pronounced, until a phase transition into a spatially inhomogeneous “ordered” phase occurs. Beyond the transition line E (mN , c), typical configurations are charge-segregated, consisting of # a strip of mostly positive charges “floating” on a similar strip of negative charges, surrounded by a background of holes. To explore these transitions more quantitatively, we need to define a suitable order parameter. Taking the Fourier transforms of the local hole and charge densities, 1 /I (k , k ), + [1!(n` r #n~ r )]expMik x#ik yN M , M , ¸¸ r x y and 1 + [n` tI (k , k ), r !n~ r ]expMik x#ik yN, M , M , ¸¸ r x y
3 The second paper in Ref. [48] discusses the adiabatic elimination procedure. For earlier work (c"0) see Refs. [51,56,57].
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
59
Fig. 4. Phase diagram for the driven two-species model. The system size is 30]30. The filled circles mark the lines of continuous transitions, while the open circles denote the spinodal lines associated with first order transitions. The inset shows a typical ordered configuration at mN "0.40, E"3.00 and c"0.02 with the open (filled) circles representing positive (negative) charges.
we select the components U,D/I (0, 2p/¸ )D and W,DtI (0, 2p/¸ )D since these are most sensitive to y y ordering into a single transverse strip. Naturally, we choose their averages SUT and SWT as our order parameters. To distinguish first from second-order transitions, we also measure their fluctuations and histograms. The resulting phase diagram is shown in Fig. 4. Typically, hysteresis loops in SUT, SWT and the average current are observed only for mN (mN (c), indicating that the 0 transitions are first order in this regime. On the other hand, fluctuations develop sharp peaks for larger mass densities, signalling continuous transitions. Larger values of c favor the disordered phase, so that the whole transition line shifts to higher E, the region between the spinodals narrows, and mN (c) moves to higher mass densities. Thus, we observe a surface of first-order transitions, 0 separated from a surface of continuous ones by a line of multicritical points: E (mN (c),c). # 0 While it is not surprising that the disordered phase is stable for large E provided the mass density is sufficiently small, it may appear rather counterintuitive that it should also dominate the mN [1 region. However, setting mN , 1 eliminates every hole in the system so that the dynamics is carried entirely by the charge exchange mechanism. Relabelling positive charges as “particles” and negative ones as “holes”, the model becomes equivalent to a non-interacting (i.e., J"0) KLS model. Also termed the asymmetric simple exclusion process (ASEP) in the literature, its steadystate solution is exactly known [50] to be homogeneous. Thus, for all c'0, our two-species model remains disordered along the entire mN , 1 line. For sufficiently large c, the data indicate quite clearly that a finite region of disorder persists, for any E, just below complete filling. For smaller c, however, it is less obvious whether such a region exists in the thermodynamic limit, since even a single hole can suffice to induce spatial inhomogeneities in a finite system: Acting as a catalyst for the charge segregation process, the hole creates a strip of predominantly positive charge, separated by a rather sharp interface from a similar, negatively charged domain, located “downstream”. The charge exchange mechanism tends to remix the charges but has little effect for small c. Eventually,
60
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
the hole ends up trapped in the interfacial region! Finally, we turn to larger values of c, specifically, cZ0.62: here, the charge exchange mechanism dominates over the excluded volume constraint and suppresses the ordered phase completely. So far, our discussion has mostly drawn upon Monte Carlo results. It is gratifying, however, that the same qualitative picture emerges from a mean-field theory, based upon a set of equations of motion for the local densities. We briefly summarize the analytic route. Since the numbers of both positive and negative charges are conserved, we begin with a continuity equation for the coarsegrained densities oB(r, t): oB#+jB"0. For simplicity, we focus on the case c"0, i.e., part ticle—hole exchanges, first. As for the KLS model, jB can be written as the sum of a diffusive piece and an Ohmic term,
G
jB"jB !+
dH doB
K
H
$EyL . oB The density-dependent mobility jB must vanish with both oB and the local hole density, u , 1!(o`#o~), i.e., jB"oBu. The “Hamiltonian” H"Mo` ln o`#o~ ln o~#u ln uN is just the mixing entropy associated with distributing ¸ ¸ o` positive and ¸ ¸ o~ negative charges x y x y over a lattice of ¸ ¸ sites. Note that the functional derivative dH/doB is taken at fixed oY since we x y are focusing on particle—hole exchanges here. To model the charge exchange mechanism, we simply add a similar term to jB, namely,
G
jB"j !+
H G
dH dH $EyL #cj@ !+ doB do
K
H
$EyL rB where j@"o`o~ vanishes with both o` and o~, and the functional derivative is taken at fixed hole density. Expressing the resulting equations in the more convenient variables u and charge density t,o`!o~, we obtain u"+M+u#EutyL N t
G
H
c t"+ c+t#(1!c)[u+t!t+u]!Eu(1!u)yL ! E[(1!u)2!t2] yL . t 2
(7)
These equations have to be supplemented by periodic boundary conditions and constraints on total mass and charge, (1!mN )¸ ¸ ":ddr u(r, t) and 0":ddr t(r, t). Also, to be precise, all x y Laplacians + 2 should be given the appropriate anisotropic interpretation. Using Eq. (7) as our starting point, we seek to recapture the major features of the phase diagram. First, the presence of two phases, uniform versus spatially structured, is reflected in the existence of both homogeneous and inhomogeneous steady-state solutions of Eq. (7). The existence of the former is evident, due to the conservation law. Anticipating homogeneity in the transverse directions, we seek the latter in the form u(y), t(y). Integrating Eq. (7) once, the stationary mass and charge currents appear as natural integration constants. Since the numbers of positive and negative charges are equal, the former must be zero, leaving us with the latter: J. The first equation allows us to eliminate t in favor of u, and, with u , J1#[c/(1!c)u] as the new variable, we can reduce the second equation to potential form, u@@"!d»(u)/du. The potential »(u), given explicitly in Ref. [48], exhibits a minimum for a range of J’s, so that inhomogeneous, periodic solutions of Eq. (7) exist. We note in passing that these may even be found explicitly, provided c"0 [51]. Otherwise,
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
61
numerical integration is of course always possible, yielding rather impressive agreement with simulation profiles. Interestingly, our equations of motion predict that E enters only through the scaling variable E¸ . If plotted accordingly, our simulation data confirm this scaling by collapsing y rather convincingly, at least within their error bars. However, the mere existence of two types of steady-state solutions is not sufficient to provide evidence for a phase transition. Therefore, we seek instabilities of these solutions, as the control parameters E, mN and c are varied. Performing a linear stability analysis, we find that a homogeneous solution with mass mN becomes unstable if the drive E exceeds a threshold value
S
2p 1!mN #cmN E (mN , c)" . H ¸ (1!mN )[(2!c)mN !1] y The most relevant perturbation is associated with the smallest wave vector in the parallel direction, (0, 2p/¸ ). A simple analysis of E (mN , c) shows that this instability can only occur within the interval y H 1/(2!c)(mN (1, so that, in particular, the homogeneous phase is always stable for c51. Clearly, we cannot identify this mean-field stability boundary with the true transition line: we have not considered the locus of instabilities associated with inhomogeneous solutions, and fluctuations have been neglected throughout. However, it is quite remarkable how well it mirrors the qualitative shape of the phase diagram. Finally, let us consider the order of the transitions. In principle, two routes can be pursued here. One of these, namely, the computation of the stability boundary of the inhomogeneous phase, E (mN , c), is rather subtle, involving three Goldstone modes [52]. Once E (mN , c) is known, values (mN , c) I I for which E and E coincide can be identified as loci of continuous transitions; otherwise, E and I H H E mark the “spinodals”, near a first-order transition, where the homogeneous/inhomogeneous I phases become linearly unstable. Alternatively, the adiabatic elimination of the fast modes results in an effective equation of motion for the slow ones which can then be analyzed. Since the technical details of this approach can be found in Ref. [48], we need only focus on the result which combines the expected with the surprising. Letting M(q , t) denote the complex amplitude of the unique slow M mode, associated with the band of wave vectors (q , 2p/¸ ), its equation of motion takes the M y Ginzburg—Landau form: M"!M(q#q2 )M#gMDMD2#O(M5)N . t M Here, q is the soft eigenvalue, which vanishes on the stability boundary and g"g(E, mN , c, ¸ ) is , a rather complicated function. As in standard Landau theory, the sign of g determines the order of the transition: if positive, the transition is continuous while negative g signals a first-order one. Since qK0, we can set E"E and discuss g on the stability boundary itself: being positive for H mN [1, it has a unique zero at a critical mN (c), below which it becomes negative. Since mN (c) increases 0 0 with c, we recover the qualitative behavior of the multicritical line observed in the simulations. The surprising aspect of this equation of motion, however, resides in the fact that M is O(2)-symmetric and q spans just a single spatial dimension. Given these symmetries, the Mermin—Wagner M theorem [53] should forbid the existence of long-range order! One might hope that a careful analysis of finite-size effects in this system would contribute to the disentanglement of this puzzle. We conclude with two comments. First, we noted earlier that the case mN ,1 is exactly soluble. There are, in fact, two other surfaces in the phase diagram, corresponding to c"1 and c"2, for
62
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
which certain distributions are exactly known. Setting c"1 implies that the rates for particle—hole and particle—particle exchanges become equal, so that any given particle, e.g. a positive charge, cannot distinguish between negative charges and holes. Thus, it experiences biased diffusion, equivalent to the non-interacting KLS model. Accordingly, the marginal steady-state distribution of the occupation numbers of one species is uniform, i.e., P[MnB P[Mn` r N]"+M Y r , n~ r N]J1, and nr N observables pertaining to a single species are trivial. For example, the two-point correlation functions for equal charges, Sn` r n` 0 T and Sn~ r n~ 0 T, vanish for rO0. The full distribution, P[Mn` r , n~ r N], however, is nontrivial, so that, e.g., the two-point function for opposite charges, Sn` r n~ 0 T, remains long-ranged [49]. A completely random system, marked by a uniform P[Mn` r , n~ r N] results only if c"2. Finally, we note that the one-dimensional version of our model is exactly soluble by matrix methods [54]. Second, it is natural to wonder about the consequences of having non-zero charge. This problem has only been investigated for c"0 [55], but the findings are quite remarkable. The system still orders into a charge-segregated strip, while supporting a nonvanishing mass current, reflected in an overall drift. If, e.g. positive charges outnumber negative ones, one might expect that the whole strip would drift in the field direction — in analogy with American football, where the team with fewer players tends to lose ground. In contrast, the strip wanders against the field, following the preferred direction of the minority charge! We should add that this model possesses several other intriguing properties, e.g., stable configurations with nontrivial winding number (“barber poles”) [56] or multiple-valued currents in the mean-field description [51]. To summarize, the remarkable richness of this deceptively simple system is clearly amazing.
5. Summary and outlook We have presented, within the limited scope of these lecture notes, a brief introduction to the statistical mechanics of driven diffusive systems and some recent developments in this ever expanding field. Focusing only on the proto-type model and two of the simplest generalizations, we pointed out a multitude of surprises, when we base our expectations on the experience with equilibrium systems. While some of these, e.g. the generic presence of singular correlations, are rather well understood, others, such as the nature of ordering in two-species models, remain unresolved. Of course, the holy grail of this whole field, namely, the fundamental understanding and theoretical classification of non-equilibrium steady states, still beckons at the distant horizon.
Acknowledgements It is a pleasure to especially acknowledge our collaborators on the recent work reported here: C.C. Hill, G. Korniss and K.-t Leung. Others are too numerous to mention but no less deserving. We thank the IXth International Summer School on Fundamental Problems in Statistical Mechanics, and particularly H.K. Janssen and L. Scha¨fer, for their hospitality. This research was supported in part by grants from NATO, the SFB 237 of the Deutsche Forschungsgemeinschaft and the US National Science Foundation through the Division of Materials Research.
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45—64
63
References [1] S. Chandra, Superionic Solids. Principles and Applications, North-Holland, Amsterdam 1981. [2] S. Katz, J.L. Lebowitz, H. Spohn, Phys. Rev. B 28 (1983) 1655; J. Stat. Phys. 34 (1984) 497. [3] Ising, Z. Physik 31 (1925) 253. A more recent treatment is, e.g., B.M. McCoy, T.T. Wu, The Two-dimensional Ising Model, Harvard Univ. Press, Cambridge, MA, 1973. [4] C.N. Yang, T.D. Lee, Phys. Rev. 87 (1952) 404; T.D. Lee, C.N. Yang, Phys. Rev. 87 (1952) 410. [5] B. Schmittmann, R.K.P. Zia, in: C. Domb, J.L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, vol. 17, Academic, London, 1995. [6] K.-t. Leung, Phys. Rev. Lett. 66 (1991) 453 and Int. J. Mod. Phys. C3 (1992) 367; J.S. Wang, J. Stat. Phys. 82 (1996) 1409. [7] G.L. Eyink, J.L. Lebowitz, H. Spohn, J. Stat. Phys. 83 (1996) 385—472 . [8] P.C. Hohenberg, B.I. Halperin, Rev. Mod. Phys. 49 (1977) 435. [9] H. Risken, The Fokker—Planck Equation, Springer, Heidelberg 1989. [10] H.K. Janssen, in: C.P. Enz (Ed.), Dynamical Critical Phenomena and Related Topics, Lecture Notes in Physics, vol. 104, Springer, Heidelberg, 1979. [11] R. Kubo, Rep. Progr. Phys. 29 (1966) 255. See R. Graham, Z. Phys. 26 (1977) 397 and B 40 (1980) 149, for a much deeper discussion of this fundamental issue. [12] G. Grinstein, C. Jayaprakash, Y. He, Phys. Rev. Lett. 55 (1985) 2527. [13] H.K. Janssen, B. Schmittmann, Z. Phys. B 63 (1986) 517; R.K.P. Zia, B. Schmittmann, Z. Phys. B 97 (1995) 327. [14] B. Schmittmann, Europhys. Lett. 24 (1993) 109. [15] J. Goldstone, Nuovo Cim., 19 (1961) 154; H. Wagner, Z. Physik 195 (1966) 273. [16] H.N.V. Temperley, Proc. Cam. Phil. Soc. 48 (1952) 683; K.Y. Lin, F.Y. Wu, Z. Physik B 33 (1970) 181. [17] K.G. Wilson, Rev. Mod. Phys. C 47 (1975) 773. [18] L. Onsager, Phys. Rev. 65 (1944) 117 and Nuovo Cim. 6 (Suppl.) 261 (1949). [19] K. Hwang, B. Schmittmann, R.K.P. Zia, Phys. Rev. Lett. 67 (1991) 326 and Phys. Rev. E 48 (1993) 800. [20] G. Grinstein, J. Appl. Phys. 69 (1991) 5441. [21] R.K.P. Zia, in: K.C. Bowler, A.J. McKane (Eds.), Statistical and Particle Physics: Common Problems and Techniques, SUSSP Publications, Edinburgh, 1984. See references herein. [22] F.P. Buff, R.A. Lovett, F.H. Stillinger, Phys. Rev. Lett. 15 (1965) 621. B. Widom, J.S. Rowlinson, Molecular Theory of Capillarity, Oxford, 1982. [23] K.-t. Leung, K.K. Mon, J.L. Valle´s, R.K.P. Zia, Phys. Rev. Lett. 61 (1988) 1744. Phys. Rev. B 39 (1989) 9312. [24] K.-t. Leung, R.K.P. Zia, J. Phys. A 26 (1993) L737. [25] K.-t. Leung, J. Stat. Phys. 50 (1988) 405 and 61 (1990) 341; C. Yeung, J.L. Mozos, A. Hernandez-Machado, D. Jasnow, J. Stat. Phys. 70 (1992) 1149. [26] D.J. Amit, Field Theory, the Renormalization Group and Critical Phenomena, 2nd revised edn., World Scientific, Singapore, 1984; J. Zinn—Justin, Quantum Field Theory and Critical Phenomena, Oxford University Press, Oxford, 1989. [27] H.K. Janssen, Z. Phys. B 23 (1976) 377; C. de Dominicis, J. Phys. (Paris) Colloq. 37 (1976) C247. [28] H.K. Janssen, B. Schmittmann, Z. Phys. B 64 (1986) 503; K.-t. Leung, J.L. Cardy, J. Stat. Phys. 44 (1986) 567 and 45 (1986) 1087 (erratum). [29] R.M. Hornreich, M. Luban, S. Shtrikman, Phys. Rev. Lett. 35 (1975) 1678; R.A. Cowley, Adv. Phys. 29 (1980) 1; A.D. Bruce, Adv. Phys. 29 (1980) 111. [30] M.Q. Zhang, J.-S. Wang, J.L. Lebowitz, J.L. Valle´s, J. Stat. Phys. 52 (1988) 1461; P.L. Garrido, J.L. Lebowitz, C. Maes, H. Spohn, Phys. Rev. A 42 (1990) 1954. [31] C. Yeung, T. Rogers, A. Hernandez-Machado, D. Jasnow, J. Stat. Phys. 66 (1992) 1071; F.J. Alexander, C.A. Laberge, J.L. Lebowitz, R.K.P. Zia, J. Stat. Phys. 82 (1996) 1133. [32] J.L. Valle´s, K.-t. Leung, R.K.P. Zia, J. Stat. Phys. 56 (1989) 43; D.H. Boal, B. Schmittmann, R.K.P. Zia, Phys. Rev. A 43 (1991) 5214. [33] B. Schmittmann, R.K.P. Zia, Phys. Rev. Lett. 66 (1991) 357; B. Schmittmann, Europhys. Lett. 24 (1993) 109; E. Praestgaard, H. Larsen, R.K.P. Zia, Europhys. Lett. 25 (1994) 447 and references therein.
64
B. Schmittmann, R.K.P. Zia / Physics Reports 301 (1998) 45— 64
[34] K.-t. Leung, B. Schmittmann, R.K.P. Zia, Phys. Rev. Lett. 62 (1989) 1772; G. Szabo´, A. Szolnoki, Phys. Rev. B 47 (1993) 8260 and references therein; G. Szabo´, A. Szolnoki, T. Antal, Phys. Rev. E 49 (1994) 299. [35] K. Burns, L.B. Shaw, B. Schmittmann, R.K.P. Zia, to be published. [36] V. Becker, H.K. Janssen, Europhys. Lett. 19 (1992) 13 and private communication; K.B. Lauritsen, H.C. Fogedby, Phys. Rev. E 47 (1992) 1563; B. Schmittmann, K.E. Bassler, Phys. Rev. Lett. 77 (1996) 3581; B. Schmittmann, C.A. Laberge, Europhys. Lett. 37 (1997) 559. [37] M.S. Dresselhaus, G. Dresselhaus, Adv. Phys. 30 (1981) 139. [38] G.R. Carlow, P. Joensen, R.F. Frindt, Synth. Met. 34 (1989) 623 and Phys. Rev. B 42 (1990) 1124; G.R. Carlow, R.F. Frindt, Phys. Rev. B 50 (1992) 11 107. [39] K.K. Mon, unpublished. [40] A. Achahbar, J. Marro, J. Stat. Phys. 78 (1995) 1493. [41] C.C. Hill, R.K.P. Zia, B. Schmittmann, Phys. Rev. Lett. 77 (1996) 514—517 and C.C. Hill, Senior Thesis, Virginia Polytechnic Institute and State University, 1996. [42] J.L. Valle´s, J. Marro, J. Stat. Phys. 49 (1987) 89. [43] B. Schmittmann, C.C. Hill, R.K.P. Zia, Physica A 239 (1997) 382. [44] M. Aertsens, J. Naudts, J. Stat. Phys. 62 (1990) 609. [45] M. Rubinstein, Phys. Rev. Lett. 59 (1987) 1946; T.A.J. Duke, Phys. Rev. Lett. 62 (1989) 2877; Y. Schnidman, in: A. Friedman (Ed.), Mathematics in Industrial Problems IV, Springer, Berlin 1991; B. Widom, J.L. Viovy, A.D. Desfontaines, J. Phys I (France) 1 (1991) 1759. [46] O. Biham, A.A. Middleton, D. Levine, Phys. Rev. A 46 (1992) R6128; K.-t. Leung, Phys. Rev. Lett. 73 (1994) 2386. [47] H.M. Jaeger, S.R. Nagel, R.P. Behringer, Rev. Mod. Phys. 68 (1997) 1259. [48] G. Korniss, B. Schmittmann, R.K.P. Zia, Europhys. Lett. 32 (1995) 49 and J. Stat. Phys. 86 (1997) 721. [49] G. Korniss, B. Schmittmann, R.K.P. Zia, Physica A 239 (1997) 111; G. Korniss, B. Schmittmann, Phys. Rev. E 56 (1997) 4072. [50] F. Spitzer, Adv. Math. 5 (1970) 246. [51] I. Vilfan, R.K.P. Zia, B. Schmittmann, Phys. Rev. Lett. 73 (1994) 2071. [52] R.K.P. Zia, B. Schmittmann, unpublished. [53] N.D. Mermin, H. Wagner, Phys. Rev. Lett. 17 (1966) 1133. [54] C. Godre`che, S. Sandow, private communication, to be submitted. [55] K.-t. Leung, R.K.P. Zia, Phys. Rev. E 56 (1997) 308. [56] K.E. Bassler, B. Schmittmann, R.K.P. Zia, Europhys. Lett. 24 (1993) 115. [57] B. Schmittmann, K. Hwang, R.K.P. Zia, Europhys. Lett. 19 (1992) 19.
Physics Reports 301 (1998) 65—83
An exactly soluble non-equilibrium system: The asymmetric simple exclusion process B. Derrida* Laboratoire de Physique Statistique, Ecole Normale Supe& rieure, 24 rue Lhomond, 75005 Paris, France
Abstract A number of exact results have been obtained recently for the one-dimensional asymmetric simple exclusion process, a model of particles which hop to their right at random times, on a one-dimensional lattice, provided that the target site is empty. Using either a matrix form for the steady-state weights or the Bethe ansatz, several steady-state properties can be calculated exactly: the current, the density profile for open boundary conditions, the diffusion constant of a tagged particle. The matrix form of the steady state can be extended to calculate exactly the steady state of systems of two species of particles and shock profiles. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 05.40.#j; 05.60.#w
1. Introduction In equilibrium statistical mechanics, one usually associates an energy function E(C) to every possible configuration C of a system. At equilibrium, each configuration C has a weight proportional to exp[!E(C)/¹] P (C)"Z~1e~E(C)@T , (1) %2 where ¹ is the temperature and Z the partition function. The problem is then to calculate equilibrium properties by averaging over all configurations weighted by Eq. (1). The dynamics towards such an equilibrium is often described by a master equation of the form dP(C) "+ M(C, C@)P(C@) , dt C{
* E-mail:
[email protected]. 0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 0 6 - 4
(2)
66
B. Derrida / Physics Reports 301 (1998) 65—83
where M(C, C@) dt is the probability of a transition from configuration C@ to configuration C during the infinitesimal time interval dt. To conserve the probability, the diagonal term has to satisfy M(C, C)"! + M(C@,C) . C{EC (!M(C, C)P(C) dt represents the probability of escaping from configuration C during the time interval dt). To insure that a distribution P(C) evolving according to Eq. (2) has a long time limit independent of the initial condition, the rule of evolution M(C, C@) should not leave any subset of configurations disconnected from the rest (i.e. there should be a path of non-zero matrix elements connecting any pair of configurations). Moreover, the simplest way for the limit to be P (C) given %2 by Eq. (1) is to choose a matrix M(C, C@) which satisfies detailed balance, meaning that for any pair of configurations C ,C 1 2 M(C ,C ) P (C )"M(C ,C ) P (C ) . (3) 1 2 %2 2 2 1 %2 1 In out-of-equilibrium systems, there is usually no energy function E(C) and the system is only defined by its dynamical rules. Typically, for a system with stochastic dynamics, only a matrix M is given, and the evolution of the weights P(C) is governed by the master equation (2). In the long-time limit, the system usually (if all the configurations are connected) reaches a steady state P (C) 45%!$: 45!5% which in general does not satisfy the detailed balance condition (3) but only the weaker condition of stationarity: + M(C@, C)P (C)"+ M(C, C@)P (C@) . (4) 45%!$: 45!5% 45%!$: 45!5% C{ C{ Eq. (4) expresses the fact that the probabilities of entering and of leaving the configuration C during the time interval dt are equal. (The stationarity conditions (4) are weaker than detailed balance (3) since for a system of X configurations, stationarity imposes X equations (4) whereas detailed balance (3) usually requires many more conditions.) A trivial example of system reaching a steady state without satisfying detailed balance (3) is the problem of a biased random walker on a ring of N sites. During each infinitesimal time interval dt, the walker hops with probability p dt to its right and q dt to its left. There are N possible positions i for the walker and the solution of Eq. (4) for the steady state probability is just P (i)"1/N. 45%!$: 45!5% Obviously, M(i#1, i)P (i)"p/NOq/N"M(i, i#1)P (i#1) and detailed balance 45%!$: 45!5% 45%!$: 45!5% (3) is not satisfied as soon as there is a bias (i.e. pOq). In general, the steady state weights P (C) have to be determined by solving the equations 45%!$: 45!5% (Eq. (4)) and the expressions obtained are usually complicated. Once the steady state weights are known, the calculation of steady state properties becomes very similar to that of equilibrium properties: one has to average physical quantities with the steady state weights. One of the simplest non-equilibrium systems one can consider is the fully asymmetric simple exclusion process ASEP in one dimension [1—4]. The model describes a driven lattice gas [5] in one dimension with hard core repulsion. At large scale [6], the ASEP is expected [7—10] to be described by a noisy Burgers equation or equivalently by the Kardar—Parisi—Zhang equation. Therefore, the long time and large scale properties of the ASEP can be reinterpreted as asymptotic properties of directed polymers in a random medium or of growing interfaces.
B. Derrida / Physics Reports 301 (1998) 65—83
67
In one dimension, the ASEP can be defined as follows. Each site of a one-dimensional lattice is either occupied by one particle or empty. A configuration of the system can therefore be characterized by binary variables Mq N where q "1 if site i is occupied by a particle and q "0 if site i is i i i empty. During every infinitesimal time interval dt, each particle hops with probability dt to its right if this site is empty (and does not move otherwise). For any initial configuration Mq (0)N, one can write the evolution of arbitrary correlation i functions by considering the possible events which occur during the time interval dt. For example, the evolution of the occupation q of site i during the time interval dt is given by i q (t) with probability 1!2 dt , i q (t#dt)" q (t)#q (t)!q (t)q (t) with probability dt , (5) i i~1 i i~1 i q (t)q (t) with probability dt , i i`1 (the first line corresponds to updating neither the bond (i!1, i) nor the bond (i, i#1), the second line corresponds to updating the bond (i!1, i) and the third line corresponds to updating the bond (i, i#1)) and the average over the history between times 0 and t leads to
G
dSq T i "Sq T!Sq T!Sq q T#Sq q T . i~1 i i~1 i i i`1 dt
(6)
For a pair of neighboring sites, one can write q (t#dt)q (t#dt) i i`1 q (t)q (t) with probability 1!3 dt , i i`1 (q (t)#q (t)!q (t)q (t))q (t) with probability dt , i i~1 i i`1 " i~1 q (t)q (t) with probability dt , i i`1 q (t)q (t)q (t) with probability dt , i i`1 i`2 which gives for the evolution equation of Sq q T i i`1 dSq q T i i`1 "Sq q T!Sq q T!Sq q q T#Sq q q T i~1 i`1 i i`1 i~1 i i`1 i i`1 i`2 dt
G
(7)
and a similar reasoning allows one to derive the evolution equations of higher correlations. What is visible in Eqs. (6) and (7) is that all the correlation functions are coupled in the hierarchy. To determine in Eq. (6) the one-point functions Sq T, one needs to know the two-point functions i Sq q T. Trying to calculate these nearest-neighbor correlations in Eq. (7) requires the knowledge i i`1 of further correlations Sq q T or of higher-order correlations Sq q q T and so on. This i~1 i`1 i~1 i i`1 shows that the problem is a N-body problem and so far it has not been possible to derive an expression of the correlations at time t, given an arbitrary initial condition Mq (0)N. i 2. The steady state for periodic conditions For periodic boundary conditions, its turns out that the steady state of the ASEP is particularly simple. For a system of P particles on a ring of N sites, one can show that in the steady state all
68
B. Derrida / Physics Reports 301 (1998) 65—83
possible configurations have equal weight [11] as P! (N!P)! P (C)" . 45%!$: 45!5% N!
(8)
This is because, if n(C) is the number of clusters in the configuration C, the probability of leaving configuration C during the time interval dt is n(C)P(C) dt (by the move of the first particle of each cluster) whereas the probability of entering the configuration C is + P(C@) dt where the sum runs C{ over the configurations C@ obtained from C by moving one step backwards of the last particle of each cluster. As the number of configurations C@ is equal to the number of clusters n(C), the stationarity condition (4) is fulfilled when all configurations have equal weight. This simplicity of the steady state makes the calculation of equal time correlation functions very easy. For example P Sq T" ; i N
P(P!1) Sq q T" ; i j N(N!1)
P(P!1)(P!2) Sq q q T" . i j k N(N!1)(N!2)
The fact that the weights (8) of the configurations in the steady state are simple, Eq. (8) does not make all the steady state properties easy to calculate. For example, unequal time correlation functions like Sq (0)q (t)T are not known, even in the steady state. One of the simplest quantity i j which contains some information about unequal time correlations is the diffusion constant D of a tagged particle [12—14] on the ring. If we tag one of the particles (without in anyway changing its dynamics) and call ½ the number of hops the particle has made up to time t, one expects that the t following two limits exist: S½ T S½2T!S½ T2 t "v; t t "D lim lim (9) t t t?= t?= and these two limits define the velocity v and the diffusion constant D. The exact expressions of v and of D are given by [15] N!P v" , N!1
(10)
C
D
(2N!3)! (P!1)! (N!P)! 2 D" . (2P!1)! (2N!2P!1)! (N!1)!
(11)
The expression for v is a simple consequence of the fact (8) that all configurations with P particles have equal probability [11] in the steady state. The expression for D, on the other hand, is harder to obtain and was first derived in [15] using a generalization of the matrix approach discussed in Section 3. We will see in Section 4 that it can also be recovered using the Bethe ansatz. The expression of D in terms of correlation functions is usually complicated. However, for the ASEP on a ring with periodic boundary conditions, using the fact (8) that all configurations have equal weight, one can show that
P G
H
P2(N!P)2 N(N!P) 2N2 = dt Sq (t)[1!q (t)][1!q (0)]q (0)T! D" # . i i`1 i i`1 P(N!1) P2 N2(N!1)2 0
(12)
B. Derrida / Physics Reports 301 (1998) 65—83
69
For large N, keeping the density o"P/N of particles fixed, Eq. (11) becomes
C
D
Jp (1!o)3@2 1 . DK o1@2 N1@2 2
(13)
The fact that D vanishes for NPR indicates that in the infinite system, for fixed initial conditions, the fluctuations in the distance travelled by a tagged particle are subdiffusive [12—14].
3. The steady state for open boundary conditions Another case for which the weights of the configurations in the steady state can be calculated exactly [16,17] is the ASEP with open boundary conditions [4] which is defined as follows: each site i of a one-dimensional lattice of N sites is either occupied (q "1) by a particle or empty (q "0); i i during a time interval dt (with dt;1), each particle has a probability dt of hopping to its right, provided the target site is empty; moreover, during the time interval dt, a particle may enter the lattice at site 1 with probability a dt (if this site is empty) and a particle at site N may leave the lattice with probability b dt (if this site is occupied). The evolution equations (Eqs. (6) and (7)) remain valid in the bulk (i.e., as long as all the sites involved in the equations belong to the system). At the boundary, they are modified to include the boundary effects. For example Eq. (6) becomes dSq T 1 "aS(1!q )T!Sq (1!q )T , 1 1 2 dt
(14)
dSq T N "Sq (1!q )T!bSq T . N~1 N N dt
(15)
In the steady state, all the equal time correlation functions become time independent. In particular, the left-hand sides of Eqs. (6), (7), (14) and (15) vanish. The steady state properties of this asymmetric exclusion model with open boundary conditions can be calculated exactly [16—18] for any N, a and b. The solution described below consists in writing the steady state weights of the configurations as the matrix elements of products of N matrices [17]. This type of approach, initially introduced in connection with the Bethe ansatz [19,20], has been used to solve several problems in statistical mechanics, in particular, the problem of directed lattice animals [21]. The main idea is to write the weight P (q 2q ) of each configuration in the steady state 45%!$: 45!5% 1 N as
T
U
N P (q 2q )"Z~1 ¼D < [q D#(1!q )E]D» , (16) 45%!$: 45!5% 1 N i i i/1 where D, E are matrices, S¼D, D»T are vectors, the q are the occupation variables and the i normalization factor Z is given by
T
U
N Z" + 2 + ¼D < [q D#(1!q )E]D» "S¼D(D#E)ND»T . i i q1/1,0 qN/1,0 i/1
(17)
70
B. Derrida / Physics Reports 301 (1998) 65—83
In other words, in the product (16) a matrix D appears whenever a site is occupied (q "1) and i E whenever a site is empty (q "0). Since D and E do not commute in general, the weights i P (q 2q ) are complicated functions of the configuration Mq 2q N. 45%!$: 45!5% 1 N 1 N In [17] it was shown that Eqs. (16) and (17) do give the weights in the steady state of the asymmetric exclusion model when the matrices D and E and the vectors S¼D and D»T satisfy the following algebraic rules: DE"D#E ,
(18)
1 S¼DE" S¼D , a
(19)
1 DD»T" D»T . b
(20)
Proofs that this is so are rather easy and have already been published [17,22]. Let us just explain it with a simple example. Consider a configuration of the form (21) During each time interval dt, the probability of entering this configuration is S¼DEp~1DEDN~p~1D»T dt S¼D(D#E)ND»T
(22)
by coming from configuration (23) The probability of leaving this configuration is given by (a#b)
S¼DEpDN~pD»T dt S¼D(D#E)ND»T
(24)
since one particle may enter at site 1 or the particle at site N may leave. In the steady state these two expressions should be equal (4) to ensure that the weight of configuration (21) remains unchanged S¼DEp~1DEDN~p~1D»T"(a#b)S¼DEpDN~pD»T
(25)
and this equality is easy to check using Eqs. (18)—(20). A generalization of this reasoning to arbitrary configurations is given in [22] and provides a proof that Eq. (16) gives the steady state weight of all the configurations.
B. Derrida / Physics Reports 301 (1998) 65—83
71
Once the weights are known from Eqs. (16) and (17), one can calculate all the equal time steady state properties. For example, the average occupation Sq T of site i is given by i (q 2q ) , Sq T" + 2 + q P i 45%!$: 45!5% 1 N i qN/0,1 q1/0,1 which becomes in terms of the matrices D and E and the vectors S¼D and D»T S¼D(D#E)i~1D(D#E)N~iD»T Sq T" . i S¼D(D#E)ND»T
(26)
(27)
We see that, to evaluate Eq. (27), one has to calculate the matrix elements of the form S¼DDm1En1Dm2En22D»T .
(28)
One way to do it [17,22] is to construct matrices and vectors which satisfy Eqs. (18)—(20). One can show, however, that the calculation of all these matrix elements can be done directly from the algebraic rules Eqs. (18)—(20) without using any specific representation of the algebra. First, from Eqs. (19) and (20), it is obvious that all the matrix elements of the form S¼DEpDqD»T are easy to calculate and is given by 1 S¼D»T . S¼DEpDqD»T" apbq
(29)
It is also easy to see, using DE"D#E, that any more complicated matrix element (28) can be reduced to simple matrix elements of the form (29). For example, the matrix element SE3DE2DT becomes, using Eq. (18) SE3DE2DT"SE5DT#SE4DT#SE3D2T . In particular, one can obtain that way the expression of the denominator S¼D(D#E)ND»T for all N as N p(2N!1!p)! b~p~1!a~p~1 S¼D(D#E)ND»T "+ . (30) b~1!a~1 N! (N!p)! S¼D»T p/1 In the large N limit, one can calculate from Eq. (27) the average occupation o"Sq T for a site i far i from the boundaries and one finds the phase diagram shown in Fig. 1 with the three following phases: Phase I: For a'1 and b'1, 2 2 (31) o"1 . 2 Phase II: For a(1 and a(b, 2 o"a . (32) Phase III: For b(1 and b(a, 2 o"1!b .
(33)
72
B. Derrida / Physics Reports 301 (1998) 65—83
Fig. 1. The phase diagram. Fig. 2. The average occupation o"Sq
T of the central site versus a for N"61 and N"121 when b"0.2. (N`1)@2
Phase I is the maximal current phase, phase II is a low density phase and phase III is a high density phase. The lines a"1, b'1 and b"1, a'1 are second-order transition lines (o is continuous) 2 2 2 2 whereas the line a"b(1 is a first-order transition line (o is discontinuous). The first-order 2 transition can be seen in Fig. 2 where the average occupation of the central site o"Sq T is (N`1)@2 plotted versus a (at b"0.2 fixed) for two system sizes N"61 and N"121. As N increases, one observes that o becomes discontinuous along the first-order transition line a"b. There is a line in the phase diagram, the line a#b"1
(34)
along which there exists a solution of Eqs. (18)—(20) such that the matrices D and E commute. If one chooses D and E to be numbers, one must have from Eqs. (19) and (20), D"1/b and E"1/a and condition (18) imposes that 1/a#1/b"1/(ab) which leads to Eq. (34). Along this line, there is no correlation between the different sites and N P (q 2q )" < [q a#(1!q )(1!a)] . (35) 45%!$: 45!5% 1 N i i i/1 This line plays the role of a disordered line in usual statistical mechanics models [23]. One can try to calculate for the ASEP with open boundary conditions unequal time correlations. Because the steady state is more complicated than in the case of periodic boundary conditions, unequal time correlations are also much harder to obtain. So far only the diffusion constant of the integrated current, which generalises expression (11) to the case of open boundary conditions, has been calculated exactly [24]. However, contrarily to what was said in [24], the relation to unequal time correlation functions is usually much more complicated than (12).
B. Derrida / Physics Reports 301 (1998) 65—83
73
4. The Bethe ansatz method For the ASEP, the Bethe ansatz was first used [25—27] to calculate the gap which gives the largest relaxation time in the system. In this section we will see that it gives also the diffusion constant (11) in the case of periodic boundary conditions [28]. Because on a ring the particles cannot overtake each other, they all cover the same distance ½ , t up to fluctuations which remain bounded when t becomes large. Therefore, if one calls X the total t distance covered by the three particles during the time t, one has X KP½ or more precisely, if one t t considers the generating function of X and of ½ , the two following limits coincide t t logSecXtT logSePcYtT lim "lim "j(c) (36) t t t?= t?= and their common value j(c) can be calculated using the Bethe ansatz. To see how the Bethe ansatz works, it is easier to consider a system of three particles (P"3) on a ring of N sites. One can show [28] that j(c) is the largest eigenvalue of a N(N!1)(N!2)/ 6]N(N!1)(N!2)/6 matrix. A configuration C of the system can be labelled by the positions 14x (x (x 4N of the three particles and if t(x , x , x ) is the eigenvector corresponding to 1 2 3 1 2 3 the eigenvalue j, the equations satisfied by t and j are jt(x , x , x )"ec[t(x !1, x , x )#t(x , x !1, x )#t(x , x , x !1)]!3t(x , x , x ) . 1 2 3 1 2 3 1 2 3 1 2 3 1 2 3 (37) These equations are valid only when the positions of the three particles are not neighbours (and x O1). If x "x #1, the hard core condition modifies Eq. (37) and it becomes 1 2 1 jt(x , x #1, x )"ec[t(x !1, x #1, x )#t(x , x #1, x !1)]!2t(x , x #1, x ) . 1 1 3 1 1 3 1 1 3 1 1 3 (38) Similarly, when x "x #1 one obtains 3 2 jt(x , x , x #1)"ec[t(x !1, x , x #1)#t(x , x !1, x #1)]!2t(x , x , x #1) . 1 2 2 1 2 2 1 2 2 1 2 2 (39) Lastly, periodic boundary conditions lead to jt(1, x , x )"ec[t(x , x , N)#t(1, x !1, x )#t(1, x , x !1)]!3t(1, x , x ) . (40) 2 3 2 3 2 3 2 3 2 3 Usually, solving the Eqs. (37) and (38) for arbitrary N is very hard because the problem is a three-body problem. The idea of the Bethe ansatz consists in trying to see whether a certain simple form for the eigenvector t(x , x , x ) can solve the eigenvalue problem. For P"3 particles, 1 2 3 one looks for an eigenvector which is a sum of 6 terms t(x , x , x )"A zx1zx2zx3#A zx1zx2zx3#2#A zx1zx2zx3 (41) 1 2 3 123 1 2 3 213 2 1 3 321 3 2 1 where the three parameters z , z , z and the 6 parameters A ,2, A are unknown. Then one 1 2 3 123 321 inserts this form (41) into Eqs. (37)—(40) to find the conditions that these parameters have to satisfy
74
B. Derrida / Physics Reports 301 (1998) 65—83
for Eq. (41) to be an eigenvector. If one inserts Eq. (41) into Eq. (37), one finds
C
j(c)"ec
D
1 1 1 # # !3 z z z 2 3 1
(42)
and when one tries to satisfy Eqs. (38) and (39) also, one finds that ec!z ec!z 1A 2A "! , A "! 123 213 ec!z 231 ec!z 3 1 ec!z ec!z 1A 3A "! , A "! 312 ec!z 321 ec!z 132 2 1 ec!z ec!z 2A 3A "! , A "! 321 ec!z 312 ec!z 231 1 2 whereas Eq. (40) gives A "zNA "z~NA , 123 1 231 3 312 A "zNA "z~NA , 231 2 312 1 123 A "zNA "z~NA . 312 3 123 2 231 One can check that these linear equations for the A are compatible when the z are solutions of ijk i the following three equations: 3 z~N(ec!z )3" < (ec!z ) . (43) j i i j/1 Therefore, for each choice of the z which satisfy Eq. (43), expression (41) is an eigenvector with i eigenvalue (42). As we want the largest eigenvalue, one has to choose the solution such that j(c)P0 as cP0. This solution is obtained by choosing the solution of Mz , z , z N which converges to 1 2 3 M1, 1, 1N as cP0 (because the matrix is finite, there is no crossing of eigenvalues, so that the eigenvalue for any c can be obtained by continuity from the solution j(0) for c"0). It is straightforward to extend the Bethe ansatz from 3 particles to an arbitrary number P of particles. The eigenvector (41) becomes a sum of P! terms. The eigenvalue j(c) given by Eqs. (42) and (43) becomes for general P and N P 1 j(c)"!p#ec + , z i/1 i where the z satisfy i P z~N(ec!z )P"(!1)P~1 < (ec!z ) . i i j j/1
(44)
(45)
B. Derrida / Physics Reports 301 (1998) 65—83
75
Finding the z solutions of Eq. (45) is not easy. It turns out that the right hand side of Eq. (45) is i independent of i, thus it can be rewritten as z~N(z !ec)P"B , i i where
(46)
P B"(!1)P~1 < (z !ec) . (47) j j/1 Now, we need to select the P roots z of Eq. (46) which tend to 1 as cP0 and BP0. To do that i we can use the Cauchy formula
Q
dz (z!ec)P~1[Pz!N(z!ec)] + h(z )" h(z) , (48) j 2piz (z!ec)P!BzN 1yjyP where the integration contour is a small circle (for small B) around ec which surrounds P roots of the polynomial (z!ec)P!BzN. Using h(z)"(ec!z)/z in Eq. (48), one gets = (Nq!2)! (Be(N~P)c)q . (49) j(c)"!P + (Pq)! (Nq!Pq!1)! q/1 The z are solutions of Eq. (46) and depend both on B and c which so far are not related. In k principle, one could use these z into Eq. (47) in order to find a self-consistent equation relating k B and c. A way of relating B and c, simpler than checking that Eq. (47) is satisfied, consists in writing that z z 2z "1 or equivalently 1 2 P P + log z "0 , (50) k k/1 which expresses the fact that the eigenvector associated to the largest eigenvalue is translational invariant (t(x , x , x )"t(x #1, x #1, x #1)). Using Eq. (48), one can rewrite (50) as 1 2 3 1 2 3 dz (z!ec)P~1[Pz!N(z!ec)] log z"0 (51) 2piz (z!ec)P!BzN
Q
and by expanding this last expression in powers of B, one obtains = (Nq!1)! c"! + (Be(N~P)c)q . (52) (Pq)! (Nq!Pq)! q/1 The two series (49) and (52) give j as a function of c by eliminating B (this elimination can of course be done for small B but the coefficients of j in powers of c are much more complicated than those appearing in Eqs. (49) and (52); moreover, as the largest eigenvalue j(c) remains isolated as c varies, the function j(c) given by the two series (Eqs. (49) and (52)) for small c can be analytically continued to the whole real axis).
76
B. Derrida / Physics Reports 301 (1998) 65—83
With the two expressions (49) and (52), it is easy to calculate the successive derivatives of j with respect to c. From Eq. (36), one has
K
K
S½ T 1 dj S½2T!S½ T2 1 d2j t " t t " v"lim and D"lim (53) t t P dc 2P2 dc2 c/0 c/0 t?= t?= and this leads to Eq. (9). In fact, j(c) contains also all the higher cumulants of ½ and it can be used t to calculate the large deviation function [28]. In the case of open boundary conditions [24] (Section 3) as well for a ring with a partial asymmetry [29], the diffusion constant has been calculated by an extension of the matrix method of Section 3. The calculation by the Bethe ansatz has not been extended yet to these cases.
5. Two species of particle on a ring The matrix method of Section 3 can be generalized to obtain the exact steady state of systems of more complicated systems, in particular, systems with two species of particles [30]. Consider a ring of N sites with two species of particles represented by 1 and 2, and holes represented by 0 and in which the hopping rates are 1 0 P 0 1 with rate 1 , 2 0 P 0 2 with rate a ,
(54)
1 2 P 2 1 with rate b . One can show [30] that the steady state weights may be written as trace(X X 2X ) , (55) 1 2 N where X "D if site i is occupied by a 1 particle, X "A if it is occupied by a 2 particle and X "E if i i i it is empty, provided that the matrices D, A and E satisfy the following algebra: DE"D#E;
bDA"A;
aAE"A .
(56)
The last two of these equations are satisfied when A is given by A"D»TS¼D ,
(57)
and 1 DD»T" D»T; b
1 S¼DE" S¼D . a
(58)
So, one can use the same matrices D, E as those of Section 2 for the case of open boundary conditions, and construct matrix A from the vectors S¼D and D»T. The proof that Eqs. (55) and (56) gives the steady state of the two species problem is very similar to the proof for the case of open boundary conditions: one shows that the loss term and the gain
B. Derrida / Physics Reports 301 (1998) 65—83
77
term for the probability of each configuration are equal. For example, for a configuration of the type (59) on a ring, the loss term during dt is (a#b) trace(EpDqArEs) dt
(60)
as there are only two possibilities of leaving this configuration and the gain term is trace(Ep~1DEDq~1ArEs) dt
(61)
as there is a single way of entering this configuration. For such configurations, it is almost evident from Eq. (56) that the loss term and the gain term coincide. To illustrate a situation with two species, let us consider on a ring of N sites a single particle 2 and P particles 1 [31,32]. The parameters a and b which appear in Eq. (54) are chosen such that b(1!a . This implies in particular that a(1 and b(1, so that the particle 2 is slower than particles 1 and as b(1, it plays the role of a moving obstacle. In other words, particles 1 can be thought as cars and the particle 2 as a single slow truck. Using the algebraic rules (56), it is possible to calculate ½(N, P) defined by
T
U
1 N~1 ½(N, P)" + 2 + d 1 2 N~1 ¼D < [q D#(1!q )E]D» q i i ` `q ,P S¼D»T qN~1/0,1 q1/0,1 i/1 and show that for 14P4N!2
(62)
N~P~1 P 1 (N!2!p)! (N!2!q)! [p(P!q)#q(N!1!P)] , (63) ½(N, P)" + + P! (N!1!P)! (P!q)! (N!1!P!p)! apbq p/0 q/0 whereas ½(N, 0)"a~(N~1) and ½(N,N!1)"b~(N~1). Then the average velocity v of particle 2 2 (the truck) is given by ½(N!1, P)!½(N!1, P!1) , v " 2 ½(N, P) whereas the average velocity v of particles 1 (cars) is given by 1 N ½(N!1, P!1) v "v # . 1 2 P ½(N, P)
(64)
(65)
These velocities are shown in Fig. 3 as a function of the density o"P/N of particles 1 for two system sizes (N"100 and N"200). For large N, we see that as o varies, v presents a plateau 2 for 1!a(o(b where v "a!b. Along this plateau, the ring consists of two macroscopic 2 regions: a region of high density o "1!a following the truck and a region of low density b `
78
B. Derrida / Physics Reports 301 (1998) 65—83
Fig. 3. The velocities v and v of the particles 1 and 2 versus the density o of first class particles for two rings of 100 and 1 2 200 sites when a"0.15 and b"0.25. Finite size effects are small enough to make the curves for 100 and 200 hard to distinguish.
ahead of the truck. These two regions are of length ¸ "N(o!b)/(1!a!b) and ¸ " ` ~ N(1!a!o)/(1!a!b) so that the total density is ¸ o #¸ o ~ ~. o" ` ` N The expression of the average velocity v of the particles 1 can also be obtained in this coexistence 1 region, by ¸ o v #¸ o v ~ ~ ~ v " ` ` ` 1 P with v "1!o i.e. B B (1!a)b v "a!b# . 1 o In the thermodynamic limit, three phases can be observed as the density o"P/N of particles 1 varies. The steady state velocities v of the particles 1 and v of the particle 2 have the following 1 2 expressions (which can be obtained from the asymptotic behavior of Eq. (63)): For o(b,
v "1!o; v "a!o . 1 2 For b(o(1!a, v "a!b#(1!a)b/o; v "a!b . 1 2 For 1!a(o, v "1!o; v "1!b!o . 1 2 It is interesting to notice that in the range b(o(1!a, the truck (particle 2) has a constant velocity, whereas in that range, cars (particles 1) have an average velocity slower than 1!o (the velocity in absence of the truck). In this whole range there is a coexistence of two macroscopic
B. Derrida / Physics Reports 301 (1998) 65—83
79
regions, one of low density b and one of high density 1!a, very similar to the coexistence between the liquid and the gas for a fluid. A case of the two species problem of particular interest and which, as we will see in Section 6, can be used to study shocks is that of first and second class particles [33—36]. This corresponds to a"b"1 in Eq. (54) so that all hopping rates are 1. Both first and second class particles hop forward when they have a hole to their right, but when a first class particle has a second class particle to its right, the two particles interchange positions. Thus, a second class particle tends to move backwards in an environment of a high density of first class particles and tends to move forwards in an environment of a low density of first class particles so that it gets trapped by a shock. The matrix method was used in [30] to calculate exactly the profile of a shock as seen from a second class particle by starting from the steady state weights (55) of the two species problem on a ring. We are going to see in Section 6 that the shock profile can also be calculated directly on an infinite line.
6. Shock profile It is possible to generalize the calculation of Sections 3 and 5 to describe shock profiles in an infinite system. Consider an infinite system with only one kind of particles (as in Sections 2—4). If the initial condition is a Bernoulli measure with density o at the right of the origin (i.e. the q are independent ` i and Sq T"o for i50) and with density o at the left of the origin (Sq T"o for i(0), the i ` ~ i ~ evolution of the ASEP develops a shock. The shock has a velocity v"1!o !o (66) ~ ` and this can be easily understood from the conservation of mass. The current far to the right of the shock is o (1!o ) and far to the left is o (1!o ). Therefore, in a large region containing the ` ` ~ ~ shock, the number of particles increases by [o (1!o )!o (1!o )]t during the time t. This ~ ~ ` ` increase in the number of particles has the effect of a translation of the shock of vt so that (o !o )vt"[o (1!o )!o (1!o )]t ~ ` ~ ~ ` ` which gives (66). A more delicate question is that of the shape of a shock. Because the dynamics is stochastic, one first needs to locate the shock. There exist several ways of locating the shock. For example, one can choose [38] an irrational value o which satisfies o (o (o and define the position of the * ~ * ` shock as the value m which minimizes S * m +m (q !o ), for m'0 , k/1 k * S " 0, for m"0 , (67) m !+0 (q !o ), for m(0 , k/~m`1 k * (as the density is o to the right of the shock, the sum increases like (o !o )m as mP$R, thus, ` B * for almost all configurations Mq N, there is a unique minimum m ). With this definition of the n *
G
80
B. Derrida / Physics Reports 301 (1998) 65—83
position of the shock (which depends on o ), one can study the measure seen from this position. For * example, the profile Sq T is just the average occupation at position m #n (given that the minimum n * of Eq. (67) is located at m ). Obviously, as the location of the minimum depends on the value of o , * * the profile Sq T or the measure seen from the minimum m will depend on o . This means that two n * * observers looking at exactly the same physical situation, but using two different values of o to * locate the shock, will observe different profiles. The profile Sq T is, therefore, not an intrinsic n property of the shock. One can, however, define some properties of shocks [38] like sums of the type = U " + S(q !o )(q !o )T , (68) k n ~ n`k ` n/~= which are independent of the definition chosen to locate the shock and can therefore be thought as intrinsic properties of the shock. For example [38], one expects U to be always given by 1 = U " + S(q !o )(q !o )T"o (1!o ) 1 n ~ n`1 ` ~ ` n/~= irrespective of the method chosen to locate the shock, in particular irrespective of the value of o . * A convenient way of locating the shock consists in replacing one hole in the system by a second class particle (i.e. with dynamics given by Eq. (54) with a"b"1). This replacement does not affect the dynamics of the particles 1 and so does not disturb the configurations of the shock. The second class particle has a larger velocity in the region of density o than in the region of density o and ~ ` one can show that it gets trapped by the shock [30,33—35,37]. Therefore, the position of the second class particle can be used to locate the shock. One can try to describe the measure seen from this second class particle. It is possible to show [22,37,38] that the steady state weights of all possible environments of the second class particle in presence of a shock characterized by the two asymptotic densities o and o can be written in ` ~ a matrix form similar to Eq. (16). If the occupation numbers q are specified for the n sites at the left of second class particle and the i n@ sites at its right, the steady state probability of the configuration is given by
T C
DC
DU
~1 n{ Prob(q 2q ; q 2q )" ¼D < (Dq #E(1!q )) A < (Dq #E(1!q )) D» ~n ~1 1 n{ i i i i i/~n i/1 when the matrices A, D, E and the vectors D»T and S¼D satisfy DE"(1!o )(1!o )D#o o E , ~ ` ~ ` DA"o o A; AE"(1!o )(1!o )A , ~ ` ~ ` S¼D(D#E)"S¼D; (D#E)D»T"D»T ,
(69)
(70)
S¼DAD»T"1 . For example, Prob(1 1 0 2 1 1 0 1)"S¼DD2EAD2EDD»T. So as in Section 5, for each particle we put a D, for each hole an E and for the second class particle we put the matrix A. As in Section 3, it is easy to see that all the matrix elements of the form
B. Derrida / Physics Reports 301 (1998) 65—83
81
S¼DDmAEnD»T can be calculated immediately from Eq. (71) S¼DDnAEn{D»T"[o o ]n[(1!o )(1!o )]n{ ~ ` ~ ` and that, by using the algebra (71), all the other matrix elements can be reduced by using the algebra to such elements. One can calculate that way the profile seen from the second class particle
G
S¼DA(D#E)n~1DD»T Sq T" n S¼DD(D#E)~n`1AD»T
for n51 , for n4!1 .
(71)
One finds in particular Sq T"o #o !o o ; 1 ` ~ ` ~
Sq T"o o ; ~1 ` ~
Sq T"o o #o o (1!o )(1!o ) ~2 ` ~ ~ ` ~ `
and so on, in agreement with [30]. One can check that the velocity v of the second class particle given by v"S(1!q )!q T"1!o !o 1 ~1 ` ~ coincides with the shock velocity (66), as it should since the second class particle is trapped by the shock. The algebra (71) which gives the measure seen from the second class particle, can of course be used to calculate all the intrinsic properties of the shock, in particular the sums U defined in k Eq. (68). It is possible to show [38] from the algebra (71) that there is a relation between the profile Sq T seen from the second class particle and the intrinsic sums (68) n U !U "(o !o )(o !Sq T) . k`1 k ` ~ ` k Of course such a relation has no reason to remain valid when the profile Sq T is seen from k a different location than the second class particle.
7. Conclusions The matrix method described in Sections 3, 5 and 6 has been extended recently to a number of cases: parallel dynamics [39,40], partial asymmetry [17,37,41—44], transient properties [45], systems with two or more species of particles [46—49], reaction diffusion models [50,51], cases with disorder [52]. However, there remains a number of simple generalizations of the asymmetric exclusion model such as the case of a fixed blockage [53], the calculation of general unequal time correlations [54], more general reaction diffusion models [55], open systems with several species of particles [56], for which exact solutions are still lacking. Trying to extend the Bethe ansatz calculation of Section 4 to other cases (several species, partial asymmetry, open boundary conditions) would certainly lead to new interesting results and make more transparent the link with the matrix method.
82
B. Derrida / Physics Reports 301 (1998) 65—83
Acknowledgements The content of these lectures is based on work done in collaboration with E. Domany, M.R. Evans, S. Goldstein, V. Hakim, S. Janowsky, J.L. Lebowitz, K. Mallick, D. Mukamel, V. Pasquier, E.R. Speer. I would like to thank them as well as D. Kim, V. Rittenberg, G. Schu¨tz, H. Spohn and R. Stinchcombe for useful discussions.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37]
F. Spitzer, Adv. Math. 5 (1970) 246. T.M. Liggett, Interacting Particle Systems, Springer, New York, 1985. H. Spohn, Large Scale Dynamics of Interacting Particles, Springer, New York, 1991. J. Krug, Phys. Rev. Lett. 67 (1991) 1882. B. Schmittmann, R.K.P. Zia, Statistical mechanics of driven diffusive systems, in: C. Domb, J.L. Lebowitz (Eds.), Phase Transitions and Critical Phenomena, vol. 17, Academic, New York, 1995. J.L. Lebowitz, E. Presutti, H. Spohn, J. Statist. Phys. 51 (1988) 841. T. Halpin-Healy, Y.-C. Zhang, Phys. Rep. 254 (1994) 215. A. De Masi, E. Presutti, E. Scacciatelli, Ann. Inst. Henri Poincare´: Probabilite´s et Statistiques 25 (1989) 1. C. Kipnis, S. Olla, S.R.S. Varadhan, Commun. Pure Appl. Math. 42 (1989) 115. J.L. Lebowitz, E. Orlandi, E. Presutti, Physica D 33 (1988) 165. P. Meakin, P. Ramanlal, L.M. Sander, R.C. Ball, Phys. Rev. A 34 (1986) 5091. A. De Masi, P.A. Ferrari, J. Statist. Phys. 38 (1985) 603. R. Kutner, H. van Beijeren, J. Statist. Phys. 39 (1985) 317. S.N. Majumdar, M. Barma, Phys. Rev. B 44 (1991) 5306. B. Derrida, M.R. Evans, D. Mukamel, J. Phys. A 26 (1993) 4911. B. Derrida, E. Domany, D. Mukamel, J. Stat. Phys. 69 (1992) 667. B. Derrida, M.R. Evans, V. Hakim, V. Pasquier, J. Phys. A 26 (1993) 1493. G. Schu¨tz, E. Domany, J. Stat. Phys. 72 (1993) 277. L.D. Faddeev, Sov. Sci. Rev. C 1 (1980) 107. R.J. Baxter, Exactly Solved Models in Statistical Mechanics, Academic, New York, 1982. V. Hakim, J.P. Nadal, J. Phys. A 16 (1983) L213. B. Derrida, M.R. Evans, in: V. Privman (Ed.), Non-equilibrium Statistical Mechanics in One Dimension, Cambridge University Press, Cambridge, 1997, pp. 277. P. Ruja´n, J. Stat. Phys. 49 (1987) 139. B. Derrida, M.R. Evans, K. Mallick, J. Stat. Phys. 79 (1995) 833. D. Dhar, Phase Transitions 9 (1987) 51. L.H. Gwa, H. Spohn, Phys. Rev. A 46 (1992) 844. D. Kim, Phys. Rev. E 52 (1995) 3512; J. Phys. A 30 (1997) 3817. B. Derrida, J.L. Lebowitz, Phys. Rev. Lett. 80 (1998) 209. B. Derrida, K. Mallick, J. Phys. A 30 (1997) 1031. B. Derrida, S.A. Janowsky, J.L. Lebowitz, E.R. Speer, Europhys. Lett. 22 (1993) 651; J. Stat. Phys. 73 (1993) 813. B. Derrida, in: Hao Bailin (Ed.), Stat. Phys. 19, Xiamen, China, World Scientific, Singapore, 1996, pp. 243—253. K. Mallick, J. Phys. A 29 (1996) 5375. E.D. Andjel, M. Bramson, T.M. Liggett, Probab. Theory Rel. Fields 78 (1988) 231. P.A. Ferrari, C. Kipnis, E. Saada, Ann. Probab. 19 (1991) 226. C. Boldrighini, G. Cosimi, S. Frigio, M.G. Nun8 es, J. Stat. Phys. 55 (1989) 611. E.R. Speer, The two species totally asymmetric exclusion process, in: M. Fannes, C. Maes, A. Verbeure (Eds.), Micro, Meso and Macroscopic Approaches in Physics, Plenum Press, New York, 1994. B. Derrida, J.L. Lebowitz, E.R. Speer, J. Stat. Phys. 89 (1997) 135.
B. Derrida / Physics Reports 301 (1998) 65—83 [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56]
B. Derrida, S. Goldstein, J.L. Lebowitz, E.R. Speer, J. Stat. Phys., in press. H. Hinrichsen, J. Phys. A 29 (1996) 3659. A. Honecker, I. Peschel, J. Stat. Phys. 88 (1997) 319. S. Sandow, Phys. Rev. 50 (1994) 2660. S. Sandow, G.M. Schu¨tz, Europhys. Lett. 26 (1994) 7. F.H.L. Essler, V. Rittenberg, J. Phys. A 29 (1996) 3375. K. Mallick, S. Sandow, J. Phys. A 30 (1997) 4513. R.B. Stinchcombe, G.M. Schu¨tz, Europhys. Lett. 29 (1995) 663; Phys. Rev. Lett. 75 (1995) 140. M.R. Evans, Y. Kafri, M.H. Koduvely, D. Mukamel, Phys. Rev. Lett. 80 (1998) 425. P.F. Arndt, T. Heinzel, V. Rittenberg, J. Phys. A 31 (1998) 833; J. Stat. Phys. 90 (1998) 783. F.C. Alcaraz, S. Dasmahapatra, V. Rittenberg, J. Phys. A 31 (1998) 845. A.B. Kolomeisky, preprint, 1997. H. Hinrichsen, S. Sandow, I. Peschel, J. Phys. A 29 (1996) 2643. M.J.E. Richardson, M.R. Evans, J. Phys. A 30 (1997) 811. M.R. Evans, Europhys. Lett. 36 (1996) 13; J. Phys. A 30 (1997) 5669. S.A. Janowsky, J.L. Lebowitz, Phys. Rev. A 45 (1992) 618. G.M. Schu¨tz, J. Stat. Phys. 88 (1997) 427. F.C. Alcaraz, M. Droz, M. Henkel, V. Rittenberg, Ann. Phys. (N.Y.) 230 (1994) 250. M.R. Evans, D.P. Foster, C. Godre`che, D. Mukamel, Phys. Rev. Lett. 74 (1995) 208.
83
Physics Reports 301 (1998) 85—112
Nonequilibrium dynamics of interfaces and lines Mehran Kardar Department of Physics, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Abstract The lectures examine several problems related to non-equilibrium fluctuations of interfaces and flux lines. We start by introducing the phenomenology of depinning, with particular emphasis on interfaces and contact lines. The role of the anisotropy of the medium in producing different universality classes is elucidated. We then focus on the dynamics of lines, where transverse fluctuations are also important. We shall demonstrate how various non-linearities appear in the dynamics of driven flux lines. The universality classes of depinning, and also dynamic roughening, are illustrated in the contexts of moving flux lines, advancing crack fronts, and drifting polymers. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 47.55.Mh; 74.60.Ge; 75.60.Ch; 05.70.Ln Keywords: Depinning; Interfaces; Polymers; Flux lines
1. Depinning of interfaces 1.1. Introduction and phenomenology Depinning is a non-equilibrium critical phenomenon involving an external force and a pinning potential. When the force is weak the system is stationary, trapped in a metastable state. Beyond a threshold force the (last) metastable state disappears and the system starts to move. A simple example is provided by a point mass on a rough table. The mass is stationary until the external force F exceeds that of static friction F . Larger forces lead to an initial period of acceleration, # before the motion settles to a uniform velocity due to viscous forces. In the latter is proportional to velocity, the ultimate velocity of the point close to threshold behaves as vJ(F!F ). # While there are many other macroscopic mechanical examples, our main interest comes from condensed matter systems such as Charge Density Waves (CDWs) [1], interfaces [2], and contact lines [3]. In CDWs, the control parameter is the external voltage; a finite CDW current appears only beyond a threshold applied voltage. Interfaces in porous media, domain walls in random magnets, are stationary unless the applied force (magnetic field) is sufficiently strong. A key feature 0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 0 7 - 6
86
M. Kardar / Physics Reports 301 (1998) 85—112
of these examples is that they involve the collective depinning of many degrees of freedom that are elastically coupled. As such, these problems belong to the realm of collective critical phenomena, characterized by universal scaling laws. We shall introduce these laws and the corresponding exponents below for the depinning of a line (interface or contact line). Consider a line in two dimensions (Fig. 1) oriented along the x direction, and fluctuating along a perpendicular direction r. The configuration of the line at time t is described by the function r(x, t). The function r is assumed to be single valued, thus excluding configurations with overhangs. In many cases [2], where viscous forces dominate over inertia, the local velocity of a point on the curve is given by dr(x, t)/dt"F#f (x, r)#K[r] .
(1)
The first term on the right-hand side is a uniform applied force which is also the external control parameter. Fluctuations in the force due to randomness and impurities are represented by the second term. With the assumption that the medium is on average translationally invariant, the average of f can be set to zero. The final term in Eq. (1) describes the elastic forces between different parts of the line. Short-range interactions can be described by a gradient expansion; for example, a line tension leads to K[r(x)]"+ 2r or K[r(q)]"!q2r(q) for the Fourier modes. The surface of a drop of non-wetting liquid terminates at a contact line on a solid substrate [3]. Deformations of the contact line are accompanied by distortions of the liquid/gas surface. As shown by Joanny and de Gennes [4], the resulting energy and forces are non-local, described by K[r(q)]"!DqDr(q). For the case of a surface in three-dimensional deformations are described by r(x , x ). More 1 2 generally, we shall consider r(x), where x is a d-dimensional vector. In a similar spirit, we shall generalize the coupling to K[r(q)]"!DqDpr(q), which interpolates between the above two cases as p changes from one to two. Note, however, that the equation of motion need not originate from variations of a Hamiltonian, and may include non-linear couplings which will be discussed later on. When F is small, the line is trapped in one of many metastable states in which r/t"0 at all points. For F larger than a threshold F , the line is depinned from the last metastable state, and #
Fig. 1. Geometry of the line in two dimensions. Fig. 2. Critical behavior of the velocity.
M. Kardar / Physics Reports 301 (1998) 85—112
87
moves with an average velocity v. On approaching the threshold from above, the velocity vanishes as (Fig. 2) v"A(F!F )b , (2) # where b is the velocity exponent, and A is a non-universal amplitude. A mean-field estimate for b was obtained by Fisher in the context of CDWs [5]. It corresponds to the limit p"0, where every point is coupled to all others, and hence experiences a restoring force proportional to Sr(x)T!r(x). The resulting equation of motion, dr(x)/dt"Sr(x)T!r(x)#F#f (x, r(x)), has to be supplemented with the condition Sr(x)T"vt. The self-consistent solution for the velocity, indeed, vanishes as (F!F )b, with an exponent that depends on the details of the random force. If # f (x, r(x)) varies smoothly with r, the exponent is b"3, while discontinuous jumps in the force (like 2 a saw-tooth) result in b"1. In fact, the latter is a better starting point for depinning in finite dimensions. This is because of the avalanches in motion (discussed next), which lead to a discontinuous coarse grained force. The motion just above threshold is not uniform, composed of rapid jumps as large segments of the line depin from strong pinning centers, superposed on the slower-steady advance. The jumps are reminiscent of avalanches in other slowly driven systems. In particular, Bak, Tang, and Wiesenfeld emphasized the role and importance of such events through the introduction of sand-pile models of self-organized criticality (SOC) [6]. The similarity to SOC is more clearly obtained if the depinning can be approached from below F by monotonically increasing # F in small increments, each sufficient to cause a jump to the next metastable state. These jumps have a power-law distribution in size, cutoff at a correlation length m which diverges at the transition as m&(F!F )~l . (3) # The size and width of avalanches becomes invariant on approaching F . For example, # Prob(width of avalanche'l)+(1/lB`1)oL (l/m ) , (4) ~ where the cutoff m diverges as in Eq. (3). The critical line is a self-affine fractal whose correlations ~ satisfy the dynamic scaling from S[r(x, t)!r(x@, t@)]2T"(x!x@)2fg(Dt!t@D/Dx!xDz) ,
(5)
defining the roughness and dynamic exponents, f and z, respectively. (Angular brackets reflect averaging over all realizations of the random force f.) The scaling function g goes to a constant as its argument approaches 0; f is the wandering exponent of an instantaneous line profile, and z relates the average lifetime of an avalanche to its size by q(m)&mz. Although, the underlying issues of collective depinning for CDWs and interfaces have been around for some time, only recently a systematic perturbative approach to the problem was developed. This functional renormalization group (RG) approach to the dynamical equations of motion was originally developed in the context of CDWs by Narayan and Fisher [7] (NF), and extended to interfaces by Nattermann et al. [8]. We shall provide a brief outline of this approach starting from Eq. (1). Before embarking on the details of the formalism, it is useful to point out
88
M. Kardar / Physics Reports 301 (1998) 85—112
some scaling relations amongst the exponents which follow from underlying symmetries and non-renormalization conditions. 1. As mentioned earlier, the motion of the line close to the threshold is composed of jumps of segments of size m. Such jumps move the interface forward by mf over a time period mz. Thus, the velocity behaves as v&mf/mz&DF!F Dl(z~f) N b"l(z!f) . (6) # 2. If the elastic couplings are linear, the response of the line to a static perturbation e(x) is obtained simply by considering r (x, t)"r(x, t)!K~1[e(x)] , (7) e where K~1 is the inverse kernel. Since, r satisfies Eq. (1) subject to a force F#e(x)#f (x, r ), e e r satisfies the same equation with a force F#f (x, r!K~1[e(x)]). As long as the statistical properties of the stochastic force are not modified by the above change in its argument, SrT/e"0, and
T U
r (x) e "!K~1, or e(x)
T U
r (q) 1 e " . e(q) DqDp
(8)
Since it controls the macroscopic response of the line, the kernel K cannot change under RG scaling. From Eqs. (5) and (3), we can read-off the scaling of r(x), and the force dF, which using the above non-renormalization must be related by the exponent relation f#1/l"p .
(9)
Note that this identity depends on the statistical invariance of noise under the transformation in Eq. (7). It is satisfied as long as the force correlations S f (x, r) f (x@, r@)T only depend on r!r@. The identity does not hold if these correlations also depend on the slope r/x. 3. A scaling argument related to the Imry—Ma estimate of the lower critical dimension of the random field Ising model, can be used to estimate the roughness exponent [9] The elastic force on a segment of length m scales as mf~p. If fluctuations in force are uncorrelated in space, they scale as m~(f`1)@2 over the area of an avalanche. Assuming that these two forces must be of the same order to initiate the avalanche leads to f"(2p!1)/3 .
(10)
This last argument is not as rigorous as the previous two. Nonetheless, all three exponent identities can be established within the RG framework. Thus, the only undetermined exponent is the dynamic one, z. 1.2. Functional renormalization group A field theoretical description of the dynamics of Eq. (1) can be developed using the formalism of Martin et al. [10] (MSR): generalizing to a d-dimensional interface, an auxiliary field rL (x, t) is
M. Kardar / Physics Reports 301 (1998) 85—112
89
introduced to implement the equation of motion as a series of d-functions. Various dynamical response and correlation functions for the field r(x, t) can then be generated from the functional,
P
(11)
P
(12)
Z" Dr(x, t)DrL (x, t)J[r] exp(S) , where S"i ddx dt rL (x, t)M r!K[r]!F!f (x, r(x, t))N . t
The Jacobian J[r] is introduced to ensure that the d-functions integrate to unity. It does not generate any new relevant terms and will be ignored henceforth. The disorder-averaged generating functional ZM can be evaluated by a saddle-point expansion around a mean-field (MF) solution obtained by setting K [r(x)]"vt!r(x). This amounts to MF replacing interaction forces with Hookean springs connected to the center of mass, which moves with a velocity v. The corresponding equation of motion is dr /dt"vt!r (t)#f [r (t)]#F (v) , (13) MF MF MF MF where the relationship F (v) between the external force F and average velocity v is determined MF from the consistency condition Sr (t)T"vt. The MF solution depends on the type of irregularity MF [7]: For smoothly varying random potentials, b "3, whereas for cusped random potentials, MF 2 b "1. Following the treatment of NF [7,11], we use the mean-field solution for cusped MF potentials, anticipating jumps with velocity of O(1), in which case b "1. After rescaling and MF averaging over impurity configurations, we arrive at a generating functional whose low-frequency form is
P
ZM " DR(x, t)DRK (x, t) exp(SI ) ,
P P
P
ddq du SI "! ddx dt [F!F (v)] RK (x, t)! RK (!q, !u)(!iuo#DqDp)R(q, u) MF (2p)w 2p #1 ddx dt dt@ RK (x, t)RK (x, t@)C[vt!vt@#R(x, t)!R(x, t@)] . 2
(14)
In the above expressions, R and RK are coarse-grained forms of r!vt and irL , respectively. F is adjusted to satisfy the condition SRT"0. The function C(vq) is initially the connected mean-field correlation function S(r (t)r (t#q)T . MF MF # Ignoring the R-dependent terms in the argument of C, the action becomes Gaussian, and is invariant under a scale transformation xPbx, tPbpt, RPbp~d@2R, RK Pb~p~d@2RK , FPb~d@2F, and vPb~d@2v. Other terms in the action, of higher order in R and RK , that result from the expansion of C [and other terms not explicitly shown in Eq. (14)], decay away at large length and time scales if d'd "2p. For d'd , the interface is smooth (f (0) at long length scales, and the # # 0 depinning exponents take the Gaussian values z "p, l "2/d, b "1. 0 0 0 At d"d , the action S has an infinite number of marginal terms that can be rearranged as # a Taylor series for the function C[vt!vt@#R(x, t)!R(x, t@)], when vP0. The RG is carried out by
90
M. Kardar / Physics Reports 301 (1998) 85—112
integrating over a momentum shell K/b(DqD(K (we set the cutoff wave vector to K"1 for simplicity) and all frequencies, followed by a scale transformation xPbx, tPbzt, RPbfR, and RK Pbh~dRK , where b"el. The resulting recursion relation for the linear part in the effective action (to all orders in perturbation theory) is (F!F )/l"(z#h)(F!F )#constant , MF MF which immediately implies (with a suitable definition of F ) # (F!F )/l"y (F!F ) # F # with the exponent identity
(15)
(16)
y "z#h"1/l . (17) F The functional renormalization of C(u) in d"2p!e interface dimensions, computed to one-loop order, gives the recursion relation, C(u)/l"[e#2h#2(z!p)]C(u)#fu dC(u)/du !(S /(2p)d)d/duM[C(u)!C(0)]dC(u)/duN , (18) d where S is the surface area of a unit sphere in d dimensions. NF showed that all higher-order d diagrams contribute to the renormalization of C as total derivatives with respect to u, thus, integrating Eq. (18) at the fixed-point solution C*/l"0, together with Eqs. (9) and (17), gives f"e/3 to all orders in e, provided that :C*O0. This gives Eq. (10) for a one-dimensional interface, as argued earlier. This is a consequence of the fact that C(u) remains short-ranged upon renormalization, implying the absence of anomalous contributions to f. The dynamical exponent z is calculated through the renormalization of o, the term proportional to RK R, which yields t z"p!2e/9#O(e2) (19) and using the exponent identity (Eq. (6)), b"1!2e/9p#O(e2) .
(20)
Nattermann et al. [8] obtain the same results to O(e) by directly averaging the MSR generating function in Eq. (11), and expanding perturbatively around a rigidly moving interface. Numerical integration of Eq. (1) for an elastic interface [12] has yielded critical exponents f"0.97$0.05 and l"1.05$0.1, in agreement with the theoretical result f"l"1. The velocity exponent b"0.24$0.1 is also consistent with the one-loop theoretical result 1; however, a logar3 ithmic dependence v&1/ln(F!F ), which corresponds to b"0, also describes the numerical data # well. In contrast, experiments and various discrete models of interface growth have resulted in scaling behaviors that differ from system to system. A number of different experiments on fluid invasion in porous media [13] give roughness exponents of around 0.8, while imbibition experiments [14,15] have resulted in f+0.6. A discrete-model studied by Leschhorn [16], motivated by Eq. (1) with p"2, gives a roughness exponent of 1.25 at threshold. Since the expansion leading to
M. Kardar / Physics Reports 301 (1998) 85—112
91
Eq. (1) breaks down when f approaches one, it is not clear how to reconcile the results of Leschhorn’s numerical work [16] with the coarse-grained description of the RG calculation, especially since any model with f'1 cannot have a coarse grained description based on gradient expansions. 1.3. Anisotropy Amaral et al. (ABS) [17] recently pointed out that various models of interface depinning in 1#1 dimensions fall into two distinct classes, depending on the tilt dependence of the interface velocity: 1. For models like the random field Ising Model [18] and some Solid On Solid models, the computed exponents are consistent with the exponents given by the RG analysis. It has been suggested [16], however, that the roughness exponent is systematically larger than e/3, casting doubt on the exactness of the RG result. 2. A number of different models, based on directed percolation (DP) [19,14] give a different roughness exponent, f+0.63. In these models, pinning sites are randomly distributed with a probability p, which is linearly related to the force F. The interface is stopped by the boundary of a DP cluster of pinning sites. The critical exponents at depinning can then be related to the longitudinal and transverse correlation length exponents l +1.70 and , l +1.07 of DP. In particular, f"l /l +0.63, and b"l !l +0.63, in agreement with M , M , M experiments. The main difference of these models can be understood in terms of the dependence of the threshold force F to the orientation. To include the possible dependence of the line mobility on its # slope, r, we can generalize the equation of motion to x r"K2r#i r#(j/2)( r)2#F#f (x, r) . (21) t x x x The isotropic depinning studied by RG corresponds to i"j"0. The usual mechanisms for generating a non-zero j are of kinematic origin [20] (jJv) and can be shown to be irrelevant at the depinning threshold where the velocity v goes to zero [11]. However, if j is not proportional to v and stays finite at the transition, it is a relevant operator and expected to modify the critical behavior. As we shall argue below, anisotropy in the medium is a possible source of the nonlinearity at the depinning transition. A model flux line (FL) confined to move in a plane [12,21] provides an example where both mechanisms for the non-linearity are present. Only the force normal to the FL is responsible for motion, and is composed of three components: (1) A term proportional to curvature arising from the smoothening effects of line tension. (2) The Lorentz force due to a uniform current density perpendicular to the plane acts in the normal direction and has a uniform magnitude F (per unit line length). (3) A random force nL ) f due to impurities, where nL is the unit normal vector [21]. Equating viscous dissipation with the work done by the normal force leads to the equation of motion h/t"J1#s2[2h/(1#s2)3@2#F#( f !sf )/J1#s2] , x h x
(22)
92
M. Kardar / Physics Reports 301 (1998) 85—112
where h(x, t) denotes transverse displacement of the line and s, h. The nonlinearities generated x by J1#s2 are kinematic in origin20 and irrelevant as vP011, as can be seen easily by taking them to the left hand side of Eq. (22). The shape of the pinned FL is determined by the competition of the terms in the square brackets. Although there is no explicit simple s2 term in this group, it will be generated if the system is anisotropic. To illustrate the idea, let us take f and f to be independent random fields with amplitudes h x D1@2 and D1@2 respectively; each correlated isotropically in space within a distance a. For h x weak disorder, a deformation of order a in the normal direction nL takes place over a distance ¸
M. Kardar / Physics Reports 301 (1998) 85—112
93
As emphasized above, the hallmark of anisotropic depinning is the dependence of the threshold force F (s) on the slope s. Above this threshold, we expect v(F, s) to be an analytical function of # F and s. In particular, for F'F (0), there is a small s expansion v(F, s)"v(F, s"0) # #j s2/2#2 . On the other hand, we can associate a characteristic slope sN "m /m &(dF)l(1~f), %&& M , to DP clusters where dF"F!F (0), and l is the correlation length exponent. Scaling then # suggests v(F, s)"(dF)hg(s/dFl(1~f)) ,
(23)
where h"l(z!f). Matching Eq. (23) with the small s expansion, we see that j diverges as (dF)~( %&& (as defined by ABS [17]) with /"2l(1!f)!h"l(2!f!z). In d"1, the exponents l and f are related to the correlation length exponents l and l of DP [24] via l"l +1.73 and , M , f"l /l +0.63, while the dynamical exponent is z"1. Scaling thus predicts /+0.63, in M , agreement with the numerical result of 0.64$0.08 in Ref. [17]. Close to the line F"F (0) (but at # a finite s), the dependence of v on dF drops out and we have v(F , s)JDsDh@l(1~f) . (24) # As z"1 in d"1, the above equation reduces to vJDsD, in agreement with Fig. 1 of Ref. [17]. Since v(F, s)"0 at F"F (s), Eq. (23) suggests # F (s)!F (0)J!DsD1@l(1~f) . (25) # # Note that Eqs. (24) and (25) are valid also in higher dimensions, though values of the exponents quoted above vary with d [24]. An interface tilted away from the hard direction not only has a different depinning threshold, but also completely different scaling behavior at its transition. This is because, due to the presence of an average interface gradient s"S+ hT, the isotropy in the internal x space is lost. The equation of motion for fluctuations, h@(x, t)"h(x, t)!s ) x, around the average interface position may thus include a non-zero i in Eq. (21). The resulting depinning transition belongs to yet a new universality class with anisotropic response and correlation functions in directions parallel and perpendicular to s; i.e.
G
Dx !x@ Df for x !x@"0 , , , t t Dx !x@Df@g for x !x@ "0 , t t , , where g is the ansiotropy exponent, and x denotes the d!1 directions transverse to s. t A suggestive mapping allows us to determine the exponents for depinning a tilted interface: consider the response to a perturbation in which all points along a (d!1)-dimensional cross section of the interface at a fixed x are pushed up by a small amount. This move decreases the , slope of the interface uphill but increases it downhill. Since F (s) decreases with increasing s, at # criticality the perturbation propagates only a finite distance uphill but causes a downhill avalanche. The disturbance front moves at a constant velocity (dx Jt) and hence z "1. (Such chains of , , moving sites were, indeed, seen in simulations of the d"2 model discussed below.) Furthermore, the evolution of successive cross-sections x (x ) is expected to be the same as the evolution in time t , of a (d!1)-dimensional interface! The latter is governed by the Kardar—Parisi—Zhang (KPZ) S[h(x)!h(x@)]2T"Dx !x@ DfF(Dx !x@D/Dx !x@ Dg) , , t t , ,
P
94
M. Kardar / Physics Reports 301 (1998) 85—112
equation [20], whose scaling behavior has been extensively studied. From this analogy we conclude, f(d)"f
(d!1)/z (d!1), g(d)"1/z (d!1) . (26) KPZ KPZ KPZ In particular, the tilted interface with d"2 maps to the growth problem in 1#1 dimensions where the exponents are known exactly, yielding f(2)"1 and g(2)"2. This picture can be made more 3 3 precise for a lattice model introduced below. Details will be presented elsewhere. To get the exponent b for the vanishing of velocity of the tilted interface, we note that since z "1, v scales as the excess slope ds"s!s (F). The latter controls the density of the above , # moving fronts; s (F) is the slope of the critical interface at a given driving force F, i.e., F"F (s ). # # # Away from the symmetry direction, the function F (s) has a non-vanishing derivative and hence # dF"F!F (s)"F (s )!F (s)&ds&v . (27) # # # # We thus conclude that generically b"1 for tilted interfaces, independent of dimension. To check the above predictions, we performed simulations of the parallelized version of a previously studied percolation model of interface depinning [19]. A solid-on-solid (SOS) interface is described by a set of integer heights MhiN where i is a group of d integers. With each configuration is associated a random set of pinning forces Mgi3[0,1)N. The heights are updated in parallel according to the following rules: hi is increased by one if (i) hi4hj!2 for at least one j which is a nearest neighbor of i, or (ii) gi(F for a pre-selected uniform force F. If hi is increased, the associated random force gi is also updated, i.e. replaced by a new random number in the interval [0,1). Otherwise, hi and gi are unchanged. The simulation is started with initial conditions hi(t"0)"Int[si ], and boundary conditions hi#L"Int[s¸]#hi are enforced throughout. The x CPU time is greatly reduced by only keeping track of active sites. The above model has a simple analogy to a resistor-diode percolation problem [24]. Condition (i) ensures that, once a site (i, h) is wet (i.e., on or behind the interface), all neighboring columns of i must be wet up to height h!1. Thus, there is always “conduction” from a site at height h to sites in the neighboring columns at height h!1. This relation can be represented by diodes pointing diagonally downward. Condition (ii) implies that “conduction” may also occur upward. Hence, a fraction F of vertical bonds are turned into resistors which allow for two-way conduction. Note that, due to the SOS condition, vertical downward conduction is always possible. For F(F , conducting sites connected to a point lead at the origin, form a cone whose hull is the # interface separating wet and dry regions. The opening angle of the cone increases with F, reaching 180° at F"F , beyond which percolation in the entire space takes place, so that all sites are # eventually wet. If instead of a point, we start with a planar lead defining the initial surface, the percolation threshold depends on the surface orientation, with the highest threshold for the untilted one. Our simulations of lattices of 65 536 sites in d"1 and of 512]512 and 840]840 sites in d"2 confirm the exponents for depinning in the hard direction. For a tilted surface in d"1 the roughness exponent determined from the height—height correlation function is consistent with the predicted value of f"1 and different from f+0.63 of the untilted one. The dependence of the 2 depinning threshold on slope is clearly seen in the figure below, where the average velocity is plotted against the driving force for s"0 (open) and s"1 (solid) is shown in Fig. 3. The s"0 data 2 can be fitted to a power-law v&(F!F )h, where F +0.461, b"0.63$0.04 for d"1, and # #
M. Kardar / Physics Reports 301 (1998) 85—112
95
Fig. 3. Average interface velocity v versus the driving force F, for d"1, s"0 (open circles), d"1, s"1 (solid circles), 2 d"2, s"0 (open squares), and d"2, s"1 (solid squares). 2
Fig. 4. Height—height correlation functions (a) along and (b) transverse to the tilt for an 8402 system at different times 324t41024. The interface at t"0 is flat; d"2, s"1, and F"0.144. 2
F +0.201, b"0.72$0.04 for d"2. Data at s"1 are consistent with Eq. (27) close to the # 2 threshold. We also measured height—height correlation functions at the depinning transition (Fig. 4). For a tilted surface in d"2, the height fluctuations and corresponding dynamic behaviors are different parallel and transverse to the tilt. The next figure shows a scaling plot of (a) C (r , t) , , ,S[h(x #r , x , t)!h(x , x , t)]2T and (b) C (r , t),S[h(x , x #r , t)!h(x , x , t)]2T against the , , t , t t t , t t , t scaled distances at the depinning threshold of an s"1 interface. Each curve shows data at a given 2 t"32, 64,2, 1024, averaged over 50 realizations of the disorder. The data collapse is in agreement with the mapping to the KPZ equation in one less dimension.
96
M. Kardar / Physics Reports 301 (1998) 85—112
In summary, critical behavior at the depinning of an interface depends on the symmetries of the underlying medium. Different universality classes can be distinguished from the dependence of the threshold force (or velocity) on the slope, which is reminiscent of similar dependence in a model of resistor-diode percolation. In addition to isotropic depinning, we have so far identified two classes of anisotropic depinning: along a (hard) axis of inversion symmetry in the plane, and tilted away from it. We have no analytical results in the former case, but suggest a number of scaling relations that are validated by simulations. In the latter (more generic) case we have obtained exact information from a mapping to moving interfaces, and confirmed them by simulations in d"1 and 2. As it is quite common to encounter (intrinsic or artificially fabricated) anisotropy for flux lines in superconductors, domain walls in magnets, and interfaces in porous media, we expect our results to have important experimental ramifications. Another form of anisotropy is also possible for interfaces in 2#1 dimensions. If the directions x and y on the surface are not related by symmetry, the non-linear term in the KPZ equation can be generalized, leading to the depinning equation h"K 2r#K 2r#(j /2)( r)2#(j /2)( r)2#F#f (x, y, r) . (28) t x x y x x x y y In fact, the difference between K and K is not important as long as both are positive. It was first x y pointed out by Dietrich Wolf [25] that different signs of j and j lead to a different universality x y class for the case of annealed noise. More recently, it was demonstrated by Jeong et al. [26] that, with quenched noise, Eq. (28) describes a new universality class of depinning transitions with b+0.80(1), and anisotropic roughness exponents in the x and y directions. 2. Fluctuating lines 2.1. Flux line depinning The pinning of flux lines (FLs) in Type-II superconductors is of fundamental importance to many technological applications that require large critical currents [27]. Upon application of an external current density J, the motion of FLs due to the Lorentz force causes undesirable dissipation of supercurrents. Major increases in the critical current density J of a sample are # achieved when the FLs are pinned to impurities. There are many recent studies, both experimental [28,29] and theoretical [30,31], on collective pinning of FLs to point or columnar defects. Another consequence of impurities is the strongly non-linear behavior of the current slightly above the depinning threshold, as the FLs start to move across the sample. Recent numerical simulations have concentrated on the low-temperature behavior of a single FL near depinning [32,12,21], mostly ignoring fluctuations transverse to the plane defined by the magnetic field and the Lorentz force. Common signatures of the depinning transition from J(J to J'J include a broad-band # # ( f~a type) voltage noise spectrum, and self-similar fluctuations of the FL profile. The FL provides yet another example of a depinning transition. We now extend the methods of the previous section to the full three-dimensional dynamics of a single FL at low temperatures (Fig. 5). The shape of the FL at a given time t is described by r(x, t), where x is along the magnetic field B, and the unit vector e is along the Lorentz force F. Point impurities are modeled by , a random potential »(x, r), with zero mean and short-range correlations. In the presence of
M. Kardar / Physics Reports 301 (1998) 85—112
97
Fig. 5. Geometry of the line in three dimensions.
impurities and a bulk Lorentz force F, the energy of a FL with small fluctuations is,
P G
H" dx
H
1 ( r)2#»(x, r(x, t))!r(x, t) ) F . 2 x
(29)
The simplest possible Langevin equation for the FL, consistent with local, dissipative dynamics, is k~1(r/t)"!dH/dr"2r#f (x, r(x, t))#F , (30) x where k is the mobility of the FL, and f"!+r». The potential »(x, r) need not be isotropic. For example, in a single crystal of ceramic superconductors with the field along the oxide planes, it will be easier to move the FL along the planes. This leads to a pinning threshold that depends on the orientation of the force. Anisotropy also modifies the line tension, and the elastic term in Eq. (30) is in general multiplied by a non-diagonal matrix K . The random force f (x, r), can be taken to have ab zero mean with correlations S f (x, r) f (x@, r@)T"d(x!x@)D (r!r@) . (31) a c ac We shall focus mostly on the isotropic case, with D (r!r@)"d D(Dr!r@D), where D is a function ac ac that decays rapidly for large values of its argument. While the flux line is pinned by impurities when F(F , for F slightly above threshold, we expect # the average velocity v"DD to scale as in Eq. (23). Superposed on the steady advance of the FL are rapid “jumps” as portions of the line depin from strong pinning centers. The cut-off length m on avalanche sizes diverges on approaching the threshold as m&(F!F )~l. At length scales up to m, # the correlated fluctuations satisfy the dynamic scaling forms, S[r (x, t)!r (0, 0)]2T"DxD2f,g (t/DxDz,) , , , , (32) S[r (x, t)!r (0, 0)]2T"DxD2fMg (t/DxDzM) , M M M where f and z are the roughness and dynamic exponents, respectively. The scaling functions g go a a a to a constant as their arguments approach 0. Beyond the length scale m, different regions of the FL
98
M. Kardar / Physics Reports 301 (1998) 85—112
depin more or less independently and the system crosses over to a moving state, described by different exponents, which will be considered in the next section. The major difference of this model from the previously studied interface is that the position of the flux line, r(x, t), is now a two-dimensional vector instead of a scalar; fluctuating along both e and , e directions. One consequence is that a “no passing” rule [33], applicable to CDWs and interfaces, M does not apply to FLs. It is possible to have coexistence of moving and stationary FLs in particular realizations of the random potential. How do these transverse fluctuations scale near the depinning transition, and do they in turn influence the critical dynamics of longitudinal fluctuations near threshold? The answer to the second question can be obtained by the following qualitative argument: Consider Eq. (30) for a particular realization of randomness f (x, r). Assuming that portions of the FL always move in the forward direction, there is a unique point r (x, r ) that is M , visited by the line for given coordinates (x, r ). We construct a new force field f @ on a two, dimensional space (x, r ) through f @(x, r ),f (x, r ,r (x, r )). It is then clear that the dynamics , , , , M , of the longitudinal component r (x, t) in a given force field f (x, r) is identical to the dynamics of , r (x, t) in a force field f @(x, r ), with r set to zero. It is quite plausible that, after averaging over , , M all f, the correlations in f @ will also be short ranged, albeit different from those of f. Thus, the scaling of longitudinal fluctuations of the depinning FL will not change upon taking into account transverse fluctuations. However, the question of how these transverse fluctuations scale still remains. Certain statistical symmetries of the system restrict the form of response and correlation functions. For example, Eq. (30) has statistical space- and time-translational invariance, which enables us to work in Fourier space, i.e. (x, t)P(q, u). For an isotropic medium, F and are parallel to each other, i.e., (F)"v(F)FK , where FK is the unit vector along F. Furthermore, all expectation values involving odd powers of a transverse component are identically zero due to the statistical invariance under the transformation r P!r . Thus, linear response and two-point M M correlation functions are diagonal. The introduced critical exponents are then related through scaling identities. These can be derived from the linear response to an infinitesimal external force field e(q, u), s (q, u)"Sr (q, u)/e (q, u)T,d s (q, u) , ab a b ab a
(33)
in the (q, u)P(0, 0) limit. Eq. (30) is statistically invariant under the transformation FPF#e(q), r(q, u)Pr(q, u)#q~2e(q). Thus, the static linear response has the form s (q, u"0)"s (q, u"0)"q~2. Since e scales like the applied force, the form of the linear , M , response at the correlation length m gives the exponent identity f #1/l"2 . ,
(34)
Considering the transverse linear response seems to imply f "f . However, the static part of M , the transverse linear response is irrelevant at the critical RG fixed point, since z 'z , as shown M , below. When a slowly varying uniform external force e(t) is applied, the FL responds as if the instantaneous external force F#e is a constant, acquiring an average velocity, S r T"v (F#e)+v (F)#(v /F )e . ta a a a c c
M. Kardar / Physics Reports 301 (1998) 85—112
99
Substituting v /F "dv/dF and v /F "v/F, and Fourier transforming, gives , , M M s (q"0, u)"1/[!iu(dv/dF)~1#O(u2)] , , (35) s (q"0, u)"1/[!iu(v/F)~1#O(u2)] . M Combining these with the static response, we see that the characteristic relaxation times of fluctuations with wavelength m are q (q"m~1)&(q2 dv/dF)~1&m2`(b~1)@l&mz, , , q (q"m~1)&(q2 v/F)~1&m2`b@l&mzM , M which, using Eq. (34), yield the scaling relations b"(z !f )l, z "z #1/l . (36) , , M , We already see that the dynamic relaxation of transverse fluctuations is much slower than longitudinal ones. All critical exponents can be calculated from f , f , and z , by using Eqs. (34) , M , and (36). Eq. (30) can be analyzed using the formalism of Martin, Siggia, and Rose (MSR) [10]. Ignoring transverse fluctuations, and generalizing to d-dimensional internal coordinates x3Rd, leads to an interface depinning model which was studied by Nattermann, Stepanow, Tang, and Leschhorn (NSTL) [8], and by Narayan and Fisher (NF) [11]. The RG treatment indicates that impurity disorder becomes relevant for dimensions d44, and the critical exponents in d"4!e dimensions are given to one-loop order as f"e/3, z"2!2e/9. NSTL obtained this result by directly averaging the MSR generating functional Z, and calculating the renormalization of the force—force correlation function D(r), perturbatively around the freely moving interface [D(r)"0]. NF, on the other hand, used a perturbative expansion of Z, around a saddle point corresponding to a meanfield approximation to Eq. (30) [34], which involved temporal force—force correlations C(vt). They argue that a conventional low-frequency analysis is not sufficient to determine critical exponents. They also suggest that the roughness exponent is equal to e/3 to all orders in perturbation theory. Following the approach of NF, we employ a perturbative expansion of the disorder-averaged MSR partition function around a mean-field solution for cusped impurity potentials [11]. All terms in the expansion involving longitudinal fluctuations are identical to the interface case, thus we obtain the same critical exponents for longitudinal fluctuations, i.e., f "e/3, , z "2!2e/9#O(e2). Furthermore, for isotropic potentials, the renormalization of transverse , temporal force—force correlations C (vt) yields a transverse roughness exponent f "5f /2!2, to M M , all orders in perturbation theory. For the FL (e"3), the critical exponents are then given by f "1, z +4, l"1, b+1, f "1, z +7 . (37) , , 3 3 M 2 M 3 To test the scaling forms and exponents predicted by Eqs. (23) and (32), we numerically integrated Eq. (30), discretized in coordinates x and t. Free boundary conditions were used for system sizes of up to 2048, with a grid spacing *x"1 and a time step *t"0.02 (Fig. 6). Time averages were evaluated after the system reached steady state. Periodic boundary conditions gave similar results, but with larger finite size effects. Smaller grid sizes did not change the results considerably. The behavior of v(F) seems to fit the scaling form of Eq. (23) with an exponent
100
M. Kardar / Physics Reports 301 (1998) 85—112
Fig. 6. A plot of average velocity versus external force for a system of 2048 points. Statistical errors are smaller than symbol sizes. Both fits have three adjustable parameters: The threshold force, the exponent, and an overall multiplicative constant. Fig. 7. A plot of equal time correlation functions versus separation, for the system shown in Fig. 6, at F"0.95. The observed roughness exponents very closely follow the theoretical predictions of f "1, f "0.5, which are shown as , M solid lines for comparison.
b+0.3, but is also consistent with a logarithmic dependence on the reduced force, i.e., b"0. The same behavior was observed by Dong et al. in a recent simulation of the 1#1 dimensional geometry [12]. Since z , and consequently b, is known only to first order in e, higher-order , corrections are expected. By looking at equal time correlation functions, we find that transverse fluctuations are strongly suppressed, and that the roughness exponents are equal to our theoretical estimates within statistical accuracy. The excellent agreement for e"3 suggests that the theoretical estimates are indeed exact. The potential pinning the FL in a single superconducting crystal is likely to be highly anisotropic. For example, consider a magnetic field parallel to the copper oxide planes of a ceramic superconductor. The threshold force then depends on its orientation, with depinning easiest along the copper oxide planes. In general, the average velocity may depend on the orientations of the external force and the FL. The most general gradient expansion for the equation of motion is then, r /t"k F #i r #K 2r #1K r r #f (x, r(x, t))#2 a ab b ab x b ab x b 2 a,bc x b x c a
(38)
with S f (x, r) f (x@, r@)T"d(x!x@)C (r!r@) . (39) a b ab Depending on the presence or absence of various terms allowed by the symmetries of the system, the above set of equations encompasses many distinct universality classes. For example, consider the situation where depends on F, but not on the orientation of the line. Eq. (35) have to be
M. Kardar / Physics Reports 301 (1998) 85—112
101
modified, since and F are no longer parallel (except along the axes with rP!r symmetry), and the linear response function is not diagonal. The RG analysis is more cumbersome: For depinning along a non-symmetric direction, the longitudinal exponents are not modified (in agreement with the argument presented earlier), while the transverse fluctuations are further suppressed to f "2f !2 (equal to zero for f "1) [35] (Fig. 7). Relaxation of transverse modes are still M , , characterized by z "z #1/l, and the exponent identity (Eq. (34)) also holds. Surprisingly, the M , exponents for depinning along axes of reflection symmetry are the same as the isotropic case. If the velocity also depends on the tilt, there will be additional relevant terms in the MSR partition function, which invalidate the arguments leading to Eq. (34). The analogy to FLs in a planes suggests that the longitudinal exponents for d"1 are controlled by DP clusters [19,14], with f +0.63. Since no perturbative fixed point is present in this case, it is not clear how to explore , the behavior of transverse fluctuations systematically. 2.2. Dynamic fluctuations of an unpinned flux line So far, we investigated the dynamics of a flux line near the depinning transition. Now, we would like to consider its behavior in a different regime, when the external driving force is large, and the impurities appear as weak barriers that deflect portions of the line without impeding its overall drift. In such non-equilibrium systems, one can regard the evolution equations as more fundamental, and proceed by constructing the most general equations consistent with the symmetries and conservation laws of the situation under study [36]. Even in a system with isotropic randomness, which we will discuss here, the average drift velocity, v, breaks the symmetry between forward and backward motions, and allows introduction of nonlinearities in the equations of motion [37,36]. Let us first concentrate on an interface in two dimensions (Fig. 1). By contracting up to two spatial derivatives of r, and keeping terms that are relevant, one obtains the Kardar—Parisi—Zhang [20] (KPZ) equation, r(x, t)"kF#K2r(x, t)#(j/2)[ r(x, t)]2#f (x, t) t x x with random force correlations S f (x, t) f (x@, t@)T"2¹d(x!x@)d(t!t@) .
(40)
(41)
For a moving line, the term proportional to the external force can be absorbed, without loss of generality, by considering a suitable Galilean transformation, rPr!at, to a moving frame. A large number of stochastic non-equilibrium growth models, like the Eden model and various ballistic deposition models are known to be well described, at large length scales and times, by this equation, which is intimately related to several other problems. For example, the transformation v(x, t)"!j r(x, t) maps Eq. (40) to the randomly stirred Burgers’ equation for fluid flow [38,39], x v#v v"K2v!j f (x, t) . (42) t x x x The correlations of the line profile still satisfy the dynamic scaling form in Eq. (5), nevertheless, with different scaling exponents f, z and scaling function g. This self-affine scaling is not critical, i.e., not obtained by fine tuning an external parameter like the force, and is quite different in nature
102
M. Kardar / Physics Reports 301 (1998) 85—112
than the critical scaling of the line near the depinning transition, which ceases beyond the correlation length scale m. Two important nonperturbative properties of Eq. (40) help us determine these exponents exactly in 1#1 dimensions: 1. Galilean invariance (GI): Eq. (40) is statistically invariant under the infinitesimal reparametrization r@"r#ex, x@"x#jet, t@"t ,
(43)
provided that the random force f does not have temporal correlations [40]. Since the parameter j appears both in the transformation and Eq. (40), it is not renormalized under any RG procedure that preserves this invariance. This implies the exponent identity [39,40] f#z"2 .
(44)
2. Fluctuation—dissipation (FD) theorem: Eqs. (40) and (41) lead to a Fokker—Planck equation for the evolution of the joint probability P[r(x)],
P A
B
d2P dP P" dx r#¹ . t [dr(x)]2 dr(x) t
(45)
It is easy to check that P has a stationary solution
A
P
B
K P"exp ! dx ( r)2 . x 2¹
(46)
If P converges to this solution, the long-time behavior of the correlation functions in Eq. (5) can be directly read off Eq. (46), giving f"1. 2 Combining these two results, the roughness and dynamic exponents are exactly determined for the line in two dimensions as f"1, z"3 . (47) 2 2 Many direct numerical simulations and discrete growth models have verified these exponents to a very good accuracy. Exact exponents for the isotropic KPZ equation are not known in higher dimensions, since the FD property is only valid in two dimensions. These results have been summarized in a number of recent reviews [41—44]. As an aside we remark that some exact information is available for the anisotropic KPZ equation in 2#1 dimensions. Using a perturbative RG approach, Wolf showed [25] that in the equation r"K+ 2r#(j /2)( r)2#(j /2)( r)2#f (x, y, t) , (48) t x x y y the non-linearities Mj , j N renormalize to zero if they initially have opposite signs. This suggests x y logarithmic fluctuations for the resulting interface, as in the case of the linear Langevin equation. In fact, it is straightforward to demonstrate that Eq. (48) also satisfies a Fluctuation—Dissipation condition if j "!j . When this condition is satisfied, the associated Fokker—Planck equation x y has a steady-state solution
A
P
B
K P"exp ! dx dy (+r)2 . 2¹
(49)
M. Kardar / Physics Reports 301 (1998) 85—112
103
This is a non-perturbative result which again indicates the logarithmic fluctuations resulting from Eq. (48). In this context, it is interesting to note that the steady state distribution for an exactly solvable discrete model of surface growth belonging to the above universality class has also been obtained [45]. Let us now turn to the case of a line in three dimensions (Fig. 5). Fluctuations of the line can be indicated by a two-dimensional vector r. Even in an isotropic medium, the drift velocity breaks the isotropy in r by selecting a direction. A gradient expansion up to second order for the equation of motion gives [46] r "[K d #K v v ]2r #[j (d v #d v )#j v d #j v v v ]( r r /2)#f (50) ta 1 ab 2 a b x b 1 ab c ac b 2 a bc 3 a b c xb xc a with random force correlations S f (x, t) f (x@, t@)T"2[¹ d #¹ v v ]d(x!x@)d(t!t@) . (51) a b 1 ab 2 a b Higher-order non-linearities can be similarly constructed but are, in fact, irrelevant. In terms of components parallel and perpendicular to the velocity, the equations are r "K 2r #(j /2)( r )2#(j /2)( r )2#f (x, t) , t, , x , , x, C xM , r "K 2r #j r r #f (x, t) , tM M x M M x, xM M
(52)
with S f (x, t) f (x@, t@)T"2¹ d(x!x@)d(t!t@) , , , , S f (x, t) f (x@, t@)T"2¹ d(x!x@)d(t!t@) . M M M The noise-averaged correlations have a dynamic scaling form like Eq. (32),
(53)
S[r (x, t)!r (x@, t@)]2T"Dx!x@D2f,g (Dt!t@D/Dx!x@Dz,) , , , , (54) S[r (x, t)!r (x@, t@)]2T"Dx!x@D2fMg (Dt!t@D/Dx!x@DzM) . M M M In the absence of non-linearities (j "j "j "0), Eq. (52) can easily be solved to give , C M f "f "1 and z "z "2. Simple dimensional counting indicates that all three non-linear terms , M 2 , M are relevant and may modify the exponents in Eq. (54). Studies of related stochastic equations [47,25] indicate that interesting dynamic phase diagrams may emerge from the competition between nonlinearities. Let us assume that j is positive and finite (its sign can be changed by , r P!r ), and focus on the dependence of the scaling exponents on the ratios j /j and j /j , as , , M , C , depicted in Fig. 8. (It is more convenient to set the vertical axis to j K ¹ /j K ¹ .) C M M , M , The properties discussed for the KPZ equation can be extended to this higher-dimensional case: 1. Galilean invariance (GI): Consider the infinitesimal reparametrization x@"x#j et, t@"t, r@ "r #ex, r@ "r . (55) , , , M M Eq. (52) are invariant under this transformation provided that j "j . Thus along this line in , M Fig. 8 there is GI, which once more implies the exponent identity f #z "2 . , ,
(56)
104
M. Kardar / Physics Reports 301 (1998) 85—112
Fig. 8. A projection of RG flows in the parameter space, for n"1 transverse components.
2. Fluctuation—Dissipation (FD) condition: The Fokker—Planck equation for the evolution of the joint probability P[r (x), r (x)] has a stationary solution , M
A P C
K K , ( r )2# M ( r )2 P Jexp ! dx 0 2¹ x M 2¹ x , M ,
DB
,
(57)
provided that j K ¹ "j K ¹ . Thus, for this special choice of parameters, depicted by C , M M M , a starred line in Fig. 8, if P converges to this solution, the long-time behavior of the correlation functions in Eq. (54) can be directly read off Eq. (57) giving f "f "1. , M 2 3. ¹he Cole—Hopf (CH) transformation is an important method for the exact study of solutions of the one component non-linear diffusion equation [38]. Here we generalize this transformation to the complex plane by defining, for j (0, C
A
W(x, t)"exp
B
j r (x, t)#iJ!j j r (x, t) ,, , CM . 2K
(58)
The linear diffusion equation W"K2W#k(x, t)W , t x then leads to Eq. (52) if K "K "K and j "j . [Here Re(k)"j f /2K and Im(k)" , M , M , , J!j j f /2K.] This transformation enables an exact solution of the deterministic equation, and , CM further allows us to write the solution to the stochastic equation in the form of a path integral
P
W(x, t)"
(x, t)
G P C
DH
t xR 2 #k(x, q) Dx(q)exp ! dq 2K (0, 0) 0
.
(59)
M. Kardar / Physics Reports 301 (1998) 85—112
105
Eq. (59) has been extensively studied in connection with quantum tunneling in a disordered medium [48] with W representing the wave function. In particular, results for the tunneling probability DWD2 suggest z "3 and f "1. The transverse fluctuations correspond to the phase in , 2 , 2 the quantum problem which is not an observable. Hence, this mapping does not provide any information on f and z which are, in fact, observable for the moving line. M M At the point j "j "0, r and r decouple, and z "2 while z "3. However, in general, M C , M M , 2 z "z "z unless the effective j is zero. For example, at the intersection of the subspaces with GI , M M and FD the exponents z "z "3 are obtained from the exponent identities. Dynamic RG , M 2 recursion relations can be computed to one-loop order [46,49], by standard methods of momentum-shell dynamic RG [39,40]. The renormalization of the seven parameters in Eq. (52), generalized to n transverse directions, give the recursion relations
C C
D
dK 1 j2 ¹ 1j j ¹ ,"K z!2# , ,#n M C M , , dl p 4K3 p 4K K2 , , M dK 1 j ((j ¹ /K )#(j ¹ /K )) M"K z!2# M C M M M , , M dl 2K (K #K ) p M M ,
D
1 K !K j ((j ¹ /K )!(j ¹ /K )) M , M C M M M , , , # K (K #K ) p K #K M , M M ,
C C
D
d¹ 1 j2 ¹ 1 j2 ¹2 ,"¹ z!2f !1# , , #n C M , , , dl p 4K3 p 4K3 , M
D
d¹ 1 j2 ¹ M"¹ z!2f !1# M , , M M dl p K K (K #K ) M , M , dj ,"j [f #z!2] , , , dl
C C
(60)
D
dj 1 j !j M"j f #z!2! , M ((j ¹ /K )!(j ¹ /K )) , M , M , , dl p (K #K )2 C M M M ,
D
dj 1 j K !j K , M M , ((j ¹ /K )!(j ¹ /K )) . C"j 2f !f #z!2# C M , M , , dl p K K (K #K ) C M M M , M , The projections of the RG flows on the two-parameter subspace shown in Fig. 8 are indicated by trajectories. They naturally satisfy the constraints imposed by the non-perturbative results: the subspace of GI is closed under RG, while the FD condition appears as a fixed line. The RG flows, and the corresponding exponents, are different in each quadrant of Fig. 8, which implies that the scaling behavior is determined by the relative signs of the three non-linearities. This was confirmed by numerical integrations [46,49] of Eq. (52), performed for different sets of parameters. A summary of the computed exponents are given in Table 1.
106
M. Kardar / Physics Reports 301 (1998) 85—112
Table 1 Numerical estimates of the scaling exponents, for various values of model parameters for n"1. In all cases, K "K "1 , M and ¹ "¹ "0.01, unless indicated otherwise. Typical error bars are $0.05 for f, $0.01 for z/f. Entries in brackets , M are theoretical results. Exact values are given in fractional form j ,
j C
j M
f ,
z /f , ,
f M
z /f M M
20
20
20
0.48 (1/2)
3.0 (3)
0.48 (1/2)
3.0 (3)
20 20
20 5
2.5 25
0.75 0.51
1.7 3.4
0.50 0.56
3.7 2.9
5
5
!5
0.83
unstable 0.44 (No fixed point for finite f, z)
3.6
20
!20
!20
0.50 (1/2)
3.1 (3)
0.50 (1/2)
2.9 (3)
5
!5
5
0.52 (1/2)
3.3 (3)
0.57 (Strong coupling)
3.4
20
0
20
0.49 (1/2)
3.1 (3)
0.72 (0.75)
2.2 (2)
20
0
!20
0.48 (1/2)
3.0 (3)
0.65
1.4
0.50 (1/2)
4.0 (4)
2.9
0.51 (1/2)
4.0 (4)
20
20
0
0.84 (z (z ) , M
20
!20
0
0.55 (z (z ) , M
3.1 (z 'z ) M ,
The analysis of analytical and numerical results can be summarized as follows: j j '0: In this region, the scaling behavior is understood best. The RG flows terminate on the M C fixed line where FD conditions apply, hence, f "f "1. All along this line, the one-loop RG , , 2 exponent is z"3. These results are consistent with the numerical simulations. The measured 2 exponents rapidly converge to these values, except when j or j are small. M C j "0: In this case the equation for r is the KPZ equation (40), thus f "1 and z "3. The C , , 2 , 2 fluctuations in r act as a strong (multiplicative and correlated) noise on r . The one-loop RG , M yields the exponents z "3, f "0.75 for j '0, while a negative j scales to 0 suggesting M M M 2 , z 'z . Simulations are consistent with the RG calculations for j '0, yielding f "0.72, M M M M surprisingly close to the one-loop RG value. For j (0, simulations indicate z +2 and f +2 M M M 3 along with the expected values for the longitudinal exponents. j "0: The transverse fluctuations satisfy a simple diffusion equation with f "1 and z "2. M M 2 M Through the term j ( r )2/2, these fluctuations act as a correlated noise [40] for the longitudinal C xM mode. A naive application of the results of this reference [40] give f "2 and z "4. Quite , 3 , 3 surprisingly, simulations indicate different behavior depending on the sign of j . For j (0, C C z +3 and f +1 whereas for j '0, longitudinal fluctuations are much stronger, resulting in , 2 , 2 C
M. Kardar / Physics Reports 301 (1998) 85—112
107
z +1.18 and f +0.84. Actually, f increases steadily with system size, suggesting a breakdown of , , , dynamic scaling, due to a change of sign in j j . This dependence on the sign of j may reflect the M C C fundamental difference between behavior in quadrants II and IV of Fig. 8. j (0 and j '0: The analysis of this region (II) is the most difficult in that the RG flows do not M C converge upon a finite fixed point and j P0, which may signal the breakdown of dynamic scaling. M Simulations indicate strong longitudinal fluctuations that lead to instabilities in the discrete integration scheme, excluding the possibility of measuring the exponents reliably. j '0 and j (0: The projected RG flows in this quadrant (IV) converge to the point j /j "1 M C M , and j ¹ K /j ¹ K "!1. This is actually not a fixed point, as K and K scale to infinity. The C M , , , M , M applicability of the CH transformation to this point implies z "3 and f "1. Since j is finite, , 2 , 2 M z "z "3 is expected, but this does not give any information on f . Simulations indicate strong M M , 2 transverse fluctuations and suffer from difficulties similar to those in region II. Eq. (52) are the simplest non-linear, local, and dissipative equations that govern the fluctuations of a moving line in a random medium. They can be easily generalized to describe the time evolution of a manifold with arbitrary internal (x3Rd) and external (r3Rn`1) dimensions, and to the motion of curves that are not necessarily stretched in a particular direction. Since the derivation only involves general symmetry arguments, the given results are widely applicable to a number of seemingly unrelated systems. We will discuss one application to drifting polymers in more detail in the next lecture, explicitly demonstrating the origin of the non-linear terms starting from more fundamental hydrodynamic equations. A simple model of crack front propagation in three dimensions [50] also arrives at Eq. (52), implying the self-affine structure of the crack surface after the front has passed. 2.3. Drifting polymers The dynamics of polymers in fluids is of much theoretical interest and has been extensively studied [51,52]. The combination of polymer flexibility, interactions, and hydrodynamics make a first principles approach to the problem quite difficult. There are, however, a number of phenomenological studies that describe various aspects of this problem [53]. One of the simplest is the Rouse model [54]: The configuration of the polymer at time t is described by a vector R(x, t), where x3[0, N] is a continuous variable replacing the discrete monomer index (see Fig. 9). Ignoring inertial effects, the relaxation of the polymer in a viscous medium is approximated by R(x, t)"kF(R(x, t))"K2R(x, t)#g(x, t) , (61) t x where k is the mobility. The force F has a contribution from interactions with near neighbors that are treated as springs. Steric and other interactions are ignored. The effect of the medium is represented by the random forces g with zero mean. The Rouse model is a linear Langevin equation that is easily solved. It predicts that the mean square radius of gyration, R2"SDR!SRTD2T, is ' proportional to the polymer size N, and the largest relaxation times scale as the fourth power of the wave number, (i.e., in dynamic light scattering experiments, the half-width at half-maximum of the scattering amplitude scales as the fourth-power of the scattering wave vector q). These results can be summarized as R &Nl and C(q)&qz, where l and z are called the swelling and dynamic ' exponents, respectively [55]. Thus, for the Rouse model, l"1 and z"4. 2
108
M. Kardar / Physics Reports 301 (1998) 85—112
Fig. 9. The configuration of a polymer.
The Rouse model ignores hydrodynamic interactions mediated by the fluid. These effects were originally considered by Kirkwood and Risemann [56] and later on by Zimm [57]. The basic idea is that the motion of each monomer modifies the flow field at large distances. Consequently, each monomer experiences an additional velocity
P
P
1 F(x@)r2 #(F(x@) ) r )r c xx{ xx{ xx{+ dx@ d R(x, t)" dx@ 2R , (62) H t 8pg Dr D3 Dx!x@Dl x 4 xx{ where r "R(x)!R(x@) and the final approximation is obtained by replacing the actual distance xx{ between two monomers by their average value. The modified equation is still linear in R and easily solved. The main result is the speeding up of the relaxation dynamics as the exponent z changes from 4 to 3. Most experiments on polymer dynamics [58], indeed, measure exponents close to 3. Rouse dynamics is still important in other circumstances, such as diffusion of a polymer in a solid matrix, stress and viscoelasticity in concentrated polymer solutions, and is also applicable to relaxation times in Monte Carlo simulations. Since both of these models are linear, the dynamics remains invariant in the center of mass coordinates upon the application of a uniform external force. Hence, the results for a drifting polymer are identical to a stationary one. This conclusion is in fact not correct due to the hydrodynamic interactions. For example, consider a rod-like conformation of the polymer with monomer length b where R "b t everywhere on the polymer, so that the elastic (Rouse) force 0 x a 0a vanishes. If a uniform force E per monomer acts on this rod, the velocity of the rod can be solved using Kirkwood theory, and the result is [51] "[(!ln i)/4pg b ]E ) [I#tt] . (63) 4 0 In the above equation, g is the solvent viscosity, t is the unit tangent vector, i"2b/b N is the ratio 4 0 of the width b to the half-length b N/2 of the polymer. A more detailed calculation of the velocity in 0
M. Kardar / Physics Reports 301 (1998) 85—112
109
the more general case of an arbitrarily shaped slender body by Khayat and Cox [59] shows that non-local contributions to the hydrodynamic force, which depend on the whole shape of the polymer rather than the local orientation, are O(1/(ln i)2). Therefore, corrections to Eq. (63) are small when N
110
M. Kardar / Physics Reports 301 (1998) 85—112
different scaling regimes correspond to actual physical situations. The scaling results found by the RG analysis are verified by direct integration of equations, as mentioned in the earlier lectures. A more detailed discussion of the analysis and results can be found in our earlier work [49]. In constructing equations Eq. (64), we only allowed for local effects, and ignored the nonlocalities that are the hallmark of hydrodynamics. One consequence of hydrodynamic interactions is the back-flow velocity in Eq. (62) that can be added to the evolution equations Eq. (64). Dimensional analysis gives the recursion relation c/l"c[lz!1!(d!2)l]#O(c2) ,
(66)
which implies that, at the non-linear fixed point, this additional term is surprisingly irrelevant for d'3, and z"3 due to the non-linearities. For d(3, z"d due to hydrodynamics, and the non-linear terms are irrelevant. The situation in three dimensions is unclear, but a change in the exponents is unlikely. Similarly, one could consider the effect of self-avoidance by including the force generated by a softly repulsive contact potential
P
b dx dx@ V(r(x)!r(x@)) . 2
(67)
The relevance of this term is also controlled by the scaling dimension y "lz!1!(d!2)l, and b therefore this effect is marginal in three dimensions at the non-linear fixed point, in contrast with both Rouse and Zimm models where self-avoidance becomes relevant below four dimensions. Unfortunately, one is ultimately forced to consider non-local and non-linear terms based on similar grounds, and such terms are, indeed, relevant below four dimensions. In some cases, local or global arclength conservation may be an important consideration in writing down a dynamics for the system. However, a local description is likely to be more correct in a more complicated system with screening effects (motion in a gel that screens hydrodynamic interactions) where a first principles approach becomes even more intractable. Therefore, this model is an important starting point towards understanding the scaling behavior of polymers under a uniform drift, a problem with great technological importance.
Acknowledgements The work described here is part of the doctoral thesis of Deniz Ertas in the Physics Department of MIT. Financial support from the NSF through grant number DMR-93-03667 is gratefully acknowledged.
References [1] [2] [3] [4] [5]
H. Fukuyama, P.A. Lee, Phys. Rev. B 17 (1978) 535; P.A. Lee, T.M. Rice, Phys. Rev. B 19 (1979) 3970. R. Bruinsma, G. Aeppli, Phys. Rev. Lett. 52 (1984) 1547; J. Koplik, H. Levine, Phys. Rev. B 32 (1985) 280. P.G. de Gennes, Rev. Mod. Phys. 57 (1985) 827. J.F. Joanny, P.G. de Gennes, J. Chem. Phys. 81 (1984) 552. D.S. Fisher, Phys. Rev. Lett. 50 (1983) 1486.
M. Kardar / Physics Reports 301 (1998) 85—112 [6] [7] [8] [9] [10] [11] [12]
[13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45]
111
P. Bak, C. Tang, K. Wiesenfeld, Phys. Rev. Lett. 59 (1987) 381; Phys. Rev. A 38 (1988) 364. D.S. Fisher, Phys. Rev. B 31 (1985) 1396; O. Narayan, D.S. Fisher, Phys. Rev. B 46 (1992) 11 520. T. Nattermann, S. Stepanow, L.H. Tang, H. Leschhorn, J. Phys. II France 2 (1992) 1483. Y. Imry, S.K. Ma, Phys. Rev. Lett. 35 (1975) 1399. P.C. Martin, E. Siggia, H. Rose, Phys. Rev. A 8 (1973) 423. O. Narayan, D.S. Fisher, Phys. Rev. B 48 (1993) 7030. M. Dong, M.C. Marchetti, A.A. Middleton, V. Vinokur, Phys. Rev. Lett. 70 (1993) 662. The identification of the exponent f"1 from correlation function has been questioned by H. Leschhorn, L.-H. Tang, Phys. Rev. Lett. 70 (1993) 2973. M.A. Rubio, C.A. Edwards, A. Dougherty, J.P Gollub, Phys. Rev. Lett. 63 (1989) 1685; V.K. Horva´th, F. Family, T. Vicsek, Phys. Rev. Lett. 67 (1991) 3207; S. He, G.L.M.K.S. Kahanda, P.-Z. Wong, Phys. Rev. Lett. 69 (1992) 3731. S.V. Buldyrev, A.-L. Barabasi, F. Caserta, S. Havlin, H.E. Stanley, T. Vicsek, Phys. Rev. A 45 (1992) R8313. F. Family, K.C.B. Chan, J.G. Amar, Surface Disordering: Growth, Roughening and Phase Transitions, Les Houches Series, Nova Science Publishers, New York, 1992. H. Leschhorn, Physica A 195 (1993) 324. L.A.N. Amaral, A.-L. Barabasi, H.E. Stanley, Phys. Rev. Lett. 73 (1994) 62. H. Ji, M.O. Robbins, Phys. Rev. B 44 (1991) 2538; B. Koiller, H. Ji, M.O. Robbins, Phys. Rev. B 46 (1992) 5258. L.H. Tang, H. Leschhorn, Phys. Rev. A 45 (1992) R8309. M. Kardar, G. Parisi, Y.C. Zhang, Phys. Rev. Lett. 56 (1986) 889. C. Tang, S. Feng, L. Golubovic, Phys. Rev. Lett. 72 (1994) 1264. S. Stepanow, J. Phys. II France 5 (1995) 11. Z. Chaho´k, K. Honda, T. Vicsek, J. Phys. A 26 (1993) L171; S. Galluccio, Y.-C. Zhang, Phys. Rev. E 51 (1995) 1686; H. Leschhorn, cond-mat 9605018. D. Dhar, M. Barma, M.K. Phani, Phys. Rev. Lett. 47 (1981) 1238; D. Dhar, J. Phys. A 15 (1982) 1859; S.V. Buldyrev, S. Havlin, H.E. Stanley, Physica 200 (1993) 200; and references therein. D. Wolf, Phys. Rev. Lett. 67 (1991) 1783. H. Jeong, B. Kahng, D. Kim, Phys. Rev. Lett. 77 (1996) 5094. See, for example, G. Blatter, M.V. Feigel’man, V.B. Geshkenbein, A.I. Larkin, V.M. Vinokur, Rev. Mod. Phys. 66 (1994) 1125; and references therein. R.H. Koch et al., Phys. Rev. Lett. 63 (1989) 1511; P.L. Gammel, L.F. Schneemener, D.J. Bishop, Phys. Rev. Lett. 66 (1991) 953. L. Civale et al., Phys. Rev. Lett. 67 (1991) 648; M. Leghissa et al., Phys. Rev. B 48 (1993) 1341. D.S. Fisher, M.P.A. Fisher, D.A. Huse, Phys. Rev. B 43 (1991) 130. D.R. Nelson, V.M. Vinokur, Phys. Rev. Lett. 68 (1992) 2398. Y. Enomoto, Phys. Lett. A 161 (1991) 185; Y. Enomoto, K. Katsumi, R. Kato, S. Maekawa, Physica C 192 (1992) 166. A.A. Middleton, D.S. Fisher, Phys. Rev. Lett. 66 (1991) 92; Phys. Rev. B 47 (1993) 3530. H. Sompolinsky, A. Zippelius, Phys. Rev. B 25 (1982) 6860; A. Zippelius, Phys. Rev. B 29 (1984) 2717. In this case, the longitudinal direction is chosen to be along the average velocity v, not the Lorentz force F. M. Kardar, in: J.C. Charmet, S. Roux, E. Guyon (Eds.), Disorder and Fracture, Plenum, New York, 1990; T. Hwa, M. Kardar, Phys. Rev. A 45 (1992) 7002. M. Plischke, Z. Ra´cz, D. Liu, Phys. Rev. B 35 (1987) 3485. J.M. Burgers, The Nonlinear Diffusion Equation, Riedel, Boston, 1974. D. Forster, D.R. Nelson, M.J. Stephen, Phys. Rev. A 16 (1977) 732. E. Medina, T. Hwa, M. Kardar, Y. Zhang, Phys. Rev. A 39 (1989) 3053. F. Family, T. Vicsek (Eds.), Dynamics of Fractal Surfaces, World Scientific, Singapore, 1991. J. Krug, H. Spohn, in: C. Godreche (Eds.), Solids Far From Equilibrium: Growth, Morphology and Defects, edited by Cambridge University Press, Cambridge, 1991. T. Halpin-Healy, Y.-C. Zhang, Phys. Rep. 254 (1995) 215. A.L. Barabasi, H.E. Stanley, Fractal Concepts in Surface Growth, CUP, Cambridge, 1995. M. Prahofer, H. Spohn, J. Stat. Phys. (1997), in press.
112
M. Kardar / Physics Reports 301 (1998) 85—112
[46] D. Ertas, M. Kardar, Phys. Rev. Lett. 69 (1992) 929. [47] T. Hwa, Phys. Rev. Lett. 69 (1992) 1552. [48] E. Medina, M. Kardar, Y. Shapir, X.-R. Wang, Phys. Rev. Lett. 62 (1989) 941; E. Medina, M. Kardar, Phys. Rev. B 46 (1992) 9984. [49] D. Ertas, M. Kardar, Phys. Rev. E 48 (1993) 1228. [50] J.P. Bouchaud, E. Bouchaud, G. Lapasset, J. Planes, Phys. Rev. Lett. 71 (1993) 2240. [51] M. Doi, S.F. Edwards, Theory of Polymer Dynamics, Oxford University Press, Oxford, 1986. [52] P.G. de Gennes, Scaling Concepts in Polymer Physics, Cornell University Press, Ithaca, NY, 1979. [53] R.B. Bird, Dynamics of Polymeric Physics, vols. 1—2, Wiley, New York, 1987. [54] P.E. Rouse, J. Chem. Phys. 21 (1953) 1272. [55] We have changed the notation to confer with the traditions of polymer science. l is f and z is z/f in terms of the notation used previously. [56] J. Kirkwood, J. Risemann, J. Chem. Phys. 16 (1948) 565. [57] B.H. Zimm, J. Chem. Phys. 24 (1956) 269. [58] See, for example, M. Adam, M. Delsanti, Macromolecules 10 (1977) 1229. [59] R.E. Khayat, R.G. Cox, J. Fluid. Mech. 209 (1989) 435.
Physics Reports 301 (1998) 113—150
Collective transport in random media: from superconductors to earthquakes Daniel S. Fisher* Physics Department, Harvard University, Cambridge, MA 02138, USA
Abstract In these lectures, a variety of non-equilibrium transport phenomena are introduced that all involve, in some way, elastic manifolds being driven through random media. A simple class of models is studied focussing on the behavior near to the critical “depinning” force above which persistent motion occurs in these systems. A simple mean field theory and a “toy” model of “avalanche” processes are analyzed and used to motivate the general scaling picture found in recent renormalization group studies. The general ideas and results are then applied to various systems: sliding charge density waves, critical current behavior of vortices in superconductors, dynamics of cracks, and simple models of a geological fault. The roles of thermal fluctuations, defects, inertia, and elastic wave propagation are all discussed briefly. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 05.60.#w
1. Introduction Many phenomena in nature involve transport of material or some other quantity from one region of space to another. In some cases transport occurs in systems that are close to equilibrium with the transport representing only a small perturbation such as flow or electrical current in a metal, while in other cases it involves systems that are far from equilibrium such as a land-slide down a mountain, or a drop of water sliding down an irregular surface. Sometimes, particles or other constituents move relatively independently of each other like the electrons in a metal, while in other situations the interactions play an important role, as in the landslide and the water droplet. If the interactions are strong enough, all the particles (or other constituents) move together and the macroscopic dynamics involves only a small number of degrees of freedom. This is the case for a small water drop on, e.g. wax paper, which slides around while retaining its shape. But if the * Fax: #1 617 495-0416; e-mail: [email protected]. 0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 0 8 - 8
114
D.S. Fisher / Physics Reports 301 (1998) 113—150
interactions are not so strong relative to the other forces acting on the constituents, then the transport involves in an essential way many interacting degrees of freedom. This is the case for a larger water drop on an irregular surface for which the contact line between the droplet and the surface continually deforms and adjusts its shape in response to the competition between the surface tension of the water and the interactions with the substrate [1,2]. Such a moving drop and the landslide are examples of non-equilibrium collective transport phenomena, which will be the general subject of these lectures. This is, of course, an impossibly broad subject! We must thus narrow the scope drastically. Although the range of systems discussed here will, nevertheless, be reasonably broad, we will primarily focus on systems in which the interactions are strong enough so that the transported object (or at least some part of it) is elastic. We will use this in a general and somewhat loose sense that the transported object has enough integrity that if one part of it moves a long distance then so, eventually, must the other parts as well. Thus, the fluid drop is elastic if it does not break up — i.e. its perimeter retains its integrity — while a landslide is not elastic as some rocks will fall much further than others and the relative positions of the rocks will be completely jumbled by the landslide. We will be interested in systems in which the medium in which the transport occurs has static random heterogeneities (“quenched randomness”) which exert forces on the transported object that depend on where it is in space. Examples we will discuss are: interfaces between two phases in random media [3,4], such as between two fluids in a porous medium [5], or domain walls in a random ferromagnetic alloy; lattices of vortices in dirty type II superconductors [6]; charge density waves which are spatially periodic modulations of the electron density that occur in certain solids [7—9]; and the motion of geological faults [10,11]. In addition to the contact line of the fluid drop already mentioned [1,2], another well known — but poorly understood — example that we will, however, not discuss is solid-on-solid friction. In all of these systems, a driving force, call it F, can be applied which acts to try to make the object move, but this will be resisted by the random “pinning” forces exerted by the medium or substrate. The primary questions of interest will involve the response of the system to such an applied driving force [12]. If F is small, then one might guess that it will not be sufficient to overcome the resistance of the pinning forces; sections of the object would just move a bit and it would deform in response to F, but would afterwards be at rest. (Note that in most of what follows, we will ignore fluctuations so that the motion is deterministic and the objects can be said to be stationary.) If the force is increased, some segment might go unstable and move only to be stopped by stronger pinning regions or neighboring segments. But for large enough F, it should be possible to overcome the pinning forces — unless they are so strong that the object is broken up, an issue we will return to at the end — and the object will move, perhaps attaining some steady state velocity v. Basic questions one might ask are: is there a unique, history independent force, F separating the # static from the moving regimes? How does v depend on F (and possibly on history)? Are there some kinds of non-equilibrium critical phenomena when v is small? How does the system respond to an additional time or space dependent applied force? These are all macroscopic properties of the system. But we will also be interested in some microscopic properties: how can one characterize (statistically) the deformations of the object when it is stationary [13]? The dynamic deformations and local velocities when it is moving? The response to a small local perturbation? etc.
D.S. Fisher / Physics Reports 301 (1998) 113—150
115
Motivated by possible analogies with equilibrium phase transitions [14], we can ask if there are scaling laws that might obtain near a critical force which relate, for example, the characteristic length scale ¸, for some process, to its characteristic time scale, q, via a power-law relation of the form q&¸z .
(1)
Trying to answer some of these questions — and to pose other more pointed questions — is the main aim of these lectures. In the next few sections a particular system and its natural (theoretical) generalizations will be studied and tools and ideas developed. In the last section, these are tentatively applied to various physical systems and some of the complicating features left out of the simple model systems are discussed. This leads naturally to many open questions.
2. Interfaces and models In order to develop some of the general ideas — both conceptual and computational — we will focus initially on an interface between two phases that is driven by an applied force through an inhomogenous medium [3,4]. The essential ingredients of a model of this system are: the forces of sections of the interface on nearby sections, i.e. the elasticity of the interface caused by its interfacial tension; the preference of the interface for some regions of the system over others due to the random heterogeneities; and some dynamical law which governs the time evolution of the local interface position. We will initially make several simplifying approximations, which we will come back and examine later. First, we assume that the interface is not too distorted away from a flat surface normal to the direction (z) of the driving force so that its configuration can be represented by its displacement field u(r) away from a flat reference surface. The coordinates R"(x, y, z) of points on the surface are then (x, y) " r and
z " u(r) .
(2)
Second, we will assume that the dynamics are purely dissipative, i.e. that inertia is negligible — a good approximation in many physical situations. Keeping only the lowest order terms in deviations from flat, we then have gu(r, t)/t " F#p(r, t)!f [r, u(r, t)] (3) 1 with F representing the driving force on the interface, f (R) representing the random “pinning” 1 forces of the heterogeneous medium on the interface which we assume for the present are not history or velocity dependent; g a dissipative coefficient; and
P P
5 p(r, t) " dr@ dt@ J(r!r@, t!t@)[u(r@, t@)!u(r, t)]
(4)
the “stress” on the interface from its elasticity which is “transmitted” by the kernel J(r, t). Short-range elasticity of the interface corresponds to JJd(t)+2d(r) . A schematic of such an interface and the forces acting on it is shown in Fig. 1.
(5)
116
D.S. Fisher / Physics Reports 301 (1998) 113—150
Fig. 1. Schematic of a one-dimensional interface in a two dimensional disordered system illustrating the forces acting on the interface.
Keeping in mind some of the other problems of interest [1,15,16] in addition to interfaces, we will abstract to a more general problem of a d-dimensional elastic “manifold” !d " 2 for the interface — with more general interactions, which can be long-range, embodied in J(r, t). In addition to the form of J(r, t), the system will be characterized by the statistics of the pinning forces which impede interface motion near points where the interface has lower (free) energy; f (R) will generally 1 have only short-range correlations in space, i.e., in both u!u@ and r!r@. Even with these simplifying assumptions, the model Eqs. (3) and (4) is impossible to analyze fully due to the non-linearities implicit in the u dependence of f (r, u). Nevertheless, a lot of the qualitative behavior 1 can be guessed. If the driving force is sufficiently small, then it will be insufficient to overcome the pinning forces. But if F is increased slowly, it may overcome the pinning of some small segment of the interface which can then jump forwards only to be stopped by stronger pinning forces or by the elastic forces from neighboring still-pinned parts of the interface. But if the drive is larger, the neighboring regions may themselves not be strongly enough pinned to resist the increase in stress from the jumping section and may themselves jump forward leading to an “avalanche” of some larger region of the interface; this process might or might not eventually stop [13]. If the force is large enough — and certainly if it exceeds the maximum f — then it is not possible for the interface to be pinned 1 and the interface will move forward with some average velocity vN , albeit via very jerky motion both in space and time. In addition to the basic questions raised in the Introduction, we will be interested in the statistics of the sizes and dynamical properties of the avalanches [4,13,17]. The existence of a unique critical force can be established by a simple convexity argument that is valid if J(r, t) is non-negative [18]. If two configurations u and u of the interface at time t have the a b 0 property that one is “ahead” of the other, i.e. u (r, t )'u (r, t ) for all r; then u will be ahead of u at a 0 b 0 a b all later times. This can be seen by assuming the contrary and then considering the putative first
D.S. Fisher / Physics Reports 301 (1998) 113—150
117
time t 't at which there is a point of contact; say u (r , t ) " u (r , t ) at some r . Then 1 0 a 1 1 b 1 1 1 [u (r , t)!u (r , t)]D 1"p[r , Mu N]!p[r , Mu N] b 1 t/t 1 a 1 b t a 1
P P
" dr@
t1 dt@ J(r !r@, t !t@)[u (r@, t@)!u (r@, t@)]'0 1 1 a b
(6)
since the pinning force at r is the same in both configurations and therefore cancels out. By 1 assumption the last expression in Eq. (6) is positive as long as J is non-negative so that for t't , 1 u is again ahead of u violating the assumption. a b The condition that J(r, t)50
(7)
for all r, t plays an important role in the theoretical analysis and frequently also in the physics of these types of systems. We will refer to models with this convexity property as monotonic; they have the property that if the displacements and the driving force F(t) increase monotonically with time, then so will the total “pulling force” — see later — on any segment. Except in the final section we will focus solely on monotonic models. We have shown that in monotonic models one configuration that is initially behind cannot “pass” another that is ahead of it [18]; therefore stationary and continually moving solutions cannot coexist at the same F; therefore F is unique. This is a big simplification and one that will not # occur generally, in particular not in some of the systems that we discuss in the last section. For forces well above F , one can use perturbative methods to study the effects of the random # pinning and compute, for example: the mean velocity, vN (F), the spatio-temporal correlations of the local velocities, and responses to additional applied forces [8]. Near F , however, life is # much more complicated as is usually the case near critical points — but for conventional equilibrium critical points the theoretical framework for dealing with the complexities is well established [14]. The most interesting behavior occurs in the critical regime; in particular one might expect processes — such as avalanches — to occur on a wide range of length and time scales. In order to make real progress, we will have to — at least initially — make further simplifications or approximations. One of the lessons from equilibrium critical phenomena is that analyzing simple models exactly or by controlled approximations is more useful than analyzing more realistic models by uncontrolled approximations; we will thus take the former route. But we must first find some clues as to what simplifications preserve — what we hope will be — the most essential features. Near the critical force, at any given time most of the interface will be moving very slowly if at all so that the left hand side of Eq. (3) will be close to zero. Thus, a first try might be to replace the actual dynamics with the adiabatic approximation that the forces at every point always balance exactly. Let us focus on one point r on the interface and divide the stress p(r), Eq. (4), into the local part JI u(r, t) and the non-local part, f , involving u(r@, t@) for r@ O r with p
P P
JI , dr@ dt@ J(r@, t@) .
(8)
118
D.S. Fisher / Physics Reports 301 (1998) 113—150
We then have a balance between the local force, f #JI u and the pulling force, 1 /(r, t),f (r, t)#F . p The adiabatic approximation corresponds to
(9)
f (r, t)+/(r, t)!JI u(r, t) . (10) 1 But generally, because of the non-linearities in f (u), for a fixed / “applied” to u(r) there can be 1 multiple values of u which satisfy Eq. (10); as we shall see these play an important role in the physics. At this point, it is helpful to be more concrete. Let us consider a simple model of the pinning consisting of pinning sites u1(r) distributed for fixed r with random spacings between the u1(r), a a ¶ (r),u1 (r)!u1(r) (11) a a`1 a drawn, for each r and a, independently from a distribution P(¶ )d¶. The pinning force f [r, u(r)] " 0 except if u(r) is equal to one of the pinning positions, while for u(r) " u1(r), f can 1 a 1 take any value between zero and a yield strength, f , which is the same for each pin. A typical : realization of the pinning force f (u) on some segment of the interface is plotted in Fig. 2a. Note that 1 for a fixed /, there are several possible values of u given by the intersection of the line /!JI u with f (u). 1 If / is increased, then the particular (history dependent) force-balanced position u(/) that the interface point is following adiabatically can become unstable — for example, the configuration denoted by the circle in Fig. 2a — and u must jump to a new position. During the jump gu/t is clearly not small and Eq. (10) will not be satisfied. But if such jumps occur more rapidly than the time scales of interest then their primary effect will be a time lag between the actual u[/(t)] and an adiabatic solution u [/(t)] to Eq. (3). A way to capture this feature while preserving both the !$ physics of the time delays and the conceptually simplifying separation of motion into adiabatic and jump parts (with u/t " 0 in the adiabatic parts for the pinning model illustrated in Fig. 2), is to require u[/(r, t)] " u [/(r, t!t )] (12) !$ $ with some fixed (microscopic) delay time t . This is illustrated in Fig. 2 (top). Note that, formally, $ this can be accomplished, by taking gP0 and J(r, t) " J(r)d(t!t ). $ 3. Infinite-range model: mean-field theory The above discussion in terms of the local pulling force /(r, t) suggests that we could try to analyze the system crudely by assuming that the spatial and temporal fluctuations in /(r, t) are small so that / can be replaced by some sort of time-dependent average /M (t) which would then need to be determined self-consistently from the behavior of the neighboring regions that contribute to the stress at our chosen point r [8]. This is very analogous to the well-known mean-field approximation to conventional phase transitions: for example, in a magnetic system the statistical mechanics (or dynamics) of a single spin S(r) in the presence of a local effective field, h (r) from its %&& neighbors — the mean field — is analyzed and the mean field determined self-consistently from the condition that at all points the assumed SST entering the mean field h (r) " +r J(r!r@)SS(r@)T, is %&& {
D.S. Fisher / Physics Reports 301 (1998) 113—150
119
Fig. 2. (top) Simple model of the forces on one segment of an interface. The segment can be pinned at the positions, u1 , of ia the vertical lines at which the pinning force can take any value up to the yield strength f . The intersections of the “comb” : representing the pinning force f (u ) and the diagonal line u!JI u with u the total pulling force from the applied force and 1 i i other segments of the interface, are the possible stationary positions of u indicated by the dots. The one of these with the i smallest u , u., plays a special role as discussed in the text. The amount *u that / needs to increase by to depin the i i segment from this pinning position is w JI . (bottom) Dynamics of the same segment of the interface as the pulling force is i increased. The actual u (t) (dotted), the adiabatic approximation to this (solid), and the time delayed approximation i (dashed) that is used in the analysis in the text are all shown.
the same as the computed SST. One of the advantages of this approximation is that it has a well-defined regime of validity: in the limit that the range of the interaction is very long [or more properly that the effective number of neighbors, (+ rDJ(r)D)2/+ rDJ(r)D2 is large] then the mean field theory becomes exact. But — users beware! — for fixed, but finite range interactions it can still fail near critical points as we shall see. In order to obtain analytic results in at least some model, we will study a strictly mean-field limit where each discrete segment of the interface — which we label simply by a subscript i since it is no
120
D.S. Fisher / Physics Reports 301 (1998) 113—150
longer really a spatial coordinate — has an independent random pinning force f i(u ), of the form in 1 i Fig. 2, and the N segments are all coupled together by a uniform coupling J " J(t)/N , (13) ij i.e., infinite range forces. (Note that Eq. (13) includes a self-coupling piece but its effects are negligible in the desired NPR limit.) Much can be done for general non-negative J(t) and more complicated forms of f (u) using the actual dynamical evolution Eq. (3) [9], but to keep things 1 simple we will use the time-delayed adiabatic approximation discussed above with J(t) " JI d(t!t ) , (14) $ and the form of f i of Fig. 2a with independent randomness for each i. For simplicity, we will focus 1 on the strong pinning limit which corresponds to f 'JI ¶ . (15) : .!9 It is left to the reader to show that including some of the more “realistic” features within the infinite range model does not change the qualitative or other universal aspects of the results. Our task is now simple, at least in principle: we assume some mean field 1 + u (t!t ) , (16) i $ N i compute the evolution of each u (t) (from / "/M for all i) from some set of initial conditions, and i i then adjust our guessed /M (t) until the computed /M (t) " F#JI
1 (17) Su(t)T, + u (t) i N i is equal to [/M (t#t )!F]/JI for all t. $ We first try the simplest possibility: a constant /M (t). We can proceed graphically. From Fig. 2a we select for each i one of the possible stationary values of u . We have many choices; the only i constraint being that SuT " (/M !F)/JI .
(18)
But we must be careful: If we start choosing too many large u ’s, we may find that SuT will i become too large. We can thus ask: what are the minimum and maximum possible SuT for a given /M ? The minimum will turn out to be of primary interest so we focus on this: for each i, the minimum u , u.(/M ) corresponds to the first pinning position — i.e. one of the Mu1 N — to the i i ia right of the intersection of the line f " /!JI u with the line f " f that passes through the : tips of the “comb” — representing the yield strength — in Fig. 2a.1 Since the peaks are randomly positioned, Su T.*/ " Su.(/M )T " (/M !f )/JI #Sw T i i : i
(19)
1 The strong pinning condition f 'JI ¶ ensures that u. is at a pinning position. The general case can be worked out : .!9 i similarly.
D.S. Fisher / Physics Reports 301 (1998) 113—150
121
with w the distance to the next pin which has the distribution2 i = 1 ¶P(¶ ) d¶ Prob(w) " . (20) ¶ ¶M w Here the quantity in parentheses is the probability distribution that a random point is in an interval of width ¶ between pins; this includes the factor of ¶/¶M with
P C
¶M ,
P
D
=
¶P(¶ ) d¶ (21) 0 because of the presence of more points in wider intervals. Integration of Eq. (20) by parts yields SwT " ¶2/(2¶M ) so that F f ¶2 ¶2 Su T.*/ " (/M !f )/JI # " #Su T! :# i i : JI JI 2¶M 2¶M
(22)
from Eqs. (16) and (19). For self-consistency, we must therefore have F4F " f !JI ¶2/2¶M (23) # : a non-trivial result for the critical force above which no static solutions are possible. Note that as the interaction strength, JI , is increased, the critical force decreases. Physically, this is a consequence of the elasticity causing the system to average over the randomness more effectively: pulling a stiff object over a rough surface is easier than pulling a flexible one. For F(F the number of stable solutions, N will be exponentially large with an “entropy” per # 4 segment (ln N )/N which is of order one well below F but decreases to zero at F as most of the u ’s 4 # # i will then need to take their minimum values to ensure self-consistency. What happens as F is increased slowly from below F so that the non-adiabaticity is negligible? # Since this will certainly result in SuT and hence /M increasing, we can understand the behavior from Fig. 2. As /M increases, some segments become unstable and jump to their next pinning positions. But this cannot happen unless they are stuck on the pin with smallest u1 , i.e. u.(/M ). Furthermore, ia i after a jump u will again be on the new smallest u1 for the increased /M . Thus the u ’s are gradually ia i swept to their minimum stable positions as F is increased towards F . # Above F , the segments continue to jump from one u.(/M ) to the next. But now the time # i delays must play a role. If we assume a solution which progresses uniformly on average, SuT " vN t, then /M " JI vN t!JI vN t #F . $
(24)
2 We use notations like “Prob(w)” to mean the probability that the continuous variable w is in the range w to w#dw, divided by dw; i.e. Prob(w) is the probability density (usually called by physicists “distribution”) of w. One must remember, however, that if variables are changed, e.g. from w to w@, then there is a Jacobian needed: Prob(w@) " (dw/dw@) Prob(w).
122
D.S. Fisher / Physics Reports 301 (1998) 113—150
[Note that to ensure that u does not stop between pins, we again need the strong pinning condition JI ¶ (f ]. With all u " u.(/M ), our earlier analysis immediately yields self-consistency only when .!9 : i i F!F # vN " (25) JI t $ so that near criticality vN & (F!F )b # with the critical exponent
(26)
b"b "1 (27) MF in this infinite range mean-field model [5]. Note that a comparison of Eq. (25) for F
Fig. 3. Velocity versus driving force in a typical mean field model is indicated by the solid line. Note the linear dependence of vN on F just above F . The dashed line is the behavior in the absence of pinning. #
D.S. Fisher / Physics Reports 301 (1998) 113—150
123
Fig. 4. Schematic of hysteresis loops that occur as the force is increased from zero to the critical force, decreased to the critical force in the opposite direction, and then cycled between these values. The direction of change of F is indicated by the arrows with the “1” denoting the first increase.
suggests that some of the asymptotic forms near F , such as Eq. (25) will be correct in realistic # models if the dimension of the “interface” is sufficiently large or the interactions are sufficiently long range [2,19]. We will return to these questions later, but for now we stick to the simple mean field model and ask what can be learned about “microscopic” properties, in particular the properties of avalanches that occur as the driving force is increased towards F . # 4. Avalanche statistics and dynamics In the mean-field model introduced in the previous section, the statistics and other properties of the avalanches of jumps that occur as the driving force is increased slowly can be worked out in substantial detail [20,21]. We will carry out the analysis using methods which can be generalized to provide useful information about the behavior with more realistic interactions. Let us consider what happens for F(F when F is increased by a very small amount. If the # increase is sufficiently small, then no segments will jump. But a slightly bigger increase — typically of order 1/N — will result in one jump. Let us call the time of this jump t " 0 and measure times in units of the delay time t so that with F held fixed after the avalanche starts, we have simple discrete $ time dynamics. The first jump can trigger n other jumps at time t " 1, with these triggering 1 n further ones at t " 2, etc. As long as the total number of jumps is finite, then in a large system 2 the mean SuT, and hence /M will have only advanced by an amount of order 1/N. Thus, all that will matter is the distribution of segments which are very close to jumping, i.e. those which will jump when /M is increased by a small amount *u . From Fig. 2a, we see that *u " JI w . For large N, all i i i but very special ways of preparing the conditions before the avalanche starts will yield a distribution of these small *u which are independent and randomly distributed with (initial-condition i dependent) density o , o(*u " 0) ; i
(28)
124
D.S. Fisher / Physics Reports 301 (1998) 113—150
o thus measures a local susceptibility to jumping. We can now immediately conclude something about the mean number of jumps Sn T at a time t after the initial jump. Since the n jumps at time t t~1 t!1 will cause an increase in /M by, on average, JI S¶Tn (where we have used S¶T rather than t~1 ¶M since the distribution of ¶ ’s for the almost unstable segments and hence S¶T could depend on the initial conditions). This will cause, on average, oS¶Tn jumps at time t, i.e. t~1 Sn T " oS¶TJI Sn T . (29) t t~1 The crucial parameter is thus oS¶TJI ; if this is greater than one the avalanche will runaway. If the system is below F as we have assumed, it will eventually be stopped only when a finite fraction of # the segments have jumped and the system has found a stable — and more typical — configuration. But if oS¶TJI (1, then the expected total size of an avalanche s , + *u i i is simply SsT " S¶T/(1!oS¶TJI ) .
(30)
(31)
As we shall see however, this is not the typical size: even very close to criticality, i.e. e , 1!oS¶TJI <1 ,
(32)
most avalanches will be small. The distribution of avalanche sizes, as well as other interesting information on their dynamics, etc., can be obtained from generating function methods. Since these are a widely applicable tool, we will go through some of the details. For simplicity, we work in units with S¶T " 1 ,
(33)
JI " 1 .
(34)
We are interested in the time evolution of the displacements, in particular the increments nt m " + [u (t)!u (t!1)] " + ¶ , (35) t i i ta i a/1 where the M¶ N are the magnitudes of the n jumps that occur at time t (with m " 0 if n " 0). The ta t t t total size is simply = s" + m . (36) t t/0 The joint probability distribution of all the Mm N given the initial jump n " 1 t 0 PMm N , Prob[m ,m ,m 2Dn "1] , (37) t 0 1 2 0 contains the information of interest. Note that vertical bars as in Eq. (37) denote “given”, i.e. conditional probability. It is useful to define a generating function of the distribution including all
D.S. Fisher / Physics Reports 301 (1998) 113—150
125
times up to ¹
T A
BU
T C Mk N , exp i + k m (38) T t t t P t/0 (usually we will drop the P). Note that C is simply the Fourier transform of P restricted to times T 4¹. We can derive a recursion relation for C in terms of C . For a given m , the number of T T~1 T~1 jumps triggered at time ¹ will be Poisson distributed with mean om , i.e. T~1 e~omT~1 Prob(n Dm )" (om )nT . (39) T T~1 T~1 n ! T Then, since m depends only on m , we can compute T T~1 Se*kTmTDMm N T"Se*ktmTDm T t t:T T~1 = nT " + Prob(n Dm )< d¶ P(¶ )e*kT¶Ta T T~1 Ta Ta nT/0 a/1 "exp[om (Se*kT¶T!1)] . (40) T~1 The last equality follows from Eq. (39); the resulting expression has similar m dependence to the T~1 T~1 T~1 . e*k m factor in C T~1 We thus find that
G
CP
DH
C [k ,2, k ] " C [k , , k ,j ] T 0 T T~1 0 2 T~2 T~1
(41)
!io(Se*jT¶T!1) , T~1
(42)
with j
T~1
"k
where j "k . (43) T T We can now iterate with the recursion relation Eq. (42) from the “initial” condition (43), eventually obtaining C Mk N " C (j Mk N) " Se*j0MktN¶T (44) T t 0 0 t (since n " 1). All the information has thus gone into j . As long as the system is stable, i.e. o41, 0 0 we can simply take ¹PR to recover the full information. [If o'1, then there is a non-zero (and computable) probability that s"R, and more care is needed.] To get the probability distribution of the total size s, we simply set all k " k and then t
P
Prob(s)" e~*kse*j*(k) k
(45)
with
P
k
,
P
1 = dk 2p ~=
(46)
126
D.S. Fisher / Physics Reports 301 (1998) 113—150
and j*(k) the (stable) fixed point solution to Eq. (42). Let us first consider the behavior for large s. This will be dominated by the singularity in the lower half complex-k plane nearest to the real axis — a general property of Fourier transforms that can be seen by deforming the integration contour away from the real axis until it encounters a singularity. We only expect to have large avalanches for e , 1!o
(47)
small, so the interesting regime is k and e small which suggests j* small. We find that to leading order, j* + [!ie#J!e2#2ibk]/b
(48)
b , S¶2T
(49)
with
and the sign of the square root that has positive imaginary part for k real chosen. For k P 0, this gives j* "0 as it must for normalization of probability CMk "0N "1. The integration contour in t Eq. (45) can be deformed so that it is dominated by the cut at k"!1ie2/b for large s and we thus 2 have, after replacing the dummy variable k by k!1ie2/b and expanding in small k, 2 Prob(s) +
P
iJ2ik/b e~*kse~se2@(2b) .
(50)
k By “power counting”, we see that the branch cut must yield a 1/s3@2 dependence; hence for large s, Prob(s)&e~se2@(2b)/s3@2
(51)
[20]. Note that for small e, the mean SsT is dominated by large s avalanches which are rare since Prob(s&1/e2)&Prob(s'1/e2)&e .
(52)
We see from Eq. (51) that these yield SsT & 1/e as expected. Two questions now arise: First, can we trust this heuristic calculation? And, second, is the large s, small e behavior in Eq. (51) generic? The latter we have answered already: the large s behavior is the same (except for the coefficient b) as long as b"S¶2T(R. (The reader is encouraged to find the behavior associated with a power law tail in the distribution of ¶3 ). As far as justifying the result Eq. (51), for the case in which all the jumps are the same, ¶ " 1, one can compute Prob(s) exactly by changing the variable of integration to z " e*j*(k) so that the integral in Eq. (45) circles an sth order pole at z " 0. Cauchy’s theorem then yields Prob(s) " e~os(os)s~1/s! ,
(53)
with, of course, Prob (s " 0) " 0.
3 Note that arbitrarily large jumps can only occur in models that also have arbitrarily large local yield stresses, f . :
D.S. Fisher / Physics Reports 301 (1998) 113—150
127
From the limiting large s form s! + sse~sJ2ps
(54)
the asymptotic behavior, Eq. (51), is found confirming the validity of the approximations made in our first derivation of this result. Note that the o dependence is simply via the (1/o)(oe~o)s factor in Eq. (53). It is nice that the exact result can be found in this case, but in general, asymptotic methods like those we have used above give more understanding and are more widely applicable. Nevertheless, to convince skeptical colleagues, a few exact results are useful! In addition to the distribution of avalanche sizes, we are also interested in their temporal evolution. For example, one might ask what is Sm DsT, i.e. what is the time development of an t average event of size s? This can be computed using the generating function. If we choose k "k for q O t q
(55)
k "k#l , t t
(56)
1 C /l D t " Sn e*ksT " j /l D t Se*j*(k)¶T t 0 t l /0 i = t l /0
(57)
and then
whose Fourier transform in k yields
P
P
m Prob(m , s) dm " m Prob(m Ds)dm Prob(s) " Sm DsTProb(s) . t t t t t t t
(58)
But by multiple use of the chain rule and Eq. (42), j /l " (j /k ) t`1(j /j ) t~1 ]2] (j /j ) 0 t~1 t k 0 1k 0 t t tj * evaluated with all j " j (k) and all k " k. q q For the constant ¶ " 1 case we obtain,
(59)
j /l D t " [oe*j*(k)]t . 0 t l /0 After shifting k as in Eq. (50) we see that, for e small and s and t large,
(60)
1 Sm DsT & t Prob(s)
P
e~se2@2e~*ks`*tJ2ik .
(61)
k This will be dominated by k & 1/s and hence the typical duration of an avalanche s will be q & Js .
(62)
Note that this is much less then the maximum possible duration q "s!1. The integral in .!9 Eq. (61) can be done exactly (by writing k"!ix2/2) yielding Sm DsT + te~t2@2s t
(63)
128
D.S. Fisher / Physics Reports 301 (1998) 113—150
for large s independent of o. Again, the behavior for large s and 1;t;s is generic up to a coefficient b that should appear as in Eq. (51). For the particular constant ¶ case, the exact result can be computed from Eq. (58) yielding t#1 (s!1)!s~t for 04t4s!1 . Sm DsT" t (s!t!1)!
(64)
Note that this result includes the rare tail of events which have total duration t of order s; the asymptotic methods needed to obtain this are a variant of those used above although terms neglected in Eqs. (48) and (61) become important in this region. A schematic of the evolution of a large avalanche is shown in Fig. 5. Note that, on average, an avalanche which is going to be large starts with m growing linearly in time, independent of how big it will become; this is very different t from the behavior of typical avalanches even near to criticality which are small. Nevertheless, typical large avalanches will have large fluctuations of m away from Sm DsT; these can be studied by t t computing, e.g. Sm2DsT from which it can be concluded that a typical m for a large s avalanche is the t t same order as Sm DsT except for t<1/Js. The probability that a large avalanche stops before time t t can be computed from Prob(m "0Ds) which is found from C (k P#iR, k "k for qOt). t = t q So far, we have not attempted to relate the local susceptibility to jumps, o, to the original mean field model of the previous section. In general, o will depend on the past history. But on a generic approach to F from below (e.g. after “training” the system by a slow increase to F from # # F"!F ), o will approach unity at F and the cutoff sJ &1/(1!o)2 in the jump size distribution # # will diverge. Right at F there will be a power law distribution of avalanches sizes; this is analogous # to the power law distribution of clusters that occur at the critical point for conventional percolation [22]. One might also hope to make connections to power law spatial structures at percolation [22] as well as to spatial correlations at conventional equilibrium critical points. But to do this we will certainly have to move away from the infinite-range mean field model. This we do in the next section.
Fig. 5. Typical avalanche in a mean-field model showing the increase in size of the avalanche, m , at each time step. Note t that the fluctuations in m are larger when m is larger. t t
D.S. Fisher / Physics Reports 301 (1998) 113—150
129
5. Toy model: Spatial structure of avalanches The simple mean-field model of avalanches discussed in the previous section can be extended in a relatively straightforward way to include spatial and temporal structure like that which arises for general non-negative stress transfer functions J(r, t), i.e. monotonic models. Spatial coordinates for n(r, t) are now needed and corresponding generating function variables k(r, t). In the mean-field model, the probability that a segment u(r) jumps in a small time interval is proportional to the increase in pulling force /(r) on it in that interval times a local jump susceptibility o. If we assume the same is true here, then we can generalize the recursion relation equation (42) to include general J(r, t):
P P
j(r, t)"k(r, t)!io dr@
=
dt@ J(r@!r, t@!t)Mexp[i¶j(r@, t@)]!1N
(65)
t for the case with all jump displacements ¶ equal. Many quantities of interest are computable by similar techniques to those in the previous section, often in terms of the spatio-temporal Fourier transform of J(r, t), J(q, u). The mean number of jumps at r at a time t after an avalanche is triggered at r"0 at time t"0 is
PP
1 . e~*ute*q > r 1!o¶J(q, u) q u The critical point is thus still given by Sn(r, t)T"
(66)
(o¶JI ) "1 #3*5
(67)
JI "J(q"0, u"0)
(68)
with and the mean total size is SsT"¶/(1!o¶JI )
(69)
as before. We will henceforth work in units with ¶"JI "1. A more interesting quantity is again the conditional mean:
PPP
1 Sn(r, t)DsT" Prob(s)
e~*kse*j*(k)e*q > re~*ut 1!o¶J(q, u)e*j*(k)
(70)
k u with j*(k) the fixed point solution to the mean-field recursion relation (42) or, equivalently, Eq. (65) with j and k independent of r and t. By changing variables one can show that (as in the previous section) conditional statistics like Eq. (70) are independent of o. The important physical quantity is q
K(q, u),1!J(q, u)
(71)
which embodies the information on the space and time-dependent elasticity. Changing variables to kPk#constant and noting that for large s, small k will dominate as before, we obtain
P
P
C
D
1 j2 Sn(q, u)DsT+ J2ps3e~*sk " dj e~(1@2)sj2Js3/2p . !i2ik#K(q, u) j2#K2(q, u) k
(72)
130
D.S. Fisher / Physics Reports 301 (1998) 113—150
For an interface with dissipative dynamics and local elasticity, in the absence of pinning or driving forces we have, after rescaling lengths and times, simply u/t"+2u
(73)
so that K(q, u)"!iu#q2 .
(74)
But to understand the general behavior, and to apply the results to other physical systems, we would like to include the possibility of long range elasticity, i.e.,
P
dt J(r, t)&1/rd`aJ
(75)
in d-dimensions corresponding to the static K (q)&DqDaJ ,K(q, u"0) if aJ (2 or K (q)&q2 if aJ '2 . 4 4 We thus consider the general case of K (q)&DqDa with a42 with 4 a,min(aJ , 2) .
(76)
(77)
The total displacement a distance r from the avalanche starting point, during an avalanche of large size s, is obtained from Eq. (72) with u"0. We must thus evaluate :qe*q > r of the last expression in Eq. (72). For r"0, all q can contribute but j is small so that we can ignore the j2 in the denominator, yielding
TP
S*u(r"0)DsT,¶
U P
dt n(r"0,t)Ds &
1 [K (q)]2 q 4
(78)
which is of order one independent of s if d'd (a)"2a . (79) # We thus see the appearance of a special critical dimension above which no segment will jump more than a few times even in an arbitrarily large avalanche. Indeed, we will see that above the critical dimension driven interfaces will have only bounded small scale roughness. For d(d , the integral in Eq. (78) is infinite so that small q (i.e. small K) dominates and more # care is needed. The cutoff of : of Eq. (72) when K&j yields, with a typical j&1/Js and hence q q&s~1@2a, S*u(r"0)DsT&s1~d@2a .
(80)
Note the appearance of a non-trivial exponent relating *u and s. It depends, as is usually the case for critical phenomena, on the spatial dimension. As mentioned in the Introduction, Eq. (80) is just the kind of scaling law we expect near critical points. Eq. (80) relates characteristic scales of displacement to the characteristic scales of avalanche size. We can also say something about the spatial extent and shape of large avalanches, by computing S*u(r)DsT. For d'd (a), the integral in Eq. (72) will be cutoff for q'r by the e*q > r oscillations and #
D.S. Fisher / Physics Reports 301 (1998) 113—150
131
hence dominated by q&1/r yielding S*u(r)DsT&1/rd~2a
(81)
for 1;r;s1@2a where the upper cutoff arises when K(q&1/r)&j&1/Js. For d(d , on the other hand, as long as r;s1@2a, :qe*q>r/(j2#K2) for typical j will be # 4 dominated by q&s~1@2a and S*u(r)DsT&s1~d@2p for r;s1@2p ,
(82)
i.e. the magnitude of typical displacements is approximately independent of r in this range. In all dimensions, *u(r) will fall off rapidly for larger r. The length ¸&s1@2a
(83)
is thus some measure of the diameter of an avalanche. Let us now try to interpret these results [Note that the skeptic could compute, e.g., S[*u(r)]2DsT, etc., to provide further support for the picture below]. For d'd , the fact that S*u(r)T is much less # than unity for r<1 strongly suggests that most segments will not jump even if they are within r;¸ of the avalanche center, rather only a fraction &1/rd~2a of them will jump, and these typically only once or a few times. The number of sites that have jumped at all within a distance R(¸ of the origin is of order R2a;Rd so that the avalanche is fractal. The total number of sites that jump, its “area” A is thus, by taking R&¸, A&¸d&&s;¸d
(84)
with the fractal dimension d "2a for d'd . (85) & # In lower dimensions, the picture is quite different. The approximate independence of S*u(r)DsT of r for r;¸ suggests that each site in this region jumps a comparable number of times &s1~d@(2a) (with fluctuations around this of the same order) and hence the avalanche is not fractal but has area A&¸d&sd@2
(86)
while the typical displacement is *u(r)&¸f
(87)
for r4¸ with f"2a!d
(88)
(for d'd , f"0). # The distribution of s at the critical point that occurs at oJI ¶"1, is the same as in the mean field model. This implies that Prob (diameter'¸)&1/¸i
(89)
132
D.S. Fisher / Physics Reports 301 (1998) 113—150
with i"a .
(90)
The duration of an avalanche with dissipative dynamics — corresponding to K(q, u)+!iu#DqDp
(91)
— is given simply by scaling, i.e., q&¸z&s1@2
(92)
with the dynamical critical exponent z"a .
(93)
Note that the relation q&Js is (not surprisingly) the same as in the infinite-range mean-field model. We have found that in our toy model, many of the properties of large avalanches near to the critical point (actually any large avalanche although they are rare away from criticality), obey scaling laws which relate various characteristic physical properties to each other by power law relationships. For example, for d(d , an avalanche of diameter ¸ has typical size s&¸2a@d, # displacement *u&¸f and duration q&¸z. This type of scaling behavior is one of the key aspects of critical phenomena in both equilibrium and non-equilibrium systems. But there is more: if we scale all lengths by a correlation length m&e~1@a
(94)
which is the diameter above which avalanches become exponentially rare away from the critical point, and correspondingly displacements by mf, durations by mz, etc., then functions such as those that occur in the distribution of avalanche sizes, Eq. (51), or the average growth of the displacements during an avalanche, Su(r, t)/tDsT will be universal functions of scaled variables such as ¸/m. For example, from Eq. (72) we obtain, for K(q, u)"!giu#DgDqDa and d(d "2a, # C s u(r, t) r t s + u mf~z½ , , (95) C t m C mz C md`f t t u with s":dr *u(r) the total size, C and C non-universal (dimensionfull) coefficients which set the u t scales of the displacements and times; these depend on the random pinning, g, D, etc. The universal scaling function is
T
KU
A
PPP
½(R, ¹, m)"
Q X
B
dK e*Q > R~*XTe~(1@2)mK2Jm3/2p
C
D
K2 K2#(!iX#DQDa)2
(96)
which depends only on the dimension, the range of interactions, and the type of dynamics (i.e. dissipative), as is manifested in the low-frequency form of the stress transfer function K(q, u). As we shall see in the next section, a similar scaling structure is expected to exist in more realistic models. Let us now try applying the toy model results to the interface problem with d"2 and short-range elasticity, so that a"2. This dimension is less than d4)035~3!/'%"d (a"2)"4 , # #
(97)
D.S. Fisher / Physics Reports 301 (1998) 113—150
133
so we have f"2, i.e. *u(¸)<¸, for large avalanches. But this is clearly unphysical: our original model for the interface assumed that it was close to flat so that, at least on large scales, we need small angles of the interface, i.e. +u;1. Thus, the result Eq. (87) violates the assumptions of our original model in this case. What has gone wrong? Is the original model bad or have we made some grievous errors in trying to analyze it? The answer is the latter and understanding why gives some clues as to how to do better. In the infinite range mean-field model, the odds of any given segment jumping more than once are very low as long as s;JN, since the odds of a specific segment jumping at all is small and each jump is almost independent by (justifiable) assumption. But in the finite range toy model, we assumed that this independence was still true, i.e. that the probability of a segment u(r) jumping in a short time interval depends only on the increase in pulling force, */(r), on it during that interval. But this is problematic: if a segment has just jumped, it is much less likely to do so again until its neighbors have caught up. The needed increase in /(r) for a subsequent jump will typically be of order one. [Actually this is true even in our toy model, but the cumulative effect of many jumps causes the problem: the needed */(r) to cause a large *u(r) that consists of many jumps should be */(r)+JI *u(r)$O(1) while in the toy model it is, for o"1 where large avalanches can occur, */(r)+JI *u(r)$JI J*u(r). This difference is responsible for major errors when avalanches involve large *u(r)’s but not for d'd (a) where *u(r)’s remain of order one.] # Our task, then, is to somehow take into account properly the anticorrelations between local susceptibilities to successive jumps. This will certainly involve ensuring that the statistical properties of M f [r, u(r)]N when all u(r) are increased by any fixed amount are preserved, i.e. the statistical 1 translational invariance of the system which is lacking in our toy avalanche model. Remarkably, in spite of its problems the toy model correctly gives the statistics and properties of avalanches for large s and small e in dimensions greater than d (a). The basic reason for this is the # observation mentioned above that each segment is unlikely to jump many times during even very large avalanches; a real understanding, however, relies on the renormalization group treatment discussed briefly in the next section.
6. Interfaces and scaling laws Motivated by the partial success of the toy model and general scaling concepts from more conventional critical phenomena, we will now approach the interface problem by making a scaling Ansatz. Specifically, we conjecture that large avalanches near the critical point have properties which scale with their diameter ¸ (or size s) as powers of ¸ [3,4]. In the toy model we found that the critical exponents which characterized these scaling laws, f, d , i and z depend on the dimension & of the elastic manifold, the power law decay of the interactions if they are long range, and on the type of dynamics, but not on other details of the system. This is the fundamental property of “universality”. In contrast, the critical force and coefficients in the scaling functions such as Eq. (95), will generally depend on details and hence are non-universal. We might thus hope that in dimensions d(d (a) for which the toy model fails, the exponents will still be universal functions of # d and a.
134
D.S. Fisher / Physics Reports 301 (1998) 113—150
In addition to the four exponents already introduced, there should also be an exponent which characterizes the correlation length: this is the diameter above which avalanches become unlikely (like m&s1@2a&e~1@a in the toy model) [23]. As F increases towards F with a generic past history, .!9 # we conjecture that m&1/(F !F)l . (98) # If in the mean-field model the local jump susceptibility o is smooth at F [17] so that e&F !F, # # then l "1/a. This turns out to be the case for the fuller mean field model defined in 50:~.0$%Section 3. It appears that we have five separate exponents and we would like to understand whether there are some relations among them. Fortunately, this will turn out to be the case. First, however, we can eliminate one exponent. If, as we expect for d(d , f'0 so that some interface segments advance large distances in big # avalanches, then the elasticity will certainly make neighboring regions advance as well so that it seems implausible that avalanches would be fractal; thus we expect d "d (99) & for d(d . # The other relations between exponents are much more subtle. The simplest assumption — and one that is borne out — is that there is one basic length scale m within the interface and a basic scale *u&mf
(100)
in the direction of motion. [Note that we have been sloppy and not put the needed non-universal coefficients, as appeared in Eq. (95) into Eq. (100).] Thus, for example, if we started with a flat interface and gradually increased F, when e,F !F # is small, the interface would be rough on scales less than m, with S[u(r)!u(r@)]2T&Dr!r@D2f
(101)
(102)
for Dr!r@D;m, and be flat on longer scales with S[u(r)!u(r@)]2T&m2f
(103)
for Dr!r@D<m as shown in Fig. 6. Similarly, we conjecture that for e small, and s large [4,13] Prob(s)&(1/sB`1)H(s/md&`f)
(104)
with H(xP0)P1 and H(xPR)P0, i.e., a similar form to that found in the toy model. Scaling implies that the exponent for the distribution of avalanche probability as a function of diameter in Eq. (89) obeys i"B(d #f) . &
(105)
D.S. Fisher / Physics Reports 301 (1998) 113—150
135
Fig. 6. Schematic of a one-dimensional interface with F somewhat below the critical force. After starting from a flat initial configuration, the typical displacements of the interface are of order mf with m the correlation length. On smaller length scales, the statistics of relative displacements as a function of separation has a simple scaling behavior as shown. An avalanche that occurs as the force is increased slightly is also shown.
We can now compute the “polarizability” of the system s,dSu(r)T/dFC
(106)
where the arrow denotes that F is increasing. This will be given by the sum over avalanches which are triggered with probability odF per segment of the interface yielding s&o
P
ds s H(s/md&`f) . s sB
(107)
With B(1, this is dominated by large s so that s&m(1~B)(d&`f)&e~(1~B)(d&`f)l .
(108)
But from the earlier results about the interface roughness, Eq. (103), we also have s&(d/dF)mf&e~(1`fl) .
(109)
We thus have B"(d !1/l)/(d #f) & & and hence i"d !1/l &
(110)
(111)
136
D.S. Fisher / Physics Reports 301 (1998) 113—150
[4]. Note that these relations work in the toy model for d'd (a) but not for d(d (a) due to the # # problems discussed earlier. Another relation can be derived by considering the forces one avalanche exerts on another section of the interface. When an avalanche that moves a region of diameter ¸&m occurs, we expect it to also involve quite a few segments nearby to jump, but not many that are far away. In particular, distances R&m away should be borderline. The increase in pulling force from the original part of the avalanche on sections &R away will be *u&md`f/Rd`a
(112)
(for short range elasticity, a"2, the argument is more subtle). For R&m, this */ should be comparable to the deviation e from criticality or else we must not have properly identified the crossover distance. Thus, we should have *u(R&m)&1/ma~f&e&m~1@l
(113)
yielding the scaling law 1/l"a!f .
(114)
A further relation can be derived by considering the variations in the “local critical forces” F (r, ¸) needed to make a region of diameter ¸ advance by of order ¸f. We define #(115) el(r, ¸),F (r, ¸)!F #to be, loosely, the deviation from “local criticality”. Since the volume of the region through which the section of scale ¸ will advance is »&¸d`f, we expect that the random pinning forces in the region will make del(¸),Jvariance[el(r, ¸)]51/¸(d`f)@2
(116)
i.e. the region “does not know” F to better than this accuracy [24]. If at scale ¸&m, del(¸"m) # were much larger than e, then we would expect many regions of size m to be unstable so large avalanches that occurred in the past should have been even larger. Thus, we should have del(m)4e
(117)
yielding 1/l41(d#f) 2 from Eq. (116). This can be combined with Eq. (114) to yield
(118)
f51(2a!d) . 3 An upper bound of
(119)
f42a!d
(120)
follows from the observation that the toy model should overestimate the jumps of a typical segment in a large avalanche. For d'd , we expect f"0, i.e. interfaces which are flat on large scales. #
D.S. Fisher / Physics Reports 301 (1998) 113—150
137
So far, we have not discussed the dynamics. One of the (helpful!) features of monotonic models, is that at the end of an avalanche, how much each segment has moved, *u(r), is independent of the dynamics although how long the avalanche takes, in what order jumps occur, etc., will depend on the dynamics [17,18]. Indeed, even the exponent z can depend on the stress transfer J(r, t). In particular for long-range interactions whose effect is only felt after a time proportional to r — i.e. finite velocity of information propagation — one must clearly have z41 (this can already be seen in the toy model if a(1) [25]. In general, however, we cannot say much about z without much more work. Nevertheless, using our scaling Ansatz, we can relate the dynamical behavior in the moving phase for F just above F to that for F(F [3,4,8]. In particular, we conjecture that the jerkiness of # # the motion occurs on length scales up to m&1/(F!F )l and times up to # q &mz (121) m while the motion is smoother on longer scales. This jerky motion will look like the dynamics within avalanches. If we consider, crudely, the motion to be made up of avalanches of scale m occurring at intervals of order q within each region of diameter m, then the velocity is simply m vN &mf/q &(F!F )b (122) m # with b"(z!f)l .
(123)
Note again, that with the exponents of the toy model for d'd : z"a, l"1/a, f"0 and b"1, the # scaling law (123) is obeyed. In the moving phase, the response of Su(r, t)!vN tT to a small additional static force dF e*q > r, q s(q)"dSu(q, u"0)T/dF , (124) q can be obtained exactly! (Note that here either sign of dF is okay if it is sufficiently small.) If the q variable change
P
u(r, t)"uJ (r, t)# e*q > r q
dF q JI !J(q, u"0)
(125)
which corresponds to transforming away the response in the absence of pinning, is made, then the statistical properties of the random pinning forces fI [r, uJ (r, t)],f [r,u(r, t)] (126) 1 1 as a function of uJ and r are identical to those of the original f . This is a consequence of the 1 underlying statistical rotational invariance of the system.4 We thus find that SuJ (r, t)T q"Su(r, t)T q dF dF /0 so that the average response is given exactly by the second term in Eq. (126), i.e. 1 1 s(q)" & . JI !J(q,u"0) DqDa
(127)
(128)
4 If the system is not rotationally invariant, the depinning critical behavior can be very different; see the discussion in [19].
138
D.S. Fisher / Physics Reports 301 (1998) 113—150
Since s should scale as ¸f/¸1@l, we again obtain the scaling law (114). Note that since f is defined at F , the agreement of this result with that below threshold supports the notion that the appropriate # characteristic length scales above and below threshold will diverge at F with the same exponent l. # We have reduced the number of critical exponents to two basic ones; f and z which relate displacement and time scales to length scales. But, so far, we have neither a means of computing these exponents; nor more importantly, a way of justifying the scaling laws beyond the hand waving arguments given above (or variants of these); nor even a way of understanding the claimed universality of the exponents. But with the basic scaling picture in mind, we can try and appeal to the framework of the renormalization group which has been so successful in understanding equilibrium — and some non-equilibrium — critical phenomena. There are two substantial difficulties. One is associated with the basic physics: we must find a way of properly dealing with two kinds of scales in the pinned phase near F . First, the jumps of small segments happen on the basic microscopic time scale but # their existence and discrete nature is crucial. Large avalanches last for times which scale with their size out to q which is very large just below F and thus avalanche activity spans a broad range of m # scales. But, in addition, there is the time between avalanches set by the rate at which F is changed; as long as this is slow enough, it does not really matter except that the important anti-correlations between successive avalanches in the same region will only be felt on this very long time scale. The other main difficulty is associated with the history dependence in the pinned phase. For example, what should one average over to get sensible quantities? If the stress transfer, J(r, t), is non-negative, this difficulty can be circumvented if F is always increasing (or always decreasing): if this is the case, then from any stable initial condition the pulling force on every segment will increase monotonically with time. This feature is important for the physics as well as drastically simplifying the theoretical analysis. Nevertheless, even in such monotonic models, there will be history dependence. But near to F , this should, at worst, only modify non-universal coefficients as # long as the critical force is approached monotonically starting from a much lower force. A natural reproducible history results from starting at F"!F . # For monotonic models, a perturbative renormalization group (RG) analysis for dimensions near the upper critical dimension has been carried out [2—4]. The first result is that for d'd (a), the # decreased local susceptibiltiy to jumping after a segment has jumped is irrelevant in the RG sense, except on the very long time scales during which the whole system has advanced. In particular, this justifies our claim that, while the critical force and other non-universal properties will not be given correctly, universal features of avalanches such as critical exponents and scaling functions [e.g. H in Eq. (104) and ½ in Eq. (95)] will be given exactly by the toy model for d'd . In the moving phase, # some of the effects left out of the toy model will be important but mean field results like b"1 and f"0 will obtain.5
5 The velocity exponent, b, in mean field theory is not fully universal. It depends on whether or not the pinning forces are continuously differentiable and, if not, on the nature of the singularities in f (u). For all but discontinuous forces, this 1 causes long time scales in mean-field theory associated with the acceleration away from configurations that have just become unstable. With short range interactions, these should not play a role due to the jerky nature of u(r, t) caused by jumps. Thus the mean-field models with b "1 are the “right” ones about which to attempt an RG analysis. See [19]. MF
D.S. Fisher / Physics Reports 301 (1998) 113—150
139
For d(d many new results can be obtained from the RG analysis. In addition to the derivation # of universality, scaling laws and perturbative computations of exponents that arise from a new critical fixed point of the RG, it is possible, in principle — although not yet carried out — to compute such quantities as the new universal scaling function that replaces ½ in Eq. (95), anticorrelations between avalanches in the same region, local velocity correlations in the moving phase, etc. Here we just quote the results for the exponents [2—4]. All the scaling laws derived heuristically above are found to be obeyed. The exponent f is f+1(d (a)!d) (129) 3 # to all orders in powers of d !d; indeed, this result — which saturates the lower bound Eq. (119) # — may well be exact. Numerical computations for d"a"1 yield f+0.34$0.02 consistent with 1 [25,26]. The dynamic exponent is found to be 3 2[d (a)!d] #O[(d !d)2] . z"a! # # 9
(130)
(131)
We thus find an interesting effect: the non-linearities of the avalanche process cause disturbances of the interface to propagate more rapidly at long scales than for an unpinned interface. The velocity exponent is then, from scaling, z!f 2(2a!d) b" +1! (1 , a!f 9a
(132)
i.e. a concave downwards vN (F) curve. Making the somewhat dangerous extrapolation to the interface with d"a"2, we get predictions of z+14, f+2, b+2 . (133) 9 3 3 At this point, except for some experiments on interfaces between two fluids that are being driven through porous media which give somewhat inconclusive results [5], there are not experiments, of which this author is aware, that test these results for interfaces. But as we shall see in the next section, there have been both experiments and numerical tests carried out on other systems.
7. Applications and complications In the previous section we saw how scaling ideas and intuition that arise from the simple solvable toy model could be used to develop an understanding of the behavior of interfaces near to the critical driving force that makes them move. Renormalization group methods can then be used to carry out concrete calculations and justify many of the conjectures. In particular, the general structure and existence of scaling laws and universality follows rather directly from the existence of an RG critical fixed point. One of the advantages of this framework is that it enables us to apply the general structure to other systems — such as (but not limited to) different dimensionalities and ranges of interactions. But in addition we can introduce various physical features left out of the simple models — even our
140
D.S. Fisher / Physics Reports 301 (1998) 113—150
relatively realistic Eq. (3) — and ask whether they are relevant in the RG sense of changing (or destroying) the universal aspects of the critical behavior. In this section we will discuss several of the physical systems mentioned in the Introduction with an eye both to applying the ideas and seeing how they must be modified — or thrown away! — to account for the appropriate extra physics. 7.1. Charge density waves The best and perhaps really the only real test of critical behavior near to a depinning transition of an elastic manifold in a random medium is that of charge density waves (CDW) driven by an electric field [7]. The periodic electron density waves that occur within this class of materials are incommensurate with respect to the underlying crystalline periodicity in one direction so they could move freely in this direction — contributing to the current proportionally to their velocity — except for being pinned by randomly positioned impurities; see Fig. 7. These CDWs are thus three-dimensional elastic manifolds with short range interactions (the coulomb interactions are screened) in a three-dimensional random medium. Inertial effects are negligible. The primary difference from interfaces is in terms of the displacements u(r): in the frame of the crystal the random pinning forces f [r, u(r)] are periodic in 1 u(r)Pu(r)#j (134) CDW for all r, where j is the wavelength of the CDW. Although the resulting behavior near the CDW critical driving force is not much changed for d'd "4, the extra symmetry associated with this # periodicity changes the universality class for d(d . In particular, the exponents become [9] # f"0 , l"1, 2 4!d #O(4!d)2 , z"2! 3
(135)
Fig. 7. Schematic of a crystal with a charge density wave that is incommensurate with respect to the underlying lattice periodicity of the solid. The maxima in the difference between the actual electron charge density and that without the CDW are shown. An electric field pulls on the CDW which could move freely, except for being impeded by pinning forces of the impurities.
D.S. Fisher / Physics Reports 301 (1998) 113—150
141
yielding scaling behavior above F involving the velocity exponent # b"zl"1!1(4!d)#O(4!d)2+5 (136) 6 6 in d"3. Experiments [27] carried out on CDWs and numerical simulations yield b+0.75!0.9 over 21 decades of F!F — surprisingly good agreement with the theoretical prediction. 2 # But should we really expect that CDWs will exhibit the critical behavior of an ideal elastic medium? In a precise sense, certainly not. ¹hermal fluctuations (or even quantum fluctuations) can cause sections of the CDW to overcome the barriers caused by the pinning and jump to lower energy local minima. For any non-zero electric field, this will cause the CDW to gradually creep forwards thereby contributing to the current. The critical behavior near the fluctuationless F will # thus be smeared out by fluctuations which are hence a relevant perturbation [8,28]. This is quite analogous to the role of a magnetic field near to ferromagnetic phase transitions, which also smears out the critical singularities. But as in the magnetic case, if the perturbation is small enough, critical behavior can still be observed over a wide range of scales and F!F with the smearing only # occurring very close to F . In general, fluctuation effects appear to be quite substantial in CDW # systems and the critical behavior is smeared out. But the experiments quoted above [27] were performed by applying an ac driving force in addition to the dc drive; this reduces the effects of thermal fluctuations and appears to yield a wide range over which they do not play much of a role. Another complication in CDWs — which might also be reduced by an additional ac drive — is defects in the CDW lattice, especially dislocations [29]. The pinning is very weak in CDWs implying that the stresses that would cause dislocations to form are unlikely to occur except perhaps on long length scales. In this weak pinning regime, the effects of dislocations are poorly understood. It now appears, however, that dislocations will not destroy the existence of the CDW phase in equilibrium [30]; this is not directly relevant to the non-equilibrium physics of interest here but may point to progress also on the more relevant issues. Nevertheless, at this point, whether or not defects always destroy the elastic depinning critical behavior that we have studied is an open question. 7.2. Superconductors One system that is quite similar in spirit to a CDW, but for which the strength of the pinning forces can readily be varied, is a vortex lattice in a type II superconductor. Vortex lattices are pinned by impurities that impede their motion under the action of a transport current which exerts a force on the vortices [6]. Since the voltage is proportional to the mean vortex velocity, vN (F) plots are simply voltage-current curves, F being simply proportional to the critical current density. # By making bulk superconducting samples, thin superconducting films or wires, or a normal layer sandwiched between two bulk superconductors, three, two and one dimensional systems can all be studied as well as, in the last case, a two dimensional lattice of roughly parallel vortex lines with no dislocations allowed. In principle, many systems and regimes can thus be investigated, subject to the complication of non-uniform forces on the vortices, dissipative heating effects, and natural tendencies of experimentalists to care more about the magnitude of the critical current density than about what happens when the superconductor fails, i.e. when the critical current is exceeded! Surprisingly, although thermal fluctuation driven transitions in both clean and dirty superconductors have received a lot of attention recently [6,31] the behavior near to critical currents has received far less [32].
142
D.S. Fisher / Physics Reports 301 (1998) 113—150
If the pinning is very strong, the vortex lattice will be destroyed. What then (when thermal fluctuations can be neglected) will be the qualitative behavior near the critical force? In the case of two dimensional films in a perpendicular magnetic field that produces point-like vortices, the vortex flow just above F will almost certainly be confined to a sparse interconnected network of # irregular channels across the system, as sketched in Fig. 8. Some preliminary theoretical studies [33] and experiments of the critical behavior that can arise in this regime have been carried out and experiments that “see” the vortices performed [34], but even this simple case is far from understood. For more complicated cases of intermediate strength pinning or when three dimensional effects are important, even less is known. One thing that is clear, both theoretically [33] and experimentally [32], is that the critical force is history dependent. This and other history dependence — which can be caused by defect motion — is certain to play an important role in the physics. We note that in principle — and soon, if not quite now, in practice [34] — measurements of local vortex flow on small scales will be possible in some of the parameter regimes of greatest potential interest. Both macroscopic and microscopic information should be available in these superconducting vortex systems. Leaving the effects of lattice defects as an intriguing puzzle, we now turn to another type of physical effect that we have so far left out: the role of inertia and other related phenomena. 7.3. Cracks In contrast to CDWs and vortex lattices, interfaces between phases are “topological” and hence have elasticity which is robust for weak pinning, although for strong pinning they can become fractal and a very different “invasion percolation” behavior occurs. Various other elastic systems are also not directly susceptible to destruction by defects. In particular, the front of a planar tensile
Fig. 8. A thin superconducting film in a magnetic field, H, perpendicular to the film. A supercurrent provides a driving force for the vortices to move across the film. Just above the critical current for vortex motion, the vortices in most of the film can remain stationary with the vortex flow confined to a sparse network of vortex channels, as shown. The electric field is proportional to the average vortex flow rate.
D.S. Fisher / Physics Reports 301 (1998) 113—150
143
Fig. 9. A solid with a crack under tensile loading. If the load is increased, the crack front can progress through the solid. In some circumstances, the crack is confined to a plane; this is the case discussed in the text.
crack in a heterogeneous solid has long range 1/r2 elastic forces mediated by the elasticity of the solid which act to keep the crackfront roughly straight; see Fig. 9 [15,16,34]. But random variations in the local toughness — the energy per unit area needed for crack growth — act to deform the crack front. This system is thus an example of an elastic manifold with d"1 and a"1. Numerical studies [25,26] with quasistatic (i.e. instantaneous on the time scales of the crack motion) stress transfer which corresponds to J(r, t)Jd(t)/r2 and hence is a monotonic system, yields results for f, z, l, and b in excellent agreement with the d"2!e expansion results [2]. But the resulting z+7 is less than one [2,25]. Therefore, even if the basic propagation of disturbances 9 along the crack front is slow on small scales (so that the quasistatic approximations would seem to be justified), near the critical point disturbances would propagate arbitrarily fast since the characteristic scale of the “velocity” along the crack front would scale as m/mz which diverges as mPR. This is clearly unphysical and so elastic wave propagation effects must change the asymptotic critical behavior. The simplest way to take the time delays associated with wave propagation into account is to use J(r, t)&(1/r2)d(t!r/c)
(137)
with c an elastic wave velocity. This is still a monotonic model, so only the dynamic exponent z will change. Theoretical arguments and numerical simulations [25] for this case yield z"1 exactly — so there is no problem with causality. From Eq. (123), we obtain b"1. But real elastic wave propagation is much more complicated [25,35]. If the crack grows by a segment of the crack front jumping forward only to be stopped by a tougher region, then a point on the crackfront a distance r away will initially feel no change in stress. Furthemore, when the longitudinal waves with velocity c arrive, the stress that tends to open the crack — analogous to l p(r,t) for the interface model — will actually decrease since J(r, tZr/c )(0 [25,36,37] Only when the l Rayleigh waves with slower velocity c 4c (c arrive will the opening stress become positive. But R t l when this occurs, there will be a rapid jump to a large peak value of the opening stress followed by
144
D.S. Fisher / Physics Reports 301 (1998) 113—150
a gradual fall off to the eventual static stress increase which is
P
J(r, u"0)"
=
J(r, t) dt&1/r2 . (138) 0 This is highly non-monotonic behavior! Such stress overshoots can cause segments of the crack front to jump forward that would not have been triggered by the static stress changes. How often this occurs will depend on the heights of the stress peaks relative to the static stress increases, i.e. to the size of the overshoots. If the jumps of segments of the crack are slow and smooth (with duration long compared to the sound travel time across the jumping segment) then the stress overshoots will only occur far from the jumping segment and have small amplitude. Their effects should then be small and the quasistatic behavior should be observable, except, as we shall see, very near F . But any stress # overshoots will at least occasionally cause some extra jumping and we must understand their effect. Some intuition can be gleaned by considering what would happen if the stress increases never decayed to their static value, but stayed at their peak strength. This would correspond, roughly, to increasing the elastic interactions of the crack with itself. As we saw in the mean field model, such enhanced elasticity causes the critical force to decrease because of more effective averaging over the randomness [see Eq. (20)]. Thus we would expect avalanches to run away at a force below the quasistatic F . A careful analysis of the effects of realistic stress transfer peaks followed by decay # towards the static stress shows that the same basic picture obtains [11,25]. Consider a segment which, at the end of the quasistatic approximation to an avalanche, has a final stress on it which is very close to being enough to make it jump again. Then any overshoot in the stress on this segment that occured during the actual avalanche will, if the resulting peak stress is above the quasistatic final stress, cause it to undergo an extra jump. If, as is indeed the case, there is a small density of these extra jumps roughly randomly distributed within the avalanche, each of the extra jumps will trigger extra avalanches, for which one again has to consider the effects of the stress overshoots which make more avalanches, etc. One can show that this process will always run away for a sufficiently large avalanche — independent of how small the stress overshoots might be [11,25]. The stress peaks are thus a relevant perturbation which will change the critical behavior. Indeed, the runaway of large avalanches implies the coexistence of moving and stationary states and strongly suggests that the depinning transition in the presence of stress pulses will be either first order with a discontinuous and probably hysteretic vN (F) or have a continuous vN (F) but with a smaller critical force and different critical behavior [25]. What happens if the stress overshoots are small, say with strength parameterized by R? Then only avalanches with ¸5¸R&1/R1@yR
(139)
will typically be much affected by the stress peaks. The exponent yR is the RG eigenvalue for the relevant perturbation, R; it has been computed analytically [11,25] and checked numerically [25] for some types of stress overshoots, but it’s value is not yet understood analytically for the type of long-tail stress peaks that occur in the crack problem [25,37]. Experiments on very slowly advancing cracks confined to a plane [38] yield an estimate of the roughness exponent which characterizes the deviations of the crack front from a straight line, of f +0.55$.05 #3!#, &30/5
(140)
D.S. Fisher / Physics Reports 301 (1998) 113—150
145
substantially larger then the quasistatic prediction of 1. However the large corrections to scaling 3 predicted by the RG analysis [25,39] may account for this discrepancy; beware of quoted error bars for exponents! It is possible, however, that real elastodynamic effects play a role even for slowly moving planar cracks. A related challenge is to understand the crack front distortions which manifest themselves in the roughness of the fracture surface left behind after a crack front passes: i.e. after the material is broken [16,36,40]. Again, elastodynamic effects may play a crucial role. In certain situations, cracks can form multiple branches — such as shattering of glass-, or just a few small branches near the crack front [35]. Under what circumstances crack branching might affect the onset of crack motion or the large scale shape of fracture surfaces is another open question. 7.4. Faults So far, except for crack front roughness, we have only discussed systems for which the simple measurements are of macroscopic properties such as vN (F). But there is one natural system in which, even though the length scales are huge, the measurements are of “microscopic” type: specifically, the statistics and other properties of “avalanches”. A crude model of a geological fault is the motion of two large blocks of crust in contact along a disordered but roughly planar surface, the fault, which move relative to each other; see Fig. 10. The motion is driven by forces exerted from far away (e.g. from the viscoelastic region beneath the crust) that are transmitted to the fault by the elasticity of the blocks. Rather than being driven by a constant force, the blocks are driven at a fixed time-averaged velocity that is extremely slow — of order millimeters to centimeters per year — compared to other characteristic velocities — e.g. the speed of sound in the rock [10]. Segments of the two dimensional fault plane interact with each other by long range 1/r3 stress transfer —as can be seen by dimensional analysis— and are pinned by heterogeneities on the two fault surfaces that have to rub past each other [11]. Earthquakes, of course, are just the “avalanches” that occur when some segment of the fault becomes unpinned and jumps forward triggering others [12,41]. The range of length scales of slipping regions is huge: from meters to hundreds of kilometers, with slips — changes in relative displacement across the fault plane *u — from millimeters to tens of meters.
Fig. 10. Schematic of two segments of crust which move relative to each other along a fault plane in a sequence of earthquakes. The driving forces are transmitted to the fault by the two halves of the crust moving with relative velocity, v.
146
D.S. Fisher / Physics Reports 301 (1998) 113—150
Most theoretical approaches to modelling faults have involved physics driven by friction laws and inertial effects [42], with intrinsic heterogeneities playing a secondary role if included at all. An alternative approach [11], motivated by the systems we have discussed here, might be to start from the opposite end, a strongly disordered fault with quasistatic stress transfer, and then bring in other features such as elastic wave propagation (which is the manifestation of inertia) and frictional weakening. This has the advantage of building on an established theoretical framework in which the importance — or lack thereof — of various features can be considered. This is particularly useful as the parameter space in even relatively simple models is very large and thus hard to explore numerically — at least in ways that might convince a skeptic of the interpretations or predictions! If the forces that one side of the fault exerts on the other are independent of the history and the local slip velocities u(r, t)/t and the stress transfer is quasistatic, i.e. J &d(t)/r3, then this system qs falls into the class of generalized interface models with d"2 and a"1. But for this a, two is exactly the upper critical dimension so we can use the results of the toy model. The fault system acts as if it were driven just below F (by an amount that goes to zero for large system size) so that we can set # e"0.6 [Note that there should be logarithmic corrections to various quantities, as is usual at critical dimensions, but their effects are small]. The toy model is sufficiently simple, that one can include a realistic way of driving the faults, but for now we consider the simpler idealized infinite system. Several results can immediately be used [11]: the area of the region of the fault that slips in an earthquake will scale as the square of its diameter, ¸, the typical slip in this region will depend only logarithmically on ¸ (i.e. f"0), and the size, in this context called the moment, will scale as
P
M" dr *u(r)&¸2 .
(141)
Note that the moment magnitude often quoted in newspapers is m"log M#constant , (142) 30 the base 30 being for historical reasons, particularly consistency with the older Richter magnitude scale. The probability of different sized events in the simple model is Prob (moment'M)&1/MB
(143)
with B"1. 2 How do these predictions compare with observations? The moment of quakes is the best measured quantity, although for medium and larger size events, the duration q can also be measured well. The diameter — or generally the dimensions of the slipped region — and the slip *u can only be measured directly if the quake involves slip where the fault breaks through the surface of the earth. However, earthquake data are usually interpreted in terms of a crack picture in which
6 Whether critical behavior is considered “self-organized” or not is somewhat a matter of taste: if the systems we are considering are driven at very slow velocity, then they will be very close to critical. In another well known situation, when a fluid is stirred on large scales, turbulence exists on a wide range of length scales extending down to the scale at which viscous dissipation occurs. In both of these and in many other contexts the parameter which is “tuned” to get a wide range of scales is the ratio of some basic “microscopic” scale to the scale at which the system is driven.
D.S. Fisher / Physics Reports 301 (1998) 113—150
147
f"1 and z"1 so that M
&¸3&q3 (144) #3!#, [10]. This scaling appears to be reasonably well justified observationally for large earthquakes and the M&q3 scaling perhaps also for intermediate size events. But for small events — i.e. most of them — only the moment can be measured reasonably reliably. Thus, perhaps, our M&¸2&q2 might not be ruled out for small events although it probably is for large ones. Note however, that in quakes of magnitude m"7 or so and above, the linear size (e.g. the diameter) is comparable to the depth of the crust (and other relevant length scales) so different scaling laws should in any case be expected for large earthquakes with a crossover between the two regimes [43]. The question of earthquake magnitude statistics is a complicated and controversial one. The famous Gutenberg—Richter law [44] states that the distribution of all earthquakes approximately satisfies Eq. (143) with a B+2, although it changes somewhat — perhaps for the reasons mentioned 3 above — around magnitude seven. The data cover over 12 orders of magnitude in the moment, equivalent to, assuming the crack scaling Eq. (144), four orders of magnitude in length scale. Understanding the Gutenberg—Richter law is a very interesting problem, tied closely to that of understanding the apparent fractal distribution of some geological features such as fault networks [41]. But our subject here is individual faults — at least if they really exist as well defined entities, a question which is also controversial. Observations of some highly disordered faults have been fitted — over a reasonable range — with B&0.5—0.6 [45]. This is perhaps encouraging although the apparent rough agreement with our Eq. (143) may well be fortuitous. Most faults, however, exhibit quite different behavior: a small regime (if at all) of power law statistics that can be fit by somewhat larger Bs, then a gap with few events and a narrow peak (which dominates the total slip) at a characteristic earthquake size in which the whole fault section slips [45]. This behavior, sketched in Fig. 11 (bottom), appears to be very different from the power-law scaling behavior we have found in our simple quasistatic heterogeneous model. However, behavior qualitatively similar to Fig. 11 (bottom) — with large characteristic earthquakes — has been found in simulations of models with inertia and frictional weakening but no intrinsic randomness although analytic understanding of these results is rather limited [42]. Can we understand qualitatively how both power-law and characteristic-earthquake types of behavior might arise starting from our simple randomness-dominated picture? Elastodynamic effects, as for the crack front problem discussed earlier, will result in peaks in the dynamic stress transfer that are larger than the static stress transfer. In addition, frictional weakening — the tendency for dynamic frictional forces to be less than static frictional forces — will also be present. Both of these effects will cause extra segments to slip that would not have slipped in quasistatic events. These, as for the crack front, will cause runaway of large earthquakes which will eventually be stopped only by strong “pinning” caused by the boundaries of the fault section, or by the unloading of the shear stress that is driving the fault. If the stress overshoot and frictional weakening effects are small, there should be a wide regime of power law scaling with B"1 which 2 can extend out to the largest quakes in the fault section. But if these effects are strong — as one might guess would be the case for a weakly disordered fault that is close to planar — intermediate size events will not occur and the behavior will be qualitatively similar to the characteristic-quake behavior observed in many faults [11]. Which behavior obtains would be determined by how the length scale above which events typically runaway compares with the length of the fault section.
148
D.S. Fisher / Physics Reports 301 (1998) 113—150
Fig. 11. Schematic of two types of earthquake statistics observed on different geological faults; see [42]. The probability of quakes with a given moment is plotted on a log—log scale. (top) A fault with power law statistics of events. (bottom) A fault that exhibits “characteristic earthquake” behavior which refers to the peak in the distribution for large events of a characteristic size, with not many intermediate size events occurring.
Although this basic scenario of fault dynamics that was found in the simple heterogeneous fault model may have nothing to do with what happens in the real earth, analog laboratory systems might be found, or synthesized, in which some of these ideas could be tested. In these lecture notes, we have outlined a framework for studying the non-equilibrium “critical” behavior that occurs near the onset of macroscopic motion in many driven systems. This enables us to understand the origins, nature, and statistics of avalanche-like events that can occur in these systems, as well as other qualitative and quantitative aspects of the critcal behavior. But the examples given in this section have also illustrated a key point of these lectures: to be really useful, a phenomenological framework should be broad enough and robust enough to show the roots of its own failure. With judicious choice of which experimental systems to focus on — and some luck — this should enable enough predictions to be made that theoretical results and underlying assumptions built into models are falsifiable!
D.S. Fisher / Physics Reports 301 (1998) 113—150
149
Note on references. Since the literature in many of the areas discussed here is vast, I have generally tried to include references to papers that elucidate the main points using a framework similar to that of these notes, to review or introductory articles, and to recent papers that contain references to earlier literature, rather than referring directly to all the relevant original papers.
Acknowledgements Whatever understanding the author might have of the problems discussed here is due in large part to interactions over the past 15 years with students, postdocs and colleagues too numerous to mention. I am most grateful to all of them. I would also like to thank Jennifer Schwarz for comments on a preliminary version of the manuscript. This work has been supported in part by the National Science Foundation via grants DMR 9106237, DMR 9630064, and Harvard University’s MRSEC.
References [1] [2] [3] [4] [5] [6]
[7] [8] [9] [10] [11] [12] [13]
[14] [15] [16] [17] [18] [19] [20]
P.G. deGennes, Rev. Mod. Phys. 57 (1985) 827. E. Ertas, M. Kardar, Phys. Rev. E 49 (1994) R2532. T. Natterman, S. Stepanow, L.-H. Tang, H. Leschorn, J. Phys. (France) II 2 (1992) 1483. O. Narayan, D.S. Fisher, Phys. Rev. B 48 (1993) 5949. The renormalization group analysis is based on that in [3]. J.B. Stokes, D.A. Weitz, J.P. Gollub, A. Dougherty, M.O. Robbins, P.M. Chaikin, H.M. Lindsay, Phys. Rev. Lett. 57 (1986) 1718. G. Blatter, M.V. Feigelman, V.B. Geshkenbein, A.I. Larkin, V. Vinokur, Rev. Mod. Phys. 66 (1988) 1125 and references therein. For a general introduction to vortex dynamics, see M. Tinkham, Introduction to Superconductivity, 2nd ed., McGraw-Hill, New York, 1996. G. Gruner, Rev. Mod. Phys. 60 (1988) 1075 and references therein. D.S. Fisher, Phys. Rev. B 31 (1985) 1396. O. Narayan, D.S. Fisher, Phys. Rev. B 46 (1992) 11520. For a general reference on earthquake dynamics, see e.g. K. Kanimori, E. Boschi (Eds.), Earthquakes: Observation, Theory, and Interpretation, North-Holland, Amsterdam, 1983. D.S. Fisher, K. Dahmen, S. Ramanathan, Y. Ben-Zion, Phys. Rev. Lett. 78 (1997) 4885. D.S. Fisher, Friction and forced flow: collective transport in disordered media, in: A.R. Bishop, D.K. Campbell, P. Kumar, S. Trullinger (Eds.), Nonlinearity in Condensed Matter, Springer, New York, 1987. The importance of “avalanche” like phenomena in a variety of physical systems has been emphasized by P. Bak, C. Tang and collaborators, see e.g. P. Bak, C. Tang, C. Wiesenfeld, Phys. Rev. Lett. 59 (1987) 381; P. Bak, K. Chen, Sci. Am. 264 (1991) 46, and this and other critical-like phenomena dubbed “self-organized criticality”. See also [6,12]. P.C. Hohenberg, B.I. Halperin, Rev. Mod. Phys. 49 (1977) 435, for a review of equilibrium dynamic critical phenomena. S. Ramanathan, D. Ertas, D.S. Fisher, Phys. Rev. Lett. 79 (1997) 873. H. Larralde, R.C. Ball, Europhys. Lett. 30 (1995) 287. A.A. Middleton, D.S. Fisher, Phys. Rev. B 47 (1993) 3530. A.A. Middleton, Phys. Rev. Lett. 68 (1992) 670. M. Kardar, Phys. Rep. 299 (1998) in this volume. The scaling law for distribution of avalanche sizes is the same as that in mean field models of “sandpiles” (see P. Bak, C. Tang, J. Stat. Phys. 51 (1988) 797) as well as that of cluster sizes in the mean field theory of percolation [22].
150 [21] [22] [23] [24]
[25] [26]
[27] [28] [29] [30]
[31] [32] [33] [34] [35]
[36] [37] [38] [39] [40] [41]
[42] [43] [44] [45]
D.S. Fisher / Physics Reports 301 (1998) 113—150 K. Dahmen, J.P. Sethna, Phys. Rev. B 53 (1996) 14872. D. Stauffer, A. Aharony, Introduction to Percolation Theory, 2nd ed., Taylor & Francis, London, 1991. O. Narayan, A.A. Middleton, Phys. Rev. B 49 (1994) 244. J.T. Chayes, L. Chayes, D.S. Fisher, T. Spencer, Phys. Rev. Lett. 57 (1986) 299; Comm. Math. Phys. 120 (1989) 501 present rigorous results and generalizations of a result of A.B. Harris, J. Phys. C 7 (1974) 1671, on correlation length exponents in random systems. S. Ramanathan, D.S. Fisher, Phys. Rev. B, in press; S. Ramanathan, Ph.D. Thesis, Harvard University, 1997. Other authors have found f+1; see J. Schmittbuhl, S. Roux, J.-P. Vilotte, K.J. Ma> l+y, Phys. Rev. Lett. 74 (1995) 2 1787; P.B. Thomas, M. Paczuski, cond-mat 9602023. It is not clear whether the differences between these results and f+1 can be attributed to corrections to scaling or to some other source. 3 M.J. Higgins, S. Bhattacharya, A.A. Middleton, Phys. Rev. Lett. 70 (1993) 3784. A.A. Middleton, Phys. Rev. B 45 (1992) 9465. S.N. Coppersmith, Phys. Rev. Lett. 65 (1990) 1044. The possible stability to dislocations of an elastic solid in a three dimensional weakly random medium has been discussed by various authors including J. Kierfeld, T. Natterman, T. Hwa, Phys. Rev. B 55 (1997) 626; T. Giamarchi, P. LeDoussal, Phys. Rev. B 52 (1995) 1242; D.S. Fisher, Phys. Rev. Lett. 78 (1997) 1964; see also [6]. D.S. Fisher, M.P.A. Fisher, D.A. Huse, Phys. Rev. B 43 (1991) 130; D.A. Huse, D.S. Fisher, M.P.A. Fisher, Nature 358 (1992) 553, for an introduction to the basic issues involved. M.J. Higgins, S. Bhattacharya, Physica C 257 (1996) 232. J. Watson, D.S. Fisher, Phys. Rev. B 54 (1996) 938; Phys. Rev. B 55 (1997) 14909; A.-C. Shi, A.J. Berlinsky, Phys. Rev. Lett. 67 (1991) 1926. T. Matsuda et al., Science 271 (1996) 1393. For a general reference on analytical work on fracture dynamics, see L.B. Freund, Dynamic Fracture Mechanics, Cambridge University Press, Cambridge, 1990; and for a more phenomenological treatment, see B. Lawn, Fracture of Brittle Solids, Cambridge University Press, Cambridge, 1993. S. Ramanathan, D.S. Fisher, Phys. Rev. Lett. 79 (1997) 877. J.R. Willis, A.B. Movchan, J. Mech. Phys. Solids 43 (1995) 319. J. Schmittbuhl, K.J. Ma> l+y, Phys. Rev. Lett. 78 (1997) 3888. O. Narayan, private communication. E. Bouchaud, J. Phys. C 9 (1997) 4319 and references therein. Various authors have studied the development and dynamics of fault networks using models that are somewhat related to those discussed here; see, e.g., K. Chen, P. Bak, S.P. Obukhov, Phys. Rev. A 43 (1991) 625; P.A. Cowie, C. Vanneste, D. Sornette, J. Geophys. Res. 98 (1993) 21809; P. Miltenberger, D. Sornette, C. Vanneste, Phys. Rev. Lett. 71 (1993) 3604; and references therein. See, e.g. J.M. Carlson, J.S. Langer, B.E. Shaw, Rev. Mod. Phys. 66 (1994) 658 and references therein. J.F. Pacheco, C.H. Schulz, L.R. Sykes, Nature 355 (1992) 71. B. Gutenberg, K.F. Richter, Ann. Geophys. 9 (1996) 1. S.G. Wesnowsky, Bull. Seismol. Soc. Am. 84 (1994) 1940, has analyzed the statistics of earthquakes on various faults in California.
Physics Reports 301 (1998) 151—185
Deterministic chaos and the foundations of the kinetic theory of gases J.R. Dorfman* Institute for Physical Science and Technology and Department of Physics, University of Maryland, College Park, Maryland, 20742, USA
Abstract Recent work in dynamical systems theory has shown that many properties that are associated with irreversible processes in fluids can be understood in terms of the dynamical properties of reversible, Hamiltonian systems. That is, stochastic like behavior is possible for these systems. Here we review the basic theory for this stochastic-like behavior and show how it may be used to obtain an understanding of irreversible processes in gases and fluids. Recent, closely related, work on the use of kinetic theory to calculate dynamical quantities such as Lyapunov exponents is also discussed. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 05.40.#j; 05.45.#b; 05.60.#w
1. Introduction Dynamical systems theory [1], of which chaos theory is a part, has its origins in two famous problems of classical physics: to find a solution for the three-body problem of celestial mechanics [2], and to find a proof of Boltzmann’s ergodic hypothesis for the equivalence of time and ensemble averages of dynamical quantities (such as the force per unit area on the wall of the container) for isolated systems composed of a large number of particles [3]. In each case a great deal of progress toward an understanding of both the possibilities and difficulties in finding solutions to these problems has been made possible, over the past several decades, by the development of powerful and sophisticated mathematical and computer methods. These developments are well summarized in a number of excellent books and papers on dynamical systems theory, some of which are listed as Refs. [1,4—7] as well as references contained therein. This article is a summary of a more extensive set of notes on this subject developed by the author
* E-mail: [email protected]. 0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 0 9 - X
152
J.R. Dorfman / Physics Reports 301 (1998) 151—185
which presents these topics in a somewhat more leisurely way, and also contains an extensive discussion of the Boltzmann transport equation [8]. Another discussion of many of these topics can be found in Ref. [9]. The purpose of this article is to explain some of these developments and to explore their application to the issues that preoccupied Boltzmann, namely the foundations of non-equilibrium statistical mechanics, with a particular emphasis on the foundations of the kinetic theory of dilute gases of particles that interact with short-range forces [10—12]. The dynamical properties of such a dilute gas are somewhat easier to understand than those for other states of matter since in the main, the dynamics of a particle in a dilute gas consists of long periods of motion free from interactions with the other particles, punctuated by occasional and essentially brief collisions with other particles in the gas. That is, for a gas of particles with short-range forces, the mean free time between collisions is much longer than the average duration of a collision. Even for such an apparently simple system complicated and difficult questions arise such as: How does one reconcile the reversible equations of motion with the irreversible macroscopic properties of such a system, particularly as seen in the Second Law of Thermodynamics concerning the irreversible increase of entropy? Why is the Boltzmann equation, which is admittedly derived on the basis of stochastic rather than mechanical assumptions, so successful in predicting the properties of dilute gases, and in explaining the origin of the hydrodynamic properties of the gas? and so on. This article will try to show how such questions might be answered by chaos theory and to indicate the general lines of the answers, to the extent that the issues are understood at present. Our discussions will be based at first upon the work of Bowen [13], Ruelle [4,14], and Sinai [15], who are the principal architects of the modern theory of the dynamical foundations of statistical mechanics. Recent developments due to major contributions of many other authors will also be described here, with appropriate, though certainly incomplete, references to the literature [16]. The plan of this article is as follows: In Section 2 we will review the basic ideas of ergodicity and mixing for dynamical systems, and show their importance for statistical mechanics. We discuss the role of systems with few degrees of freedom, such as the baker’s map and the Arnold cat map, as systems that exhibit ergodic and mixing properties and serve as paradigms for the kinds of properties we would like to see in systems with many degrees of freedom. The examples provided by such simple systems will motivate the important definitions of Lyapunov exponents, Kolmogorov—Sinai (KS) entropies, and hyperbolic, Anosov systems which will be important for the further discussions. In Section 3 we consider some recent applications of ideas from dynamical systems theory to the theory of transport phenomena, and illustrate the role of fractal structures that are responsible for diffusion and other transport processes. We consider there the application of escape-rate methods to transport theory, as developed by Gaspard and Nicolis [17], and the method of thermostatted dynamical systems, developed by D. Evans, W. Hoover, H.A. Posch, G. Morriss, and E.G.D. Cohen [18], as well as other authors. In Section 4 we use these ideas to present our current understanding of the dynamical origins of the Boltzmann equation, and to show how this equation, in turn, can be used to calculate quantities such as Lyapunov exponents which measure the chaotic nature of the relevant dynamics. We conclude in Section 5 with a number of remarks and questions. This article is dedicated to the memories of T.H. Berlin, M.S. Green, M. Kac, J. Kestin, E.A. Mason, P. Resibois, and G.E. Uhlenbeck.
J.R. Dorfman / Physics Reports 301 (1998) 151—185
153
2. Dynamical systems theory and statistical mechanics We begin by recalling that classical statistical mechanics represents the mechanical state of a system of N point particles in d-dimensional space by a single point in a phase-space of 2Nd dimensions, C-space, whose axes represent the position and momentum in each spatial direction for each of the N particles [19]. If we denote a point in C-space by X, then the Newtonian motion of this phase point over a time t from X to X will be described by a time displacement transformation t S with t X "S (X ) . (1) t t Although statistical mechanics is usually concerned with systems where N is large and t is continuous, usually called “flows”, here we will often consider simple dynamical systems in phase spaces of two dimensions and where the time t takes on discrete values. Such low-dimensional systems with discrete time steps are called “maps” [1,7]. The reason for considering them is that they display the kind of dynamical properties that we would like to find in systems of more interest to physicists. In particular, we would like to identify the important features of these simple maps which are responsible for such properties as ergodic behavior and an approach to an equilibrium state, and then see if such properties are characteristic of more general flows for more realistic models of physical systems. 2.1. Invariant measures Having defined the phase-space C and the time displacement operator for either continuous or discrete times, we now add a third basic quantity to our discussion, an invariant measure, k(A), where A is some set in phase-space. A measure k is invariant under the time displacement operator S if for any set A, k(A)"k(S A), i.e. the measure of the pre-image of A, denoted by S A is equal t ~t ~t to the measure of A for any t'0. This condition, that an invariant measure be constant under time evolution, is closely related to the usual statements in statistical mechanics that follow from the evolution equation for the Liouville distribution function, o(X, t)"o(S (X ), 0). An example of the ~t point of the requirement that t'0, will be seen shortly when we consider invariant measures for one-dimensional maps, which are not invertible. Well-known examples of invariant measures include the volume of a set in C-space which defines the Liouville measure, k (A), L
P
k (A)" L
dC , (2) A where the integration is over the range of position and momentum coordinates that define the set A in C-space; the microcanonical measure on the constant energy surface, k (A), defined by ..#. dX k (A)" , (3) ..#. D+HD A where dX is a surface element on the constant energy surface, and +H is the gradient of the Hamiltonian H of the system with respect to all of the coordinates and momenta in C-space [19].
P
154
J.R. Dorfman / Physics Reports 301 (1998) 151—185
There are a number of useful systems of low dimensionality with simple invariant measures, as well. As they will be used in our further discussions, we describe the systems and their appropriate invariant measures now. We should mention that very often dynamical systems have infinitely many invariant measures. However of these, usually only one, called the natural or SRB measure, can be used to compute the type of ensemble averages used in statistical mechanics. The other measures are often defined on very special points or sets in phase-space. For example, one can define invariant delta function measures on periodic points of the system, but these measures cannot generally be used to characterize the statistical properties of the system. In all of the discussions to follow we will assume that the total measure of the appropriate phase-space is finite, which implies, of course, that the measure of any subset of the space is finite. The simplest map with an invariant measure is a piecewise linear, 1-d non-invertible map of the unit interval, [0,1] onto itself, an example of which is the tent map illustrated in Fig. 1. This map is given by [1,7]
G
2x n x " n`1 2(1!x ) n
for 0)x (1 , n 2 for 1)x )1 . 2 n
(4)
Here the invariant measure of a set A, k(A) is its Lebesgue measure, or simple area, k (A). For this map the pre-image of A is the union of two sets each of Lebesgue measure k (A)/2. Note that the image of A typically has Lebesgue measure 2k (A), so our definition of the invariant measure, using the condition t'0 above allows us to define such measures even for non-invertible maps, such as the tent map. An invariant measure should be defined by considering the pre-image of A, as we did above. For our purposes the most useful maps will be two-dimensional (2-d) invertible maps, particularly, the baker’s map illustrated in Fig. 2, and the toral automorphism, often called the Arnold cat map, illustrated in Fig. 3. [20]. The baker’s map is defined on the unit square 0)x,y)1 by the dynamical equation
A B AB x x n`1 "B n y y n`1 n
A B
for 0)x (1 n 2
A
B
2x n y /2 n
"
"
2x !1 n (y #1)/2 n
for 1)x )1 . 2 n
(5)
This map has an invariant measure which is just the Lebesgue measure of two-dimensional regions on the unit square. This follows immediately from the observation that horizontal lengths become twice as long, but vertical lengths become one half as long. One can also define the inverse map B~1 in a simple way. Under the inverse map, horizontal lengths contract by a factor of 2, and vertical lengths become twice as long.
J.R. Dorfman / Physics Reports 301 (1998) 151—185
155
Fig. 1. The tent map on the unit interval 0)x)1.
Fig. 2. The baker’s map on the unit square 0)x,y)1.
Toral automorphisms, T, are defined in general by the transformation
A B A B A BA B
x x a b x n`1 "T ) n " ) n mod 1 , (6) y y c d y n`1 n n where a, b, c, d are positive integers, and the determinant of the 2]2 matrix, ac!bd"1. Thus a unit torus is mapped onto a unit torus, and the condition on the determinant guarantees that all areas on the unit torus are preserved by the mapping. For reasons to be clear shortly, we will restrict our attention here to so-called hyperbolic maps where the eigenvalues of T do not lie on the unit circle. The Arnold cat map is a particular case of this where the matrix T is given as (2 1 ). This is 11 the case illustrated in Fig. 3. 2.2. Ergodic and mixing systems For classical systems, the foundations of equilibrium statistical mechanics are based upon Boltzmann’s ergodic hypothesis: The long time average of the dynamical quantities accessible to measurement, such as the force per unit area exerted by a fluid on the walls of a container, for an individual, isolated system, is equal to the ensemble average of the same dynamical quantity taken with respect to a microcanonical ensemble [19]. One of the tasks of dynamical systems theory is to identify the kinds of dynamical systems and dynamical quantities for which the ergodic hypothesis can be verified.
156
J.R. Dorfman / Physics Reports 301 (1998) 151—185
Fig. 3. The Arnold cat map on the unit square.
The first major result in this direction was Birkhoff ’s ergodic theorem [20,21]: Consider some dynamical function, F(X ), defined on the constant energy surface, E, that satisfies the condition
P
dkDF(X )D(R,
(7)
where the integration is with respect to the microcanonical measure defined in Eq. (3), and is carried out over the whole surface. Suppose further that the total measure of the surface is finite. Then: (i) the time average of F, denoted by FM (X)"lim 1 :Tdt F(X(t)) exists almost everywhere T?= T 0 on the constant energy surface, that is, for almost every starting point X; (ii) The time average FM (X) may depend on the particular trajectory but not upon the initial point of the trajectory; (iii) The ensemble average of FM is equal to ensemble average of F itself. That is, :dk F(X)":dk FM (X); and (iv) One can replace the entire constant-energy surface as the region of integration by any invariant subset of non-zero measure. Here an invariant set is one where all points (with the exception of a subset of measure zero) initially in the set, remain in the set during the course of their time evolution, and the measure of the set does not change with time. Birkhoff’s theorem does not show
J.R. Dorfman / Physics Reports 301 (1998) 151—185
157
that systems of physical interest are ergodic in the sense of Boltzmann because the time average of an integrable dynamical quantity may depend on trajectories and thus not necessarily be equal to an ensemble average, which is, of course, just a number. We therefore define an ergodic system to be one where the time average FM of any dynamical quantity F is a constant on the surface. For such a system the time average of F is indeed its ensemble average, which we denote by FI , since
P
P
dk F(X )"FM dk ,
(8)
from which it follows that
P
NP
FM " dkF(X )
dk"FI .
(9)
It should be remarked that although it is true that for ergodic systems the ensemble average of any integrable function is equal to its time average, the requirement of ergodicity may be demanding too much of the dynamical systems to which we would like to apply equilibrium statistical mechanics. For example, for applications of statistical mechanics we might restrict ourselves to a smaller class of functions for which we would need to prove the equivalence of time and ensemble averages, thus placing fewer demands on the dynamics. Over the seventy or so years since Birkhoff’s theorem was proved, a number of systems have been shown to be ergodic [15,16,22]. These include simple transformations of the unit interval [0,1] onto itself by x@"x#a mod 1, (think of taking steps of length a on a circle of unit circumference) where a is irrational, the baker’s map, toral automorphisms of the type described above, geodesics on surfaces of constant negative curvature, and of most interest for physics, billiard ball systems in two and three dimensions. These systems include a particle moving among a system of fixed, non-overlapping hard disk or hard-sphere scatterers (the Lorentz gas) [15], as well as certain systems of moving hard disks or hard-spheres, in a box with periodic boundary conditions [23]. The proofs of ergodicity for billiard systems are not at all easy(!), but billiards may be the simplest physically realistic systems to analyze because of the simplicity of the collisions of the particles [24]. We will return to this point further on, but first we wish to develop further characterizations of the dynamical systems for which the methods of statistical mechanics apply. Gibbs took Boltzmann’s ideas a step further by introducing the idea of a mixing system [16,19,20]. He proposed to look at an initial ensemble of mechanically identical systems all with the same total energy, but with different initial conditions. Suppose the phase points for this ensemble are confined, initially to a set of positive measure, A, on the constant energy surface, E. Then in the course of time this set will evolve to a new set, A , with the same measure as A. A system is called t mixing in the sense of Gibbs, if for any fixed set B of positive measure on E lim k(BWA )/k(B)"k(A)/k(E) . (10) t t?= That is, a system is mixing if the time evolution of any set of non-zero measure leads to the set being distributed uniformly with respect to the invariant measure k over the constant-energy surface. It is easy to show that the condition that a dynamical system be mixing is stronger than requiring that it be ergodic [16,20]. That is, all mixing systems are ergodic but not vice versa. For example,
158
J.R. Dorfman / Physics Reports 301 (1998) 151—185
the ergodic transformation of the unit interval [0,1] onto itself, by x@"x#a mod 1, for irrational a is not mixing. To see this just let A, B each be small connected sets of the unit interval, and consider the set C "A WB, the intersection of B with the image of A after n applications of the n n transformation. The limit of k(C ) as n gets large is not well defined and the set A does not get n uniformly mixed over the unit interval, but instead moves as a rigid set without mixing. The mixing property of a dynamical system is even more interesting for our purposes than that of ergodicity, not only because mixing implies ergodicity, but also because if a system is mixing, then non-equilibrium ensemble averages approach their equilibrium values as time gets large [8,20]. That is, if one starts with some non-equilibrium ensemble distribution on E, not the microcanonical one, and if this distribution is a measureable function on E, then the time evolution of this distribution function is governed by Liouville’s equation. The ensemble average of any measurable dynamical quantity, F, at some time t is then determined by the integral of F with respect to the distribution function at time t. For a mixing system, this ensemble average approaches its value in the microcanonical ensemble as t approaches infinity. Physically this means that in a weak sense, that is, under integration with some well-behaved function, non-equilibrium distribution functions approach the microcanonical equilibrium distribution in the long time limit. This is the essential information needed for the validity of non-equilibrium statistical mechanics. Systems that have been rigorously proved to be mixing include the baker’s map, the hyperbolic toral automorphisms, and the Lorentz gases described above [16,25,26]. There is little doubt the moving hard sphere, or hard disk systems are mixing, but as yet there is no rigorous proof of this supposition. 2.3. Dynamics, symbolic dynamics, and ¸yapunov exponents While we will not prove here that the baker’s map or the toral automorphisms are mixing systems, it is very instructive to sketch the lines of the proof for the baker’s map. This will provide a simple example of a Markov partition, which is a useful construction in many other, and more general, contexts [27,28]. We begin by noting that any number in the interval [0,1] can be expressed in a binary series, the dyadic expansion, as a a a x" 0# 1# 2#2 , 2 22 23
(11)
where each of the a can take on the values 0,1, depending on x, of course. We can represent x as an i infinite sequence of 0’and 1’s as x"(a , a , a ,2) . (12) 0 1 2 This representation is unique except for those fractions whose representations consist of a finite number of a’s followed by an infinite number of 0’s or of 1’s. Such fractions have two equivalent representations, one with an infinite sequence of zeroes and one with an infinite sequence of 1’s. As these particular fractions form a countable set, we can always require that we use one of these representations for them consistently. In a similar manner we may represent y in [0,1] as b b y" ~1# ~2#2 or y"(b , b ,2) . ~1 ~2 22 2
(13)
J.R. Dorfman / Physics Reports 301 (1998) 151—185
159
Combining these two representations, we see that any number in the unit square 0)x, y)1 may be expressed by (x, y)"(2b b . a a a 2) , (14) ~2 ~1 0 1 2 where we have separated the x-sequence from the y-sequence by a “dot”, and strung the y-sequence to the left, with the x-sequence to the right. We can clearly approximate any point in the unit square to an accuracy 2~N by a finite sequence (x, y)K(b b 2b . a a 2a ). (15) ~N ~N`1 ~1 0 ~1 N~1 Such an approximation is equivalent to locating the point somewhere in one of the small squares obtained by dividing the unit square into small regions by drawing vertical lines at intervals of 2~N, and similarly with horizontal lines. This partition of the unit square into much smaller regions, is an example of a Markov partition, because of the particular way in which the lines are drawn, as we explain below. The approximation to the point (x, y) consists in specifying the particular small region in which the point is located. This procedure for approximating a particular point in phase-space, here the unit square, is a mathematical echo of the idea of coarse graining often used in statistical mechanics. The central feature that we now want to emphasize is that given the representation of the point (x, y) on the unit square by Eq. (14), its image, (x@, y@) under the baker’s map becomes (x@, y@)"(2b b a . a a 2) , (16) ~2 ~1 0 1 2 where all the members of the sequence have shifted one unit to the left, and, in particular, the number a has moved past the “dot” into the y-sequence. Thus, we have mapped a deterministic 0 dynamical system onto something that looks like a “coin toss” experiment. A point on the unit square can be mapped onto a bi-infinite sequence of 0’s and 1’s, that is, onto a particular realization of a bi-infinite coin toss sequence, and the “dot” indicates the location of one particular moment in the sequence of tosses. The next toss will simply be the next step in the process of generating the entire sequence, and corresponds, in the baker’s map, to one iteration of the map! A coin toss experiment is, of course, a stochastic process, with a given probability p for obtaining a “heads” and q"1!p for a “tails”. Our ability to map certain deterministic dynamical systems onto random, stochastic processes is the main point of this article! Of course the central question is to identify those systems for which such a mapping is possible. Now consider what happens to all of the points in the small square of order 2~N on a side, which we will take to be the set A in the Gibbs picture. We know that for all such points the values of b ,b , ,b ,a ,a , ,a will be the same. However, each of the additional, unspecified ~N ~N`1 2 ~1 0 1 2 N~1 a ,b , with k'!1, k@'0, can take on the values 0 or 1. All of the points in the small N`k ~N~k{ square correspond to all possible values of these unspecified a’s and b’s. After N iterations of the baker’s map, all of the specified values for the a’s will have shifted onto the y coordinate, and the set of points in the original small square will be uniformly distributed along the x interval [0,1]. After 2N iterates of the map, we will have lost track of the y coordinates as well, at least to the accuracy we have specified, and the points originally confined to our small square will be uniformly distributed over the entire unit square. Thus the system is indeed mixing. Note also that a similar circumstance would apply if the map were run backward, and the sequence shifted to the right past
160
J.R. Dorfman / Physics Reports 301 (1998) 151—185
the “dot” at each step. Thus we can see an approach to a uniform distribution for both the “forward” and the “time reversed” motions. Consequently, we can say that it is certainly possible for the ensemble distribution for a reversible dynamical system to show an approach to a uniform, equilibrium state for both forward and backward motions, provided we look at the distribution on a slightly coarse grained scale. A very similar analysis applies to the hyperbolic toral autormophisms, but the partitioning of the unit square (or torus) should be done with lines whose directions and spacings are dictated by certain features of the particular automorphism [16,27]. These features will be determined by the directions of the stable and unstable manifolds, and by the Lyapunov exponents of the map, all of which we will discuss shortly. The representation of the trajectory of a point on the unit square under the baker’s map in terms of sequences of 0’s and 1’s, is an example of symbolic dynamics, whereby, trajectories are coded into sequences whose elements are chosen from a finite number of symbols (here 0 or 1) [1,7,16]. The representation of dynamics using such sequences can be very useful in analyzing and calculating the properties of dynamical processes in simple enough systems. For example, readers might convince themselves that the Cantor “middle third” set can be coded by similar sequences of two symbols. (Hint: Consider the map x@"3x mod 1, and represent all numbers on the unit interval by a series in inverse powers of 3, with coefficients 0,1,2. The “middle third” Cantor set is isomorphic to all sequences where a 1 never appears, and the above transformation just maps this set onto itself.) Now a remarkable thing has just happened in our analysis of the baker map. We started with a set of points that were all within a distance e"2~N of each other, and they spread out over the unit square. Consequently, the rates and directions of separation, or conversely of approach, of the images of two nearby points are certainly quantities of some interest for understanding the mixing process. For the baker’s map, it is clear that the images of infinitesimally close points with the same y coordinate will separate in the x-direction, so that their separation, d , after n steps will be 2nd , n 0 where their initial infinitesimal separation is d . However, the images of two infinitesimally close 0 points with the same x coordinate will approach each other as d "2~nd . The images of two n 0 infinitesimally close points that do not lie in a strictly vertical line will also separate exponentially, since the exponential separation in the x-direction will dominate the separation of the two points. Consequently, the images of any two infinitesimally close points will separate exponentially, except for a set of measure zero, namely those points with the same y coordinate. If we consider the time-reversed motion, then points will separate in the y direction, approach in the x-direction, but the images of almost all nearby points will separate exponentially rapidly. The exponents that govern the rates of exponential separation or approach are called ¸yapunov exponents [1,4]. We see that the baker’s map has two Lyapunov exponents, ln 2,!ln 2. The fact that the sum of these exponents is zero is not accidental, it is a consequence of the invariance of the Lebesgue measure on the unit square. Moreover, the separation of the nearby points is exponential for a large number of iterations of the map and the rate of separation is given by the Lyapunov exponents on this time scale. However, this exponential rate of separation cannot continue forever since the measure of the phase-space is finite. Eventually, the separation will become large enough that the folding mechanisms of the baker map will take over and the two points will find themselves in very different regions of the unit square. In general, we will find that exponential separation of trajectories in a bounded phase-space cannot continue indefinitely, but that there is always a folding mechanism that keeps trajectories within the bounded region.
J.R. Dorfman / Physics Reports 301 (1998) 151—185
161
Consider now the case of the hyperbolic toral automorphisms, Eq. (6), and let us use the Arnold cat map as an example. An elementary calculation shows that the 2]2 matrix given below Eq. (6) has the eigenvalues K given by B K "(3$51@2)/2 . (17) B Notice that K '1,K (1, and the invariance of the Lebesgue measure on the unit torus requires ` ~ that K K "1. Here again, we see that there is an expanding direction which is determined by the ` ~ direction of the eigenvector corresponding to K , and a contracting direction determined by the ` direction of the eigenvector corresponding to K . The associated Lyapunov exponents which we ~ denote by j are given by j "ln K . Here again we note that the invariance of the measure on B B B the unit torus requires that j #j "0. Consider a typical point on the unit square and draw a set ` ~ of lines through it such that one is in the direction of the expanding eigenvector and the other is in the direction of the contracting eigenvector. We will refer to these two lines as the expanding or unstable manifold and the contracting or stable manifold respectively. The images of two infinitesimally close points on the unstable manifold will separate exponentially, while the images of two close points on the stable manifold will approach exponentially [1,16]. Now we can return to the idea of a Markov partition mentioned at the beginning of this section. A Markov partition is a partition of the phase-space into a finite collection of small “parallelograms” with disjoint interiors and whose sides lie on stable and unstable manifolds, perhaps suitably extended, of points in the space. For the case of the baker’s map, the parallelograms have vertical and horitonzal sides, while for the Arnold cat map, the parallelograms have sides that lie along the eigen directions of the matrix below Eq. (6). The existence of a Markov partition for a dynamical system allows one to use symbolic dynamics to code the trajectories, and to use of the theory of Markov processes to analyze the time dependence of distribution functions defined on the phase-space. For example, the solution of the Perron—Frobenius equation to be discussed in Section 4 is greatly facilitated if one can construct a suitable Markov partition for the system. The calculations described in Section 3.4 makes extensive use of the method of Markov partitions. Further details and a discussion of the use of Markov partitions to compute other quantities of dynamical interest such as Kolmogorov—Sinai entropies, to be discussed below, can be found in Refs. [15,16]. In order to show that the Lyapunov exponents are non-zero for both the baker map and the hyperbolic toral automorphisms we have had to calculate them, thus, given a dynamical system, it is not always obvious that it will have the positive Lyapunov exponents that are required for the exponential separation of trajectories. We can generalize the concept of Lyapunov exponents, stable, and unstable manifolds to higher dimensional systems and to flows, in addition to maps [4]. Usually the only fundamental difference between maps and flows is that flows typically have one zero Lyapunov exponent in the direction of the trajectory in phase-space [5,4]. Although the proof can sometimes be involved mathematically, the zero Lyapunov exponent in the direction of the trajectory can be understood by imagining two infinitesimally close points that lie on the same trajectory in phase-space. If the dynamics of the system consists of free particle motion punctuated by collisions that are instantaneous or almost so in comparison with other time scales of the system (the mean free time between collisions, say), then the two points will remain close over the course of their motion and their separation will certainly not be exponential. Furthermore, if the dynamics is consistent with
162
J.R. Dorfman / Physics Reports 301 (1998) 151—185
an invariant measure, such as Hamiltonian dynamics, then the sum of all the non-zero Lyapunov exponents must be zero as a consequence of the invariance of the measure. This suggests that the Lyapunov exponents should be defined in such a way as to be consistent with the invariance of the measure, which is the case in the two examples we have considered. It is worth pointing out that Lyapunov exponents for periodic trajectories, which have delta function invariant measures, may very well be different from those defined with respect an invariant measure defined on the entire phase-space. One additional property of the non-zero Lyapunov exponents for a Hamiltonian system that is important is the symplectic conjugate pairing rule. That is, for a system with symplectic dynamics, the non-zero Lyapunov exponents, should there be any, must come in “plus—minus” pairs, with each pair of exponents summing separately to zero, independent of the values of the other pairs. This is a consequence of the symplectic form of Hamilton’s equations of mechanics [29], and is not true for reversible systems that are not symplectic. We shall encounter such systems in the further sections. 2.4. The Kolmogorov—Sinai entropy Let us now look at some properties of the Arnold cat map in more detail. In Fig. 4, we show an initial set A which is located in the lower left hand corner of the unit square, and in Figs. 5—7, we show the evolution of this set after 2, 3, and 10 iterations of the map, respectively. As the number of iterations increases the set becomes longer and thinner such that at the third iteration the set has begun to fold back across the unit square, and after 10 iterations the set is so stretched and folded that it appears to cover the unit square uniformly. If we were able to increase the resolution of this
Fig. 4. The initial set A is confined to the lower left corner of the unit square. The dynamics of the points in this set is governed by the Arnold cat map.
J.R. Dorfman / Physics Reports 301 (1998) 151—185
163
Fig. 5. The evolution of the set A after two iterations of the Arnold cat map.
Fig. 6. The evolution of the set A after three iterations of the map.
illustration beyond the 105 points used to generate it, we would see that this apparently uniform distribution is made up of very many, on the order of 106, thin lines parallel to the expanding direction of the cat map, and very close together. Since the initial square is getting stretched along the unstable manifold, at every iteration we learn more about the initial location of points within
164
J.R. Dorfman / Physics Reports 301 (1998) 151—185
Fig. 7. The evolution of the set A after 10 iterations. The initial set A consisted of 105 points, and the continuous nature of the initial set is no longer preserved. With more points, this evolved set would appear to be a set of closely spaced parallel lines, nearly convering the unit square uniformly.
the small initial set A. That is, suppose we can distinguish two points on the unit square only if they are separated by a distance d, the resolution parameter, and suppose further that the initial set has a characteristic dimension on the order of d [9]. We cannot resolve two points in the initial set then, but after a time t, the initial set will have stretched along the unstable direction to a length on the order of d exp(j t), and we can easily resolve the images of points in the initial set. Thus, as we look ` at the successive images of the initial set we are able to obtain more and more information about the location of points in the initial region, in fact, the information is growing at an exponential rate. The exponential rate at which information is obtained is measured by the Kolmogorov—Sinai (KS) entropy [1,4,16], h , and for our simple system, the cat map, h "j . In general, one finds that KS KS ` for a dynamical system of several dimensions, with positive Lyapunov exponents, and where all of the points are confined to a bounded region of phase-space, then h "+ j , KS `,i i
(18)
where the summation is over all of the positive Lyapunov exponents of the system. Eq. (18) is referred to as Pesin’s theorem [4,30], and we mention that there is a highly developed theory for the KS entropy [16,27], which we cannot expand upon here. We mention again that the Lyapunov exponents and the KS entropy should be defined with respect to an invariant measure. Thus, there may be many sets of exponents for a single system, e.g., Lypaunov exponents defined on periodic orbits may differ from those defined with respect to the natural invariant, or SRB, measure, which will be discussed in Section 2.5. Pesin’s theorem applies if h and the Lyapunov exponents are KS
J.R. Dorfman / Physics Reports 301 (1998) 151—185
165
defined with the respect to the natural invariant measure. In the next section we will consider a situation where points can escape from a bounded phase-space region, and where the KS entropy is less than the sum of the positive Lyapunov exponents by an amount that is equal to the escape rate of points from the bounded region. 2.5. Hyperbolic and Anosov systems We have argued that both the baker’s map and the hyperbolic toral automorphisms are mixing systems, and as a consequence, as simple models of dynamical systems, they show an approach to equilibrium. In fact we can map these simple reversible dynamical systems onto stochastic processes. The mechanisms responsible for this desirable behavior are: (1) the exponential separation of trajectories in phase-space, characterized by positive (with corresponding negative) Lyapunov exponents with corresponding expanding and contracting directions; and (2) the folding of phase-space regions during their time evolution due to the boundedness of the phase-space, leading to a uniform distribution of points over the phase-space, at least on a coarse grained scale. It is now time to formalize these properties so that we can see if the above mechanisms apply for more realistic systems. To do this we consider a general dynamical system with several degrees of freedom, with an associated phase-space C and an invariant measure k on the phase-space. Let us write the equations of motion for the phase-space variables, denoted by X as XQ "G(X) ,
(19)
where, typically, but not always, G is the appropriate derivative of the Hamiltonian function with respect to the phase-space variables. In order to consider the possible exponential separation of infinitesimally nearby trajectories, we consider the time evolution of a small displacement dX in phase-space between two arbitrarily close points [5,31]. This displacement satisfies a linearized equation in the tangent space to C at the point X, dXQ "G(X)/X ) dX .
(20)
The solution to this equation in the tangent space has the form dX "M(t, X) ) X , t where M satisfies the equation
(21)
MQ (t, X)"G(X )/X ) M(t, X) . (22) t t The matrix M plays a special role in the discussion now, as it determines the dynamical evolution of the displacements in the tangent space. We define an Anosov dynamical system as a map or flow with the following properties [6,16,32] (we consider flows here, but our definition can be easily modified for maps): 1. For almost every point, X, in the phase-space C, there is a decomposition of the tangent space at X, TC(X), into three subspaces, an unstable subspace E6 , a stable subspace E4 , and a center X X
166
J.R. Dorfman / Physics Reports 301 (1998) 151—185
subspace En such that X TC(X)"E6 =E4 =E0 . X X X 2. There are constants C , C , and K with C ,C '0, and 0(K(1, such that: 4 6 4 6
(23)
2.1. If dX is in E4 , then X EM(t,X) ) dXE)C KtEdXE , (24) 4 2.2. If dX is in Eu , then (for t'0) X EM(!t, X) ) dXE)C KtEdXE , (25) u 2.3. The subspaces E6 ,E6 ,E0 vary continuously with X. This means that any vector dX can be X X X written as dX"dXu #dXs #dX0 , (26) X X X where each term is in the indicated subspace. Then each of the dXj vary continuously X with X. 2.4. The various subspaces intersect transversely, without any tangencies. The center subspace contains the directions of motion with zero Lyapunov exponent, such as the direction tangent to the trajectory of the system in phase-space, and any other directions associated with macroscopically fixed constants of motion, such as the total momentum etc. Note that in condition (2.2) above we considered the time reversed motion on the unstable manifold, and the condition is that small deviations from the orbit converge exponentially to zero for the time reversed motion, while for the stable manifold, the deviations approach zero exponentially in the forward direction. The transversality condition (2.4) assures that there are no tangencies among the subspaces which would greatly complicate the description of the dynamics. As we will see in the next section, there are a number of circumstances of physical interest where we will need to consider invariant subregions of the full phase-space. These regions are usually fractal attractors or repellers, to be defined and described in the next section, which may have Lebesgue measure zero in the full phase-space. It may happen that the dynamics, when restricted to such an invariant subregion still satisfies conditions (2.1)—(2.4) above. In such a case we say that the dynamics is hyperbolic on the subregion. An Anosov system, then, is one which is hyperbolic on the full phase-space. Anosov systems of finite total measure have the properties that we need for equilibrium and non-equilibrium statistical mechanics, they are ergodic and mixing, with positive KS entropy. However, with the exception of the simple models we have discussed already, we do not know if some of the most physically realistic systems are Anosov. Certainly, hard-disk or hard-sphere systems are not Anosov, since the collision dynamics produces discontinuities in which one member of a pencil of nearby trajectories just includes a grazing collision involving two particles, but many of the other trajectories miss that collision altogether. However, Sinai and co-workers have shown that such systems still have many of the nice properties of Anosov systems [15,24]. In view of the difficulty in establishing whether a physical system is Anosov, Gallavotti and Cohen [33] have proposed a chaotic hypothesis, namely, that until one proves that a system of interest is
J.R. Dorfman / Physics Reports 301 (1998) 151—185
167
not Anosov, one should assume that it is, and then calculate its dynamical properties based upon that supposition. Their reasoning is simply that had we waited for someone to prove that a given physical system is ergodic before applying the methods of equilibrium statistical mechanics to it, there would have been precious little progress in understanding the properties of equilibrium systems over the last hundred years. Thus it would seem practical for physicists, at least, to compute the dynamical properties of systems by assuming that the systems are Anosov. We will adopt this attitude here, for the most part, but we will also treat the hard-disk Lorentz gas in the later sections, and make explicit use of the discontinuous dynamics. We will see that the failure of this system to be Anosov is not a serious handicap. We conclude this section with the statement of an important theorem due to Sinai, Ruelle, [6] that is a generalization of the ergodic theorem to general hyperbolic systems, whether they be an Anosov system, or a system which evolves to an attractor. Suppose then that we have: (1) a hyperbolic system described by a flow X "S (X) for almost all X in some invariant region R, and t t (2) g(X) is a continuous function of X. Then there exists a unique measure k such that for almost all (with respect to Lebesgue measure) X in R
P
1 T :g(Y ) dk lim dt g(S (X))" . (27) t ¹ :dk 0 T?= where the integrals on the right-hand side of Eq. (27) are over the invariant region R. The measure that is proved to exist by this SRB theorem is called a SRB measure. In the case that R is the full phase-space C, then the SRB measure is the microcanonical measure, and the SRB theorem is equivalent to the statement that Anosov systems are ergodic. However, the theorem is much more general than that, and as we have said, it applies to hyperbolic attractors as well.
3. Applications of dynamical systems theory to transport: repellers and attractors 3.1. The escape-rate method In our discussion of the KS entropy, we argued, in effect, that if we have a closed hyperbolic or Anosov system, the sum of the positive Lyapunov exponents is equal to the KS entropy (Pesin’s theorem, Eq. (18)). Suppose now that we have a situation frequently encountered in transport phenomena, where particles can escape from the system. Consider, for example, the diffusion of particles in a fluid, that has absorbing boundaries. In that case, particles that reach the boundary will be lost from the system. Here we argue that it is possible to apply dynamical systems theory to such a case, and that this application shows a deep connection between transport coefficients and the properties of chaotic dynamical systems. This connection was first made by Gaspard and Nicolis [17]. This is an appropriate place to say that by a chaotic dynamical system we mean one that has a positive KS entropy, i.e., exponential separation of trajectories, and a bounded phase-space so that folding takes place. First we consider the macroscopic description of such a diffusion phenomena. Suppose we have some macroscopic region of characteristic size ¸, much greater than any of the microscopic lengths that characterize the system. Suppose that the system consists of one moving particle and
168
J.R. Dorfman / Physics Reports 301 (1998) 151—185
a collection of fixed scatterers, arranged in such a way that for almost every initial point the moving particle is never trapped inside the system. Now, suppose that the boundary of the system is absorbing, so that any particle that reaches the boundary will be absorbed, and removed from the system. The macroscopic description of this process is through the Fokker—Planck equation for P(r, t), the probability of finding the particle at point r at time t, which is P(r, t)/t"D+2P(r, t) ,
(28)
supplemented by the absorbing boundary condition, P"0 on the boundary. (In this case the Fokker—Planck equation coincides with the diffusion equation.) Here the quantity of interest is the diffusion coefficient D. If we solve this equation and ask for the probability that the particle will still be in the system at time t, P(t)":dr P(r, t) we find that P(t)&exp[!ct], where the escape-rate c is c"D a/¸2 ,
(29)
where a is a numerical constant that depends on the geometry of the region of scatterers. A remarkable fact is that there is also a microscopic expression for the same escape-rate in terms of the dynamical properties of the moving particle in the scattering region. To understand the microscopic formula we need to realize that although the moving particle will eventually escape the region with probability 1, there are an infinite number of possible trajectories for the particle whereby it never leaves the scattering region, i.e., it is never absorbed at the walls. The set of initial positions and velocities of the moving particle, that is, the set of initial points in the phase space of the moving particle, that lead to orbits which are entirely within the scattering region is called the repeller for this system, and we denote it by R. The repeller is typically a fractal set of Lebesgue measure zero in the phase-space, but with a non-countable number of points. R will have a Hausdorff dimension which is slightly less than the dimension of the phase-space. The escape-rate c can be expressed in terms of the dynamical properties of the trajectories that are entirely confined to the repeller as c"+ j (R)!h (R) , (30) `,i KS i where the summation is over all the positive Lyapunov exponents, j (R) for trajectories on the `,i repeller, and h (R) is the KS entropy for these trajectories [1,17,31,34]. Note that we have a system KS for which Pesin’s theorem does not hold, but by a small amount of order ¸~2. To get some feeling for the origin of the escape-rate formula, though by no means a derivation, we return to our argument for the validity of Pesin’s theorem [9]. We want to find the rate at which we are obtaining information about the location of the initial points on the repeller. As before, the stretching mechanism provides an exponential rate of information growth about the initial location of the point in phase-space, but the probability that the system has not disappeared through the boundary is decreasing exponentially, too. Therefore, we should expect that the rate of information growth for points on the repeller is obtained by combining these two exponential rates as ethKS(R)"e*t+ij`,i(R)+e~ct ,
(31)
or, equivalently, Eq. (30) above. Simple one-dimensional maps which illustrate the derivation of the escape-rate formula, and exhibit the structure of the fractal repeller are described in Refs. [1,8,9,35].
J.R. Dorfman / Physics Reports 301 (1998) 151—185
169
As an example the reader might consider the map
G
x@"
3x
3x!2
for 0)x)1 , 2 for 1/2(x)1
(32)
and suppose that points mapped outside the interval [0,1] have escaped. For this map the fractal repeller is the middle third Cantor set, the escape rate is c"ln(3), the Lyapunov exponent on the 2 repeller is ln 3, and the KS entropy is ln 2. These results can be understood by realizing that the map stretches all intervals by a factor of 3, accounting for the Lyapunov exponent, and that dynamics on the repeller can be coded by sequences of two symbols, each symbol appearing with equal probability, accounting for the KS entropy on the repeller. If we now combine the macroscopic expression for the escape-rate, Eq. (29) with the microscopic one, Eq. (30), we obtain an expression for the coefficient of diffusion, D, a transport coefficient, in terms of the dynamical properties of the repeller,
C
D
¸2 D" lim + j (R)!h (R) , (33) `,i KS a L?= i where we have taken the thermodynamic limit to ensure a result that does not contain finite size effects. Eq. (32) is due to Gaspard and Nicolis [17]. Gaspard and Dorfman have extended this method to apply to the other transport coefficients, such as the shear and bulk viscosities, and thermal conductivity of a fluid, as well as for chemical reaction rates [36]. When applying the escape-rate formalism to systems like the Lorentz gas, for example, one usually finds that the Lyapunov exponents and KS entropy for the repeller are equal to their infinite system values plus very small corrections of order ¸~2. Thus the transport coefficients are contained in these small corrections to the infinite system values, and the transport process is “coded” in the fractal structure of the repeller. The escape-rate method has been applied to compute the properties of the fractal repeller for a number of model systems, including the multibaker model for diffusion in one-dimensional systems [37], the two-dimensional periodic Lorentz gas at sufficiently high density that the free path length of the moving particle is small compared to the dimensions of the system [35,38], to the random Lorentz gas at low densities [39,40], but still with a mean free path small compared to the size of the system, and to Lorentz lattice gases [41]. In these cases the Lyapunov exponents and KS entropies on the repeller differ from their values in the thermodynamic limit, by terms of order ¸~2. The chief difficulty in using the escape-rate method as a method to compute transport coefficients is that it is as hard or harder to compute the dynamical quantities, particularly h (R) as it is to KS compute the transport coefficient directly, using kinetic theory or other methods. It would be extremely valuable, for example, to have analytic methods to compute h (R) directly rather than KS having to compute c and j (R) and using the escape rate to obtain the KS entropy on the repeller, `,i as one has to do in the analytic studies of the random Lorentz gas. 3.2. The Gaussian thermostat method Another way to relate transport coefficients to dynamical quantities, in this case Lyapunov exponents, was developed as a result of efforts to simulate transport processes on the computer
170
J.R. Dorfman / Physics Reports 301 (1998) 151—185
using molecular dynamics techniques. It was found early on that the simulations of viscous flow were plagued by the viscous heating of the fluid. To deal with this, Evans, Hoover, Nose´, and co-workers developed a thermostatting method that keeps either the kinetic or total energy constant in the fluid undergoing shear [18]. Although the thermostat destroys the symplectic, Hamiltonian structure of the dynamics, we get as compensation, so to speak, a useful and interesting connection between the transport coefficients in the thermostatted system and the Lyapunov exponents for the thermostatted dynamical system, assuming the system approaches a non-equilibrium steady state. The “destruction” of the symplectic structure is something of an overstatement. Liverani and Wojtkowski [42] have shown that these thermostatted systems are conformally symplectic, with useful consequences for the Lyapunov spectrum, and Dettmann and Morriss [43] have shown that the thermostatted dynamics can be mapped onto Hamiltonian dynamics with unusual canonical position and momentum variables. We refer to their papers for details. We illustrate the method again for a Lorentz gas with a particle moving among hard disk or sphere scatterers [8,9,44,45]. We suppose that the scatterers are arranged randomly in space or in a regular configuration with a finite free path for the moving particle. We also provide the moving particle with a charge q and suppose that it is acted upon by an external electric field E, in addition to the scatterers. If there were no thermostat, then the average kinetic energy of the moving particle would increase steadily with time, since the collisions do not affect the kinetic energy, but the field does. To counter this average increase in kinetic energy, we introduce a “thermostat” which keeps the kinetic energy of the moving particle constant between collisions with the scatterers. The equation for the motion of the particle between collisions is given by rR "p,
pR "E!ap ,
(34)
where we have set both the mass, m, and the charge, q, of the moving particle equal to unity. Here a is a function of the momentum and electric field that is fixed by the condition that the kinetic energy be constant, i.e. p ) pR "0. Thus a"E ) p/p2 .
(35)
These equations of motion are supplemented by the equations for elastic collisions of the particle with the scatterers, p@"p!2(nL ) p)nL ,
(36)
where p@ is the momentum of the particle after collision, and nK is a unit vector in the direction from the center of the scatterer to the point of contact with the moving particle at the instant of collision. These equations of motion define a reversible (under pP!p, rPr, and tP!t), non-Hamiltonian system in which the phase-space volume is not conserved. This latter remark follows from the observation that rR pR # "!(d!1)a , r p
(37)
where d is the number of spatial dimensions of the system. These equations have a number of remarkable consequences. Computer simulations show that the system develops a steady state, and that in this steady state the average value of a is positive,
J.R. Dorfman / Physics Reports 301 (1998) 151—185
171
SaT '0, as expected if the thermostat is to keep the system at constant kinetic energy [44,46,47]. 44 Now, if the phase-space volume is not conserved, but decreases on the average, but the system still settles into a steady state, at least within the resolution of the computer experiment, then it may be heading to a fractal attractor, of lower information dimension than the phase-space dimension, here, 2d!1. We will give a simple example of a system that does just this in a moment. Moreover, one may define a Gibbs entropy, S for this system by G
PP
S "!k G B
dr dp f (r, p, t)[ln f (r, p, t)!1] ,
(38)
where we have assumed the existence of a phase-space distribution function, f (r, p, t), for the moving particle. If this function exists and is differentiable, then it satisfies the conservation equation (f/t)#+r ) ( f rQ )#+p ) ( f pQ )"0 .
(39)
We now use this conservation equation to compute the time rate of change of the Gibbs entropy due to the thermostat [45]. Here we suppose that the Gibbs entropy of the system does not change with time if the thermostat were not present, as is true for systems whose distribution functions satisfy the usual Liouville equation. Now, by assuming that the distribution function vanishes at the boundaries (in space and momentum) of the system, we find that dS (t)/dt"!k SaT(d!1) , (40) G B where the angular brackets denote an average with respect to the phase-space distribution function f. This result that the entropy decreases with time can be understood, to some extent, by noting that if the system is heading toward an attractor it is getting localized into a more restricted region of phase-space with a concomitant decrease of the Gibbs entropy. The conservation equation, Eq. (39), has yet another consequence of some considerable interest, namely, we express f as f (r, p, t)"1/»(r, p, t), where » is the volume of a small phase-space region about (r, p) where the system is located at time t. Then the equation for f, fQ "(d!1)af, easily obtained from Eq. (37) and Eq. (39), leads to d ln »(t)/dt"!(d!1)a or Sd ln »(t)/dtT"!(d!1)SaT .
(41)
Since the rate of change of small volume elements in phase-space is governed by the sum of the Lyapunov exponents,
C
D
»(r, p, t)&exp t + j (r, p) , (42) i i where we may imagine that the Lyapunov exponents depend upon the phase-space point, and the sum is over all of the exponents, not just the positive ones. From this it follows that
T
U
(d!1)SaT"! + j(r, p) . (43) i Therefore, we can relate the average friction coefficient to the average value of the sum of the Lyapunov exponents [18].
172
J.R. Dorfman / Physics Reports 301 (1998) 151—185
We now push this analysis into the realm of transport theory by returning to the calculation of the rate of change of the Gibbs entropy. In a non-equilibrium steady state, the rate of decrease of the Gibbs entropy must be at least matched by an increase in entropy of the heat reservoir that is removing the heat generated by the dynamical processes in the system. If we identify this positive rate of entropy production with the irreversible entropy production required by irreversible thermodynamics, pE2/k ¹, where p is the coefficient of electrical conductivity, and ¹ is the B thermodynamic temperature, and k is Boltzmann’s constant, we find that B
T U
p"!(k ¹/E2) + j . (44) B i i Eq. (44) is the main result of this analysis [18]. It shows that a transport coefficient, in this case the electrical conductivity, is proportional to the average rate of phase-space contraction as measured by the sum of all of the Lyapunov exponents. We will comment in the final section upon the obviously problematic issues concerning the use of entropy production arguments here. For small fields the electrical conductivity of the moving particle and its diffusion coefficient in the Lorentz gas are directly related by a simple factor, D"[(p2/(q2m)]p, and Eq. (44) then provides another expression for the diffusion coefficient in terms of dynamical quantities, the sum of all of the Lyapunov exponents of the system. Similar expressions can be obtained for other transport coefficients. The coefficient of shear viscosity, in particular has been studied in great detail for particles that interact with short-ranged repulsive potentials, i.e. WCA particles [18,48]. The evaluation of these expressions for transport coefficients appears to be formidable since one has to determine the complete Lyapunov spectrum for the thermostatted system in an external field. This situation has been simplified to a great extent by the observation, and in some cases the proof [42,48,49], of a conjugate pairing rule for these thermostatted systems. That is, the non-zero Lyapunov exponents can be ordered in conjugate pairs, j ,j , say, such that the sum of each i ~i conjugate pair is the same for all conjugate pairs. Consequently, the sum of all of the Lyapunov exponents is equal to the product of the number of conjugate pairs and the sum of one conjugate pair. This is of course a generalization of the conjugate pairing rule for symplectic systems, where the sum is zero. The conjugate pairing rule for thermostatted systems has been observed in computer simulations, and has been proved analytically for a number of systems, provided the thermostat keeps the total kinetic energy [42,49], rather than the total energy, kinetic plus potential, constant. Of course for hard-sphere-type systems, there is no difference between the kinetic and total energies. 3.3. A simple map with an attractor In order to see how a fractal attractor might form in the phase-space for a thermostatted system we turn once again to a simple two-dimensional map. Here we construct a map that has some of the features that we expect to see in a thermostatted Lorentz gas, for example. That is, we consider a map with reversible, hyperbolic dynamics, but allow for a phase-space volume contraction in part of the phase-space. Such a map is provided by a map U of the unit square onto itself given by
G
U(x, y)"
(x/l, ry)
for 0)x)l ,
((x!l)/r, r#ly) for l(x)1 ,
(45)
J.R. Dorfman / Physics Reports 301 (1998) 151—185
173
Fig. 8. A simple map on the unit square with an attractor. The map is given by Eq. (45), and the areas of regions I and II individually are not preserved by the map.
where l, r'0, lOr, and l#r"1. This map is illustrated in Fig. 8. One can easily calculate the positive and negative Lyapunov exponents for this map and they are 1 1 j "l ln #r ln , ` l r
(46)
j "l ln r#rln l , ~
(47)
r j #j "(l!r) ln )0 . ` ~ l
(48)
and
Here the equality holds in the last equation only if l"r"1. The SRB theorem tells us that the time 2 average of any continuous function defined on the unit square will approach an ensemble average taken with respect to the appropriate SRB measure. Some reflection will convince the reader that the SRB measure is smooth in the expanding, or x-direction but fractal in the contracting, or y-direction. The details of this and related SRB measures are described in Ref. [50]. We say that the system has evolved to a fractal attractor with a smooth measure in the x-direction and fractal in the y-direction. This attractor is invariant under the map U since the limiting form of the set is not changed by the map, or for that matter its inverse, U~1 which can easily be constructed. Another interesting point about this map U is related to its reversibility property. The map U~1 has exactly the same Lyapunov exponents as the map U. However, the SRB measure for the inverse map is smooth in the expanding y-direction but fractal in the contracting x-direction, and the timereversed attractor is also invariant under the two maps, U and its inverse. Sometimes we call this pair of invariant sets an attractor with a corresponding repeller. Depending on the direction of the time, one of the two invariant sets is an attractor, and the other is a repeller. 3.4. Fractal forms for diffusion coefficients Our discussions in this section have been largely formal. To conclude this section, we will consider the case of deterministic diffusion in one dimension under the action of a simple, piecewise linear map, such as that illustrated in Fig. 9. This map has the general form x "M(x ), where n`1 n M(x) is a piecewise linear function of x, and satisfies the condition M(x#1)"1#M(x). In the
174
J.R. Dorfman / Physics Reports 301 (1998) 151—185
Fig. 9. The one-dimensional piecewise linear map given by Eq. (49). Here a'2.
interval 0)x)1, M(x) is given explicitly by
G
M(x)"
ax
for 0)x)1 , 2 ax#(1!a) for 1(x)1 , 2
(49)
where the slope of the map a satisfies the condition a'2, which is necessary for diffusion to take place at all. Klages studied this and similar maps for his Ph. D. work [51]. The diffusion coefficient could be obtained by mapping this system onto a Markov process using a one-dimensional version of Markov partitions. All of the appropriate details are given in Klages’ dissertation and in Refs. [52,53]. Here we will consider the shape of the diffusion coefficient, D, as a function of the slope a for a'2. The values of D for even integer a can easily be found by mapping this process onto a simple random walk on a line with a non-zero probability, depending on a, of staying at the same site. For odd integer a, the problem is not much harder and D can also be obtained easily. The method of Markov partitions gives D for a dense set of values of a on the real line, and this is used to give the general structure of the dependence of D upon a. This dependence is illustrated in Fig. 10. There we see that D is a fractal function of a with quite a rich fractal structure. This rather surprising structure shows that transport coefficients are not always simple functions of the parameters of the system. Further studies on models of this type are underway by Klages [54] and by Groeneveld [55]. We have established some interesting relations between dynamical and transport quantities and showed that fractal structures in the microscopic phase-space have properties that are directly related to macroscopic transport properties of the system. In the next section we look at questions of the dynamical foundations of irreversible kinetic equations and show how such equations can be
J.R. Dorfman / Physics Reports 301 (1998) 151—185
175
Fig. 10. The diffusion coefficient D as a function of the slope of the map a, for the map in Fig. 9. Curve (a) is the function over the range 2)a)8. Curves (b)—(f) show the diffusion coefficient at various magnifications. The error bars are too small to be noticeable on these graphs.
176
J.R. Dorfman / Physics Reports 301 (1998) 151—185
used to compute, among other things, Lyapunov exponents and KS entropies, for the Lorentz gas, as an example.
4. The dynamical foundations of kinetic equations Two ideas have been developed in the previous sections which seem to be central to an understanding of the dynamical origins of irreversible phenomena in fluids. In particular, we can now isolate some of the key features that must be taken into account when one tries to understand the efficacy of stochastic equations such as the Boltzmann transport equation for describing irreversible processes in gases. Briefly stated, these two central ideas are: 1. One can, for Anosov systems and probably with suitable modifications for hard-sphere systems as well, separate the tangent space at almost every point in phase-space into expanding, contracting and center manifolds. The expanding and contracting manifolds exchange behavior on the time-reversed motion. 2. Measures tend to be smooth in the unstable, or expanding, directions, and fractal in the stable, or contracting, directions. Therefore, one can imagine the following scenario for the derivation of an irreversible equation from a dynamical analysis. Consider some initial ensemble distribution in phase-space that is a measureable function of the phase-space variables with respect to the microcanonical measure, say. To be specific consider an initial distribution which is unity on a certain set A of small measure and zero elsewhere. In the course of time, this set will be stretched along the unstable directions, becoming more and more uniformly distributed in these directions, but forming more and more of a fractal structure in the stable directions. If then, we project the distribution function onto the unstable directions, say by integrating over the stable directions and thereby obtaining a reduced distribution function, we ought to see the approach to a uniform distribution function in the projected variables. To make this argument less abstract and more transparent, we consider its application to our familiar models, the baker map [8,56,57] and the Arnold cat map. To begin we will need the version of Liouville’s equation appropriate for maps that evolve at discrete time intervals, called the Perron—Frobenius equation [1,58]. 4.1. The Perron—Frobenius equation We consider a map of some region R onto itself. To be definite we take R to be two dimensional, and the map to be of the form (x@, y@)"U(x, y)"(U (x, y), U (x,y)) . 1 2
(50)
We consider R to be our phase-space, and we suppose there is some phase-space distribution evolving from some initial distribution on R denoted by o (x, y). Then after n iterations of the map, 0
J.R. Dorfman / Physics Reports 301 (1998) 151—185
177
a new distribution function, o (x, y), is obtained which satisfies the recursion equation n
PP
o (x, y)" n
dx@ dy@ d(x!U (x@,y@))d(y!U (x@, y@)) ) o (x@, y@) . (51) 1 2 n~1 R This recursion equation, the Perron—Frobenius equation, is self-evident provided the delta functions are not to be evaluated at possible discontinuity points of the distribution function. The delta functions can be evaluated in terms of the pre-image points of (x,y), and the Jacobian of the map U at these pre-image points. The examples to which we apply this equation are simple area preserving maps on the unit square for which the Jacobian of U is simply unity, but more general cases can be treated as well. 4.2. The baker’s map To apply the Perron—Frobenius equation to the baker’s map, we take R to be the unit square, and the mapping U to be given by Eq. (5). An elementary calculation leads to the recursion equation
G
o (x, y)" n
o (x/2, 2y) n~1 x#1 o , 2y!1 n~1 2
A
for 0)y(1/2 ,
B
(52) for 1/2)y)1 .
For this map the unstable direction is the x-direction, and we can define a reduced distribution function that only depends on x by integrating over the y variation of o (x, y). We, therefore, define n ¼ (x) by n 1 1@2 1 x#1 ¼ (x)" dy o (x, y)" dy o (x/2, 2y)# dy o , 2y!1 n n n~1 n~1 2 0 0 1@2 1 x x#1 " ¼ #¼ . (53) n~1 2 n~1 2 2
P
C AB
P
A BD
P
A
B
This equation can be considered to be a model “kinetic” equation since it is obtained from the discrete form of Liouville’s equation by integrating over the y variation. The reader may also notice that Eq. (53) is itself the Perron—Frobenius equation for the map x@"2x mod 1. It is easy to see that ¼ "1, or any non-zero constant, is a solution of this equation, and we require that o and n n ¼ be non-negative so that they can represent probability distributions. Moreover, there is an n H-theorem associated with Eq. (53), very similar to Boltzmann’s H-theorem. That is, we define H ":1dx ¼ (x) ln ¼ (x), and elementary inequalities based upon the observation that a striaght n 0 n n line connecting two points on the curve z"u ln u lies below the curve show that H )H . The n n~1 equality sign holds only if ¼ is a constant. Finally, Eq. (53) can be solved if ¼ (x) is a measurable n 0 function of x, and any arbitrary, normalized initial distribution will approach 1 as nPR. It is important to note that we have done nothing more than integrate the Perron—Frobenius equation for this map over the coordinate in the stable direction, assuming only that the initial distribution function is well behaved. As the time increases, the reduced distribution function becomes an increasingly smoother function of x, finally reaching an equilibrium distribution, in a monotonic,
178
J.R. Dorfman / Physics Reports 301 (1998) 151—185
irreversible manner. Now, if we were to start with the same initial distribution function o (x, y), but 0 follow the time-reversed map, then the unstable direction is parallel to the y-axis, and the reduced distribution function would be obtained by integrating the Perron—Frobenius equation for the time-reversed motion over x. The resulting equation would be identical to Eq. (53), with x replaced by y. The reversibility of the underlying equations of motion is still preserved but irreversible equations are obtained by selecting the direction of time and projecting the full distribution function onto the unstable manifold. The reader is also invited to see what happens if the distribution function is projected onto the stable manifold, for the baker’s map. The paper by Tasaki and Gaspard [59] should prove useful for this analysis.
4.3. The Arnold cat map As an extension of the ideas just discussed we consider now the Arnold cat map. The analysis of the corresponding Perron—Frobenius equation is complicated by the fact that the unstable and stable directions are not oriented along the x- and y-axes, respectively, but are rotated by a fixed angle. Instead of working out the mathematics let us just look at the map on the computer. We will be interested in the reduced distributions projected onto the x-axis or onto the y-axis. This is a little closer to what we need to understand the Boltzmann equation since for a dilute gas with N particles we do not know the directions of the stable and unstable manifolds in phase-space, but we do consider a projected or reduced distribution function, namely the single-particle distribution function obtained by integrating the Liouville distribution over the phases of all but one particle. We start with the same initial distribution function for the Arnold cat map as illustrated in Fig. 4, that is, where all of the distribution is concentrated in the lower left corner of the unit square. In Fig. 11 we plot the distribution function projected onto the x-axis, for various times starting with a step function at t"0. Note that this function also becomes smooth with time and appears to have reached equilibrium after four steps. Now, consider the same map but project the distribution function onto the y-axis. this is illustrated in Fig. 12. Exactly the same things happen. This reduced distribution function also approaches an equilibrium value after four time steps or so. One can also show that H-functions defined in terms of either of these distribution functions decrease monotonically in time. If one reverses the motion, then very similar things will happen: both of the reduced distribution functions, whether projected on the x or y axes will approach equilibrium. Only if we project somehow onto the stable direction will we not see an evolution to a smooth equilibrium distribution. We conclude from a study of these maps that the approach of a reduced distribution function to a smooth equilibrium value depends only on the construction of a reduced distribution function that is obtained by projecting the full distribution function onto a direction that is not orthogonal to unstable directions in phase-space. Of course this is really a conjecture. It is an important and open problem to work this out in detail for more realistic systems. Furthermore, we might be asking too much of realistic systems that this picture be realized in essence for them. Instead only much weaker properties may really be necessary for the variables of physical interest, such as the pressure exerted by a gas on the walls of the container, to show an approach to equilibrium. However, for Anosov systems, at least, where there is a very clear decomposition of the entire phase-space into stable, unstable, and center manifolds, the picture sketched above is likely to hold.
J.R. Dorfman / Physics Reports 301 (1998) 151—185
179
Fig. 11. The x-distribution function for the Arnold cat map at iterations n"0,1,2,3,4, showing that the distribution becomes nearly uniform after 4 iterations. The initial condition is the same as in Fig. 4. These curves are affected by a small computer generated error.
Fig. 12. The y-distribution function for the Arnold cat map at the same iterations as in Fig. 11, for the same initial conditions.
4.4. The random Lorentz gas In this section we briefly indicate how the Boltzmann equation itself may be used to calculate the Lyapunov exponents of a simple model of a gas, the hard-disk Lorentz model [39,40,46]. To do this we combine the ideas of Boltzmann with those of Ya.G. Sinai [15], thereby suggesting the existence
180
J.R. Dorfman / Physics Reports 301 (1998) 151—185
of a deep connection between “molecular chaos” — the stochastic hypothesis that underlies the Boltzmann transport equation — and dynamical chaos, which shows that stochastic-like behavior is possible even for deterministic mechanical systems. Consider a collection of hard disks of radius a placed at random on a plane with number density n, such that na2@1. The moving particle always travels with the same speed, v and makes elastic collisions with the scatterers. To calculate the Lyapunov exponent we consider two trajectories on the constant energy surface that are infinitesimally close to each other, but separating, nonetheless. We consider the separation of a diverging bundle of trajectories in position space, since if this separation grows exponentially, so will the separation in velocity space, since both have to grow with the same exponential. We suppose without loss of generality that if we follow the trajectory bundle backward in time all the trajectories go through a common point of intersection. The separation of two trajectories in the bundle can be expressed in terms of a radius of curvature, o, which is the distance from the present position of the trajectories to the point of intersection. If we denote the distance between the two trajectories by D, one finds that (see Fig. 13) dD/dt"vD/o .
(54)
This equation has the simple solution
AP
D(t)"exp
t
0
dq
B
v D(0) . o(q)
(55)
It now follows immediately, from the fact that this system is ergodic that the positive Lyapunov exponent, j , is given as ` j "vS1/oT , `
(56)
where the brackets denote an ensemble average. Therefore, we need to determine the ensemble distribution for the radius of curvature. This is the point where the Boltzmann equation becomes useful. Let us consider an “extended” distribution function, F(r, , o, t), which includes the position of the moving particle, r, its velocity, the time, t, and the radius of curvature, o, as variables. In a sense this is a two-particle distribution function since the separation of two trajectories is included as one of the variables. Now we know how r and change during free motion and at a collision with a scatterer. The radius of curvature o grows linearly with time during the free motion as o(t#dt)"o(t)#v dt, and changes from a precollision value o~ to a post collision value
Fig. 13. The growth of the radius of curvature between collisions.
J.R. Dorfman / Physics Reports 301 (1998) 151—185
181
Fig. 14. The change in the radius of curvature at collision.
o` at a collision, where 1 1 2 " # , o` o~ a cos /
(57)
where / is the angle of incidence at collision [15]. Typically o~ is on the order of the mean free path between collisions, and o` is always less than or equal to a/2 (see Fig. 14). Using standard methods of kinetic theory one can derive the following equation for F(r, , o, t), the extended Lorentz—Boltzmann equation
C
D
# ) +#v F(r, , o, t)"!lF(r, , o, t) o t
P
n@2
P
A
B
= (a cos /)/2 d/ do@ cos /d o! ]F(r,@,o@,t) , (58) 1#(a cos /)/(2o@) ~n@2 0 and @"!2(nK ) )nK , as in Eq. (36), and l"2nav is the average collision frequency for the moving particle. This equation can be solved easily for an equilibrium system where F does not depend on r or t. The details of the derivation and the solution are given elsewhere [9], but here we give the final result, to lowest order in the density, #(l/2)
j`"2nav(1!C!ln(2na2)) ,
(59)
where C is Euler’s constant. We emphasize the point that not only does dynamical systems theory allow us to understand, to some extent at least, the dynamical origins of kinetic theory, kinetic theory in turn can be used to compute important quantities needed to show that the ideas we have been discussing are indeed relevant for our understanding of irreversible processes in fluids.
5. Conclusions Our excursion into the dynamical foundations of the kinetic theory of gases has provided us with a very stimulating and suggestive picture of irreversible processes. If we can show that a typical system like a gas of hard spheres or WCA particles is indeed an Anosov system, then many striking consequences are immediate. In particular, the phase space has a tangent space almost everywhere
182
J.R. Dorfman / Physics Reports 301 (1998) 151—185
which can be decomposed into unstable, stable, and neutral subspaces. Measures and distribution functions become smooth with time along the unstable directions, approaching equilibrium values as the time gets very large compared to characteristic microscopic times. The reversibility of the equations of motion reflects itself in the interchange of stable and unstable directions under time reversal, so that equilibrium states appear in the time-reversed motion as the distribution function becomes smooth along the unstable directions of the time-reversed motion. If this picture is correct then we can relate microscopic dynamical quantities such as Lyapunov exponents and KS entropies to macroscopic quantities such as transport coefficients, Here we illustrated this connection using the escape-rate formalism, and the method of Gaussian thermostats. However, these are only two of a number of useful methods for making this connection. For example, methods based on periodic orbit expansion or Ruelle—Pollicott resonances are under very active study at the present time [5,60]. We conclude with a few remarks: 1. We have presented a picture here of how things might be for a typical system of interest to those applying kinetic theory methods to compute transport properties. It is not known whether this picture is indeed correct. We must study a number of systems of increasingly more realistic structures in order to get deeper insights into this question. Progress is being made in this direction using both computer and analytical studies of a number of different systems. 2. We know very little about the complicated fractal structures that underlie transport processes in fluids, i.e., the attractors and repellers that we have discussed earlier. Only for simple twodimensional maps do we have any good intuition about these fractals. Even higher-dimensional maps would provide useful examples for more complicated systems. 3. We have not mentioned many aspects of dynamical systems theory that are relevant for a further analysis of the topics discussed here. In particular, the thermodynamic formalism of Sinai, Bowen and Ruelle [14,61] provides a close, but often very formal, connection between dynamical systems theory and equilibrium statistical mechanics. This connection is surely not accidental, and can be very helpful for understanding the dynamical foundations of statistical mechanics, in general. 4. The reader may have been surprised by the appearance of a negative entropy production in the discussion of the Gaussian thermostatted systems in Section 3.2. This negative sign is really not mysterious, since we are gaining more information about the system as its phase-space trajectory becomes more and more localized on an attractor. In equilibrium thermodynamics one can reduce the entropy of a gas by applying work in a reversible, isothermal compression. However, the entropy of the universe stays constant when one takes into account the entropy change of the thermostat that works on the gas. Here, similarly, one has to consider the system plus the thermostat in order to make sense of the entropy production. In a steady state the total entropy production must be zero (or positive), so the negative entropy production of the system must be matched by a positive entropy production in the thermostat. A number of papers have been written recently which discuss this topic in more detail. Ruelle has proved that the entropy production by the thermostat must be strictly positive [62]. Tel, Vollmer and Breymann have discussed the entropy production for a simple model based upon the baker map from the point of view of coarse graining the phase-space [63,64]. Here the point is that we are unable, experimentally, to follow the localization of the thermostates system onto
J.R. Dorfman / Physics Reports 301 (1998) 151—185
183
the attractor because we are limited to the resolution of our observing devices. Therefore, beyond a certain point we will not see the entropy of the thermostatted system decrease. Instead, we will see the entropy of the coarse grained, observed, system increase as the microscopic processes take place on a scale that is too fine for our measuring devices. In the Tel, Vollmer, and Breymann analysis the positive entropy production is taken to be the difference between the information available in a coarse grained description where nothing is changing beyond a certain scale and the information available in a complete microscopic description where one can follow the trajectory onto the attractor. There still needs to be a study where one connects this idea of coarse graining to a physical description of a thermostat and shows that including the entropy production in the thermostat is equivalent to coarse graining the phase-space of the thermostatted system. 5. P. Gaspard [65] has discussed entropy production for a simple reversible system which consists of a chain of baker’s maps coupled in such a way that a density gradient may be established and maintained. He shows that the positive entropy production associated with this process is connected to the fractal structure that develops in the system as the steady density gradient is established. The phase-space density is not a differentiable function of the phase-space variables, and one can only define a coarse grained entropy for the system. Gaspard shows that this coarse grained entropy is positive and its production has the form required by irreversible thermodynamics. 6. We have not mentioned quantum mechanics at all. It seems inconsistent with our physical understanding of matter to restrict our attention exclusively to classical systems. The quantum versions of the ideas discussed here are still in the early phases of development and we refer the reader to the literature for further details [66]. This is an area where even our understanding of even the simplest systems is not entirely secure. To conclude, there is a lot of work still to do in order to undertand in detail the chaotic foundations of transport theory and to provide a clear microscopic explanation of all of the phenomena that we associate with irreversible processes in fluids. Acknowledgements The author would like to thank Henk van Beijeren, E.G.D. Cohen, Matthieu Ernst, Pierre Gaspard, Edward Ott, and Tamas Tel for many helpful converstaions as well as Thomas Gilbert for many helpful discussions on SRB measures. He would like to thank Charles Ferguson and Rainer Klages for supplying a number of the figures, and he would also like to thank them as well as Thomas Gilbert, Luis Nasser, and Debabrata Panja for many helpful and clarifying discussions, and especially Ramses van Zon for a critical reading of this manuscript. He thanks the National Science Foundation for support under Grant PHY-96-00428. References [1] E. Ott, Chaos in Dynamical Systems, Cambridge University Press, Cambridge, 1992. [2] H. Poincare´, in: D.L. Goroff (Ed.), New Methods in Celestial Mechanics, AIP Press, New York, 1993.
184 [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
[14]
[15]
[16] [17] [18]
[19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37]
J.R. Dorfman / Physics Reports 301 (1998) 151—185 J. Lebowitz, O. Penrose, Physics Today 26 (1973) 23. J.-P. Eckmann, D. Ruelle, Rev. Mod. Phys. 57 (1985) 617. P. Gaspard, Chaos, Scattering, and Statistical Mechanics, Cambridge University Press, Cambridge, to appear. J. Guckenheimer, P. Holmes, Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, Springer, Berlin, 1983. K.T. Alligood, T.D. Sauer, J.A. Yorke, Chaos: An Introduction to Chaotic Systems, Springer, New York, 1997. J.R. Dorfman, An Introduction to Chaos in Non-Equilibrium Statistical Mechanics, Lecture Notes, University of Utrecht, and University of Maryland, College Park, 1997. J.R. Dorfman, H. van Beijeren, Physica A 240 (1997) 12. J.R. Dorfman, H. van Beijeren, in: B. Berne (Ed.), Statistical Mechanics, B, Plenum Press, New York, 1977. P. Resibois, M. deLeener, Classical Kinetic Theory of Fluids, Wiley, New York, 1977. S. Chapman, T.G. Cowling, The Mathematical Theory of Non-Uniform Gases, 3rd Ed., Cambridge University Press, Cambridge, 1970. R. Bowen, Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms, Lecture Notes in Mathematics, vol. 470, Springer, Berlin, 1975; On Axiom A Diffeomorphisms, Regional Conference Series in Mathematics, vol. 35, American Mathematical Society, Providence, 1978; R. Bowen, D. Ruelle, Invent. Math. 29 (1975) 181. D. Ruelle, Thermodynamic Formalism, Addison-Wesley, Reading MA, 1978; Am. J. Math. 98 (1976) 619; Phys. Math. IHES 50 (1979) 275; Elements of Differentiable Dynamics and Bifurcation Theory, Academic Press, New York, 1989; D. Ruelle, Ya.G. Sinai, Physica A 140 (1986) 1. Ya.G. Sinai, Introduction to Ergodic Theory, Princeton University Press, Princeton, 1976; Russ. Math. Surv. 25 (1970) 137; Dynamical Systems, World Scientific, Singapore, 1991; Russ. Math. Surv. 21 (1972) 21; Topics in Ergodic Theory, Princeton University Press, Princeton, 1994. A. Katok, B. Hasselblatt, Introduction to the Modern Theory of Dynamical Systems, Cambridge University Press, Cambridge, 1995. P. Gaspard, G. Nicolis, Phys. Rev. Lett. 65 (1990) 1693. D.J. Evans, G.P. Morriss, Statistical Mechanics of Nonequilibrium Liquids, Academic Press, London, 1990; W.G. Hoover, Computational Statistical Mechanics, Elsevier, Amsterdam, 1991. See also papers collected in M. Mareschal, B. Holian (Eds.), Microscopic Simulations of Comples Hydrodynamic Phenomena, Plenum Press, New York, 1992. G.E. Uhlenbeck, G.W. Ford, Lectures in Statistical Mechanics, American Mathematical Society, Providence, 1963. V.I. Arnold, A. Avez, Ergodic Problems of Classical Mechanics, Benjamin, New York, 1968. G.D. Birkhoff, Proc. Nat. Acad. Sci. 17 (1931) 656. D. Szasz, Stud. Scient. Math. Hungarica 31 (1996) 299; Physica A 194 (1993) 86. N. Simanyi, D. Szasz, Ergodicity of Hard Spheres in a Box, to be published. C. Liverani, M. Wojtkowski, Ergodicity in Hamiltonian Systems in Dynamics Reported (New Series), vol. 4. Springer, Berlin, 1995, p. 130. G. Gallavotti, D. Ornstein, Comm. Math. Phys. 38 (1974) 83. Ya.G. Sinai, Funct. Anal. Appl. 13 (1980) 192. K. Peterson, Ergodic Theory, Cambridge University Press, Cambridge, 1983. P. Billingsley, Ergodic Theory and Information, Wiley, New York, 1965. V.I. Arnold, Mathematical Methods in Classical Mechanics, 2nd edn., Springer, Berlin, 1989. Ya. Pesin, Russ. Math. Surv. 32 (1997) 55. P. Gaspard, J.R. Dorfman, Phys. Rev. E 52 (1995) 3525. O.E. Lanford, in: G. Iooss, R.H.G. Helleman, R. Stora (Eds.), Chaotic Behavior of Deterministic Systems, North Holland, Amsterdam, 1983, p. 6. G. Gallavotti, E.G.D. Cohen, J. Stat. Phys. 80 (1995) 931. T. Tel, in: Hao Bai-Lin (Ed.), Directions in Chaos, World Scientific, Singapore, 1990. P. Gaspard, F. Baras, in: M. Mareschal, B. Holian (Eds.), Microscopic Simulations of Complex Hydrodynamic Phenomena, Plenum Press, New York, 1992, p. 301. J.R. Dorfman, P. Gaspard, Phys. Rev. E 51 (1995) 28. P. Gaspard, J. Stat. Phys. 68 (1992) 673.
J.R. Dorfman / Physics Reports 301 (1998) 151—185 [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66]
185
P. Gaspard, F. Baras, Phys. Rev. E 51 (1995) 5332. H. van Beijeren, J.R. Dorfman, Phys. Rev. Lett. 74 (1995) 4412; 76 (1996) 3238(E). A. Latz, H. van Beijeren, J.R. Dorfman, Phys. Rev. Lett. 78 (1997) 207. M.H. Ernst, J.R. Dorfman, R. Nix, D. Jacobs, Phys. Rev. Lett. 74 (1995) 4416. M.P. Wojtkowski, C. Liverani, Conformally Symplectic Dynamics and Symmetry of the Lyapunov Spectrum, preprint, 1997. C.P. Dettmann, G.P. Morriss, Phys. Rev. E 54 (1996) 2495; 55 (1997) 3693. G.P. Morriss, C.P. Dettmann, L. Rondoni, Physica A 240 (1997) 84; J.P. Lloyd, M. Niemeyer, L. Rondoni, G.P. Morriss, Chaos 5 (1995) 536. N.I. Chernov, G.I. Eyink, J.L. Lebowitz, Ya.G. Sinai, Comm. Math. Phys. 154 (1993) 569. H. van Beijeren, J.R. Dorfman, E.G.D. Cohen, Ch. Dellago, H.A. Posch, Phys. Rev. Lett. 77 (1996) 1974. Ch. Dellago, H.A. Posch, W.G. Hoover, Phys. Rev. E 53 (1996) 1485. D.J. Evans, E.G.D. Cohen, G.P. Morriss, Phys. Rev. A 42 (1990) 5990. C.P. Dettmann, G.P. Morriss, Phys. Rev. E 53 (1996) R5541. S. Tasaki, T. Gilbert, J.R. Dorfman, An Analytical Construction of the SRB Measures for Baker-type Maps, preprint, 1998. R. Klages, Deterministic diffusion in one-dimensional maps, Ph.D. Dissertation, Technical University of Berlin, Berlin, 1995. R. Klages, J.R. Dorfman, Phys. Rev. Lett. 74 (1995) 387. R. Klages, J.R. Dorfman, Phys. Rev. E 55 (1997) R1247. R. Klages, unpublished. J. Groenveld, unpublished. M.V. Berry, in: S. Jorna (Ed.), Topics in Nonlinear Dynamics, AIP Conf. Proc., No. 46, American Institute of Physics, New York, 1978. L. Reichl, A Modern Course in Statistical Mechanics, 2nd Ed., Wiley, New York, 1998. A. Lasota, M.C. Mackey, Chaos, Fractals, and Noise, 2nd. edn., Springer, Berlin, 1994. S. Tasaki, P. Gaspard, J. Stat. Phys. 81 (1995) 935. P. Gaspard, Phys. Rev. E 53 (1996) 4379. C. Beck, F. Schlo¨gl, Thermodynamics of Chaotic Systems, Cambridge University Press, Cambridge, 1993. D. Ruelle, J. Stat Phys. 85 (1996) 1. T. Tel, J. Vollmer, W. Breymann, Europhys. Lett. 35 (1996) 659; Phys. Rev. Lett. 77 (1996) 2945. G.P. Morriss, L. Rondoni, Physica A 233 (1996) 767. P. Gaspard, Physica A 240 (1997) 54; J. Stat. Phys., 89 (1997) 1215. G. Casati, B. Chirikov (Eds.) Quantum Chaos, Cambridge University Press, Cambridge, 1995; K. Nakamura, Quantum Chaos, Cambridge University Press, Cambridge, 1994; see also Chaos 3 (4) (1993).
Physics Reports 301 (1998) 187—204
Strongly correlated electron systems and the density matrix renormalization group Steven R. White Department of Physics and Astronomy, University of California, Irvine, CA 92697-4575, USA
Abstract This paper briefly reviews some of the models used to study strongly correlated electron systems, and some of the numerical methods used to study them. Then one particular method, the density matrix renormalization group (DMRG), is introduced in some depth. As a demonstration of the capability of DMRG, test calculations are presented for Heisenberg spin chains. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 02.70.Lq
1. Introduction One of the most powerful techniques for studying strongly correlated electron systems is the density matrix renormalization group (DMRG), which was developed by the author in 1992. In this paper we briefly introduce some of the most popular models for the study of strongly correlated systems, with the emphasis on the origins of the models rather than their properties. We then briefly discuss some of the numerical methods that have been used to study these models, including the determinantal quantum Monte Carlo method and exact diagonalization. We then discuss in more detail numerical renormalization groups, and present the density matrix renormalization group method. We illustrate the capability of the method by tests on Heisenberg chain systems.
2. Models of strongly correlated electron systems Lattice models for strongly correlated electron systems are based on the idea of atomic orbitals. An isolated atom has a set of orbitals which we usually get from a Hartree—Fock calculation. In such a calculation, the single-electron particle part of the Hamiltonian, consisting of both the 0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 1 0 - 6
188
S.R. White / Physics Reports 301 (1998) 187—204
Coulomb interaction between the nuclei and the electrons and the kinetic energy of the electrons, is more important than the electron—electron Coulomb interaction. The most important role for the electron—electron interaction is in the screening of the nuclear charge by the core electrons, which substantially alters the orbitals for the outer electrons. Hartree Fock takes into account the single-electron terms and the screening of the nuclear charge by electrons. Although the only correlation between different electrons comes from the antisymmetry of the Hartree—Fock wave function, Hartree Fock often does a surprisingly good job at describing an atom or molecule. Hartree Fock gives us, for a single atom, a set of electron levels. Levels below the Fermi energy are doubly occupied, and levels above are unoccupied. If there is a reasonable separation between the occupied and unoccupied levels, we expect that a solid made out of these atoms will be an insulator, which for our purposes is not interesting. However, some solids have atoms with an orbital or a set of orbitals which are very close to the Fermi energy. Generally, these orbitals are fractionally occupied — they are not completely empty, nor are they doubly occupied. In that case, the Hartree—Fock procedure loses its usefulness. Furthermore, nontrivial correlations between electrons in these orbitals are possible. The electron—electron interaction now plays a new role in inducing these correlations. In a typical metal, we have levels near the Fermi level, but the overlap between orbitals on different atoms is strong and the picture of a local orbital is not so useful. However, in some materials, the overlap between orbitals is not so large compared to other residual interactions, such as the electron—electron interaction between two electrons in the same orbital. The electron—electron interaction then can give rise to interesting effects like antiferromagnetism or superconductivity. All of the standard lattice models used to study such “strongly correlated electron systems” ignore all but one or two levels close to the Fermi energy. In principle, we imagine that we have some renormalization procedure which integrates out all orbitals far from the Fermi energy. In practice, however, we generally choose to include only a few simple terms in the Hamiltonian and we then determine coefficients phenomenologically. This procedure can be dangerous — can we really leave out the next term? However, it is the best method currently available. Usually in describing the standard models, the Hamiltonians are immediately written down. I think it is more sensible to explain first how many orbitals per site a given model has and what states are allowed in each orbital. If a site has only a single orbital, there are only a few possibilities. First, a “Hubbard” site has four states: an empty state, with energy 0; a state with one up electron, with energy e; a state with one down electron, with energy e; and a state with two electrons, with energy 2e#º. º is the electron—electron interaction between two electrons in the same orbital. This type of site appears in the Hubbard model, and the “Hubbard-º” term is the key term. Now, the single-particle energy e can be absorbed into the chemical potential. One additional term is needed to make a nontrivial model — the hopping term. The hopping term comes from putting the atoms close together in a solid, and its effect is to allow electrons to hop from one site to the next. In principle, it comes from the matrix element of the kinetic energy between two atomic orbitals on nearby sites, but in practice, it is determined phenomenologically. The Hubbard Hamiltonian is H"!t + (cs c #cs c )#º+ n n , ip jp jp ip i i¬ Sij T i p
(1)
S.R. White / Physics Reports 301 (1998) 187—204
189
where cs creates a electron on site i and spin p"C or B, n "cs c , and SijT indicates a sum over ip ip ip ip all pairs of sites which are nearest neighbors. The filling of the model is the average number of electrons per site, Sn T"Sn #n T. Half-filling corresponds to Sn T"1. The filling, in general, is i i i¬ i set by chemistry, but it can sometimes be adjusted by doping, as in a semiconductor. Variations on the Hubbard model — usually called extended Hubbard models — come from including additional terms, such as next-nearest-neighbor hopping. Although these additional terms can be argued to be smaller and thus less important than the two terms included above, usually the only argument for excluding them is an appeal for simplicity. If º is very large, and Sn T)1, we may be able to integrate out the doubly occupied state on i each site. This is done perturbatively in t/º and is given in a number of textbooks. A t—J site has only three states: an empty state, with energy 0; and two states with either one up or one down electron, each with energy e. The J term appears as a result of integrating out º. The Hamiltonian of the t—J model is H"!t + [cs c #h.c.]#J + [S ) S !1n n ] , (2) ip jp i j 4 i j S ij T WijX p where J&4t2/º. It is understood that the Hamiltonian only connects the states within the subspace of no double occupancy; this can be made more explicit by inserting projection operators on both sides of the hopping part of the Hamiltonian. Integrating out the doubly occupied sites has produced a term giving antiferromagnetism. The origin of the antiferromagnetism is rather simple: consider just two sites, with two electrons. The º term makes it unlikely that both electrons are on the same site. Because of Pauli exclusion, if both sites have up electrons, there is no hopping energy at all (that is, the expectation value of the hopping term is zero). If the two electrons are antiferromagnetically aligned, then each can virtually hop onto the other site. This virtual state is higher in energy by º, but its perturbative effect allows the kinetic energy to be negative, of order t2/º. At half-filling — one electron per site — the exclusion of doubly occupied sites in the t—J model also eliminates the unoccupied sites. In that case, each site is a Heisenberg S"1 site, with only two 2 states, an up electron or a down electron. The n n term is now constant, and is dropped, and we i j have the Heisenberg Hamiltonian H"J + S ) S . (3) i j WijX Sometimes two orbitals are important. In the Anderson lattice model, there are two Hubbard sites per Anderson site, one with a smaller e, and one with º"0. In the Kondo lattice model, there are one Hubbard (with º"0) site and one Heisenberg site per Kondo site. These two models are often used to describe heavy fermion systems, whereas the Hubbard, t—J, and Heisenberg models are often used for high-temperature superconductors. In the high-temperature cuprate superconductors, values for the parameters of the models are known fairly well. Usually for theoretical calculations we set t"1, i.e. we work in units of t. For the cuprates, in the Hubbard model º/t&12, and in the t—J model J/t&0.33. It is clear from the value of º/t that any theoretical technique starting from a noninteracting picture (º"0) is highly questionable. In fact, it has been quite difficult to develop analytic
190
S.R. White / Physics Reports 301 (1998) 187—204
techniques for these models. Although progress has been made in analytical methods, numerical simulation methods have played a very prominent role for strongly correlated systems. We will discuss only numerical approaches in this review. What sorts of questions would we like to be answered about these models? Here are some examples: What are the zero-temperature phase diagrams of these models? When are they antiferromagnetic, when are they ferromagnetic, when are they superconducting, and when are they metallic? The same questions also extend to finite temperature. How can superconductivity arise without electron—phonon interactions? What symmetry does it have? (d-wave2) What is the transition temperature? Which models describe the cuprates? Is it possible to get higher transition temperatures? In half-filled systems, which are modeled by the Heisenberg model, when is there long-range antiferromagnetic order at zero temperature, and when do quantum fluctuations in the spin directions give a “spin liquid”? How do the lattice geometry and dimension affect properties? In particular, what are the properties of ladder systems, which are two-dimensional, but have a finite width and an infinite length? Most of these questions have answers that can be checked with experiments. In addition, one would like to calculate various specific momentum and frequency-dependent correlation functions which can be compared with experiments such as photoemission or neutron scattering. Trying to answer these difficult questions and calculate these properties will keep condensed matter theorists busy for a long time.
3. Numerical techniques for quantum lattice models Probably the two most useful methods for studying quantum lattice models before the invention of the DMRG were the Quantum Monte Carlo method and exact diagonalization (usually using the Lanczos method). Now, in addition to these methods, DMRG has become very useful. In this section we will briefly discuss quantum Monte Carlo and exact diagonalization. 3.1. Determinantal quantum Monte Carlo There are a variety of types of quantum Monte Carlo, each with certain strengths and weaknesses [1]. Here, as an example of these approaches, we will outline one specific method, the determinantal quantum Monte Carlo method often used for the Hubbard model. In addition, we will consider only the Hubbard model. In the determinantal method, which is a finite temperature method, we express the partition function Z"tr e~bH ,
(4)
as a path integral. The first step is the Suzuki—Trotter breakup to separate the kinetic (¹) and interaction (») parts of the Hamiltonian Z"tr[e~*qH]L+tr[e~*qTe~*qV]L ,
(5)
where *q"b/¸ (*q is one variable). This approximation has errors of order *q2tº, which can be made small by making ¸ large, at the expense of additional computer time. The purpose of this
S.R. White / Physics Reports 301 (1998) 187—204
191
breakup is to isolate the interaction part of the Hamiltonian into its own exponential term. Each pair of terms forms one “time-slice”, and this method is closely related to standard imaginary-time Green’s functions methods. With the interaction terms isolated, a Hubbard—Stratonovitch transformation can be used to eliminate the two-body terms in ». The price which must be paid for this is the introduction of a sum over an Ising variable Ms lN for each term eliminated. Usually in analytic work, an integral i, rather than a sum is introduced. The sum is usually more convenient in numerical work. The following exact relation is due to Hirsch: e~*qUnini¬J + e~*qsi,lj(ni~ni¬) , (6) si,l/B1 where the proportionality factor and j are constants (i.e. no operators!) depending on *qº [1]. The number of Ising variables is the number of sites times ¸. We think of having gone to d#1 dimensions by this procedure, where the last dimension is imaginary time. Even though the term j(n !n ) is a one particle term, it acts to suppress doubly occupied sites i i¬ just as does the Hubbard term ºn n . It acts as a strong magnetic field, which allows electrons in i i¬ line with the field to exist on that site, but puts a strong penalty in energy for electrons antiparallel to the field. The field flips in the path integral, allowing either spin to occur on a site, but not at the same time. With the two-particle interaction terms thus eliminated, for each Ising spin variable configuration the noninteracting electron system can be solved exactly. The solution turns out to involve determinants of matrices whose dimension is equal to the number of sites. There is one determinant for up spins, and one for down Z" + det M det M , ¬ si,l/B1 where
(7)
M "1#B (¸)2B (1) , (8) p p p where each B (l) is a matrix which depends in a simple way on the Ising variables of one time slice. p Symbolically, B (l)"e~*qhp(sl) , (9) p where h is a one-particle Hamiltonian matrix which describes an electron in the magnetic field configuration determined by s l. i, Once this is done, the product of determinants above define a “probability” function, giving the probability of each spin configuration. Consequently, classical Monte Carlo methods can be applied. In practice, however, there are substantial complications. First, calculating the change in probability from flipping one spin requires a substantial calculation because of the presence of determinants. If N is the number of sites in the system, then the best-known procedure requires N2 operations to determine if a spin should be flipped. Second, at low temperatures this procedure becomes unstable because the matrices involved become singular. However, this problem has been solved by the use of special matrix orthogonalization methods. The third problem is the most well known, and the most difficult to deal with — the fermion sign problem. Here, this means that the determinants may become negative. Monte Carlo can still proceed using as the probability the
192
S.R. White / Physics Reports 301 (1998) 187—204
absolute value of the product of the determinants, but measurements of the quantities must be divided by the average sign of the determinants. Their statistical uncertainties blow up when the average sign approaches zero. In the doped Hubbard model the average sign has been found to decrease exponentially fast in N/¹ as ¹P0. Furthermore, in the 2D Hubbard model, the sign problem is worst where the physics is most interesting: at low dopings which correspond to high-temperature superconductivity, if indeed the single-band Hubbard model adequately describes “high-¹ ”. This has led to various attempts to “fix” the sign problem, and also to the # development of other numerical methods (such as DMRG). Some of these approaches seem quite promising, but they are beyond the scope of this review. 3.2. Exact diagonalization In exact diagonalization [2], one takes a small system and finds the ground state exactly. Since the number of states in the Hilbert space grows exponentially with the number of sites (4N for the Hubbard model) it is important to be as efficient as possible. One way to be more efficient is to use symmetries to reduce the size of the space. For example, if one separates the states by their total momentum, and one knows that the total momentum in the ground state is zero, then one can leave out all the states with nonzero momentum. Let the size of the symmetry-reduced Hilbert space be M. Naive full-diagonalization takes &M3 operations and requires storage of about &M2 numbers, which is unacceptable if we wish to treat M&107—108. For such a system we need an algorithm with both storage and operations of order M. If H is not sparse, there can be no such algorithm, since then the number of nonzero elements of H is M2 and all must be accounted for. However, Hamiltonian matrices of reasonable models are extremely sparse. In particular, multiplication of a vector t by H takes of order M operations. The simplest algorithm with operations and storage of order M is a simple iteration, usually called the power method: take an initial guess t, and multiply by (1!eH), where e is sufficiently small, to get a new t. Now, for e sufficiently small, the maximum magnitude eigenvalue of (1!eH) is the ground state of H. Repeated multiplication by a matrix always projects out the eigenvector with maximum magnitude eigenvalue, in this case the ground state. Of course, the starting vector must not be orthogonal to the ground state, although one often finds that finite floating point precision allows convergence even when the starting vector is orthogonal. The convergence rate depends in a simple way on the gap to the first excited state. In practice, this method is rather slow. The wave function in the nth iteration of this simple method looks like a polynomial in terms of the form a Hit, where t is the initial guess. What if we could vary the a freely (up to a given order i i n) in order to minimize the energy? Then we would hope for better convergence for a given order n. The Lanczos algorithm does exactly this. Specifically, within the subspace spanned by Mt,Ht,H2t,2N, the Lanczos approach finds the lowest energy state. The algorithm is quite simple, although we will not present it here. Another algorithm, which can sometimes converge faster than Lanczos, is the Davidson algorithm. It uses the diagonal elements of the Hamiltonian to generate a slightly difference subspace than that used by Lanczos. Again, the state produced is the lowest-energy state within that subspace. The Davidson algorithm performs much better than Lanczos if the matrix is diagonally dominant in some loose sense. If the diagonal elements are constant, then it reduces to the Lanczos algorithm, although with some extra work involved.
S.R. White / Physics Reports 301 (1998) 187—204
193
With Lanczos or Davidson, and good use of symmetry, one can treat Hubbard systems with up to about 20 sites. This has been quite useful, but obviously one would like to treat much larger systems.
4. Numerical renormalization groups In exact diagonalization, the number of states grows exponentially with the system size. An obvious question is, are all of these states necessary? A number of basis set truncation techniques have been used over the years. For example, the configuration interaction method of quantum chemistry diagonalizes within a subspace consisting of the Hartree—Fock state plus a limited number of particle—hole excitations relative to the Hartree—Fock state. In strongly correlated systems, the Hartree—Fock state is usually too far from the true ground state for this to be useful. A completely different basis set truncation technique was pioneered by Wilson in his renormalization group treatment of the Kondo impurity problem [3]. This technique is usually called the “numerical renormalization group”. We will describe this technique below in a simpler context than the Kondo problem. Shortly after Wilson developed his RG procedure, there was considerable interest in applying closely related techniques to a variety of problems. In particular, it seemed that a number of quantum lattice models (such as the Hubbard and Heisenberg models), particularly in one dimension (1D), could be treated with a real-space blocking version of this technique. It was clear from the beginning that one could not hope to achieve the accuracy Wilson obtained for the Kondo problem in these other systems, but it was hoped that the method would yield qualitatively reliable results. Unfortunately, the approach proved to be rather unreliable, particularly in comparison with other numerical approaches, such as Monte Carlo, which were being developed at the same time. The density matrix renormalization group (DMRG) [4,5] was developed to try to fix the problems with Wilson’s procedure for lattice systems. Although DMRG is based on Wilson’s numerical approach, it is very reliable and accurate for one-dimensional lattice systems. 4.1. Standard RG approach We first describe in detail the standard RG approach in the simplest possible context, a real space blocking approach for a 1D lattice system. The notation and many of the central ideas will be very similar in the density matrix approach described later. The approach is used to find the ground state and some low-lying excited states. One begins by breaking the 1D chain into finite identical blocks. It is usually convenient to start at the first iteration with blocks consisting of just one site. We will label the blocks B and the block Hamiltonian H . H contains all terms of H involving only sites contained in B. For example, for B B the Hubbard model at the first iteration, where B consists of one site, H "ºn n !k(n #n ). B i i¬ i i¬ For the Heisenberg model at the first iteration, H "0. B Rather than describe B and H in the usual way by listing the sites of B and using secondB quantized operator expressions for H , we describe B by a list of the many-body states on the block, B and by quantum numbers and matrix elements between these states. We store the number of states m, and for each state we list all quantum numbers which are to be used, such as S and S for a spin z
194
S.R. White / Physics Reports 301 (1998) 187—204
Table 1 Standard numerical renormalization group algorithm for a 1D quantum system (1) (2) (3) (4) (5) (6)
Isolate two blocks BB, and form H . BB Diagonalize H , obtaining the m lowest eigenvectors ua. BB Form matrix representations of Szl, etc., for BB from the corresponding matrices for B. Change basis to the ua, keeping only the lowest m states, using H "OH O†, etc., with O(a;i ,i )"ua , a"1,2,m. 1 2 i1,i2 B{ BB Replace B with B@. Go to step 1.
system, or N ,N , and S for an electron system. H is represented as an m]m matrix. In order to ¬ B reconstruct H, additional information is needed besides H . The additional information describes B the interactions between blocks. For a Heisenberg system with interaction S ) S "SzSz #1(S`S~ #S~S` ) , (10) i i`1 i i`1 2 i i`1 i i`1 one needs to store m]m matrix representations of Sz,S`, and S~ for i equal to both the left- and i i i right-end sites of B. (In practice, one need not store S~, since it can be obtained by taking the i Hermitian conjugate of S`). For a Hubbard model one would have to store matrices for cs and c , i ip ip with p"C and B, in order to reconstruct the hopping term + (cs c #cs c ). p i`1p ip ip i`1p The standard procedure is summarized in Table 1. At the beginning of an iteration one forms the Hamiltonian for two blocks joined together, H . BB has m2 states. The states are labeled by two BB indices, i i . For a Heisenberg system with J"1 the m2]m2 matrix for H is given by 12 BB [H ] 1 2 @1 @2"[H ] 1 @1d 2 @2#[H ] 2 @2d 1 @1#[Sz] 1 @1[Szl] 2 @2 BB i i §i i B ii ii B ii ii r ii ii #1[S`] 1 @1[S~ (11) l ] @ #1[S~] @ [S` l ] @ , 2 r ii i2i 2 2 r i1i 1 i2i 2 where r represents the right-most site of the left block, and l the left-most site of the right block. In diagonalizing H it is useful to separate the basis states by quantum numbers, since H is BB BB block diagonal. It is very simple to use S or N and N in this way. Utilizing the total spin S is z ¬ more tedious (especially when one puts four blocks together, as we do below), and we have not used S to further reduce the dimension of H . (The value of S for a state can easily be inferred by BB degeneracies for different values of S .) z The lowest-lying eigenstates ua1 2, a"1,2m, of H are the states used to describe B@ (BBPB@). ii BB The new block Hamiltonian matrix H is diagonal. However, in the more general case where the B{ states kept, the ua, are not eigenstates of H we can write BB H "OH Os , (12) B{ BB where the m]m2 matrix O 1 2"ui1 2, i.e. the rows of O are the states kept. If O were square, this i§i i ii would be a unitary transformation. Since O is not square, the transformation truncates away (integrates out) the high-energy states. In order to obtain new matrices for Szl , Sz, etc., it is necessary to use O again. First, one must r construct the operators for Szl, Sz, etc. for BB, which we denote by SI zl,SI z, etc. For example r r [SI zl] 1 2 @1 @2"[Szl] 1 @1d 2 @2 , (13) i i §i i ii ii (14) [SI z] 1 2 @1 @2"[Sz] 2 @2d 1 @1 . r ii ii r i i §i i
S.R. White / Physics Reports 301 (1998) 187—204
195
Then the new matrices for B@ are given by Szl"OSI lzOs ,
(15)
etc. After these new operator matrices are formed, we can replace B by B@ and start the next iteration. The iteration is continued until the system is large enough to represent properties of the infinite system. As our main concern here is the iterative diagonalization procedure discussed above, we will not discuss the analysis, using fixed points, relevant and irrelevant operators, etc., of the effective Hamiltonians obtained with the procedure. Wilson’s approach to the Kondo problem is closely related to the method described here, despite some important differences. One difference is that rather than joining two identical blocks, the degrees-of-freedom associated with a single interval (an “onion layer”) were added to the system at each iteration. The analogous procedure for a 1D system would be to add a single site to a block at each iteration. From a computational point of view, this has a distinct advantage in that many more states can be kept (m can be made larger) since at each iteration a system with nm states, as opposed to m2, must be diagonalized, where n is the number of states on a single site (n"4 for Hubbard models, n"2S#1 for spin models). The most important difference between the Kondo system and a 1D system is that the couplings between adjacent layers or “sites” decreases exponentially in the Kondo system, whereas it remains constant for a 1D system. This exponential decrease is the key to the success of the method for the Kondo system and related impurity systems. More discussion concerning how the detailed form of the Hamiltonian makes the numerical approach accurate are given by Wilson [3]. When applied to other systems, such as 1D spin systems, where the couplings do not decrease exponentially, this standard numerical RG approach generally performs poorly. 4.2. A toy model The fundamental difficulty in the standard approach discussed above lies in choosing the eigenstates of H to be the states kept. Since H contains no connections to the rest of the lattice, BB BB its eigenstates have inappropriate features at the block ends. This can be understood in detail by considering a toy model, a single particle on a tight-binding chain [6]. Consider a 1D chain of sites i with the single-particle Hamiltonian matrix
G
2,
H " !1, ij 0,
i"j , Di!jD"1 ,
(16)
otherwise .
This problem is equivalent in the continuum limit to a 1D particle in a box. To apply the standard real-space RG approach to this problem, we consider a group of sites to be a “block”, and diagonalize that block to find a set of eigenstates. We then truncate the set of eigenstates, keeping only the lowest m states (ordered by energy), and use those states to construct an approximate Hamiltonian for a new, larger block composed of two of the old blocks. At each
196
S.R. White / Physics Reports 301 (1998) 187—204
iteration s we can write the Hamiltonian of the infinite chain as a block tridiagonal matrix in terms of diagonal blocks Hs and off-diagonal blocks ¹s
A
H"
Hs
¹s
0
0
2
¹ss
Hs
¹s
0
2
0
¹ss
Hs
¹s
2
F
F
F
F
B
.
(17)
Initially, the block size is 1 and H1 and ¹1 are 1]1 matrices equal to 2 and !1, respectively. We start iteration s by forming the Hamiltonian matrix for a block composed of two blocks from the previous iteration
A
HM s" and
A
¹M s"
B
Hs~1
¹s~1
(¹s~1)s
Hs~1
B
0
0
¹s~1
0
.
(18)
(19)
We diagonalize HM s and take the lowest m eigenvalues Esl and eigenstates, usl, l"1,2m, discarding the rest. We then perform a change of basis to the eigenstates via Hsll "+ usl HM s usl { i ij {j i,j
(20)
and ¹sll "+ usl ¹M s usl . (21) { i ij {j i,j Note that Eq. (5) puts Hs into diagonal form. We then proceed on to the next iteration, starting with Eqs. (3) and (4). The accuracy can be increased by keeping more states, i.e., increasing m. It is easy to see in this simple example, however, that this procedure is quite poor in describing large-scale, low-energy behavior. The Hamiltonian in this example is just a finite-difference discretization of the kinetic energy of a 1D particle, and in the limit of large block size, the eigenstates are just particle-in-a-box eigenstates. The boundary condition of ignoring the connections ¹ to the neighboring blocks corresponds to setting the wave function to 0 at the sites just outside the block. Fig. 1 illustrates the difficulty. Any state made only of low-lying states from the previous iteration must have a “kink” in the middle. In order to accurately represent states in the larger block, one must make use of nearly all the states in the smaller block: any truncation leads to large errors. White and Noack [6] studied this simple model in detail, and suggested two alternatives to the standard approach. These methods shared a common feature: the states that were kept were not the eigenstates of H . They differed in how the states to be kept were chosen. In the first method, BB the combination of boundary conditions (CBC) approach, the lowest-lying eigenstates of several different block Hamiltonians were kept. The several block Hamiltonian differed only in the boundary condition applied to a block, e.g. one Hamiltonian might have periodic boundary
S.R. White / Physics Reports 301 (1998) 187—204
197
Fig. 1. Particle-in-a-box ground states for two different sized blocks. The standard RG algorithm attempts to write the open-square ground state in terms of the two filled circle ground states.
conditions applied and another antiperiodic. The rationale for this was that quantum fluctuations in the rest of the system effectively apply a variety of boundary conditions to the block. States from any single boundary condition cannot respond properly to these fluctuations. By applying a representative set of boundary conditions, which is in some sense “complete” enough for the problem at hand, one obtains a set of states which are able to respond to these fluctuations. This approach proved very effective for the simple single-particle problems studied by White and Noack, as well as for Anderson localization models [7]. 4.3. RGs for many-particle systems The CBC approach is, however, ill-suited for interacting systems. It is useful to consider a noninteracting many-particle system, such as the Hubbard model with º"0. An arbitrary state of this system can be described in terms of the single-particle wave functions of each of its particles. Some of these single-particle wave functions may have nodes at the ends of a block, and some may have antinodes. It is easy to choose boundary conditions with generate block states where every particle on the block has a node or every particle has an antinode, but it is difficult to get different boundary behavior for different particles. In order to properly represent the block, the states kept not only need to allow for different end behavior for different particles, they must represent a complete range of boundary behavior. This general line of reasoning is supported by numerical tests on Heisenberg chains. We have tried to find a simple set of boundary conditions which can be used to treat the S"1 Heisenberg 2 chain. We tried combinations of periodic and antiperiodic couplings between the ends of the block, as well as varying the magnitude of the coupling between the ends of a block. We were unable to find any set of boundary conditions which was at all satisfactory. The other approach suggested by White and Noack, the superblock method, forms the basis for DMRG. In the superblock method, one diagonalizes a larger system (the “superblock”; the name is
198
S.R. White / Physics Reports 301 (1998) 187—204
analogous to “supercell”, as used in electronic structure calculations) composed of three or more blocks which includes the two blocks BB which are used to form B@. The wave functions for the superblock are projected onto BB, and these projected states of BB are kept. For a single-particle wave function, this projection is single-valued and trivial. The superblock method works quite well in the single-particle model, with the accuracy increasing rapidly with the number of extra blocks used. However, for a many-particle wave function, the “projection” of a wave function onto BB is many-valued, and, in fact, a single many-particle state for the entire lattice generally “projects” onto a complete set of block states. However, some of these states are more important than others; the density matrix tells us which states are the most important. (The reader is urged to review Feynman’s introduction to density matrices [8] before proceeding further.) There are a variety of types of density matrices. The type we consider here is used by Feynman to consider the coupling of a quantum system with the rest of the universe. In our case the system is a block and the rest of the universe is the rest of the lattice, which may be finite or infinite. Let DiT be the entire set of many-body states of the block, and D jT be the many-body states of the rest of the lattice. If t is a state of the entire lattice, DtT"+ t DiTD jT . ij ij The reduced density matrix for the block is
(22)
o "+ t* t . ii{ ij i{j j If operator A acts only on the block,
(23)
SAT"+ A o "Tr oA . (24) ii{ i{i ii{ Let o have eigenstates Dv T and eigenvalues w . It is easy to see that w *0 and + w "1. The a a a a a w are the probabilities of the states Dv T. We have a a SAT"+ w Sv DADv T . (25) a a a a This relation is very important, because it tells us that if for a particular a, w +0, we make no error a in SAT if we discard the state Dv T, for any A. One can also show we make no error in our ability to a represent t. Thus, the density matrix naturally gives a way to throw out states with minimal errors: throw out the eigenstates of the density matrix with minimal eigenvectors. It is also very natural from other considerations to use the density matrix to choose the states which we wish to keep. Consider the following argument by analogy. For an isolated block at finite temperature, the probability that the block is in an eigenstate a of the block Hamiltonian is proportional to its Boltzmann weight exp(!bE ). The Boltzmann weight is an eigenvalue of the a density matrix exp(!bH ), and an eigenstate of the Hamiltonian is also an eigenstate of the density B matrix. Since lowest energy corresponds to highest probability in the Boltzmann weight, we can view the standard RG approach as choosing the m most probable eigenstates to represent the block given the assumption that the block is isolated. (Alternatively, we can view the rest of the lattice as a heat bath at an effective inverse temperature b, to which the system is very weakly coupled.) However, in reality the block is not isolated, the density matrix is not exp(!bH ), and eigenstates B
S.R. White / Physics Reports 301 (1998) 187—204
199
of the block Hamiltonian are not eigenstates of the block’s density matrix. For a system which is strongly coupled to the outside universe, it is much more appropriate to use the eigenstates of the density matrix to describe the system rather than the eigenstates of the system’s Hamiltonian. Thus, a natural generalization of the standard approach is to choose to keep the m most probable eigenstates of the block density matrix. This conclusion — that the optimal states to keep are the most probable eigenstates of the block density matrix — can be justified precisely [4,5].
5. The density matrix renormalization group algorithms Incorporating the density matrix concept in a numerical renormalization group algorithm involves a fundamental change in the way the calculation is carried out. In the standard RG approach, to find the states to be kept, one diagonalizes only the system BB, which becomes B@. In the density matrix approach, in order to obtain any reasonable approximation to the density matrix, it is necessary to diagonalize the Hamiltonian of a larger system which includes BB, namely, some sort of superblock, and then use the eigenstates of the superblock to determine the density matrix. The density matrix is then diagonalized, and its most significant eigenstates are the states kept. The number of eigenstates of the Hamiltonian of the superblock used to produce the density matrix can be as small as one; this single state produces a density matrix for BB which has many eigenstates to be used as block states to be kept. A density matrix algorithm is defined mainly by the form of the superblock and the manner in which the blocks are enlarged (such as by doubling the block, B@"BB, or by adding a single site, B@"B#site), and by the choice of superblock eigenstates used in constructing the density matrix (e.g., the two lowest-lying S "0 states). An eigenstate of the superblock Hamiltonian is called z a target state if it is used in forming the block density matrix. The most efficient algorithms use only a single target state (usually the ground state) in constructing the density matrix. By targeting only one state, the block states are more specialized for representing that state, and fewer are needed for a given accuracy. Probably the most important characteristic of a density matrix algorithm is the rate at which the accuracy increases with the number of states m. We have found that the accuracy of the representation of the target states increases roughly exponentially with m, at least for open boundary conditions. The coefficient governing the increase of accuracy with m is largest with a single target state. Several considerations enter in the construction of the superblock to be used in an algorithm. Generally, it is more efficient to enlarge the block by adding a single site, rather than doubling a block. The superblock configuration used here is represented symbolically as Bl · · BRl , where { Bl represents a block composed of l sites, BRl is a reflected block (right interchanged with left) of length l@, · represents a single site, and the total length of the superblock is ¸"l#l@#2. Here B@"Bl is formed from the left block plus a single site, i.e. Bl "Bl · . Open boundary `1 `1 conditions are used. The right-hand block and site · BRl are only used to help form the density { matrix for Bl ; in the construction of the density matrix, the states of · BRl are traced over. `1 { This configuration can be used in two different ways: in an infinite chain method, in which the chain size increases by two at each step, and in a finite chain method, in which the chain size is fixed.
200
S.R. White / Physics Reports 301 (1998) 187—204
Table 2 Infinite system density-matrix algorithm for a 1D system (1) Make four initial blocks, each consisting of a singlesite, representing the initial four-site system. Set up matrices representing the block Hamiltonian and other operators. (2) Form the Hamiltonian matrix (in sparse form) for the superblock. (3) Using the Davidson or Lanczos method, diagonalize the superblock Hamiltonian to find the target state t(i ,i ,i ,i ).t is usually the ground state. Expectation values of various operators can be measured at this point 1 2 3 4 using t. (4) Form the reduced density matrix for the two-block system 1—2, using o(i ,i ;i@ ,i@ )"+ t(i ,i ,i ,i )t(i@ ,i@ ,i ,i ). 1 2 1 2 i3,i4 1 2 3 4 1 2 3 4 (5) Diagonalize o to find a set of eigenvalues w and eigenvectors ua . Discard all but the largest m eigenvalues and a i1,i2 associated eigenvectors. (6) Form matrix representations of operators (such as H) for the two-block system 1—2 from operators for each separate block. (7) Form a new block 1 by changing basis to the ua and truncating to m states using H1{"OH12Os, etc. If blocks 1and 2 have m and m states, then O is an m]m m matrix, with matrix elements O(a;i ,i )"ua , a"1,2,m. 1 2 1 2 1 2 i1,i2 (8) Replace old block 1 with new block 1. (9) Replace old block 4 with the reflection of new block 1. (10) Go to step 2.
5.1. The infinite system method In the first step of the infinite system method, we start with a four site chain and diagonalize the Hamiltonian of the superblock configuration B · · BR, where B and BR both represent a single site. 1 1 1 1 We use the Davidson algorithm [9] for the sparse matrix diagonalization, but one could also use the more well-known Lanczos method. Using the target states calculated with this configuration, we calculate a density matrix and form an effective Hamiltonian for B "B · . In the second step 2 1 we diagonalize B · · BR, where we have formed BR by reflecting B . We continue in this manner, 2 2 2 2 diagonalizing the configuration Bl · · BRl , and setting Bl "Bl · , and using Bl and its reflection `1 `1 in the next step of the iteration. At each step, both blocks increase in length by one site, and the total length of the chain increases by two at each step of the iteration. The infinite chain method is usually used when one is interested in ground-state properties of the infinite chain. Each step of the iteration pushes the ends of the chain farther from the two sites in the center. After many steps, each block approximately represents one-half of an infinite chain. In order to represent one-half of an infinite chain, B must not only contain many sites itself, its effective Hamiltonian must be formed from a system in which the rest of the chain has many sites. The effective Hamiltonian formed from the left-hand side Bl · depends strongly on the right-hand side · BRl . The infinite chain algorithm converges in two senses simultaneously: in the length of Bl going to infinity and in the sense that Bl is adapted to respond to an infinite chain connected to it on the right. The infinite system algorithm is summarized in Table 2. The representation of the blocks is identical to that of the standard algorithm: we describe a block by listing how many states it has and the quantum numbers for each state, and by storing matrices for H , Sz, etc. Once the matrix B i O is constructed using the most significant eigenvectors of the density matrix, the change of basis procedure is also identical to that of the standard algorithm. For the purposes of organizing the algorithm, it is easiest to think of the two sites in the middle as blocks which can be treated similarly to the two outer blocks, although they contain only a few states.
S.R. White / Physics Reports 301 (1998) 187—204
201
Table 3 Finite system density-matrix algorithm for a 1D system consisting of ¸ sites. A calculation consists of several iterations, indexed by I, with each iteration consisting of ¸!3 steps, indexed by l, where l is the size of the first block (1) (First-half of I"1.) Use the infinite system algorithm for ¸/2!1 steps to build up the lattice to ¸ sites. At each iteration store the block Hamiltonian and end operator matrices for block 1. Label the blocks by their size, Bl, l"1,2, ¸/2. (2) (Start of second-half of I"1) Set l"¸/2. Use Bl as block 1, and the reflection of B l as block 4. L~ ~2 (3) Steps 2—8 of Table 2. (4) Store the new block 1 as Bl , replacing the old Bl . `1 `1 (5) Replace block 4 with the reflection of B l , obtained from the first-half of this iteration. L~ ~2 (6) If l(¸!3, set l"l#1 and go to step 3. (7) (Start of iteration I, I*2). Make four initial blocks, the first three consisting of a single site, and the fourth consisting of the reflection of B from the previous iteration. Set l"1. L~3 (8) Steps 2—8 of Table 2. (9) Store the new block 1 as Bl , replacing the old Bl . `1 `1 (10) Replace block 4 with the reflection of B l , obtained from the previous iteration (if l)¸/2!1) or the first-half L~ ~2 of this iteration (if l'¸/2!1) (11) If l(¸!3, set l"l#1 and go to step 8. If l"¸!3, start a new iteration by going to step 7. (Stop after 2 or 3 iterations.)
5.2. The finite system method The finite system algorithm is designed to calculate accurately the properties of a finite system of size ¸, which we will assume for simplicity to be even. It is summarized in Table 3. It begins with the use of the infinite system algorithm for ¸/2!1 steps, so that the final superblock used is of size ¸. In the infinite system method, there is no need to store Bl once we have Bl ; we need only store `1 the latest block. In the finite system method, we need to store ¸!3 blocks, B to B , and the 1 L~3 infinite system method is used to get initial, approximate versions of B to B . After the system 1 L@2 B · · BR is used to form B , the next step is to use the configuration B · · BR to form L@2~1 L@2~1 L@2 L@2 L@2~2 B . This system, and all the other superblocks to follow, contain ¸ sites. We continue to form L@2`1 the other blocks up to size ¸!3, using the superblock Bl · · BR l to form Bl . This sequence of L~ ~2 `1 steps is the first iteration of the finite system algorithm. The second and subsequent iterations use the blocks obtained from the previous iteration as the right-hand reflected blocks in each superblock. The first step starts by diagonalizing the superblock B · · BR , where B is a single site and is always known exactly, and BR is obtained from the last 1 L~3 1 L~3 step of the previous iteration. Once a new Bl is formed, it replaces the old Bl, so that only one set of blocks need be stored. Consequently, for the second-half of the iteration, starting with the superblock B · · BR , we use a block formed in the current iteration, rather than the last L@2~1 L@2~1 iteration, as the right-hand block. On the very last iteration, we usually stop after the diagonalization of B · · BR , and then use this wave function of the ¸-site system to measure various L@2~1 L@2~1 properties, such as the local magnetization or correlation functions. After a few iterations each Bl accurately represents an l-site block which is the left-hand l sites of an ¸-site chain. For many systems, such as most 1D spin chains, the method converges by the middle of the second iteration, although sometimes three iterations are necessary.
202
S.R. White / Physics Reports 301 (1998) 187—204
Table 4 Relative errors (E!E )/E in ground-state energies in the indicated spin sector S of finite S"1 chains of length %9!#5 %9!#5 T 2 ¸"16 and 22. The exact energies were determined by a separate exact diagonalization
m 16 24
¸"16 S "0 T 9.3]10~8 2.2]10~9
¸"16 S "1 T 1.2]10~7 5.4]10~9
¸"16 S "2 T 5.9]10~8 4.0]10~9
¸"22 S "0 T 8.0]10~7 8.1]10~8
6. Test calculations In recent years DMRG has been adopted widely by many groups and has been used in dozens of publications. It is beyond the scope of this review to discuss the use of DMRG in any detail. Here we will only show the results of some simple tests in cases where exact results are known. The first tests were done shortly after the development of DMRG; the last one gives more of an idea of the current capability of DMRG. The tests shown are for antiferromagnetic Heisenberg spin chains, with S"1 and S"1, with 2 Hamiltonian H"+ S ) S , (26) i i`1 i where we have set J"1. While the S"1 case is soluble via the Bethe ansatz, the S"1 case is not. 2 The S"1 chain has been the subject of considerable numerical effort [10,11] since Haldane argued that the infinite system has a finite gap between the ground and first excited state [12]. Thus these two cases provide excellent tests both for the accuracy of the methods and for their competitiveness with other numerical approaches. First, we consider short systems and compare with exact diagonalization. Results for the energies of 16 and 22 site blocks for S"1 from the finite lattice method are compared with exact 2 diagonalization in Table 4. DMRG is able to obtain energies for finite lattices with remarkable accuracy, even keeping only 16 states. We next consider calculations on much larger chains. For long S"1 chains we can compare 2 DMRG results with the exact results of the Bethe Ansatz. The Bethe Ansatz can be used to solve finite spin 1 Heisenberg chain systems with open boundary conditions [13]. (The Bethe Ansatz 2 equations need to be solved numerically for finite systems.) Fig. 2 shows the DMRG energy as a function of the DMRG step index l, for a 2000 site system. Reflection symmetry was used, which means that the sweeps only need to go half-way across the system before turning around. The line of stars shows the exact Bethe Ansatz result. A number of DMRG sweeps were performed, with the number of states kept per block m gradually increased. The nearly vertical line in the upper right corner represents the last small part of an infinite system sweep which was used to initially grow the system to 2000 sites. For this sweep m"10. The next two sweeps also had m"10, but since the energy had already converged in the infinite system sweep for this value of m, only one horizontal line is visible. The lines labeled m"15 and 20 are the two halves of the next sweep, with a vertical jump at the end when the new left block is reflected and replaces the right block. This is followed by another full sweep with m"20, during which the energy decreases only slightly. Subsequent sweeps have m"30, 40, 60, 80, 120, and 200.
S.R. White / Physics Reports 301 (1998) 187—204
203
Fig. 2. Energy as a function of DMRG step for a 2000 site S"1 Heisenberg chain. 2
Fig. 3. Error in the total energy as a function of the number of states kept per block m for the 2000 site S"1 Heisenberg 2 chain. The solid line represents the gap to the first excited state, as determined by the Bethe Ansatz. The exact ground-state energy is !886.105634697; the gap to the first excited state is 0.0021903.
Fig. 3 shows the error in the total energy relative to the exact Bethe Ansatz result. Although the S"1 chain is gapless, since the chain is finite, there is a finite gap to the first excited state. By 2 keeping 80 or more states, our total energy is below the energy of the first excited state. This means that the DMRG wave function is an excellent approximation to the ground state, with very small components of excited states. This gives us confidence that any correlation functions measured by DMRG must be quite accurate, since they are exact measurements for the approximate DMRG wave function. The error in the energy probably falls off roughly as a power law in m until the first excited state is reached, after which it probably falls off exponentially. For gapped systems, such as the S"1 chain, the errors appear to fall off exponentially starting with small values of m, independent of the length of the system.
7. Conclusions In this review we have introduced some of the common models used for the study of strongly correlated systems and discussed some of the numerical methods used to study them. We have
204
S.R. White / Physics Reports 301 (1998) 187—204
described in some detail the density matrix renormalization group, and presented some test result showing some of its capabilities for 1D spin systems. Unfortunately, we have left out a substantial body of work extending DMRG to new regimes. During the last several years there has been progress in applying DMRG to classical 2D systems; in using it at finite temperature; in momentum space; in treating boson systems as well as fermion systems; in extracting dynamic information from DMRG; in using symmetry more effectively; in understanding the structure of the DMRG wave function; and in treating wider and wider ladder systems. We expect progress in these and other areas to continue for the forseeable future.
Acknowledgements We acknowledge support from the NSF under Grant No. DMR-9509945.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
W. von der Linden, Phys. Rep. 220 (1992) 53. E. Dagotto, Rev. Mod. Phys. 66 (1994) 763. K.G. Wilson, Rev. Mod. Phys. 47 (1975) 773. S.R. White, Phys. Rev. Lett. 69 (1992) 2863. S.R. White, Phys. Rev. B 48 (1993) 10345. S.R. White, R.M. Noack, Phys. Rev. Lett. 68 (1992) 3487. R.M. Noack, S.R. White, Phys. Rev. B 47 (1993) 9243. R.P. Feynman, Statistical Mechanics: A Set of Lectures, Benjamin, Reading, MA, 1972. E.R. Davidson, J. Comput. Phys. 17 (1975) 87. M.P. Nightingale, H.W.J. Blo¨te, Phys. Rev. B 33 (1986) 659. K. Nomura, Phys. Rev. B 40 (1989) 2421. F.D.M. Haldane, Phys. Lett. A 93 (1983) 464. F. Alcaraz et al., J. Phys. A 20 (1987) 6397.
Physics Reports 301 (1998) 205—234
Field theory of critical phenomena: quantitative analysis beyond powerlaws Lothar Scha¨fer Fachbereich Physik, Universita( t Essen, 45117 Essen, Germany
Abstract We present a brief introduction to the theory of renormalization, where we base our discussion on the dilatation group and concentrate on the structural aspects. We then in particular consider the crossover between the Gaussian and the nontrivial fixed point. Crossover functions show two branches, depending on the physical coupling strength. We present recent Monte Carlo data for self-repelling random walks (polymer solutions) that verify this two branched structure. We finally discuss aspects important for a perturbative calculation of the crossover functions. Using data for a variety of observables in polymer solutions we demonstrate that the theory has reached the stage of quantitative success. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 64.60.-i
1. Introduction: The general framework Systems at a critical point show power law and scaling behavior. The standard example is a ferromagnet at its Curie-temperature ¹ , where the spontaneous magnetization vanishes. Con# sider, for instance, the susceptibility M s(t , h )" 0 D 0 , 0 0 h t &*9%$ 0 where M is the magnetization density, h is the external magnetic field, and t "(¹!¹ )/¹ 0 0 0 # # denotes a reduced temperature variable, vanishing at ¹ . In the zero-field limit h P0 one finds the # 0 power law s(t , 0)&t~c, t P0 . 0 0 0 This is the limiting form of the more general scaling law s(t , h )&t~c sN (h /tc`b) , 0 0 0 0 0 0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 1 1 - 8
L. Scha( fer / Physics Reports 301 (1998) 205—234
206
valid for sufficiently small t and h . The critical exponents c, b as well as the scaling function sN (y) 0 0 are universal, i.e., they are the same for large classes of magnets. Quite generally the critical point of any second order phase transition shows such scaling properties. There also are other critical systems, not related to thermodynamic phase transitions. A well known example is a dilute solution of very long polymermolecules. For example, polystyrene [!CH(C H )!CH !] 0 consists of N styrene units (‘monomers’), linked together to form 6 5 2 N 0 a chain. Values up to N +2]105 can be reached. If such a molecule is dissolved in toluene, its 0 spatial configuration takes the shape of a coil, randomly fluctuating due to collisions with solvent particles. The average radius of the coil can be measured and is found to depend on chain length N and number concentration c of chain molecules in the solution as 0 p R&Nl RM (N3lc ) . 0 0 p This scaling law is valid for small N c and large N . For c P0 it reduces to the power law 0 p 0 p R&Nl . Again the exponent l and the scaling function RM (y) are the same for a large class of 0 polymer—solvent systems, i.e., universal. This striking similarity in the structure of the physical laws governing completely different physical phenomena suggests that some very general mechanism must be at work. Among the most basic principles of physics is invariance under groups of spatial transformations. If such an invariance holds, it greatly restricts the functional form of the physical observables. Consider, for instance, some quantity f (r , r ) depending on two space points r , r . If the system is translationally 1 2 1 2 invariant, we know that f (2) can depend only on r !r : f (r , r )"fM (r !r ). If it in addition is 1 2 1 2 1 2 rotationally invariant, fM (2) under rotations must transform like a component of a tensor, otherwise depending on r , r only via Dr !r D. It then is natural to go one step further and ask, 1 2 1 2 what are the properties of a system which in addition is invariant under spatial dilatations:rPj r? We will see that this leads to the scaling laws, which reduce to power laws in specific limits. Dilatation invariance is the property that identifies a system as being critical. Here we should pause a little to ask, in which sense physical systems like magnets or polymers can be invariant under spatial dilatations? All such systems show a fixed microscopic length scale of the order of angstro¨ms. The lattice spacing of a magnetic crystal or the size of a monomer in a polymer molecule are examples of such scales. The microscopic hamiltonian of such systems will contain this fixed microscopic scale and cannot be dilatation invariant. Dilatation symmetry is a macroscopic feature, due to the cooperation of many fluctuating microscopic degrees of freedom. This is most easily explained with a simple example from polymer physics. Assume that the configuration of a polymer in solution is modelled as a random walk of N steps s "r !r , 0 j j j~1 Ds D"l ; j"1,2,N , in d-dimensional space. The r fix the endpoints of the segments, which j 0 0 j represent the monomers. Then the central limit theorem tells us that for large N the distribution of 0 the end-to-end vector r "r !r "+N0 s is Gaussian: e N 0 j/1 j d d@2 l2 dr2 exp ! e 1#O 0 . (1.1) P(r )" e 2nR2 R2 2R2 e e e Here R2 is the mean squared end-to-end distance e
A B
P
R2" ddr r2P(r ) , e e e e
A
BC
A BD
(1.2)
L. Scha( fer / Physics Reports 301 (1998) 205—234
207
which is a macroscopic quantity. Within our simple model R2 is easily expressed in terms of e microscopic quantities as R2"l2N , but this is completely irrelevant for the macroscopic law (1.1). e 0 0 Indeed, the central limit theorem guarantees that Eq. (1.1) holds for a large class of microscopic models, allowing also for correlations among neighbouring s , where only the expression of R2 in j e terms of microscopic parameters changes. It is the combined action of the many fluctuating variables s , that suppresses the microstructure. j Indeed, concerning its basic structure the result (1.1) can be seen as a scaling law resulting from dilatation invariance. To show this we write P(r ) somewhat more explicitly as P(r2, R2). Note that e e e we used rotational invariance, and we invoked the central limit theorem to suppress the microscopic variable l . The argument therefore is correct up to microstructure corrections O(l2/R2). Being 0 0 e a length, R under dilatations r Pj r , j'0, must transform as R Pj R . We now assume that e e e e e P(r2, R2) under dilatations transforms as e e P(j2r2, j2R2)"j~a0P(r2, R2) . (1.3) e e e e In the next section we will see that this is the typical form of a transformation law in a dilatation invariant theory (cf. Eq. (2.16).) The exponent a easily is determined by noting that P is 0 normalized, independent of R : e dd(jr ) e P(j2r2, j2R2)"ja0~d . ddr P(r2,R2),1" ja0 e e e (1.3) e e jd
P
P
Thus a "d, and the transformation law (1.3) takes the form 0 P(r2, R2)"jdP(j2r2, j2R2) . (1.4) e e e e We may now choose the scaling parameter j as j"1/R , to find the scaling law e P(r2, R2)"(R2)~d@2 PM (r2/R2) . (1.5) e e e e e Thus, assuming dilatation covariance in the form (1.3) we find the scaling form. However, dilatation symmetry clearly does not fix the explicit expression (1.1) of the scaling function PM (y). (Indeed, the argument just given is nothing but an awkward form of dimensional analysis, that however clearly exhibits the underlying dilatation group.) This example shows that there indeed are systems for which the microscopic scales are irrelevant for macroscopic behavior. However, there still are macroscopic scales like the end-to-end distance R2 of a polymer. For the magnet the corresponding scale is the correlation length m, which gives the e range of the correlations among the directions of the individual magnetic moments. Scale invariance in a strict sense holds only, if these macroscopic lengths diverge. This, then, provides us with another definition of a critical point. After this digression let us come back to the general aspects of symmetries. Symmetry operations generally are represented by operators acting in the space of physical states. (Think of a rotation matrix acting on a vector.) The groups of interest here analytically depend on certain parameters (rotation angles, dilatation factor), and for such groups it is sufficient to know the operators generating infinitesimal transformations as well as their commutation relations to reconstruct arbitrary group transformations. For the rotation group, for instance, the generators are the angular momentum operators ¸ , j"1,2,3, obeying the commutation relations j [¸ , ¸ ]"ie ¸ . (1.6) j k jkl l
208
L. Scha( fer / Physics Reports 301 (1998) 205—234
Any set of three (hermitian) operators ¸ obeying Eq. (1.6) generates a representation of the j rotation group. As we all learned in quantum mechanics, there are countably many representations, i.e., countably many sets of matrices obeying Eq. (1.6). They are distinguished by the ‘quantum number’ l of total angular momentum, and the l-representation acts in a 2l#1dimensional vector space Dl mT. If we now want to evaluate some matrix elements in quantum mechanics, say Sl mDr Dl@ m@T, for j definiteness, then this task decomposes into two well separated steps. Knowing l, l@, i.e., knowing the relevant representations, we can use the group structure to calculate all such matrix elements, provided we know a single ‘reduced’ matrix element (Wigner Eckart theorem). Even in evaluating the reduced matrix element the group helps somewhat since it allows us to rotate the coordinate frame in some convenient position. But then we have fully exploited the rotation group and the quantitative determination of the reduced matrix element is a task completely independent of the group structure. I here stress this point since it touches the basic philosophy underlying our approach to scaling functions. Turning to the dilatation group we note that its structure is much simpler than that of the rotation group, since in contrast to two different rotations, two different dilatations always commute. As a result, the algebra of infinitesimal transformations worked out in the next section, (we will discuss the combined action of translations and dilatations), is simpler than angular momentum algebra. However, it also does not restrict the representations as strongly. We will find a continuum of allowed infinitesimal transformations, parameterized by a real number a. This number is the analog to l in the angular momentum problems, and it will turn out that some quantity transforming under the a-representation of dilatations shows scaling laws, where a occurs as critical exponent. Given now some dilatation invariant theory, it is a formidable task to determine the exponents and thus the representations of the dilatation group for the quantities of interest. In field theory, proving the dilatation invariance of a theory technically is known as showing that the theory is renormalizable. The renormalization group (RG) expresses the effect of infinitesimal dilatations and thus gives the generator of the group. Knowing the RG, i.e., knowing the representation, we know all the general structure of the scaling laws including the exponents. This is analogous to knowing the quantum number l and the corresponding angular momentum operators. Calculating the scaling functions is a separate task, analogous to calculating the reduced matrix elements. The group helps in that it allows us to scale the system into a parameter region where our perturbative methods might be hoped to give good results. But still, basically the task of determining the RG and the task of calculating the scaling functions are well separated — even though this fact sometimes is masked by the complications of the theory and these two steps may be mixed in explicit calculations. In the following lectures I aim at illustrating ideas and concepts useful in calculating scaling functions. But of course we first have to set up the renormalization group. Since this is covered by excellent books or reviews (see, for instance [1,2], I will not go into any technical details here. More detailed discussions will be presented for the scaling functions, where I mostly will use the example of polymer solutions, which show a particularly rich critical behavior. For a magnet we essentially can control temperature t and magnetic field h . Changing other variables like the pressure 0 0 will not do much. t and h for a magnet roughly correspond to inverse chain length 1/N and 0 0 0 chain concentration c for a polymer solution. Besides these variables we also can change the p
L. Scha( fer / Physics Reports 301 (1998) 205—234
209
temperature of the solution, and this may have a great effect. At a special temperature ¹"H the chain may well be represented by the random walk model discussed above, but with increasing ¹ the repulsive interaction among the monomer units becomes dominant, leading to nontrivial exponents and scaling functions. In technical terms: in polymer physics we most naturally can cover the crossover from the Gaussian (or rather the tricritical) fixed point to the nontrivial fixed point of the RG. We thus can measure crossover scaling functions depending continuously on a four dimensional parameter space (N , c , ¹, q). Here q is the scattering vector in a scattering 0 p experiment. The renormalization group immediately yields the qualitative form of the scaling laws and allows for a calculation of the asymptotic scaling exponents. But our ultimate goal here is to find an internally consistent set of quantitative predictions for all measured crossover scaling functions.
2. Dilatation transformations in field theory In describing the macroscopic properties of a magnet we in a first attempt ignore the crystal structure and replace the magnetic moments of the individual ions by a continuous field u(r) defined in d-dimensional space. We want to study the action of the dilatation group in such a field theory.1 As a prerequisite, let us consider translations. We introduce the translation operator Ta, that shifts the field by the vector a: Tau(r)"u(r!a) .
(2.1)
Expanding this relation for infinitesimal da we find
C
D
C
D
d d 1# + da D Ta u(r)" 1! + da u(r) . j a 0 j r j j j/1 j/1 We thus read off the expressions for the generator of infinitesimal translations: D Ta"! "p . (2.2) 0 j a r j j (I omitted the factor of i conventionally included into the definition of p.) We now consider dilatations rPj r, to be represented by an operator D in function space. To j find the possible forms of D we note the defining properties of dilatations. j 1. Dilatations and rotations commute. This implies that D must be a scalar operator. j 2. First shifting by a and then scaling by j must yield the same result as first scaling by j and then shifting by j ) a: j(r!a)"(jr)!(ja). This implies the relation D Ta"T aD . j j j 1 A reader, interested in a more detailed discussion of the role of dilations in field theory, should consult [3].
(2.3)
L. Scha( fer / Physics Reports 301 (1998) 205—234
210
3. A dilatation by a factor j"1 is the identity operation. This implies that the generator of infinitesimal dilatations is to be defined as D . D" D j j/1 j We now take the derivative » D » D of Eq. (2.3) to find Dp "p #p D, or »j j/1 »aj a/0 j j j [D, p]"p .
(2.4)
(2.5)
Any scalar operator obeying this relation generates a representation of dilatations. With the form (2.2) of p it is easily checked that D(0)"!r ) +
(2.6) r solves Eq. (2.5). We however may add any scalar operator that commutes with p. The simplest extension of Eq. (2.6) is given by the set of operators D(a)"!r ) + !a, a real . r a is known as the ‘scaling dimension’ of the representation. We now use the simple multiplicative structure of the dilatation group:
(2.7)
D(a) u(r)"D(a) ) D(a)u(r) , j{>j j{ j to determine the effect of finite dilatations. We introduce the notation
(2.8)
u (r)"D(a)u(r) , j,a j of Eq. (2.8) to find and we take the derivative $ D $j{ j{/1 d j u (r)"D(a)u (r) j,a dj j,a
(2.9)
"(!r ) + !a)u (r) . r j,a This equation can be integrated immediately to yield the transformation law under finite dilatations in the a-representation: u (r)"j~au(r/j) . (2.10) j,a We now turn to our continuum model of a magnet. The field u(r) is to be interpreted as a configuration of the magnetic moments, and the hamiltonian H[u] is a functional of the field. (Appropriate boundary conditions for u(r) are understood.) The sum over all configurations becomes an integral over all functions. Thus the partition function is written as a functional integral
P
Z" D[u]e~H*r+ ,
(2.11)
L. Scha( fer / Physics Reports 301 (1998) 205—234
211
where the factor 1/k ¹ is absorbed into H. The integration measure in function space formally is B represented by D[u], which should not be confused with a dilatation operator. Besides the partition function also correlations of the field
P
1 M G(M)(r ,2,r )" D[u]e~H < u(r ) (2.12) 1 M j Z j/1 are of interest. In particular, two-point correlations G(2) are measured in scattering experiments. We now assume that the hamiltonian is invariant under the a"a representation of the 0 dilatation group: H[u 0],H[u] . j,a We relabel the integration variables u(r) in Eq. (2.12) as u
(2.13) (r) and we use Eq. (2.13) to find j,a0
:D[u 0] <M u 0(r)e~H*r+ j,a j/1 j,a G(M)(r ,2,r )" . (2.14) 1 M :D[u 0] e~H*r+ j,a Note that the relabelling of course does not change G(M), which stays independent of j. We then take the derivative $ D . It can be shown that the contribution due to the j-dependence of the $j j/1 integration measure cancels. The resulting identity
P
A
B
d M 1 D[u] + D u (r ) < u(r ) e~H 0" j{ dj 1 j,a0 j Z j/1 j{Ej M " ! + r + j!Ma G(M)(r ,2,r ) (2.15) j r 0 1 M j/1 is known as Ward-identity of the dilatation group. It is easily integrated to yield the scaling properties of G(M):
C
D
A
B
r r G(M)(r ,2,r )"j~Ma0 G(M) 1,2, M . 1 M j j
(2.16)
Invoking translational and rotational invariance we, in particular, find for the two-point function G(2)(r , r )"j~2a0GM (2)((r !r )2/j2) . 1 2 1 2 With the choice j2"(r !r )2 the power law 1 2 GM (2)(1) . (2.17) G(2)(r , r )" 1 2 (r !r )2 a0 1 2 results. These results rely on the dilatation invariance of the hamiltonian. Close to a critical point H generally is written in Landau—Ginzburg—Wilson form:
P C
H[u]" ddr
D
m2 u 1 (+u(r))2# 0 u2(r)# 0 u4(r)!h u(r) , 0 2 4! 2
(2.18)
L. Scha( fer / Physics Reports 301 (1998) 205—234
212
where h represents the magnetic field, the ‘mass’ m incorporates the temperature dependence, and 0 0 u is known as the coupling constant. We now consider the properties of this expression under 0 dilatations. With Eq. (2.10) we find
P C
H[u 0]" ddr j,a
P
m2 j~2a0 u (+u(r/j))2# 0 j~2a0u2(r/j)#j~4a0 0 u4(r/j)!j~a0 h u(r/j)] 0 2 2 4!
1 m2 u " ddr [jd~2~2a0 (+u(r))2#jd~2a0 0 u2(r)#jd~4a0 0 u4(r)!jd~a0 h u(r)] , 0 2 2 4!
(2.19)
where the last equation results from the substitution rPj r. Comparing to Eq. (2.18) we find that dilatation invariance holds only if d!2!2a "0, d!4a "0, m "0"h . (2.20) 0 0 0 0 (Note that we take u '0 to have a nontrivial theory.) The first two equations yield d"4, a "1. 0 0 Thus the only naively scale invariant theory of the form (2.18) is a massless theory with vanishing magnetic field in four dimensions. The scaling dimension a of the field is then trivial. 0 This result is not completely discouraging. We note that the simple mean field theory (Landautheory) of phase transitions identifies the free energy with the hamiltonian (2.18) evaluated for constant field u(r),M . The critical point then is found for h "0"m , and m2 is taken as 0 0 0 0 measure of the reduced temperature: m2&t . This is consistent with the above findings. We, 0 0 however, are searching for a theory of critical phenomena in three dimensions, with in general nontrivial exponents, whereas dilatation invariance seems to hold only in d"4, the exponent being trivial: a "1. What has gone wrong? 0 3. Continuum limit and renormalization The basic mistake made in the previous section is our very naive use of the continuum limit. Indeed, as we have written it, neither the partition function nor the correlation functions exist, since all the functional integrals diverge. This is easily demonstrated for the free energy F"!ln Z. We restrict ourselves to the trivial case u "0, h "0. Expressed by the Fourier-components of the 0 0 field
P
uJ (k)"
ddr kr ei u(r)"uJ *(!k) , (2n)d@2
the functional integral (2.11) then factorizes: Z"< k
P
duJ (k) duJ (!k) exp[uJ (!k)(k2#m2)uJ (k)] . 0 n
Here the product essentially ranges over a half-space of k-values. The individual integral is of Gaussian form and yields (k2#m2)~1, and the free energy is found as 0 1 (3.1) F" ddk ln (k2#m2) . 0 2
P
L. Scha( fer / Physics Reports 301 (1998) 205—234
213
This expression diverges for all d"1,2,3,2, due to integration over large k. This is called an ultraviolet (u.v.) divergence. Thus even in the simplest noninteracting (u "0) form of 0 our naively written down field theory the free energy just does not exist. However, the Fourier-transformed correlation functions are finite, since the divergences from numerator and denominator cancel. But as soon as we allow for the interaction u '0, which has to be treated 0 perturbatively, we in each order of perturbation theory find u.v. divergences for all d52. Thus the above analysis of dilatation invariance was based on non-existing quantities and seems meaningless. Now this is a completely artificial problem. In principle the degrees of freedom of our magnet are localized on a lattice of spacing l , and the Fourier-components are restricted to the first Brillouin 0 zone. Thus k-integrals are cut off at a value k+l~1, and all expressions stay finite. It is the 0 continuum limit l P0, that results in the u.v. divergencies. However, as soon as we keep the finite 0 cutoff l~1, we ruin the dilatation invariance of the hamiltonian, and again our argument breaks 0 down. This is the true problem. The way out of this dilemma is renormalization, which in this context can be characterized as a very refined way to carry through the continuum limit. It can be shown that order for order in perturbation theory the leading cutoff dependence of the correlation functions can be absorbed into a redefinition of the parameters m2,u ,h of the hamiltonian and an associated r-independent 0 0 0 factor multiplying the fields. In four dimensions the remaining cutoff dependence introduces corrections of order l2/m2, where m is the correlation length. As mentioned in the first section, in the 0 vicinity of a critical point m becomes macroscopically large, and the continuum limit consists in dropping the small corrections &l2/m2. 0 Technically renormalization even for the very simple theory (2.18) is quite involved, and to low orders perturbation theory it is carried through in standard references [1,2]. I therefore here omit all technicalities, except for mentioning one point: in Section 2 we found dilatation invariance in four dimensions. In some sense we want to analytically continue this result to d"3. We thus need to define the theory for continuous d, 34d44. This strange concept easily is implemented within the framework of perturbation theory. As example we consider the expression (3.1) for F. Integration over the direction of k yields the surface of the d-dimensional unit sphere, which easily is determined for all integer dimensions. As a result, Eq. (3.1) takes the form nd@2 F" C(d/2)
P
l~1
0
0
dk kd~1 ln (k2#m2) , 0
where C(z) denotes the gamma function and where we inserted the cutoff. This result can be evaluated for continuous d, 0(d(R. Essentially, the same trick works for all terms occurring in perturbation theory, and in the perturbative sense we succeeded in defining the theory for noninteger d. Another very important aspect concerns simple dimensional analysis. r clearly has dimensions of length: [r]"[l ], and H necessarily is dimensionless. (Recall that we absorbed the factor 1/k ¹.) 0 B Considering the (+u)2-term in H, Eq. (2.18), we find the ‘naive’ dimension of u: [u]"[l ]1~d@2. 0 Then the other terms yield [m2]"[l ]~2; [u ]"[l ]d~4; [h ]"[l ]~1~d@2. 0 0 0 0 0 0
(3.2)
L. Scha( fer / Physics Reports 301 (1998) 205—234
214
In the unrenormalized theory we would extract all dimensions in powers of l . In the renormalized 0 theory l is eliminated, and for the purpose of dimensional analysis we introduce some arbitrary 0 scale l . This scale will play an important role in the theory. Its arbitrariness will be used to analyse R the dilatation properties. We now are prepared to formulate the results of renormalization. The unrenormalized or ‘bare’ theory is given by the hamiltonian (2.18) with a short-distance cutoff l (or u.v. cutoff l~1, 0 0 equivalently). Those are the critical properties of this theory, that we want to determine. For clarity I decompose the approach into three steps. First step: Within the bare theory we define the critical mass m2 by the vanishing of the inverse 0# correlation length m~1. The point h "0, m2"m2 is the critical point of the theory, and, deviating 0 0 0# from the naive theory of Section 2, m2 "l~2mN 2 (u l4~d) in general is nonzero due to the 0# 0 0# 0 0 interaction. We then replace m2 by the reduced temperature defined as t "m2!m2 . 0 0 0 0# Note that this step serves to identify the critical region. Second step: We formally introduce renormalized parameters u ,t ,h via the substitutions R R R
A B A B A B
l u "Z u , 0 u ld~4 0 u Rl R R R l t "Z u , 0 t l~2 0 t Rl R R R l h "Z u , 0 h l~1~d@2 . 0 h Rl R R R
(3.3)
(To keep notation simple I deviate somewhat from the historically grown standard choice.) We also define the renormalized correlation functions G(M) by the relation R
A B
A
B
l r l r G(M)(r ,2,r ; m2,h ,u ,l )"Z~M u, 0 lM(1~d@2)G(M) 1 ,2, M; t ,h ,u , 0 . 1 M 0 0 0 0 h R R R R R l l l l R R R R
(3.4)
Note that by construction all renormalized quantities are dimensionless. Note further, that up to now nothing has happened. We just have rewritten the physical correlation function G(M) in terms of new variables. Final step: We now can state the theorem of renormalizability: The renormalization factors Z ,Z ,Z can be constructed as power series in u such that in the perturbation expansion of G(M) in u t h R R powers of u the continuum limit l P0, t ,u ,h ,l fixed, can be taken order for order, for all d44. R 0 R R R R In d"4 the corrections are of order (l /m)2, up to logarithmic terms. Thus 0
A
B
A
B
r l r r r G(M) 1 ,2, M;t ,h ,u , 0 "G(M) 1 ,2, M;t ,h ,u [1#O(l2/m2)] R l R l 0 l R R Rl l R R R R R R R R
(3.5)
holds to all orders perturbation theory. We of course hope that the result stays valid also beyond perturbation theory. We furthermore hope that the corrections, which are of order l2/m2 in d"4, stay small in the critical region l2;m2 0 0 also if the theory is analytically continued to d"3.
L. Scha( fer / Physics Reports 301 (1998) 205—234
215
We stress that the continuum limit l P0 is not taken in the bare theory or in the renormaliz0 ation factors,2 which in fact absorb the leading l -dependence of the theory. Eqs. (3.4) and (3.5) 0 together show that the physical correlation functions G(M) up to factors Z~M and up to (hopefully) h negligible corrections can be calculated in the renormalized theory, where the cutoff is eliminated. This should be compared to the naive continuum limit l P0 in G(M), which leads to unphysical 0 divergent quantities.
4. Renormalization group equations and scaling Eqs. (3.3), (3.4) and (3.5) define a continuous set of renormalized theories, parametrized by l , R which are all equivalent to the same bare theory. We now study the l -dependence of the R renormalized parameters u , etc. We substitute Eq. (3.5) into Eq. (3.4), ignore the corrections R &l2/m2, and we differentiate the result with respect to l , keeping the bare theory fixed. Since the 0 R unrenormalized correlation function G(M) is independent of l , its derivative vanishes, and we find R the equation
G
M du dt dh d ln Z R h 0" ! + r ) + j#l #l R #l R !Ml j r R dl u Rdl t Rdl h R dl R R R R R R R j/1 d #M 1! G(M)(r ,2,r ; t ,h ,u ) . R 1 M R R R 2
A BH
(4.1)
We introduce the standard notation d ln Z 1 t"2! , R dl l(u ) R R d ln Z 1 h" g(u ) , l R dl 2 R R du R"!¼(u ) , l R dl R R and we take the derivatives of Eq. (3.3) to find l
t dt R" R , R dl l(u ) R R dh d 1 R" 1# ! g(u ) h . l R dl 2 2 R R R
(4.2) (4.3) (4.4)
l
A
B
(4.5)
2 The expert will realize that for pedagogical reasons I ignore the possibility of dimensional regularization, which in this context is just a powerful but technical trick.
L. Scha( fer / Physics Reports 301 (1998) 205—234
216
Recall that all bare parameters are kept fixed in taking these derivatives. With these results Eq. (4.1) takes the form
G
M 1 0" ! + r ) + j!¼(u ) # t j r R u l(u ) R t R R R j/1 d g(u ) g(u ) d # 1# ! R h !M !1# R G(M) . (4.6) R h R 2 2 2 2 R These are the renormalization group equations. Here one point needs explanation. In Eq. (3.3) we introduced renormalization factors Z , etc., u that depend on u and l /l . Thus we expect that also the derivatives (4.2)—(4.4) depend on both R 0 R these variables. However, we can read Eq. (4.6) also as a set of algebraic equations which allows us to determine ¼(u ), l(u ), g(u ) in terms of the G(M). Since these functions manifestly are indepenR R R R dent of l /l , so are ¼(u ), etc. 0 R R We now compare the RG equation (4.6) to the Ward identity of dilatations (2.15). At the critical point t "0, h "0 these equations differ only by the presence of the term &¼(u ) in Eq. (4.6). R R R We are thus led to consider the role of zeros u* of ¼(u ). R R If as initial condition in the integration of Eq. (4.4) we choose u (l(0))"u* we immediately find R R R * u (l ),u for all l . R R R R Thus the zeros of ¼(u ) are fixed points of the RG flow of the coupling constant. At such a fixed R point we for t "0"h recover dilatation invariance: R R M g(u* ) d 0" ! + r ) + j!M !1# R G(M). j r R 2 2 j/1 The scaling dimension (a ) of the field equals d !1#g , where g"g(u* ). 0 2 2 R Within the fixed point theory u ,u* we next consider the neighbourhood of the critical point R R t "0"h . We introduce the notation l"l(u* ) and the dimensionless scaling parameter R R R j"l(0)/l , and we integrate Eqs. (4.2) and (4.3): R R Z (l )"Z (l(0))j1@l~2, Z (l )"Z (l(0))j~g@2 . (4.7) t R t R h R n R Substituting these results into the relation (3.3) among bare and renormalized quantities we find
G
A
B
A
A
BH
BH
t "c t j~1@l, h "c h j~1~d@2`g@2 , (4.8) R t0 R h 0 where c ,c absorb the dependence on the initial parameter l(0). These equations express the t h R dilatation properties of temperature and magnetic field. We now substitute these results into the relation among bare and renormalized correlation functions (Eqs. (3.4) and (3.5)): G(M)(r ,2,r ; m2,h ,u ,l )"(c )MjM(d@2~1`g@2) 1 M 0 0 0 0 G r r ]G(M) j 1 ,2,j M ; c t j~1@l, c h j~1~d@2`g@2, u* . n 0 R R l(0) l(0) t 0 R R Here c is another constant. G
A
B
(4.9)
L. Scha( fer / Physics Reports 301 (1998) 205—234
217
This result incorporates all the scaling- and power-laws. To illustrate this we consider the two-point function (M"2) and we fix the arbitrary scaling parameter j by the relation 1"c t j~1@l"t . t0 R Using also translation and rotation invariance we then find the scaling law G(2)((r !r )2; m2,h ,u ,l )"c2 (c t )l(d~2`g)]GM (2)((c t )2l(r !r )2; c h (c t )~l(1`$@2~g@2)) . 1 2 0 0 0 0 G t0 R t0 1 2 h 0 t0 (4.10) The susceptibility is found by integrating this expression over r"r !r . In zero field h "0 this 1 2 0 yields
P
P
s"c2 (c t )l(d~2`g) ddr GM (2)((c t )2lr2,0)"c2 (c t )~l(2~g) ddr GM (2)(r2,0) . G t0 R t0 G t0 R
(4.11)
The integral yields a pure number, and we thus have derived the power law s&t~c (Eq. (1.1)), with 0 the well known exponent relation c"l(2!g). To summarize, all the critical power and scaling laws follow from dilatation invariance at a fixed point of the renormalization group. The exponents determine the transformation properties under dilatations of the physical quantities, as announced in the first section. There remains the question why the critical theory corresponds to a fixed point of the RG. To understand this we have to look into the ‘infrared’ properties of the theory. We first consider the typical form of the flow equation (4.4) of the renormalized coupling. Its general structure follows from the first line of Eq. (3.3). du R"e u !¼ K (u ) . R R R dl R Here ¼ K (u ) is due to the derivative of Z , and we introduced the standard notation R u e"4!d . l
(4.12)
(4.13)
In renormalized perturbation theory ¼ K (u ) is found as R ¼ K (u )"au2#O(u3) , (4.14) R R R where a is a positive coefficient of order 1. It depends on some details of the theory. For instance, we instead of a single field u(r) could introduce a n-component vector field (u (r),2,u (r)) and write 1 n the hamiltonian in terms of scalar products of such vector fields. The resulting coefficient a is a linear function of n. The typical form of the r.h.s. of Eq. (4.12) is sketched in Fig. 1. Clearly, u "0 is a fixed point, representing a trivial noninteracting or ‘Gaussian’ theory. For R e'0, i.e. d(4, this fixed point is ‘infrared’ (i.r.) unstable: for a starting value u(0)"u (l(0))'0 R R R close to zero, u (l ) with increasing l is driven away from zero. A second zero of ¼(u ) is found at R R R R e u* " #O(e2) . (4.15) R a This ‘nontrivial’ fixed point is i.r. stable: any starting value in some neighbourhood of u* that R includes the interval 0(u(0)(u* , for l PR is driven into u* . R R R R
218
L. Scha( fer / Physics Reports 301 (1998) 205—234
Fig. 1. Typical form of !¼(u ). The arrows indicate the flow of u with increasing l . R R R
Why are we interested in the flow of u with increasing l ? After all we have stressed above that R R l is arbitrary! Now, at a critical point even the renormalized theory is singular. In Fourier R representation these ‘infrared’ singularities arise from integration over small k, and in contrast to the u.v. singularities of the naive continuum limit they have a good physical reason: the physical quantities themselves show power law singularities at the critical point. Since the critical point corresponds to a divergent correlation length m, these singularities in the dimensionless correlation functions G(M) occur for m/l PR. Choosing l &m we eliminate these singularities from G(M). R R R R They then are taken care off by the RG flow, and on approaching the critical point l &mPR, the R coupling u of the critical theory is driven into the i.r. stable fixed point u* . R R 5. Qualitative features of crossover functions Most work on critical phenomena concentrates on the fixed point behavior: u ,u* . Going R R away from a fixed point, typically only the immediate neighbourhood of u* is considered. The R renormalized correlation functions are expanded in powers of u !u* , and only the linear term of R R this expansion, which yields the ‘corrections to scaling’, is discussed. Again this has a good reason: the models, like our hamiltonian (2.18), involve only those terms important for the critical behavior. In a more realistic modelling of the physical system we should include much more terms like higher powers of the field or higher powers of the gradient. It can be argued that all such ‘irrelevant perturbations’ die out for l PR, and in three dimensions their effect can be estimated to be of the R same order as corrections to fixed point behavior resulting from terms of order (u !u* )2. Thus in R R general it does not make sense to go beyond the linear terms u !u* . R R Still, the RG provides us with all the flow of the coupling from the Gaussian to the nontrivial fixed point. Are there systems for which this information is useful? For such systems the starting value u(0) of the renormalized coupling must be very close to the Gaussian fixed point, so that R l can be chosen large enough to suppress all irrelevant corrections, whereas u (l ) still is not close R R R to the nontrivial fixed point u* . By definition the starting value u(0)"u (l(0)) gives the value of the R R r R renormalized coupling on scale l(0). We take l(0) to be of the order of the microscopic cutoff l of R R 0 the physical system. Determined on that scale, u(0) should not feel the critical i.r. singularities but R can be considered as an analytical function of parameters of the full physical model. We thus are
L. Scha( fer / Physics Reports 301 (1998) 205—234
219
confronted with the question whether by changing physical parameters not explicitly included into our simple hamiltonian (2.18), we can continuously shift u(0) towards zero. R In the realm of phase transitions this is indeed possible, if we take the system to be complicated enough. We, for instance, may consider fluid mixtures consisting of more than two components. In such systems we may have a normal demixing transition with its critical point described by the hamiltonian (2.18). As a function of the composition this critical point will trace out a curve in the parameter space, and along this curve u(0) varies continuously. We may reach a ‘tricritical’ point, R where u(0)"0. Beyond this point new phases show up, and a detailed phenomenology of tricritical R phase diagrams may be found in [4]. Another interesting example where u(0) can be varied is He4 at the superfluid phase transition, R where the pressure seems to have a significant effect. This has been studied in particular by Dohm and co-workers (see [5,6]). However, a particularly simple example is provided by the physics of polymer solutions. This system becomes critical for chain length N PR and chain concentration 0 c P0. (More precisely, we have to take the segment concentration N c towards zero.) Temperp 0 p ature essentially governs u(0), and simply changing ¹ often allows us to go from the tricritical point R u(0)"0 to the nontrivial fixed point. Thus for polymer solutions we automatically are concerned R with this crossover. In the sequel we will study this example. One complication must be noted. Since at a tricritical point u vanishes, we in principle have to R include the next interaction term (u6) in our hamiltonian. This considerably complicates the theory (see [7] for a review) but introduces only small (logarithmic) corrections in three dimensions. We therefore will ignore this complication, treating the tricritical point as trivial Gaussian fixed point. We now translate the general RG results into the language of polymer physics. The simplest model of a polymer chain in solution is the ‘spring and bead’ model. Harmonic springs of mean extension &l couple beads, j"0,1,2,N , that repell each other. The hamiltonian reads 0 0 N0 (r !r )2 u j~1 # 0 l4 + dd(r !r ) , H" + j (5.1) j j{ 3 0 4l2 0 j/1 j:j{ where r denotes the position of the jth bead in d-dimensional space. The single chain partition j function takes the form
P
N0 < ddr e~H , (5.2) j X j/0 where X denotes the volume of the system. It turns out that the Laplace transform of Z(N ), defined as 0 = ZI (m2)" + e~l20N0m20 Z(N ) (5.3) 0 0 N0/1 is closely related to the two-point function G(2)(r , r ; m2, h "0, u , l ) of the field theoretic model 1 2 0 0 0 0 (2.18), integrated over r , r . The equivalence can be established order for order in the perturbation 1 2 expansion in powers of u , but it again involves a somewhat strange limit. We have to generalize 0 the field theory to a n-component vector field (u (r),2,u (r)), and we then have to set n"0. This 1 n makes no problems in perturbation theory, where all terms are found to be polynomials in n. Since all the theory is based on the perturbation expansion, this is all we need. The equivalence can be Z(N )" 0
L. Scha( fer / Physics Reports 301 (1998) 205—234
220
extended to the correlation functions. G(2), for instance, up to normalization gives the distribution of the end-to-end vector of the chain. Furthermore, it is found that solutions of finite chain concentration c correspond to the field theory for nonvanishing magnetic field. h2'0 is to be p 0 interpreted as the chain fugacity. To summarize, we have the following relation between magnetic and polymer variables: lattice spacing l % average spring extension l 0 0 u4!coupling u % excluded volume coupling u 0 0 mass m2 % Laplace conjugate to l2N 0 0 0 magnetic field h % chain fugacity h2 . 0 0 This embeds the problem of polymer solutions into the general scheme outlined above. (A complete discussion of RG-theory of polymer solution may be found in [8], or in a forthcoming book of the present author.) We can now translate all the RG-results into polymer language. The renormalization of the coupling is not changed, and the renormalization of the chain length follows from its relation to m2. 0 We thus find from Eq. (3.3) u "Z u l~e 0 u R R (5.4) N l2"Z N l2 , 0 0 N R R where N is the renormalized chain length and R Z "Z~1 . (5.5) N t Since experimentally we control concentrations rather than fugacities, the chain fugacity h2 is not 0 of much interest. The relevant variable is the chain concentration c . Since neither the number of p chains in the system nor the volume X is changed by renormalization, c is not renormalized. We, p however, define a dimensionless variable c "ld c . (5.6) pR R p We can now rewrite the results (3.4) and (3.5) for the correlation functions. For simplicity we give only the result for the normalized end-to-end vector distribution of a chain
A
B
(r 0!r )2 N 0 ,N ,c ,u . (5.7) R pR R l2 R All explicit prefactors found in Eq. (3.4) cancel by virtue of the normalization, but we extracted the dimension of the renormalized distribution function as l~d. Eq. (5.7) generalizes the simple scaling R result (1.4) to a solution of self-repelling chains. Having introduced the polymer system, we come back to our discussion of the crossover. For polymer solutions the temperature ¹ is an uncritical variable that influences the coupling u . For 0 many chemical systems a special temperature ¹"H is found, where the chains to good approximation are described by the simple random walk model discussed in the introduction. For instance, independent of concentration the mean squared end-to-end vector behaves as P(r 0!r ,N ,c ,u ,l )"l~d P N 0 0 p 0 0 R R
R2(N ,H)"l2(H)N , % 0 0
(5.8)
L. Scha( fer / Physics Reports 301 (1998) 205—234
221
where l(H) is some microscopic length. We therefore identify the H-temperature with the noninteracting Gaussian fixed point u "0. (As mentioned above we ignore subtle corrections, which are 0 due to three body interactions. See [9] and references given there.) For ¹'H we however find for a long isolated chain R2(N ,¹)"B2(¹)N2l, N PR , (5.9) % 0 0 0 where l+0.588 is the universal correlation length exponent and B(¹) is a system-dependent coefficient. This corresponds to the nontrivial fixed point. Increasing ¹ from ¹"H we smoothly cross over among the two fixed points. To establish the qualitative features of the crossover we more closely consider the general RG flow. We identify the starting value l(0) with l and we introduce the dimensionless scaling R 0 parameter j"l /l . The critical limit corresponds to jP0. The flow equation (4.4) for u "u (j) 0 R R R takes the form j
du R"¼(u ) , R dj
(5.10)
which is easily integrated:
P
P
j dj@ uR du@ R . " (5.11) j@ (0) ¼(u@ ) 1 R u R In the region of interest ¼(u ) shows two zeros u "0, u* , and 1/¼(u{ ) therefore can be written as R R R R !1 1 1 " # #p (u@ ) . (5.12) u R eu@ u(u{ !u* ) ¼(u@ ) R R R R Here the first term follows from the general structure (4.12) of ¼(u ), and R d¼(u ) R (5.13) u" du R u*R is known as the ‘correction to scaling exponent’. p (u ) is assumed to be some regular function. With u R Eq. (5.12) the integrals in Eq. (5.11) are easily evaluated to yield
K
K
A B A
B
u(0) 1@e u* !u 1@u R R R exp(P (u )!P (u(0))) , u R u R u u* !u(0) R R R where P is the integral of p . This is an implicit equation for u "u (j). u u R R Turning to the calculation of Z we note that Eqs. (4.3) and (5.5) yield N 1 d . j ln Z "2! N l(u ) dj R With Eq. (5.10) this transforms to j"
A
B
d ln Z 1 1 N" 2! . du ¼(u ) l(u ) R R R
(5.14)
(5.15)
(5.16)
L. Scha( fer / Physics Reports 301 (1998) 205—234
222
Fig. 2. Flow in the (u ,1/N )-plane. Flow lines traced out for decreasing j (i.e. increasing l ) are shown. R R R
We now note the fixed point values of l(u ): l(0)"1/2;l(u* )"l. This suggests to write R R
A
B
u 1 1 " R* 2! #(u* !u )p (u ) . 2! R R N R l l(u ) u R R
(5.17)
Using Eqs. (5.13) and (5.17) we can easily integrate Eq. (5.16). Substituting the result into Eq. (5.4) for N we find R
A
B
u* !u 1@u(1@l~2) R exp(P (u )!P (u(0)))N . N "Z~1(u(0))j2 *R R N R N R N R 0 u !u(0) R R
(5.18)
Here P is again some regular function. N Eqs. (5.14) and (5.18) give the general global form of the RG mapping. Let us first consider the region u +u* . Eqs. (5.14) and (5.18) yield R R u !u* +const. (u(0)!u* )ju , R R R R (5.19) 1 1 +const. j~1@l . N N 0 R This shows that the integrated flow in the (u ,1/N ) plane is singular at the line u "u* , which R R R R separates two regions. Starting in the weak coupling regime u(0)(u* we stay with u (j)(u* . R R R R Starting in the strong coupling regime u(0)'u* , we find u (j)'u* , always. The global RG flow is R R R R sketched in Fig. 2. The distinction among u(0)(u* or u(0)'u* will induce two branches of R R R R crossover behavior. I should note that the validity of the strong coupling branch u 'u* has been questioned. To be R R sure, there are physical systems, which can be described within the present theory only if we allow for u 'u* . The most prominent example is the three-dimensional Ising model, where the sign of R R the corrections to scaling is incompatible with the weak coupling branch u (u* . (See [10].) More R R precisely, we for instance may calculate the susceptibility to find s"at~c(1#b(u(0)!u* )tlu#2) . R 0 0 R
L. Scha( fer / Physics Reports 301 (1998) 205—234
223
The sign of the nonuniversal constant b can be extracted from the theory, and the data are compatible with this expression only if we take u(0)'u* . The same phenomenon occurs for self R R repelling polymer chains modelled as strictly self avoiding walks on cubic lattices. (See [11], and references given there.) It, however, has been argued that our perturbative renormalization methods cannot reach the region u 'u* , since the flow function ¼(u ) itself may be singular at u* . R R R R (See [12] and references given there.) Then of course our method would break down. A careful discussion shows [11,13,14] that the arguments for the existence of a singularity do not apply to the special renormalization schemes (‘minimal subtraction’) used in our work. However, we of course cannot prove the analyticity of ¼(u ). This is a task beyond perturbation theory. We thus take R a pragmatic attitude, checking the usefullness of our approach in comparison to experiments. In evaluating the theory, l must be chosen to be of the order of the correlation length, which in R the present problem may be identified with the end-to-end distance R of the chain. To leading % order R2"l2N , and we thus for the present general discussion fix l by the condition % R R R N "1 . (5.20) R The RG-mapping (5.14), (5.18) involves two adjustable parameters. The first one is the starting value u(0) of the coupling constant flow, the second is the unknown factor Z (u(0)), resulting from R N R integrating the flow of Z . To bring out the structure more clearly we write Eqs. (5.18) and (5.20) as N u* !u 1@u(1@l~2) R l2"l2Z~1(u(0)) *R exp(P (u )!P (u(0)))N R 0 n R u !u(0) N R N R 0 R R "l2(¹)N Du* !u D1@u(1@l~2)exp P (u ) , (5.21) 0 R R N R where the nonuniversal parameter l(¹) combines all terms depending on u(0). Combining R Eqs. (5.14) and (5.18) we find a second equation:
A
B
A
B
e u Du* !u D~e@2ul exp ! P (u )!eP (u ) "v(¹)Ne@2 . R u R 0 R R 2 N R
(5.22)
Here v(¹) again absorbs u(0)-dependent terms. Eqs. (5.21) and (5.22) yield the renormalized R parameters l , u in terms of physical parameters l2(¹)N , v(¹)Ne@2. They in general allow for two R R* 0 0 solutions, u (u (weak coupling branch) and u 'u* (strong coupling branch). The parameters R R R R l(¹), v(¹) must be determined by fitting to experiment. Only on the weak coupling branch for u(0)P0, i.e. ¹PH, we can use the analyticity of u(0) as function of ¹ to find R R l(¹)"const. l (1#O(¹!H)) , 0 vJ (¹)"const. (¹!H)(1#O(¹!H)) . We now consider the simplest physical quantity, which is the end-to-end distance of an isolated chain. The general renormalized expression reads R2"l2R(N ,u ) . (5.23) % R R R Using N "1 and the general form (5.12) and (5.22) of the mapping to physical variables we find R (5.24) R2"l2(¹)N a2k(v(¹)Ne@2) . 0 % 0 %
224
L. Scha( fer / Physics Reports 301 (1998) 205—234
Fig. 3. a k (Eq. (5.24) as a function of z"v(¹)N1@2, as determined in a simulation [15]. The dotted straight line represents the % 0 asymptotic power law a &z2l~1. The curves represent dense sets of data points. Crosses are simulation results using a different % algorithm.
Here the index k distinguishes the two branches. This result, incorporating the general RG structure with no explicit perturbative calculations involved, shows that all data for R2 can be % brought onto two universal branches by adjusting two N -independent parameters. R2 can be 0 % determined only in simulations, and we recently carried through [15] a large scale computer experiment to test this structure. In a three dimensional system we varied u from zero to some 0 large value and measured chains up to N "16 000. We expect to see data collapse onto two 0 universal branches for N Z100. Fig. 3 shows our results for the ‘swelling factor’ a k(z) in a doubly 0 % logarithmic plot. The two-branched structure is clearly seen. Note that changing the nonuniversal parameters l2(u ), v(u ) we can only shift the curves, and we cannot bring the two branches to 0 0 coincide. This two-branched structure, which has not found much attention up to now, should be present for all observable quantities. To give another example I in Fig. 4show the effective exponent c (vN1@2), derived from the single chain partition function Z(N ) %&& 0 0 c "N ln [Z(N )e~k *s N0]#1 . (5.25) %&& 0 N 0 0 Here k* is the (nonuniversal) chemical potential per segment of an infinitely long chain. Again the s two-branched structure is clearly seen. Note that in analyzing this quantity we have to use the parameter v(u ), as determined from R2. No freedom is left. 0 % A further example, with data from physical solution experiments, is shown in Fig. 5. It gives the interpenetration ratio A t"(4n)~3@2 2 , (5.26) R3 ' where A is the second virial coefficient of the osmotic pressure and R is the radius of gyration of 2 ' an isolated coil, a quantity which is essentially proportional to R and which can be measured in %
L. Scha( fer / Physics Reports 301 (1998) 205—234
225
Fig. 4. c (Eq. (5.25) as a function of z"v(¹)N1@2. The smooth curve gives the calculated scaling function. Data from [15]. %&& 0
Fig. 5. The interpenetration ratio t (Eq. (5.26) as a function of the swelling a2. Data for a variety of chemical systems are included. ' Small points are from a simulation of self-avoiding lattice chains [16].
scattering experiments. In the excluded volume limit, t reaches a universal fixed point value t*+0.245. Fig. 5 shows t as function of the swelling factor a2"R2 (¹)/R2(¹"H) and again ' ' ' clearly exhibits the two branches. We note that this plot eliminates z"vJ (¹)N1@2 in favour of a2 and ' therefore involves the single nonuniversal parameter l(¹)/l(H), that furthermore is close to 1 for ¹ close to H. In Figs. 4 and 5 I also included results from a calculation based on renormalized perturbation theory. (See the next section.) Clearly data and theory match very well. We also calculated a2, and the corresponding curves are omitted in Fig. 3 only, because the theory falls % right on top of the data. This strengthens our belief in the strong coupling branch as calculated from renormalized field theory.
6. Quantitative calculation of crossover functions Up to now we studied the qualitative behavior of a critical system. We now consider quantitative calculations of physical observables.
226
L. Scha( fer / Physics Reports 301 (1998) 205—234
We first note that an presumably very accurate form of the RG-mapping is known for all d, 24d44. The fixed point u* and the exponents l,u, etc. have been calculated by resummation of R renormalized perturbation theory, pushed to high orders. (See [2,17] for a review.) For the polymer system in d"3 values l"0.588,u"0.80 are found. Extending this analysis, Schloms and Dohm [18] calculated the full functions ¼(u), l(u), etc., which then yields excellent approximations for the functions P (u ), P (u ) occurring in the integrated form of the RG-mapping. Indeed, these u R N R functions are found to not vary much in the region of interest. Thus the first step of our program is carried through: We have an excellent representation of the group. We still have to calculate the crossover scaling functions, which is the second step, independent of the group structure. I exemplify the problems showing up with a calculation of the osmotic pressure P of a polymer solution. The general theory yields the form P "c [1#P (N ,c ,u )] . (6.1) p n R pR R k ¹ B The first term gives the osmotic pressure of an ideal solution and P (2) is due to the interaction n among the chains. We first establish the scaling law at the fixed point. With N "1, Eq. (5.19) valid for u Pu* R R R yields j&N~l. Thus 0 c "ld c "ld j~dc "Bdc Nld , (6.2) pR R p 0 p p 0 and Eq. (6.1) reduces to the scaling law P "c [1#P* (BdNldc )] . (6.3) p n 0 p k ¹ B B is a nonuniversal constant, only due to the initial conditions in the RG mapping. Thus this same constant occurs in all scaling functions evaluated at the fixed point, and it there is the only microstructure dependent parameter in the theory. The two constants l(¹), v(¹) of the crossover u Ou* combine into the single parameter B. Up to a universal number a , B can be identified with R s R B(¹) introduced in Eq. (5.9) for R , so that the scaling variable takes the form % s"BdNldc "adRdc . (6.4) 0 p s % p Thus the ‘overlap’ s has a direct physical interpretation. Expanding Eq. (6.3) in powers of s we recover the virial expansion, valid in the dilute limit s;1, where the chains essentially form isolated coils. We, however, in the critical region N <1, N ld c ;1 also can take the ‘semidilute’ 0 0 0 p limit sPR. Then the coils interpenetrate strongly and the osmotic pressure depends only on the segment concentration c "N c , (6.5) s 0 p but not on N or c separately. In some sense the chains are so strongly entangled that they loose 0 p their individuality. For Eq. (6.3) to reduce to a function of c only, the scaling function must behave s as P* (s)Ps1@ld~1; sPR , n resulting in P/k ¹&cld@ld~1. B s
(6.6)
L. Scha( fer / Physics Reports 301 (1998) 205—234
227
We now turn to the calculation of P , which has to be done perturbatively within the n renormalized theory. To first order we find3
A B
e 2C 1# 1 2 Ne@2 2 R P " ¼ 1# u 1! n 2 R e R (4!e)(1!e/2) ¼ R e 1 1 ] (1#2¼ )1~e@2 1! ! # R 2 2¼ 2¼ R R where
A
G
C
A
B
BDH
,
(6.7)
¼ "u c N2 . (6.8) R R pR R This result is derived not as simple expansion in powers of u , but in the so-called ‘loop’ expansion. R It can be described as expansion in powers of u , but with ¼ taken fixed, formally of order 1. This R R is important to reach the semidilute limit, since ¼ is found to correspond to the overlap and R becomes large for s<1. We furthermore note that the square bracket for eP0 is of order e, so the prefactor 1/e is cancelled and the expression is finite for e"0, i.e. d"4, as it should. Now a generally employed way of calculating scaling functions is the so-called e-expansion. We write u "u* f , (6.9) R R and we observe from Eq. (4.15) that u* is of order e. To be consistent, in an expression linear in R u we thus should keep only terms linear in e, i.e. e is our expansion parameter. In my conventions R u* "e #O(e2), and substituting u "e f into Eq. (6.7) we find to first order in e R 4 R 4 e 1 1 1 1 # !ln N #c ! #O(e2) , (6.10) P " ¼ 1# f ln(1#2¼ ) 1! R R E6 n 2 R 4 4¼2 2¼ 2 R R where c is Euler’s number. We now restrict ourselves to the fixed point ( f"1) and tentatively E6 choose N "1. Eqs. (6.8) and (6.2) yield R ¼ "u* c "const. s . R R pR For f"1,s<1, we therefore find
C
A
A
C
B
D
e P &s 1# (ln s#const.) , n 4
B
D
(6.11)
which is not the expected result Eq. (6.6). Indeed, it is the e-expansion of that result. Calculated perturbatively, l takes the form l"1# e #O(e2), and 1/(ld!1)"1#e #O(e2). Thus 2 16 4 e s1@ld~1"s1`e@4`O(e2)"s 1# ln s#O(e2) , 4
A
B
3 The result given is for an exponential distribution of the chain lengths in the solution. The influence of the chain length distribution can be handled without problem and in this review is ignored throughout.
228
L. Scha( fer / Physics Reports 301 (1998) 205—234
which explains Eq. (6.11). Thus Eq. (6.10) is correct in the sense of the e-expansion, but certainly is not a good approximation for e"1. One solution to this problem is to reexponentiate the logarithms so as to produce the expected power laws. This is not a good idea, however, since in general there are many ways to construct a smooth function interpolating among the correct limits (sP0 and sPR, in our example) in a way consistent with a low order e-expansion. Thus the results are quite arbitrary, a fact usually masked by making some ‘natural’ choice from the outset. This ambiguity could be cured only by pushing the expansion to high orders, a program forbidingly difficult for scaling functions. It is better to carefully look for the origin of the problem. In fact, the e-expansion together with the choice N "O(1) does not respect the strict separation R into group structure and scaling functions. The exponents follow alone from the representation of the group, and here we have mixed their expansion with the calculation of the scaling functions. We thus have not fully exploited the structure of the problem. Another way to see this is to note that our expansions of the scaling functions, a priori supposed to be finite and well behaved, involve logarithmically singular terms. To cure this, we must change our choice of l . R We want to use the RG to scale the renormalized theory into an uncritical region, where the perturbation expansion works. To find this region we must discuss the relevant length scales. Of course, one scale is the radius R of the coils and choosing N "1 we effectively use l +R . In % R R % strongly overlapping solutions, however, there is the ‘screening length’ m as a second scale. It gives E the distance beyond which the correlations within a self-repelling chain are destroyed by the interaction with the other chains. The physical mechanism of screening [19] easily is understood. An isolated chain swells beyond the entropy dominated random walk size in order to decrease its interaction energy. Also in a strongly overlapping system pairs of segments spaced a distance r[m E predominantly are found on the same chain, and therefore swelling is favourable on such scales. Pairs of segments spaced by r<m , however, typically are found on different chains, and swelling E on such scales does not reduce the interaction energy. Entropy wins, and on scale r<m the chain E configuration stays random walk like. Thus m gives the scale relevant for interaction effects in E semidilute systems. To lowest order approximation m in the semidilute limit is found as E l2 R ;R2"l2N , m2" (6.12) E u c N % R R R pR R and in semidilute systems we should choose l +m as the relevant scale. This implies R E u c N "O(1) . R pR R Interpolating among l +R and l +m in the two limits, we thus implicitly fix l by the choice R % R E R n 1 1" 0 # u c N , (6.13) N c R pR R R 0 where n ,c are two constants discussed below. 0 0 With this choice for most quantities of interest in polymer solutions even the lowest order calculation of the scaling functions gives qualitatively correct results. This clearly is an essential property of a good expansion, and I illustrate it with our expression for P . It reads to lowest order n 1 1 P " ¼ " u c N2 . (6.14) n 2 R 2 R pR R
L. Scha( fer / Physics Reports 301 (1998) 205—234
229
Fig. 6. Structure of the uncritical region. The broken line connecting points A and D represents the excluded volume limit f"u /u* "1. R R The long-dashed lines correspond to experiments where c is varied for N ,¹ fixed. The curve in the strong coupling region f'1 runs p 0 away to some unknown limit.
We again restrict ourselves to the fixed point u* , where N &j1@l N , c &j~d c (cf. Eq. (6.2)). R R 0 pR p For c P0 we find from Eq. (6.13) N "n and thus j&N~l, c &c Nld, leading to p R 0 0 pR p 0 u* P & R c Nld . (6.15) n 2 p 0 This is the correct form in the dilute limit. The semidilute limit corresponds to N PR, thus R u* c N "c , or j1@l~d&1/(c N ). P takes the form R pR R 0 p 0 n c (6.16) P " 0 N &N j1@l&N (c N )1@ld~1&s1@ld~1 . 0 0 p 0 n 2 R The full result smoothly interpolates among the limits (6.15), (6.16), and the first order correction only changes the quantitative but not the qualitative result. The RG together with the choice Eq. (6.13) maps our system into an uncritical region of the renormalized theory, parametrized by f"u /u* and n /N . Fig. 6 shows the structure of that R R 0 R region. The dilute excluded volume limit (N PR, N c P0) corresponds to point A: f"1, 0 0 p n /N "1. Point B ( f"0, n /N "1) is the dilute H-limit. Increasing the overlap we reach the 0 R 0 R semidilute limit n /N "0, where point D represents the excluded volume limit f"1 and C is the 0 R semidilute H-limit. Consider now an experiment, where we increase the concentration in a solution of very long chains. In the dilute limit we may start in the excluded volume region close to
230
L. Scha( fer / Physics Reports 301 (1998) 205—234
point A. For f(1 (weak coupling) we then first may cross over towards the semidilute excluded volume limit (B), until with increasing overlap the screening becomes so strong that we are driven towards f"0 (point C). Starting with f'1 (strong coupling), we again first are driven towards point D, but ultimately we leave the region, where our perturbative treatment can be trusted. Other experiments (changing ¹, or changing N with fixed c "N c ) trace out similar 0 s 0 p crossover curves. The semidilute limit n /N P0 deserves some special attention. Clearly N PR, and since the 0 R R scaling functions depend on N we might suspect to find new singularities in that limit. However, R the screening of the interaction suppresses these singularities. Therefore renormalized perturbation theory covers also this limit. Subleading terms &Ne@2~1 remain, which are due to the details of the R screening mechanism. They have nothing to do with the critical singularities, but are the analogue of Goldstone singularities found in the ordered phase of a Heisenberg ferromagnet. They thus have a well defined meaning, their structure being rigorously given by powers of Ne@2~1. If we now would R use the e-expansion, extrapolated to e"1, we would destroy this structure. It thus is most revealing that once we have mapped the system into the above described uncritical region we have a finite well behaved expansion in u , with no need to carry through the e-expansion. We thus can evaluate R results like Eq. (6.7) for P directly for d"3 (e"1), keeping the screening structure in the n semidilute region intact. We finally have to consider the parameters n ,c introduced in Eq. (6.13). They reflect the fact 0 0 that our qualitative consideration of length scales does not fix l precisely, but only up to constants R of order 1. If we could evaluate the theory to all orders it would be strictly scale invariant, and the choice of l and thus of n ,c would not matter. Our low-order approximation, however, breaks R 0 0 strict scale invariance, and the choice of n and c is important. It influences the region where the 0 0 theory crosses over from the Gaussian behavior to nontrivial fixed point behavior (n ) or from the 0 dilute to the semidilute limit (c ). In some sense the parameters n , c are a rationalized form of the 0 0 0 freedom one has in exponentiating strict e-expansions. Since exact quantities are independent of n ,c , we should look for a region where our 0 0 approximations are insensitive to these parameters. I illustrate this with our choice of n , which is 0 to be determined in the dilute limit c P0. An appropriate quantity is the interpenetration ratio p t"(4n)~3@2A /Rd (see Eq. (5.26)). Evaluated at u* , t"t* is a universal number. Calculating 2 ' R A , R in three dimensions to first order and inserting the fixed point value u* known from the RG 2 ' R mapping, we find 1.728!0.521n1@2 0 . t*"0.182n1@2 0 (0.636#0.232n1@2)3@2 0
(6.17)
This function is plotted in Fig. 7. It indeed is insensitive to n for n +1. Furthermore, it comes 0 0 close to the experimentally observed value t*"0.245. We thus fix n to reproduce this value, 0 where it turns out to be not important which of the two possibilities we choose. c can be 0 determined in a similar way. We are now in a position to summarize our approach to crossover scaling functions. We, first of all, use the RG-mapping to scale the renormalized theory into an uncritical region defined by the choice (6.13) of l . This fully exploits the group structure, and an excellent form of the RG-mapping R is available. We then calculate the scaling functions to first order of the renormalized loop
L. Scha( fer / Physics Reports 301 (1998) 205—234
231
Fig. 7. Fixed point value of the interpenetration ratio Eq. (5.26) as a function of the theoretical parameter n . The broken line gives the 0 experimental value.
expansion directly in three dimensions in the uncritical region, with no further manipulations. As we stressed, the determination of the group structure is a task strictly separated from the calculation of scaling functions. Thus there is no inconsistency in performing these two steps to different levels of approximation. All crossover functions will show two branches, due to the structure of the RG flow. I should note that similar approaches have been around for a long time [7,20]. In my view the recent achievements are (i) the appearance of a most precise RG mapping, (ii) the observation of the two branched structure, and (iii) the observation that we may use perturbation theory in d"3, which incorporates also the subleading singularities due to the screening effect. The method should work for all crossover problems, where one limit is governed by a nontrivial fixed point, whereas the other limiting situations can be treated perturbatively. We have stressed that any approach towards crossover scaling functions necessarily involves theoretical parameters like n , c . Furthermore, in comparing to experiments, results at the fixed 0 0 point u* involve a single nonuniversal parameter, depending on chemistry and temperature, but R not on N or c . Going away from the fixed point we encounter a second nonuniversal parameter of 0 p this type, and we have to take care of the two-branched structure of the scaling functions. With such freedom, for a meaningful test of the theory it is not sufficient to compare just a single scaling function to experimental results. We rather should compare as many observables as possible to an internally consistent set of calculations. Over the years we have carried through this program, and I finally show a small subset of our results. I first recall that the curves shown in Figs. 4 and 5 are calculated with our theory and that the theoretical results for the swelling factor a would lie right on top of the data points in Fig. 3. This % demonstrates that the theory can deal with dilute systems. For the crossover towards the semidilute limit I in Figs. 8 and 9 show results of scattering experiments for the system polystyrene—toluene. This system is close to u* , but on the strong coupling branch. The two nonuniversal parameters R have been determined from independent measurements (R , A ) in the dilute limit. Thus these plots ' 2 involve no further parameter. Scattering experiments measure the density correlation function I (q), and Fig. 8 shows some quantity derived from I (0) by extracting the trivial small overlap $ $
232
L. Scha( fer / Physics Reports 301 (1998) 205—234
Fig. 8. J(s)"(c N2 /I (0)!1) s~1 as a function of the overlap s for polystyrene in toluene. Chain lengths as indicated. Data: J. p 0 $ Amirzadeh, M.E. Mc Donnell, Macromol. 15 (1982) 927, and private communication. Full curves: Crossover functions on the strong coupling branch. Broken curves: prediction for u ,u* . For clarity the figure has been split into two, and curves and data shifted R R upwards by a N -dependent amount. 0
Fig. 9. m2/R2 (points; fat curve) and m2/R2 (open symbols, thin line) as functions of s for polystyrene—toluene. Literature data from $ ' E ' several groups are combined.
L. Scha( fer / Physics Reports 301 (1998) 205—234
233
Fig. 10. Light scattering data for polystyrene—transdecaline (B. Chu, T. Nose, Macromol. 12 (1979) 590; 13 (1980) 122), plotted as a function of sH"c R3H; RH"R (¹"H). Temperatures are indicated. (a) c N2/I (0) for N "125 000 (upper three curves) and p ' p 0 $ 0 N "1800 (lower two curves). (b) m2/R2H for N "125 000. 0 $ 0
behavior. Fig. 9 shows two relevant correlation lengths, normalized to the isolated chain radius R2. ' The density correlation length m is defined as $
K K
ln I~1(q) . m2"3 $ $ q2 0
(6.18)
L. Scha( fer / Physics Reports 301 (1998) 205—234
234
It in principle differs from the screening length m . The latter has to be determined with radiation E able to resolve a length scale m &q~1. Thus E m2+3 ln I~1(q) . (6.19) E $ q2 2E m A subleading Goldstone-mode type singularity of I (q), found in the semidilute limit, enhances m2 as $ E compared to m2. This is illustrated in Fig. 9, where the points result from light scattering, measuring $ m2, whereas the open symbols give results from X-ray or neutron scattering, which are restricted to $ larger q-values and determine m . E Results for the system polystyrene—transdecaline, which is deep in the crossover region from the Gaussian to the nontrivial fixed point, are shown in Fig. 10. The experiments are close to the H-temperature (H"20.4"C), and thus the nonuniversal parameters to good approximation obey l(¹)"l(H), v(¹)"v (¹!H)/H. They again have been determined from independent measure0 ments in dilute systems, so that Fig. 10 involves no parameter fitting. In summary, these examples show that renormalization group theory, pushed beyond the determination of exponents, can reproduce nicely observed crossover scaling functions.
K K
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
D.J. Amit, Field Theory, the Renormalization Group, and Critical Phenomena, World Scientific, Singapore, 1984. J. Zinn-Justin, Quantum Field Theory and Critical Phenomena, Clarendon Press, Oxford, 1989. S. Coleman, R. Jackiw, Ann. Phys. 67 (1971) 552. C.M. Knobler, R.L. Scott, Multicritical Points in Fluid Mixtures, in: Phase Transitions and Critical Phenomena, vol. 9, Domb and Lebowitz (Eds.), Academic Press, London, 1984, p. 164. V. Dohm, J. Low, Temp. Phys. 69 (1987) 51. V. Dohm, Phys. Scripta T 49 (1993) 46. I.D. Lawrie, S. Sarbeck, Theory of tricritical points, in: Phase Transitions and Critical Phenomena, vol. 9, Domb, Lebowitz (Eds.), Academic Press, London, 1984, p. 2. J. des Cloizeaux, G. Jannink, Polymers in Solution, Clarendon Press, Oxford, 1990. B. Duplantier, Phys. Rev. A 38 (1988) 3647. A.J. Liu, M.E. Fisher, J. Stat. Phys. 58 (1990) 431. L. Scha¨fer, Phys. Rev. E 50 (1994) 3517. A.D. Sokal, Europhys. Lett. 27 (1994) 661. C. Bagnuls, C. Bervillier, Phys. Lett. A 195 (1994) 163. C. Bagnuls, C. Bervillier, Phys. Rev. B 41 (1990) 402. P. Grassberger, P. Sutter, L. Scha¨fer, J. Phys. A, in press. B. Li, N. Madras, A.D. Sokal, J. Stat. Phys. 80 (1995) 661. J. Zinn-Justin, Phys. Rep. 70 (1981) 109. R. Schloms, V. Dohm, Nucl. Phys. B 328 (1989) 639. S.F. Edwards, Proc. Phys. Soc. 85 (1965) 613. L. Scha¨fer, Macromolecules 17 (1984) 1357.
Physics Reports 301 (1998) 235—270
Exact results for two-dimensional Coulomb systems P.J. Forrester* Department of Mathematics, University of Melbourne, Parkville, Victoria 3052, Australia
Abstract Four review topics concerning exact results for two-dimensional Coulomb systems are covered. These are the two-dimensional one-component plasma at a special value of the coupling, the two-dimensional Coulomb gas at the same special value of the coupling, exact results in the form of sum rules and asymptotic formulas deduced from physical principles, and a solvable model exhibiting a Kosterlitz—Thouless-like pairing transition. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 02.90.#r
1. The 2dOCP at C"2 The 2dOCP stands for the two-dimensional one-component plasma. We begin the section by reviewing the classification of Coulomb systems which leads to this terminology, along with the calculation of the corresponding Boltzmann factor. Then, for the special value of the coupling C"2, analogies are noted between the Boltzmann factor of the 2dOCP in a disk and the eigenvalue probability density function for complex random matrices, as well as with the absolute value squared of the ground-state wave function for free fermions in a plane subject to a perpendicular magnetic field. The latter analogy with a non-interacting quantum system indicates that the 2dOCP is exactly solvable at C"2. The final sections of the chapter revise the exact solution for the free energy and particle distributions, and present an asymptotic analysis which is of interest in Section 3 when universal features of Coulomb systems are discussed. 1.1. Classification of Coulomb systems This is done via the number of components, which is the number of different mobile species. Thus for the one-component plasma there is only one mobile species, the particles with positive charge
* Tel.: 61 3 344 9683; fax: 61 3 344 4599.
0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 1 2 - X
236
P.J. Forrester / Physics Reports 301 (1998) 235—270
say. The system is overall charge neutral, with the negative charge forming a smeared out background. The term two-component plasma may refer to a system with a background as above, except now there are two distinct mobile species, both having the same sign charge. The relative concentration of the mobile species is arbitrary. Alternatively, there may be no background. Then for charge neutrality, the two species must have opposite signed charges. In the symmetric case both species have the same magnitude charge, so the system consists of equal numbers of particles with charge #q and !q say. Also, two-dimensional Coulomb systems may be distinguished by the domain to which they are confined. Possibilities include the disk, the surface of a cylinder (this is equivalent to imposing semi-periodic boundary conditions) and the surface of a sphere. Finally, there may be image forces due to e.g. a perfectly conducting wall. 1.2. The Boltzmann factor To study the equilibrium statistical mechanics of Coulomb systems, the first task is to compute the Boltzmann factor e~bU. The potential energy º is calculated according to the laws of two-dimensional electrostatics. The particles can be thought of as infinitely long parallel charged lines, which are perpendicular to the confining volume. The electrostatic potential U at a point r@"(x@,y@) is given by the solution of the Poisson equation +2r U(r,r @)"!2pd(r!r@),
+2r " : 2/x2#2/y2 .
(1.1)
In a plane, the solution subject to the boundary condition +U(r,r@)P0 as Dr!r@DPR is U(r,r@)"!log(Dr!r@D/l) ,
(1.2)
where l is an arbitrary length scale which we will henceforth set equal to unity. To verify this statement, we first recall that for any analytic function f of a complex variable z, Re f(z) satisfies the Laplace equation +2f"0. Now log (z!z@)/l is analytic for zOz@, and with z"x#iy, Re log (z!z@)/l"log(Dz!z@D/l)"log(Dr!r@D/l), so Poissons equation is satisfied for rOr@. It remains to check the delta function property:
P
P
dr +2r U(r,r@)" dr n ) +r U(r,r@) D C
P
"!
C
dr n )
A
B
x!x@ y!y@ , "!2p , Dr!r@D2 Dr!r@D2
(1.3)
where D denotes a disk centred at r@ and C its boundary, n the corresponding unit normal, and the first equality follows from the divergence theorem. Note also that by writing
P
1 dk v(k)e~*k >(r~r{), U(r,r@)" (2p)2
P
1 d(r!r@)" dr e~*k >(r~r{) (2p)2
P.J. Forrester / Physics Reports 301 (1998) 235—270
237
in Eq. (1.1) and using Eq. (1.2) we have that
P
2p v(k)"! dr logDrD e~*k > r" , DkD2
(1.4)
where the integrand is to be understood as a generalized function. It follows immediately from Eq. (1.2) that for the symmetric two-component plasma (to be referred henceforth as the 2dCG — the two-dimensional Coulomb gas), with coordinates of the positive charges given by Mu N and the negative charges by M N, j j N º"!q2 + logDu !u D!q2 + logD ! D#q2 + logDu ! D . (1.5) j k j k j k 1yj:kyN 1yj:kyN j,k/1 The computation of º for the two-dimensional one-component plasma (2dOCP) requires more calculation. Due to the neutralizing background charge density !qo (r), the total potential energy b º consists of the sum of the electrostatic energy of particle—particle interaction + logDr !r D , k j 1yj:kyN the particle—background interaction º " : !q2 1
P
N º " : + »(r ) where »(r ) " : q2 dr o (r)logDr!r D , 2 j j b j X j/1 and the background—background interaction
P
P
P
q2 q2 dr@o (r@) dro (r)logDr@!rD"! dr@o (r@)»(r@) . º " : ! b b b 3 2 X 2 X X Hence the Boltzmann factor is given by N e~bU3 < e~CV(rl) < Dr !r DC, k j 1yj:kyN l/1
C" : q2/k ¹ . B
(1.6)
In particular, if the system is a disk with a uniform neutralizing background of density o (r)"o(o"N/pR2), then use of the integral b 1 logD1!ke2p*hD dh"0, DkD41 0 shows
P
»(r)"q2po(r2/2#R2 log R!R2/2) and º "!q2poN(R2/2 log R!R2/8). 3
238
P.J. Forrester / Physics Reports 301 (1998) 235—270
Substituting in Eq. (1.6) we see that the Boltzmann factor is equal to e~CN2((1@2)-0' R~3@8)e~pCob+Nj/1@rj@2@2 < Dr !r DC . k j 1yj:kyN
(1.7)
1.3. Random matrix analogy For C"2, the Boltzmann factor is proportional to the p.d.f. for the eigenvalues of certain complex random matrices [1]. The result is that for an N]N random matrix X in which the elements u #iv are independently distributed with p.d.f. jk jk (1/p)e~@ujk@2~@vjk@2 , the corresponding eigenvalue p.d.f. for the (complex) eigenvalues j "x #iy is proportional to j j j Eq. (1.7) with r "(x , y ), C"2 and o"1/p. j j j The analogy allows the density of eigenvalues to be predicted. Now, the Boltzmann factor results from the 2dOCP in a disk of radius R with a uniform background of charge density o"1/p. But in general it is expected that Coulomb systems will be locally charge neutral, so this implies that the eigenvalues will uniformly occupy a disk in the complex plane of radius R"JN. Indeed this agrees with numerical computation, as is seen in Fig. 1, where the eigenvalues of a particular complex random matrix X of dimension 100]100 are plotted.
Fig. 1. Eigenvalues in the complex plane of a 100]100 random matrix with complex Gaussian elements.
P.J. Forrester / Physics Reports 301 (1998) 235—270
239
1.4. Quantum particle in a plane with a perpendicular magnetic field Suppose the particle has mass m and charge !e, and is subject to a perpendicular magnetic field B"BzL ,B'0. The Hamiltonian is then (see e.g. [2]) H" : (1/2m)(!i++#(e/c)A)2"(1/2m)P2"1+w (asa#aas) 2 # where
(1.8)
P "!i+/x#(e/c)A , P "!i+/y#(e/c)A x x y y as"(l/J2+)(P #iP ), a"(l/J2+)(P !iP ) , (1.9) x y x y w" : eB/mc (c denotes the speed of light) is called the cyclotron frequency, l " : J+c/eB is called the c Landau length, and the vector potential A must satisfy +]A"BzL .
(1.10)
We can check the commutation relation [a,as]"1, which allows Eq. (1.8) to be rewritten as H"+w (asa#1) . # 2 Thus there are eigenstates Dn,0T of H with energy E "(n#1)+w , (n"0,1,2,2), referred to as n 2 # ¸andau levels, which are given in terms of the ground state D0,0T by Dn,0T"((as)n/n!)D0,0T,
aD0,0T"0 .
(1.11)
These states do not form a complete set (they are the quantum analogue of the cyclotron orbit). We also have a degree of freedom corresponding to the centre of the cyclotron orbit, specified by the operator X2#½2, where X"x!(l2/+)P , y Defining
½"y#(l2/+)P . x
bs"(1/J2l)(X!i½),
b"(1/J2l)(X#i½)
(1.12)
(1.13)
we see that [b,bs]"1 and furthermore, X2#½2"l2(bbs#bsb)"2l2(bsb#1) . 2 Thus, this operator also has an harmonic oscillator form, and so has eigenstates Dn,mT"((bs)m/m!)Dn,0T,
bDn,0T"0 ,
(1.14)
m"0,1,2,2 with eigenvalue R2 "(2m#1)l2. m Now a,as commutes with b,bs and so simultaneous eigenstates of H and X2#½2 are permitted. In particular, all states D0,mT have the minimal allowed energy 1+u , and so lie in the lowest Landau 2 # level. They are orthogonal and form a complete set for states in the lowest Landau level. Consider now the symmetric guage A"(B/2)(!yxL #xyL ) .
240
P.J. Forrester / Physics Reports 301 (1998) 235—270
The requirements aD0,0T"0, bD0,0T"0 specifies that D0,0T"(1/J2pl2) e~zzN @4l2"(1/2pl2) e~(x2`y2)@4l2 . Substituting this in Eq. (1.14) with bs given by Eq. (1.13) gives (zN )m e~(x2`y2)@4l2 D0,mT" , (2pl22ml2mm!)1@2
z"x#iy .
(1.15)
1.4.1. Many-particle state The interpretation of the states (1.15) is that they have definite values of the distance from the origin to the centre of their cyclotron orbit, which increases with m. The most dense N-particle ground state t, in which the particles are fermions but otherwise non-interacting, is therefore obtained by constructing a Slater determinant from the states D0,mT " : t (r), m t(r ,2,r )"1/JN! det[t (r )] j~1 k j,k/1,2,N 1 N N e~(x2j`y2j)@4l2 det[(zN )j~1] "1/JN! < j,k/1,2,N (2pl22j~1l2(j~1)( j!1)!)1@2 j/1 N e~(x2j`y2j)@4l2 < (zN !zN ) , (1.16) "1/JN! < k j (2pl22j~1l2(j~1)( j!1)!)1@2 1yj:kyN j/1 where the last line uses the Vandermonde determinant formula. Thus DtD2 is indeed proportional to the Boltzmann factor for the one-component plasma at C"2 in a disk with o"1/2pl2. Note that this analogy predicts the density will be uniform in a disk of radius R"(2N)1@2l with value 1/2pl2, and equal to zero outside this radius. This prediction is consistent with the fact that the maximum distance from the origin to the centre of the cyclotron orbit of the single particle states is J2N!1 l. 1.5. Calculation of the free energy In the canonical formalism of statistical mechanics, the total dimensionless free energy bF is given by bF"!log
1 ZK , N! N
P
ZK " : N
P
dr 2 1 X
dr e~bU(r1,2,rN) , N X
(1.17)
and the corresponding dimensionless free-energy per particle bf is, in the thermodynamic limit, given by bf" lim (1/N)bF . N,@X@?= N@@X@/o Now, substituting Eq. (1.7) in Eq. (1.17) gives that for the 2dOCP in a disk at C"2, bF"!log(Q /N!)#N2 log R!3N2 , N 4
(1.18)
(1.19)
P.J. Forrester / Physics Reports 301 (1998) 235—270
241
where
P
P
dr e~po+Nj/1@rj@2 < Dr !r D2 , N k j 1yj:kyN with X denoting a disk of radius R centred at the origin. But we will show in the course of calculating the distribution function below that Q " : N
dr 2 1
X
X
N Q "N! pN(po)~N(N`1)@2 < c( j; N) , N j/1 where
(1.20)
P
a tk~1 e~t dt 0 denotes the incomplete gamma function. Hence [3] c(k; a) " :
(1.21)
N2 3 N c( j;N) N N !log < C(j) . (1.22) bF"!N logp# log po# log N! N2!log < 2 4 2 C(j) j/1 j/1 Now
A
B
P A
(1.24)
B
N c(j;N) = 1 &J2N log (1#erf(t)) dt#O(1) , log < 2 C(j) 0 j/1 where
P
erf (t) " : 2/Jp
x
(1.23)
(1.25)
e~t2 dt .
0 Substituting Eqs. (1.23) and (1.22) in Eq. (1.25) gives [5] 1 bF&pR2bf#2pRbc# log((po)1@2R)#O(1) 6
(1.26)
where
A B P
A
B
o 1@2 = 1 dy log (1#erf(y)) 2p 2 0 ( f corresponds to the free energy per volume and c is the surface tension). 7 o bf " log(o/2p2), 7 2
bc"!
(1.27)
242
P.J. Forrester / Physics Reports 301 (1998) 235—270
We can suppose that in addition to the uniform background charge density o there is a uniform " surface charge density !qp [6], and insist on global charge neutrality so that pR2o #2pRp"N. " A straightforward calculation shows that the Boltzmann factor is the same as Eq. (1.7) provided o therein is replaced by o as defined above. Appropriate modification of the above working then " shows that at C"2, bF again has the asymptotic expansion (1.26) except that the surface tension is now
A B P
bc"!
A
B
o 1@2 = 1 log (1#erf(t)) dt . 2p 2 ~p(2p@o)1@2
1.6. Distribution functions The n-particle distribution is defined as
P
P
N(N!1)2(N!n) o (r ,2,r )" dr 2 dr e~bU(r1,2,rN) . (1.28) n 1 n n`1 N ZK X X N We remark that o (r ,2,r )/o (r ,2,r ) can be interpreted as the density at point r given that n 1 n n~1 1 n~1 n there are particles at points r ,2,r . Defining the functional derivative d/da(x) by 1 n~1
P
d a(x@) f (x@) dx@"f (x), x3X , da(x) X
(1.29)
we see that the n-particle distribution (1.28) can be written 1 dn o (r ,2,r )" n 1 n ZK da(r )2da(r ) N 1 n
P
X
P
dr1 a(r )2 1
X
dr a(r ) e~bU(r1,2,rN)D . N N a/1
(1.30)
This formula can be used to compute o for the 2dOCP in a disk at C"2. Now, introducing n polar coordinates gives N N e~bU(r1,2,rN) dr 2dr " < f (r ) < Dr e*hk!r e*hj@2 < r dr dh 1 N j k j j j j j/1 1yj:kyN j/1
(1.31)
f (r)"C e~por2 .
(1.32)
with
Rewriting the product over pairs as the product of two Vandermonde determinants, and expanding out the determinants gives
P
P
dr a(r )2 1 1 X
dr a(r ) e~bU(r1,2,rN) N N X
P
P
N R 2p dr rf (r)rP(j)`Q(j)~2 dh a(r,h) e*h(P(j)~Q(j)) , " + e(P) + e(Q) < 0 P|SN Q|SN j/1 0 where S denotes the set of all permutations and e(P) denotes the sign of the permutation P. N
P.J. Forrester / Physics Reports 301 (1998) 235—270
243
But in general N + e(P) + e(Q) < a "N!det[a ] P(j),Q(j) j,k j,k/1,2,N P|SN Q|SN j/1 so we have
P
X
P CP
dr a(r )2 1 1
X
R
dr a(r ) e~bU(r1,2,rN) N N
P
dr rf (r)rj`k~2
2p
D
. dh a(r,h) e*h(j~k) j,k/1,2,N 0 0 With a"1 we see that the above matrix is diagonal, as :2p dh e*h(j~k)"2pd , which gives j,k 0 N R dr rf (r)r2j~2 (1.33) ZK "N!(2p)N < j/1 0 (thus establishing Eq. (1.20)). Furthermore, the functional differentiation can be done row-by-row in the determinant, with n rows effected by the functional differentiation in each term. Setting a"1, the remaining N!n rows uneffected by the functional differentiation are non-zero only in their diagonal element. Expanding the determinant by these elements and substituting the resulting expression along with Eq. (1.33) in Eq. (1.30) we see that "N! det
P
1 1 N o (r ,2,r )" det[ f (r )rjk`jc~2 e*hk(jk~jc)] + n 1 n k k k,c/1,2,n (2p)n
C
D
lim o (r ,2,r )"on det[e~po(r2k`r2c)@2 epozkzN c] . n 1 N k,c/1,2,n N?= o &*9%$
244
P.J. Forrester / Physics Reports 301 (1998) 235—270
In particular, o (r)"o, o (r ,r )"o2(1!e~po(r1~r2)2) , (1.35) 1 2 1 2 where it is understood o and o refer to the thermodynamic values. 1 2 To calculate the surface distribution function, it is convenient to choose as the origin (r,h)"(R,p), and to specify points in Cartesian coordinates from this origin: x &R!r , y &R(p!h ) . k k k k Note that in the limit RPR the background occupies the right-half plane x'0. Now, Replacing r ,r by x ,x and h ,h by y ,y , replacing j by N#1!j in the summation and introducing the k c k c k c k c notation (1.21) gives 1 N e~po(r2k`r2c)@2(r r )j~1 e*(hk~hc)(j~1) kc + :R dr re~po r2r2(j~1) 2po 0 j/1 N~1 (po)N~j e(N~j)(-0'(R~xk)`-0'(R~xc)) e*(yk~yc)(j~1)@R &e~po((R~xk)2`(R~xc)2)@2 + . c(N!j#1; N) j/0 Next, we make use of the asymptotic expansion (1.24), and we further use Stirling’s formula to approximate the gamma function therein by C(N!j#1)&(2pN)1@2NN~j e~N e~j2@2N. An expression of the form (1/JN) +N~1f ( j/JN) results. Then noting that this sum tends to :=f(t) dt in the j/0 0 limit NPR gives [8] 2 2 lim on(r1,2,rN)"on det[e~po(x k`x c)@2h(12(xk#xc#i(yk!yc)))]k,c/1,2,n , N?= o &*9%$ where
AB P
h(z) " :
1 1@2 =e(2po)1@2zt e~t2 dt . 1 (1#erf(t)) p 02
We remark that integration by parts shows that oT(r ,r ):"o (r ,r )!o (r )o (r ) 2 1 2 2 1 2 1 1 1 2 2o e~po(x21`x22) . & ! p2(y !y )2 1 2 @y1~y2@?=
(1.36)
2. The 2dCG at C"2 in planar geometry The 2dCG consists of an equal number of positive and negative charged particles. In this section the grand partition function for a general planar domain at the special coupling C"2 is written as
P.J. Forrester / Physics Reports 301 (1998) 235—270
245
a product over the eigenvalues of the free particle Dirac equation with particular boundary conditions. For the disk domain this spectrum can be calculated exactly, and this allows closed formed expressions for the grand partiton function as well as the particle distributions to be given and analysed in asymptotic limits. Also, it is shown how the formalism can be modified to account for a perfect conductor boundary. 2.1. Lattice domain The potential energy for the 2dCG is given by Eq. (1.5). We will suppose the positive and negative charges are confined to distinct sublattices within a planar domain X, with lattice points specified by the complex numbers M¼ N and M¼@ N , respectively. We assume that in j j/1,2,M j j/1,2,M the limit MPR the lattice points fill X uniformly, and that away from the boundaries the two sublattices have unit cells which form squares of area d2. Then, for the generalized partition function, we have N + < u(w )v(w@ ) Z C[u,v]"d4N + j j N w/W(N) w{/W(N){ j/1
A
]
B
< Dw !w D Dw@ !w@ D 1yj:kyN j k j k
C
,
(2.1)
where ¼(N) denotes a subset of M¼ N consisting of N(4M) elements, and similarly the j j/1,2,M meaning of ¼@(N). Now, for the 2dOCP at C"2, we have seen that the Vandermonde determinant identity plays an essential role in the exact calculation. For the 2dCG at b"2, exact calculations are again possible, with the role of the Vandermonde determinant is played by the Cauchy double alternant identity <
C
D
1 (x !x ) (y !y ) 1yj:kyN k j k j "(!1)N(N~1)@2 det . x !y (x !y )
(2.2)
This allows us to write N + < u(w )v(w@ ) Z [u,v]"d4M + j j N2 w/W(N) w{/W{(N) j/1 ]det
C
0
N
[ N @j 1 N k] w ~w j,k/1,2,N
D
[ j 1 @k] w ~w j,k/1,2,N . 0 N
(2.3)
The corresponding generalized grand partition function can be summed as a single determinant [9] M N [f;u,v] " : + f2NZ "det(1 #fd2K ), 2 N2 2M 2M N/0
(2.4)
246
P.J. Forrester / Physics Reports 301 (1998) 235—270
where j) ] 0 [ u(W Wj~W@k j,k/1,2,M . K " M @ j) ] 2M [ M v(W 0 M W@j~WM k j,k/1,2,M This can be seen by expanding the determinant in Eq. (2.4) as a power series in f.
C
D
2.2. The continuum limit In the continuum limit MPR, dP0 we expect the matrix 1 #fK to have eigenvectors of 2M 2M the form
C D t (¼ d) 1 1 F
t (¼ d) 1 M , t (¼ d) 2 1 F
t (¼ d) 2 M where t and t are continuous functions (note that since K depends on both ¼ and ¼ M , 1 2 2M j j t and t will not be analytic functions). This follows since the matrix is just a discrete 1 2 approximation to the integral operator A, defined by the mapping rule
C D C D
A
f (z@) g (z) 1 " 1 , f (z@) g (z) 2 2
where
C D P AC
A
f (z@) 1 " : f (z@) 2
D C
d(z!z@) 0
X
d(z!z@)
0
DC
u(z) 0
#f
v(z)
0
DB C D
0
1 z!z@
1 zN !zN @
0
f (z@) ] 1 dx@ dy@ f (z@) 2
with z " : x#iy, z@ " : x@#iy@, d(z!z@)"d(x!x@)d(y!y@). Furthermore, this approximation should become exact in the limit MPR, dP0. Thus, if we denote by I , s and K the matrix 2 integral operators with kernels
C
d(z!z@) 0 0
D C
d(z!z@)
, f
u(z) 0 0
D C
v(z)
,
D
0
1 z!z@
1 z6 !zN @
0
,
respectively, then in the continuum limit N [f;u,v]"det(I #sK) . 2 2
(2.5)
2.3. The grand partition function The determinant (2.5) can be written as the product of the eigenvalues of the integral operator I #sK. Now, with u"v"1, when N (f; u, v) reduces to the grand partition function N (f), the 2 2 2
P.J. Forrester / Physics Reports 301 (1998) 235—270
247
operator sK"fK is antihermitian, so its eigenvalues must occur in complex conjugate pairs $ij , j '0 (j"1,2,2). Thus we can write [10] j j = N (f)"det(I #fK)" < (1 f2j2) , (2.6) 2 2 j j/1 where the j are the positive eigenvalues of the coupled eigenvalue equations j t (z@) 2 dx@ dy@"ijt (z) , 1 Xz!z@
P P
t (z@) 1 dx@ dy@"ijt (z) . 2 XzN !zN @
(2.7)
In fact, the eigenvalue equation (2.7) is equivalent to the free-particle two-dimensional Dirac equation with special boundary conditions. To see this, introduce the complex derivative /zN " : 1(/x#i/y) 2 and use the formula 1 "pd(x!x@)d(y!y@) zN z!z@ to deduce from Eq. (2.7) that
G G
pt (z), 2 ij(/zN )t (z)" 1 0,
z3X!X ,
pt (z), 1 ij(/z)t (z)" 2 0,
z3X!X,
zNX , zNX .
(2.8)
Indeed with p ,p denoting the usual Pauli matrices, these coupled equations are equivalent to the x y two-dimensional free-particle Dirac equation
A B
A B
t p t1 (p /x#p /y) 1 "! , (x,y)3X!X (2.9) x y j t t 2 2 with the boundary conditions that t and t be continuous across the boundary, that t be an 1 2 1 analytic function of z on and outside the boundary of X which vanishes as DzDPR, and that t be 2 an analytic function of zN on and outside X which also vanishes as DzDPR. Alternatively the coupled equations can be combined to give the Helmoltz equation +2t "!(2p/j)2t for t 3X!X . 1 1 1
(2.10)
2.4. Disk geometry Suppose now that X is a disk centred at the origin of radius R. To compute Eq. (2.5) we must compute the positive eigenvalues of the coupled integral equations (2.7), or equivalently the
248
P.J. Forrester / Physics Reports 301 (1998) 235—270
coupled differential equations (2.8). We can solve the latter by introducing polar coordinates (r,h) and noting from Eq. (2.10) that for r3X!X, t (r)"J ((2p/j)r)e*lh, l3Z . 1 l From Eq. (2.8) we then have t (r)"!(j/4p)J ((2p/j)r)e*(l`1)( . 2 l`1 If l50 the boundary conditions can be satisfied by choosing J ((2p/j)r)"0 (and l t (r)"0, t (r)"!(j/4p)J ((2p/j)R)((R/zN )l`1 for DrD5R)), while for l(0 the boundary condi1 2 l`1 tions can be satisfied by choosing J ((2p/j)r)"0 (and t (r)"0, t (r)"J ((2p/j)R)(z/R)l for l`1 2 1 l DrD5R). Since J (z)"(!1)lJ (z) we, therefore, have l ~l )2) , (2.11) log N (f)"2 + log (1#(2pRf/j 2 l~1,j j,lz1 where j denotes the jth positive root of the Bessel function J (z). l~1,j l~1 Now, in general, for f (z) an analytic function of z with zeros at z"c , j3Z, and a product j expansion of the form
A B
z f (z)"A< 1! , c j j we have
A B
c f (!c) + log 1# "log . (2.12) c f (0) j j We remark that a contour integration derivation [11] of Eq. (2.12) shows that sufficient conditions for its validity are that f(z) be analytic and exhibit the large z behaviour f(z)/f @(z)Pc for 0(arg(z)(p and f(z)/f @(z)P!c for p(arg(z)(2p, where c is a constant. Applying this result with f(z)"z~(l~1)J (z) (note that f is then even and f(0)"1/2l~1C(l)) gives l~1 = C(l)J (2piRf) l~1 log N (f)"2 + log . (2.13) 2 (piRf)l~1 l/1
A
B
2.4.1. The thermodynamic limit We will see that the sum (2.13) is divergent. This is because we are using the continuum formalism, in which there is no regularization of the short distance non-integrability of the Boltzmann factor r~C between opposite charges. However, at C"2 this divergence is weak (logarithmic) and sensible physical results can be obtained by introducing a cutoff K (of order R/p, where p is interpreted as a hard core radius) in the upper summation terminal. The large R asymptotics of Eq. (2.13) thus regularized is computed by using the Debye expansion of I (z)"i~lJ (iz), valid for large-z uniformly in l. It reads [4] l l 1 3t!5t3 1 log I (z)"(z2#l2)1@2!l arsinh(l/z)!1log (z2#l2)! log 2p# #O (2.14) l 4 2 24l a2#l2
A
B
P.J. Forrester / Physics Reports 301 (1998) 235—270
249
where t " : l/(z2#l2)1@2. This allows the asymptotic expansion of +K log I (2pRf) to be deduced l/1 l from the Euler—Maclaurin summation formula
P
N N 1 1 + f(l)" f(x) dx# (f(0)#f(N))# (f @(0)!f @(N))#2 . 2 12 0 l/0 The sum +K log C(l) is expanded using Eq. (1.23). We find that all terms except the leading term l/1 have finite limits for KPR (with R large but still finite) [12,5]: !log N (f)"!pR2bP#2pRbc#1 log 2pfR#O(1) , 2 6 where
A
bP"2pf2 1#log
B
K , pfR
A
bc"2pf
(2.16)
B
1 1 ! 4 2 p
(2.17)
(P denotes the pressure and c the surface tension). 2.5. The distribution functions The distribution function for j positive charges and j negative charges in the lattice system is 1 2 calculated according to
.
dj1`j2 1 d2(j1`j2)o 1 2(¼ 1,2,¼ j1;Z 1,2,Z j2)" N(f;u,v)D j ,j l l k k u/v/1 N(f) du(¼ 1)2du(¼ j1)dv(Z 1)2dv(Z j2) l l l k (2.18)
Substituting in Eq. (2.5) we see that the functional derivatives can be performed row by row to give d2(j1`j2)o 1 2 (¼ 1,2,¼ j1;Z 1,2,Z j2)"(1/N (f))det(1 #fd2K D !D ) , (2.19) j ,j l l k k 2 2M 2M u/v/1 2M where D is a 2M]2M diagonal matrix with non-zero elements, all equal to unity, in rows 2M l ,2,l 1,M#k ,2,M#k 2 only. With X"1 !(1 #fd2K D )~1"fd2K (1 # 1 j 1 j 2M 2M 2M u/v/1 2M 2M fd2K D )~1 and 2M u/v/1 d2G (¼ ,¼ )"S jDXD j@T, d2G (Z ,Z )"SM#jDXDM#j@T , `` j j{ ~ ~ j j{ d2G (¼ ,Z )"S jDXDM#j@T, d2G (Z ,¼ )"SM#jDXD j@T , `~ j j{ ~` j j{ where S jDXDkT denotes the element in row j and column k of X, substituting the first equality in Eq. (2.6) in Eq. (2.19) shows o 1 2(¼ 1,2,¼ j1; Z 1,2,Z j2) j ,j l l k k [G (¼ j,¼ @j)] `` l l j,j{/1,2,j1 "det [G (Z j,¼ @j)] ~` l l j/1,2,j2 j{/1,2,j1
C
[G (¼ j,Z @j)] `~ l l j/1,2,j1 j{/1,2,j2 [G—(Z j,Z @j)] l l j,j{/1,2,j2
D
250
P.J. Forrester / Physics Reports 301 (1998) 235—270
which in the continuum limit reads o 1 2(w ,2,w 1;z ,2,z 2) j ,j 1 j 1 j [G (w ,w )] `` j j{ j,j{/1,2,j1 "det [G (z ,w )] ~` j j{ j/1,2,j2 j{/1,2,j1
C
[G (w ,z )] `~ j j{ j/1,2,j1 j{/1,2,j2 [G (z ,z )] ~~ j j{ j,j{/1,2,j2
D
,
(2.20)
where now G 1 2(w,z) " : fSws DK(I#fK)~1Dzs T, (here Sws DXDzs T denotes the kernel of the 2]2 ss 1 2 1 2 matrix integral operator X, with arguments w and z in positions (s s ) of the matrix). 12 Now, writing G " : fK(1#fK)~1 we must have (1#fK)G"K, so G 1 2 must satisfy ss G (z ,z ) d G (z ,z )#f dx dy ~s 2 3 "f `(~s) , 2 2 z !z `s 1 3 X z !z 1 2 1 3 (2.21) G (z ,z ) d G (z ,z )#f dx dy `s 2 3 "f `(~s) . 2 2 zN !zN ~s 1 3 X zN !zN 3 1 1 2 Due to the appearance of zN in Eq. (2.21), G (w,z) will not be an analytic function of w and z. ss{ From these equations we see that
P P
G (z,z@)"G (z,z@), G (z,z@)"GM (z,z@) . `` ~~ `~ ~` Furthermore, we must also have G(I#fK)"fK, from which we can deduce that G (z,z@)"G (z@,z), `` ``
G (z,z@)"!G (z@,z) . `~ `~
(2.22)
(2.23)
The procedure used to derive Eq. (2.10) can be applied to Eq. (2.21) to deduce that for z 3X!X, 1 !1+2G (z ,z )#pf2G (z ,z )"p2f2d(z !z ) (2.24) 4 1 `` 1 3 `` 1 3 1 3 and 1 G (z ,z )"! G (z ,z ) . ~` 1 3 pf zN `` 1 3 1
(2.25)
2.5.1. Correlations in the bulk Eqs. (2.24) and (2.25) are easy to solve in the bulk, when G (z ,z )"G (z !z ) and `` 1 3 `` 1 3 G (z ,z )"G (z !z ), by using Fourier transforms. One finds [13] ~` 1 3 ~` 1 3 G (z,z@)"f2K (2pfDz!z@D), G (z,z@)"f2e*(K (2pfDz!z@D) , (2.26) `` 0 ~` 1 where / is the polar angle of z!z@ and K and K are modified Bessel functions, defined by 0 1
P
K (z) " : l
=
0
e~z #04) t cosh(lt) dt, Re(z)'0 .
P.J. Forrester / Physics Reports 301 (1998) 235—270
251
Now from Eqs. (2.20) and (2.22) we have that o (r)"o (r)"G (r,r). However, substituting ` ~ `` Eq. (2.26) gives an infinite result because K (0) is infinite. This is consistent with o"o (r)#o (r) 0 ` ~ deduced from the thermodynamic formula bP o"f . f According to Eq. (2.17) we have o&4pf2log (K/pfR)&4pf2log (1/ppf) which diverges as the hard core radius p approaches zero. On the other hand, the truncated distributions, which do not involve G (z,z) remain finite for ss non-coincident points. In particular, the truncated two-particle distributions, which depend only on the displacement r"Dr !r D, are given by 1 2 oT (r)"!f4(K (2pfr))2, `` 0
oT (r)"f4(K (2pfr))2 . `~ 1
(2.27)
We remark that, from the asymptotic expansion of K (r),K (r), these truncated distributions 0 1 exhibit the large-r behaviour p2f3 !oT (r)&oT (r)& e~4pfr . `` `~ r
(2.28)
2.6. Surface correlations Eq. (2.21) make sense for DXD infinite, so choosing X to be the half space y'0 we have
P P P P
G (z ,z ) d dx ~s 2 3 "f `(~s) , 2 z !z z !z 1 2 1 3 0 ~= (2.29) d = = G (z ,z ) G (z ,z )#f dy dx `s 2 3 "f ~(~s) . ~s 1 3 2 2 zN !zN zN !zN 1 3 1 2 0 ~= Now, parallel to the wall, we expect the correlations to be translationally invariant, and thus G (z,z@)"G (x!x@;y,y@). This suggests we introduce the Fourier transform ss{ ss{ =
G (z ,z )#f `s 1 3
P
g (a;y,y@) " : ss{
=
~=
dy 2
=
dx G (x;y, y@) e2p*ax , ss{
which can be done by multiplying both sides of each equation in Eq. (2.29) by e2p*a(x1~x3) and integating with respect to x . The resulting expression depends on whether a'0 or a(0. 1 Consider first the case a'0. Using the result
P
=
~=
dt
G
2pi e2pak, k(0 , e2p*at " t#ik 0, k'0 ,
(2.30)
252
P.J. Forrester / Physics Reports 301 (1998) 235—270
the equations become
P
=
dy g 3(a;y ,y )e2pa(y1~y2) 2 ~s 2 3 y1 e2p(y1~y3)a, y (y , 1 3 "!2pfd 3 ~,s 0, y 'y , 1 3 1 y g 3(a;y ,y )#2pf dy g 3(a;y ,y )e2pa(y2~y1) 1 3 2 3 2 `s ~s 0 e2p*(y3~y1)a, y (y , 3 1 "2pfd 3 `,s 0, y 'y . 3 1 Differentiating with respect to y , we then obtain four coupled first-order differential equations 1 which can be summarized as the one equation g
`s3
(a;y ,y )!2pf 1 3
G
G
P
!sgn(s)2pag #g@ #2pfg "2pfd d(y !y ) , (2.31) ss{ ss{ (~s)s{ (~s),s{ 1 3 where we have used the abbreviation g (a; y , y )"g , and the differentiation is with respect to ss{ 1 3 ss{ y . 1 For a(0, use of the integration formula (2.30) and proceeding as above shows that the equations (2.31) again holds. These equations are equivalent to the coupled integral equations provided we can specify four independent boundary conditions. From the first set of integral equations we see that g
(a;y ,y )P0 as y PR , (2.32) `s3 1 3 1 which constitutes two of the boundary conditions, while the second set of integral equations gives g
(a; 0, y )"0 (a'0), g 3(a; 0, y )"0 (a(0) ~s3 3 `s 3 and thus the remaining two boundary conditions. From Eq. (2.30) with s,s@"#,# and s,s@"!,#, we see that
(2.33)
g
"(1/2pf)(2pag !g@ ) , (2.34) ~` `` `` !gA #((2pa)2#(2pf)2)g "(2pf)2d(y !y ) . (2.35) `` `` 1 3 The general solution of Eq. (2.35) subject to the boundary condition (2.32) is easily verified to be (a;y ,y )"[2(pf)2/i](e~i@y1~y3@#A(a)e~i(y1`y3)) , `` 1 3 where i"2p(a2#f2)1@2. Thus [10] g
P
G ((x,y),(x@,y@))"2(pf)2 ``
=
e~2p*a(x~x{) da (e~i@y1~y3@#A(a)e~i(y1`y3)) . i
~= To determine A(a) we use Eq. (2.33) with s "# together with Eq. (2.34), which give 3 i!2pa , (a'0), A(a)"!1, (a(0) A(a)" i#2pa
(2.36)
(2.37)
(2.38)
P.J. Forrester / Physics Reports 301 (1998) 235—270
253
Use of Eq. (2.33), together with Eq. (2.36), also allows us to deduce that
A P P
G ((x,y),(x@,y@))"pf 2p ~`
=
da a ~=
B
e~2p*a(x~x{) (e~i@y~y{@#A(a)e~i(y`y{)) i
=
da e~2p*a(x~x{)(sgn(y!y@)e~i@y~y{@#A(a)e~i(y`y{)) . (2.39) ~= We remark that by integrating by parts we can deduce the large x!x@ behaviour of Eqs. (2.38) and (2.39). Using these results, together with Eq. (2.21), we deduce that the #
!oT (x!x@;y,y@)&oT (x!x@;y,y@)&(f2/(x!x@)2)e~4pf(y`y{) . `` `~
(2.40)
2.7. Metal boundary Here we will consider the 2dOCP in semi-periodic boundary conditions, period ¸ in the x-direction, with a metal (perfect conductor) occupying the region y(0 [14—16]. The pair potential must have this periodicity property, satisfy the 2d Poisson equation (1.1) and vanish on the boundary x"0. The unique solution is
K
U(z,z@)"!log
K
sin p(z!z@)/¸ . sin p(z!zN @)/¸
(2.41)
Due to the image charges, charge neutrality in the vicinity of the metal wall occurs independent of the relative number of positive and negative charges. We can therefore consider a system of N positive charges with complex coordinates Mz N and N negative charges with complex ` j ~ coordinates Mz@ N. j From Eq. (2.41) the Boltzmann factor is given by (p/¸)(N``N~)C@2D¼ /¼ DC , 1 2 where N~ N` N~ N` ¼ " < sin p(z !z )/¸ < sin p(z@ !z@ )/¸ < < sin p(zN !z@ )/¸ , k j k j j k 1 j:k j:k j/1 k/1 N` N~ N` N~ N` ¼ " < (sin p(zN !z )/¸)1@2 < (sin p(zN @ !z@ )/¸)1@2 < sin p(z !zN )/¸ < sin p(z@ !zN @ )/¸ < 2 j j j j k j k j j/1 j/1 j:k j:k j/1 N~ ] < sin p(z !zN @ )/¸ . j k k/1 For C"2, use of the Cauchy double alternant identity (2.2) with N"N #N , ` ~ u "e2p*zj@L, v "e2p*zN j@L ( j"1,2,N ), u `"e2p*zN @j@L, v `"e2p*z@j@L ( j"1,2,N ) shows that j j ` j`N j`N ~ A A ¼ 2 2 , 1 "det 1 (2.42) ¼ A A 2 3 4
K K
C
D
254
P.J. Forrester / Physics Reports 301 (1998) 235—270
where
C
D
C
1 A "i , 1 sin p(z !zN )/¸ 2 j k j,k/1, ,N`
C
D
1 A "i 2 sin p(z !z@ )/¸ j k j/1,2,N` k/1,2,N~
D
1 , A "!i 3 sin p(zN @ !zN )/¸ j k j/1,2,N~ k/1,2,N`
C
D
1 A "!i . 4 sin p(zN !z@ )/¸ 2 ~ j k j,k/1, ,N
Now suppose the particles are confined to the region 0(x(¸, e(y(e#¼. Using the identity (2.42), and proceeding as in the derivation of Eqs. (2.5) and (2.6) shows that the corresponding grand partition function is given by = " < (1#j ) , + M 2N``N1 + j ¼ ` ~ + fN fN d N (f ,f )" lim Mz N3z(N`) M @jN (N~) N2 j/1 2 ` ~ ` ~ j z |Z M?= N`,N1/1 d?0 where the j are the eigenvalues of the coupled integral equations j t (z@) t (z@) p e`W j L L p e`W 1 2 dy@ dx@ dy@ dx@ # " t (z) , 1 sin p(z!z N @)/¸ sin p(z!z@)/¸ ¸ if ¸ ` e 0 e 0 t (z@) t (z@) p e`W j L L p e`W 1 2 dy@ dx@ dy@ dx@ # "! t (z) . sin p(zN !zN @)/¸ ¸ sin p(zN !z@)/¸ if 2 ¸ ~ e 0 e 0
P P
P P
P P
P P
(2.43)
(2.44)
Analogous to the situation with the coupled integral equations (2.7), these integral equations can be reduced to coupled differential equations. This is done by applying the formula p 1 "pd(x!x@)d(y!y@) ¸ zN sin p(z!z@)/¸ and its complex conjugate, to obtain t (z)"(j/if )(/zN )t (z) and t (z)"!(j/if )(/z)t (z) , 2 ` 1 1 ~ 2 z3X!X. In the special case f "f "f these coupled equations are equivalent to the Dirac ` ~ equation (2.9) with j replaced by j/f. To calculate the eigenvalues of Eq. (2.44) we seek eigenfunctions of the form t (z)"e~2p*(p~1@2)x@LX(y), t (z)"e~2p*(p~1@2)x@L½(y) , 1 2 p3Z. For p51 this leads to the eigenvalue condition
A
B
2p sin ¼((2pif/j)2!k2)1@2 ! fjK e~2ke#k #cos ¼((2pif/j)2!k2)1@2"0 , j ((2pif/j)2!k2)1@2
P.J. Forrester / Physics Reports 301 (1998) 235—270
255
where k" : p(2p!1)/¸,
f"(f f )1@2, fK "(f /f )1@2 , (2.45) ` ~ ~ ` while for p(1 the same equation holds with k replaced by !k and fK by 1/fK . The grand partition function (2.43) is therefore specified in terms of the roots of this equation, and this expression can be further simplified by the use of Eq. (2.12). The resulting expression has the large ¸ and ¼ expansion !log N (f ,f )&!¸¼bP#¸(bc#bu) 2 ` ~ where bP and bc are as in Eq. (2.16), while
P
bu"!
=
0
dt(g(t,fK )#g(t,1/fK ))#(1!p/2)f ,
(2.46)
(2.47)
where
AA
g(t,fK )"log
BB
ffK e~4pte#t 1 1# (f2#t2)1@2 2
.
The calculation of the correlations near the metal boundary proceeds in an analogous way to the calculation of the surface correlations in the disk with a hard wall boundary. In fact, the final expressions in that calculation, Eqs. (2.37) and (2.39), again hold provided A therein is replaced by i!2pa!2pffK e~4pae A" , i#2pa#2pffK e~4pae
(2.48)
with f and fK specified by Eq. (2.45), and that Eq. (2.39) is multiplied by fK . 3. Exact results from physical principles A number of features of the exact solutions are universal in the sense that they apply to all Coulomb systems in the conducting phase, independent of the details of the models. Such features to be covered in this section include the finite size corrections to the free energy, screening sum rules, fluctuation formulas and asymptotic properties of the surface correlations. 3.1. Finite size corrections to the free energy and grand potential Following Jancovici and Te´llez [16], we make the heurisitic assumption that the universal features of the (grand) partition function N of a conducting Coulomb system are correctly C accounted for by the continuum functional expression
P
A PP
b N " Do exp ! C 2
B
dr dr@ o(r)G(r,r@)o(r@) .
256
P.J. Forrester / Physics Reports 301 (1998) 235—270
Here o(r) is the continuum charge density and G(r,r@) is the 2d Coulomb potential !logDr!r@D. Let us change variables from the charge density o(r) to the electric potential /(r). Since +2/(r)"!2po(r) , the Jacobian is given by det(!(1/2p)+2) and we obtain N "det(!(1/2p)+2)Z , C G where
P
A P
B A A
b 1 /(r)(!+2)/(r) dr " det ! +2 Z " D/ exp ! G 4p 2p
BB
~1@2 .
Thus log N "!log Z C G
(3.1)
so log N and log Z have the same universal term, except for its sign. C G Now for the Gaussian theory in two-dimensions with hard wall boundary conditions, it is known [17] that for large volumes DXD, s !log Z &ADXD#BDXD! logDXD#O(1) , G 6 where the first two terms are not universal (i.e. depend on cutoff procedures, etc.) but the term !(s/6)logDXD is universal, being independent of all such detail. Here s"2!2h!b where h is the number of handles and b is the number of boundaries of X; s is referred to as the Euler number. Thus Eq. (3.1) predicts that the large volume expansion of !log Z or !log N will contain C C a universal term (s/6)logDXD. In particular, for a disk s"1 and so the universal term is (1/6)log R, which is in precise agreement with the exact results Eqs. (1.26) and (2.16). 3.2. Screening sum rules A basic hypothesis relating to the conducting phase of a Coulomb system is that an external charge density will be perfectly screened in the long-wavelength limit. This can be used in conjunction with a linear response argument to predict the long-wavelength behaviour of the Fourier transform of the charge—charge correlation (see e.g. Ref. [18]). Now, the linear response formula states that for any observable A in a statistical mechanical system, the change in its mean value due to a perturbation dº to the total energy º is to leading order in dº given by SAT !SAT "!bSAdºTT . e 0 0
(3.2)
P.J. Forrester / Physics Reports 301 (1998) 235—270
257
Here the subscript e (0) denotes the presence (absence) of the perturbation in the average. It is a simple matter to derive Eq. (3.2) in the canonical ensemble from the meaning of the averages: 1 N SAT " : < e Z el/1
P
N dx Ae~b(U`dU) with Z " : < l e X l/1
P
X
dx e~b(U`dU) l
and SAdºTT " : SAdºT !SAT SdºT . 0 0 0 0 Consider a general s-component two-component Coulomb system, and suppose we take for dº the particle—background potential created at r by an external background charge-density ee*k > x. Denoting the charge density C(x) and its Fourier transform CI (k) by s Nk C(x)" + q + d(x!r(k)), k j k/1 j/1 we have
s Nk CI (k)" + q + e*k >r(k)j k k/1 j/1
(3.3)
2p CI (k) , k2
(3.4)
PP
dº"!e
logDx!yDC(y)e*k > x dx dy"e
where we have used Eq. (1.4). Now take for the observable A the charge density C(r). We then have
P
2p SAdºTT"e SC(r)C(r@)TTe*k > r{ dr@ . e k2
(3.5)
Note that for a fluid state the truncated charge—charge distribution in Eq. (3.5) will depend only on the difference r!r@, and we write SC(r)C(r@)TT":S(r!r@). On the other hand, the screening hypothesis gives that for DkDP0 SC(r)T !SC(r)T "!ee*k > r , (3.6) e 0 so equating !b times Eq. (3.5) with Eq. (3.6) according to the linear response formula (3.2) gives k2 as DkDP0 , SI (k)& 2pb
(3.7)
or equivalently
P
(3.8)
P
(3.9)
S(r) dr"0
and 1 r2S(r) dr"! . pb
The first equation expresses the fact that the screening cloud about an internal charge at the origin exactly cancels that charge. It has the generalization that all multipoles of the internal screening
258
P.J. Forrester / Physics Reports 301 (1998) 235—270
cloud must vanish:
P
S(r)(x#iy)n dr"0, n"0,1,2
(3.10)
which is an immediate consequence of the decay of S(r) being faster than any power law, and rotational invariance. The second moment condition (3.9) is referred to as the Stillinger—Lovett sum rule. For the 2dOCP, S(r)"q2(oT(r)#od(r)) , 2
(3.11)
so from Eq. (1.35), at C"2, S(r)"q2(!o2e~po@r@2#od(r)) .
(3.12)
It is straightforward to verify the sum rules Eqs. (3.7), (3.8), (3.9) and (3.10). For the 2dCG, S(r)"2q2(oT (r)!oT (r)#o d(r)) . `` `~ `
(3.13)
At C"2 in the continuum, we have remarked that o is infinite. Nonetheless sensible results can ` be obtained by introducing a hard core cutoff p into the Green functions (2.26), by supposing that for 0(Dz!z@D(p they vanish. This then gives
G
S(r)"2q2
!f4(K (2pfr))2!f4(K (2pfr))2, r'p , 0 1 0, r(p ,
#2q2f2K (2pfp)d(r) 0 and we can check [10] that the perfect screening sum rule (3.8) holds in the limit pP0. Another significant feature of Eq. (3.7) is that it is consistent with the asymptotic decay of the correlations in the conducting phase being faster than any inverse power law (recall Eqs. (1.35) and (2.27)). Indeed the small DkD behaviour of the Fourier transform determines the large DrD behaviour of the original function; if the small DkD behaviour contains a leading singularity of the form DkDk(k'0, kO2,4,2) then the large DrD asymptotic form will be proportional to 1/DrD2`k. On this point it is interesting to repeat the above perfect screening argument, with the potential !logDx!yD replaced by 1/Dx!yDc, 0(c(2. The Fourier transform of this potential is proportional to DkDc~2, which means that k2 in Eq. (3.7) is to be replaced by a constant times DkDc~2. According to the above discussion, this implies that for large DrD, S(r) is proportional to 1/DrD4~c. We remark that this conclusion can also be reached by an analysis of the BGY equations [19,18]. 3.3. Fluctuation formulas One use for the charge—charge correlation S(r!r@) is in the calculation of the variance of a quantity of the form A"+s q +N a(rj), which is referred to as a linear statistic. A short k/1 k j/1
P.J. Forrester / Physics Reports 301 (1998) 235—270
259
calculation shows Var(A):"S(A!SAT)2T
PP
" dr dr@ a(r)a(r@)S(r!r@) ,
(3.14)
where it is assumed the thermodynamic limit has been taken. In particular, choosing a(r)"sK(r) where sK(r)"1 for r3K, sK(r)"0 otherwise, gives the variance in the number of the charge (DQK)2 in the region K:
P P
(DQK)2" dr dr@ S(r!r@) . K
(3.15)
K
Note that in the one-component case (DQK)2 is proportional to the variance of the number of particles (DNK)2 in the region K. This quantity in a compressible gas is proportional to DKD. However, we will see that for the OCP the sum rule (3.8) shows that (DNK)2 is proportional to DKD, so the fluctuations are strongly compressed. Suppose for example that K is a square of area ¸2[18]. Now we can rewrite Eq. (3.15) to read
P
(DQK)2" dr cK(r)S(r) ,
(3.16)
K
where cK(r)":dr@ sK(r@)sK(r#r@) is the volume of the intersection of K with its r-translate. But cK(r)"¸2!(DxD#DyD)¸#O(1) so substituting in Eq. (3.16) and using Eq. (3.8) shows
P
(DQK)2&!¸ dr(DxD#DyD)S(r) , thus verifying that (DQK)2 is indeed proportional to ¸. Another interesting linear statistic is the potential difference N s A(u) " : ! + q + (log Dr !uD!log Dr D) j j k k/1 j/1 between the potential at the point u and the potential at the origin. Substituting in Eq. (3.14) and introducing Fourier transforms gives
P
Var(A(u))" dk D1!e*k > uD2
S(k) . DkD4
For DuD large, substituting the asymptotic formula (3.7) in this formula shows that [20] Var(A(u))&(2/b)log DuD .
(3.17)
Thus even though the particle correlations in the bulk have a fast decay, the potential fluctuations are long-ranged.
260
P.J. Forrester / Physics Reports 301 (1998) 235—270
For a class of linear statistics it is possible to argue [21,23] that the full distribution
P
P
A
B
s s N 1 s (3.18) < dr(k)2 < dr(k) d u! + q + a(r(k)) e~bU Pr(A"u) " : lim 1 N k j ZK k/1 X k/1 j/1 @X@?= k/1 X will be Gaussian. These are linear statistics for which a(r)"b(r/a) and for which the limit aPR is taken after Eq. (3.18) is computed in the thermodynamic limit. Note that as aPR the function a(r) varies only over macroscopic distances. The key observation is that if we consider a Coulomb system perturbed by a potential dº"A"+s q +N b(r(k)/a) then for aPR, when dº varies over macroscopic distances, k/1 k j/1 j macroscopic electrostatics gives that
P
! dr@ logDr!r@D(SC(r@)T !SC(r@)T )"!b(r/a)#const . e 0 exactly. Thus, the relationship between SC(r@)T !SC(r@)T and b(r/a) is linear in the limit aPR. e 0 But, in general, SC(r@)T !SC(r@)T "!bSC(r@)ATT#O(A2) , e 0 0 so we must have that for aPR SC(r@)T !SC(r@)T "!bSC(r@)ATT e 0 0 exactly. Multiplying by b(r@/a) and integrating over r@ gives SAT !SAT "SA2TT . e 0 0 This is precisely the linear reponse relation (3.2) with dº"A, so we have that this relation is exact in the limit aPR. By replacing A by ikA it follows from this that the distribution of A must be Gaussian, and thus Pr(A"u)"(1/2pp2)1@2e~u2@2p2, p2"Var(A) .
(3.19)
3.4. Surface correlations The behaviour of surface correlations differs drastically from their bulk counterparts [22,23]. In particular, along a half-plane boundary at y"0 it is expected that bSp(x)p(x@)TT"!1/[2p2(x!x@)2] ,
(3.20)
where p(x) denotes the (smoothed) surface charge density. This means that the usual charge density correlation function S(r,r@)"SC(r)C(r@)TT behaves asymptotically along the wall as F(y,y@; x!x@), where
P P =
=
1 . dy@ F(y,y@; x!x@)"! 2bp2(x!x@)2
(3.21) 0 0 From Eq. (1.36) (with x and y interchanged) and Eq. (2.40) we see that both exact results exhibit this behaviour. We will detail a method of deriving Eq. (3.21) due to Jancovici [23]. The method dy
P.J. Forrester / Physics Reports 301 (1998) 235—270
261
requires knowledge of SU(r@)T !SU(r@)T , where U(r@) denotes the potential at a point r@ due to the e 0 charges in the Coulomb system when an external charge dq, regarded as a perturbation, is fixed at r. Suppose both r and r@ are in the system. Then a screening cloud of charge !dq will surround the charge dq at r. From large distances this appears as a point charge, and so creates a potential at r@ equal to dq logDr!r@D. Also, due to charge conservation, a charge dq will spread itself around the boundaries creating a constant potential dq/C, where C is the capacitance. Thus SU(r@)T !SU(r@)T "dq logDr!r@D#dq/C . (3.22) e 0 For an infinite system in two dimensions 1/C will diverge (e.g. for a disk of radius R, 1/C"!log R). However, we will see below that it is only the derivatives 2/r r@ that are k l relevant to the derivation of Eq. (3.21). Now suppose r is in the system, but r@ is outside. Reasoning as above gives (3.23) SU(r@)T !SU(r@)T "dq logDr!r@D#dqF(r@) , e 0 where dqF(r@) is the potential at r@ due to the surface charge. In the case r is outside and the observation point r@ is in, there is no screening cloud but there will be an induced surface charge of zero total charge giving SU(r@)T !SU(r@)T "!dqF(r) . e 0 Finally, suppose both r and r@ are outside the conductor. Then
(3.24)
SU(r@)T !SU(r@)T "dq logDr!r@D#dqG(r,r@) , (3.25) e 0 where G(r,r@) is the potential at the point r@ due to the charge at r and the induced surface charge. In particular, for a plane boundary, regarding the Coulomb system as a macroscopic conductor, the method of images gives G(z,z@)"!log Dz!z@D#log Dz!zN @D .
(3.26)
To make use of these formulas, one uses the linear response relation (3.2) with A chosen to be the potential U(r@) to deduce that SU(r@)T !SU(r@)T "bdqSU(r)U(r@)TT (3.27) e 0 0 (here the fact that dº"dqU(r) has been used). From Eq. (3.27) and the results (3.22)—(3.25) the electric field—electric field correlation can be computed according to the formula 2 SU(r)U(r@)TT (3.28) SE (r)E (r@)T" 0 k l r r@ k l (recall E (r)"! U(r)). Now, according to the laws of two-dimensional electrostatics, the electric k k field is related to the surface charge: E065(r)!E*/(r)"2pp(r) , n n
262
P.J. Forrester / Physics Reports 301 (1998) 235—270
where the index n denotes the component normal to the surface (outwards is positive) and out (in) refers to the direction from which the limit to the surface is taken. Hence, 1 S(E065(r)!E*/(r))(E065(r@)!E*/(r@)T , Sp(r)p(r@)TT" n n n (2p)2 n which by use of Eq. (3.28) as well as Eqs. (3.22), (3.23), (3.24) and (3.25) reduces to 1 2G(r,r@) bSp(r)p(r@)TT"! Dr r . (3.29) (2p)2 r r@ , {|463&!#% n n With G given by Eq. (3.26), and the surface being the line y"0, the formula (3.20) results. Related to the slow decay (3.20) is the fact that the surface layer carries a non-zero dipole moment,
P P P
= = dy y dx S((x,y),(0,y@))"!1 . (3.30) 0 0 ~= In fact, it can be argued from the second BGY equation [18] that for this dipole moment to be non-zero, S((x,y),(0,y@)) must decay slower than O(1/DrD3) in some direction. 2pb
=
dy@
4. Solvable model with a pairing transition 4.1. Kosterlitz—¹houless transition The 2dCG exhibits a low-density phase transition as the coupling C is varied. It is referred to as the Kosterlitz—Thouless phase transition, in honour of the work of Kosterlitz and Thouless [24], who gave an iterated mean field analysis in the neighbourhood of the low-density critical point (C"4). The transition is from a high-temperature (small C) conductive phase in which the positive and negative charges are dissociated, to a low-temperature (large C) insulator phase in which the positive and negative charges are bound together in dipole pairs. In the conductive phase the system perfectly screens an external charge density in the long-wavelength limit, so the Stillinger—Lovett sum rule (3.9) holds, while in the dipole phase only some fraction (1!(1/e)), where e is the dielectric constant, is screened. The r.h.s. of Eq. (3.9) must then be multiplied by this factor. An essential ingredient in the analysis of Kosterlitz and Thouless is the so-called nested dipole hypothesis for the dominant contributions to e as the transition is approached from the dipole phase. In a descriptive sense, this hypothesis gives that the dominant configurations are those in which there is a hierarchy of nested dipoles of increasing size, in which the small dipoles ‘screen’ the larger dipoles. Below we will review a solvable model exhibiting a pairing transition, and the associated analysis of its critical point, which is based on the Kosterlitz—Thouless nested dipole hypothesis. The model therefore provides an opportunity to test the validity of the hypothesis. 4.2. The solvable model The model is defined in the grand canonical ensemble. The particles, which are two-dimensional charges of like charge q, are confined to the half strip x3[!¸/2,¸/2], y5d, with periodic
P.J. Forrester / Physics Reports 301 (1998) 235—270
263
boundary conditions in the x-direction. The region y(0 is a perfect conductor, so the potential at point x@"(x@,y@) due to a charge q at point x"(x,y) is given by
K
/(x@, x)"!q log
K
sin(w!z) , sin(w!zN )
where w " : p(x@#iy@)/¸, z " : p(x#iy)/¸. Also imposed on the system is an external one body potential with Boltzmann factor e~bV(y)"y~a. The total Boltzmann factor is therefore < sin(w !w )sin(w !w ) C k j @2 k j ip NC@2 N ¼ C" < y~a 1yj:kyN . (4.1) N j ¸ N j/1 < sin(w !w ) j k j, k/1 We will see below that in equilibrium there are only a finite number of particles per unit length of the interface. This is because the particles in the system all have the same sign, but there is no neutralizing background. The attraction of the particles by their images to the interface is only a surface effect, which is out weighed by the interparticle repulsion in the bulk. Also, we will see below that there are two distinct phases in the system: a phase for a51 in which the particles and their images are dissociated, in the sense that their mean separation as measured by the first moment of the particle density perpendicular to the wall diverges, and a phase for a(1 in which this first moment is finite. Thus there is a pairing of the particle and its image to form a dipole, and thus an analogy with the Kosterlitz—Thouless transition. The model with Boltzmann factor (4.1) is exactly solvable at the coupling C"2 [25]. To see this, note that use of the Cauchy double alternant formula (2.2) with x "e2*wj, y "e2*wj, gives j j
AB
A
AB
C
B
D
ip N N 1 . ¼ " < y~a det N2 j ¸ sin(w !w ) j,k/1,2,N j/1 j k Hence the generalized grand partition function can be written
A B P
P
C
D
= 1 pif N N L@2 = 1 N[a]" + dx dy y~aa(x ,y ) det < l l l l l N! ¸ sin(w !w ) j,k/1,2,N d N/0 l/1 ~L@2 j k "det(1#fK)
(4.2)
where K is the integral operator defined on the region (x,y)3[!¸/2,¸/2]][d,R) with kernel pi a(x,y) y~a . ¸ sin p(x@!x#i(y@#y))/¸ In particular, this means that the grand partition function itself is given by N"< (1#fj ) j j
(4.3)
264
P.J. Forrester / Physics Reports 301 (1998) 235—270
where the j denote the eigenvalues of the eigenvalue equation j pi =dy L@2 dx v(x,y) "jv(x@,y@) . ¸ ya sin p(x@!x#i(y@#y))/¸ d ~L@2
P P
(4.4)
The anti-periodicity of the integrand in x@ suggests that we seek eigenfunctions of the form v(x,y)"e~p*(2l`1)x@Lg(y), l3Z .
(4.5)
Substituting Eq. (4.4) in Eq. (4.5) and using the result
P
G
1
l50 ,
0, ep*x(2l`1)@L dx " sin p(x#i/) !2iep(l, 0
(4.6)
l(0 ,
we see that for l50 the eigenvalue is zero, while for l(0 we have
P
j =dy epy(2l`1)@Lg(y)" e~py(2l`1)@Lg(y@) . 2p ya d Since the left-hand side is independent of y@ we must have g(y)"epy(2l`1)@L, which gives j"2p
P
=dy epy(2l`1)@L . ya d
and thus
A
P
B
= =dy N(f)" < 1#2pf e~4py(l~1@2)@L . ya d l/1 We, therefore, have for the one-dimensional pressure
P
A
P
(4.7)
B
= =dy 1 e~4pyt . bP:" lim log N(f)" dt log 1#2pf ya ¸ 0 d L?=
(4.8)
4.2.1. Singularities In general, the singular points of the pressure as a function of the fugacity f occur at the zeros of the grand partition function in the thermodynamic limit. From Eq. (4.8), these zeros occur when 1 , f"! dy = e~4pyt 2p: d ya
04t(R
(4.9)
For a41, any f3(!R,0] satisfies this equation, while for a'1 the solution of Eq. (4.9) is restricted to the interval f3(!R,f ] where 0 f "!(a!1)/(2pda~1) . 0 As a consequence, for a41, the pressure is not an analytic function of f at f"0, while for a'1 the pressure is an analytic function of f at f"0 with radius of convergence f . 0
P.J. Forrester / Physics Reports 301 (1998) 235—270
265
The leading non-analytic term in the expansion of the pressure as a function of f about f"0 for a(1 comes from the small-t portion of the integrand. Now, for tP0 and a(1
P
=dy e~4pyt&(4pt)a~1C(1!a) . (4.10) ya d After truncating the upper terminal of the t-integration in Eq. (4.7) at some small positive value k say, Eq. (4.10) can be substituted in the integrand, which after a change of variables gives
P
bP &h(f,a) 4*/'.
k@h(f,a) log(1#sa~1) ds , 0
(4.11)
where h(f,a)"(1/4p)(2pfC(1!a))1@(1~a) . For f small enough, k/h(f,a) is always larger than unity, so the above integral can thus be evaluated by breaking up the range of integration into two segments: [0,1) and (1,k/h(f,a)). Appropriate series expansions of the logarithm, and use of the identity = (!1)k~1 p x + " k2!x2 sin px k/~= gives
P
k@h(f,a)
p = c~(1~a)n log(1#sa~1) ds" #c + (!1)n`1 , sin p/(1!a) n(1!(1!a)n 0 n/1 c"k/h(f,a) ,
(4.12)
assuming 1/(1!a)NZ`. For 1/(1!a)3Z`, taking the limit 1/(1!a)Pm3Z` in (4.12) gives
P
k@h(f,a)
0
= (!1)n`1c~(1~a)n (!1)m (!log c#1)#c + log(1#sa~1) ds" . m n(1!(1!a)n) n/1 nEm
(4.13)
Substituting Eqs. (4.12) and (4.13) in Eq. (4.11) shows that only the first term in each expression gives a singular term in f. We find
G
1 1 (2pfC(1!a))1@(1~a) , sin p/(1!a) 4p bP & 4*/'. (!1)m (2pfC(1!a))1@(1~a)log f, 4p
1 NZ` , 1!a 1 "m,m3Z` . 1!a
(4.14)
One deduces from Eq. (4.14) that bP can be expanded as a power series in f up to order [1/(1!a)]!1. Since this order becomes infinite as aP1~, this is consistent with the fact that bP is an analytic function of f for a'1. Furthermore, for aP1~, Eq. (4.14) exhibits an essential
266
P.J. Forrester / Physics Reports 301 (1998) 235—270
singularity as a function of a, although from Eq. (4.8) it can be seen that for fixed f, bP is an analytic function of f across the boundary a"1. 4.3. Distribution functions The functional differentiation formula (1.29) applied to Eq. (4.2) gives that o (r ,2,r )"det[G(r ,r ] , n 1 n j k j,k/1,2,n where
(4.15)
G(r,r@)"fSrDK(1#fK)~1Dr@T .
(4.16)
Evaluating Eq. (4.16) according to the method of Section 2.6 shows
P
2pf = e~2pc(y`y{)e2p*(x~x{)c dc G(r,r@)"G(x!x@;y,y@)" . y@a du 0 = 1#2pf: e~4pcu d ua
(4.17)
In particular,
P
e~4pcy 2pf = dc . o(y)"G(0;y,y)" du ya 0 1#2pf:= e~4pcu d ua
(4.18)
Now, integration by parts shows that provided a'1, o(y)"O(1/ya`1). Thus
P
Syo(y)T " :
=
yo(y) dy
(4.19)
d is finite. Explicitly, we find Syo(y)T"(1/4p)log(1#4pfd1~a/(a!1)) .
(4.20)
On the other hand, for a41, the change of variables r"4pyt and then s"ruy allows us to deduce that for large-y o(y)&(1!a)/4py2 (aO1),
o(y)&1/(4py2log y) (a"1) ,
so Syo(y)T is infinite. Hence, indeed the phase transition is a pairing transition from a state in which the particles and their images are dissociated (a41) to a state in which they are separated by a finite distance (a'1). 4.4. Resummation analysis A feature of the exact solution is that the pressure and correlation functions are analytic functions of the fugacity f in the neighbourhood of the origin f"0 for a'1. For general C we expect this feature to persist for C#2a'4, and in particular Syo(y)T as defined by Eq. (4.19) is
P.J. Forrester / Physics Reports 301 (1998) 235—270
267
expected to diverge as C#2aP4` (this will be verified below). Here, following [26], we will study the asymptotic density profile oJ (y), which is defined as the portion of the low-fugacity expansion of o(y) that gives the correct leading order singular behaviour of each term in the low-fugacity expansion of Syo(y)T for C#2aP4`. Our strategy is to explicitly determine the configurations at order f2 of o(y) which contribute to the leading order singularity of oJ (y) at order f2. This will turn out to have an interpretation as the partial screening of a fixed particle image pair of separation 2y by a smaller pair of separation 2y . 1 Analogous to the idea of Kosterlitz and Thouless, we will then proceed under the assumption that all configurations contributing to oJ (y) at order fn are nested chains of particle—image pairs. This is analogous to the strategy first used in Ref. [27] to study the Kosterlitz—Thouless transition in the 2dCG. 4.4.1. The expansion at O(f2) Now, for a general one component system near an interface, the low fugacity expansion of the density can be computed by introducing in the definition of the grand partition function a position dependent fugacity f(y). Then, from the formula o(y)"f
d log N[f,a(y)]D a(y)/1 da(y)
we find
K
d Z [a] da(y) 1
A
K
K B
d d #f2 2 !Z Z Z #O (f3) 1 2 da(y) 1 da(y) a(y)/1 a(y)/1 a(y)/1 1 f f2 = = dx dy " # C C 1 1 (2y )C@2ya (2y) @2ya (2y) @2ya 1 1 ~= d C (x!x )2#(y!y )2 @2 1 1 !1 #O(f3). ] (x!x )2#(y#y )2 1 1
o(y)"f
P
GA
B
P
H
(4.21)
Substituting the term O(f) from Eq. (4.21) in (4.19) gives f2~C@2 d~C@2~a`2 Syo(y)T(1)" C/2!a#2 (the superscript (1) indicates the order of f), which indeed diverges as C#2aP4`. Since there is only one term at order f, trivially oJ (1)(y)"o(1)(y). At order O(f2), we break the integration over y in the double integral into two intervals: [d,y] 1 and [y,R). The change of variables y Pyv shows that the latter interval of integration gives 1 1 a contribution to Syo(y)T(2) which is O(1/(C#2a!4)). For the interval [d,y], use of the large-y expansion
A
B
(x!x )2#(y!y )2 1 1 (x!x )2#(y#y )2 1 1
C
@2 2Cy y 1 !1&! x2#y2
268
P.J. Forrester / Physics Reports 301 (1998) 235—270
and integration over x gives a contribution to the asymptotic expansion of o(2)(y) equal to
P
y f2 dy S (y ) , C 1 y 1 (2y) @2ya d where
(4.22)
2pC , d4y 4y . (4.23) S (y ) " : ! 1 y 1 (2y )C@2ya~1 1 1 Substitution into Eq. (4.19) gives a contribution to Syo(y)T(2) which is O(1/(C#2a!4)2), which is more singular than the contribution from the interval [y,R]. Thus oJ (2)(y) is given by (4.22), and this quantity indeed has a nested dipole interpretation alluded to above. 4.5. Expansion for general O(fn) and resummation The nested dipole hypothesis allows the contribution to oJ (n)(y) to be represented by labelled rooted trees (root at y). At order f4, for example, there are four distinct rooted trees given in Fig. 2. The ordering y5y 5y 5y 5d is equivalent to the nesting of the particle—image pairs so that 1 2 3 each pair screens a pair of smaller size. For example, the third graph in Fig. 2 contributes 3
P
y
dy S (y ) 1 y 1
AP
y1
B
dy S 1(y ) 2 y 2
2
d d (the factor 3 accounts for the labelling degeneracy). This tree structure implies [27,26] that the sum over n of oJ (n)(y), oJ (y), satisfies the non-linear integral equation
AP
f oJ (y)" exp (2y)C@2ya
B
y
A
P
B
y f dy S (y )(2y )C@2ya oJ (y ) " exp !2pC dy y oJ (y ) , C 1 y 1 1 1 1 1 1 1 (2y) @2ya d d (4.24)
where to obtain the second equality (4.23) has been used. Differentiation of (4.24) with respect to y gives d 2pC g(y)"! C (g(y))2 , dy y @2`a~1
(4.25)
g(y) " : yC@2`aoJ (y), which is equivalent to Eq. (4.24) if we impose the initial condition oJ (y)"f/((2d)C@2da). Although Eq. (4.25) is non-linear, it is separable and easily solved to give 1 f oJ (y)" . C C (2y) @2ya 1#(2pCf/2 @2(C/2#a!2))(d2~C@2~a!y2~C@2~a)
(4.26)
Substituting in Eq. (4.19) gives
A
B
2pCfd2~C@2~a 1 log 1# C . Syo(y)T" 2pC 2 @2(C/2#a!2)
(4.27)
P.J. Forrester / Physics Reports 301 (1998) 235—270
269
Fig. 2. Graphical representation of the four distinct chains at O(f4).
Although it has been assumed in the derivation that C#2aP4`, we note that both Eqs. (4.26) and (4.27) are well defined for all C#2a'4. In particular, setting C"2 we see that (4.27) reproduces Eq. (4.20) not only in the asymptotic limit aP1~, but is in fact exact for all a'1. As the only assumption leading to Eq. (4.27) was the nested dipole hypothesis of Kosterlitz and Thouless, the agreement with the exact result proves its validity at C"2.
Acknowledgements I thank A. Alastuey for comments on the manuscript, and the organisers and participants of the summer school for following these lectures. The support of the ARC is acknowledged.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
J. Ginibre, J. Math. Phys. 6 (1965) 440. C. Cohen-Tannoudji, B. Diu, F. Laloe¨, Me´canique Quantique, Hermann, Paris, 1977. A. Alastuey, B. Jancovici, J. Physique 42 (1981) 1. A. Erde´lyi, Higher Transcendental Functions, McGraw-Hill, New York, 1953. B. Jancovici, G. Manificat, C. Pisani, J. Stat. Phys. 76 (1994) 307. P.J. Forrester, E.R. Smith, J. Phys. A 15 (1982) 3861. B. Jancovici, Phys. Rev. Lett. 46 (1981) 386. E.R. Smith, J. Phys. A 15 (1982) 3861. M. Gaudin, J. Phys. (Paris) 46 (1985) 1027. F. Cornu, B. Jancovici, J. Chem. Phys. 90 (1989) 2444. P.J. Forrester, J. Stat. Phys. 67 (1992) 433. P.J. Forrester, C. Pisani, Nucl. Phys. B 374 (1992) 720. F. Cornu, B. Jancovici, J. Stat. Phys. 49 (1987) 33. P.J. Forrester, J. Chem. Phys. 95 (1991) 4345. P.J. Forrester, J. Stat. Phys. 67 (1992) 433. B. Jancovici, G. Te´llez, J. Stat. Phys. 82 (1996) 609. J.L. Cardy, I. Peschel, Nucl. Phys. B 300 (1988) 377. Ph.A. Martin, Rev. Mod. Phys. 60 (1988) 1075. A. Alastuey, Ph.A. Martin, J. Stat. Phys. 39 (1985) 405. A. Alastuey, B. Jancovici, J. Stat. Phys. 34 (1984) 557.
270 [21] [22] [23] [24] [25] [26] [27]
P.J. Forrester / Physics Reports 301 (1998) 235—270 H.D. Politzer, Phys. Rev. B 40 (1989) 11917. B. Jancovici, J. Stat. Phys. 29 (1982) 263. B. Jancovici, J. Stat. Phys. 80 (1995) 445. J.M. Kosterlitz, D.J. Thouless, J. Phys. C 6 (1973) 1181. P.J. Forrester, Int. J. Mod. Phys. A 7 (supp. 1A) 303. A. Alastuey, P.J. Forrester, J. Stat. Phys. 79 (1995) 503. A. Alastuey, F. Cornu, J. Stat. Phys. 66 (1992) 165.
Physics Reports 301 (1998) 271—292
Exact solution of random tiling models Bernard Nienhuis* Instituut voor Theoretische Fysica, Universiteit van Amsterdam, Valckenierstraat 65, 1018 XE Amsterdam, Netherlands
Abstract The quasicrystalline state of matter and the role of quasiperiodicity is discussed. Both energetic and entropic mechanisms may stabilize the quasicrystalline phase. For systems where entropy plays the dominant role, random tiling models are the appropriate description. These are discrete statistical models, but without an underlying lattice. Several, though very few, quasicrystalline random tilings have been solved exactly, in the sense that the free energy has been calculated analytically in the thermodynamic limit. The models have besides a quasicrystalline phase also incommensurate phases of which the rotation symmetry is that of an ordinary crystal. The quasicrystalline phase maximizes the entropy. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 61.44.Br; 05.70.Ce; 02.30.Rz Keywords: Random tilings; Quasicrystals; Exact solution; Quasiperiodicity
1. Quasicrystals and quasiperiodicity It is often assumed that the solid state of matter is dominated by the crystalline phase. So much so, that the word solid is often used as a synonym of crystal. Even so, the classical standard of the three thermodynamic states, solid, liquid and gas has been unable to withstand the continuous wear and tear of the research of the last few decades. Materials like glasses and liquid crystals can be viewed as states in between the crystal and the liquid phase, in the sense that they combine properties of both. Quasicrystals, however are heretics entirely within the realm of solids. Crystals are characterized by discrete versions of the fundamental spatial symmetry groups, rotations and translations. It is a well-known fact that their rotational symmetry groups have only elements of order two, three, four and six. This can readily be seen as follows. Suppose a strucure is invariant under a specific finite rotation R and a is the smallest vector over which it is translation
* E-mail: [email protected].
0370-1573/98/$19.00 ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 1 3 - 1
272
B. Nienhuis / Physics Reports 301 (1998) 271—292
invariant. Then it is also invariant under translations over linear combinations of Rma with integer m. When, in two or three dimensions the rotation R is over an angle 2p/n, with n"5 or n'6, this set of linear combinations always contains elements smaller than a, contradicting the premise. In spite of this argument, in 1984 Shechtman et al. [1] discovered that a rapidly quenched Al—Mn alloy displays icosahedral symmetry. This symmetry group shared by the regular icosahedron and dodecahedron has five-fold axes, and is therefore forbidden by the just recounted basic rule of crystallography. Not only did the electron diffraction patterns show ten-fold symmetry, they also consisted of clear Bragg peaks, generally a sign of lattice periodicity. Since then, many other compounds have been found which combine non-crystallographic symmetries and sharp Bragg peaks, some in equilibrium and others in a metastable phase. These materials are now called quasicrystals [2,3]. The theory to describe this discovery had already been developed by Penrose [4] and De Bruijn [5]. They had described patterns of which the Fourier transform (diffraction pattern) (i) consists of distinct d-functions, and (ii) shows non-crystallographic symmetries. Interestingly, even the inventor of these structures had not anticipated [6] a spontaneous emergence of this type of order in nature. Periodic functions have a discrete Fourier spectrum, or in other words they can be written as a Fourier sum, rather than a Fourier integral. This discreteness of the Fourier transform can be reconciled with non-crystallographic symmetries by the following construction:
A
B
D ¼(r)" + Am exp i + m a ) r , j j m ZD | j/1
(1)
in which d is the dimension of r, and D'd is the number of basis vectors a . When D"d the j function ¼ is periodic, and can only have crystallographic rotation symmetry. But for a noncrystallographic symmetry D must be larger than d, and in that case ¼ is called quasiperiodic. This last requirement implies that the basis vectors cannot be independent (under the real numbers). However, in order that the sum in the exponent of Eq. (1) is unique the basis vectors are chosen independently under the integers. In other words, the equation + m a "0 has no integer solutions j j j except m "0. The result is that the positions in Fourier space at which the Fourier transform of j ¼ is non-zero, are discrete, but they lie dense in Rd. When two of these positions are very close to each other, they must be very different in the corresponding values of m . Since the amplitude j Am decays with increasing DDmDD, at least one of the amplitudes at such nearby positions is typically very small. That is why the Fourier spectrum appears as separated peaks, even though the peaks are virtually everywhere. It is always possible to view the vectors a as projections from a D-dimensional space in which j they are independent also over R. More formally, one could define D-dimensional independent basis vectors p "a #b such that the added b are orthogonal to any r in real space: r ) b "0. j j j j j With these definitions, the vectors a can simply be replaced by p in Eq. (1) without any change in j j the function ¼. Then the only difference between that definition and a periodic function in D dimensions, is that the argument r is restricted to a d-dimensional subspace. Therefore, a quasiperiodic function is simply a restriction of a periodic function in a space of more dimensions. An example with D"2 and d"1 is shown in Fig. 1. Here a two-dimensional periodic function is constructed as a sum of d-functions supported on line segments through each of the vertices of
B. Nienhuis / Physics Reports 301 (1998) 271—292
273
Fig. 1. A quasiperiodic distribution of atoms (dots) in one dimension. It is generated as the intersection of a line with a two-dimensional periodic arrangement of line segments. The intervals between consecutive dots are the projections of lattice edges which are shown bold.
Fig. 2. A quasiperiodic arrangement of disks, which can also be read as a rhombus covering.
a square lattice. This is intersected by a line, orthogonal to the line segments, at some angle to the lattice axes. The resulting quasiperiodic function can be interpreted as the distribution of atoms on a line with short and long intervals in between, in irregular alternation. When a similar construction is made for larger D and d the quasiperiodic function can have a rotational symmetry not permitted by the d-dimensional crystallography, but which is induced by the symmetry of the lattice in D-dimensions. For example, Fig. 2 shows a two-dimensional quasiperiodic arrangement of disks, which is an intersection of the plane with objects on the vertices of a four-dimensional periodic lattice.
274
B. Nienhuis / Physics Reports 301 (1998) 271—292
2. Energy versus entropy: random tilings Real crystals can be approximated by perfect lattices, but are always subject to thermal excitations of various types. Vibrations of the atoms around their average positions are called phonons. When an atom deviates far enough from its allocated position it may get trapped behind other atoms, thus creating a vacancy and an interstitial atom. Also a bounded additional layer of atoms can be inserted into a perfect crystal at the cost of a topological defect at the layer boundary, called a dislocation. All these excitations possible in crystals, are also permitted in quasicrystals. In addition, there is another type known by the name, phason. In Fig. 1 the interchange of an adjacent short and a long interval is a phason excitation. Since these intervals are projections of lattice edges of the two-dimensional lattice, this phason can be seen as an excursion of the chain of projected lattice edges (shown bold) in the orthogonal space. When a number of these exchanges is made in succession, this excursion can be more and more removed from the original almost straight sequence. In two physical dimensions the phason excitation cannot be readily visualized. However, note in Fig. 2 that there are many hexagons which consist of two fat and one thin rhombus. One of the possible phason excitations replaces the interior of such hexagons, by its mirror image. These variations, when applied repeatedly, destroy the perfect quasiperiodic order. But in any reasonable interaction scheme, the physical energy of such excitations would be very small. This is because the atomic configuration after a phason move still consists of the same local structures. Only on relatively large scales new configurations appear. Therefore, if the energy is determined by a strictly short-range potential, the phason moves may even be energetically neutral. These considerations are of some consequence if we consider the physical mechanism that stabilizes a real quasicrystal. If the quasiperiodic state minimizes the potential energy, the interparticle potential must be of sufficiently long range to exclude periodic repetitions of the same local pattern. This may be the case in some quasicrystals, but it is natural to also consider the possibility that a quasicrystal is not stabilized by potential energy alone, but only in conjunction with thermal effects. The quasicrystalline phase is then the minimum of the free energy, and the quasiperiodic characteristics of the diffraction pattern are the result of thermal averaging. In particular, if the inter-particle potential does not reach beyond first neighbors, this is a most likely scenario. Consider for example Fig. 3. This figure, (as also Fig. 2) shows only three distances between nearest-neighbor disks: two small disks, two big disks and a big and a small one. Suppose these distances minimize the two-particle potentials, and suppose further that these potentials are so short ranged that they do not reach to second neighbors. Then any configuration which realizes only the same distances between neighboring disks, will have the same potential energy. There are, in fact, extensively many of these configurations, and even at zero temperature the correlation functions and thus the diffraction pattern are the average of a whole ensemble of configurations. This example makes it plausible that there are quasicrystalline compounds in which the entropic effect dominates over the energetics in the stability of the quasicrystalline phase. It should be noted that in these considerations of energy and entropy it is still the energy which favors or selects specific angles between interatomic vectors. This effect, if strong enough, may induce a long-range orientational order, associated with a non-crystallographic symmetry. But it does not generate a structure with discrete Bragg peaks itself. Those are the results of the
B. Nienhuis / Physics Reports 301 (1998) 271—292
275
Fig. 3. A quasiperiodic arrangement of disks with octogonal symmetry.
long-range properties of the positional correlations which are affected and can even be dominated by the entropy. Random tiling models are designed to investigate precisely the scenario set out above [7,8]. Formally, they are simply defined as the ensemble of coverings of space, without holes or overlaps, by a limited set of geometrical objects. This in itself is rather general and many known statistical models can be cast in this form, see Ref. [9]. For our purposes here, the objects are chosen so that they support a non-crystallographic rotational symmetry. They could be, for instance, the two Penrose rhombusses, with angle p/5 and 2p/5, respectively [10,11], allowing a ten-fold symmetry, or a square and a rhombus with angle p/4, allowing an eight-fold symmetry [12]. In three dimensions the tiles would obviously be three-dimensional objects, rhombohedra [13], or otherwise [14]. Typically, in random tiling models one does not consider interactions between the tiles, except for the restriction that they fill space. This implies that the partition sum of the model is an integer, counting the number of ways that a given area can be covered by a given set of tiles. In practice, the number of tiles will not be fixed but controlled by means of a chemical potential. There is a variety of physical quantities that can be addressed via random tiling models, such as a phase diagram, correlation functions, diffraction pattern and response functions [15]. In this lecture we focus on an exact solution and with this in mind we consider a specific two-dimensional tiling model which is solvable. Of this model we consider the entropy, not because this is such an important quantity, but because it is most directly calculable. Because the entropy is calculated as a function of a partial tile density it also gives insight in the phase diagram.
3. The square triangle tiling In the remainder of this lecture we consider a particular random tiling in which the plane is tiled with squares and equilateral triangles, with edge length unity [16,17]. As an example a configuration
276
B. Nienhuis / Physics Reports 301 (1998) 271—292
Fig. 4. A random covering of the plane with squares and triangles. It is equivalent to a closest packing of a binary mixture of hard disks.
Fig. 5. Configurations of the square triangle tiling in which the area is very unevenly divided between squares and triangles. The left-hand side configuration is dominated by triangles and the right-hand side by squares.
is shown in Fig. 4, which also shows that the model can be used to describe a closed packing of a binary mixture of hard disks. When the density of squares and triangles differ largely, the configurations have a rather different appearance. Fig. 5 shows configurations dominated by triangles and squares, respectively. When most of the area is covered by squares, the squares in one of the possible orientations form rectangular domains, which are separated by domain walls consisting of triangles. At the intersections of the domain walls the squares in the other orientations appear. The highest rotational symmetry that can be achieved at this relative density, is fourfold. When the configuration consists mostly of triangles, there are hexagonal domains filled with triangles in two orientations, and
B. Nienhuis / Physics Reports 301 (1998) 271—292
277
Fig. 6. The tiles in their respective orientations.
separated by domain walls consisting of squares. At the vertices of these domain walls we see the triangles in the remaining orientations. The highest possible symmetry with these densities is hexagonal. Though it is clearly a discrete statistical model, the vertices are not restricted to an underlying lattice. Notice, however, that the angles of the tiles are all integer multiples of p/6, and this property is transmitted to the angles between any two edges in a configuration. Once one of the tiles is fixed on the plane the possible orientations of all the other tiles are restricted to a finite set. In particular the squares can occur in three different orientations and the triangles in four, which makes for seven different oriented tiles, shown in Fig. 6. The seven partial densities corresponding to the differently oriented tiles, are not independent. In particular, an oriented triangle and its rotation over p must occur in equal numbers in a tiling with periodic boundary conditions. Therefore, we will distinguish only two triangle orientations. As the total area is precisely covered, the partial densities expressed in area fractions must add up to unity. With these restrictions there remain four independent densities. In addition there is another constraint, which is explained in the appendix, where it is expressed in total numbers of the tiles in each orientation (A.2). This same relation can be expressed in the area fractions covered by the tiles, p for the squares and q for the triangles j j p p #p p #p p "4q q . (2) 1 2 2 3 3 1 3 1 2 One of the interests of the model is in the possible existence of a phase which respects the complete twelvefold orientational symmetry. In such a symmetric phase, the area covered by differently oriented tiles must be the same: p "p "p "p/3, and q "q "q/2, in which p and q are the 1 2 3 1 2 total area fractions covered by squares and triangles, respectively. From Eq. (2) it immediately follows that in the symmetric phase p"q. 3.1. Lattice deformation Let the possible orientations of the edges have an angle with the horizontal equal to an odd multiple of p/12. We deform the tiling by rotating the edges over $p/12 such that the angle with the horizontal is a multiple of p/3. In Fig. 7 the possible edge orientations are shown together with the orientations after this deformation. Note that under this deformation the triangles remain triangles, turned over $p/12, and the squares become rhombi. Since each original tiling is mapped in a unique way into a deformed tiling, the partition sum, which counts the number of possible tiling configurations, is unchanged. The vertices of the deformed tiling coincide with those of a triangular lattice, so that the model is now turned into a lattice model. Every edge of the lattice represents either an edge of the original tiling or the diagonal of a square. The deformed tiling can be represented on the lattice by drawing only those lattice edges which represent a tile edge. Because two different orientations in the tiling
278
B. Nienhuis / Physics Reports 301 (1998) 271—292
Fig. 7. The possible orientations of the tile edges are shown on the left. The tiling is deformed by rotating each edge over p/12 alternating to the right or left. The result is the set of orientations shown on the right, where coinciding directions are drawn separately.
Fig. 8. The possible states of the elementary triangles of the lattice. The drawn lattice edges are also edges in the tiling. The lattice edges which are omitted, represent the diagonal of a square tile. The bold lines and bold dotted lines are decorations which indicate the original orientation of the edges they intersect.
are mapped on one orientation of the lattice, it is convenient to decorate the lattice (by bold and dotted lines) to keep track of the original direction of the edge. Thus, the elementary triangles (faces) of the lattice can be in one of the five different states, depicted in Fig. 8. Two of them are triangular tiles, and three represent half of a square. The lattice edges can be in one of three states. The restriction that the original tiles fit together without gaps or overlaps translates in the lattice model as the requirement that the decorations of an edge agree between the two adjacent lattice faces. Now that the tiling is translated into a lattice problem, the standard techniques of lattice statistical mechanics may be applied. In particular, here we will use the transfer matrix [18]. Let a and b specify the state of an entire row of horizontal edges of the lattice, which is taken to be periodic in two directions. The transfer matrix element ¹ is then given by the weight of the row of a,b lattice faces that fits in between a and b, and it is zero if there is no such row. If we simply wish to count the number of configurations, each allowed row configuration has weight one. In practice, we will control the relative number of squares and triangles by giving the triangles a weight, equal to Jt. The partition sum of the model is given by Z"Tr ¹N.
(3)
B. Nienhuis / Physics Reports 301 (1998) 271—292
279
In which N is the vertical size of the lattice. In the limit NPR, the partition sum per row is thus given by the largest eigenvalue of ¹, so that the thermodynamic problem is reduced to an eigenvalue problem.
4. The Bethe Ansatz The size of the transfer matrix is 3L]3L, with ¸ the number of edges in a row. Therefore, even for modest lattice sizes this matrix may already be too large to diagonalize with standard numerical techniques. It was discovered by Widom [19] that the transfer matrix of this model can be diagonalized by means of the Bethe Ansatz, an analytic technique which reduces the eigenvalue problem to the solution of at most ¸ equations (See Ref. [20], Ch. 9). The first ingredient is the observation that the solid and dotted decorating lines indicate conserved quantities. Continuity requires that each horizontal row of edges is intersected by the same number of bold and dotted lines. If we interpret the vertical extent of the lattice as time (running upwards), the transfer matrix turns into a time evolution operator, and the decoration lines can be seen as world lines of particles. Note that in the configuration shown above in Fig. 5 (right-hand side) the particle trajectories are the domain walls between the square filled domains. In the lattice the drawn horizontal edges are occupied by particles and the open horizontal edges are vacant. The particles shown by a solid or dotted line will be denoted as type # or !, respectively. Since the number of particles of both types is conserved, the transfer matrix is block diagonal with the blocks labeled by the particle numbers. We will first consider the smaller blocks. The block with zero particles is only 1]1. The matrix element is between two rows of open (undrawn) horizontal edges. Only one row of lattice faces fits in between, as shown in Fig. 9a. Therefore, the only eigenvalue in this sector is K"1, equal to the weight of this row of lattice faces. We now consider the sector with one particle of the type #. A vector in that sector is denoted as t (x), representing the weight of the state in which the particle sits at position x. Fig. 9b shows the ` tile configuration corresponding to a matrix element in this sector. From one layer to the next the particle moves one-half lattice edge to the left. It is thus convenient to measure x in units of one-half
Fig. 9. Configurations corresponding to elements of the transfer matrix in various sectors, corresponding to no particles (a), one particle (b and c), two particles (d—f) and three particles (g, h).
280
B. Nienhuis / Physics Reports 301 (1998) 271—292
edge. If the particle after the action of the transfer matrix sits at x, it was at x#1 before, thus the eigenvalue equation reads Kt (x)"(¹t )(x)"tt (x#1) . (4) ` ` ` The parameter t is the weight of the two triangles. From translational invariance we expect a plane-wave solution t (x)"exp(ipx), which with the identification exp(ip)"u is written ` t (x)"ux. Substitution of this Ansatz in the eigenvalue equation (4), results in ` K"tu . (5) Periodic boundary conditions require t (0)"t (2¸), i.e. u2L"1. The sector with one particle of ` ` the other type (Fig. 9c) can be treated in the same way. The Ansatz t (x)"vx leads to the ~ eigenvalue K"t/v
(6)
and periodic boundary condition v2L"1. The sector with one particle of each type requires more thought. First observe that when the particles are far apart they behave as if they were alone (Fig. 9d). For this reason it is reasonable to consider again a piecewise plane-wave solution, t (x,y)"A`~ ux vy for x(y , `~ t (y,x)"A~` ux vy for y(x . (7) ~` For the case that the particles are not in each others way (yOx#2), the eigenvalue equations read Kt (x,y)"t2t (x#1, y!1) , `~ `~ Kt (y,x)"t2t (y!1, x#1) . ~` ~` After substitution of the Ansatz (7) this leads to
(8)
K"t2uv~1 .
(9)
Consider now the case that after the action of the transfer matrix the # and ! particles sit on two consecutive lattice edges. Fig. 9e and f show the two possible rows of lattice faces that fit under this configuration of horizontal edges. It leads to the eigenvalue equation Kt (x!1, x#1)"tt (x!2, x)#tt (x, x#2) . (10) `~ ~` ~` Since K is now known from Eq. (9), this eigenvalue equation can be used to relate A`~ and A~`: A`~"(u2#v~2)t~1 A~` .
(11)
But the amplitudes A`~ and A~` are also related through the periodic boundary conditions t (x, 2¸)"t (0, x) and t (y, 2¸)"t (0, y) . `~ ~` ~` `~ These are consistent with Eq. (11) only if u and v satisfy the conditions v~2L"u2L"(u2#v~2)t~1 .
(12)
(13)
B. Nienhuis / Physics Reports 301 (1998) 271—292
281
The particles of different types interact with each other every time their trajectories intersect. Since particles of the same type travel in the same direction with the same speed, it seems impossible that they ever meet and interact. However, when two particles of the same type sit on two consecutive edges the moment they bump into a particle of the other type, then they are in each others way, and thus necessarily interact. To investigate this effect, consider the sector with three particles, two of type # and one of type !. Guided by the success so far we assume a wave function of piecewise plane-wave type, with coefficients that depend on the order in which the particles are positioned in space. The three positional arguments of t: x , x and y are given in increasing order. 1 2 (x ,x ,y)"A` ` ~ ux1 ux2 vy#A` ` ~ ux1 ux2 vy , t ``~ 1 2 12v 1 2 21v 2 1 t (x ,y,x )"A` ~ ` ux1 vy ux2#A` ~ ` ux1 vy ux2 , (14) `~` 1 2 1v2 1 2 2v1 2 1 t (y,x ,x )"A~ ` ` vy ux1 ux2#A~ ` ` vy ux1 ux2 . ~`` 1 2 v12 1 2 v21 2 1 In the eigenvalue equations for the cases when all particles are spatially separated the two terms in each line can be treated separately, always resulting in K"t3u u /v. (15) 1 2 Since the eigenvalue factorizes in factors for each momentum, it is simple to treat the case as being that of two unlike particles colliding while the third particle is elsewhere. Also in such a case the two terms in each of the equations (14) can be treated separately. In both terms the contribution of the colliding particles and of the spectator particle factorize. The resulting amplitude relations are the obvious generalization of Eq. (11) A` ` ~"(u2#v~2) t~1 A` ~ ` , j k v k j v k (16) A` ~ `"(u2#v~2) t~1 A~ ` ` . j v k j v j k But in the case where all three particles collide at once, the two terms in Eq. (14) must be taken together. The configurations are shown in Fig. 9g and h. The corresponding eigenvalue equation reads Kt (x!2, x, x#2)"tt (x!3, x!1, x#1)#t2 t (x!1, x#1, x#3) . ``~ ~`` `~` (17) When the Ansatz (14) and the eigenvalue (15) are substituted, and the relations (16) are applied it results in A` ` ~"!A` ` ~ and A~ ` `"!A~ ` ` . (18) 12v 21v v 12 v 2 1 It is remarkable, that the ! particle, though it is necessary to facilitate the collision, does not enter these equations. Obviously, the sector with two particles of type ! and one particle of type # is similar. Now, we are ready to construct an Ansatz for the sector with general particle numbers. The natural generalization is a linear combination of products of plane waves. tc(x)"+ Acp< (wcjj)xj. p p j
(19)
282
B. Nienhuis / Physics Reports 301 (1998) 271—292
Here x"(x ,x ,2) stands for the sequence of positions in increasing order: x (x . The 1 2 j j`1 sequence of particle types is coded in c"(c ,c ,2), of the elements of which n take the value 1 2 ` # and n the value !. The sum is over permutations p"(p ,p ,2) of the indices, only mixing ~ 1 2 indices of particles of the same type. The exponentiated wave numbers will also be denoted as w`,u and w~,v . The amplitudes thus depend on the order in which the particle types are j j k k distributed in space and on the distribution of the momenta over the particles. The eigenvalue follows immediately from the action of the transfer matrix on those configurations where all particles are separated.
A
BA
B
n` n~ K" < tu < tv~1 . (20) j k j/1 k/1 Likewise, from the case that two or three particles collide as they are separated from the others, the amplitudes involved in the collision are related just like Eqs. (18) and (11) 2"t~1(u2#v~2)A2,~,`,2 , A2 (21) 2,`,~, 2, j, k,2 j k , j, k,2 2"!A2,`,`,2 2,~,~,2"!A2,~,~,2 . A2 and A2 (22) 2,`,`, 2, k, j,2 2, k, j,2 , j, k,2 , j, k,2 Though the eigenvalue equations with Ansatz (19) are easily verified for those configurations in which the particles never occupy more than three consecutive edges, to prove that the Ansatz is generally valid, is more difficult. It requires first the observation that each uninterrupted sequence of occupied horizontal edges, separated by unoccupied edges, can be treated independently. Then for such a sequence one has to consider general lengths of the uninterrupted subsequences of particles of the same type, and a general number of such subsequences. Kalugin [21] makes the first consideration but not the second. An algebraic proof from a different perspective has been given by De Gier and Nienhuis [22]. For the Ansatz to be consistent with periodic boundary conditions, it has to satisfy
t2 (2,2¸)"t 2(0,2) , ,c c, which requires for the amplitudes
(23)
A2 . (24) (wc )2L"Ac,2 2,c ,k k k,2 The left- and right-hand side of this equation can also be related by successive interchanges of momenta and particle types using Eqs. (21) and (22). This results in consistency equations on the momenta, known as the Bethe Ansatz equations (BAE). n~ u2L"(!)n``1t~n~ < u2#v~2, j j k k/1 (25) n` ~ ` v~2L"(!)n `1t~n < u2#v~2 . k j k j/1 The numbers n and n are the particle numbers of each type. The momenta of the particles of the ` ~ same type must all be different, since otherwise the wave function vanishes. With this condition each solution of the BAE (25) provides an eigenvalue (20) of the transfer matrix, and extensive numerical tests on finite systems unambiguously support that each eigenvalue corresponds to a solution of Eq. (25).
B. Nienhuis / Physics Reports 301 (1998) 271—292
283
In retrospect the Bethe Ansatz appears natural. When the particles interact only at short range, it is reasonable to assume a piecewise plane-wave solution. However, what is remarkable is that all collisions between any number of particles factorize into collisions between two particles each. We note that the general verification of this fact is not undertaken in this lecture.
5. Solution to the Bethe Ansatz equations The BAE constitute significant progress in the diagonalization of the transfer matrix, since the size of the transfer matrix is exponential in the system size ¸ while the number of equations grows only linear. In the thermodynamic limit Kalugin [21] managed to find an analytic solution, which we will follow in this section. First it is convenient to introduce new variables m"u2 and g"!v~2 and rewrite the BAE accordingly n~ mL"(!)n``1t~n~ < m !g , j j k k/1 n` gL"(!)L`n~`1t~n` < m !g , j k k j/1 n` n~ K" < tm1@2 < t(!g )1@2 . j k j/1 k/1 When we introduce the functions n n~ 1 F (z)"log z! + log(z!g )# ~log t , k ` ¸ ¸ k/1 n` 1 n F (z)"log z! + log(m !z)# `log t , ~ j ¸ ¸ j/1 the BAE (26) can be rewritten as
A
BA
B
(26) (27)
(28)
¸F (m ) mod 2pi"(n #1mod 2)pi , ` j ` (29) ¸F (g ) mod 2pi"(¸#n #1 mod 2)pi . ~ k ~ From this form of the equations it is evident that the roots of the BAE live on curves in the complex plane, where the real part of the functions F and F , respectively, vanishes. An example of the ` ~ locus of a numerical solution is shown in Fig. 10. From the form of the eigenvalue (27) the largest eigenvalue is expected to be the product of all the solutions of Eq. (29) for which Dt2m D'1 or Dt2g D'1, respectively. Therefore, it is plausible that j k between two successive roots the imaginary part changes by 2p/¸ and not by a multiple of this amount, so that we may specify the order of the roots by 2pi F (m )!F (m )" , ` j`1 ` j ¸ (30) 2pi F (g )!F (g )" . ~ k`1 ` k ¸
284
B. Nienhuis / Physics Reports 301 (1998) 271—292
Fig. 10. The locus of the roots of the Bethe Ansatz equations. The corresponding q(1/2. The system size ¸ and the number of particles n$ are so large that the individual roots are not visually separated.
Therefore, in the thermodynamic limit, when the roots are close to each other, the density of roots on the solution curves is given by ¸ F (z) ¸ ` f (z), ` 2pi z 2pi and ¸ ¸ F (z) ~ . f (z), 2pi ~ 2pi z From differentiating Eq. (29) these functions satisfy 1 1 1 n~ , f (z)" ! + ` z!g z ¸ k k/1 (31) 1 1 1 n` . f (z)" ! + ~ z!m z ¸ j j/1 But since we know the density of roots m and g on the solution curves, the sums in Eq. (31) can be turned into integrals. g1 1 1 f (g) ~ dg , f (z)" # ` z 2pi z!g gn~ (32) mn` 1 f (m) 1 ` dm . f (z)" ! ~ z 2pi z!m m1
P
P
B. Nienhuis / Physics Reports 301 (1998) 271—292
285
The difference in sign is caused by the relative position of the first and the last root of the sequence of m and g, and the fact that the direction of integration is chosen in the positive imaginary direction. The integrals are to be taken along the contours of the solutions of the BAE (26). These contours can be defined self-consistently by the integral equations (32) as the locus where the forms f (z) dz are imaginary. B The analytic properties of the functions f (z) can be read directly from the integral equations B (32): both have a pole in the origin, and f has a cut at g and f at m, and otherwise the functions ` ~ are analytic. The behavior at the cut can also be obtained formally by evaluating the integral equations for z just to the left and right of the cut and comparing the results. Then it follows that: f (g#e)!f (g!e)"f (g) , ` ` ~ (33) f (m#e)!f (m!e)"!f (m) . ~ ~ ` These equations precisely define the analytic continuations of the functions f through the cuts. B When f is continued analytically from left to right through the g-cut it is replaced by f !f ; ` ` ~ likewise when f is continued analytically from left to right through the m-cut it is replaced by ~ f #f . This implies that the class of functions f (z)"a f (z)#a f (z) is closed under analytic ` ~ `` ~~ continuation through the cuts. The analytic continuation of f (z) from left to right through the g and m cut, changes the vector (a a )T precisely as the multiplication by the matrices ` ~ 1 0 1 1 C" and C " , (34) g m !1 1 0 1
A
B
A B
respectively. The matrices C and C together generate the whole infinite group SL(2, Z), which is thus the m g monodromy group of the function f (z). However, the situation simplifies considerably if the endpoints of the cuts coincide: m Pg ~ and g Pm ` in the thermodynamic limit (¸PR while 1 n 1 n keeping n /¸ and n /¸ constant). In that case it is impossible to make a loop through one of the ~ ` two cuts and return to the starting point, without going through the other cut as well. That implies that we have to deal with only one monodromy operator, C C . This operator has the convenient g m property that (C C )6"1, which implies that the monodromy group is finite. This property is g m indeed sufficient to determine uniquely the solution of the integral equations, while in the general case we do not know how to proceed. Therefore, from here on we will simply assume that the endpoints of the cuts indeed coincide, and find out as we go what this implies for the parameters, such as n , n and t. We will take the function f (z) together with its analytic continuation as ` ~ a function on six Riemann sheets. These six Riemann sheets can be mapped into one, by introducing the parameter w z/b!1 w6" and 1!z/b*
w6#1 z" , w6/b*#1/b
(35)
where b"g "m ` and b*"m "g ~ in the thermodynamic limit. Rather than expressing f (z) itself 1 n 1 n as a function of w it is convenient to introduce a single-valued function g(w) defined by g(w) dw"f (z) dz ,
(36)
286
B. Nienhuis / Physics Reports 301 (1998) 271—292
Table 1 List of the poles and residues of the function g(w). The third column gives the position of the pole, the second column gives the corresponding value of z, the fourth column gives the functional behavior of f (z) in the corresponding z-sheet, and the last column gives the residue of the pole m
z
w m
f (z)
Residue r m
1 2 3 4 5 6 7 8 9 10 11 12
0 R 0 R 0 R 0 R 0 R 0 R
ep*@6 e*(p`c)@3 i e*(2p`c)@3 !e~p*@6 !e*c@3 !ep*@6 !e*(p`c)@3 !i !e*(2p`c)@3 e~p*@6 !e*c@3
f (z) ` f (z) ` f (z)!f (z) ` ~ !f (z) ~ !f (z) ~ !f (z)!f (z) ` ~ !f (z) ` !f (z) ` !f (z)#f (z) ` ~ f (z) ~ f (z) ~ f (z)#f (z) ` ~
1 n /¸!1 ~ 0 1!n /¸ ` !1 2!(n #n )/¸ ` ~ !1 1!n /¸ ~ 0 n /¸!1 ` 1 (n #n )/¸!2 ` ~
where w and z are related by Eq. (35). In this way the poles of f (z) translate into poles of g(w) with the same residues. That this is so, can be seen from integrating the forms (36) over a closed contour containing a pole. From Eq. (31) it is clear that the functions f (z) decay as (1!n )/z for zPR. This behavior B Y also translates into a pole of g(w) at the w-image of z"R, with residue n !1. The function g(w) Y is not singular at the cuts of the functions f and f , as it is defined by analytic continuation. But at ` ~ the end points z"b and z"b* it could have a pole. However, since the form f dz measures the B number of roots (up to a factor ¸), the integral of this form may not diverge at the locus of m and g, and hence the function f (z) is finite at z"b. Therefore, apart from poles at the images of z"0 and z"R, the function g(w) is completely analytic. The location and the residues at the poles are thus sufficient to determine g(w) unambiguously. The poles of g(w) are located at the six images of z"0, i.e. w"emp*@6, with odd m, and at the images of z"R, i.e. w"emp*@6~*c with even m, where c is defined by b"iDbDe~*c. Table 1 lists the poles and residues of g(w) which is chosen as g(w) dw"f (z) dz in the neighborhood of w"ep*@6. ` Fig. 11 shows the locus of the cuts in the z-plane, and the corresponding locus (sketched) in the w-plane. The table shows the position, w , of the poles in the w-plane, the functional behavior of m f (z)"g(w) dw/dz expressed in f (z) and the residue r . The functional behavior in the various B m branches is found by applying the monodromy operators C and C and once that is found, the m g residues follow directly from Eq. (31). Now, g(w) is simply expressed in its poles and residues as 12 r m . g(w)" + w!w m m/1
(37)
This also determines f effectively, at least in terms of the parameters b, n /¸ and n /¸. It will turn B ` ~ out, however, that these parameters are not independent.
B. Nienhuis / Physics Reports 301 (1998) 271—292
287
Fig. 11. Locus of the cuts of f (z) in the z-plane and of g(w) in the w-plane. The function g(w) has poles at the images of z"0 and z"R, also marked.
The roots are determined by Eq. (29), so that at the solution curves the functions F are B purely imaginary. This implies that f (z) dz also must be purely imaginary along the curves. B Likewise the images of the solution curves in the w-plane are determined by Re g(w) dw"0. In w"0 six of these curves intersect, which is only possible if g(w"0)"0. With Eq. (37) this can simply be written as 12 r + m "0 . w m/1 m
(38)
This complex equation is satisfied if n 2 pGc B"1! cos . ¸ 3 J3
(39)
These values of n are consistent with hexagonal symmetry, see Eq. (A.3). The total triangle density B (area fraction) can be related to c, Eq. (A.4) J2q!1 c . sin " 3 (1!q)J3#2q
(40)
For c"0, we find q"1, consistent with the maximal, twelve-fold, rotational symmetry, and 2 for other values of c the triangle density q'1. It is due to the assumption that the two solution 2 curves of the BAE for m and g have common end points that at least half of the area is covered by triangles. In this region of the phase diagram we have the solution (37) to the integral equations (32). We now wish to evaluate the corresponding eigenvalue (27) of the transfer matrix. Note that log K is a linear combination of log t, +n` logDm D and +n~ logDg D. The functions F (28), j/1 j k/1 k B when evaluated in z"0 yield terms of this type, but are hindered by the divergent log(z) term. Likewise, when zPR, the functions F have besides a logarithmic divergence also useful B
288
B. Nienhuis / Physics Reports 301 (1998) 271—292
finite contributions. Therefore, we simply subtract the divergent parts in these limits by introducing the functions F (z)"F (z)!log z, 1 `
A
B
A
B
n F (z)"F (z)! 1! ~ log z, 2 ` ¸ F (z)"F (z)!log z, 3 ~ n F (z)"F (z)! 1! ` log z. 4 ~ ¸
(41)
Since b is on the locus of the roots, F (z"b) and F (z"b) must be purely imaginary. Thus, ` ~ directly from the definitions (28) and (41) we have the following relations: t 1 n~ Re[F (0)!F (b)]"logDbD# + log , 1 1 g ¸ k k/1 n t Re[F (R)!F (b)]"logDbD# ~log , 2 2 ¸ DbD 1 n` t Re[F (0)!F (b)]"logDbD# + log , 3 3 ¸ m j j/1 n t Re[F (R)!F (b)]"logDbD# `log . 4 4 ¸ DbD
(42)
But since we have the solution (37) of the integral equations (32), the left-hand sides can also be calculated as integrals of f , 0 B w1 1 dz F (0)!F (b)" dz f (z)! " dw g(w)! , 1 1 ` z zdw b w2 0 dz , F (R)!F (b)" dw g(w)! 2 2 zdw (43) w11 0 dz , F (0)!F (b)" dw g(w)! 3 3 zdw 0 w10 dz . F (R)!F (b)" dw g(w)! 4 4 zdw 0 These integrals can be performed readily, knowing g(w) from Eq. (37) and the second term in the
P A B P A P A B P A B P A B
B
integrand 1 dz 12 (!1)m`1 "+ . (44) z dw w!w m m/1 They remain finite, as the residue of the total integrand vanishes consistently at the upper integration limit. Thus, the combination of Eqs. (42) and (43) gives four equations linear in the quantities log(t), logDbD, + logDm D and + logDg D. The solution to these equations yield not only the j j k k
B. Nienhuis / Physics Reports 301 (1998) 271—292
289
Fig. 12. The entropy of the square—triangle tiling as a function of the triangle area fraction. The analytic solution for q'1/2 is shown as a solid line, and for q(1/2 the dots indicate numerical results from the BAE.
eigenvalue (27), which is the partition sum per row, but also the entropy per vertex S , via the 7 thermodynamic relation 1 q S " log K! 7log t. 7 ¸ 2
(45)
Here q is the number of triangles per vertex 7 c q "4 2!3 cos . 7 3
A
B
(46)
The entropy per vertex can now be expressed as 108 S "log 7 cos2c
A B C A A B C A
B A BD B A BD
#2 cos
p c p p c c # log tan # tan # 6 3 12 6 4 6
#2 cos
c p c p p c ! log tan ! tan ! 6 3 12 6 4 6
,
(47)
in which c parametrizes the triangle density via Eq. (46) or Eq. (40). When the entropy is plotted versus the triangle area fraction q, as in Fig. 12, it shows a sharp maximum at q"1. The analytic solution (47) applies only to the regime q51, but for smaller 2 2 values of q the BAE even without analytic solution permit numerical calculations for very large systems.
290
B. Nienhuis / Physics Reports 301 (1998) 271—292
6. Summary and outlook In this lecture we have discussed the nature of quasicrystals. Their diffraction patterns show non-crystallographic symmetries and at the same time sharp Bragg peaks. In contrast to the ordinary crystals the atoms are not arranged in periodic repetitions of the same unit cell. When the interparticle potential is sufficiently short ranged, random tiling models are a realistic description of these interesting solids. In two dimensions, there are currently three solved random tiling models with a quasicrystalline phase [21,23,24]. All three describe tilings of the plane with rectangles and isosceles triangles. The angle between the equal sides of the triangle takes the values 2p/6, 2p/4 and 2p/5, corresponding to dodecagonal, octogonal and decagonal symmetry, respectively. The solution of the first solved case is treated in this lecture. The other models are solved in a very similar way, but involve further complications. The solution results in a closed expression for the entropy as a function of the partial density of the tiles. Other physical quantities such as the phason elastic constants have also been calculated, but are not discussed here. We note, however, that the results are relatively recent and that one should also expect other physical quantities to be calculated in the near future (e.g. Ref. [25]). Many aspects of these models have not yet been fully explored. It is not at all clear if and how the class of solvable random tiling models can be extended. In particular, it is not obvious why specifically the rectangle—triangle tilings share this favorable property. At the same time, it is tempting to think that other rectangle—triangle tilings than the three mentioned above, may also be solvable. This is in fact the case: these models can be solved, but they are not related to quasicrystals. If the top angle of the triangle is 2p/n with n'6, it is geometrically impossible to have a symmetric phase. The corresponding tiling model is effectively a deformation of the square—triangle tiling. Appendix A. Geometric considerations The constraint that the tiles have to cover the area without holes or overlaps gives a restriction on the number of tiles in each orientation which is not immediately obvious. Consider a configuration in which the area is mostly covered by triangles, such as in Fig. 5. Two orientations of the triangles form hexagonal domains, which are separated by domain walls consisting of squares. Configurations of this type permit fluctuations in which one hexagon grows at the expense of the surrounding hexagons. Under these variations the tile number in each orientation remains constant. Therefore, in order to derive a constraint on these numbers, it is sufficient to inspect the configurations in which all the domains have the same size and shape. Let in such configurations the length of the sides of the domains in the different directions be l , l and l . Let ¹ be the 1 2 3 2 number of triangles in the vertices of the domain walls, and ¹ the number of triangles that fill the 1 hexagonal domains. The number of domains is equal to ¹ /2. The number of triangles in each 2 domain is equal to 2(l l #l l #l l ), so that 1 2 2 3 3 1 ¹ "¹ (l l #l l #l l ) . (A.1) 1 2 1 2 2 3 3 1 The number of squares in the three orientations is given by S "l ¹ /2, so that the tile numbers j j 2 must satisfy S S #S S #S S "1¹ ¹ . 1 2 2 3 3 1 4 1 2
(A.2)
B. Nienhuis / Physics Reports 301 (1998) 271—292
291
Fig. 13. Configuration of the lattice in the hexagonal phase.
This equation is valid for the total number of tiles in their respective orientations on a periodic manifold. Though in the derivation we have assumed that there are more triangles than squares, the validity is not dependent on this assumption. In the phase with hexagonal symmetry, the squares in their different orientations occur in equal numbers, say l "l. We will use Eq. (A.2) in this situation to find a relation between the numbers j of conserved particles and the tile numbers. Fig. 13 shows a configuration of the tiling, as it is deformed on the lattice, in which the particle trajectories are indicated by bold and dotted lines. The configuration shown is periodic with a somewhat complicated period. The number of horizontal edges in each period is 3l2#3l#1. Of these there are l open edges, l#1 are marked with a bold line, and the remaining 3l2#l are marked with a dotted line. For the particle numbers this implies n l#1 `" , ¸ 3l2#3l#1 n 3l2#l ~" . ¸ 3l2#3l#1
(A.3)
The area fraction covered by the triangles q can also be expressed in l. In each period there are 6l2#2 triangles and 3l squares. By accounting for the area of each we have for the triangle area fraction 3l2#1 q" (1#lJ3)2
(A.4)
References [1] D. Shechtman, I. Blech, D. Gratias, J.W. Cahn, Metallic phase with long-range orientational order and no translational symmetry, Phys. Rev. Lett. 51 (1984) 1953. [2] C. Janot, Quasicrystals — A Primer, 2nd ed., Clarendon Press, Oxford, 1994.
292
B. Nienhuis / Physics Reports 301 (1998) 271—292
[3] P. Guyot, P. Kramer, M. de Boissieu, Quasicrystals, Rep. Prog. Phys. 54 (1991) 1373. [4] R. Penrose, The role of aesthetics in pure and applied mathematical research, Bull. Inst. Math. Appl. 10 (1974) 266. [5] N.G. de Bruijn, Algebraic theory of Penrose’s non-periodic tilings of the plane, Proc. Konink. Nederl. Akad. Wetensch. A 84 (1981) 39. [6] R. Penrose, Tilings and Quasi-crystals; a non-local growth problem? in: M.V. Jaric´ (Ed.), Aperiodicity and Order, vol. 2: Introduction to the Mathematics of Quasi-crystals, Academic Press, New York, 1989, p. 53. [7] V. Elser, Comment on Quasicrystals: a new class of ordered structures, Phys. Rev. Lett. 54 (1985) 1730. [8] V. Elser, Indexing problems in quasicrystals diffraction, Phys. Rev. B 32 (1985) 4892. [9] C. Richard, M. Ho¨pfe, J. Hermission, M. Baake, Random tilings: concepts and examples, cond-mat/9712267 (1997). [10] M. Widom, D.P. Deng, C.L. Henley, Transfer-matrix analysis of a two-dimensional quasicrystal, Phys. Rev. Lett. 63 (1989) 310. [11] K.J. Strandburg, L.H. Tang, M.V. Jaric, Phason elasticity in entropic quasicrystals, Phys. Rev. Lett. 63 (1989) 314. [12] W. Li, H. Park, M. Widom, Phase diagram of a random tiling quasicrystal, J. Stat. Phys. 66 (1992) 1. [13] D. Levine, P.J. Steinhardt, Quasicrystals, a new class of ordered materials, Phys. Rev. Lett. 53 (1984) 2477. [14] C.L. Henley, Cell geometry for cluster based quasicrystal models, Phys. Rev. B 43 (1991) 993. [15] C.L. Henley, Random tiling models, in: P.J. Steinhardt, D.P. Di Vincenzo (Ed.), Quasi-Crystals the State of the Art, Ch. 15, World Scientific, Singapore, 1991. [16] H. Kawamura, Statistics of a two-dimensional Amorphous Lattice, Proc. Theor. Phys. 70 (1983) 352. [17] M. Oxborrow, C.L. Henley, Random square—triangle tilings, a model of 12-fold quasicrystals, Phys. Rev. B 48 (1993) 6966. [18] C.L. Henley, Random tilings with quasicrystal order: transfer-matrix approach, J. Phys. A 21 (1988) 1649. [19] M. Widom, Bethe Ansatz solution of the square—triangle random tiling model, Phys. Rev. Lett. 70 (1993) 2094. [20] R.J. Baxter, Exactly Solved Models in Statistical Mechanics, Academic Press, London, 1982. [21] P.A. Kalugin, The square—triangle random-tiling model in the thermodynamic limit, J. Phys. A 27 (1994) 3599. [22] J. de Gier, B. Nienhuis, Integrability of the square—triangle random tiling model, Phys. Rev. E 55 (1997) 3926. [23] J. de Gier, B. Nienhuis, The exact solution of an octagonal rectangle—triangle random tiling, J. Stat. Phys. 87 (1997) 415. [24] J. de Gier, B. Nienhuis, Bethe Ansatz solution of a decagonal rectangle triangle random tiling, J. Phys. A 31 (1998) 2141. [25] P.A. Kalugin, Low-lying excitations in the square—triangle random tiling model, J. Phys. A 30 (1997) 7077.
Physics Reports 301 (1998) 299—486
Quantum chromodynamics and other field theories on the light cone Stanley J. Brodsky!, Hans-Christian Pauli", Stephen S. Pinsky# ! Stanford Linear Accelerator Center, Stanford University, Stanford, CA 94309, USA " Max-Planck-Institut fu( r Kernphysik, D-69029 Heidelberg, Germany # Ohio State University, Columbus, OH 43210, USA Received October 1997; editor: R. Petronzio
Contents 1. Introduction 2. Hamiltonian dynamics 2.1. Abelian gauge theory: quantum electrodynamics 2.2. Non-abelian gauge theory: Quantum chromodynamics 2.3. Parametrization of space—time 2.4. Forms of Hamiltonian dynamics 2.5. Parametrizations of the front form 2.6. The Poincare´ symmetries in the front form 2.7. The equations of motion and the energy—momentum tensor 2.8. The interactions as operators acting in Fock space 3. Bound states on the light cone 3.1. The hadronic eigenvalue problem 3.2. The use of light-cone wavefunctions 3.3. Perturbation theory in the front form 3.4. Example 1: The qqN -scattering amplitude 3.5. Example 2: Perturbative mass renormalization in QED (KS) 3.6. Example 3: The anomalous magnetic moment 3.7. (1#1)-dimensional: Schwinger model (LB) 3.8. (3#1)-dimensional: Yukawa model 4. Discretized light-cone quantization 4.1. Why discretized momenta?
302 307 308 311 313 315 317 319 322 327 329 330 334 336 338 341 345 349 352 358 360
4.2. Quantum chromodynamics in 1#1 dimensions (KS) 4.3. The Hamiltonian operator in 3#1 dimensions (BL) 4.4. The Hamiltonian matrix and its regularization 4.5. Further evaluation of the Hamiltonian matrix elements 4.6. Retrieving the continuum formulation 4.7. Effective interactions in 3#1 dimensions 4.8. Quantum electrodynamics in 3#1 dimensions 4.9. The Coulomb interaction in the front form 5. The impact on hadronic physics 5.1. Light-cone methods in QCD 5.2. Moments of nucleons and nuclei in the light-cone formalism 5.3. Applications to nuclear systems 5.4. Exclusive nuclear processes 5.5. Conclusions 6. Exclusive processes and light-cone wavefunctions 6.1. Is PQCD factorization applicable to exclusive processes? 6.2. Light-cone quantization and heavy particle decays 6.3. Exclusive weak decays of heavy hadrons 6.4. Can light-cone wavefunctions be measured?
0370-1573/98/$19.00 Copyright ( 1998 Elsevier Science B.V. All rights reserved PII S 0 3 7 0 - 1 5 7 3 ( 9 7 ) 0 0 0 8 9 - 6
362 366 376 381 381 385 389 393 395 395 402 407 409 411 412 414 415 416 418
QUANTUM CHROMODYNAMICS AND OTHER FIELD THEORIES ON THE LIGHT CONE
Stanley J. BRODSKY!, Hans-Christian PAULI", Stephen S. PINSKY# ! Stanford Linear Accelerator Center, Stanford University, Stanford, CA 94309, USA " Max-Planck-Institut fu( r Kernphysik, D-69029 Heidelberg, Germany # Ohio State University, Columbus, OH 43210, USA
AMSTERDAM — LAUSANNE — NEW YORK — OXFORD — SHANNON — TOKYO
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486 7. The light-cone vacuum 7.1. Constrained zero modes 7.2. Physical picture and classification of zero modes 7.3. Dynamical zero modes 8. Non-perturbative regularization and renormalization 8.1. Tamm—Dancoff Integral equations 8.2. Wilson renormalization and confinement 9. Chiral symmetry breaking 9.1. Current algebra
419 419 429 434 436 437 442 446 447
9.2. Flavor symmetries 9.3. Quantum chromodynamics 9.4. Physical multiplets 10. The prospects and challenges Appendix A. General conventions Appendix B. The Lepage—Brodsky convention (LB) Appendix C. The Kogut—Soper convention (KS) Appendix D. Comparing BD- with LB-spinors Appendix E. The Dirac—Bergmann method References
301 450 453 454 456 460 462 463 465 466 476
Abstract In recent years light-cone quantization of quantum field theory has emerged as a promising method for solving problems in the strong coupling regime. The approach has a number of unique features that make it particularly appealing, most notably, the ground state of the free theory is also a ground state of the full theory. We discuss the light-cone quantization of gauge theories from two perspectives: as a calculational tool for representing hadrons as QCD bound states of relativistic quarks and gluons, and also as a novel method for simulating quantum field theory on a computer. The light-cone Fock state expansion of wavefunctions provides a precise definition of the parton model and a general calculus for hadronic matrix elements. We present several new applications of light-cone Fock methods, including calculations of exclusive weak decays of heavy hadrons, and intrinsic heavy-quark contributions to structure functions. A general non-perturbative method for numerically solving quantum field theories, “discretized light-cone quantization”, is outlined and applied to several gauge theories. This method is invariant under the large class of light-cone Lorentz transformations, and it can be formulated such that ultraviolet regularization is independent of the momentum space discretization. Both the bound-state spectrum and the corresponding relativistic light-cone wavefunctions can be obtained by matrix diagonalization and related techniques. We also discuss the construction of the light-cone Fock basis, the structure of the light-cone vacuum, and outline the renormalization techniques required for solving gauge theories within the Hamiltonian formalism on the light cone. ( 1998 Elsevier Science B.V. All rights reserved. PACS: 11.10.Ef; 11.15.Tk; 12.38.Lg; 12.40.Yx
302
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
1. Introduction One of the outstanding central problems in particle physics is the determination of the structure of hadrons such as the proton and neutron in terms of their fundamental quark and gluon degrees of freedom. Over the past 20 years, two fundamentally different pictures of hadronic matter have developed. One, the constituent quark model (CQM) [469], or the quark parton model [144,145], is closely related to experimental observation. The other, quantum chromodynamics (QCD) is based on a covariant non-abelian quantum field theory. The front form of QCD [172] appears to be the only hope of reconciling these two. This elegant approach to quantum field theory is a Hamiltonian gauge-fixed formulation that avoids many of the most difficult problems in the equal-time formulation of the theory. The idea of deriving a front form constituent quark model from QCD actually dates from the early 1970s, and there is a rich literature on the subject [74,119,135,30,6,120,304,305,332,350,87,88,235—237]. The main thrust of this review will be to discuss the complexities that are unique to this formulation of QCD, and other quantum field theories, in varying degrees of detail. The goal is to present a self-consistent framework rather than trying to cover the subject exhaustively. We will attempt to present sufficient background material to allow the reader to see some of the advantages and complexities of light-front field theory. We will, however, not undertake to review all of the successes or applications of this approach. Along the way we clarify some obscure or little-known aspects, and offer some recent results. The light-cone wavefunctions encode the hadronic properties in terms of their quark and gluon degrees of freedom, and thus all hadronic properties can be derived from them. In the CQM, hadrons are relativistic bound states of a few confined quark and gluon quanta. The momentum distributions of quarks making up the nucleons in the CQM are well-determined experimentally from deep inelastic lepton scattering measurements, but there has been relatively little progress in computing the basic wavefunctions of hadrons from first principles. The bound-state structure of hadrons plays a critical role in virtually every area of particle physics phenomenology. For example, in the case of the nucleon form factors and open charm photoproduction the cross sections depend not only on the nature of the quark currents, but also on the coupling of the quarks to the initial and final hadronic states. Exclusive decay processes will be studied intensively at B-meson factories. They depend not only on the underlying weak transitions between the quark flavors, but also the wavefunctions which describe how B-mesons and light hadrons are assembled in terms of their quark and gluon constituents. Unlike the leading twist structure functions measured in deep inelastic scattering, such exclusive channels are sensitive to the structure of the hadrons at the amplitude level and to the coherence between the contributions of the various quark currents and multi-parton amplitudes. In electro-weak theory, the central unknown required for reliable calculations of weak decay amplitudes are the hadronic matrix elements. The coefficient functions in the operator product expansion needed to compute many types of experimental quantities are essentially unknown and can only be estimated at this point. The calculation of form factors and exclusive scattering processes, in general, depend in detail on the basic amplitude structure of the scattering hadrons in a general Lorentz frame. Even the calculation of the magnetic moment of a proton requires wavefunctions in a boosted frame. One thus needs a practical computational method for QCD which not only determines its spectrum, but which can provide also the non-perturbative hadronic matrix elements needed for general calculations in hadron physics.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
303
An intuitive approach for solving relativistic bound-state problems would be to solve the gauge-fixed Hamiltonian eigenvalue problem. The natural gauge for light-cone Hamiltonian theories is the light-cone gauge A`"0. In this physical gauge the gluons have only two physical transverse degrees of freedom. One imagines that there is an expansion in multi-particle occupation number Fock states. The solution of this problem is clearly a formidable task, and if successful, would allow one to calculate the structure of hadrons in terms of their fundamental degrees of freedom. But even in the case of the simpler abelian quantum theory of electrodynamics very little is known about the nature of the bound-state solutions in the strong-coupling domain. In the non-abelian quantum theory of chromodynamics, a calculation of bound-state structure has to deal with many difficult aspects of the theory simultaneously: confinement, vacuum structure, spontaneous breaking of chiral symmetry (for massless quarks), and describing a relativistic many-body system with unbounded particle number. The analytic problem of describing QCD bound states is compounded not only by the physics of confinement, but also by the fact that the wavefunction of a composite of relativistic constituents has to describe systems of an arbitrary number of quanta with arbitrary momenta and helicities. The conventional Fock state expansion based on equal-time quantization becomes quickly intractable because of the complexity of the vacuum in a relativistic quantum field theory. Furthermore, boosting such a wavefunction from the hadron’s rest frame to a moving frame is as complex a problem as solving the bound-state problem itself. In modern textbooks on quantum field theory [242,342], one therefore hardly finds any trace of a Hamiltonian. This reflects the contemporary conviction that the concept of a Hamiltonian is old-fashioned and littered with all kinds of almost intractable difficulties. The presence of the square root operator in the equal-time Hamiltonian approach presents severe mathematical difficulties. Even if these problems could be solved, the eigensolution is only determined in its rest system as noted above. Actually, the action and the Hamiltonian principle in some sense are complementary, and both have their own virtues. In solvable models they can be translated into each other. In the absence of such, it depends on the kind of problem one is interested in: The action method is particularly suited for calculating cross sections, while the Hamiltonian method is more suited for calculating bound states. Considering composite systems, systems of many constituent particles subject to their own interactions, the Hamiltonian approach seems to be indispensable in describing the connections between the constituent quark model, deep inelastic scattering, exclusive process, etc. In the CQM, one always describes mesons as made of a quark and an anti-quark, and baryons as made of three quarks (or three anti-quarks). These constituents are bound by some phenomenological potential which is tuned to account for the hadron’s properties such as masses, decay rates or magnetic moments. The CQM does not display any visible manifestation of spontaneous chiral symmetry breaking; actually, it totally prohibits such a symmetry since the constituent masses are large on a hadronic scale, typically of the order of one-half of a meson mass or one-third of a baryon mass. Standard values are 330 MeV for the up- and down-quark, and 490 MeV for the strange-quark, very far from the “current” masses of a few (tens) MeV. Even the ratio of the up- or down-quark masses to the strange-quark mass is vastly different in the two pictures. If one attempted to incorporate a bound gluon into the model, one would have to assign to it a mass at least of the order of magnitude of the quark mass, in order to limit its impact on the classification scheme. But a gluon mass violates the gauge invariance of QCD. Fortunately, “light-cone quantization”, which can be formulated independent of the Lorentz frame, offers an elegant avenue of escape. The square root operator does not appear, and the
304
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
vacuum structure is relatively simple. There is no spontaneous creation of massive fermions in the light-cone quantized vacuum. There are, in fact, many reasons to quantize relativistic field theories at fixed light-cone time. Dirac [123], in 1949, showed that in this so-called “front form” of Hamiltonian dynamics, a maximum number of Poincare´ generators become independent of the interaction, including certain Lorentz boosts. In fact, unlike the traditional equal-time Hamiltonian formalism, quantization on a plane tangential to the light cone (“null plane”) can be formulated without reference to a specific Lorentz frame. One can construct an operator whose eigenvalues are the invariant mass squared M2. The eigenvectors describe bound states of arbitrary four-momentum and invariant mass M and allow the computation of scattering amplitudes and other dynamical quantities. The most remarkable feature of this approach, however, is the apparent simplicity of the light-cone vacuum. In many theories the vacuum state of the free Hamiltonian is also an eigenstate of the total light-cone Hamiltonian. The Fock expansion constructed on this vacuum state provides a complete relativistic many-particle basis for diagonalizing the full theory. The simplicity of the light-cone Fock representation as compared to that in equal-time quantization is directly linked to the fact that the physical vacuum state has a much simpler structure on the light cone because the Fock vacuum is an exact eigenstate of the full Hamiltonian. This follows from the fact that the total light-cone momentum P`'0 and it is conserved. This means that all constituents in a physical eigenstate are directly related to that state, and not to disconnected vacuum fluctuations. In the Tamm—Dancoff method (TDA) and sometimes also in the method of discretized light-cone quantization (DLCQ), one approximates the field theory by truncating the Fock space. Based on the success of the constituent quark models, the assumption is that a few excitations describe the essential physics and that adding more Fock space excitations only refines the initial approximation. Wilson [455,456] has stressed the point that the success of the Feynman parton model provides hope for the eventual success of the front-form methods. One of the most important tasks in hadron physics is to calculate the spectrum and the wavefunctions of physical particles from a covariant theory, as mentioned. The method of “discretized light-cone quantization” has precisely this goal. Since its first formulation [354,355] many problems have been resolved but some remain open. To date, DLCQ has proved to be one of the most powerful tools available for solving bound-state problems in quantum field theory [363,68]. Let us review briefly the difficulties. As with conventional non-relativistic many-body theory one starts out with a Hamiltonian. The kinetic energy is a one-body operator and thus simple. The potential energy is at least a two-body operator and thus complicated. One has solved the problem if one has found one or several eigenvalues and eigenfunctions of the Hamiltonian equation. One can always expand the eigenstates in terms of products of single-particle states. These singleparticle wavefunctions are solutions of an arbitrary “single-particle Hamiltonian”. In the Hamiltonian matrix for a two-body interaction most of the matrix elements vanish, since a two-body Hamiltonian changes the state of up to two particles. The structure of the Hamiltonian is that one of a finite penta-diagonal block matrix. The dimension within a block, however, is infinite to start with. It is made finite by an artificial cut-off, for example on the single-particle quantum numbers. A finite matrix, however, can be diagonalized on a computer: the problem becomes ‘approximately soluble’. Of course, at the end, one must verify that the physical results are (more or less) insensitive to the cut-off(s) and other formal parameters. Early calculations [353], where this procedure was
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
305
actually carried out in one space dimension, showed rapid converge to the exact eigenvalues. The method was successful in generating the exact eigenvalues and eigenfunctions for up to 30 particles. From these early calculations it was clear that discretized plane waves are a manifestly useful tool for many-body problems. In this review we will display the extension of this method (DLCQ) to various quantum field theories [137—140,227,228,258,259,261,264,354,355,358,422,29,272,359—361, 392,393]. The first studies of model field theories had disregarded the so-called ‘zero modes’, the spacelike constant field components defined in a finite spatial volume (discretization) and quantized at equal light-cone time. But subsequent studies have shown that they can support certain kinds of vacuum structure. The long range phenomena of spontaneous symmetry breaking [206—208,33,382,223,389] as well as the topological structure [259,261] can in fact be reproduced when they are included carefully. The phenomena are realized in quite different ways. For example, spontaneous breaking of Z symmetry (/P!/) in the /4-theory in 1#1 dimension occurs via 2 a constrained zero mode of the scalar field [33]. There the zero mode satisfies a nonlinear constraint equation that relates it to the dynamical modes in the problem. At the critical coupling a bifurcation of the solution occurs [209,210,389,33]. In formulating the theory, one must choose one of them. This choice is analogous to what in the conventional language we would call the choice of vacuum state. These solutions lead to new operators in the Hamiltonian which break the Z symmetry at and beyond the critical coupling. The various solutions contain c-number pieces 2 which produce the possible vacuum expectation values of /. The properties of the strong-coupling phase transition in this model are reproduced, including its second-order nature and a reasonable value for the critical coupling [33,382]. One should emphasize that solving the constraint equations really amounts to determining the Hamiltonian (P~) and possibly other Poincare´ generators, while the wavefunction of the vacuum remains simple. In general, P~ becomes very complicated when the constraint zero modes are included, and this in some sense is the price to pay to have a formulation with a simple vacuum, combined with possibly finite vacuum expectation values. Alternatively, it should be possible to think of discretization as a cutoff which removes states with 0(p`(n/¸, and the zero mode contributions to the Hamiltonian as effective interactions that restore the discarded physics. In the light-front power counting a` la Wilson it is clear that there will be a huge number of allowed operators. Quite separately, Kalloniatis et al. [259] have shown that also a dynamical zero mode arises in a pure SU(2) Yang—Mills theory in 1#1 dimensions. A complete fixing of the gauge leaves the theory with one degree of freedom, the zero mode of the vector potential A`. The theory has a discrete spectrum of zero-P` states corresponding to modes of the flux loop around the finite space. Only one state has a zero eigenvalue of the energy P~, and is the true ground state of the theory. The non-zero eigenvalues are proportional to the length of the spatial box, consistent with the flux loop picture. This is a direct result of the topology of the space. Since the theory considered there was a purely topological field theory, the exact solution was identical to that in the conventional equal-time approach on the analogous spatial topology [217]. Much of the work so far performed has been for theories in 1#1 dimensions. For these theories there is much success to report. Numerical solutions have been obtained for a variety of gauge theories including U(1) and SU(N) for N"1,2,3 and 4 [228,227,229,230,272]; Yukawa [182]; and to some extent /4 [203,204]. A considerable amount of analysis of /4 [203,204,206—210,214] has been performed and a fairly complete discussion of the Schwinger model has been presented
306
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
[137—139,326,210,214,296]. The long-standing problem in reaching high numerical accuracy towards the massless limit has been resolved recently [438]. The extension of this program to physical theories in 3#1 dimensions is a formidable computational task because of the much larger number of degrees of freedom. The amount of work is therefore understandably smaller; however, progress is being made. Analyses of the spectrum and light-cone wavefunctions of positronium in QED have been made by Tang et al. [422] 3`1 and Krautga¨rtner et al. [279]. Numerical studies on positronium have provided the Bohr, the fine, and the hyperfine structure with very good accuracy [429]. Currently, Hiller et al. [222] are pursuing a non-perturbative calculation of the lepton anomalous moment in QED using the DLCQ method. Burkardt [79] and, more recently, van de Sande and Dalley [79,437,439,116] have solved gauge theories with transverse dimensions by combining a transverse lattice method with DLCQ, taking up an old suggestion of Bardeen and Pearson [17,18]. Also of interest is recent work of Hollenberg and Witte [225], who have shown how Lanczos tri-diagonalization can be combined with a plaquette expansion to obtain an analytic extrapolation of a physical system to infinite volume. The major problem one faces here is a reasonable definition of an effective interaction including the many-body amplitudes [357,361]. There has been considerable work focusing on the truncations required to reduce the space of states to a manageable level [363,367,368,456]. The natural language for this discussion is that of the renormalization group, with the goal being to understand the kinds of effective interactions that occur when states are removed, either by cutoffs of some kind or by an explicit Tamm—Dancoff truncation. Solutions of the resulting effective Hamiltonian can then be obtained by various means, for example using DLCQ or basis function techniques. Some calculations of the spectrum of heavy quarkonia in this approach have recently been reported [48]. Formal work on renormalization in 3#1 dimensions [339] has yielded some positive results but many questions remain. More recently, DLCQ has been applied to new variants of QCD with quarks in the adjoint 1`1 representation, thus obtaining color-singlet eigenstates analogous to gluonium states [121, 360,437]. The physical nature of the light-cone Fock representation has important consequences for the description of hadronic states. As to be discussed in greater detail in Sections 3 and 5, one can compute electromagnetic and weak form factors rather directly from an overlap of light-cone wavefunctions t (x , k i, j ) [131,299,418]. Form factors are generally constructed from hadronic n i M i matrix elements of the current SpD jk(0)Dp#qT. In the interaction picture one can identify the fully interacting Heisenberg current Jk with the free current jk at the space—time point xk "0. Calculating matrix elements of the current j`"j0#j3 in a frame with q`"0, only diagonal matrix elements in particle number n@"n are needed. In contrast, in the equal-time theory one must also consider off-diagonal matrix elements and fluctuations due to particle creation and annihilation in the vacuum. In the non-relativistic limit one can make contact with the usual formulas for form factors in Schro¨dinger many-body theory. In the case of inclusive reactions, the hadron and nuclear structure functions are the probability distributions constructed from integrals and sums over the absolute squares Dt D2. In the far n off-shell domain of large parton virtuality, one can use perturbative QCD to derive the asymptotic fall-off of the Fock amplitudes, which then in turn leads to the QCD evolution equations for distribution amplitudes and structure functions. More generally, one can prove factorization theorems for exclusive and inclusive reactions which separate the hard and soft momentum transfer
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
307
regimes, thus obtaining rigorous predictions for the leading power behavior contributions to large momentum transfer cross sections. One can also compute the far off-shell amplitudes within the light-cone wavefunctions where heavy quark pairs appear in the Fock states. Such states persist over a time qKP`/M2 until they are materialized in the hadron collisions. As we shall discuss in Section 6, this leads to a number of novel effects in the hadroproduction of heavy quark hadronic states [67]. A number of properties of the light-cone wavefunctions of the hadrons are known from both phenomenology and the basic properties of QCD. For example, the endpoint behavior of lightcone wave and structure functions can be determined from perturbative arguments and Regge arguments. Applications are presented in Ref. [70]. There are also correspondence principles. For example, for heavy quarks in the non-relativistic limit, the light-cone formalism reduces to conventional many-body Schro¨dinger theory. On the other hand, we can also build effective three-quark models which encode the static properties of relativistic baryons. The properties of such wavefunctions are discussed in Section 5. We will review the properties of vector and axial vector non-singlet charges and compare the space—time with their light-cone realization. We will show that the space—time and light-cone axial currents are distinct; this remark is at the root of the difference between the chiral properties of QCD in the two forms. We show for the free quark model that the front form is chirally symmetric in the SU(3) limit, whether the common mass is zero or not. In QCD chiral symmetry is broken both explicitly and dynamically. This is reflected on the light-cone by the fact that the axial-charges are not conserved even in the chiral limit. Vector and axial-vector charges annihilate the Fock space vacuum and so are bona fide operators. They form an SU(3)?SU(3) algebra and conserve the number of quarks and anti-quarks separately when acting on a hadron state. Hence, they classify hadrons, on the basis of their valence structure, into multiplets which are not mass degenerate. This classification however turns out to be phenomenologically deficient. The remedy of this situation is unitary transformation between the charges and the physical generators of the classifying SU(3)?SU(3) algebra. Although we are still far from solving QCD explicitly, it now is the right time to give a presentation of the light-cone activities to a larger community. The front form can contribute to the physical insight and interpretation of experimental results. We therefore will combine a certain amount of pedagogical presentation of canonical field theory with the rather abstract and theoretical questions of most recent advances. The present attempt can neither be exhaustive nor complete, but we have in mind that we ultimately have to deal with the true physical questions of experiment. We will use two different metrics in this review. The literature is about evenly split in their use. We have, for the most part, used the metric that was used in the original work being reviewed. We label them the LB convention and the KS convention and discuss them in more detail in Section 2 and the appendix.
2. Hamiltonian dynamics What is a Hamiltonian? Dirac [125] defines the Hamiltonian H as that operator whose action on the state vector DtT of a physical system has the same effect as taking the partial derivative with
308
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
respect to time t, i.e. HDtT"i /tDtT .
(2.1)
Its expectation value is a constant of the motion, referred to shortly as the “energy” of the system. We will not consider pathological constructs where a Hamiltonian depends explicitly on time. The concept of an energy has developed over many centuries and applies irrespective of whether one deals with the motion of a non-relativistic particle in classical mechanics or with a non-relativistic wavefunction in the Schro¨dinger equation, and it generalizes almost unchanged to a relativistic and covariant field theory. The Hamiltonian operator P is a constant of the motion which acts as the 0 displacement operator in time x0,t, P Dx0T"i /x0Dx0T . (2.2) 0 This definition applies also in the front form, where the “Hamiltonian” operator P is a constant of ` the motion whose action on the state vector, P Dx`T"i /x`Dx`T , (2.3) ` has the same effect as the partial with respect to “light-cone time” x`,(t#z). In this section we elaborate on these concepts and operational definitions to some detail for a relativistic theory, focusing on covariant gauge field theories. For the most part the LB convention is used however many of the results are convention independent. 2.1. Abelian gauge theory: Quantum electrodynamics The prototype of a field theory is Faraday’s and Maxwell’s electrodynamics [323], which is gauge invariant as first pointed by Weyl [449]. The non-trivial set of Maxwells equations has the four components (2.4) Fkl"gJl. k The six components of the electric and magnetic fields are collected into the antisymmetric electromagnetic field tensor Fkl,kAl!lAk and expressed in terms of the vector potentials Ak describing vector bosons with a strictly vanishing mass. Each component is a real-valued operator function of the three space coordinates xk"(x, y, z) and of the time x0"t. The space—time coordinates are arranged into the vector xk labeled by the ¸orentz indices (i, j, k, l"0, 1, 2, 3). The Lorentz indices are lowered by the metric tensor g and raised by kl gkl with g gkj"dj. These and other conventions are collected in Appendix A. The coupling ik i constant g is related to the dimensionless fine structure constant by a"g2/4p+c .
(2.5)
The antisymmetry of Fkl implies a vanishing four-divergence of the current Jl(x), i.e. Jk"0 . (2.6) k In the equation of motion, the time derivatives of the vector potentials are expressed as functionals of the fields and their space-like derivatives, which in the present case are of second order in the
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
309
time, like Ak"f [Al, Jk]. The Dirac equations 0 0 (ick !m)W"gckA W , (2.7) k k for given values of the vector potentials A , define the time derivatives of the four complex-valued k spinor components W (x) and their adjoints WM (x)"Ws(x)(c0) , and thus of the current a a b ba Jl,WM clW"WM cl W . The mass of the fermion is denoted by m, the four Dirac matrices by a ab b ck"(ck) . The Dirac indices a or b enumerate the components from 1 to 4, doubly occurring ab indices are implicitly summed over without reference to their lowering or raising. The combined set of the Maxwell and Dirac equations is closed. The combined set of the 12 coupled differential equations in 3#1 space—time dimensions is called quantum electrodynamics (QED). The trajectories of physical particles extremalize the action. Similarly, the equations of motion in a field theory like Eqs. (2.4) and (2.7) extremalize the action density, usually referred to as the ¸agrangian L. The Lagrangian of quantum electrodynamics (QED) L"!1FklF #1[WM (ickD !m)W#h.c.] , (2.8) 4 kl 2 k with the covariant derivative D " !igA , is a local and hermitean operator, classically a real k k k function of space—time xk. This almost empirical fact can be cast into the familiar and canonical calculus of variation as displayed in many text books [39,242], whose essentials shall be recalled briefly. The Lagrangian for QED is a functional of the 12 components W (x), WM (x), A (x) and their a a k space—time derivatives. Denoting them collectively by / (x) and / (x) one has thus r k r L"L[/ , / ]. Crucial is that L depends on space—time only through the fields. Independent r k r variation of the action with respect to / and / , r k r
P
d dx0 dx1 dx2 dx3 L(x)"0 , (
(2.9)
results in the 12 equations of motion, the Euler equations ni!dL/d/ "0 with ni[/],dL/d( / ) , (2.10) i r r r i r for r"1,2,2, 12. The generalized momentum fields ni[/] are introduced here for convenience and r later use, with the argument [/] usually suppressed except when useful to emphasize the field in question. The Euler equations symbolize the most compact form of equations of motion. Indeed, the variation with respect to the vector potentials dL/d( A ),nji[A]"!Fij and dL/dA ,gJj"gWM cjW (2.11) i j j yields straightforwardly the Maxwell equations (2.4), and varying with respect to the spinors i dL i dL " WM ci , "! WM ck #gWM ck A !mWM (2.12) ni[t], b ba b ba k a a dW 2 k b ba d( W ) 2 a i a and its adjoints give the Dirac equations (2.7). The canonical formalism is particularly suited for discussing the symmetries of a field theory. According to a theorem of Noether [242,346] every continuous symmetry of the Lagrangian is associated with a four-current whose four-divergence vanishes. This in turn implies a conserved
310
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
charge as a constant of motion. Integrating the current Jk in Eq. (2.6) over a three-dimensional surface of a hypersphere, embedded in four-dimensional space—time, generates a conserved charge. The surface element du and the (finite) volume X are defined most conveniently in terms of the j totally antisymmetric tensor e (e "1): jklo 0123 1 du " e dxk dxl dxo , X" du " dx1 dx2 dx3 , (2.13) j 3! jklo 0
P
P
respectively. Integrating Eq. (2.6) over the hyper-surface specified by x0"const. then reads x0
P
P
dx1 dx2 dx3 J0(x)# dx1 dx2 dx3
X
X
C
D
J1(x)# J2(x)# J3(x) "0 . x2 x3 x1
(2.14)
The terms in the square bracket reduce to surface terms which vanish if the boundary conditions are carefully defined. Under that proviso the charge
P
P
Q" du J0(x)" dx1 dx2 dx3 J0(x0, x1, x2, x3) 0 X
(2.15)
is independent of time x0 and a constant of the motion. Since L is frame-independent, there must be ten conserved four-currents. Here they are ¹jl"0 Jj,kl"0 , (2.16) j j where the energy—momentum ¹jl and the boost-angular-momentum stress tensor Jj,kl are respectively, ¹jl"njl/ !gjkL , Jj,kl"xk¹jl!xl¹jk#njRkl/ . (2.17) r r r rs s As a consequence the Lorentz group has ten “conserved charges”, the ten constants of the motion
P
Pl" du (n0l/ !g0kL) , 0 r r X
P
(2.18)
Mkl" du (xk¹0l!xl¹0k#n0Rkl/ (x)) , 0 r rs s X the 4 components of energy—momentum and the 6 boost-angular momenta, respectively. The first two terms in Mkl correspond to the orbital and the last term to the spin part of angular momentum. The spin part R is either or Rkl "gkgl !gkgl , (2.19) Rkl"1 [ck, cl] ab op o p p o ab 4 depending on whether / refers to spinor or to vector fields, respectively. In the latter case, we r substitute njPnoj"dL/d( A ) and / PAp. Inserting Eqs. (2.11) and (2.12) one gets for gauge r j o s theory the familiar expressions [39] Jj,kl"xk¹jl!xl¹jk#1iWM (cj[ck, cl]#[cl, ck]cj)W#AkFjl!AlFjk . (2.20) 8 The symmetries will be discussed further in Section 2.6. In deriving the energy—momentum stress tensor one might overlook that nj[/] does not r necessarily commute with k/ . As a rule, one therefore should symmetrize in the boson and r
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
311
anti-symmetrize in the fermion fields, i.e. nj[/]k/ P1(nj[/]k/ #k/ nj[/]) , r r 2 r r r r nj[t]kt P1(nj[t]kt !kt nj[t]) , (2.21) r r 2 r r r r respectively, but this will be done only implicitly. The Lagrangian L is invariant under local gauge transformations, in general described by a unitary and space—time-dependent matrix operator º~1(x)"ºs(x). In QED, the dimension of this matrix is 1 with the most general form º(x)"e~*gK(x). Its elements form the abelian group º(1), hence abelian gauge theory. If one substitutes the spinor and vector fields in Fkl and WM D W a k b according to WI "ºW , AI "ºA ºs#(i/g)( º)ºs , (2.22) a a k k k one verifies their invariance under this transformation, as well as that of the whole Lagrangian. The Noether current associated with this symmetry is the Jk of Eq. (2.11). A straightforward application of the variational principle, Eqs. (2.11) and (2.12), does not yield immediately manifestly gauge invariant expressions. Rather one gets ¹kl"FkilA #1[WM icklW#h.c.]!gklL . i 2 However, using the Maxwell equations one derives the identity
(2.23)
FkilA "FkiFl #gJkAl# (FkiAl) . i i i Inserting that into the former gives
(2.24)
(2.25) ¹kl"FkiF l#1 [iWM ckDlW#h.c.]!gklL# (FkiAl) . i i 2 All explicit gauge dependence resides in the last term in the form of a four-divergence. One can thus write ¹kl"FkiF l#1[iWM ckDlW#h.c.]!gklL , i 2 which together with energy—momentum
(2.26)
P
Pl" du (F0iF l!g0lL#1[iWM c0DlW#h.c.]) 0 i 2 X
(2.27)
is manifestly gauge-invariant. 2.2. Non-abelian gauge theory: Quantum chromodynamics For the gauge group SU(3), one replaces each local gauge field Ak(x) by the 3]3 matrix Ak(x),
A
B
1 Ak #Ak Ak !iAk Ak !iAk J 3 1 2 4 5 1 3 8 1 AkP(Ak) " Ak #iAk Ak !Ak Ak !iAk . (2.28) J 2 cc{ 2 1 3 6 7 3 8 Ak #iAk Ak #iAk !J2 Ak 4 5 6 7 3 8 This way one moves from quantum electrodynamics to quantum chromodynamics with the eight real-valued color vector potentials Ak enumerated by the gluon index a"1,2, 8. These matrices a
312
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
are all hermitean and traceless since the trace can always be absorbed into an abelian U(1) gauge theory. They belong thus to the class of special unitary 3]3 matrices SU(3). In order to make sense of expressions like WM AkW the quark fields W(x) must carry a color index c"1,2,3 which are usually suppressed as are the Dirac indices in the color triplet spinor W (x). c,a More generally for SU(N), the vector potentials Ak are hermitian and traceless N]N matrices. All such matrices can be parametrized Ak,¹a Ak. The color index c (or c@) runs now from 1 to n , cc{ a c and correspondingly the gluon index a (or r, s, t) from 1 to n2!1. Both are implicitly summed, with c no distinction of lowering or raising them. The color matrices ¹a obey cc{ Tr(¹r¹s)"1ds . [¹r, ¹s] "if rsa¹a , (2.29) cc{ cc{ 2 r The structure constants f rst are tabulated in the literature [242,342,343] for SU(3). For SU(2) they are the totally antisymmetric tensor e , since ¹a"1pa with pa being the Pauli matrices. For SU(3), rst 2 the ¹a are related to the Gell—Mann matrices ja by ¹a"1ja. The gauge-invariant Lagrangian 2 density for QCD or SU(N) is L"!1Tr(FklF )#1[WM (ickD !m)W#h.c.] 2 kl 2 k "!1FklFa #1[WM (ickD !m)W#h.c.], (2.30) 4 a kl 2 k in analogy to Eq. (2.8). The unfamiliar factor of 2 is because of the trace convention in Eq. (2.29). The mass matrix m"md is diagonal in color space. The matrix notation is particularly suited for cc{ establishing gauge invariance according to Eq. (2.22) with the unitary operators U now being N]N matrices, hence non-abelian gauge theory. The latter fact generates an extra term in the color-electro-magnetic fields Fkl,kAl!lAk#ig[Ak, Al] , or Fkl,kAl!lAk!gf arsAkAl , (2.31) a a a r s but such that Fkl remains antisymmetric in the Lorentz indices. The covariant derivative matrix finally is Dk "d k#igAk . The variational derivatives are now cc{ cc{ cc{ dL/d( Ar )"!Fij, dL/dAr "!gJj with Jj"WM cj¹aW#f arsFjiAs , (2.32) i j r j r r r i in analogy to Eq. (2.11), and yield the color-Maxwell equations Fkl"gJ l with J l"WM cl¹aW¹a#(1/i)[Fli, A ] . k i The color-Maxwell current is conserved,
(2.33)
Jk"0 . (2.34) k Note that the color-fermion current jk"WM cl¹aW is not trivially conserved. The variational a derivatives with respect to the spinor fields like Eq. (2.12) give correspondingly the color-Dirac equations (ickD !m)W"0 . k
(2.35)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
313
Everything proceeds in analogy with QED. The color-Maxwell equations allow for the identity FkilAa "FkiFl #gJkAl#gf arsFkiAlAs # (FkiAl) . a i a i,a a a a r i i a a The energy—momentum stress tensor becomes
(2.36)
¹kl"2Tr(FkiF l)#1[iWM ckDlW#h.c.]!gklL!2 Tr(FkiAl) . i 2 i Leaving out the four divergence, ¹kl is manifestly gauge-invariant,
(2.37)
¹kl"2Tr(FkiF l)#1[iWM ckDlW#h.c.]!gklL i 2 as are the generalized momenta [245]
(2.38)
P
Pl" du (2Tr(F0iF l)!g0lL#1[iWM c0DlW#h.c.]) . 0 i 2 X
(2.39)
Note that all this holds for SU(N), in fact it holds for d#1 dimensions. 2.3. Parametrization of space—time Let us review some aspects of canonical field theory. The Lagrangian determines both the equations of motion and the constants of motion. The equations of motion are differential equations. Solving differential equations one must give initial data. On a hypersphere in four-space, characterized by a fixed initial “time” x0"0, one assumes to know all necessary field components / (x0, x). The goal is then to generate the fields for all space—time by means of the differential r 0 1 equations of motion. Equivalently, one can propagate the initial configurations forward or backward in time with the Hamiltonian. In a classical field theory, particularly one in which every field / has a conjugate r momentum n [/],n0[/], see Eq. (2.10), one gets from the constant of motion P to the r r 0 Hamiltonian P by substituting the velocity fields L / with the canonically conjugate momenta n , 0 0 r r thus P "P [/, n]. Equations of motion are then given in terms of the classical Poisson brackets 0 0 [186], / "MP , / N , n "MP , n N . (2.40) 0 r 0 r #0 r 0 r #They are discussed in greater details in Appendix E. Following Dirac [125—127], the transition to an operator formalism like quantum mechanics is consistently achieved by replacing the classical Poisson brackets of two functions A and B by the “quantum Poisson brackets”, the commutators of two operators A and B MA, BN P(1/i+)[A, B] 0 0 , (2.41) #x /y and correspondingly by the anti-commutator for two fermionic fields. Particularly, one substitutes the basic Poisson bracket M/ (x), n (y)N "d d(3)(x!y) r s #rs 1 1 by the basic commutator (1/i+)[/ (x), n (y)] 0 0"d d(3)(x!y) . r s x /y rs 1 1
(2.42)
(2.43)
314
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
The time derivatives of the operator fields are then given by the Heisenberg equations, see Eq. (2.57). In gauge theory like QED and QCD, one cannot proceed so straightforwardly as in the above canonical procedure, for two reasons: (1) Not all of the fields have a conjugate momentum, that is not all of them are independent; (2) Gauge theory has redundant degrees of freedom. There are plenty of conventions how one can ‘fix the gauge’. It suffices to say for the moment that ‘canonical quantization’ applies only for the independent fields. In Appendix E we will review the Dirac— Bergman procedure for handling dependent degrees of freedom, or for ‘quantizing under constraint’. Thus far time t and space x was treated as if they were completely separate issues. But in 1 a covariant theory, time and space are only different aspects of four-dimensional space—time. One can however generalize the concepts of space and of time in an operational sense. One can define ‘space’ as that hypersphere in four-space on which one chooses the initial field configurations in accord with microcausality. The remaining, the fourth coordinate can be thought of being kind of normal to the hypersphere and understood as ‘time’. Below we shall speak of space-like and time-like coordinates, correspondingly. These concepts can be grasped more formally by conveniently introducing generalized coordinates xJ l. Starting from a baseline parametrization of space—time like the above xk [39] with a given metric tensor gkl whose elements are all zero except g00"1, g11"!1, g22"!1, and g33"!1, one parametrizes space—time by a certain functional relation xJ l"xJ l(xk) .
(2.44)
The freedom in choosing xJ l(xk) is restricted only by the condition that the inverse xk(xJ l) exists as well. The transformation conserves the arc length; thus (ds)2"g dxkdxl"gJ dxJ idxJ j. The metric kl ij tensors for the two parametrizations are then related by gJ "(xk/xJ i)g (xl/xJ j) . ij kl
(2.45)
The two four-volume elements are related by the Jacobian J(xJ )"Ex/xJ E, particularly d4x" J(xJ )d4xJ . We shall keep track of the Jacobian only implicitly. The three-volume element du is 0 treated correspondingly. All the above considerations must be independent of this reparametrization. The fundamental expressions like the Lagrangian can be expressed in terms of either x or xJ . There is however one subtle point. By matter of convenience, one defines the hypersphere as that locus in four-space on which one sets the “initial conditions” at the same “initial time”, or on which one “quantizes” the system correspondingly in a quantum theory. The hypersphere is thus defined as that locus in four-space with the same value of the “time-like” coordinate xJ 0, i.e. xJ 0(x0, x)"const. Correspond1 ingly, the remaining coordinates are called ‘space-like’ and denoted by the spatial three-vector xJN "(xJ 1, xJ 2, xJ 3). Because of the (in general) more complicated metric, cuts through the four-space characterized by xJ 0"const are quite different from those with xJ "const. In generalized coordi0 nates the covariant and contravariant indices can have rather different interpretation, and one must be careful with the lowering and rising of the Lorentz indices. For example, only "/xJ 0 is 0 a ‘time derivative’ and only P a “Hamiltonian”, as opposed to 0 and P0 which in general are 0 completely different objects. The actual choice of xJ (x) is a matter of preference and convenience.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
315
2.4. Forms of Hamiltonian dynamics Obviously, one has many possibilities to parametrize space—time by introducing some generalized coordinates xJ (x). But one should exclude all those which are accessible by a Lorentz transformation. Those are included anyway in a covariant formalism. This limits considerably the freedom and excludes, for example, almost all rotation angles. Following Dirac [123] there are no more than three basically different parametrizations. They are illustrated in Fig. 1, and cannot be mapped on each other by a Lorentz transform. They differ by the hypersphere on which the fields are initialized, and correspondingly one has different “times”. Each of these space—time parametrizations has thus its own Hamiltonian, and correspondingly Dirac [123] speaks of the three forms of Hamiltonian dynamics: The instant form is the familiar one, with its hypersphere given by t"0. In the front form the hypersphere is a tangent plane to the light cone. In the point form the time-like coordinate is identified with the eigentime of a physical system and the hypersphere has a shape of a hyperboloid. Which of the three forms should be prefered? The question is difficult to answer, in fact it is ill-posed. In principle, all three forms should yield the same physical results, since physics should not depend on how one parametrizes the space (and the time). If it depends on it, one has made a mistake. But usually one adjusts parametrization to the nature of the physical problem to simplify the amount of practical work. Since one knows so little on the typical solutions of a field theory, it might well be worth the effort to admit also other than the conventional “instant” form. The bulk of research on field theory implicitly uses the instant form, which we do not even attempt to summarize. Although it is the conventional choice for quantizing field theory, it has
Fig. 1. Dirac’s three forms of Hamiltonian dynamics.
316
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
many practical disadvantages. For example, given the wavefunctions of an n-electron atom at an initial time t"0, t (x , t"0), one can use the Hamiltonian H to evolve t (x , t) to later times t. n * n * However, an experiment which specifies the initial wavefunction would require the simultaneous measurement of the positions of all of the bounded electrons. In contrast, determining the initial wavefunction at fixed light-cone time q"0 only requires an experiment which scatters one plane-wave laser beam, since the signal reaching each of the n electrons, along the light front, at the same light-cone time q"t #z /c. i i A reasonable choice of xJ (x) is restricted by microcausality: a light signal emitted from any point on the hypersphere must not cross the hypersphere. This holds for the instant or for the point form, but the front form seems to be in trouble. The light cone corresponds to light emitted from the origin and touches the front form hypersphere at (x, y)"(0, 0). A signal carrying actually information moves with the group velocity always smaller than the phase velocity c. Thus, if no information is carried by the signal, points on the light cone are unable to communicate. Only when solving problems in one-space and one-time dimension, the front form initializes fields only on the characteristic. Whether this generates problems for pathological cases like massless bosons (or fermions) is still under debate. Comparatively little work is done in the point form [154,192,405]. Stech and collaborators [192] have investigated the free particle, by analyzing the Klein—Gordon and the Dirac equation. As it turns out, the orthonormal functions spanning the Hilbert space for these cases are rather difficult to work with. Their addition theorems are certainly more complicated than the simple plane-wave states applicable in the instant or the front form. The front form has a number of advantages which we will review in this article. Dirac’s legacy had been forgotten and re-invented several times, thus the approach carries names as different as infinite-momentum frame, null-plane quantization, light-cone quantization, or most unnecessarily light-front quantization. In the essence these are the same. The infinite-momentum frame first appeared in the work of Fubini and Furlan [153] in connection with current algebra as the limit of a reference frame moving with almost the speed of light. Weinberg [448] asked whether this limit might be more generally useful. He considered the infinite-momentum limit of the old-fashioned perturbation diagrams for scalar meson theories and showed that the vacuum structure of these theories simplified in this limit. Later Susskind [414,415] showed that the infinities which occur among the generators of the Poincare´ group when they are boosted to a fast-moving reference frame can be scaled or subtracted out consistently. The result is essentially a change of the variables. Susskind used the new variables to draw attention to the (two-dimensional) Galilean subgroup of the Poincare´ group. He pointed out that the simplified vacuum structure and the non-relativistic kinematics of theories at infinite momentum might offer potential-theoretic intuition in relativistic quantum mechanics. Bardakci and Halpern [16] further analyzed the structure of the theories at infinite momentum. They viewed the infinite-momentum limit as a change of variables from the laboratory time t and space coordinate z to a new “time” q"(t#z)/J2 and a new “space” f"(t!z)/J2. Chang and Ma [92] considered the Feynman diagrams for a /3-theory and quantum electrodynamics from this point of view and where able to demonstrate the advantage of their approach in several illustrative calculations. Kogut and Soper [274] have examined the formal foundations of quantum electrodynamics in the infinite-momentum frame, and interpret the infinite-momentum limit as the change of variables thus avoiding limiting procedures. The time-ordered perturbation series of the S-matrix is due to them, see also
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
317
[40,41,406,274,275]. Drell et al. [130—133] have recognized that the formalism could serve as kind of natural tool for formulating the quark-parton model. Independent of and almost simultaneous with the infinite-momentum frame is the work on null plane quantization by Leutwyler [302,303], Klauder et al. [273], and by Rohrlich [390]. In particular, they have investigated the stability of the so-called “little group” among the Poincare´ generators [304—307]. Leutwyler recognized the utility of defining quark wavefunctions to give an unambiguous meaning to concepts used in the parton model. The later developments using the infinite-momentum frame have displayed that the naming is somewhat unfortunate since the total momentum is finite and since the front form needs no particular Lorentz frame. Rather it is frame-independent and covariant. ¸ight-Cone Quantization seemed to be more appropriate. Casher [91] gave the first construction of the light-cone Hamiltonian for non-Abelian gauge theory and gave an overview of important considerations in light-cone quantization. Chang et al. [93—96] demonstrated the equivalence of light-cone quantization with standard covariant Feynman analysis. Brodsky et al. [53] calculated one-loop radiative corrections and demonstrated renormalizability. Light-cone Fock methods were used by Lepage and Brodsky in the analysis of exclusive processes in QCD [297—300,62,345]. In all of this work there was no citation of Dirac’s work. It did reappear first in the work of Pauli and Brodsky [354,355], who explicitly diagonalize a light-cone Hamiltonian by the method of discretized light-cone quantization, see also Section 4. Light-front quantization appeared first in the work of Harindranath and Vary [203,204] adopting the above concepts without change. Franke and collaborators [14,146—148,385], Karmanov [267,268], and Pervushin [369] have also done important work on light-cone quantization. Comprehensive reviews can be found in [300,62,66,250,72,185,80]. 2.5. Parametrizations of the front form If one were free to parametrize the front form, one would choose it most naturally as a real rotation of the coordinate system, with an angle u"p/4. The “time-like” coordinate would then be x`"xJ 0 and the “space-like” coordinate x~"xJ 3, or collectively
A B x`
A
BA B
1 1 !1 " x~ 1 J2 1
x0 x3
,
A B
0 1 g " . ab 1 0
(2.46)
The metric tensor gkl obviously transforms according to Eq. (2.45), and the Jacobian for this transformation is unity. But this has not what has been done, starting way back with Bardakci and Halpern [16] and continuing with Kogut and Soper [274]. Their definition corresponds to a rotation of the coordinate system by u"!p/4 and an reflection of x~. The Kogut—Soper convention (KS) [274] is thus:
A B x`
A
BA B
1 1 1 " x~ J2 1 !1
x0 x3
,
A B
0 1 g " . ab 1 0
(2.47)
see also Appendix C. It is often convenient to distinguish longitudinal Lorentz indices a or b (#,!) from the transversal ones i or j (1,2), and to introduce transversal vectors by
318
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
x "(x1, x2). The KS-convention is particularly suited for theoretical work, since the raising and M lowering of the Lorentz indices is simple. With the totally antisymmetric symbol e` "1 , thus e "1 , `12 `12~
(2.48)
the volume integral becomes
P
P
P
du " dx~ d2x " dx d2x . ` M ` M
(2.49)
One should emphasize that "~ is a time-like derivative /x`"/x as opposed to ` ~ "`, which is a space-like derivative /x~"/x . Correspondingly, P "P~ is the ~ ` ` Hamiltonian which propagates in the light-cone time x`, while P "P` is the longitudinal ~ space-like momentum. In much of the practical work, however, one is bothered with the J2’s scattered all over the place. At the expense of having various factors of 2, this is avoided in the Lepage—Brodsky (LB) convention [299]:
A B A x` x~
BA B
1
x0
1 !1
x3
1
"
A B
thus gab"
0 2 2 0
,
A B
0 1 2 , g " ab 1 0 2
(2.50)
see also Appendix B. Here, "1~ is a time-like and "1` a space-like derivative. The ` 2 ~ 2 Hamiltonian is P "1P~, and P "1P` is the longitudinal momentum. With the totally ` 2 ~ 2 antisymmetric symbol e` "1 e "1 , `12 `12~ 2
(2.51)
the volume integral becomes
P
P
P
1 du " dx~d2x " dx d2x . ` 2 M ` M
(2.52)
We will use both the LB-convention and the KS-convention in this review, and indicate in each section which convention we are using. The transition from the instant form to the front form is quite simple: In all the equations found in Sections 2.1 and 2.2 one has to substitute the “0” by the “#” and the “3” by the “!”. Take as an example the QED four-momentum in Eq. (2.27) to get
P A P A
B
1 1 P " du F0iF # g0FijF # [iWM c0D W#h.c.] , l 0 il l ij l 4 2 X
B
1 1 P " du F`iF # g`FijF # [iWM c`D W#h.c.] , l ` il 4 l ij 2 l X
(2.53)
also in KS-convention. The instant and the front form look thus almost identical. However, after having worked out the Lorentz algebra, the expressions for the instant and front-form
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
319
Hamiltonians are drastically different:
P P
P
1 1 P " du (E2#B2)# du [iWM c`D W#h.c.] , 0 2 X 0 0 2 X 0
P
1 1 P " du (E2 #B2 )# du [iWM c`D W#h.c.] , ` 2 X ` , , ` 2 X `
(2.54)
for the instant and the front-form energy, respectively. In the former one has to deal with all three components of the electric and the magnetic field, in the latter only with two of them, namely with the longitudinal components E "1F`~"E and B "F12"B . Correspondingly, z , z , 2 energy—momentum for non-abelian gauge theory is
P P
P " du (F0iFa #1g0FijFa #1[iWM c0¹aDaW#h.c.]) , l 0 a il 4 l a ij 2 l X
(2.55)
P " du (F`iFa #1g`FijFa #1[iWM c`¹aDaW#h.c.]) . l ` a il 4 l a ij 2 l X These expressions are exact but not yet very useful, and we shall come back to them in later sections. But they are good enough to discuss their symmetries in general. 2.6. The Poincare´ symmetries in the front form The algebra of the four-energy—momentum Pk"pk and four-angular—momentum Mkl"xkpl!xlpk for free particles [19,400,433,450] with the basic commutator (1/i+)[xk, p ]"dk l l is (1/i+)[Po, Mkl]"gokPl!golPk ,
(1/i+)[Po, Pk]"0 ,
(1/i+)[Mop, Mkl]"golMpk#gpkMol!gokMpl!gplMok .
(2.56)
It is postulated that the generalized momentum operators satisfy the same commutator relations. They form thus a group and act as propagators in the sense of the Heisenberg equations (1/i+)[Pl, / (x)]"il/ (x) r r (2.57) (1/i+)[Mkl, / (x)]"(xkl!xlk)/ (x)#Rkl/ . r r rs s Their validity for the front form was verified by Chang et al. [94—96], and partially even before that by Kogut and Soper [274]. Leutwyler and others have made important contributions [302—307]. The ten constants of motion Pk and Mkl are observables, thus hermitean operators with real eigenvalues. It is advantageous to construct representations in which the constants of motion are diagonal. The corresponding Heisenberg equations, for example, become then almost trivial. But one cannot diagonalize all ten constants of motion simultaneously because they do not commute. One has to make a choice. The commutation relations, Eq. (2.56), define a group. The group is isomorphous to the Poincare´ group, to the ten 4]4 matrices which generate an arbitrary inhomogeneous Lorentz
320
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
transformation. The question of how many and which operators can be diagonalized simultaneously turns out to be identical to the problem of classifying all irreducible unitary transformations of the Poincare´ group. According to Dirac [123] one cannot find more than seven mutually commuting operators. It is convenient to discuss the structure of the Poincare´ group [400,433] in terms of the Pauli—Lubansky vector »i,eijklP M , with eijkl being the totally antisymmetric symbol in j kl 4 dimensions. » is orthogonal to the generalized momenta, P »k"0, and obeys the algebra k (1/i+)[»i, Pk]"0 , (1/i+)[»i, Mkl]"gil»k!gik»l , (1/i+)[»i, »j]"eijkl» P . (2.58) k l The two group invariants are the operator for the invariant mass-squared M2"PkP and the k operator for intrinsic spin-squared »2"»k» . They are Lorentz scalars and commute with all k generators Pk and Mkl, as well as with all »k. A convenient choice of the six mutually commuting operators is therefore for the front form: (1) the invariant mass squared, M2"PkP , k (2—4) the three space-like momenta, P` and P , M (5) the total spin squared, S2"»k» , k (6) one component of », say »`, called S . z There are other equivalent choices. In constructing a representation which diagonalizes simultaneously the six mutually commuting operators one can proceed consecutively, in principle, by diagonalizing one after the other. At the end, one will have realized the old dream of Wigner [450] and of Dirac [123] to classify physical systems with the quantum numbers of the irreducible representations of the Poincare´ group. Inspecting the definition of boost-angular-momentum M in Eq. (2.18) one identifies which kl components are dependent on the interaction and which are not. Dirac [123,126] calls them complicated and simple, or dynamic and kinematic, or Hamiltonians and momenta, respectively. In the instant form, the three components of the boost vector K "M are dynamic, and the three i i0 components of angular momentum J "e M are kinematic. The cyclic symbol e is 1, if the * ijk jk ijk space-like indices ijk are in cyclic order, and zero otherwise. As noted already by Dirac [123], the front form is special in having four kinematic components of M (M , M , M , M ) and only two dynamic ones (M and M ). One checks this kl `~ 12 1~ 2~ `1 `2 directly from the defining equation (2.18). Kogut and Soper [274] discuss and interpret them in terms of the above boosts and angular momenta. They introduce the transversal vector B with M components B "M "(1/J2)(K #J ) , B "M "(1/J2)(K !J ) . (2.59) M1 `1 1 2 M2 `2 2 1 In the front form they are kinematic and boost the system in x- and y-direction, respectively. The kinematic operators M "J , M "K (2.60) 12 3 `~ 3 rotate the system in the x—y plane and boost it in the longitudinal direction, respectively. In the front form one deals thus with seven mutually commuting operators [123] M
, B and all Pk , `~ M
(2.61)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
321
instead of the six in the instant form. The remaining two Poincare´ generators are combined into a transversal angular—momentum vector S with M S "M "(1/J2)(K #J ) . (2.62) S "M "(1/J2)(K !J ) , 1 2 M2 2~ 2 1 M1 1~ They are both dynamical, but commute with each other and M2. They are thus members of a dynamical subgroup [274], whose relevance has yet to be exploited. Thus, one can diagonalize the light-cone energy P~ within a Fock basis where the constituents have fixed total P` and P . For convenience, we shall define a “light-cone Hamiltonian” as the M operator H "PkP "P~P`!P2 , (2.63) LC k M so that its eigenvalues correspond to the invariant mass spectrum M of the theory. The boost i invariance of the eigensolutions of H reflects the fact that the boost operators K and B are LC 3 M kinematical. In fact, one can boost the system to an “intrinsic frame” in which the transversal momentum vanishes P "0 thus H "P~P` . (2.64) M LC In this frame, the longitudinal component of the Pauli—Lubansky vector reduces to the longitudinal angular momentum J " J , which allows for considerable reduction of the numerical work 3 z [429]. The transformation to an arbitrary frame with finite values of P is then trivially performed. M The above symmetries imply the very important aspect of the front form that both the Hamiltonian and all amplitudes obtained in light-cone perturbation theory (graph by graph!) are manifestly invariant under a large class of Lorentz transformations: (1) boosts along the 3-direction: and p`PC p`, p Pp , , M M p p~PC~1p~ M , (2) transverse boosts: p`Pp`, p Pp #p`C , M M M p~Pp~#2p ) C #p`C2 M M M (3) rotations in the x—y plane: p`Pp`, p2 Pp2 . M M All of these hold for every single-particle momentum pk, and for any set of dimensionless c-numbers C and C . It is these invariances which also lead to the frame independence of the Fock , M state wave functions. If a theory is rotational invariant, then each eigenstate of the Hamiltonian which describes a state of non-zero mass can be classified in its rest frame by its spin eigenvalues J2DP0"M, P"0T"s(s#1)DP0"M, P"0T ,
(2.65) J DP0"M, P"0T"s DP0"M, P"0T . z z This procedure is more complicated in the front form since the angular momentum operator does not commute with the invariant mass-squared operator M2. Nevertheless, Hornbostel [228—230] constructs light-cone operators J2"J2#J2 , with J "J #e B P /P`, 3 M 3 3 ij Mi Mj J "(1/M)e l(S lP`!B lP~!K P l#J el P ) , Mk k M M 3 M 3 m Mm
(2.66)
322
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
which, in principle, could be applied to an eigenstate DP`, P T to obtain the rest frame spin M quantum numbers. This is straightforward for J since it is kinematical; in fact, J "J in a frame 3 3 3 with P "0 . However, J is dynamical and depends on the interactions. Thus, it is generally M M M difficult to explicitly compute the total spin of a state using light-cone quantization. Some of the aspects have been discussed by Coester [106] and collaborators [105,102]. A practical and simple way has been applied by Trittmann [429]. Diagonalizing the light-cone Hamiltonian in the intrinsic frame for J O0, he can ask for J , the maximum eigenvalue of J within a numerically z .!9 z degenerate multiplet of mass-squared eigenvalues. The total ‘spin J’ is then determined by J"2J #1, as to be discussed in Section 4. But more work on this question is certainly .!9 necessary, as well as on the discrete symmetries like parity and time reversal and their quantum numbers for a particular state, see also Hornbostel [228—230]. One needs the appropriate language for dealing with spin in highly relativistic systems. 2.7. The equations of motion and the energy—momentum tensor Energy—momentum for gauge theory had been given in Eq. (2.55). They contain time derivatives of the fields which can be eliminated using the equations of motion. ¹he color-Maxwell equations are given in Eq. (2.33). They are four (sets of) equations for determining the four (sets of) functions Ak. One of the equations of motion is removed by fixing the a gauge and we choose the light-cone gauge [22] A`"0 . (2.67) a Two of the equations of motion express the time derivatives of the two transversal components Aa in terms of the other fields. Since the front-form momenta in Eq. (2.55) do not depend on them, M we discard them here. The fourth is the analogue of the Coulomb equation or of the Gauss’ law in the instant form, particularly Fk`"gJ`. In the light-cone gauge the color-Maxwell charge k a a density J` is independent of the vector potentials, and the Coulomb equation reduces to a !`—A~!` Ai "gJ` . (2.68) a i Ma a This equation involves only (light-cone) space derivatives. Therefore, it can be satisfied only, if one of the components is a functional of the others. There are subtleties involved in actually doing this, in particular one has to cope with the ‘zero mode problem’, see for example [358]. Disregarding this here, one inverts the equation by Aa "AI a #(g/(i`)2)J` . (2.69) ` ` a For the free case (g"0), A~ reduces to AI ~. Following Lepage and Brodsky [299], one can collect all components which survive the limit gP0 into the ‘free solution’ AI k, defined by a AI a "!(1/`) Ai , thus AI k"(0, A , AI `) . (2.70) ` i Ma a Ma a Its four-divergence vanishes by construction and the Lorentz condition AI k"0 is satisfied as an k a operator. As a consequence, AI k is purely transverse. The inverse space derivatives (i`)~1 and a (i`)~2 are actually Green’s functions. Since they depend only on x~, they are comparatively simple, much simpler than in the instant form where (e2)~1 depends on all three space-like coordinates.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
323
¹he color-Dirac equations are defined in Eq. (2.35) and are used here to express the time derivatives W as function of the other fields. After multiplication with b"c0 they read explicitly ` (ic0c`¹aDa #ic0c~¹aDa #iai ¹aDa )W"mbW , (2.71) ` ~ M Mi with the usual ak"c0ck, k"1,2,3. In order to isolate the time derivative one introduces the projectors K "KB and projected spinors W "WB by B B W "K W . (2.72) K "1(1$a3) , B B B 2 Note that the raising or lowering of the projector labels $ is irrelevant. The c0cB are obviously related to the KB, but differently in the KS- and LB-convention c0cB"2KB "J2KB . (2.73) LB KS Multiplying the color-Dirac equation once with K` and once with K~, one obtains a coupled set of spinor equations 2i W "(mb!iai ¹aDa )W #2gAa ¹aW , ` ` M Mi ~ ` ` (2.74) 2i W "(mb!iai ¹aDa )W #2gAa ¹aW . ~ ~ M Mi ` ~ ~ Only the first of them involves a time derivative. The second is a constraint, similar to the above in the Coulomb equation. With the same proviso in mind, one defines W "(1/2i )(mb!iai ¹aDa )W . ~ ~ M Mi ` Substituting this into the former, the time derivative is
(2.75)
2i W "2gAa ¹aW #(mb!iaj ¹aDa )(1/2i )(mb!iai ¹aDa )W . (2.76) ` ` ` ` M Mj ~ M Mi ` Finally, in analogy to the color-Maxwell case, one can conveniently introduce the free spinors WI "WI #WI by ` ~ WI "W #(mb!iai )(1/2i )W . (2.77) ` Mi ~ ` Contrary to the full spinor see, for example, Eq. (2.75), WI is independent of the interaction. To get the corresponding relations for the KS-convention, one substitutes the “2” by “J2” in accord with Eq. (2.73). The front-form Hamiltonian according to Eq. (2.55) is
P A
B
1 1 P " du F`iF # FijFa # [iWM c`¹aDa W#h.c.] . ` ` i` a ij ` 4 2 X
(2.78)
Expressing it as a functional of the fields will finally lead to Eq. (2.89) below, but despite the straightforward calculation we display explicitly the intermediate steps. Consider first the energy density of the color-electro-magnetic fields 1FijF #F`iF . Conveniently defining the abbrevi4 ij i` ations Bkl"f abcAkAl , sk"f abckAlAc , a b c a b l
(2.79)
324
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
the field tensors in Eq. (2.31) are rewritten as Fkl"kAl!lAk!gBkl and typical tensor a a a a contractions become 1FklFa "kAl Aa!kAl Aa #2skAa #1g2BklBa . (2.80) 2 a kl a k l a l k a k 2 a kl Using FaiF "2F`iF , the color-electro-magnetic energy density ai `i 1FijF #F`iF "1FijF !1FiaF "1FiiF !1FiaF "1FijF !1FabF (2.81) 4 ij i` 4 ij 2 ia 4 ii 4 ia 4 ij 4 ab separates completely into a longitudinal (a, b) and a transversal contribution (i, j) [358]; see also Eq. (2.54). Substituting A by Eq. (2.69), the color-electric and color-magnetic parts become ` 1FabF "1`A `A "1g2J`(1/(i`)2)J`#1( Ai )2#gJ`AI , 4 ba 2 ` ` 2 2 i M ` (2.82) 1FijF "1g2BijB !1( Ai )2#siA #1Aj(i )A , 4 ij 4 ij 2 i M i 2 i j respectively. The role of the different terms will be discussed below. The color-quark energy density is evaluated in the LB-convention. With iWM c`Da ¹aW"iWsc0c`Da ¹aW and the projectors of ` ` Eq. (2.72) one gets first iWM c`Da ¹aW"iJ2Ws Da ¹aW . Direct substitution of the time deriva` ` ` ` tives in Eq. (2.76) then gives iWM c`Da ¹aW"WI s (mb!iaj Da ¹a)(1/J2i )(mb!iai Db ¹b)W . ` ` M Mj ~ M Mi ` Isolating the interaction in the covariant derivatives i¹aDa "i !g¹aAa produces k k k iWM c`Da ¹aW"gWI s aj Aa ¹aWI #gWI s aj Aa ¹aWI ` ` M Mj ~ ~ M Mj ` 1 #(g2/J2)Ws aj Aa ¹a ai Ab ¹bW ` M Mj i M Mi ` ~
(2.83)
#(1/J2)Ws (mb!iaj )(1/i )(mb!iai )W . (2.84) ` M Mj ~ M Mi ` Introducing jI k as the color-fermion part of the total current JI k, that is a a jI l(x)"WIM cl¹aWI with JI l(x)"jI l(x)#sJ l(x) , (2.85) a a a a one notes that J`"JI ` when comparing with the defining equation (2.77). For the transversal a a parts holds obviously jI i "WI sai ¹aWI "WI s ai ¹aWI #WI s ai ¹aWI . Ma M ` M ~ ~ M ` With c`c`"0 one finds
(2.86)
WMI ckAI c`clAI WI "WIM ci AI c`ci AI WI "WI sai AI c`c0aj AI WI k l M Mi M Mi M Mi M Mj "J2WI s ai AI ai AI WI , ` M Mi M Mi ` see also [300]. The covariant time derivative of the dynamic spinors W is therefore a iWM c`Da ¹aW"gjI iAI #1g2WIM ckAI (c`/i`)clAI WI #1WIM c`((m2!+ 2)/i`)WI ` M Mi 2 k l 2 M
(2.87)
(2.88)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
325
in terms of the fields AI and WI . One finds the same expression in LB-convention. Since it is k a a hermitean operator, one can add Eqs. (2.82) and (2.88) to finally get the front-form Hamiltonian as a sum of five terms,
P
A
B P
m2#(i+ )2 1 M WI #AI k(i+ )2AI a #g dx d2x JI kAI a P " dx d2x WIM c` ` M a M k ` M a k ` 2 i`
P P
P
g2 g2 1 # dx d2x BI klBI a # dx d2x JI ` JI ` ` M a kl ` M a 4 2 (i`)2 a g2 c` # dx d2x WIM ck¹aAI a (cl¹bAI bWI ) . ` M ki` l 2
(2.89)
Only the first term survives the limit gP0, hence P~PPI ~, referred to as the free part of the Hamiltonian. For completeness, the space-like components of energy—momentum as given in Eq. (2.55) become
P P
P " dx d2x (F`iF #iWM c`¹aDaW) ` M ik k k " dx d2x (WMI c`i WI #AI k` AI a ) for k"1, 2,! . ` M k a k k
(2.90)
Inserting the free solutions as given below in Eq. (2.100), one gets for PI k"(P`, P , PI ~) M
P
(2.91) PI k" + dp`d2p pk(aJ s(q)aJ (q)#bI s(q)bI (q)#dI s(q)dI (q)) , M j,c,f in line with expectation: In momentum representation the momenta PI k are diagonal operators. Terms depending on the coupling constant are interactions and in general are non-diagonal operators in Fock space. Eqs. (2.89) and (2.90) are quite generally applicable: f f f f
They hold both in the Kogut—Soper and Lepage—Brodsky convention. They hold for arbitrary non-abelian gauge theory SU(N). They hold therefore also for QCD (N"3) and are manifestly invariant under color rotations. They hold for abelian gauge theory (QED), formally by replacing the color matrices ¹a with the c,c{ unit matrix and by setting to zero the structure constants f abc, thus Bkl"0 and sk"0. f They hold for 1 time dimension and arbitrary d#1 space dimensions, with i"1,2,d. Only what has to be adjusted is the volume integral :dx d2x . ` M f They thus hold also for the popular toy models in 1#1 dimensions. f Last but not least, they hold for the ‘dimensionally reduced models’ of gauge theory, formally by setting the transversal derivatives of the free fields to zero, that is ³ WI "0 and ³ AI "0. M a M k
Most remarkable, however, is that the relativistic Hamiltonian in Eq. (2.89) is additive [274] in the “kinetic” and the “potential” energy, very much like a non-relativistic Hamiltonian H"¹#º .
(2.92)
326
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
In this respect, the front form is distinctly different from the conventional instant form. With H,P the kinetic energy ` m2#(i+ )2 1 M WI #AI k(i+ )2AI a (2.93) ¹"PI " dx d2x WMI c` ` M a M k ` 2 i`
P
A
B
is the only term surviving the limit gP0 in Eq. (2.89). The potential energy º is correspondingly the sum of the four terms º"»#¼ #¼ #¼ . 1 2 3 Each of them has a different origin and interpretation. The vertex interaction
P
»"g dx d2x JI kAI a ` M a k
(2.94)
(2.95)
is the light-cone analogue of the J Ak-structures known from covariant theories, particularly k electrodynamics. It generates three-point vertices describing bremsstrahlung and pair creation. However, since JI k contains also the pure gluon part sJ k, it includes the three-point-gluon vertices as well. The four-point-gluon interactions
P
g2 ¼ " dx d2x BI klBI a ` M a kl 1 2
(2.96)
describe the four-point-gluon vertices. They are typical for non-abelian gauge theory and come only from the color-magnetic fields in Eq. (2.82). The instantaneous—gluon interaction
P
1 g2 JI ` ¼ " dx d2x JI ` ` M a 2 (i`)2 a 2
(2.97)
is the light-cone analogue of the Coulomb energy, having the same structure (density-propagatordensity) and the same origin, namely Gauss’ equation (2.69). ¼ describes quark—quark, 3 gluon—gluon, and quark—gluon instantaneous-gluon interactions. The last term, finally, is the instantaneous-fermion interaction
P
c` g2 ¼ " dx d2x WIM ck¹aAI a (cl¹bAI bWI ) . ` M k l 3 i` 2
(2.98)
It originates from the light-cone specific decomposition of Dirac’s equation (2.74) and has no counterpart in conventional theories. The present formalism is however more symmetric: The instantaneous gluons and the instantaneous fermions are partners. This has some interesting consequences, as we shall see below. Actually, the instantaneous interactions were seen first by Kogut and Soper [274] in the time-dependent analysis of the scattering amplitude as remnants of choosing the light-cone gauge. One should carefully distinguish the above front-form Hamiltonian H from the light-cone Hamiltonian H , defined in Eqs. (2.63) and (2.64) as the operator of invariant mass-squared. The LC former is the time-like component of a four-vector and therefore frame-dependent. The latter is a Lorentz scalar and therefore independent of the frame. The former is covariant, the latter invariant under Lorentz transformations, particularly under boosts. The two are related to each
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
327
other by multiplying H with a number, the eigenvalue of 2P`: H "2P`H . (2.99) LC The above discussion and interpretation of H applies therefore also to H . Note that matrix LC elements of the “Hamiltonian” have the dimension SenergyT2. 2.8. The interactions as operators acting in Fock space In Section 2.7 the energy—momentum four-vector P was expressed in terms of the free fields. k One inserts them into the expressions for the interactions and integrates over configuration space. The free fields are
P
dp`d2p M (bI (q)u (p, j)e~*px#dI s(q)v (p, j)e`*px) , WI (x)"+ acf a a J2p`(2p)3 j (2.100) dp`d2p M (aJ (q)e (p, j)e~*px#aJ s(q)e*(p, j)e`*px) , AI a (x)"+ k k k J2p`(2p)3 j where the properties of the u , v and e are given in the appendices and where a a k [aJ (q), aJ s(q@)]"MbI (q), bI s(q@)N"MdI (q), dI s(q@)N"d(p`!p`@)d(2)(p !p@ )dj@dc@df@ . (2.101) M M j c f Doing that in detail is quite laborious. We therefore restrict ourselves here to a few instructive examples, the vertex interaction », the instantaneous-gluon interaction ¼ and the instan2 taneous-fermion interaction ¼ . 3 According to Eq. (2.95) the fermionic contribution to the vertex interaction is
P
P
P
K
» "g dx d2x jI kAI a "g dx d2x WMI (x)ck¹aWI (x)AI a (x) f ` M a k ` M k
x`/0 g dp`d2p dp`d2p dp`d2p dx d2x 1 M1 2 M2 3 M3 ` M[(bI s(q )uN (p , j )e`*p1x " + + 1 a 1 1 (2p)3 J(2p)3j1,j2,j3 c1,c2,a3 J2p` J2p` J2p` 1 2 3 #dI (q )vN (p , j )e~*p1x)¹a13 2 ck (dI s(q )v (p , j )e`*p2x#bI (q )u (p , j )e~*p2x)] 1 a 1 1 2 b 2 2 2 b 2 2 c ,c ab (2.102) ](aJ s(q )e*(p , j )e`*p3x#aJ (q )e (p , j )e~*p3x) . 3 k 3 3 3 k 3 3 The integration over configuration space produces essentially Dirac delta—functions in the single particle momenta, which reflect momentum conservation:
P
P
P
A B P
P
P
A B
dx d2x `e*x`(+jp`j)"d + p` , Me~*xM(+jpMj)"d(2) + p . (2.103) j Mj 2p (2p)3 j j Note that the sum of these single-particle momenta is essentially the sum of the particle momenta minus the sum of the hole momenta. Consequently, if a particular term has only creation or only destruction operators as in bs(q )ds(q )as(q )d(p`#p`#p`)K0 , 1 2 3 1 2 3
328
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
its contribution vanishes since the light-cone longitudinal momenta p` are all positive and cannot add to zero. The case that they are exactly equal to zero is excluded by the regularization procedures discussed below in Section 4. As a consequence, all energy diagrams which generate the vacuum fluctuations in the usual formulation of quantum field theory are absent from the outset in the front form. The purely fermionic part of the instantaneous-gluon interaction given by Eq. (2.97) becomes, respectively,
P
P
g2 1 g2 1 M ¼ " dx d2x jI ` jI `" dx d2x WMI (x)c`¹aWI (x) WI (x)c`¹aWI (x)D ` . 2,f ` M a a ` M x /0 2 (i`)2 2 (i`)2
P
P
P
P
g2 dp`d2p dp`d2p dp`d2p dp`d2p 1 M1 2 M2 3 M3 4 M4 ¼ " + + 2,f 2(2p)3 J2p` J2p` J2p` J2p` jj c1,c2,c3,c4 1 2 3 4 dx d2x ` M[(bI s(q )uN (p , j )e`*p1x#dI (q )vN (p , j )e~*p1x)¹a ] 1 a 1 1 1 a 1 1 c1,c2 (2p)3
P
]c`(dI s(q )v (p , j )e`*p2x#bI (q )u (p , j )e~*p2x)] ab 2 a 2 2 2 a 2 2 1 [(bI s(q )uN (p , j )e`*p3x#dI (q )vN (p ,j )e~*p3x)¹a3 4 ] 3 a 3 3 3 a 3 3 c ,c (i`)2 ]c` (dI s(q )v (p , j )e`*p4x#bI (q )u (p , j )e~*p4x)] . (2.104) ab 4 b 4 4 4 b 4 4 By the same reason as discussed above, there will be no contributions from terms with only creation or only destruction operators. The instantaneous-fermion interaction, finally, becomes according to Eq. (2.98),
P
g2 c` ¼ " dx d2x WIM ck¹aAI a (cv¹bAI bWI ) 3 ` M k v 2 iL`
P
P
P
P
g2 dp`d2p dp`d2p dp`d2p dp`d2p 1 M1 2 M2 3 M3 4 M4 " + + 2(2p)3 J2p` J2p` J2p` J2p` jj c1,a2,a3,c4 1 2 3 4 dx d2x ` M[(bI s(q )uN (p , j )e`*p1x#dI (q )vN (p , j )e~*p1x)¹a2 ] 1 1 1 1 1 1 c1,c (2p)3
P
]ck(aJ s(q )e*(p , j )e`*p2x#aJ (q )e (p , j )e~*p2x) 2 k 2 2 2 k 2 2 1 ] (aJ s(q )e*(p , j )e`*p3x#aJ (q )e (p , j )e~*p3x)¹a3 4 3 l 3 3 3 l 3 3 c,c i` ]cl(dI s(q )v(p , j )e`*p4x#bI (q )u(p , j )e~*p4x)] . (2.105) 4 4 4 4 4 4 Each of the instantaneous interactions types has primarily 24!2"14 individual contributions, which will not be enumerated in all detail. In Section 4 complete tables of all interactions will be tabulated in their final normal ordered form, that is with all creation operators are to the left of the destruction operators. All instantaneous interactions like those shown above are four-point interactions and the creation and destruction operators appear in a natural order. According to Wick’s theorem this “time-ordered” product equals to the normal ordered product plus the sum of
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
329
all possible pairwise contractions. The fully contracted interactions are simple c-numbers which can be omitted due to vacuum renormalization. The one-pair contracted operators, however, cannot be thrown away and typically have a structure like I(q)bI s(q)bI (q) .
(2.106)
Due to the properties of the spinors and polarization functions u , v and e they become diagonal a a k operators in momentum space. The coefficients I(q) are kind of mass terms and have been labeled as “self-induced inertias” [354]. Even if they formally diverge, they are part of the operator structure of field theory, and therefore should not be discarded but need careful regularization. In Section 4 they will be tabulated as well.
3. Bound states on the light cone In principle, the problem of computing for quantum chromodynamics the spectrum and the corresponding wavefunctions can be reduced to diagonalizing the light-cone Hamiltonian. Any hadronic state must be an eigenstate of the light-cone Hamiltonian, thus a bound state of mass M, which satisfies (M2!H )DMT"0. Projecting the Hamiltonian eigenvalue equation onto the LC various Fock states SqqN D, SqqN gD2 results in an infinite number of coupled integral eigenvalue equations. Solving these equations is equivalent to solving the field theory. The light-cone Fock basis is a very physical tool for discussing these theories because the vacuum state is simple and the wavefunctions can be written in terms of relative coordinates which are frame-independent. In terms of the Fock-space wavefunction, one can give exact expressions for the form factors and structure functions of physical states. As an example we evaluate these expressions with a perturbative wavefunction for the electron and calculate the anomalous magnetic moment of the electron. In order to lay down the groundwork for upcoming non-perturbative studies, it is indispensable to gain control over the perturbative treatment. We devote therefore a section to the perturbative treatment of quantum electrodynamics and gauge theory on the light cone. Light-cone perturbation theory is really Hamiltonian perturbation theory, and we give the complete set of rules which are the analogues of the Feynman rules. We shall demonstrate in a selected example, that one gets the same covariant and gauge-invariant scattering amplitude as in Feynman theory. We also shall discuss one-loop renormalization of QED in the Hamiltonian formalism. Quantization is done in the light-cone gauge, and the light-cone time-ordered perturbation theory is developed in the null-plane Hamiltonian formalism. For gauge-invariant quantities, this is very loosely equivalent to the use of Feynman diagrams together with an integration over p~ by residues [426,427]. The one-loop renormalization of QED quantized on the null plane looks very different from the standard treatment. In addition to not being manifestly covariant, x`-ordered perturbation theory is fraught with singularities, even at tree level. The origin of these unusual, “spurious”, infrared divergences is not a mystery. Consider, for example, a free particle whose transverse momentum p " (p1, p2) is fixed, and whose third component p3 is cut at some M momentum K. Using the mass-shell relation, p~"(m2#p2 )/2p`, one sees that p` has a lower M bound proportional to K~1. Hence, the light-cone spurious infrared divergences are simply a manifestation of space—time ultraviolet divergences. A great deal of work is continuing on how to treat these divergences in a self-consistent manner [456]. Bona fide infrared divergences are of
330
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
course also present, and can be taken care of as usual by giving the photon a small mass, consistent with light-cone quantization [406]. As a matter of practical experience, and quite opposed to the instant form of the Hamiltonian approach, one gets reasonable results even if the infinite number of integral is equations truncated. The Schwinger model is particularly illustrative because in the instant form this bound state has a very complicated structure in terms of Fock states while in the front form the bound state consists of a single electron—positron pair. One might hope that a similar simplification occurs in QCD. The Yukawa model is treated here in Tamm—Dancoff truncation in 3#1 dimensions [182,373,374]. This model is particularly important because it features a number of the renormalization problems inherent to the front form, and it motivates the approach of Wilson to be discussed later. 3.1. The hadronic eigenvalue problem The first step is to find a language in which one can represent hadrons in terms of relativistic confined quarks and gluons. The Bethe—Salpeter formalism [37,312] has been the central method for analyzing hydrogenic atoms in QED, providing a completely covariant procedure for obtaining bound-state solutions. However, calculations using this method are extremely complex and appear to be intractable much beyond the ladder approximation. It also appears impractical to extend this method to systems with more than a few constituent particles. A review can be found in Ref. [312]. An intuitive approach for solving relativistic bound-state problems would be to solve the Hamiltonian eigenvalue problem HDWT"JM2#P2DWT
(3.1)
for the particle’s mass, M, and wavefunction, DWT. Here, one imagines that DWT is an expansion in multi-particle occupation number Fock states, and that the operators H and P are secondquantized Heisenberg operators. Unfortunately, this method, as described by Tamm and Dancoff [117,421], is complicated by its non-covariant structure and the necessity to first understand its complicated vacuum eigensolution over all space and time. The presence of the square-root operator also presents severe mathematical difficulties. Even if these problems could be solved, the eigensolution is only determined in its rest system (P"0); determining the boosted wavefunction is as complicated as diagonalizing H itself. In principle, the front-form approach works in the same way. One aims at solving the Hamiltonian eigenvalue problem M2#P2 MDWT , HDWT" 2P`
(3.2)
which for several reasons is easier: Contrary to P the operator P` is positive, having only positive z eigenvalues. The square-root operator is absent, and the boost operators are kinematic, see Section 2.6. As discussed there, in both the instant and the front form, the eigenfunctions can be labeled with six numbers, the six eigenvalues of the invariant mass M, of the three space-like momenta P`, P , and of the generalized total spin-squared S2 and its longitudinal projection S , M z that is DWT"DW; M, P`, P , S2, S ; hT . M z
(3.3)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
331
In addition, the eigenfunction is labeled by quantum numbers like charge, parity, or baryon number which specify a particular hadron h. The ket DWT can be calculated in terms of a complete set of functions DkT or Dk T, n
P
P
d[k] DkT SkD"+ d[k ] Dk T Sk D"1 . (3.4) n n n n The transformation between the complete set of eigenstates DWT and the complete set of basis states Dk T are then Sk DWT. The projections of DWT on Dk T are usually called the wavefunctions n n n W (k),Sk DWT . (3.5) n@h(M,P`,PM,S2,Sz) n Since the values of (M, P`, P , S2, S ) are obvious in the context of a concrete case, we convene to M z drop reference to them and write simply
P
P
(3.6) DWT"+ d[k ] Dk TW (k),+ d[k ] Dk T Sk DW; M, P` P , S2, S ; hT . n n n@h n n n M z n n One constructs the complete basis of Fock states Dk T in the usual way by applying products of n free-field creation operators to the vacuum state D0T: n"0: D0T, n"1: DqqN : k`, k , j T " bs(q )ds(q ) i Mi i 1 2 n"2: DqqN g: k`, k , j T " bs(q )ds(q )as(q ) i Mi i 1 2 3 n"3: Dgg: k`, k , j T " as(q )as(q ) i Mi i 1 2 F F F F
D0T, D0T,
(3.7)
D0T, D0T .
The operators bs(q), ds(q) and as(q) create bare leptons (electrons or quarks), bare antileptons (positrons or antiquarks) and bare vector bosons (photons or gluons). In the above notation, one explicitly keeps track of only the three continuous momenta k` and k and of the discrete helicities i Mi j . The various Fock-space classes are conveniently labeled with a running index n. Each Fock state i Dk T"Dn : k`, k , j T is an eigenstate of P` and P . The eigenvalues are n i Mi i M P "+ k , P`"+ k` with k`'0 . (3.8) M Mi i i ion ion The vacuum has eigenvalue 0, i.e. P D0T"0 and P`D0T"0. M The restriction k`'0 for massive quanta is a key difference between light-cone quantization and ordinary equal-time quantization. In equal-time quantization, the state of a parton is specified by its ordinary three-momentum k"(k , k , k ). Since each component of k can be either positive x y z or negative, there exist zero total momentum Fock states of arbitrary particle number, and these will mix with the zero-particle state to build up the ground state, the physical vacuum. However, in light-cone quantization each of the particles forming a zero-momentum state must have vanishingly small k`. The free or Fock-space vacuum D0T is then an exact eigenstate of the full front-form Hamiltonian H, in stark contrast to the quantization at equal usual time. However, as we shall see later, the vacuum in QCD is undoubtedly more complicated due to the possibility of color-singlet
332
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
states with P`"0 built on zero-mode massless gluon quanta [189], but as discussed in Section 7, the physical vacuum is still far simpler than is usual. Since k`'0 and P`'0, one can define boost-invariant longitudinal momentum fractions i k` x" i with 0(x (1 , (3.9) i P` i and adjust the notation. All particles in a Fock state Dk T"Dn : x , k , j T have then fourn i Mi i momentum
A
B
m2#k2 Mi kk,(k`, k , k~) " x P`, k , i for i"1,2,N , (3.10) i M i i Mi x P` n i and are “on shell”, (kk k ) "m2. Also the Fock state is “on shell” since one can interpret k* * Nn (k #x P )2#m2 Nn k2 #m2 Nn Mi i M i !P2 " + M (3.11) + k~ P`!P2 " + i M M x x i i/1 i/1 i/1 as its free invariant mass squared M I 2"PI kPI . There is some confusion over the terms “on-shell” k and “off-shell” in the literature [367]. The single-particle states are on-shell, as mentioned, but the Fock states k are off the energy shell since M I in general is different from the bound-state n mass M which appears in Eq. (3.2). In the intrinsic frame (P "0), the values of x and k are M * Mi constrained by
A
B
A
B
A
B
Nn Nn + x "1, + k "0 , (3.12) i Mi i/1 i/1 because of Eq. (3.8). The phase-space differential d[k ] depends on how one normalizes the n single-particle states. In the convention where commutators are normalized to a Dirac d-function, the phase-space integration is
P
P
d[k ]2" + [dx d2k ]2, with n i Mi ji o n (3.13) Nn Nn . [dx d2k ]"d 1! + x d(2) + k dx 2dx n d2k 2d2k j Mj 1 N M1 MNn i Mi j/1 j/1 ¹he additional Dirac d-functions account for the constraints (3.12). The eigenvalue equation (3.2) therefore stands for an infinite set of coupled integral equations
A
P
B A
B
M2#P2 MW (x , k , j ) + [dk@ ] Sn : x , k , j DHDn@ : x@ , k@ , j@TW (x@, k@ , j@)" (3.14) n{ i Mi i i Mi i n{@h i Mi i n@h i M i 2P` n{ for n"1, 2, R. The major difficulty is not primarily the large number of coupled integral equations, but rather that the above equations are ill-defined for very large values of the transversal momenta (“ultraviolet singularities”) and for values of the longitudinal momenta close to the endpoints x&0 or x&1 (“endpoint singularities”). One often has to introduce cut-offs K, to regulate the theory in some convenient way, and subsequently to renormalize it at a particular
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
333
mass or momentum scale Q. The corresponding wavefunction will be indicated by corresponding upper scripts, W(K)(x , k , j ) or W(Q)(x , k , j ) . (3.15) n@h i M i n@h i M i Consider a pion in QCD with momentum P"(P`, P ) as an example. It is described by M = (3.16) Dn : PT" + d[k ]Dn : x P`, k #x P , j TW (x , k , j ) , n i Mi i M i n@n i Mi i n/1 where the sum is over all Fock space sectors of Eq. (3.7). The ability to specify wavefunctions simultaneously in any frame is a special feature of light-cone quantization. The light-cone wavefunctions W do not depend on the total momentum, since x is the longitudinal momentum n@n i fraction carried by the i5) parton and k is its momentum “transverse” to the direction of the Mi meson; both of these are frame-independent quantities. They are the probability amplitudes to find a Fock state of bare particles in the physical pion. More generally, consider a meson in SU(N). The kernel of the integral equation (3.14) is illustrated in Fig. 2 in terms of the block matrix Sn : x , k , j DHDn@ : x@, k@ , j@ T. The structure of this i Mi i i Mi i matrix depends of course on the way one has arranged the Fock space, see Eq. (3.7). Note that most of the block matrix elements vanish due to the nature of the light-cone interaction as defined in
P
Fig. 2. The Hamiltonian matrix for a SU(N)-meson. The matrix elements are represented by energy diagrams. Within each block they are all of the same type: either vertex, fork or seagull diagrams. Zero matrices are denoted by a dot ( ) ). The single gluon is absent since it cannot be color neutral.
334
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Eq. (2.94). The vertex interaction in Eq. (2.95) changes the particle number by one, while the instantaneous interactions in Eqs. (2.96), (2.97) and (2.98) change the particle number only up to two. 3.2. The use of light-cone wavefunctions The infinite set of integral equations (3.14) is difficult if not impossible to solve. But given the light-cone wavefunctions W (x , k , j ), one can compute any hadronic quantity by convolution n@h i Mi i with the appropriate quark and gluon matrix elements. In many cases of practical interest it suffices to know less information than the complete wavefunction. As an example consider
P
G (x,Q)"+ d[k ]DW(Q)(x , k , j )D2+ d(x!x ) . (3.17) a@h n n@h i Mi i i n i G is a function of one variable, characteristic for a particular hadron, and depends parametrically a@h on the typical scale Q. It gives the probability to find in that hadron a particle with longitudinal momentum fraction x, irrespective of the particle type, and irrespective of its spin, color, flavor or transversal momentum k . Because of wavefunction normalization the integrated probability is M normalized to one. One can ask also for conditional probabilities, for example, for the probability to find a quark of a particular flavor f and its momentum fraction x, but again irrespective of the other quantum numbers. Thus,
P
G (x; Q)"+ d[k ]DW(Q)(x , k , j )D2+ d(x!x )d . (3.18) f@h n n@h i Mi i i i,f n i The conditional probability is not normalized, even if one sums over all flavors. Such probability functions can be measured. For exclusive cross sections, one often needs only the probability amplitudes of the valence part
P
U (x; Q)"+ d[k ]W(Q)(x , k , j )+ d(x!x )d d H(k2 4Q2) . (3.19) f@h n n@h i Mi i i i,f n,7!-%/#% Mi n i Here, the transverse momenta are integrated up to momentum transfer Q2. The leading-twist structure functions measured in deep-inelastic lepton scattering are immediately related to the above light-cone probability distributions by F (x, Q) 2MF (x, Q)" 2 ++ e2G (x, Q) . (3.20) 1 f f@p x f This follows from the observation that deep-inelastic lepton scattering in the Bjorken-scaling limit occurs if x matches the light-cone fraction of the struck quark with charge e . However, the bj f light cone wavefunctions contain much more information for the final state of deep-inelastic scattering, such as the multi-parton distributions, spin and flavor correlations, and the spectator jet composition. One of the most remarkable simplicities of the light-cone formalism is that one can write down the exact expressions for the electro-magnetic form factors. In the interaction picture, one can
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
335
equate the full Heisenberg current to the free (quark) current Jk(0) described by the free Hamiltonian at x`"0. As was first shown by Drell and Yan [133], it is advantageous to choose a special coordinate frame to compute form factors, structure functions, and other current matrix elements at space-like photon momentum. One then has to examine only the J` component to get form factors like F (q2)"SP@, S@DJ`DP,ST with q "P@ !P . (3.21) S?S{ u k k This holds for any (composite) hadron of mass M, and any initial or final spins S [133,56]. In the Drell frame, as illustrated in Fig. 3, the photon’s momentum is transverse to the momentum of the incident hadron and the incident hadron can be directed along the z-direction, thus
A
B
M2 Pk" P`, 0 , , M P`
A
B
2q ) P qk" 0, q , . M P`
(3.22)
With such a choice the four-momentum transfer is !q qk,Q2"q2 , and the quark current can k M neither create pairs nor annihilate the vacuum. This is distinctly different from the conventional treatment, where there are contributions from terms in which the current is annihilated by the vacuum, as illustrated in Fig. 4. Front-form kinematics allow to trivially boost the hadron’s four-momentum from P to P@, and therefore the space-like form factor for a hadron is just a sum of overlap integrals analogous to the corresponding non-relativistic formula [133]
P
F (Q2)"+ + e d[k ]W* (x , l , j )W (x , k , j ) . (3.23) S?S{ f n n,S{ i Mi i n,S i Mi i n f Here e is the charge of the struck quark, and f k !x q #q for the struck quark , i M M (3.24) l , Mi Mi k !x q for all other partons . Mi i M This is particularly simple for a spin-zero hadron like a pion. Notice that the transverse momenta appearing as arguments of the first wavefunction correspond not to the actual momenta carried by the partons but to the actual momenta minus x q , to account for the motion of the final hadron. i M Notice also that l and k become equal as q P0, and that F P1 in this limit due to Mi Mi M p
G
Fig. 3. Calculation of the form factor of a bound state from the convolution of light-cone Fock amplitudes. The result is exact if one sums over all W . n
336
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 4. (a) Illustration of a vacuum creation graph in time-ordered perturbation theory. A corresponding contribution to the form factor of a bound state is shown in (b).
wavefunction normalization. In most of the cases it suffices to treat the problem in perturbation theory. 3.3. Perturbation theory in the front form The light-cone Green’s functions GI (x`) are the probability amplitudes that a state starting in fi Fock state DiT ends up in Fock state D f T a (light-cone) time x` later
P
S f DGI (x`) DiT"S f De~*P`x`DiT"S f De~*Hx`DiT"i
de e~*ex`S f DG(e)DiT . 2n
(3.25)
The Fourier transform S f DG(e)DiT is usually called the resolvent of the Hamiltonian H [333], i.e.
TK
KU T K
KU
1 1 i " f i . (3.26) e!H#i0 e!H !º#i0 ` 0 ` Separating the Hamiltonian H"H #º according to Eq. (2.92) into a free part H and an 0 0 interaction º, one can expand the resolvent into the usual series S f DG(e)DiT" f
TK
1 1 1 # º e!H #i0 e!H #i0 e!H #i0 0 ` 0 ` 0 ` 1 1 1 # º º #2 i . (3.27) e!H #i0 e!H #i0 e!H #i0 0 ` 0 ` 0 ` The rules for x`-ordered perturbation theory follow immediately when the resolvent of the free Hamiltonian (e!H )~1 is replaced by its spectral decomposition. 0 1 1 "+ d[k ]Dn : k`, k , j T Sn : k`, k , j D . (3.28) n i Mi i e!+ ((k2 #m2)/2k`) #i0 i Mi i e!H #i0 i M i ` 0 ` n The sum becomes a sum over all states n intermediate between two interactions º. To calculate then S f D(e)DiT perturbatively, all x`-ordered diagrams must be considered, the contribution from each graph computed according to the rules of old-fashioned Hamiltonian perturbation theory [274,299]: S f DG(e)DiT" f
KU
P
1. Draw all topologically distinct x`-ordered diagrams. 2. Assign to each line a momentum kk, a helicity j, as well as color and flavor, corresponding to a single-particle on-shell, with kkk "m2. With fermions (electrons or quark) associate a spinor k
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
337
u (k,j), with antifermions v (k, j), and with vector bosons (photons or gluons) a polarization a a vector e (k, j). These are given explicitly in Appendices B and C. l 3. For each vertex include factor » as given in Fig. 5 for QED and Fig. 6 for QCD, with further tables given in Section 4. To convert incoming into outgoing lines or vice versa replace u % v, uN % !vN , e % e* in any of these vertices (see also items 8,9, and 10). 4. For each intermediate state there is a factor 1 , e!+ ((k2 #m2)/2k`) #i0 i M i ` where e"PI is the incident light-cone energy. */` 5. To account for three-momentum conservation include for each intermediate state the delta functions d(P`!& k`) and d(2)(P !& k ). i i M i Mi 6. Integrate over each internal k with the weight
P
h(k`) d2k dk` M (2p)3@2
and sum over internal helicities (and colors for gauge theories). 7. Include a factor !1 for each closed fermion loop, for each fermion line that both begins and ends in the initial state, and for each diagram in which fermion lines are interchanged in either of the initial or final states. 8. Imagine that every internal line is a sum of a “dynamic” and an “instantaneous” line, and draw all diagrams with 1,2,3,2 instantaneous lines. 9. Two consecutive instantaneous interactions give a vanishing contribution.
Fig. 5. A few selected matrix elements of the QED front-form Hamiltonian H"P in KS-convention. `
338
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 6. A few selected matrix elements of the QCD front form Hamiltonian H"P in LB-convention. `
10. For the instantaneous fermion lines use the factor ¼ in Fig. 5 or Fig. 6, or the corresponding f tables in Section 4. For the instantaneous boson lines use the factor ¼ . b The light-cone Fock state representation can thus be used advantageously in perturbation theory. The sum over intermediate Fock states is equivalent to summing all x`-ordered diagrams and integrating over the transverse momentum and light-cone fractions x. Because of the restriction to positive x, diagrams corresponding to vacuum fluctuations or those containing backwardmoving lines are eliminated. 3.4. Example 1: ¹he qqN -scattering amplitude The simplest application of the above rules is the calculation of the electron—muon scattering amplitude to lowest non-trivial order. But the quark—antiquark scattering is only marginally more difficult. We thus imagine an initial (q, qN )-pair with different flavors fOfM to be scattered off each other by exchanging a gluon. Let us treat this problem as a pedagogical example to demonstrate the rules. Rule 1: There are two time-ordered diagrams associated with this process. In the first one the gluon is emitted by the quark and absorbed by the antiquark, and in the second it is emitted by the antiquark and absorbed by the quark. For the first diagram, we assign the momenta required in rule 2 by giving explicitly the initial and final Fock states 1 nc Dq, qN T" + bs (k , j )ds M (k N , j N )D0T , Jn c/1 cf q q cf q q c 1 nc Dq@, qN @T" + bs (k@ , j@ )ds M (k@N , j@N )D0T , Jn c/1 cf q q cf q q c
(3.29) (3.30)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
339
respectively. Note that both states are invariant under SU(n ). The usual color singlets of QCD are c obtained by setting n "3. The intermediate state c
S
Dq@, qN , gT"
2 nc nc n2c~1 + + + ¹a bs M (k@ , j@ )ds M (k N , j N )as(k , j )D0T , c,c{ cf q q cf{ q q a g g n2!1 c c/1 c{/1 a/1
(3.31)
has “a gluon in flight”. Under that impact, the quark has changed its momentum (and spin), while the antiquark as a spectator is still in its initial state. At the second vertex, the gluon in flight is absorbed by the antiquark, the latter acquiring its final values (k@N , j@N ). Since the gluons longitudinal q q momentum is positive, the diagram allows only for k@ `(k`. Rule 3 requires at each vertex the q q factors
S
g n2!1 [uN (k , j )cke (k , j )u(k@ , j@ )] c q q k g g q q , Sq, qN D»Dq@, qN , gT" (2p)3@2 2n J2k` J2k` J2k@` c q g q
S
g n2!1 [uN (k N , j N )cle*(k , j )u(k@ N ,j@N )] c q q l g g q q , Sq@, qN , gD»Dq@, qN @T" (2p)3@2 2n J2k` J2k` J2k@ ` N N c q g q
(3.32)
(3.33)
respectively. If one works with color neutral Fock states, all color structure reduces to an overall factor C, with C2"(n2!1)/2n . This factor is the only difference between QCD and QED for this c c example. For QCD C2"4 and for QED C2"1. Rule 4 requires the energy denominator 1/*E. 3 With the initial energy e"PI "1PI ~"(k #k N ) "1(k #k N )~, ` 2 q q` 2 q q the energy denominator 2*E"(k #k N )~!(k #k@ #k N )~"!Q2/k` q q g q q g
(3.34)
can be expressed in terms of the Feynman four-momentum transfers Q2"k`(k #k@ !k )~, QM 2"k`(k #k N !k@N )~. g g q q g g q q
(3.35)
Rule 5 requires two Dirac-delta functions, one at each vertex, to account for conservation of three-momentum. One of them is removed by the requirement of rule 6, namely to integrate over all intermediate internal momenta and the other remains in the final equation (3.43). The momentum of the exchanged gluon is thus fixed by the external legs of the graph. Rule 6 requires that one sums over the gluon helicities. The polarization sum gives d (k ),+ e (k , j ) e*(k , j )"!g #(k g #k g )/kig , kl g k g g l g g kl g,k l g,l k g i jg
(3.36)
see Appendix B. The null vector gk has the components [299] gk"(g`, g , g~)"(0, 0 , 2) , M M
(3.37)
340
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
and thus the properties g2,gkg "0 and kg"k`. In light-cone gauge, we find for the gk dependent terms
A
B
+ Sq, qN D»Dq@, qN , gTSq@, qN , gD»Dq@, qN @T g jg 1 [uN (q)c kku(q@)] [uN (qN )c glu(qN @)] [uN (q)c gku(q@)] [uN (qN )c klu(qN @)] (gC)2 u k g l l g ] # . (3.38) " (2p)3 2k`(k g) J4k` k@ ` J4k` k@ ` J4k` J4k` k@ ` k@ ` N N N N g g q q q q q q q q Next, we introduce four-vectors like lk"(k #k !k@ )k. Since its three-components vanish by q g q q momentum conservation, lk must be proportional to the null vector gk. With Eq. (3.35) one gets q (3.39) lk"(k #k !k@ )k"(Q2/2k`)gk, lkN "(k #k@N !k N )k"(QM 2/2k`)gk . g q q g q g q q g q The well-known property of the Dirac spinors, (k !k@ )k[uN (k , j )c u(k@ , j@ )]"0, yields then q q q q k q q [uN (q)c kku(q@)]"[uN (q)c gku(q@)]Q2/2k`"[uN (q)c`u(q@)]Q2/2k` , k g k g g and Eq. (3.38) becomes
A
G
H
B
(gC)2 Q2 [uN (q)c`u(q@)] [uN (qN )c`u(qN @)] + Sq, qN D»Dq@, qN , gTSq@, qN , gD»Dq@, qN @T " . (3.40) (2p)3 2(k`)3 J4k`k@ ` J4k` N k@N ` g g jg q q q q Including the g contribution, the diagram of second order in » gives thus kl g2C2 [uN (k , j )cku(k@ , j@ )] 1 [uN (k N , j N )c u(k@N , j@N )] 1 q q q q q q k q q »" » (2p)3 Q2 PI !H J4k` k@ ` J4k` k@ ` ` 0 q q qN qN g2C2 [uN (k , j )c`u(k@ , j@ )] 1 [uN (k N , j N )c`u(k@N , j@N )] q q q q q q q q , ! (3.41) (2p)3 (k`)2 k@ ` J4k`k@ ` J4k` N N g q q q q up to the delta functions, and a step function H(k@ `4k`), which truncates the final momenta k@`. q q Evaluating the second time ordered diagram, one gets the same result up to the step function H(k@ `5k`). Using q q H(k@ `4k`)#H(k@ `5k`)"1 , q q q q the final sum of all time-ordered diagrams to order g2 is Eq. (3.41). One proceeds with rule 8, by including consecutively the instantaneous lines. In the present case, there is only one. From Fig. 5 we find g2C2 [uN (k , j )c`u(k@ , j@ )] 1 [uN (k N , j N )c`u(k@N , j@N )] q q q q q q q q . Sq, qN D¼ Dq@, qN @T" (3.42) b (2p)3 (k`!k@ `)2 J4k`k@ ` J4k` k@ ` N N q q q q q q Finally, adding up all contributions up to order g2, the qqN -scattering amplitude becomes ¼#»
(gC)2 (!1) 1 »" [uN (k , j )cku(k@ , j@ )][uN (k N , j N )c u(k@N , j@N )] q q q q q q k q q (2p)3 (k !k@N )2 PI !H q q ` 0 1 ] d(P`!P@`)d(2)(P !P @ ) . M M Jk`k` N k@ `k@N ` q q q q
(3.43)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
341
The instantaneous diagram ¼ is thus cancelled exactly against a corresponding term in the diagram of second order in the vertex interaction ». Their sum gives the correct second-order result. 3.5. Example 2: Perturbative mass renormalization in QED (KS) As an example for light-cone perturbation theory we follow here the work of Mustaki et al. [339,372] to calculate the second-order mass renormalization of the electron and the renormalization constants Z and Z in the KS convention. 2 3 Since all particles are on-shell in light-cone time-ordered perturbation theory, the electron wavefunction renormalization Z must be obtained separately from the mass renormalization dm. 2 At order e2, one finds three contributions. First, the perturbation expansion ¹"¼#»(1/(p !H ))» ` 0
(3.44)
yields a second-order contribution in », as shown in Fig. 7a. The initial (or final) electron four momentum is denoted by pk"(p`, p , (p2 #m2)/2p`) . M M
(3.45)
Second and finally, one has first-order contributions from ¼ and ¼ , corresponding to Fig. 7b f g and Fig. 7c. In the literature [354,422,66] these two-point vertices have been called “seagulls” or “self-induced inertias”. One has to calculate the transition matrix amplitude ¹ d "Sp, sD¹Dp, pT between a free pp s,p electron states with momentum and spin (p, s) and one with momentum and spin (p, p). The
Fig. 7. One loop self-energy correction for the electron. Time flows upward in these diagrams.
342
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
normalization of states as in Eq. (3.3) was thus far Sp@, s@Dp, sT"d(p`!p@`)d2 (p !p@ )d , M M s,s{
(3.46)
but for an invariant normalization it is better to use SpJ , sD,J2p`Sp, sD. Then one finds, 2mdmd ,¹ J J "2p`¹ Ndmd "(p`/m)¹ . sp pp pp sp pp
(3.47)
The other momenta appearing in Fig. 7a are k"(k`, k , k2 /2k`) , M M
A
k@" p`!k`,
(3.48)
B
(p !k )2#m2 M M , p !k . M M 2(p`!k`)
(3.49)
Using the above rules to calculate ¹ , one obtains for the contribution from Fig. 7a, pp
P P
d2k p` [uN (p,p)e. *(k, j)u(k@, s@)][uN (k@, s@)e. (k, j)u(p, s)] 1 M . dk` dm d "e2 + a sp (4p)3 k`(p`!k`)(p~!k~!k@~) m 0 j,s{
(3.50)
It can be shown that
CA
B
D
k` 2p` # (p ) k)!m2 , p`!k` k`
[uN (p, p)ck(k. @#m)clu(p, s)]d (k)"4d kl sp
(3.51)
which leads to the expression given below for dm . For Fig. 7b, one gets, using the rule for the a instantaneous fermion,
P P P P
uN (p, s)e. *(k, j)c`e. (k, j)u(p, p) p` d2k `= M dk` dm "e2 + b 2p`2k`2(p`!k`) 2m (2p)3 0 j p` d2k `= dk` M "e2 . 2m (2p)3 k`(p`!k`) 0
(3.52)
For Fig. 7c one finds,
P P P CP
C P
D
e2p` v(k, s)vN (k, s) c`u(p, p) d2k `=dk` uN (p, s)c` u(k, s)uN (k, s) M dm " + ! c 2m (2p)3 2k` J2p` 2(p`!k`)2 2(p`#k`)2 J2p` 0 s e2p` d2k M " 2m (2p)3
D
`= dk` `= dk` ! . (p`!k`)2 (p`#k`)2 0 0
(3.53)
These integrals have potential singularities at k`"0 and k`"p`, as well as an ultra-violet divergence in k . To regularize them, we introduce in a first step small cut-offs a and b: M a(k`(p`!b ,
(3.54)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
343
and get rid of the pole at k`"p` in dm and dm by a principal value prescription. One then b c obtains
P CP P A B P A B
A
B A BD
p`dk` m2 p` p` !2 !1 !ln a b k` p ) k 0 e2 d2k p` M ln , dm " b 2m (2p)3 a
e2 d2k M dm " a 2m (2p)3
, (3.55)
e2 d2k p` M !1 , dm " c m (2p)3 a where m2(k`)2#(p`)2k2 M. p ) k" 2p`k`
(3.56)
Adding these three contributions yields
P CP
A BD
p`dk` m2 b #ln . (3.57) a k` p ) k 0 Note the cancelation of the most singular infrared divergence. To complete the calculation, we present two possible regularization procedures: 1. ¹ransverse dimensional regularization. The dimension of transverse space, d, is continued from its physical value of 2 to 2#e and all integrals are replaced by e2 d2k M dm" 2m (2p)3
P
P
d2k P(k2)e ddk , M M
(3.58)
using e"1!d/2 as a small quantity. One thus gets
P P P P
(k2)e ddk (k2 )a"0 for a50, M M
A B A B A B
1 k2 e p (k2)e ddk " , Mk2 #M2 M2 e M (3.59) 1 k2 e p " , (k2)e ddk M(k2 #M2)2 M2 M2 M k2 k2 e pM2 M "! . (k2)e ddk Mk2 #M2 M2 e M In this method, a and b in Eq. (3.57) are treated as constants. Dimensional regularization gives zero for the logarithmic term, and for the remainder
P P
e2m 1 d2k M , dm" dx (2p)3 k2 #m2x2 M 0
(3.60)
344
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
with x,(k`/p`), the above integral yields dm"e2m/8p2e
(3.61)
as the final result. 2. Cut-offs. In this method [299,422,66], one restricts the momenta of any intermediate Fock state by means of the invariant condition
A
B
m2#k2 M 4K2 , PI 2"+ (3.62) x i i where PI is the free total four-momentum of the intermediate state, and where K is a large cut-off. Furthermore, one assumes that all transverse momenta are smaller than a certain cut-off K , with M K @K . M
(3.63)
In the case of Fig. 5a, Eq. (3.62) reads k2 (p !k )2#m2 K2#p2 M# M M M. (K@ with K@, k` p`!k` p`
(3.64)
Hence k2 (p !k )2#m2 b (p !k )2#m2 M M a" M , b" M N " M . K@ K@ a k2 M
(3.65)
In Ref. [339] it is shown that
P
AB P P
b p`dk` m2 d2k ln " d2k . M M a p` p ) k 0
(3.66)
Now
P P
A
B
m2 1 1 e2 d2k p` M dk` # . dm" p ) k p` k` 2m (2p)3 0
(3.67)
Upon integration, and dropping the finite part, one finds dm"(3e2m/16p2)ln(K2 /m2) , M
(3.68)
which is of the same form as the standard result [39]. Since dm is not by itself a measurable quantity, there is no contradiction in finding different results. Note that the seagulls are necessary for obtaining the conventional result. Finally, the wavefunction renormalization Z , at order e2, is given by 2 DSpD»DmTD2 1!Z "+@ , 2 (p !PI )2 `,m m `
(3.69)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
345
where PI is the free total energy of the intermediate state m. Note that this expression is the same `,m as one of the contributions to dm, except that here the denominator is squared. One has thus
P P P P
dk` uN (p,p)ck(cak@ #m)clu(p,s)d (k) e2 d2k p` M a kl (1!Z )d " 2 sp p` (4p)3 k`(p`!k`) (p~!k~!k@~)2 0 d2k 2(1!x)k2 e2d 1 M M #x , " sp dx (3.70) k2 #m2x2 x(k2 #m2x2) (2p)3 0 M M which is the same result as that obtained by Kogut and Soper [274]. Naturally, this integral is both infrared and ultraviolet divergent. Using the above rules, one gets
C
C
A BD
e2 3 p` Z (p`)"1# !2 ln 2 8p2e 2 a
D
A BC
e2 p` # ln (2p)2 a
A B A BD
1!2 ln
k p` !ln m a
,
(3.71)
where k2 is the scale introduced by dimensional regularization. Note that Z has an unusual 2 dependence on the longitudinal momentum, not found in the conventional instant form. But this may vary with the choice of regularization. A similar p` dependence was found for scalar QED by Thorn [426,427]. In Ref. [339,340] the full renormalization of front-form QED was carried out to the one-loop level. Electron and photon mass corrections were evaluated, as well as the wavefunction renormalization constants Z and Z , and the vertex correction Z . One feature that distinguishes the 2 3 1 front-form from the instant-form results is that the ultraviolet-divergent parts of Z and Z exhibit 1 2 momentum dependence. For physical quantities such as the renormalized charge e , this moR mentum dependence cancels due to the Ward identity Z (p`,p@`)"JZ (p`)Z (p@`). On the other 1 2 2 hand, momentum-dependent renormalization constants imply non-local counter terms. Given that the tree-level Hamiltonian is non-local in x~, it is actually not surprising to find counter terms exhibiting non-locality. As mentioned in Ref. [456], the power counting works differently here in the front than in the instant form. This is already indicated by the presence of four-point interactions in the Hamiltonian. The momentum dependence in Z and Z is another manifesta1 2 tion of unusual power counting laws. It will be interesting to apply them systematically in the case of QED. Power counting alone does not provide information about cancelation of divergences between diagrams. It is therefore important to gain more insight into the mechanism of cancelation in cases where one does expect this to occur as in the calculation of the electron mass shift. 3.6. Example 3: ¹he anomalous magnetic moment The anomalous magnetic moment of the electron had been calculated in the front form by Brodsky et al. [53], using the method of alternating denominators. Its calculation is a transparent example of calculating electro-magnetic form factors for both elementary and composite systems [41,56] as presented in Section 3.2 and for applying light-cone perturbation theory. Langnau and Burkardt [76,77,291,292] have calculated the anomalous magnetic moment at very strong coupling, by combining this method with discretized light-cone quantization, see below. We choose light-cone coordinates corresponding to the Drell frame, Eq. (3.14), and denote as in the preceding section the electron’s four-momentum and spin with (p, s). In line with Eq. (3.21), the Dirac and
346
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Pauli form factors can be identified from the spin-conserving and spin-flip current matrix elements:
T T
K K U K K U
J`(0) p,C "2F (q2) , M` " p#q,C 1 p`
(3.72)
F (q2) J`(0) p,B "!2(q !iq ) 2 , M` " p#q,C 1 2 2M ¬ p`
(3.73)
where C corresponds to positive spin projection s "#1 along the z-axis. The mass of the z 2 composite system M is of course the physical mass m of the lepton. The interaction of the current J`(0) conserves the helicity of the struck constituent fermion (uN c`u )/k "2d . Thus, one has j{ j ` jj{ from Eqs. (3.23), (3.72) and (3.73)
P
1 F (q2)" M` "+ e [dk ]t*n (x, k , j)t(n) (x, k , j) , (3.74) 1 j n p`q, M p, M 2 j 1 q !iq 2 F (q2)" M` "+ e [dk ]t*n (x, k , j)t(n) (x, k , j) . ! 1 (3.75) 2 j n p`q, M p,¬ M 2 ¬ 2M j In this notation, the summation over all contributing Fock states (n) and helicities (j) is assumed, and the reference to single-particle states i in the Fock states is suppressed. Momentum conservation is used to eliminate the explicit reference to the momentum of the struck lepton in Eq. (3.24). Finally, the leptons wavefunction directed along the final direction p#q in the current matrix element is denoted as
A
B
P
t(n) z(x, k , j)"W )(x , k , j ) . p`q,s M n@e(p`q,s2,sz i Mi i One recalls that F (q2) evaluated in the limit q2P0 with F P1 is equivalent to wavefunction 1 1 normalization
P
[dk]t* t "1, p p
P
[dk]t* t "1 . p¬ p¬
(3.76)
The anomalous moment a"F (0)/F (0) can be determined from the coefficient linear in q !iq 2 1 1 2 from t* in Eq. (3.75). Since according to Eq. (3.24) p`q t* ,! + x t* , (3.77) ik p`q q p`q M Mi iEj one can, after integration by parts, write explicitly
P
A
B
i a "!+ e [dk ] + t* x # t . (3.78) j n p i k p¬ k M 1 2 i j iEj The anomalous moment can thus be expressed in terms of a local matrix element at zero momentum transfer (see also with Section 5 below). It should be emphasized that Eq. (3.78) is exact, valid for the anomalous element of actually any spin-1-system. 2
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
347
As an example for the above perturbative formalism, one can evaluate the electron’s anomalous moment to order a [53]. In principle, one would have to account for all x`-ordered diagrams as displayed in Fig. 8. But most of them do not contribute, because either the vacuum fluctuation graphs vanish in the front form or they vanish because of using the Drell frame. Only the diagram in the upper-left corner of Fig. 8 contributes the two electron—photon Fock states with spins D1j ,j T"D!1, 1T and D1,!1T: 2 % c 2 2 (k !ik ) 2 J2 1 for D!1TPD!1, 1T, 2 2 e/Jx x (3.79) ] t " p¬ M(1!x)!mL k2 #j2 k2 #mL 2 for D!1TPD1, !1T, J2 M2! M ! M 2 2 1!x x 1!x
G G
M(1!x)!mL for D!1, 1TPD1T , 2 2 1!x e/Jx t* " ] p k2 #j2 k2 z#mL 2 (k !ik ) 2 M2! M ! M for D1,!1TPD1T . !J2 1 2 2 x 1!x x !J2
(3.80)
The quantities to the left of the curly bracket in Eqs. (3.79) and (3.80) are the matrix elements of u(p, j@) uN (p#k, j) cke*(k, jA) , (p`)1@2 (p`!k`)1@2 k
uN (p, j) u(p!k, j@) cke (k, jA) , (p`)1@2 k (p`!k`)1@2
respectively, where kke (k, j)"0 and in light-cone gauge e`(k, j)"0. In LB-convention holds k e (k , j)Pe (k ,$)"$(1/J2)(xL $iyL ), see also Appendix B [41]. For the sake of generality, we M M M M let the intermediate lepton and boson have mass mL and mJ , respectively. Substituting (3.79) and (3.80) into Eq. (3.78), one finds that only the D!1, 1T intermediate state actually contributes to a, 2
Fig. 8. Time-ordered contributions to the electron’s anomalous magnetic moment. In light-cone quantization with q`"0, only the upper-left graph needs to be computed to obtain the Schwinger result.
348
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
since terms which involve differentiation of the denominator of t
P P
p¬
cancel. One thus gets [56]
d2k 1 [mL !(1!x)M]/x(1!x) M dx , 16p3 [M2!(k2 #mL 2)/(1!x)!(k2 #mJ 2)/x]2 0 M M M[mL !M(1!x)]x(1!x) a 1 dx , (3.81) " mL 2x#mJ 2(1!x)!M2x(1!x) p 0 which, in the case of QED (mL "M, mJ "0) gives the Schwinger result a"a/2p [53]. As compared to Schwinger the above is an almost trivial calculation. The general result Eq. (3.78) can also be written in matrix form a"4Me2
P
P
a "!+ e [dx d2k ]t*S ) L t , (3.82) j M M M 2M j where S is the spin operator for the total system and L is the generator of “Galilean” transverse M M boosts [41] on the light cone, i.e. S ) L "(S ¸ #S ¸ )/2 where S "(S $iS ) is the M M ` ~ ~ ` B 1 2 spin-ladder operator and
A
B
¸ "+ x G (3.83) B i k ik 2i i iEj (summed over spectators) in the analog of the angular momentum operator r]p. Eq. (3.78) can also be written simply as an expectation value in impact space. The results given in Eqs. (3.74), (3.75) and (3.78) may also be convenient for calculating the anomalous moments and form factors of hadrons in quantum chromodynamics directly from the quark and gluon wavefunctions t(x, k , j). These wavefunctions can also be used to calculate M the structure functions and distribution amplitudes which control large momentum transfer inclusive and exclusive processes. The charge radius of a composite system can also be written in the form of a local, forward matrix element:
K
F (q2) 1 q2
P
A
B
2 "!+ e [dx d2k ]t* + x t . (3.84) j M p, ik p, 2 Mi q /0 j iEj We thus find that, in general, any Fock state DnT which couples to both t* and t will give ¬ a contribution to the anomalous moment. Notice that because of rotational symmetry in the xL - and yL -direction, the contribution to a"F (0) in Eq. (3.78) always involves the form (a, b"1,2, n) 2 M + t*x t &kMo(ka ) kb ) , (3.85) ik ¬ M M Mi iEj compared to the integral (3.76) for wavefunction normalization which has terms of order t*t &ka ) kb o(ka ) kb ) , k2o(ka ) kb ) . (3.86) M M M M M M Here o is a rotationally invariant function of the transverse momenta and k is a constant with dimensions of mass. Thus, by order of magnitude a"O(kM/(k2#Sk2 T)) M
(3.87)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
349
summed and weighted over the Fock states. In the case of a renormalizable theory, the only parameters k with the dimension of mass are fermion masses. In super-renormalizable theories, k can be proportional to a coupling constant g with dimension of mass. In the case where all the mass-scale parameters of the composite state are of the same order of magnitude, we obtain a"O(MR) as in Eq. (3.13), where R"Sk2 T~1@2 is the characteristic size of M the Fock state. On the other hand, in theories where k2@Sk2 T, we obtain the quadratic relation M a"O(kMR2). Thus, composite models for leptons can avoid conflict with the high-precision QED measurements in several ways. f There can be strong cancelations between the contribution of different Fock states. f The parameter k can be minimized. For example, in a renormalizable theory this can be accomplished by having the bound state of light fermions and heavy bosons. Since k5M, we then have a5O(M2R2). f If the parameter k is of the same order as the other mass scales in the composite state, then we have a linear condition a"O(MR). 3.7. (1#1)-dimensional: Schwinger model (¸B) Quantum electrodynamics in one-space and one-time dimension (QED ) with massless 1`1 charged fermions is known as the Schwinger model. It is one of the very few models of field theory which can be solved analytically [311,401,402,108—110]. The charged particles are confined because the Coulomb interaction in one-space dimension is linear in the relative distance, and there is only one physical particle, a massive neutral scalar particle with no self-interactions. The Fock-space content of the physical states depends crucially on the coordinate system and on the gauge. It is only in the front form that a simple constituent picture emerges [34,326,317]. It is the best example of the type of simplification that people hope will occur for QCD in physical space—time. Recent studies of similar model with massive fermion and for non-abelian theory where the fermion is in the fundamental and adjoint representation show however that many properties are unique to the Schwinger model [193,347]. The Schwinger model in Hamiltonian front-form field theory was studied first by Bergknoff [34]. The description here follows him closely, as well as Perry’s recent lectures [367]. There is an extensive literature on this subject: DLCQ [137,460], lattice gauge theory [113], light-front integral equations [315], and light-front Tamm-Dancoff approaches [338] have used the model for testing the various methods. Bergknoff showed that the physical boson in the Schwinger model in light-cone gauge is a pure electron—positron state. This is an amazing result in a strong-coupling theory of massless bare particles, and it illustrates how a constituent picture may arise in QCD. The kinetic energy vanishes in the massless limit, and the potential energy is minimized by a wavefunction that is flat in momentum space. One might expect that since a linear potential produces a state that is as localized as possible in position space. Consider first the massive Schwinger model. The finite fermion mass m is a parameter to be set to zero, later. The Lagrangian for the theory takes the same form as the QED Lagrangian, Eq. (2.8). Again one works in the light-cone gauge A`"0, and uses the same projection operators K as in B
350
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Section 2. The analogue of Eq. (2.74) becomes now simply i~t "!imt #eA~t , i`t "imt . (3.88) ` ~ ` ~ ~ The equation for t involves the light-front time derivative ~, so t is a dynamical degree of ` ` freedom that must be quantized. On the other hand, the equation for t involves only spatial ~ derivatives, so t is a constrained degree of freedom that should be eliminated in favor of t . ~ ` Formally, t "(m/`)t . (3.89) ~ ` It is necessary to specify boundary condition in order to invert the operator `. If we had not chosen a finite mass for the fermions then both t and t would be independent degrees of ` ~ freedom and we would have to specify initial conditions for both. Furthermore, in the front form, it has only been possible to calculate the condensate S0DttM D0T for the Schwinger model by identifying it as the coefficient of the linear term in the mass expansion of matrix element of the currents [34]. Due to the gauge, one component is fixed to A`"0, but the other component A~ of the gauge field is also a constrained degree of freedom. It can be formally eliminated by the light-cone analogue of Gauss’s law: A~"!(4e/(`)2)ts t . (3.90) ` ` One is left with a single dynamical degree of freedom, t , which is canonically quantized at x`"0, ` Mt (x~),ts (y~)N"K d(x~!y~) (3.91) ` ` ` similar to what was done in QED. The field operator at x`"0, expanded in terms of the free particle creation and annihilation operators, takes the very simple form
P
dk` [b e~*k>x#dse*k>x] with Md ,dsN"Mb ,bsN"4pd(k`!p`) . (3.92) k k p k p 4p k k`;0 The canonical Hamiltonian H"P "1P~ is divided into the three parts, ` 2 H"H #H@ #»@ . (3.93) 0 0 These Fock-space operators are obtained by inserting the free fields in Eq. (3.92) into the canonical expressions in Eq. (2.89). The free part of the Hamiltonian becomes t (x~)" `
P
A B
dk m2 (bsb #dsd ) . k k k k 8p k
H " 0
(3.94) k;0 H@ is the one-body operator which is obtained by normal ordering the interaction, i.e. 0 e2 1 1 dk H@ " dp ! (bsb #dsd ) . (3.95) 0 4p k k k k (k!p)2 (k#p)2 4p k;0 p;0 The divergent momentum integral is regulated by the momentum cut-off, Dk!pD'e. One finds
P
P A
P
A
B
B
e2 dk 1 1 H@ " ! #O(e) (bsb #dsd ) . 0 2p 4p e k k k k k
(3.96)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
351
The normal-ordered interaction is
P
G
dk dk 2 12 4d(k #k !k !k ) bs ds d b 1 2 3 4 4p 4p (k !k )2 1 2 4 3 1 3 2 1 # bs ds d b ! (bs bs b b #ds ds d d )#2 . (3.97) 1 2 3 4 (k #k )2 1 2 3 4 (k !k )2 1 2 3 4 1 2 1 3 The interactions that involve the creation or annihilation of electron—positron pairs are not displayed. The first term in »@ is the electron—positron interaction. The longitudinal momentum cut-off requires Dk !k D'e and leads to the potential 1 3 = dk 2 v(x~)"4q q h(DkD!e)e~*kx~@2"q q !Dx~D#O(e) . (3.98) 1 2 1 2 4p pe ~= This potential contains a linear Coulomb potential that we expect in two dimensions, but it also contains a divergent constant, being negative for unlike charges and positive for like charges. In charge neutral states the infinite constant in »@ is exactly canceled by the divergent “mass” term in H@ . This Hamiltonian assigns an infinite energy to states with net charge, and a finite 0 energy as, eP0, to charge zero states. This does not imply that charged particles are confined, but that the linear potential prevents charged particles from moving to arbitrarily large separation except as charge-neutral states. One should emphasize that even though the interaction between charges is long-ranged, there are no van der Waals forces in 1#1 dimensions. It is a simple geometrical calculation to show that all long-range forces between two neutral states cancel exactly. This does not happen in higher dimensions, and if we use long-range two-body operators to implement confinement we must also find many-body operators that cancel the strong long-range van der Waals interactions. Given the complete Hamiltonian in normal-ordered form we can study bound states. A powerful tool for getting started is the variational wavefunction. In this case, one can begin with a state that contains a single electron—positron pair »@"4pe2
P
H
C
D
P
Pdp /(p)bsds D0T . (3.99) p P~p 4p 0 The normalization of this state is SW(P@)DW(P)T"4pPd(P@!P). The expectation value of the one-body operators in the Hamiltonian is DW(P)T"
P C
D
1 dk m2!e2/p m2!e2/p 2e2 SWDH #H@ DWT" # # D/(k)D2 , 0 0 2P 4p P!k pe k
(3.100)
and the expectation value of the normal-ordered interaction is
P
C
D
e2 @dk dk 1 1 1 2 g SWD»@DWT"! # /*(k )/(k ) . (3.101) 1 2 P 4p 4p (k !k )2 P2 1 2 The prime on the last integral indicates that the range of integration in which Dk !k D(e must 1 2 be removed. By expanding the integrand about k "k , one can easily confirm that the 1/e 1 2 divergences cancel.
352
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
The easiest case to study is the massless Schwinger model. With m"0, the energy is minimized when /(k)"J4p .
(3.102)
The invariant-mass squared, M2"2PH, becomes then finally M2"e2/p .
(3.103)
This type of simple analysis can be used to show that this electron—positron state is actually the exact ground state of the theory with momentum P, and that bound states do not interact with one another [367]. It is intriguing that for massless fermions, the massive bound state is a simple bound state of an electron and a positron when the theory is formulated in the front form using the light-cone gauge. This is not true in other gauges and coordinate systems. This happens because the charges screen one another perfectly, and this may be the way a constituent picture emerge in QCD. On the other hand there are many differences between two and four dimensions. In two dimensions, for example, the coupling has the dimension of mass making it natural for the bound state mass to be proportional to coupling in the massless limit. On the other hand, in four dimensions the coupling is dimensionless and the bound states in a four-dimensional massless theory must acquire a mass through dimensional transmutations. A simple model of how this might happen is discussed in the renormalization of the Yukawa model and in some simple models in the section on renormalization. 3.8. (3#1)-dimensional: ½ukawa model Our ultimate aim is to study the bound-state problem in QCD. However light-front QCD is plagued with divergences arising from both small longitudinal momentum and large transverse momentum. To gain experience with the novel renormalization programs that this requires, it is useful to study a simpler model. The two-fermion bound-state problem in the 3#1 light-front Yukawa model has many of the non-perturbative problems of QCD while still being tractable in the Tamm—Dancoff approximation. This section follows closely the work in Refs. [182,373,374]. The problems that were encountered in this calculation are typical of any (3#1)-dimensional non-perturbative calculation and laid the basis for Wilson’s current light-front program [451,452,456,364—366] which will be briefly discussed in the section on renormalization. The light-front Tamm—Dancoff method (LFTD) is Tamm—Dancoff truncation of the Fock space in light-front quantum field theory and was proposed [363,422] to overcome some of the problems in the equal-time Tamm—Dancoff method [68]. In this approach, one introduces a longitudinal momentum cut-off e to remove all the troublesome vacuum diagrams. The bare vacuum state is then an eigenstate of the Hamiltonian. One can also introduce a transverse momentum cut-off K to regulate ultraviolet divergences. Of course, the particle truncation and momentum cut-offs spoil Lorentz symmetries. In a properly renormalized theory, one has to remove the cut-off dependence from the observables and recover the lost Lorentz symmetries. One has avoided the original vacuum problem but now the construction of a properly renormalized Hamiltonian is a nontrivial problem. In particular, the light-front Tamm—Dancoff approximation breaks rotational invariance with respect to the two transverse directions. This is visible in the spectrum which does not exhibit
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
353
the degeneracy associated with the total angular momentum multiplets. It is seen that renormalization has sufficient flexibility to restore the degeneracy. Retaining only two-fermion and two-fermion, one-boson states one obtains a two-fermion bound-state problem in the lowest-order Tamm—Dancoff truncation. This is accomplished by eliminating the three-body-sector algebraically which leaves an integral equation for the two-body state. This bound state equation has both divergent self-energy and divergent one-boson exchange contributions. In the renormalization of the one-boson exchange divergences the self-energy corrections are ignored. Related work can be found in Refs. [176,279,459]. Different counter terms are introduced to renormalize the divergences associated with oneboson exchange. The basis for these counter terms is easily understood, and uses a momentum space slicing called the high—low analysis. It was introduced by Wilson [454] and is discussed in detail for a simple one-dimensional model in the section on renormalization. To remove the self-energy divergences one first introduces a sector-dependent mass counter term which removes the quadratic divergence. The remaining logarithmic divergence is removed by a redefinition of the coupling constant. Here one faces the well-known problem of triviality: For a fixed renormalized coupling the bare coupling becomes imaginary beyond a certain ultraviolet cut-off. This was probably seen first in the Lee model [293] and then in meson—nucleon scattering using the equal-time Tamm—Dancoff method [114]. The canonical light-front Hamiltonian for the (3#1)-dimensional Yukawa model is given by
P
1 P~" dx~ d2x [2its `t #m2/2# / ) /] . M ~ ~ B M M 2 The equations of motion are used to express t
~
(3.104)
in terms of t , i.e. `
t "(1/i`)[ia ) #b(m #g/)]t . ~ M M F `
(3.105)
For simplicity, the two fermions are taken to be of different flavors, one denoted by b and the other p by B . We divide the Hamiltonian P~ into P~ and P~ , where p &3%% */5
P
P
m2#k2 m2#k2 as(k)a(k)#+ [d3k] F [bs(k)b (k)#Bs(k)B (k)] , P~ " [d3k] B p p p p &3%% k` k` p
P
P
(3.106)
P
P~ "g + [d3k ] [d3k ] [d3k ]2(2p)3d3(k !k !k ) */5 1 2 3 1 2 3 p1,p2 ][(bs1(k )b 2(k )#Bs1(k )B 2(k ))a(k )uN 1(k )u 2(k )#(bs2(k )b 1(k ) p 1 p 2 p 1 p 2 3 p 1 p 2 p 2 p 1 #Bs2(k )B 1(k ))as(k )uN 2(k )u 1(k )] . p 2 p 1 3 p 2 p 1
(3.107)
Note that the instantaneous interaction was dropped from P~ for simplicity. The fermion number */5 2 state that is an eigenstate of P~ with momentum P and helicity p is denoted as DW(P, p)T. The wavefunction is normalized in the truncated Fock space, with SW(P@, p@)DW(P, p)T"2(2p)3P`d3(P!P@)d . pp{
354
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
In the lowest-order Tamm—Dancoff truncation one has
P
P
DW(P, p)T" + [d3k ] [d3k ]U (P, pDk p , k p )bs1(k )Bs2(k )D0T 1 2 2 1 1 2 2 p 1 p 2 p1p2
P
P
P
# + [d3k ] [d3k ] [d3k ]U (P, pDk p , k p , k )bs1(k )Bs2(k )as(k )D0T , 1 2 3 3 1 1 2 2 3 p 1 p 2 3 p1p2 where U is the two-particle and U the three-particle amplitude, and where D0T is the vacuum state. 2 3 For notational convenience, one introduces the amplitudes W and W by 2 3 U (P, pDk p , k p )"2(2p)3P`d3(P!k !k )Jx x Wp1p2(i x , i x ) , 2 1 1 2 2 1 2 1 2 2 1 1 2 2
(3.108)
U (P, pDk p , k p , k )"2(2p)3P`d3(P!Rk )Jx x x Wp1p2(i x , i x , i x ) . 3 1 1 2 2 3 i 1 2 3 3 1 1 2 2 3 3
(3.109)
As usual, the intrinsic variables are x and i "j : i i Mi
A
B
j2 #m2 , kk" x P`, j , Mi i Mi x P` i i with + x "1 and + j "0. By projecting the eigenvalue equation i i i Mi (P`P~!P2 )DWT"M2DWT M
(3.110)
onto a set of free Fock states, one obtains two coupled integral equations:
C
D
m2#(i )2 m2#(i )2 1 ! F 1 Wp1p2(i , x ) M2! F 2 1 1 x x 1 2
P
dy d2q g 1 1 + Ws1,p2(q , y ; i , x )uN 1(i , x )u 1(q , y ) " 1 1 2 2 p 1 1 s 1 1 2(2p)3 J(x !y )x y 3 s1 1 1 1 1
P
g dy d2q 2 2 + Wp1s2(i , x ; q , y )uN 2(i ,x )u 2(q , y ) # 3 1 1 2 2 p 2 2 s 2 2 2(2p)3 s2 J(x2!y2)x2y2
(3.111)
and
C
D
m2#(i )2 m2#(i )2 m2#(i #i )2 1 ! F 2 ! B 1 2 Wp1p2(i , x ; i , x ) M2! F 3 1 1 2 2 x x x 1 2 3
"g+ s1
Ws1p2(!i , x #x ) 2 2 1 3 uN (i , x )u (!i , x #x ) 2 1 3 Jx x (x #x ) p1 1 1 s1 3 1 1 3
Wp1s2(i , x ) 2 1 1 uN (i , x )u (!i , x #x ) . #g+ 1 2 3 Jx x (x #x ) p2 2 2 s2 s2 3 2 3 2
(3.112)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
355
After eliminating W one ends up with an integral equation for W and the eigenvalue M2: 3 2 m2#i2 M2Wp1p2(i, x)" F #[SE] Wp1p2(i, x) 2 2 x(1!x)
A
B
P
a + dy d2q K(i, x; q,y; u) 1 2 1 2Ws1s2(q,y) #counterterms , (3.113) # p p §s s 2 4p2 s1,s2 where a"g2/4p is the fine structure constant. The absorption of the boson on the same fermion gives rise to the self-energy term [SE], the one by the other fermion generates an effective interaction, or the boson-exchange kernel K, [uN (i, x; p )u(q, y; s )][uN (!i, 1!x; p )u(!q, 1!y; s )] 1 1 2 2 , K(i, x; q, y; u) 1 2 1 2" pp§ss (a#2(i ) q))Jx(1!x)y(1!y)
(3.114)
with
G C C
DH C
1 m2#k2 m2#q2 F a"Dx!yD u! # F 2 x(1!x) y(1!y)
D
!m2#2m2 B F
D
m2#k2 y 1!y m2#q2 x 1!x ! F # ! F # , 2 2 x 1!x y 1!y
(3.115)
with k"DiD and u,M2. Possible counter terms will be discussed below. Since p"C,B one faces thus 4]4"16 coupled integral equations in the three variables x and j . M But the problem is simplified considerably by exploiting the rotational symmetry around the z-axis. Let us demonstrate that shortly. By Fourier transforming over the angle /, one introduces first states U with good total spin-projection S "p #p ,m, z 1 2 e*m( Wp1p2(i, x)"+ U 1 2(k, x; m) (3.116) pp 2 J2p m and uses that second to redefine the kernel:
P
»(k, x, m; q, y, m@; M2) 1 2 1 2" p p §s s
d/ d/@ e~*m(e*m{({K(k, /, x; q, /@, y; M2) 1 2 1 2 . p p §s s 2p
(3.117)
The /-integrals can be done analytically. Now, recall that neither S nor ¸ is conserved; only z z J "S #¸ is a good quantum number. In the two-particle sector of spin-1 particles the spin z z z 2 projections are limited to DS D41, and thus, for J given, one has to consider only the four z ; amplitudes U (k, x; J !1), U (k, x; J ), U (k, x; J ), and U (k, x; J #1). Rotational symmetry z ¬ z ¬ z ¬¬ z allows thus to reduce the number of coupled equations from 16 to 4, and the number of integration variables from 3 to 2. Finally, one always can add and subtract the states, introducing UB(k, x)"(1/J2)(U (k, x; J !1)$U (k, x; J #1)) , t z ¬¬ z UB(k, x)"(1/J2) (U (k, x; J )$U (k, x; J )) . ¬ z ¬ z s
(3.118)
356
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
The integral equations couple the sets (t~, s`) and (t`, s~). For J "0, the “singlet” and the z “triplet” states uncouple completely, and one has to solve only two pairs of two coupled integral equations. In a way, these reductions are quite natural and straightforward, and have been applied independently also by Krautga¨rtner et al. [279] and most recently by Trittmann and Pauli [429]. Next, let us discuss the structure of the integrand in Eq. (3.113) and analyze eventual divergences. Restrict first to J "0, and consider [uN (x, i; p )u(y, q; s )] for large q, taken from the tables in z 1 1 Section 4. They are such that the kernel K becomes independent of DqD in the limit DqDPR. Thus, unless U vanishes faster than DqD~2, the q-integral potentially diverges. In fact, introducing an ultraviolet cut-off K to regularize the DqD-dependence, the integrals involving the singlet wavefunctions UB diverge logarithmically with K. In the J "1 sector one must solve a system of four 4 z coupled integral equations. One finds that the kernel » approaches the same limit !f (x, y) as ,¬¬ q becomes large relative to k. All other kernels fall off faster with q. For higher values of J , the z integrand converges since the wavefunctions fall-off faster than DqD~2. Counter terms are therefore needed only for J "0 and J "$1. These boson-exchange counter terms have no analogue in z z equal-time perturbation theory, and will be discussed below. These integral equations are solved numerically, using Gauss—Legendre quadratures to evaluate the q and y integrals. Note that the eigenvalue M2 appears on both the left- and right-hand side of the integral equation. One handles this by choosing some “starting point” value u on the r.h.s. By solving the resulting matrix eigenvalue problem one obtains the eigenvalue M2(u). Taking that as the new starting point value, one iterates the procedure until M2(u)"u is numerically fulfilled sufficiently well. For the parameter values 14a42 and 24m /m 44 one finds only two stable bound states, F B one for each DJ D41. In the corresponding wavefunctions, one observes a dominance of the z spin-zero configuration S "0. The admixture from higher values of ¸ increases gradually with z z increasing a, but the predominance of ¸ "0 persists also when counter terms are included in the z calculation. With the above parameter choice no bound states have been found numerically for J '1. They start to appear only when a is significantly increased. z The above bound-state equations are regularized. How are they renormalized? In the section on renormalization, below, we shall show in simple one-dimensional models that it is possible to add counter terms to the integral equation of this type that completely remove all the cut-off dependence from both the wavefunctions and the bound-state spectrum. In these one-dimensional models the finite part of the counter term contains an arbitrary dimensionful scale k and an associated arbitrary constant. In two-dimensional models the arbitrary constant becomes an arbitrary function. The analysis presented here is based on the methods used in the one-dimensional models. It is convenient to subdivide the study of these counter terms into two categories. One is called the asymptotic counter terms, and the other is called the perturbative counter terms. Studies of the simple models and the general power counting arguments show that integral equations should be supplemented by a counter term of the form,
P
G(K) q dq dy F(x, y)/(q, y) .
(3.119)
For the Yukawa model one has not been able to solve for G(K) F(x, y) exactly such that it removes all that cut-off dependence. One can, however, estimate G(K) F(x, y) perturbatively. The
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
357
lowest-order perturbative counter terms, those of order a2, correspond to the box graphs in the integral equation. They are thus called the “box counter terms” (BCT). Applying it to the Yukawa model, one finds that the integral equation should be modified according to »(k, x; q, y; u)P»(k, x; q, y; u)!»BCT(x, y) .
(3.120)
»BCT(x, y) contains an undetermined parameter ‘C’. Redoing the bound-state mass calculations with this counter term one finds that the cut-off independence of the solutions is greatly reduced. Thus, one has an (almost) finite calculation involving arbitrary parameters, C for each sector. Adjusting the C’s allows us to move eigenvalues around only in a limited way. It is possible however to make the J "1 state degenerate with either of the two J "0 states. The splitting z z among the two J "0 states remains small. z One can also eliminate divergences non-perturbatively by subtracting the large transverse momentum limit of the kernel. We call this type of counter term as the asymptotic counter term. In the Yukawa model one is only able to employ such counter terms in the J "0 sector. One then has z »(k, x; q, y; M2)s`,s`P»(k, x; q, y; M2)s`,s`#f (x, y) , (3.121) »(k, x; q, y; M2)s~,s~P»(k, x; q, y; M2)s~,s~!f (x, y) .
(3.122)
One can find an extra interaction allowed by power counting in the LC-Hamiltonian that would give rise to more terms. One finds that with the asymptotic counter term the cut-off dependence has been eliminated for the (t!, s#) states and improved for the (t#, s!) states. We also find that this counter term modifies the large k behavior of the amplitudes U(k, x) making them fall off faster than before. The asymptotic counter term, as it stands, does not include any arbitrary constants that can be tuned to renormalize the theory to some experimental input. This differs from the case with the box counter term where such a constant appeared. One may, however, add an adjustable piece which in general involves an arbitrary function of longitudinal momenta. This is motivated by the simple models discussed in the section on renormalization. One replaces f (x, y)Pf (x, y)!G /(1#1G ln(K/k)) (3.123) k 6 k G and the scale k are not independent. A change in k can be compensated by adjusting G such k k that (1/G )!1 ln(k)"constant. This ‘constant’ is arbitrary and plays the role of the constant “C” l 6 in the box counter term. One finds that by adjusting the constant a much wider range of possible eigenvalues can be covered, compared to the situation with the box counter term. Consider now the effects of the self-energy term [SE]. Note that in the bound state problem the self-energy is a function of the bound state energy M2. The most severe ultraviolet divergence in [SE] 2 is a quadratical divergence. One eliminates this divergence by subtracting at the threshold (M ) M2"M2,(m2#k2)/(x(1!x)) 0 F [SE] 2 P[SE] 2 ![SE] 2 ,g2(M2!M2)p 2 . (3.124) (M ) (M ) (M0) 0 (M ) p 2 is still logarithmically divergent. The remaining logarithmically divergent piece corresponds (M ) to wavefunction renormalization of the two fermion lines. One finds
A
B
[SE] p " -0'$*7.1!35 M2
,!¼(K) -0'$*7.
(3.125)
358
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
One can absorb this divergence into a new definition of the coupling constant. After the subtraction (but ignoring all “boson-exchange” counter terms) the integral equation becomes a a (M2!M2)Wp1p2(k, x)" [BE]# (M2!M2)p 20 Wp1p2(k, x) , 0 2 0 (M ) 2 4p2 4p2
(3.126)
where [BE] stands for the term with the kernel K. After rearranging the terms one finds, with all spin indices suppressed,
C
D
a a a (M2!M2) 1# ¼(K) W" [BE]# (M2!M2)(p#¼(K))W . 0 0 4p2 4p2 4p2
(3.127)
The r.h.s. is now finite. One must still deal with the divergent piece ¼ on the l.h.s. of the equation. Define a "a/(1#(a/4p2)¼(K)) . (3.128) R Then one can trade a K-dependent bare coupling a in favor of a finite renormalized coupling a . R One has a /4p2 R (M2!M2)W" [BE] . (3.129) 0 1!a /4p2(p 2 #¼(K)) R (M ) One sees that the form of the equation is identical to what was solved earlier (where all counter terms were ignored) with a replaced by a /[1!a /4p2(p#¼)]. One should note that p is R R a function of x and k, and therefore effectively changes the kernel. In lowest-order Tamm—Dancoff the divergent parts of [SE] can hence be absorbed into a renormalized mass and coupling. It is however not clear whether this method will work in higher orders. Inverting the equation for a one has R (3.130) a(K)"a /(1!(a /4p2)¼(K)) . R R One sees that for every value of a other than a "0 there will be a cut-off K at which the R R denominator vanishes and a becomes infinite. This is just a manifestation of “triviality” in this model. The only way the theory can be sensible for arbitrarily large cut-off KPR, is when a P0. R In practice, this means that for fixed cut-off there will be an upper bound on a . R 4. Discretized light-cone quantization Constructing even the lowest state, the “vacuum”, of a quantum field theory has been so notoriously difficult that the conventional Hamiltonian approach was given up altogether long ago in the 1950s, in favor of action-oriented approaches. It was overlooked that Dirac’s “front form of Hamiltonian dynamics” [123] might have less severe problems. Of course, the action and the Hamiltonian forms of dynamics are equivalent to each other. The action is more suitable for deriving cross sections, the Hamiltonian more convenient when considering the structure of bound states in atoms, nuclei, and hadrons. In fact, in the front form with periodic boundary conditions one can combine the aspects of a simple vacuum [448] and a careful treatment of the infrared
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
359
degrees of freedom. This method is called “discretized light-cone quantization” (DLCQ) [354] and has three important aspects: 1. the theory is formulated in a Hamiltonian approach; 2. calculations are done in momentum representation; 3. quantization is done at equal light-cone rather than at equal usual time. As a method, “discretized light-cone quantization” has the ambitious goal to calculate the spectra and wavefunctions of physical hadrons from a covariant gauge field theory. In fact, in 1#1 dimensions this method provides the first total solutions to non-trivial quantum field theories. In 3#1 dimensions, the conversion of this non-perturbative method into a reliable tool for hadronic physics is beset with many difficulties [185]. Their resolution will continue to take time. Since its first formulation [354,355] many problems have been resolved but many remain, as we shall see. Many of these challenges are actually not peculiar to the front form but appear also in conventional Hamiltonian dynamics. For example, the renormalization program for a quantum field theory has been formulated thus far only in order-by-order perturbation theory. Little work has been done on formulating a non-perturbative Hamiltonian renormalization [363,456]. At the beginning, one should emphasize a rather important aspect of periodic boundary conditions: all charges are strictly conserved. Every local Lagrangian field theory has vanishing four-divergences of some “currents” of the form Jk"0. Written out explicitly this reads k J`# J~"0 . (4.1) ` ~ The restriction to 1#1 dimensions suffices for the argument. The case of 3#1 dimensions is a simple generalization. The “charge” is defined by
P
Q(x`),
`L
dx~ J`(x`, x~).
(4.2)
~L Conservation is proven by integrating Eq. (4.1), (d/dx`)Q(x`)"0 ,
(4.3)
provided that the terms from the boundaries vanish, i.e. J`(x`, ¸)!J`(x`,!¸)"0 .
(4.4)
This is precisely the condition for periodic boundary conditions. If one does not use periodic boundary conditions, then one has to ensure that all fields tend to vanish “sufficiently fast” at the boundaries. To guarantee the latter is much more difficult than taking the limit ¸PR at the end of a calculation. Examples of conserved four-currents are the components of the energy— momentum stress tensor with Hkl"0, the conserved “charges” being the four components of the k energy—momentum four-vector Pl. Discretized light-cone quantization applied to abelian and non-abelian quantum field theories faces a number of problems only part of which have been resolved by recent work. Here is a rather incomplete list: 1. Is the front form of Hamiltonian dynamics equivalent to the instant form? Does one get the same results in both approaches? Except for a class of problems involving massless left-handed
360
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
fields, it has been established that all explicit calculations with the front form yield the same results as in the instant form, provided the latter are available and reliable. 2. One of the major problems is to find a suitable and appropriate gauge. One has to fix the gauge before one can formulate the Hamiltonian. One faces the problem of quantizing a quantum field theory “under constraints”. Today one knows much better how to cope with these problems, and the Dirac—Bergman method is discussed in detail in Appendix E. 3. Can a Hamiltonian matrix be properly renormalized with a cut-off such that the physical results are independent of the cut-off ? Hamiltonian renormalization theory is just starting to be understood. 4. In hadron phenomenology the aspects of chiral symmetry breaking play a central role. In DLCQ applied to QCD they have not been tackled yet. In this section we shall give a number of concrete examples where the method has been successful. 4.1. Why discretized momenta? The goal of rigorously diagonalizing a Hamiltonian has not been realized even for a conventional quantum many-body problem. How can one dare to address a field theory, where not even the particle number is conserved? Let us briefly review the difficulties for a conventional non-relativistic many-body theory. One starts with a many-body Hamiltonian H"¹#º. The kinetic energy ¹ is usually a one-body operator and thus simple. The potential energy º is at least a two-body operator and thus complicated. One has solved the problem if one has found the eigenvalues and eigenfunctions of the Hamiltonian equation, HW"EW. One always can expand the eigenstates in terms of products of single-particle states SxDmT, which usually belong to a complete set of orthonormal functions of position x, labeled by a quantum number m. When antisymmetrized, one refers to them as “Slater determinants”. All Slater determinants with a fixed particle number form a complete set. One can proceed as follows. In the first step one chooses a complete set of single-particle wavefunctions. These single particle wave functions are solutions of an arbitrary “single-particle Hamiltonian” and its selection is a science of its own. In the second step, one defines one (and only one) reference state, which in field theory finds its analogue as the “Fock-space vacuum”. All Slater determinants can be classified relative to this reference state as 1-particle—1-hole (1-ph) states, 2-particle—2-hole (2-ph) states, and so on. The Hilbert space is truncated at some level. In a third step, one calculates the Hamiltonian matrix within this Hilbert space. In Fig. 9, the Hamiltonian matrix for a two-body interaction is displayed schematically. Most of the matrix elements vanish, since a two-body Hamiltonian changes the state by upto two particles. Therefore, the structure of the Hamiltonian is a finite penta-diagonal block matrix. The dimension within a block, however, is infinite. It is made finite by an artificial cut-off on the kinetic energy, i.e. on the single-particle quantum numbers m. A finite matrix, however, can be diagonalized on a computer: the problem becomes “approximately soluble”. Of course, at the end one must verify that the physical results are reasonably insensitive to the cut-off(s) and other formal parameters. This procedure was actually carried out in one-space dimension [353] with two different sets of single-particle functions, SxDmT"N H (x/¸) expM!1/2 (x/¸)2N , m m
SxDmT"N expMim(x/¸)pN . m
(4.5)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
361
Fig. 9. Non-relativistic many-body theory.
The two sets are the eigenfunctions of the harmonic oscillator (¸,+/mu) with its Hermite polynomials H , and the eigenfunctions of the momentum of a free particle with periodic boundary m conditions. Both are suitably normalized (N ), and both depend parametrically on a characteristic m length parameter ¸. The calculations are particularly easy for particle number 2, and for a harmonic two-body interaction. The results are displayed in Fig. 9, and surprisingly different. For the plane waves, the results converge rapidly to the exact eigenvalues E"3, 7, 11,2, as 2 2 2 shown in the right part of the figure. Opposed to this, the results with the oscillator states converge extremely slowly. Obviously, the larger part of the Slater determinants is wasted on building up the plane-wave states of center of mass motion from the Slater determinants of oscillator wavefunctions. It is obvious, that the plane waves are superior, since they account for the symmetry of the problem, namely Galilean covariance. For completeness one should mention that the approach with discretized plane waves was successful in getting the exact eigenvalues and eigenfunctions for upto 30 particles in one dimension [353] for harmonic and other interactions. From these calculations, one should conclude: 1. Discretized plane waves are a useful tool for many-body problems. 2. Discretized plane waves and their Slater determinants are denumerable, and thus allow the construction of a Hamiltonian matrix. 3. Periodic boundary conditions generate good wavefunctions even for a “confining” potential like the harmonic oscillator. A numerical “solution” of the many-body problem is thus possible at least in one-space dimension. Periodic boundary conditions should also be applicable to gauge field theory.
362
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
4.2. Quantum chromodynamics in 1#1 dimensions (KS) DLCQ [354] in one-space and one-time dimensions had been applied first to Yukawa theory [354,355] followed by an application to QED [137] and to QCD [227], but the advantages of working with periodic boundary conditions, particularly when discussing the “zero modes” (see Section 7), had been noted first in 1976 by Maskawa and Yamawaki [324]. However, before we go into the technical details, let us first see how much we can say about the theory without doing any calculations. With only one-space dimension there are no rotations — hence no angular momentum. The Dirac equation is only a two-component equation. Chirality can still formally be defined. Second, the gauge field does not contain any dynamical degree of freedom (up to a zero mode which will be discussed in a later section) since there are no transverse dimensions. This can be understood as follows. In four dimensions, the Ak field has four components. One is eliminated by fixing the gauge. A second component corresponds to the static Coulomb field and only the remaining two transverse components are dynamical (their “equations of motion” contain a time derivative). In contrast, in 1#1 dimensions, one starts with only two components of the Ak-field. Thus, after fixing the gauge and eliminating the Coulomb part, there are no dynamical degrees of freedom left. Furthermore, in an axial gauge the nonlinear term in the only non-vanishing component of Fkl drops out, and there are no gluon—gluon interactions. Nevertheless, the theory confines quarks. One way to see that is to analyze the solution to the Poisson equation in one-space dimension which gives rise to a linearly rising potential. This however is not peculiar to QCD . 1`1 Most if not all field theories confine in 1#1 dimensions. In 1#1 dimensions quantum electrodynamics [137] and quantum chromodynamics [227] show many similarities, both from the technical and from the phenomenological point of view. A plot like that on the left side in Fig. 10 was first given by Eller for periodic boundary conditions on the fermion fields [137], and repeated recently for anti-periodic ones [139]. For a fixed value of the resolution, it shows the full mass spectrum of QED in the charge zero sector for all values of the coupling constant and the fermion mass, parametrized by j"(1#p(m/g)2)~1@2. It includes the free case j"0 (g"0) and the Schwinger model j"1 (m"0). The eigenvalues M are plotted in units i where the mass of the lowest “positronium” state has the numerical value 1. All states with M'2 are unbound. The lower left part of the figure illustrates the following point. The rich complexity of the spectrum allows for multi-particle Fock states at the same invariant mass as the “simple qqN -states” shown in the figure as the “2 particle sector”. The spectrum includes not only the simple bound-state spectrum, but also the associated discretized continuum of the same particles in relative motion. One can identify the simple bound states as two quarks connected by a confining string as displayed in the figure. The smallest residual interaction mixes the simple configuration with the large number of “continuum states” at the same mass. The few simple states have a much smaller statistical weight, and it looks as if the long string “breaks” into several pieces of smaller strings. Loosely speaking one can interpret such a process as the decay of an excited pion into multi-pion configurations p*Pppp. In the right part of Fig. 10 some of the results of Hornbostel [227] on the spectrum and the wavefunctions for QCD are displayed. Fock states in non-abelian gauge theory SU(N) can be made color singlets for any order of the gauge group and thus one can calculate mass spectra for mesons and baryons for almost arbitrary values of N. In the upper right part of the figure the lowest mass eigenvalue of a meson is given for N"2,3,4. Lattice gauge calculations to compare with are available only for N"2 and for the lowest two eigenstates; the
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
363
Fig. 10. Spectra and wavefunctions in 1#1 dimensions, taken from Refs. [137,227]. Lattice results are from Refs. [195—197].
agreement is very good. In the left lower part of the figure the structure function of a baryon is plotted versus (Bj+rken-)x for m/g"1.6. With DLCQ it is possible to calculate also higher Fock space components. As an example, the figure includes the probability distribution to find a quark in a qqq qqN -state. Meanwhile, many calculations have been done for 1#1 dimensions, among them those by Eller et al. [137,138], Hornbostel et al. [226—230], Antonuccio et al. [9—11,13], Burkardt et al. [75—79], Dalley et al. [115,116,439], Elser et al. [139,140,219], Fields et al. [143,348,442,443], Fujita et al. [162—167,349,428], Harada et al. [198—202], Harindranath et al., [182,203—205,364,461—463], Hiller et al. [58,4,220—223,315,440,458], Hollenberg et al. [224,225], Itakura et al. [239—241], Pesando et al. [370,371], Kalloniatis et al. [140,219,258—263,358,383], Klebanov et al. [38,115,121], McCartor et al. [325—331], Nardelli et al. [25,27,28], van de Sande et al. [33,116,223,382,435—439], Sugihara et al. [410—412], Tachibana et al. [423], Thies et al. [296,425], Tsujimaru et al. [231,271,344,441], and others [3,253,278,336,391,408]. Aspects of reaction theory can be studied now. Hiller [220], for example, has calculated the total annihilation cross section R N in 1#1 ee dimensions, with success. We will use the work of Hornbostel et al. [227] as an example to demonstrate how DLCQ works. Consider the light-cone gauge, A`"0, with the gauge group SU(N). In a representation in which c5 is diagonal one introduces the chiral components of the fermion spinors:
A B
t t" L , a t R
(4.6)
364
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
The usual group generators for SU(N) are the ¹a"1ja. In a box with length 2¸ one finds 2 im `L dy~e(x~!y~)t (x`,y~) , (4.7) t (x)"! R L 4 ~L g `L A~a(x)"! dy~Dx~!y~Dts ¹at (x`,y~) . (4.8) R R 2 ~L The light-cone momentum and light-cone energy operators are
P P
P
`L dx~ts t , (4.9) R ~ R ~L im2 `L `L P~"! dx~ dy~ts (x~)e(x~!y~)t (y~) R R 4 ~L ~L `L g2 `L dx~ dy~ts ¹at (x~)Dx~!y~Dts ¹at (y~) , (4.10) ! R R R R 2 ~L ~L respectively. Here, t is subject to the canonical anti-commutation relations. For example, for anti-periodic boundary conditions one can expand P`"
P P P P
1 t (x~) " R c J2¸
= + (b e~* nnx~@L#ds e* nnx~@L) , n,c n,c n/12,32,2
(4.11)
where Mbs 1, b 2N"Mds 1, d 2N"d 1 2d , (4.12) n,c m,c n,c m,c c ,c n,m with all other anticommutators vanishing. Inserting this expansion into the expressions for P`, Eq. (4.9), one thus finds
A B 2p ¸
(4.13)
A B
(4.14)
= + n(bs b #ds d ) . n,c n,c n,c n,c n/12,32,2 Similarly, one finds for P~ of Eq. (4.10) P`"
P~"
¸ (H #») , 0 2p
where = m2 H " + (bs b #ds d ) 0 n,c n,c n,c n,c 1,3 2 n n/2 2, is the free kinetic term, and the interaction term » is given by g2 = 1 »" + ja(k) ja(!k) , p k2 k/~=
(4.15)
(4.16)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
365
where = #H(k!n)ds 2) . (4.17) ja(k)"¹a1 2 + (H(n)bs 1#H(!n)d 1)(H(n!k)b n,c n~k,c2 k~n,c c ,c n,c n/~= Since we will restrict ourselves to the color singlet sector, there is no problem from k"0 in Eq. (4.16), since ja(0)"0 acting on color singlet states. Normal ordering the interaction (4.16) gives a diagonal operator piece g2C = I F + n (bs b #ds d ) , »":»:# n,c n,c n,c n,c p 1,3 2 n n/2 2, with the “self-induced inertia”
(4.18)
1 n`1@2 1 I "! # + . (4.19) n 2n m2 m/1 The color factor is C "(N2!1)/2N. The explicit form of the normal ordered piece :»: can be F found in Ref. [227] or in the explicit tables below in this section. It is very important to keep the self-induced inertias from the normal ordering, because they are needed to cancel the infrared singularity in the interaction term in the continuum limit. Already classically, the self energy of one ingle quark is infrared divergent because its color electric field extends to infinity. The same infrared singularity (with opposite sign) appears in the interaction term. They cancel for color singlet states, because there the color electric field is nonzero only inside the hadron. Since the hadron has a finite size, the resulting total color electric field energy must be infrared finite. The next step is to actually solve the equations of motions in the discretized space. Typically one proceeds as follows: Since P` and P~ commute they can be diagonalized simultaneously. Actually, in the momentum representation, P` is already diagonal, with eigenvalues proportional to 2p/¸. Therefore, the harmonic resolution K [137], K"(¸/2p)P` ,
(4.20)
determines the size of the Fock space and thus the dimension of the Hamiltonian matrix, which simplifies the calculations considerably. For a given K"1, 2, 3,2, there are only a finite number of Fock states due to the positivity condition on the light-cone momenta. One selects now one value for K and constructs all color singlet states. In the next step one can either diagonalize H in the full space of states with momentum K (DLCQ approximation) or in a subspace of that space (for example with a Tamm—Dancoff approximation). The eigenvalue E (K) corresponds to invarii ant masses M2(K),2P`P~"KE (K) , (4.21) i i i i where we indicated the parametric dependence of the eigenvalues on K. Notice that the length ¸ drops out in the invariant mass, and that one gets a spectrum for any value of K. Most recent developments in string theory, the so-called ‘M(atrix)-theory’ [416], emphasizes this aspect, but for the present one should consider the solutions to be physical only in the continuum limit KPR. Of course there are limitations on the size of matrices that one can diagonalize (although the Lanczos algorithm allows quite impressive sizes [220]). Therefore what one typically does is to
366
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
repeat the calculations for increasing values of K and to extrapolate observables to KPR. The first QCD calculations in that direction were performed in Refs. [227,75]. In these pioneering 2 works it was shown that the numerics actually converged rather quickly (except for very small quark masses, where ground state mesons and ground state baryons become massless) since the lowest Fock component dominates these hadrons (typically less than 1% of the momentum is carried by the sea component). The suppression of the higher particle Fock states is presumably special to super-renormalizable theories where the couplings which change particle number are suppressed as g2/M2. Due to these fortunate circumstances a variety of phenomena could be investigated. For example, Hornbostel studied hadron masses and structure functions for various N which showed very simple scaling behavior with N. A correspondence with the analytic work of Einhorn [136] for meson form factors in QCD was also established. Ref. [75] focused more on 1`1 nuclear phenomena. There it was shown that two nucleons in QCD with two colors and two 1`1 flavors form a loosely bound state — the “deuteron”. Since the calculation was based entirely on quark degrees of freedom it was possible to study binding effects on the nuclear structure function (“EMC-effect”). Other applications include a study of “Pauli-blocking” in QCD . Since quarks 1`1 are fermions, one would expect that sea quarks which have the same flavor as the majority of the valence quarks (the up quarks in a proton) are suppressed compared to those which have the minority flavor (the down quarks in a proton) — at least if isospin breaking effects are small. However, an explicit calculation shows that the opposite is true in QCD ! This so called “anti 2 Pauli-blocking” has been investigated in Refs. [76,77], where one can also find an intuitive explanation. 4.3. The Hamiltonian operator in 3#1 dimensions (B¸) Periodic boundary conditions on L can be realized by periodic boundary conditions on the vector potentials A and anti-periodic boundary conditions on the spinor fields, since L is bilinear k in the W . In momentum representation one expands these fields into plane wave states e~*pkxk, and a satisfies the boundary conditions by discretized momenta n n with n"1,3,2,R for fermions , 22 ¸ p " ~ n n with n"1,2,2,R for bosons , ¸ (4.22) n p " n with n , n "0, $1, $2,2,$R for both . M ¸ M x y M As an expense, one has to introduce two artificial length parameters, ¸ and ¸ . They also define M the normalization volume X,2¸(2¸ )2. M More explicitly, the free fields are expanded as
G
1 1 WI (x)" + (b u (p, j)e~*px#dsv (p, j)e*px) , a q a JX q Jp` q a 1 1 + (a e (p, j)e~*px#ase*(p, j)e*px) , AI (x)" q k k JX q Jp` q k
(4.23)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
367
particularly for the two transverse vector potentials AI i,AI i (i"1, 2). The light-cone gauge and M the light-cone Gauss equation, i.e. A`"0 and A~"(2g/(i`)2), J`!(2/(i`))i Aj , respectively, j M complete the specification of the vector potentials Ak. The subtlety of the missing zero-mode n"0 in the expansion of the AI will be discussed below. Each denumerable single-particle state “q” is M specified by at least six quantum numbers, i.e. q"Mq D k`, k , k , j, c, f N"Mq D n, n , n , j, c, f N . (4.24) Mx My x y The quantum numbers denote the three discrete momenta n, n , n , the two helicities j"(C, B), x y the color index c"1, 2,2, N , and the flavor index f"1, 2,2, N . For a gluon state, the color C F index is replaced by the glue index a"1, 2,2, N2!1 and the flavor index is absent. CorrespondC ingly, for QED the color- and flavor index are absent. The creation and destruction operators like as and a create and destroy single-particle states q, and obey (anti-) commutation relations like q q [a , as ]"Mb , bs N"Md , ds N"d . (4.25) q q{ q q{ q q{ q,q{ The Kronecker symbol is unity only if all six quantum numbers coincide. The spinors u and v , and a a the transverse polarization vectors e are the usual ones, and can be found in Ref. [66] and in the M appendix. Finally, after inserting all fields in terms of the expansions in Eq. (4.23), one performs the space-like integrations and ends up with the light-cone energy—momenta Pl" Pl(a , as, b , bs, d , ds) as operators acting in Fock space. The space-like components of Pl are q q q q q q simple and diagonal, and its time—time-like component is complicated and off-diagonal. Its Lorentz-invariant contraction H ,PlP "P`P~!P2 (4.26) LC l M is then also off-diagonal. For simplicity, it is referred to as the light-cone Hamiltonian H , and often LC abbreviated as H"H . It carries the dimension of an invariant mass squared. In a frame in which LC P "0, it reduces to H"P`P~. It is useful to give its general structure in terms of Fock-space M operators. 4.3.1. A typical term of the Hamiltonian: Pure gauge theory As an example consider a typical term in the Hamiltonian, the pure gauge contribution ¼ as 1 defined in Eq. (2.96), i.e.
P
g2 P~ " dx~d2x BI a BI kl . pg M kl a 4 Inserting the free field solutions AI k from Eq. (4.23), one deals with 24"16 terms. The various terms a can be classified according to their operator structure and belong to one of the six classes. a
a a a , as as as as , q1 q2 q3 q4 q4 q3 q2 q1 as a a a , as as as a , q1 q2 q3 q4 q4 q3 q2 q1 as as a a , as as a a . q1 q2 q3 q4 q4 q3 q2 q1 In the first step, we pick out only those terms with one creation and three destruction operators. Integration over the space-like coordinates produces a product of three Kronecker delta functions
368
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
d(k`Dk`#k`#k`)d(2)(k Dk #k #k ), as opposed to the Dirac delta functions in Sec1 2 3 4 M1 M2 M3 M4 tion 2.8. The Kronecker delta functions are conveniently defined by
P
P
1 `L 1 `L dx~ e`*(k~~p~)x~" dx~ e`*(n~m)px~@L"d , (4.27) d(k`Dp`)" n,m 2¸ 2¸ ~L ~L and similarly for the transverse delta functions. One gets then very explicitly (Tables 1—6) g2P` 1 P`P~ " + + Ca Ca pgf 8¸(2¸ )2 k`k`k` a1a2 a3a4 M q1,q2 q3,q4 Jk` 1 2 3 4 ](as1a 2a 3a 4 (e*e )(e e ) d(k`Dk`#k`#k`)d(2) (k Dk #k #k ) q q q q 1 3 24 1 2 3 4 M1 M2 M3 M4 #a 1as2a 3a 4 (e e ) (e*e ) d(k`Dk`#k`#k`)d(2) (k Dk #k #k ) q q q q 13 2 4 2 3 4 1 M2 M3 M4 M1 #a 1a 2as3a 4 (e e*) (e e ) d(k`Dk`#k`#k`)d(2) (k Dk #k #k ) 3 4 1 2 M3 M4 M1 M2 q q q q 13 24 #a 1a 2a 3as4 (e e ) (e e*) d(k`Dk`#k`#k`)d(2) (k Dk #k #k ) . q q q q 13 24 4 1 2 3 M4 M1 M2 M3 For convenience, introduce the function 2D F (q ; q , q , q )" (e*(k , j )ek(k , j )) (e (k , j )el(k , j ))Ca1 2Ca3 4 , 6,2 1 2 3 4 3 3 l 2 2 4 4 aa aa Jk`k`k`k` k 1 1 1 2 3 4 with the overall factor * containing the Kronecker deltas *(q ; q , q , q )"gJ 2d(k`Dk`#k`#k`)d(2)(k Dk #k #k ) 1 2 3 4 1 2 3 4 M1 M2 M3 M4
(4.28)
(4.29)
Table 1 The vertex interaction in terms of Dirac spinors. The matrix elements » are displayed on the right, the corresponding n (energy) graphs on the left. All matrix elements are proportional to D "gL d(k`Dk`# )d(2)(k Dk #k ), with gL "gP`/JX. V 1 2 3 M,1 M,2 M,3 In the continuum limit, see Section 4.6, gL "gP`/J2(2n)3
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
369
Table 2 The fork interaction in terms of Dirac spinors. The matrix elements F are displayed on the right, the corresponding n,j (energy) graphs on the left. All matrix elements are proportional to D"gJ 2d(k`Dk`#k`#k`)d(2)(k Dk #k #k ), 1 2 3 4 M,1 M,2 M,3 M,4 with gJ 2"g2P`/(2X). In the continuum limit, see Section 4.6, one uses gJ 2"g2P`/(4(2n)3
Table 3 The coefficient functions c in terms of matrix elements of the seagull interaction are displayed on the right, the n,j corresponding (energy) graphs on the left
370
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Table 4 The seagull interaction in terms of Dirac spinors. The matrix elements S are displayed on the right, the n,j corresponding (energy) graphs on the left. All matrix elements are proportional to D"gJ 2d(k`# 1 k`Dk`#k`)d(2)(k #k Dk #k ), with gJ 2"g2P`/(2X). In the continuum limit one, see Secction 4.6, uses 2 3 4 M,1 M,2 M,3 M,4 gJ 2"g2P`/(4(2p)3
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
371
Table 5 Matrix elements of Dirac spinors uN (p)Mu(q) 1
M
1
(uN (p)Mu(q)) d jp,jq Jp`q`
Jp`q`
c`
2
0
c~
2 (p ) q g m2#ij p 'q ) M q M M p`q` M p q p q a ) M # M !ij a ' M ! M M p` q` q M p` q`
2m (p (j )!q (j )) M q p`q` M q m m !a (j ) ! M q p` q `
A
c )a M M
B
A
B
(uN (p)Mu(q)) d jp,~jq
A
B
m m # p` q`
p (j ) q (j ) M q! M q q` p`
a ) c c`c~ M M
8 (p ) q #m2#ij p 'q ) q M M p`q` M M 4 (a ) p !ij a 'p ) q M M p` M M 4 (a ) q #ij a 'q ) q M M q` M M
8m (p (j )!q (j )) M q p`q` M q 4m ! a (j ) p` M q 4m a (j ) q` M q
a ) c c`c ) b M M M M
2(a ) b #ij a 'b ) M M q M M
0
1
c~c`c~ c~c`c ) a M M
Notation j"$1, a (j)"!ja !ia M x y a ) b "a b #a b , a 'b "a b !a b . M M x x y y M M x y y x Symmetries vN (p)v(q)"!uN (q) u(p), vN (p) ckv(q)"uN (q) ck u(p) , vN (p) ckclco v(q)"uN (q) coclck u(p) .
and the “tilded coupling constant” gJ 2"g2P`/2X ,
(4.30)
as abbreviations. Recall the normalization volume X"2¸(2¸ )2. One can relabel the summation M indices in the above equation and get identically 1 P`P~ " + + F (q ; q , q , q ) (as1a 2a 3a 4#a 2as1a 4a 3#a 3a 4as1a 2#a 4a 3a 2as1) . 6,2 1 2 3 4 q q q q q q q q q q q q q q q q pgf 4 q1,q2 q3,q4 We can consider these expressions as the “time-ordered products” in the sense of Wick’s theorem and bring them into normal-ordered form. In the present case all pairwise contractions are either zero identically or vanish by the properties of F . The normal ordered contribution to the 6,2 Hamiltonian becomes then P`P~ " + F (q ; q , q , q ) as1a 2a 3a 4, + F (1; 2, 3, 4)as a a a . pgf 6,2 1 2 3 4 q q q q 6,2 1 2 3 4 q1,q2,q3,q4 1,2,3,4
(4.31)
372
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Table 6 Matrix elements of Dirac spinors vN (p)Mu(q) 1
M
Jp`q` c` c~
(vN (p)Mu(q)) d
1 jp,jq
Jp`q`
(vN (p)Mu(q)) d
jp,~jq
0
2
2m (p (j )#q (j )) M q p`q` M q m m a (j ) # M q p` q`
2 (p ) q !m2#ij p 'q ) M q M M p`q` M p q p q a ) M # M !ij a ' M ! M M p` q` q M p` q`
1
p (j ) q (j ) M q# M q p` q`
m m ! # p` q`
c~c`c~
8m (p (j )#q (j )) M q p`q` M q 4m a (j ) p` M q 4m a (j ) q` M q 0
8 (p ) q !m2#ij p 'q ) q M M p`q` M M 4 (a ) p !ij a 'p ) q M M p` M M 4 (a ) q #ij a 'q ) q M M q` M M 2(a ) b #ij a 'b ) M M q M M
A
c )a M M
c~c`c ) a M M a ) c c`c~ M M a ) c c`c ) b M M M M
B
A
B
A
B
Notation j"$1, a (j)"!ja !ia M x y a ) b "a b #a b , a 'b "a b !a b . M M x x y y M M x y y x Symmetries vN (p)v(q)"!uN (q) u(p) , vN (p) ck v(q)"uN (q) ck u(p) , vN (p) ckclco v(q)"uN (q) coclck u(p) .
In the second step a self-explaining “compact notation” was introduced which will be used quite often in the sequel. Below, one refers to these contributions as the “fork part”, since their energy diagrams in Table 2 look like the analogous silverware. Next, consider terms with two creation and two destruction operators. There are six of them, 2P`P~ " + (as as a a S (1, 2; 3, 4)#a a as as S (3, 4; 1, 2)) pgs 1 2 3 4 7,3 4 3 2 1 7,3 1,2,3,4 # + (as a as a S (1, 2; 3, 4)#as a as a S (2, 1; 4, 3)) 1 3 2 4 7,4 2 4 1 3 7,4 1,2,3,4 # + (as a a as S (1, 2; 3, 4)#as a a as S (2, 1; 4, 3)) , 1 3 4 2 7,5 2 4 3 1 7,5 1,2,3,4
(4.32)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
373
using compact notation. The functions S , S and S can be found in Table 4. According to 7,3 7,4 7,5 Wick’s theorem, the normal-ordered part is P`P~ " + (S (1, 2; 3, 4)#S (1, 2; 3, 4)#S (1, 2; 3, 4)) as as a a . (4.33) pgs 7,3 7,4 7,5 1 2 3 4 1,2,3,4 The normal-ordered part of all possible, non-vanishing pairwise contractions is called the contraction term P`P~ " + as a (S (1, 2; 1, 2)#S (1, 2; 2, 1)#S (1, 2; 1, 2)#S (1, 2; 2, 1)) . (4.34) pgc 1 1 7,3 7,3 7,4 7,5 1,2 As a bone fide operator it should not be dropped from the outset. It gives rise to the self-induced inertias as tabulated in Table 3. Finally, focus on terms with only creation or only destruction operators. Integration over the space-like coordinates leads to a product of three Kronecker delta’s d(k`#k`#k`#k`D0)d(2)(k #k #k #k D0) , (4.35) 1 2 3 4 M1 M2 M3 M4 as a consequence of momentum conservation. With k`"np/(2¸) and n positive one has thus d(n #n #n #n D0),0 . (4.36) 1 2 3 4 The sum of positive numbers can never add up to zero, and therefore all parts of the light-cone Hamiltonian with only creation operators or only destruction operators are strictly zero. This is the deeper reason why the vacuum state cannot couple to any Fock state and why the Fock-space vacuum is identical with the physical vacuum. “The vacuum is trivial.” This holds in general, as long as one disregards the impact of the so-called zero modes, see, for example, Refs. [258—262,358] and Section 7. 4.3.2. The explicit Hamiltonian for QCD Unlike in the instant form, the front-form Hamiltonian is additive in the free part ¹ and the interaction º, H,H "P`P~!P2 "¹#º . (4.37) LC M The kinetic energy ¹ is defined as that part of H which is independent of the coupling constant. It is the sum of the three diagonal operators ¹"¹ #¹ #¹ "+ (t (q)bsb #t (q)dsd #t (q)asa ) , 1 2 3 1 q q 2 q q 3 q q q with the coefficient functions
(4.38)
t (q)"t (q)"(m2#k2 /x) , t (q)"(k2 /x) . 1 2 M q 3 M q The interaction energy º breaks up into 20 different operators, grouped into [66]
(4.39)
º"»#F#S#C . They will be defined one after another.
(4.40)
374
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
The vertex interaction » is a sum of four operators, »"» #» #» #» " + [bs b a » (1; 2, 3)#h.c.] 1 2 3 4 1 2 3 1 1,2,3 # + [ds d a » (1; 2, 3)#h.c.]# + [as d b » (1; 2, 3)#h.c.] 1 2 3 2 1 2 3 3 1,2,3 1,2,3 # + [as a a » (1; 2, 3)#h.c.] . (4.41) 1 2 3 4 1,2,3 It changes the particle number by 1. The matrix elements » (1;2,3) are c-numbers with n » (1; 2, 3)"!»w(1; 2, 3). They are functions of the various single-particle momenta k`, k , helici2 1 M ties, colors and flavors, as tabulated in Tables 1 and 9. Again, we use the compact notation b "b(q ) and » (1; 2, 3)"» (q ; q , q ), and again we emphasize that the graphs in these tables are i i n n 1 2 3 energy graphs but not Feynman diagrams. They symbolize matrix elements but not scattering amplitudes. They conserve three-momentum but not four-momentum. One also should emphasize that the present labeling of matrix elements is different from Ref. [66]. ¹he fork interaction F is a sum of six operators, F"F #F #F #F #F #F " + [bs b d b F (1; 2, 3, 4)#h.c.] 1 2 3 4 5 6 1 2 3 4 1 1,2,3,4 #[ds d b d F (1; 2, 3, 4)#h.c.]# + [bs b a a F (1; 2, 3, 4)#h.c.] 1 2 3 4 2 1 2 3 4 3 1,2,3,4 #[ds d a a F (1; 2, 3, 4)#h.c.]# + [as a d b F (1; 2, 3, 4)#h.c.] 1 2 3 4 4 1 2 3 4 5 1,2,3,4 #[as a a a F (1; 2, 3, 4)#h.c.] . (4.42) 1 2 3 4 6 It changes the particle number by 2. The matrix elements F (1; 2, 3, 4) and their graphs are n tabulated in Tables 2 and 10, with F "F and F "F , and for example 2 1 4 3 F "F #F #F . 5 5,1 5,2 5,3 ¹he seagull interaction S is a sum of seven operators, S"S #S #S #S #S #S #S " + bs bs b b S (1, 2; 3, 4) 1 2 3 4 5 6 7 1 2 3 4 1 1,2,3,4 # + ds ds d d S (1, 2; 3, 4)# + bs ds b d S (1, 2; 3, 4) 1 2 3 4 2 1 2 3 4 3 1,2,3,4 1,2,3,4 # + bs as b a S (1, 2; 3, 4)# + ds as d a S (1, 2; 3, 4) 1 2 3 4 4 1 2 3 4 5 1,2,3,4 1,2,3,4 # + (bs ds a a S (1, 2; 3, 4)#h.c.)# + as as a a S (1, 2; 3, 4) . 1 2 3 4 6 1 2 3 4 7 1,2,3,4 1,2,3,4
(4.43)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
375
It does not change particle number. The matrix elements S (1, 2; 3, 4) and their energy graphs are n tabulated in Tables 4 and 12, with S "S and S "S . 2 1 5 4 The contraction operator C is a sum of three terms, C"C #C #C "+ (C (q)bsb #C (q)dsd #C (q)asa ) , (4.44) 1 2 3 1 q q 2 q q 3 q q q using the symbolic labeling of Eq. (4.24). The coefficient functions C are i C (q)"(I (q)/x) , C (q)"(I (q)/x) . (4.45) 1 1 q 3 3 q For the same flavours one has C (q)"C (q). The functions I are the so-called self-induced inertias, 2 1 i in analogy to the mass terms in Eq. (4.38) [354]. They along with their graphs are tabulated in Tables 3 and 11. The contraction operators arise due to bringing P~ into normal ordered form [354,355]. They are part of the operator structure and should not be omitted. But their structure allows to interpret them as mass terms which often can be absorbed into the mass counter terms. Such one are often introduced when regulating the theory, see below. The contraction terms diverge badly. In the continuum limit, i.e.
P
I (q) C Q + dk`d2k 1 bI s(q)bI (q) , (4.46) 1 M x j,c,f they diverge like C &K2, where K is the cut-off scale to be introduced below in Section 4.4.1. i Summarizing these considerations, one can state that the light-cone Hamiltonian H,H LC consists of 23 operators with different operator structure. Some of the pieces are diagonal, some conserve the particle number, and some change it. The piece S is special since it make two gluons 6 out of a quark—antiquark pair. It conserves the charge, but it changes the number of gluons. This was displayed already in Fig. 2 and it is emphasized again in Table 7. The block matrix structure of Table 7 The Fock-space sectors and the Hamiltonian block matrix structure for QCD. Diagonal blocs are marked by D. Off-diagonal blocks are labeled by », F and S , corresponding to vertex, fork and seagull interactions, respectively. Zero 6 matrices are denoted by dots. Taken from Ref. [361]; see also Fig. 2 Sector
n
1
2
3
4
5
6
7
8
9
10
11
12
13
qqN gg qqN g qqN qqN ggg qqN g g qqN qqN g qqN qqN qqN gggg qqN g g g qqN qqN g g qqN qqN qqN g qqN qqN qqN qqN
1 2 3 4 5 6 7 8 9 10 11 12 13
D S 6 » F . F . . . . . . .
S 6 D » . » F . . F . . . .
» » D » S 6 » F . . F . . .
F . » D . S 6 » F . . F . .
. » S 6 . D » . . » F . . .
F F » S 6 » D » . S 6 » F . .
. . F » . » D » . S 6 » F .
. . . F . . » D . . S 6 » F
. F . . » S 6 . . D » . . .
. . F . F » S 6 . » D » . .
. . . F . . » S 6 . » D » .
. . . . . . F » . . » D »
. . . . . . . F . . . » D
376
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
this table and the division of the Fock space into sectors will be discussed more thoroughly in Section 4.4. Some of the operators are diagonal (D) and some of them off-diagonal (R) in the sector number, i.e. D"¹#C#S!S 6 "¹ #¹ #¹ #C #C #C #S #S #S #S #S #S , 1 2 3 1 2 3 1 2 3 4 5 7 R"»#F#S "» #» #» #» #F #F #F #F #F #F #S . 6 1 2 3 4 1 2 3 4 5 6 6 Most of the blocks are actually plain zero matrices, since the light-cone Hamiltonian
(4.47) (4.48)
H,H "D#R (4.49) LC has zero matrix element between the corresponding sectors, see Fig. 2 and Table 7. One should make use of that! In DLCQ one aims at diagonalizing the Hamiltonian HDW T"E DW T , E "M2 . (4.50) i i i i i In principle, its eigenvalues and eigenfunctions are equivalent to the compactified gauge-field Lagrangian in the light-cone gauge. 4.4. The Hamiltonian matrix and its regularization The Hilbert space for the single-particle creation and destruction operators is the Fock space, i.e. the complete set of all possible Fock states DU T"JN bs1bs22bsN ds1ds22d NM as1as22asNI D0T . (4.51) i i q q q q q q q q q They are the analogues of the Slater determinants of Section 4.1. As a consequence of discretization, the Fock states are orthonormal and enumerable. Only one Fock state, the reference state or Fock-space vacuum D0T, is annihilated by all destruction operators. It is natural to decompose the Fock space into sectors, labeled with the number of quarks, antiquarks and gluons, N, NM and NI , respectively. Mesons (or positronium) have total charge Q"0, and thus N"NM . These sectors can be arranged arbitrarily, and can be enumerated differently. A particular example was given in Fig. 2 and Table 7. A second and not unreasonable choice is given in Fig. 11, where the Fock-space sectors are arranged according to total particle number N#NM #NI . The resulting block matrix structure is the one of a penta-diagonal block matrix very much in analogy to the block matrix structure of a non-relativistic many-body Hamiltonian, see Fig. 9. Since all components of the energy momentum commute with each other, and since the space-like momenta are diagonal in momentum representation, all Fock states must have the same value of P`"+ p` and P "+ (p ) , with the sums running over all partons l3n in a particular l l M l Ml Fock-space sector. For any fixed P` and thus for any fixed resolution K, the number of Fock-space sectors is finite. As a consequence, the DLCQ-Hamiltonian matrix has a finite number of blocks, as illustrated in Fig. 11. For the example chosen, the maximum parton number is 5, corresponding to 11 sectors. In 3#1 dimensions, the number of Fock states is unlimited within each sector. After regulating the formalism as to be discussed in Section 4.4.1, the number of Fock states in a sector is strictly finite. The light-cone energy operator P~(a ,as,b ,bs,d ,ds) is realized [66] as a strictly finite q q q q q q
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
377
Fig. 11. The Hamiltonian matrix for a meson. Allowing for a maximum parton number 5, the Fock space can be divided into 11 sectors. Each sector contains enumerably many Fock states. Matrix elements are represented by “energy” diagrams which are characteristic for each block. Blocks with no diagrams are zero matrices. Note that the figure mixes aspects of QCD where the single gluon is absent and of QED which has no three-photon vertices.
Heisenberg matrix with strictly finite matrix elements. From a technical point of view this is simpler than the complicated integral equations discussed in Section 3. 4.4.1. Fock-space regularization In an arbitrary frame, each particle is on its mass-shell p2"m2. Its four-momentum is p "(p`, p , p~) with p~"(m2#p2 )/p`. For the free theory (g"0), the total four-momentum is k M M Pk "+ pk where the index l runs over all particles in a particular Fock-space sector &3%% l l DU T"SnDU T. The components of the free four-momentum are n n Pk "+ (pk) . (4.52) &3%% l l|n Note that the spatial components are Pk"Pk . As usual, one introduces intrinsic momenta x and &3%% k by M p` x" l , (p ) "(k ) #x P . (4.53) l P` Ml Ml l M
378
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
The spatial components of Eq. (4.52) become then the constraints + x "1 , + (k ) "0 , l Ml l l while the free invariant mass of a Fock state becomes
A
M2 "Pk P " + (pk) l &3%% &3%% &3%%,k l|n and, therefore,
A
BA
+ (p ) kl l|n
B
(4.54)
(4.55)
B
m2#k2 M . M2 "+ (4.56) &3%% x l l|n The free invariant mass squared has a minimum with respect to k and x, at k "0 and at M M (x ) "m /+ m , l .*/ l l l|n respectively, and has the value
A A
(4.57)
BB A B
m2#k2 2 M K +m MM 2 ,min + . (4.58) &3%% l x l l|n l|n In the continuum limit the equality is strict. Physically, this corresponds to having all constituents at the same rapidity. The minimal mass-squared is frozen-in and cannot be shared by the particles off-shell. The available mass-squared M2 !M M 2 is therefore the physically meaningful quantity. &3%% &3%% The available mass-squared M2 !M M 2 plays the same role in DLCQ as the kinetic energy &3%% &3%% ¹ in non-relativistic quantum mechanics, see also Section 4.1. The analogy can be used to regulate the Fock space: A Fock state is admitted only when its “off-shell” kinetic energy is below a certain cut-off, i.e.
A
B
m2#k2 M !M + M 2 4K2 . (4.59) &3%% x l l|n Except for the term M M 2 , this dynamic regularization scheme is nothing but the Brodsky—Lepage &3%% regularization [62,53,299,300]. It also defines a factorization scheme for hard scattering processes in perturbative QCD. Since only Lorentz scalars appear, this regularization is Lorentz but not necessarily gauge-invariant. The constant K has the dimension of a SmassT and is at our disposal. Other cut-offs have also been proposed [363,456]. It is self-understood that the cut-off scale can be made sector dependent. DLCQ has an option for having as many “sector-dependent regularization parameters” as might be convenient. As a result of Fock-space regularization the Fock space and therefore the dimension of the Hamiltonian matrix is strictly finite. 4.4.2. Vertex regularization Fock-space regularization turns out to be insufficient when dealing with products of operators like »». More specifically, sums over intermediate states diverge badly for almost all matrix elements S1D»»D4T"+ S1D»D2, 3TS2, 3D»D4T. One must introduce additional regularization 2,3
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
379
schemes, which is not always easy. One possible choice is to regulate the interaction on the operator level, for example by multiplying each matrix element in Eq. (4.41) with a cut-off function H. For the vertex interaction one can thus define » (q ; q , q )N»M (q ; q , q )"» (q ; q , q ) H (q ; q , q ), n 1 2 3 n 1 2 3 n 1 2 3 V 1 2 3 with H (q ; q , q )"H (q ; q ) H (q , q ) . V 1 2 3 Q 1 2 M 2 3 The two-step functions are
G G
(4.60)
1 if [(p !p )2!(m !m )2]4K2 , 1 2 1 2 H (q ; q )" Q 1 2 0 otherwise,
(4.61)
1 if [(p #p )2!(m #m )2]4K2 , 1 2 1 2 H (q ; q )" M 1 2 0 otherwise.
(4.62)
The single-particle momenta p are those associated with the state q , see above. H is a measure for i i Q the momentum transfer and H a measure for the off-shell mass induced across the vertex. The M scale parameter K may be (or may not be) the same as in Section 4.4.1. This vertex regularization realizes what has been referred to by Lepage as a local regulator as opposed to the global Fock-space regulator in Eq. (4.59). It is generalizable to forks and seagulls and reads there F (q ; q , q , q )NFM (q ; q , q )"F (q ; q , q , q ) H (q ; q , q , q ) , n 1 2 3 4 n 1 2 3 n 1 2 3 4 F 1 2 3 4 with H (q ; q , q , q )"H (q ; q ) H (q , q ) , F 1 2 3 4 Q 1 2 M 3 4 S (q , q ; q , q )NSM (q , q ; q , q )"S (q , q ; q , q ) H (q , q ; q , q ) , n 1 2 3 4 n 1 2 3 4 n 1 2 3 4 S 1 2 3 4
(4.63)
with H (q , q ; q , q )"H (q ; q ) H (q , q )H (q , q ) . (4.64) S 1 2 3 4 Q 1 2 Q 3 4 M 3 4 It regulates automatically the contractions, i.e. the regulated expressions in Eq. (4.34) become finite P`P~ " + as a (SM (1, 2; 1, 2)#SM (1, 2; 2, 1)#SM (1, 2; 1, 2)#SM (1, 2; 2, 1)) . pgc 1 1 7,3 7,3 7,4 7,5 1,2 Note that vertex regularization is frame-independent.
(4.65)
4.4.3. Renormalization Renormalization is simple in principle: The eigenvalues may not depend on the regulator scale(s) K. Thus, if the eigenvalue equation is H(K)DW T"E (K)DW T , E "M2 i i i i i one must require that dE (K)/dK"0 for all i , i
(4.66)
(4.67)
380
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
up to terms of order 1/K. To require this is easier than to find a practical realization. As a matter of fact, it has not been found yet, irrespective of whether one deals with the compactified or the continuum theory, see Section 8. 4.4.4. The key challenge in DLCQ In principle, one can proceed for 3#1 dimensions like in Section 4.2 for 1#1 dimensions: One selects a particular value of the harmonic resolution K and the cut-off K, and diagonalizes the finite-dimensional Hamiltonian matrix by numerical methods. If this is all so simple, why has the problem not been solved a long time ago? What is the key problem? The bottle neck of any field theoretic Hamiltonian approach in physical space—time is that the dimension of the Hamiltonian matrix increases exponentially fast with K, and that one may not simply truncate the Fock-space at pleasure, because of gauge invariance. Let us consider the concrete example of harmonic resolution K"5, as illustrated in Fig. 2 and Table 7. Suppose, the regularization procedure allows for 10 discrete momentum states in each direction. A single particle has about 103 degrees of freedom. A Fock-space sector with n particles has then roughly 10n~1 different Fock states. The qqN -sector has thus about 103 Fock states. Sector 13 in Fig. 2 with its 8 particles has thus about 1021 Fock states. Chemists are able to handle matrices with some 107 dimensions, but 1021 dimensions exceed the calculational capacity of any computer in the foreseeable future. The problem is a grave one, in particular since one has to diagonalize the Hamiltonian for KPR. For physical space—time one is thus thrown back to the same (insoluble?) problem as in conventional many-body physics as displayed in Fig. 9. One has to diagonalize finite matrices with exponentially large dimensions (typically'106). In fact, in quantum field theory the problem is worse since particle number is not conserved. Even advanced numerical methods or the methods of modern quantum chemistry are apparently insufficient for the task. One needs to develop effective interactions which act in smaller matrix spaces and still are related to the full interaction. In a way, deriving an effective interaction can be understood as the aim to reduce the dimension of a matrix in a diagonalization problem from 1021 to 103! In the present we shall mention three perhaps promising developments, namely the Tamm—Dancoff approach, the similarity scheme and the Hamiltonian flow equations, but for physical space—time, the final break-through has not been achieved, yet. 1. The effective interaction a` la Tamm [421] and Dancoff [117] can be generalized by the method of iterated resolvents [188,357,361,362] to avoid the most brutal violations of gauge invariance. Because of this, Wolfgang Pauli is reported to have called the Tamm—Dancoff approach “the most stupid idea I have ever seen”. This approach will be presented more thoroughly in Section 4.7. 2. The similarity scheme of G"azek and Wilson [184,185,456] will be discussed in Section 8. 3. The Hamiltonian flow equations have been proposed recently by Wegner [188,447]. In view of recent work by Lenz and Wegner on the particle—phonon model in solid state theory [294] they seem to be rather promising. Wegner has found and applied a unitary transformation º(l) to an arbitrary matrix as given for example in Table 8, such that º(l) depends on a continuous parameter l. It has the property that d/dl Tr(R(l)R(l))(0 ,
(4.68)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
381
Table 8 A typical Hamiltonian matrix with diagonal (D) and off-diagonal (R) block matrices, corresponding to H"D#R. Zero matrices are symbolized by dots n
1
2
3
4
5
6
7
8
9
10
11
12
13
1 2 3 4 5 6 7 8 9 10 11 12 13
D R R R . R . . . . . . .
R D R . R R . . R . . . .
R R D R R R R . . R . . .
R . R D . R R R . . R . .
. R R . D R . . R R . . .
R R R R R D R . R R R . .
. . R R . R D R . R R R .
. . . R . . R D . . R R R
. R . . R R . . D R . . .
. . R . R R R . R D R . .
. . . R . R R R . R D R .
. . . . . . R R . . R D R
. . . . . . . R . . . R D
for all l. In the limit lPR, the off-diagonal blocks tend therefore to zero and can be neglected. Only the diagonal blocks survive and can be diagonalized blockwise. Except for some rather preliminary work [194], the flow equations have not yet been applied to gauge theory. 4.5. Further evaluation of the Hamiltonian matrix elements The light-cone Hamiltonian matrix elements in Figs. 1—4 are expressed in terms of the Dirac spinors and polarization vectors, u (k, j), v (k, j) and e (k, j), respectively, which can be found in a a k Appendices A and B. This representation is particularly useful for perturbative calculations as we have seen in Section 3. However, the practitioner needs these matrix elements often as explicit functions of the single-particle momenta k` and k . The calculation is straightforward but M cumbersome. To facilitate the calculation, the tables of Lepage and Brodsky [299] on the spinor contractions uN C u and vN C u are included here in Tables 5 and 6, respectively, adapted to the a ab b a ab b present notation. The general symmetry relations between spinor matrix elements are given in Appendix A. Inserting them into the matrix elements of Tables 1—4, one obtains those in Tables 9—12, respectively. One should emphasize like in Section 2.7 that all of these tables hold for QED as well as for non-abelian gauge theory SU(N) including QCD. Essentially, they hold for arbitrary n-space and 1-time dimensions as well. Using the translation keys in Section 4.6, the matrix elements in all of these tables can be translated easily into the continuum formulation. 4.6. Retrieving the continuum formulation As argued in Section 3, the continuum formulation of the Hamiltonian problem in gauge field theory with its endless multiple integrals is usually cumbersome and untransparent. In DLCQ, the
382
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Table 9 The explicit matrix elements of the vertex interaction. The vertex interaction in terms of Dirac spinors. The matrix elements » are displayed on the right, the corresponding (energy) graphs on the left. All matrix elements are n proportional to D "gL d(k`Dk`#3)d(2)(k Dk #k ), with gL "gP`/JX. In the continuum limit, see Section 4.6, one V 1 2 M,1 M,2 M,3 uses gL "gP`/J2(2n)3
continuum limit corresponds to harmonic resolution KPR. The compactified formulation with its simple multiple sums is straightforward. The key relation is the connection between sums and integrals
P P
dk` f (k`, k ) Q (p/2¸)+ f (k`, k ), M M n
d2k f (k`, k ) Q (p/¸ )+ f (k`, k ) . M M M M nM Combined they yield
P
dk` d2k f (k`, k ) Q (2(2p)3/X) + f (k`, k ) . M M M n,nM
(4.69)
(4.70)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
383
Table 10 ¹he matrix elements of the fork interaction. The matrix elements F are displayed on the right, the corresponding (energy) n,j graphs on the left. All matrix elements are proportional to D"gJ 2d(k`Dk`#k`#k`)d(2)(k Dk #k #k ), with 1 2 3 4 M,1 M,2 M,3 M,4 gJ 2"g2P`/(2X). In the continuum limit, see Section 4.6, one uses gJ 2"g2P`/(4(2p)3)
Table 11 The matrix elements of the contractions. The self-induced inertias I are displayed on the right, the corresponding n,j (energy) graphs on the left. The number of colors and flavors is denoted by N and N , respectively. In the discrete case, c f one uses gN 2"2g2/(XP`). In the continuum limit, see Section 4.6, one uses gN 2"g2/(2p)3
384
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Table 12 The matrix elements of the seagull interaction. The matrix elements S are displayed on the right, the corresponding n,j (energy) graphs on the left. All matrix elements are proportional to D"gJ 2d(k`k`Dk`#k`)d(2)(k #k Dk #k ), 1 2 3 4 M,1 M,2 M,3 M,4 with gJ 2"g2P`/(2X). In the continuum limit, see Section 4.6, one uses gJ 2"g2P`/(4(2p)3)
Similarly, Dirac delta and Kronecker delta functions are related by X d(k`D0)d(2)(k D0) . d(k`)d(2)(k ) Q M M 2(2p)3
(4.71)
Because of that, in order to satisfy the respective commutation relations, the boson operators aJ and a must be related by aJ (q) Q J(X/2(2n)3)a , q
(4.72)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
385
and correspondingly for fermion operators. The commutation relations, Eq. (4.25), become then, for example, [a , as ] Q [aJ (q ), aJ s(q )]"d 1 2d 1 2d(k`!k`)d(2)(k !k ) . (4.73) q q{ 1 2 a ,a j ,j 1 2 M1 M2 Substituting Eqs. (4.70), (4.71) and (4.72) into Eq. (4.31), for example, one gets straightforwardly
P
P
P
P
P`P~ " dk`d2k dk`d2k dk`d2k dk`d2k 1 M1 2 M2 3 M3 4 M4 pg ] + + F (q ; q ,q , q )aJ (q )saJ (q )aJ (q )aJ (q ) . (4.74) 6,2 1 2 3 4 1 2 3 4 a1,a2,a3,a4 j1,j2,j3,j4 This appears to be a clumsy expression as compared to Eq. (4.31). The physics is the same in both of them. The matrix element F is defined formally like in Eq. (4.28), except for the Dirac delta 6,2 functions, i.e. D(q ; q , q , q )"(g2P`/4(2p)3)d(k`!k`!k`!k`)d(2)(k !k !k !k ) . (4.75) 1 2 3 4 1 2 3 4 M1 M2 M3 M4 Of course, one has formally to replace sums by integrals, Kronecker delta by Dirac delta functions, and single-particle operators by their tilded versions. But as a net effect and in practice, one replaces gJ 2"g2P`/2X by gJ 2"g2P`/4(2p)3
(4.76)
in order to convert the discretized expressions in Tables 1—4 and Tables 9—12 to the continuum limit. See also the captions to the tables. The DLCQ method can be considered a general framework for solving problems such as relativistic many-body theories or approximate models. The general procedure is: 1. 2. 3. 4.
Phrase your physics problem in a compactified version like DLCQ. Apply your approximation and simplifications. Derive your final result. At the end of calculation convert expressions back to the continuum formulation.
This procedure was applied in Section 4.8. 4.7. Effective interactions in 3#1 dimensions Instead of an infinite set of coupled integral equations like in Eq. (3.14), the eigenvalue equation H DWT"M2DWT leads in DLCQ to a strictly finite set of coupled matrix equations LC N + SiDHD jT S jDWT"ESnDWT for all i"1, 2,2, N . (4.77) j/1 The rows and columns of the block matrices SiDHD jT are denumerated by the sector numbers i, j"1, 2,2, N, in accord with the Fock-space sectors in Fig. 2 or Fig. 11. Each sector contains many individual Fock states with different values of x, p and j, but due to Fock-space regularizM ation (K), their number is finite.
386
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Effective interactions are a well known tool in many-body physics [337]. In field theory the method is known as the Tamm—Dancoff approach. It was applied first by Tamm [421] and by Dancoff [117] to Yukawa theory for describing the nucleon—nucleon interaction. For the front form, a considerable amount of work has been done thus far, for instance by Tang et al. [422], Burkardt et al. [80—84], Fuda et al. [155—161], G"azek et al. [182,456], Gubankova et al. [194], Hamer et al. [420], Heinzl et al. [211—215], Hiller et al. [458], Hollenberg et al. [224], Jones et al. [255,256], Kaluza et al. [264], Kalloniatis et al. [260,386], Krautga¨rtner et al. [279], Prokhatilov et al. [14], Trittmann et al. [429—431], Wort [459], Zhang et al. [461—466], and others [49,134,191,233,251,252,270,285], but the subject continues to be a challenge for QCD. In particular one faces the problem of non-perturbative renormalization but with recent progress in Refs. [8,2,85,464], particularly see the work of Bakker et al. [306,310,399], Bassetto et al. [5,23—26], Brisudova et al. [48—50], as will be discussed in Section 8. Let us review in short the general procedure [337] on which the Tamm—Dancoff approach [117,421] is based. The rows and columns of any Hamiltonian matrix can always be split into two parts. One speaks of the P-space and of the rest, the Q-space Q,1!P. The division is arbitrary, but for being specific let us identify first the P-space with the qqN -space: N P"D1TS1D , Q" + D jTS jD . j/2 Eq. (4.77) can then be rewritten conveniently as a 2]2 block matrix
A
SPDHDPT
SPDHDQT
BA
SQDHDPT SQDHDQT
SPDWT
B A
SQDWT
"E
SPDWT
B
SQDWT
,
(4.78)
(4.79)
or explicitly SPDHDPT SPDWT#SPDHDQT SQDWT"E SPDWT ,
(4.80)
SQDHDPT SPDWT#SQDHDQT SQDWT"E SQDWT .
(4.81)
Rewriting the second equation as SQDE!HDQT SQDWT"SQDHDPT SPDWT ,
(4.82)
one observes that the quadratic matrix SQDE!HDQT could be inverted to express the Q-space wavefunction SQDWT in terms of the P-space wavefunction SPDWT. But the eigenvalue E is unknown at this point. To avoid this, one solves first another problem: One introduces the starting point energy u as a redundant parameter at disposal, and defines the Q-space resolvent as the inverse of the block matrix SQDu!HDQT, G (u)"1/SQDu!HDQT . Q In line with Eq. (4.82) one thus defines
(4.83)
SQDWT,SQDW(u)T"G (u)SQDHDPTSPDWT , Q and inserts it into Eq. (4.81). This yields an eigenvalue equation
(4.84)
H (u)DPTSPDW (u)T"E (u)DW (u)T , %&& k k k
(4.85)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
387
which defines unambiguously the effective interaction SPDH (u)DPT"SPDHDPT#SPDHDQTG (u)SQDHDPT . (4.86) %&& Q Both of them act only in the usually much smaller model space, the P-space. The effective interaction is thus well defined: It is the original block matrix SPDHDPT plus a part where the system is scattered virtually into the Q-space, propagating there by impact of the true interaction, and finally is scattered back into the P-space: SPDHDQTG (u)SQDHDPT. Every numerical value of Q u defines a different Hamiltonian and a different spectrum. Varying u one generates a set of energy functions E (u). Whenever one finds a solution to the fixed-point equation [352,467] k E (u)"u , (4.87) k one has found one of the true eigenvalues and eigenfunctions of H, by construction. It looks therefore as if one has mapped a difficult problem, the diagonalization of a large matrix (1021) onto a simpler problem, the diagonalization of a much smaller matrix in the model space (103). But this is only true in a restricted sense. One has to invert a matrix. The numerical inversion of a matrix takes about the same effort as its diagonalization. In addition, one has to vary u and solve the fixed-point equation (4.87). The numerical work is thus rather larger than smaller as compared to a direct diagonalization. But the procedure is exact in principle. In particular, one can find all eigenvalues of the full Hamiltonian H, irrespective of how small one chooses the P-space. Explicit examples for that can be found in Refs. [352,361,467]. The key problem is how to get (SQDu!HDQT)~1, the inversion of the Hamiltonian matrix in the Q-sector, as required by Eq. (4.83). Once this is achieved, for example by an approximation (see below), the sparseness of the Hamiltonian matrix can be made use of rather effectively: Only comparatively few block matrices SPDHDQT differ from being strict zero matrices, see Fig. 2 or Fig. 11. In fact, the sparseness of the Hamiltonian matrix can be made use of even more effectively by introducing more than two projectors, as done in the method of iterated resolvents [357,361,362]. One easily recognizes that Eqs. (4.79), (4.80), (4.81), (4.82), (4.83), (4.84), (4.85) and (4.86) can be interpreted as the reduction of the block matrix dimension from 2 to 1. But there is no need to identify the P-space with the lowest sector. One can also choose the Q-space identical to the last sector and identify the P-space with the rest, P"1!Q: n P" + D jTS jD with 14n4N , Q,1!P . (4.88) j/1 The same steps as above then reduce the block matrix dimension from N to N!1. The effective interaction acts in the now smaller space of N!1 sectors. This procedure can be repeated until one arrives at a block matrix dimension 1 where the procedure stops: The effective interaction in the Fock-space sector with only one quark and one antiquark is defined again unambiguously. More explicitly, suppose that in the course of this reduction one has arrived at block matrix dimension n. Denote the corresponding effective interaction H (u). The eigenvalue problem corresponding to n Eq. (4.77) then reads n + SiDH (u)D jTS jDW(u)T"E(u)SiDW(u)T for i"1, 2,2, n . n j/1
(4.89)
388
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Observe that i and j refer here to sector numbers. Since one has started from the full Hamiltonian in the last sector, one has to convene that H "H. Now, in analogy with Eqs. (4.83) and (4.84), define N the resolvent of the effective sector Hamiltonian H (u) by n 1 G (u)" , (4.90) n SnDu!H (u)DnT n n~1 SnDW(u)"G (u) + SnDH (u)D jTS jDW(u)T , (4.91) n n j/1 respectively. The effective interaction in the (n!1)-space then becomes [357] H (u)"H (u)#H (u)G (u)H (u) (4.92) n~1 n n n n for every block matrix element SiDH (u)D jT. To obtain the corresponding eigenvalue equation n~1 one substitutes n by n!1 everywhere in Eq. (4.89). Everything proceeds as above, including the fixed point equation E(u)"u. But one has achieved much more: Eq. (4.92) is a recursion relation which holds for all 1(n(N! Notice that this so-called method of iterated resolvents requires only the inversion of the effective sector Hamiltonians SnDH DnT. On a computer, this is an easier n problem than the inversion of the full Q-space matrix as in Eq. (4.83). Moreover, one can now make use of all zero block matrices in the Hamiltonian, as worked out in Ref. [361]. The Tamm—Dancoff approach (TDA) as used in the literature, however, does not follow literally the outline given in Eqs. (4.78), (4.79), (4.80), (4.81), (4.82), (4.83), (4.84), (4.85) and (4.86), rather one substitutes the “energy denominator” in Eq. (4.83) according to 1 1 1 " N , SQDu!¹!ºDQT SQD¹*!¹!dº(u)DQT SQD¹*!¹DQT
(4.93)
with dº(u)"u!¹*!º . Here, ¹* is not an operator but a c-number, denoting the mean kinetic energy in the P-space [117,421]. In fact, the two resolvents 1 G (u)" , Q SQD¹*!¹!dº(u)DQT
1 G " 0 SQD¹*!¹DQT
(4.94)
are identically related by G (u)"G #G dº(u)G (u) Q 0 0 Q or by the infinite series of perturbation theory
(4.95)
G (u)"G #G dº(u)G #G dº(u)G dº(u)G # ) ) ) . (4.96) Q 0 0 0 0 0 0 The idea is that the operator dº(u) in some sense is small, or at least that its mean value in the Q-space is close to zero, Sdº(u)T+0. In such a case it is justified to restrict to the very first term in the expansion, G (u)"G , as usually done in TDA. Notice that the diagonal kinetic energy Q 0 ¹*!¹ can be inverted trivially to get the resolvent G . 0
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
389
4.8. Quantum electrodynamics in 3#1 dimensions Tang et al. [422] gave the first application to DLCQ to QED at strong coupling, followed 3`1 later by Kaluza et al. [264]. Both works addressed the positronium eigenvalue spectrum as a test of the method. In either case the Fock space was truncated to include only the qqN and qqN g states. The so-truncated DLCQ-matrix was diagonalized numerically, with rather slow convergence of the results. Omitting the one-photon state g, they have excluded the impact of annihilation. Therefore, rather than “positronium”, one should call such models “muonium with equal masses”. Langnau and Burkardt have calculated the anomalous magnetic moment of the electron for very strong coupling [76,77,291,292]. Krautga¨rtner et al. [279,265] proceeded in a more general way by using the effective interaction of the Tamm—Dancoff approach. A detailed analysis of the Coulomb singularity and its impact on numerical calculations in momentum representation has led them to develop a Coulomb counter-term technology, which did improve the rate of numerical convergence significantly. It was then possible to reproduce quantitatively the Bohr aspects of the spectrum, as well as the fine and hyperfine structure. One should emphasize that the aim of calculating the positronium spectrum by a Hamiltonian eigenvalue equation is by no means a trivial problem. In the instant form, for example, the hyperfine interaction is so singular, that thus far the Hamiltonian eigenvalue equation has not been solved. The hyperfine corrections have only been calculated in the lowest non-trivial orders of perturbation theory, see Ref. [44]. One also should note that the usual problems in configuration space associated with recoil and reduced mass, are simply absent in momentum representation. Although the Tamm—Dancoff approach was originally applied in the instant form [117,421], one can translate it easily into the front form, as we have discussed above. The approximation of Eqs. (4.93) and (4.86) give G (u)+G , and thus the virtual scattering into the Q-space produces an Q 0 additional P-space interaction, the one-photon exchange interaction »G ». Its two time orderings 0 are given diagrammatically in Fig. 12. The original P-space interaction is the kinetic energy, of course, plus the seagull interaction. Of the latter, we keep here only the instantaneous-photon exchange and denote it as ¼, which is represented by the first graph in Fig. 12. Without the annihilation terms, the effective Hamiltonian is thus H "¹#¼#»G »"¹#º . (4.97) %&& 0 %&& The only difference is that the unperturbed energy has to be replaced by the mean kinetic energy ¹* as introduced in Eq. (4.93), which in light-cone quantization is given by
A
B
1 m2#k2 m2N #k2 m2#k@2 m2N #k@2 M# q q M# q M# q M . ¹*" 2 x@ x 1!x 1!x@
(4.98)
The same “trick” was applied by Tamm and by Dancoff in their original work [117,421]. In correspondence to Eq. (3.34), the energy denominator in the intermediate state of the Q-space ¹*!¹"!Q2/Dx!x@D
(4.99)
can now be expressed in terms of Q, of the average four-momentum transfer along the electron and the positron line, i.e. Q2"!1((k !k@ )2#(k N !k@N )2) . 2 q q q q
(4.100)
390
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 12. The graphs of the effective one-photon exchange interaction. The effective interaction is a sum of the dynamic one-photon exchange with both time orderings, the instantaneous one-photon exchange, the dynamic and the instantaneous annihilation interactions, all represented by energy graphs. The hashed rectangles represent the effective photon or the effective propagator G . Taken from Ref. [431]. 0
As illustrated in Fig. 12, the effective interaction º scatters an electron with on-shell four%&& momentum k and helicity j into a state with k@ and j@ , and correspondingly the positron from q q q q k N and j N to k@N and j@N . The evaluation of the so-defined effective interaction has been done explicitly q q q q in Section 3.4. In the sequel, we follow the more recent work of Trittmann et al. [429—431], where the Coulomb counter-term technology was improved further to the extent that a calculation of all spin-parity multiplets of positronium was meaningful. In particular, it was possible to investigate the important question to which extent the members of the multiplets are numerically degenerate with J . z One recalls that the operator for the projection of total angular momentum J is kinematic in the z front form, whereas total angular momentum J2 is not, see Section 2.6. Up to this point it was convenient to work with DLCQ, and coupled matrix equations. All spatial momenta k` and k are still discrete. But now that all the approximations have been done, M one goes over conveniently to the continuum limit according to Section 4.6. The DLCQ-matrix equation is converted into an integral equation in momentum space,
C
M2Sx, k ; j , j N DWT" M q q
P
D
m #k2 m N #k2 q M# q M Sx, k ; j , j N DWT M q q x 1!x
dx@ d2k@ Sx, k ; j , j N Dº Dx@, k@ ; j@ j@N ,TSx@, k@ ; j@ , j@N DWT . # + M M q q %&& M q q M q q j@q,j@qN D The domain D restricts integration in line with Fock-space regularization m2#k2 m2N #k2 M4(m #m N )2#K2. q M# q q q 1!x x
(4.101)
(4.102)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
391
The bras and kets refer to qqN Fock states, Dx, k ; j , j N T"bs(k , j )ds(k N , j N )D0T. The goal of the M q q q q q q calculation is to obtain the momentum-space wavefunctions Sx, k ; j , j N DtT and the eigenvalues M q q M2. The former are the probability amplitudes for finding the quark with helicity projection j , q longitudinal momentum fraction x,k`/P` and transverse momentum k , and simultaneously the q M antiquark with j N , 1!x and !k . According to Eq. (3.43), the effective interaction º becomes q M %&& 1 a [uN (k , j )cku(k@ , j@ )][uN (k N , j N )c u(k@N , j@N )] q q q q q q k q q , º "! (4.103) %&& 4p2 Q2 Jx(1!x)x@(1!x@) with a,g2/4p. Notice that both the dynamic and the instantaneous one-photon exchange interaction in Eqs. (3.41) and (3.42), respectively, contain a non-integrable singularity &(x!x@)~2, which cancel each other in the final expressions, Eq. (3.43) or Eq. (4.103). Only the square integrable “Coulomb singularity” 1/Q2 remains, see also Ref. [299]. In the numerical work [279,429] it is favorable to replace the two transverse momenta k and Mx k by the absolute value of k and the angles h, u. The integral equation is approximated by My M Gaussian quadratures, and the results are studied as a function of the number of integration points N, as displayed in Fig. 13. One sees there that the results stabilize themselves quickly. All eigenvalues displayed have the same eigenvalue of total angular momentum projection, i.e. J "0. z Since one calculates the values of an invariant mass squared, a comparative large value of the fine structure constant a"0.3 has been chosen. One recognizes the ionization threshold at M2&4m2, the Bohr spectrum, and even more important, the fine structure. The two lowest eigenvalues correspond to the singlet and triplet state of positronium, respectively. The agreement is quantitative, particularly for the physical value of the fine structure constant a" 1 . In order to verify this 137 agreement, one needs a relative numerical accuracy of roughly 10~11. The numerical stability and precision is remarkable, indeed. The stability with respect to the cut-off K has been also studied (Fig. 14). An inspection of the numerical wavefunctions t(x, k ), as displayed for example in Fig. 15, M reveals that they are strongly peaked around k &0 and x&1. Outside the region M 2 (4.104) k2 @m2 , (x!1)2@1 , M 2 they are smaller than the peak value by many orders of magnitude. Also, the singlet wave function with anti-parallel helicities is dominant with more than a factor 20 over the component with parallel helicities. The latter would be zero in a non-relativistic calculation. Relativistic effects are responsible also for the fact that the singlet-(CB) wavefunction is not rotationally symmetric. To see this, the wavefunction is plotted in Fig. 14 versus the off-shell momentum variable k, defined by [268,392,393] x"1(1#k cos h/(Jm2#k2)) , (4.105) 2 k "(k sin h cos, uk sin h sin u) , (4.106) M for different values of h. A numerically significant deviation, however, occurs only for very relativistic momenta k510m, as displayed in Fig. 14. Trittmann et al. [429—431] have also included the annihilation interaction as illustrated in Fig. 12 and calculated numerically the spectrum for various values of J . The results are compiled z
392
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 13. Stability of positronium spectrum for J "0, without the annihilation interaction. Eigenvalues M2 for a"0.3 z i and K"1 are plotted versus N, the number of integration Gaussian points. Masses are in units of the electron mass. Taken from Ref. [430]. Fig. 14. The decrease of the J "0 singlet ground-state wavefunction with antiparallel helicities as a function of the z momentum variable k for a"0.3 and K"1.0. The six different curves correspond to six values of h, see Eq. (4.106). Taken from Ref. [429].
Fig. 15. Singlet wavefunctions of positronium [430].
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
393
in Fig. 16. As one can see, certain mass eigenvalues at J "0 are degenerate with certain z eigenvalues at other J to a very high degree of numerical precision. As an example, consider the z second lowest eigenvalue for J "0. It is degenerate with the lowest eigenvalue for J "$1, and z z can thus be classified as a member of the triplet with J"1. Correspondingly, the lowest eigenvalue for J "0 having no companion can be classified as the singlet state with J"0. Quite in general z one can interpret degenerate multiplets as members of a state with total angular momentum J"2J #1. An inspection of the wavefunctions allows to conclude whether the component z,.!9 with parallel or anti-parallel helicity is the leading one. In a pragmatical sense, one thus can conclude on the “total spin” S, and on “total orbital angular momentum” ¸, although in the front form neither J nor S nor ¸ make sense as operator eigenvalues. In fact they are not, as discussed in Section 2.6. Nevertheless, one can make contact with the conventional classification scheme 3S`1¸Jz, as indicated in Fig. 16. It is remarkable, that one finds all the expected states [431], J that is all members of the multiplets are found without a single exception. 4.9. The Coulomb interaction in the front form The jkj -term in Eq. (4.103) represents retardation and mediates the fine and hyperfine interack tions. One can switch them off by substituting the momenta by the equilibrium values, kM "0 , xN "m /(m #m N ) , M q q q which gives by means of Table 5: [uN (k , j )cku(k@ , j@ )][uN (k N , j N )c u(k@N , j@N )]N(m #m N )2d q @qd qN @qN . q q q q q q k q q q q j ,j j ,j
(4.107)
(4.108)
Fig. 16. Positronium spectrum for !34J 43, a"0.3 and K"1 including the annihilation interaction. For an easier z identification of the spin-parity multiplets, the corresponding non-relativistic notation 3S`1¸jz is inserted. Masses are j given in units of the electron mass. Taken from Ref. [431].
394
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
The effective interaction in Eq. (4.103) simplifies correspondingly and becomes the front form Coulomb interaction: 1 a (m #m N )2 q q º "! . %&& 4p2 Q2 Jx(1!x)x@(1!x@)
(4.109)
To see that one performs a variable transformation from x to k (x), the inverse transformation z [359] k #E 1 with E "Jm2#k2 #k2, i"1, 2 , x"x(k )" z (4.110) z i i M z E #E 1 2 maps the domain of integration !R4k 4R into the domain 04x41, and produces the z equilibrium value for k "0, Eq. (4.107). One can combine k and k into a three-vector k"(k ,k ). z z M M z By means of the identity x(1!x)"(E #k )(E !k )/(E #E )2 , 1 z 2 z 1 2 the Jacobian of the transformation becomes straightforwardly
A
(4.111)
BS
1 1 (E #k@ )(E !k@ ) 1 z 2 z . # (4.112) z E E (E #k )(E !k ) Jx(1!x)x@(1!x@) 2 1 1 z 2 z For equal masses m "m "m (positronium), the kinetic energy is 1 2 (m2#k2 )/x(1!x)"4m2#4k2, (4.113) M and the domain of integration Eq. (4.102) reduces to 4k24K2. The momentum scale k [268,392,393], as introduced in Eq. (4.106), identifies itself as k"2DkD. As shown by Ref. [359], the four-momentum transfer Eq. (4.100) can exactly be rewritten as dx@
"dk
Q2"(k!k@)2.
(4.114)
Finally, after substituting the invariant mass squared eigenvalue M2 by an energy eigenvalue E, M2"4m2#4mE ,
(4.115)
and introducing a new wavefunction /, /(k)"Sx(k ), k ; j , j N DtT(1/m)Jm2#k2 , z M q q M one rewrites Eq. (4.101) with Eq. (4.109) identically as
A
B
P
(4.116)
1 k2 a m d3k@ E! /(k)"! /(k@) . (4.117) (k!k@)2 2m 2p2 Jm2#k2 r D Since m "m/2 is the reduced mass, this is the non-relativistic Schro¨dinger equation in momentum r representation for k2@m2 (see also Ref. [359]). Notice that only retardation was suppressed to get this result. The impact of the relativistic light-cone treatment resides in the factor (1# k2/m2)~1@2. It induces a weak non-locality in the
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
395
effective Coulomb potential. Notice also that the solution of Eq. (4.117) is rotationally symmetric for the lowest state. Therefore, the original front form wavefunction Sx(k ),k DtT in Eq. (4.116) z M cannot be rotationally symmetric. The deviations from rotational symmetry, however, are small and can occur only for k2Am2, as can be observed in Fig. 14.
5. The impact on hadronic physics In this section we discuss a number of novel applications of quantum chromodynamics to nuclear structure and dynamics, such as the reduced amplitude formalism for exclusive nuclear amplitudes. We particularly emphasize the importance of light-cone Hamiltonian and Fock state methods as a tool for describing the wavefunctions of composite relativistic many-body systems and their interactions. We also show that the use of covariant kinematics leads to non-trivial corrections to the standard formulae for the axial, magnetic, and quadrupole moments of nucleons and nuclei. In principle, quantum chromodynamics can provide a fundamental description of hadron and nuclei structure and dynamics in terms of elementary quark and gluon degrees of freedom. In practice, the direct application of QCD to hadron and nuclear phenomena is extremely complex because of the interplay of non-perturbative effects such as color confinement and multi-quark coherence. Despite these challenging theoretical difficulties, there has been substantial progress in identifying specific QCD effects in nuclear physics. A crucial tool in these analyses is the use of relativistic light-cone quantum mechanics and Fock state methods in order to provide a tractable and consistent treatment of relativistic many-body effects. In some applications, such as exclusive processes at large momentum transfer, one can make first-principle predictions using factorization theorems which separate hard perturbative dynamics from the non-perturbative physics associated with hadron or nuclear binding. In other applications, such as the passage of hadrons through nuclear matter and the calculation of the axial, magnetic, and quadrupole moments of light nuclei, the QCD description provides new insights which go well beyond the usual assumptions of traditional nuclear physics. 5.1. Light-cone methods in QCD In recent years quantization of quantum chromodynamics at fixed light-cone time q"t!z/c has emerged as a promising method for solving relativistic bound-state problems in the strong coupling regime including nuclear systems. Light-cone quantization has a number of unique features that make it appealing, most notably, the ground state of the free theory is also a ground state of the full theory, and the Fock expansion constructed on this vacuum state provides a complete relativistic many-particle basis for diagonalizing the full theory. The light-cone wavefunctions t (x , k , j ), which describe the hadrons and nuclei in terms of their fundamental n i Mi i quark and gluon degrees of freedom, are frame-independent. The essential variables are the boost-invariant light-cone momentum fractions x "p`/P`, where Pk and pk are the hadron and i i i quark or gluon momenta, respectively, with PB"P0$Pz. The internal transverse momentum variables k are given by k "p !x P with the constraints +k "0 and +x "1, i.e., the Mi Mi Mi i M Mi i light-cone momentum fractions x and k are relative coordinates, and they describe the hadronic i Mi
396
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
system independent of its total four momentum pk. The entire spectrum of hadrons and nuclei and their scattering states is given by the set of eigenstates of the light-cone Hamiltonian H of QCD. LC The Heisenberg problem takes the form H DWT"M2DWT . (5.1) LC For example, each hadron has the eigenfunction DW T of HQCD with eigenvalue M2"M2 . If we H LC H could solve the light-cone Heisenberg problem for the proton in QCD, we could then expand its eigenstate on the complete set of quark and gluon eigensolutions DnT"DuudT, DuudgT,2 of the free Hamiltonian H0 with the same global quantum numbers: LC DW T"+ DnTt (x , k i, j ) . (5.2) p n i M i n The t (n"3, 4,2) are first-quantized amplitudes analogous to the Schro¨dinger wave function, n but it is Lorentz-frame-independent. Particle number is generally not conserved in a relativistic quantum field theory. Thus, each eigenstate is represented as a sum over Fock states of arbitrary particle number and in QCD each hadron is expanded as second-quantized sums over fluctuations of color-singlet quark and gluon states of different momenta and number. The coefficients of these fluctuations are the light-cone wavefunctions t (x , k , j ). The invariant mass M of the partons in n i Mi i a given Fock state can be written in the elegant form M2"+3 (k2 #m2)/x . The dominant i/1 Mi i configurations in the wavefunction are generally those with minimum values of M2. Note that except for the case m "0 and k "0, the limit x P0 is an ultraviolet limit; i.e. it corresponds to i Mi i particles moving with infinite momentum in the negative z direction: kzP!k0P!R . i i In the case of QCD in one-space and one-time dimensions, the application of discretized light-cone quantization [66], see Section 4, provides complete solutions of the theory, including the entire spectrum of mesons, baryons, and nuclei, and their wavefunctions [227]. In the DLCQ method, one simply diagonalizes the light-cone Hamiltonian for QCD on a discretized Fock state basis. The DLCQ solutions can be obtained for arbitrary parameters including the number of flavors and colors and quark masses. More recently, DLCQ has been applied to new variants of QCD(1#1) with quarks in the adjoint representation, thus obtaining color-singlet eigenstates analogous to gluonium states [115]. The DLCQ method becomes much more numerically intense when applied to physical theories in 3#1 dimensions; however, progress is being made. An analysis of the spectrum and light-cone wavefunctions of positronium in QED(3#1) is given in Ref. [279]. Currently, Hiller et al. [222] are pursuing a non-perturbative calculation of the lepton anomalous moment in QED using this method. Burkardt has recently solved scalar theories with transverse dimensions by combining a Monte Carlo lattice method with DLCQ [79]. Given the light-cone wavefunctions, t (x , k , j ), one can compute virtually any hadronic n@H i Mi i quantity by convolution with the appropriate quark and gluon matrix elements. For example, the leading-twist structure functions measured in deep inelastic lepton scattering are immediately related to the light-cone probability distributions: F (x, Q) 2MF (x, Q)" 2 ++ e2G (x, Q) , 1 a a@p x a
(5.3)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
397
where
P
dx d2k i MiDt(Q)(x , k , j )D2 + d(x !x) G (x, Q)"+ < 1 (5.4) a@p n i Mi i b 16p3 i n,j i b/a is the number density of partons of type a with longitudinal momentum fraction x in the proton. This follows from the observation that deep-inelastic lepton scattering in the Bjorken-scaling limit occurs if x matches the light-cone fraction of the struck quark. (The + is over all partons of type a bj b in state n.) However, the light cone wavefunctions contain much more information for the final state of deep-inelastic scattering, such as the multi-parton distributions, spin and flavor correlations, and the spectator jet composition. As was first shown by Drell and Yan [133], it is advantageous to choose a coordinate frame where q`"0 to compute form factors F (q2), structure functions, and other current matrix i elements at space-like photon momentum. With such a choice the quark current cannot create or annihilate pairs, and Sp@D j`DpT can be computed as a simple overlap of Fock space wavefunctions; all off-diagonal terms involving pair production or annihilation by the current or vacuum vanish. In the interaction picture, one can equate the full Heisenberg current to the quark current described by the free Hamiltonian at q"0. Accordingly, the form factor is easily expressed in terms of the pion’s light cone wavefunctions by examining the k"# component of this equation in a frame where the photon’s momentum is transverse to the incident pion momentum, with q2 "Q2"!q2. The space-like form factor is then just a sum of overlap integrals analogous to the M corresponding non-relativistic formula [133] (See Fig. 17)
P
dx d2k i Mit(K)*(x , l , j )t(K)(x , k , j ) . F(q2)" + + e < 1 (5.5) a n i Mi i n i Mi i 16p3 n,ji a i Here e is the charge of the struck quark, K2Aq2 , and a M k !x q #q for the struck quark , i M M (5.6) l , Mi Mi k !x q for all other partons . Mi i M Notice that the transverse momenta appearing as arguments of the first wavefunctions correspond not to the actual momenta carried by the partons but to the actual momenta minus x q , to i M account for the motion of the final hadron. Notice also that l and k become equal as q P0, and M M M that F P1 in this limit due to wavefunctions normalization. All of the various form factors of n hadrons with spin can be obtained by computing the matrix element of the plus current between states of different initial and final hadron helicities [39]. As we have emphasized above, in principle, the light-cone wavefunctions determine all properties of a hadron. The general rule for calculating an amplitude involving the wavefunctions t(K), n describing Fock state n in a hadron with P"(P#, P ), has the form [62] (see Fig. 18) M dx d2k + < 1 i Mit(nK)(xi, kMi, ji)¹(nK)(xiP`, xiPM#kMi, ji) , (5.7) Jx 16p3 ji i i K where ¹( ) is the irreducible scattering amplitude in LCPTh with the hadron replaced by Fock n state n. If only the valence wavefunction is to be used, ¹(K) is irreducible with respect to the valence n
G
P
398
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fock state only, e.g. ¹(K) for a pion has no qqN intermediate states. Otherwise, contributions from all n
Fig. 17. Calculation of the form factor of a bound state from the convolution of light-cone Fock amplitudes. The result is exact if one sums over all t . n Fig. 18. Calculation of hadronic amplitudes in the light-cone Fock formalism.
Fock states must be summed, and ¹(K) is completely irreducible. n The leptonic decay of the nB is one of the simplest processes to compute since it involves only the qqN Fock state. The sole contribution to n~ decay is from S0DtM c`(1!c )t Dn~T"!J2P`f u 5 $ n
P
"
G
H
Jn uC dx d2k vN B M t(KN )(x, k ) c c`(1!c ) #(C%B) , du M J2 J1!x 5 Jx 16p3
(5.8)
where n "3 is the number of colors, f +93 MeV, and where only the ¸ "S "0 component of c n z z the general qqN wavefunction contributes. Thus, we have
P
f dx d2k M t(KN )(x, k )" n . du M 16p3 2J3
(5.9)
This result must be independent of the ultraviolet cutoff K of the theory provided K is large compared with typical hadronic scales. This equation is an important constraint upon the normalization of the duN wavefunction. It also shows that there is a finite probability for finding a n~ in a pure duN Fock state. The fact that a hadron can have a non-zero projection on a Fock state of fixed particle number seems to conflict with the notion that bound states in QCD have an infinitely recurring parton substructure, both from the infrared region (from soft gluons) and the ultraviolet regime (from QCD evolution to high momentum). In fact, there is no conflict. Because of coherent colorscreening in the color-singlet hadrons, the infrared gluons with wavelength longer than the hadron size decouple from the hadron wavefunction. The question of parton substructure is related to the resolution scale or ultraviolet cut-off of the theory. Any renormalizable theory must be defined by imposing an ultraviolet cutoff K on the momenta occurring in theory. The scale K is usually chosen to be much larger than the physical scales k of interest; however it is usually more useful to choose a smaller value for K, but at the
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
399
expense of introducing new higher-twist terms in an effective Lagrangian: [301] N L(K)"L(K)(a (K), m(K))# + (1/K)ndL(K)(a (K), m(K))#O(1/K)N`1 , 0 s n s n/1 where
(5.10)
L(K)"!1F(K) F(K)akl#tM (K)[iD. (K)!m(K)]t(K) . (5.11) 0 4 akl n The neglected physics of parton momenta and substructure beyond the cutoff scale has the effect of renormalizing the values of the input coupling constant g(K2) and the input mass parameter m(K2) of the quark partons in the Lagrangian. One clearly should choose K large enough to avoid large contributions from the higher-twist terms in the effective Lagrangian, but small enough so that the Fock space domain is minimized. Thus, if K is chosen of order 5—10 times the typical QCD momentum scale, then it is reasonable to hope that the mass, magnetic moment and other low momentum properties of the hadron could be well-described on a Fock basis of limited size. Furthermore, by iterating the equations of motion, one can construct a relativistic Schro¨dinger equation with an effective potential acting on the valence lowest-particle number state wavefunction [297,298]. Such a picture would explain the apparent success of constituent quark models for explaining the hadronic spectrum and low-energy properties of hadron. It should be emphasized that infinitely growing parton content of hadrons due to the evolution of the deep inelastic structure functions at increasing momentum transfer, is associated with the renormalization group substructure of the quarks themselves, rather than the “intrinsic” structure of the bound state wavefunction [63,65]. The fact that the light-cone kinetic energy S(k2 #m2)/xT M of the constituents in the bound state is bounded by K2 excludes singular behavior of the Fock wavefunctions at xP0. There are several examples where the light-cone Fock structure of the bound-state solutions is known. In the case of the super-renormalizable gauge theory, QED(1#1), the probability of having non-valence states in the light-cone expansion of the lowest lying meson and baryon eigenstates to be less than 10~3, even at very strong coupling [227]. In the case of QED(3#1), the lowest state of positronium can be well described on a light-cone basis with two to four particles, De`e~T, De`e~cT, De`e~ccT, and De`e~e`e~T; in particular, the description of the Lambshift in positronium requires the coupling of the system to light-cone Fock states with two photons “in flight” in light-cone gauge. The ultraviolet cut-off scale K only needs to be taken large compared to the electron mass. On the other hand, a charged particle such as the electron does not have a finite Fock decomposition, unless one imposes an artificial infrared cut-off. We thus expect that a limited light-cone Fock basis should be sufficient to represent bound color-singlet states of heavy quarks in QCD(3#1) because of the coherent color cancelations and the suppressed amplitude for transversely polarized gluon emission by heavy quarks. However, the description of light hadrons is undoubtedly much more complex due to the likely influence of chiral symmetry breaking and zero-mode gluons in the light-cone vacuum. We return to this problem later. Even without solving the QCD light-cone equations of motion, we can anticipate some general features of the behavior of the light-cone wavefunctions. Each Fock component describes a system of free particles with kinematic invariant mass squared: n M2"+ (k2 #m2)/x . Mi i i i
(5.12)
400
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
On general dynamical grounds, we can expect that states with very high M2 are suppressed in physical hadrons, with the highest mass configurations computable from perturbative considerations. We also note that ln x "ln(k0#kz) /(P0#Pz)"y !y is the rapidity difference between i i i P the constituent with light-cone fraction x and the rapidity of the hadron itself. Since correlations i between particles rarely extend over two units of rapidity in hadron physics, this argues that constituents which are correlated with the hadron’s quantum numbers are primarily found with x'0.2. The limit xP0 is normally an ultraviolet limit in a light-cone wavefunction. Recall, that in any Lorentz frame, the light-cone fraction is x"k`/p`"(k0#kz)/(P0#Pz). Thus in a frame where the bound state is moving infinitely fast in the positive z direction (“the infinite momentum frame”), the light-cone fraction becomes the momentum fraction xPkz/pz. However, in the rest frame P"0, x"(k0#kz)/M. Thus, xP0 generally implies very large constituent momentum kzP!k0P!R in the rest frame; it is excluded by the ultraviolet regulation of the theory — unless the particle has strictly zero mass and transverse momentum. If a particle has non-relativistic momentum in the bound state, then we can identify kz&xM!m. This correspondence is useful when one matches physics at the relativistic/nonrelativistic interface. In fact, any non-relativistic solution to the Schro¨dinger equation can be immediately written in light-cone form by identifying the two forms of coordinates. For example, the Schro¨dinger solution for particles bound in a harmonic oscillator potential can be taken as a model for the light-cone wavefunction for quarks in a confining linear potential [299]:
A
B
n k2 #m2 i . t(x , k )"A exp(!bM2)"exp! b+ Mi (5.13) i Mi x i i This form exhibits the strong fall-off at large relative transverse momentum and at the xP0 and xP1 endpoints expected for soft non-perturbative solutions in QCD. The perturbative corrections due to hard gluon exchange give amplitudes suppressed only by power laws and thus will eventually dominate wave function behavior over the soft contributions in these regions. This ansatz is the central assumption required to derive dimensional counting perturbative QCD predictions for exclusive processes at large momentum transfer and the xP1 behavior of deepinelastic structure functions. A review is given in Ref. [62]. A model for the polarized and unpolarized gluon distributions in the proton which takes into account both perturbative QCD constraints at large x and coherent cancelations at low x and small transverse momentum is given in Refs. [63,65]. The light-cone approach to QCD has immediate application to nuclear systems: The formalism provides a covariant many-body description of nuclear systems formally similar to non-relativistic many-body theory. One can derive rigorous predictions for the leading power-law fall-off of nuclear amplitudes, including the nucleon—nucleon potential, the deuteron form factor, and the distributions of nucleons within nuclei at large momentum fraction. For example, the leading electro-magnetic form factor of the deuteron falls as F (Q2)"f (a (Q2))/(Q2)5, where, asymptotically, d s f (a (Q2))Ja (Q2)5`c. The leading anomalous dimension c is computed in Ref. [59]. s s In general, the six-quark Fock state of the deuteron is a mixture of five different color-singlet states. The dominant color configuration of the six quarks corresponds to the usual proton— neutron bound state. However, as Q2 increases, the deuteron form factor becomes sensitive to
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
401
deuteron wavefunction configurations where all six quarks overlap within an impact separation bMi(O(1/Q). In the asymptotic domain, all five Fock color-singlet components acquire equal weight; i.e., the deuteron wavefunction becomes 80% “hidden color” at short distances. The derivation of the evolution equation for the deuteron distribution amplitude is given in Refs. [59,249]. QCD predicts that Fock components of a hadron with a small color dipole moment can pass through nuclear matter without interactions [36,60]; see also [334]. Thus, in the case of large momentum transfer reactions where only small-size valence Fock state configurations enter the hard scattering amplitude, both the initial and final state interactions of the hadron states become negligible. There is now evidence for QCD “color transparency” in exclusive virtual photon o production for both nuclear coherent and incoherent reactions in the E665 experiment at Fermilab [141], as well as the original measurement at BNL in quasi-elastic pp scattering in nuclei [216]. The recent NE18 measurement of quasielastic electron—proton scattering at SLAC finds results which do not clearly distinguish between conventional Glauber theory predictions and PQCD color transparency [320]. In contrast to color transparency, Fock states with large-scale color configurations strongly interact with high particle number production [42]. The traditional nuclear physics assumption that the nuclear form factor factorizes in the form F (Q2)"+ F (Q2)F"0$:(Q2), where F (Q2) is the on-shell nucleon form factor is in general A N N N@A N incorrect. The struck nucleon is necessarily off-shell, since it must transmit momentum to align the spectator nucleons along the direction of the recoiling nucleus. Nuclear form factors and scattering amplitudes can be factored in the form given by the reduced amplitude formalism [55], which follows from the cluster decomposition of the nucleus in the limit of zero nuclear binding. The reduced form factor formalism takes into account the fact that each nucleon in an exclusive nuclear transition typically absorbs momentum Q KQ/N. Tests of this N formalism are discussed in a later section. The use of covariant kinematics leads to a number of striking conclusions for the electromagnetic and weak moments of nucleons and nuclei. For example, magnetic moments cannot be written as the naive sum k"+k of the magnetic moments of the constituents, except in the i non-relativistic limit where the radius of the bound state is much larger than its Compton scale: R M A1. The deuteron quadrupole moment is in general non-zero even if the nucleon—nucleon A A bound state has no D-wave component [58]. Such effects are due to the fact that even “static” moments have to be computed as transitions between states of different momentum pk and pk#qk with qkP0. Thus, one must construct current matrix elements between boosted states. The Wigner boost generates non-trivial corrections to the current interactions of bound systems [51]. One can also use light-cone methods to show that the proton’s magnetic moment k and its p axial-vector coupling g have a relationship independent of the assumed form of the light-cone A wave function [71]. At the physical value of the proton radius computed from the slope of the Dirac form factor, R "0.76 fm, one obtains the experimental values for both k and g ; the 1 p A helicity carried by the valence u and d quarks are each reduced by a factor K0.75 relative to their non-relativistic values. At infinitely small radius R M P0, k becomes equal to the Dirac mop p p ment, as demanded by the Drell—Hearn—Gerasimov sum rule [174,129]. Another surprising fact is that as R P0, the constituent quark helicities become completely disoriented and g P0. We 1 A discuss these features in more detail in the following section.
402
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
In the case of the deuteron, both the quadrupole and magnetic moments become equal to that of an elementary vector boson in the Standard Model in the limit M R P0. The three form factors of d d the deuteron have the same ratio as that of the ¼ boson in the Standard Model [58]. The basic amplitude controlling the nuclear force, the nucleon—nucleon scattering amplitude can be systematically analyzed in QCD in terms of basic quark and gluon scattering subprocesses. The high momentum transfer behavior of the amplitude from dimensional counting is M K pp?pp f (t/s)/t4 at fixed center of mass angle. A review is given in Ref. [62]. The fundamental pp?pp subprocesses, including pinch contributions [289], can be classified as arising from both quark interchange and gluon exchange contributions. In the case of meson—nucleon scattering, the quark exchange graphs [43] can explain virtually all of the observed features of large momentum transfer fixed CM angle scattering distributions and ratios [90]. The connection between Regge behavior and fixed angle scattering in perturbative QCD for quark exchange reactions is discussed in Ref. [69]. Sotiropoulos and Sterman [407] have shown how one can consistently interpolate from fixed angle scaling behavior to the 1/t8 scaling behavior of the elastic cross section in the sA!t, large !t regime. One of the most striking anomalies in elastic proton—proton scattering is the large spin correlation A observed at large angles [280]. At JsK5 GeV, the rate for scattering with NN incident proton spins parallel and normal to the scattering plane is four times larger than scattering with anti-parallel polarization. This phenomena in elastic pp scattering can be explained as the effect due to the onset of charm production in the intermediate state at this energy [61]. The intermediate state DuuduudccN T has odd intrinsic parity and couples to the J"S"1 initial state, thus strongly enhancing scattering when the incident projectile and target protons have their spins parallel and normal to the scattering plane. The simplest form of the nuclear force is the interaction between two heavy quarkonium states, such as the ¶(bbM ) and the J/t(ccN ). Since there are no valence quarks in common, the dominant color-singlet interaction arises simply from the exchange of two or more gluons, the analog of the van der Waals molecular force in QED. In principle, one could measure the interactions of such systems by producing pairs of quarkonia in high energy hadron collisions. The same fundamental QCD van der Waals potential also dominates the interactions of heavy quarkonia with ordinary hadrons and nuclei. As shown in Ref. [313], the small size of the QQM bound state relative to the much larger hadron sizes allows a systematic expansion of the gluonic potential using the operator product potential. The matrix elements of multigluon exchange in the quarkonium state can be computed from non-relativistic heavy quark theory. The coupling of the scalar part of the interaction to large-size hadrons is rigorously normalized to the mass of the state via the trace anomaly. This attractive potential dominates the interactions at low relative velocity. In this way, one establishes that the nuclear force between heavy quarkonia and ordinary nuclei is attractive and sufficiently strong to produce nuclear-bound quarkonium [64]. 5.2. Moments of nucleons and nuclei in the light-cone formalism Let us consider an effective three-quark light-cone Fock description of the nucleon in which additional degrees of freedom (including zero modes) are parameterized in an effective potential. After truncation, one could in principle obtain the mass M and light-cone wavefunction of the
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
403
three-quark bound states by solving the Hamiltonian eigenvalue problem. It is reasonable to assume that adding more quark and gluonic excitations will only refine this initial approximation [363]. In such a theory the constituent quarks will also acquire effective masses and form factors. However, even without explicit solutions, one knows that the helicity and flavor structure of the baryon eigenfunctions will reflect the assumed global SU(6) symmetry and Lorentz invariance of the theory. Since we do not have an explicit representation for the effective potential in the light-cone Hamiltonian H%&&%#5*7% for three-quarks, we shall proceed by making an ansatz for the LC momentum space structure of the wavefunction W. As we will show below, for a given size of the proton, the predictions and interrelations between observables at Q2"0, such as the proton magnetic moment k and its axial coupling g , turn out to be essentially independent of the shape p A of the wavefunction [71]. The light-cone model given in Refs. [395—397] provides a framework for representing the general structure of the effective three-quark wavefunctions for baryons. The wavefunction ( is constructed as the product of a momentum wavefunction, which is spherically symmetric and invariant under permutations, and a spin—isospin wavefunction, which is uniquely determined by SU(6)symmetry requirements. A Wigner—Melosh [450,332] rotation is applied to the spinors, so that the wavefunction of the proton is an eigenfunction of J and J in its rest frame [105,68]. To represent z the range of uncertainty in the possible form of the momentum wave function, we shall choose two simple functions of the invariant mass M of the quarks: t (M2)"N exp(!M2/2b2) , t (M2)"N (1#M2/b2)~p , (5.14) HO HO P08%3 P08%3 where b sets the characteristic internal momentum scale. Perturbative QCD predicts a nominal power-law fall off at large k corresponding to p"3.5 [299,395—398]. The Melosh rotation insures M that the nucleon has j"1 in its rest system. It has the matrix representation [332] 2 m#x M!ir ) (n]k ) i i (5.15) R (x , k , m)" M i Mi J(m#x M)2#k2 i Mi with n"(0, 0, 1), and it becomes the unit matrix if the quarks are collinear R (x , 0, m)"1. Thus, M i the internal transverse momentum dependence of the light-cone wavefunctions also affects its helicity structure [51]. The Dirac and Pauli form factors F (Q2) and F (Q2) of the nucleons are given by the spin1 2 conserving and the spin-flip vector current J` matrix elements (Q2"!q2) [61] V F (Q2)"Sp#q,CDJ`Dp,CT , (5.16) 1 V (Q !iQ )F (Q2)"!2MSp#q,CDJ`Dp,BT . (5.17) 1 2 2 V We can then calculate the anomalous magnetic moment a"lim 2 F (Q2). [The total proton Q ?0 2 magnetic moment is k "(e/2M)(1#a ).] The same parameters as in Ref. [396] are chosen; p p namely m"0.263 GeV (0.26 GeV) for the up- and down-quark masses, and b"0.607 GeV (0.55 GeV) for t (t ) and p"3.5. The quark currents are taken as elementary currents with P08%3 HO Dirac moments e /2m . All of the baryon moments are well-fit if one takes the strange quark mass q q as 0.38 GeV. With the above values, the proton magnetic moment is 2.81 nuclear magnetons, the neutron magnetic moment is !1.66 nuclear magnetons. (The neutron value can be improved by
404
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
relaxing the assumption of isospin symmetry.) The radius of the proton is 0.76 fm, i.e., M R "3.63. p 1 In Fig. 19 we show the functional relationship between the anomalous moment a and its Dirac p radius predicted by the three-quark light-cone model. The value of R2"!6dF (Q2)/dQ2D 2 is 1 1 Q /0 varied by changing b in the light-cone wave function while keeping the quark mass m fixed. The prediction for the power-law wavefunction t is given by the broken line; the continuous line P08%3 represents t . Fig. 19 shows that when one plots the dimensionless observable a against the HO p dimensionless observable MR the prediction is essentially independent of the assumed power-law 1 or Gaussian form of the three-quark light-cone wavefunction. Different values of p'2 also do not affect the functional dependence of a (M R ) shown in Fig. 19. In this sense the predictions of the p p 1 three-quark light-cone model relating the Q2P0 observables are essentially model-independent. The only parameter controlling the relation between the dimensionless observables in the lightcone three-quark model is m/M which is set to 0.28. For the physical proton radius M R "3.63 p p 1 one obtains the empirical value for a "1.79 (indicated by the dotted lines in Fig. 19). p The prediction for the anomalous moment a can be written analytically as a" Sc TaNR, where aNR"2M /3m is the non-relativistic (RPR) value and c is given as [103] V p V 3m (1!x )M(m#x M)!k2 /2 3 3 M3 c (x , k , m)" . (5.18) V i Mi M (m#x M)2#k2 3 M3 The expectation value Sc T is evaluated as V
C
P
D
P
Sc T" [d3k]c DtD2/ [d3k]DtD2, V V
(5.19)
where [d3k]"dk dk dk d(k #k #k ). The third component of k is defined as k " 1 2 3 1 2 3 3i 1(x M!(m2#k2 )/x M). This measure differs from the usual one used in Ref. [299] by the 2 i Mi i Jacobian
Fig. 19. The anomalous magnetic moment a"F (0) of the proton as a function of M R : broken line, pole type 2 p 1 wavefunction; continuous line, Gaussian wavefunction. The experimental value is given by the dotted lines. The prediction of the model is independent of the wavefunction for Q2"0.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
405
Let us take a closer look at the two limits RPR and RP0. In the non-relativistic limit we let bP0 and keep the quark mass m and the proton mass M fixed. In this limit the proton radius p R PR and a P2M /3m"2.38 since Sc TP1. (This differs slightly from the usual non1 p p V relativistic formula 1#a"+ (e /e)M /m due to the non-vanishing binding energy which results q q p q in M O3m .) Thus, the physical value of the anomalous magnetic moment at the empirical proton p q radius M R "3.63 is reduced by 25% from its non-relativistic value due to relativistic recoil and p 1 nonzero k (The non-relativistic value of the neutron magnetic moment is reduced by 31%.). M To obtain the ultra-relativistic limit, we let bPR while keeping m fixed. In this limit the proton becomes pointlike (M R P0) and the internal transverse momenta k PR. The anomalous p 1 M magnetic momentum of the proton goes linearly to zero as a"0.43M R since Sc TP0. Indeed, p 1 V the Drell—Hearn—Gerasimov sum rule [174,129] demands that the proton magnetic moment becomes equal to the Dirac moment at small radius. For a spin-1 system 2 M2 =ds [p (s)!p (s)] , (5.20) a2" A 2p2a 5) s P s where p is the total photo-absorption cross section with parallel (anti-parallel) photon and P(A) target spins. If we take the point-like limit, such that the threshold for inelastic excitation becomes infinite while the mass of the system is kept finite, the integral over the photo-absorption cross section vanishes and a"0 [61]. In contrast, the anomalous magnetic moment of the proton does not vanish in the non-relativistic quark model as RP0. The non-relativistic quark model does not take into account the fact that the magnetic moment of a baryon is derived from lepton scattering at non-zero momentum transfer, i.e. the calculation of a magnetic moment requires knowledge of the boosted wave function. The Melosh transformation is also essential for deriving the DHG sum rule and low energy theorems of composite systems [51]. A similar analysis can be performed for the axial-vector coupling measured in neutron decay. The coupling g is given by the spin-conserving axial current J` matrix element A A g (0)"Sp, CDJ`Dp, CT. The value for g can be written as g "Sc TgNR with gNR being the A A A A A A A non-relativistic value of g and with c as [103,316] A A (m#x M)2!k2 3 M3 . c (x , k , m)" (5.21) A i Mi (m#x M)2#k2 3 M3 In Fig. 20a since Sc T"0.75, the measured value is g "1.2573$0.0028 [351]. This is a 25% A A reduction compared to the non-relativistic SU(6) value g "5/3, which is only valid for a proton A with large radius R A1/M . As shown in Ref. [316], the Melosh rotation generated by the internal 1 p transverse momentum spoils the usual identification of the c`c quark current matrix element with 5 the total rest-frame spin projection s , thus resulting in a reduction of g . z A Thus, given the empirical values for the proton’s anomalous moment a and radius M R , its p p 1 axial-vector coupling is automatically fixed at the value g "1.25. This prediction is an essentially A model-independent prediction of the three-quark structure of the proton in QCD. The Melosh rotation of the light-cone wavefunction is crucial for reducing the value of the axial coupling from its non-relativistic value 5/3 to its empirical value. In Fig. 20b we plot g /g (R PR) versus A A 1 a /a (R PR) by varying the proton radius R . The near equality of these ratios reflects the p p 1 1 relativistic spinor structure of the nucleon bound state, which is essentially independent of the
P
406
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 20. (a) The axial vector coupling g of the neutron to proton decay as a function of M R . The experimental value is A p 1 given by the dotted lines. (b) The ratio g /g (R PR) versus a /a (R PR) as a function of the proton radius R . A A 1 p p 1 1
detailed shape of the momentum-space dependence of the light-cone wave function. We emphasize that at small proton radius the light-cone model predicts not only a vanishing anomalous moment but also lim 1?0 g (M R )"0. One can understand this physically: in the zero radius limit the R A p 1 internal transverse momenta become infinite and the quark helicities become completely disoriented. This is in contradiction with chiral models which suggest that for a zero radius composite baryon one should obtain the chiral symmetry result g "1. A The helicity measures *u and *d of the nucleon each experience the same reduction as g due to A the Melosh effect. Indeed, the quantity *q is defined by the axial current matrix element *q"Sp, CDqN c`c qDp, CT , (5.22) 5 and the value for *q can be written analytically as *q"Sc T*qNR with *qNR being the nonA relativistic or naive value of *q and with c . A The light-cone model also predicts that the quark helicity sum *R"*u#*d vanishes as a function of the proton radius R . Since the helicity sum *R depends on the proton size, and thus it 1 cannot be identified as the vector sum of the rest-frame constituent spins. As emphasized in Refs. [316,52], the rest-frame spin sum is not a Lorentz invariant for a composite system. Empirically, one measures *q from the first moment of the leading twist polarized structure function g (x, Q). In 1 the light-cone and parton model descriptions, *q":1 dx[q(x)!q¬(x)], where q(x) and q¬(x) can 0 be interpreted as the probability for finding a quark or antiquark with longitudinal momentum fraction x and polarization parallel or anti-parallel to the proton helicity in the proton’s infinite momentum frame [299]. (In the infinite momentum there is no distinction between the quark helicity and its spin-projection s .) Thus, *q refers to the difference of helicities at fixed light-cone z time or at infinite momentum; it cannot be identified with q(s "#1)!q(s "!1), the spin z 2 z 2 carried by each quark flavor in the proton rest frame in the equal time formalism. Thus, the usual SU(6) values *uNR"4/3 and *dNR"!1/3 are only valid predictions for the proton at large MR . At the physical radius the quark helicities are reduced by the same ratio 0.75 1 as g /gNR due to the Melosh rotation. Qualitative arguments for such a reduction have been given A A in Refs. [266,151]. For M R "3.63, the three-quark model predicts *u"1, *d"!1/4, and p 1
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
407
*&"*u#*d"0.75. Although the gluon contribution *G"0 in our model, the general sum rule 1*R#*G#¸ "1 (5.23) 2 z 2 is still satisfied, since the Melosh transformation effectively contributes to ¸ . z Suppose one adds polarized gluons to the three-quark light-cone model. Then the flavor-singlet quark-loop radiative corrections to the gluon propagator will give an anomalous contribution d(*q)"!(a /2p)*G to each light quark helicity. The predicted value of g "*u!*d is of course s A unchanged. For illustration we shall choose (a /2p)*G"0.15. The gluon-enhanced quark model s then gives the values in Table 13, which agree well with the present experimental values. Note that the gluon anomaly contribution to *s has probably been overestimated here due to the large strange quark mass. One could also envision other sources for this shift of *q such as intrinsic flavor [151]. A specific model for the gluon helicity distribution in the nucleon bound state is given in Ref. [70]. In summary, we have shown that relativistic effects are crucial for understanding the spin structure of the nucleons. By plotting dimensionless observables against dimensionless observables we obtain model-independent relations independent of the momentum-space form of the threequark light-cone wavefunctions. For example, the value of g K1.25 is correctly predicted from the A empirical value of the proton’s anomalous moment. For the physical proton radius M R "3.63 p 1 the inclusion of the Wigner (Melosh) rotation due to the finite relative transverse momenta of the three quarks results in a K25% reduction of the non-relativistic predictions for the anomalous magnetic moment, the axial vector coupling, and the quark helicity content of the proton. At zero radius, the quark helicities become completely disoriented because of the large internal momenta, resulting in the vanishing of g and the total quark helicity *R. A 5.3. Applications to nuclear systems We can analyze a nuclear system in the same way as we did the nucleon in the preceding section. The triton, for instance, is modeled as a bound state of a proton and two neutrons. The same formulae as in the preceding section are valid (for spin-1 nuclei); we only have to use the 2 appropriate parameters for the constituents. The light-cone analysis yields non-trivial corrections to the moments of nuclei. For example, consider the anomalous magnetic moment a and anomalous quadrupole moment d
Table 13 Comparison of the quark content of the proton in the non-relativistic quark model (NR), in the three-quark model (3q), in a gluon-enhanced three-quark model (3q#g), and with experiment Quantity
NR
3q
3q#g
Experiment
*u *d *s *R
4 3 !1 3 0 1
1 !1 4 0 3 4
0.85 !0.40 !0.15 0.30
0.83$0.03 !0.43$0.03 !0.10$0.03 0.31$0.07
408
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Qa"Q #e/M2 of the deuteron. As shown in Ref. [432], these moments satisfy the sum rule d d d 2 1 = dl2 (Im f (l, t)!Im f (l, t)) . 2t M (5.24) a2# a # dQa " P A d M2 d d 4p 2 (l!t/4)3 2 lth d Here f (l, t) is the non-forward Compton amplitude for incident parallel (anti-parallel) P(A) photon—deuteron helicities. Thus, in the point-like limit where the threshold for particle excitation l PR, the deuteron acquires the same electro-magnetic moments QaP0, a P0 as that of the 5) d d ¼ in the Standard Model [58]. The approach to zero anomalous magnetic and quadrupole moments for R P0 is shown in Figs. 21 and 22. Thus, even if the deuteron has no D-wave d component, a non-zero quadrupole moment arises from the relativistic recoil correction. This correction, which is mandated by relativity, could cure a long-standing discrepancy between experiment and the traditional nuclear physics predictions for the deuteron quadrupole. Conventional nuclear theory predicts a quadrupole moment of 7.233 GeV~2 which is smaller than the experimental value (7.369$0.039) GeV~2. The light-cone calculation for a pure S-wave gives a positive contribution of 0.08 GeV~2 which accounts for most of the previous discrepancy. In the case of the tritium nucleus, the value of the Gamow—Teller matrix element can be calculated in the same way as we calculated the axial vector coupling g of the nucleon in the A previous section. The correction to the non-relativistic limit for the S-wave contribution is
A
B
P
Fig. 21. The anomalous moment a of the deuteron as a function of the deuteron radius R . In the limit of zero radius, the d d anomalous moment vanishes.
Fig. 22. The quadrupole moment Q of the deuteron as a function of the deuteron radius R . In the limit of zero radius, d d the quadrupole moment approaches its canonical value Q "!e/M2. d d
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
409
g "Sc TgNR. For the physical quantities of the triton we get Sc T"0.99. This means that even at A A A A the physical radius, we find a non-trivial non-zero correction of order !0.01 to g53*50//g/6#-%0/ due A A to the relativistic recoil correction implicit in the light-cone formalism. The Gamow—Teller matrix element is measured to be 0.961$0.003. The wavefunction of the tritium (3H) is a superposition of a dominant S-state and small D- and S’-state components /"/ #/ #/ . The Gamow—Teller S S{ D matrix element in the non-relativistic theory is then given by g53*50//g/6#-%0/" A A (D/ D2!1D/ D2#1D/ D2)(1#0.0589)"0.974, where the last term is a correction due to meson S 3 S{ 3 D exchange currents. Fig. 23 shows that the Gamow—Teller matrix element of tritium must approach zero in the limit of small nuclear radius, just as in the case of the nucleon as a bound state of three quarks. This phenomenon is confirmed in the light-cone analysis. 5.4. Exclusive nuclear processes One of the most elegant areas of application of QCD to nuclear physics is the domain of large momentum transfer exclusive nuclear processes [102]. Rigorous results for the asymptotic properties of the deuteron form factor at large momentum transfer are given in Ref. [59]. In the asymptotic limit Q2PR the deuteron distribution amplitude, which controls large momentum transfer deuteron reactions, becomes fully symmetric among the five possible color-singlet combinations of the six quarks. One can also study the evolution of the “hidden color” components (orthogonal to the np and ** degrees of freedom) from intermediate to large momentum transfer scales; the results also give constraints on the nature of the nuclear force at short distances in QCD. The existence of hidden color degrees of freedom further illustrates the complexity of nuclear systems in QCD. It is conceivable that six-quark d* resonances corresponding to these new degrees of freedom may be found by careful searches of the c*dPcd and c*dPnd channels. The basic scaling law for the helicity-conserving deuteron form factor, F (Q2)&1/Q10, comes d from simple quark counting rules, as well as perturbative QCD. One cannot expect this asymptotic prediction to become accurate until very large Q2 since the momentum transfer has to be shared by at least six constituents. However, one can identify the QCD physics due to the compositeness of the nucleus, with respect to its nucleon degrees of freedom by using the reduced amplitude
Fig. 23. The reduced Gamow—Teller matrix element for tritium decay as a function of the tritium radius.
410
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
formalism [68]. For example, consider the deuteron form factor in QCD. By definition this quantity is the probability amplitude for the deuteron to scatter from p to p#q but remain intact. Note that for vanishing nuclear binding energy e P0, the deuteron can be regarded as two d nucleons sharing the deuteron four-momentum (see Fig. 24a). In the zero-binding limit, one can show that the nuclear light-cone wavefunction properly decomposes into a product of uncorrelated nucleon wavefunctions [249,308]. The momentum l is limited by the binding and can thus be neglected, and to first approximation, the proton and neutron share the deuteron’s momentum equally. Since the deuteron form factor contains the probability amplitudes for the proton and neutron to scatter from p/2 to p/2#q/2, it is natural to define the reduced deuteron form factor [68,59,249]: F (Q2) d f (Q2), . d F (1 Q2)F (1 Q2) 1N 4 1N 4
(5.25)
The effect of nucleon compositeness is removed from the reduced form factor. QCD then predicts the scaling f (Q2)&1/Q2 , d
(5.26)
i.e. the same scaling law as a meson form factor. Diagrammatically, the extra power of 1/Q2 comes from the propagator of the struck quark line, the one propagator not contained in the nucleon form factors. Because of hadron helicity conservation, the prediction is for the leading helicity-conserving deuteron form factor (j"j@"0.) As shown in Fig. 25, this scaling is consistent with experiment for Q"p &1 GeV. T The data are summarized in Ref. [58]. The distinction between the QCD and other treatments of nuclear amplitudes is particularly clear in the reaction cdPnp, i.e. photo-disintegration of the deuteron at fixed center of mass angle. Using dimensional counting [54], the leading power-law prediction from QCD is simply (dp/dt)(cdPnp)&F(h )/s11. A comparison of the QCD predic#. tion with the recent experiment of Ref. [31] is shown in Fig. 26, confirming the validity of the QCD scaling prediction up to E K3 GeV. One can take into account much of the finite-mass, higherc twist corrections by using the reduced amplitude formalism [58]. The photo-disintegration
Fig. 24. (a) Application of the reduced amplitude formalism to the deuteron form factor at large momentum transfer. (b) Construction of the reduced nuclear amplitude for two-body inelastic deuteron reactions.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
411
Fig. 25. Scaling of the deuteron reduced form factor. Fig. 26. Comparison of deuteron photo-disintegration data with the scaling prediction which requires s11dp/dt(s, h ) to #. be at most logarithmically dependent on energy at large momentum transfer.
amplitude contains the probability amplitude (i.e. nucleon form factors) for the proton and neutron to each remain intact after absorbing momentum transfers p !1/2p and p !1/2p , respectively p d n d (see Fig. 24b). After the form factors are removed, the remaining “reduced” amplitude should scale as F(h )/p . The single inverse power of transverse momentum p is the slowest conceivable in any #. T T theory, but it is the unique power predicted by PQCD. The data and predictions from conventional nuclear theory are summarized in Ref. [133]. There are a number of related tests of QCD and reduced amplitudes which require pN beams [249], such as pN dPcn and pN dPnp in the fixed h region. These reactions are particularly interesting #. tests of QCD in nuclei. Dimensional counting rules predict the asymptotic behavior (dp/dt)(pN dPnp)&(1/(p2 )12) f (h ) since there are 14 initial and final quanta involved. Again one T #. notes that the pN dPnp amplitude contains a factor representing the probability amplitude (i.e. form factor) for the proton to remain intact after absorbing momentum transfer squared tL "(p!1/2p )2 d and the NM N time-like form factor at sL "(pN #1/2p )2. Thus, M N &F (tL )F (sL )M , where d pd?np 1N 1N r M has the same QCD scaling properties as quark meson scattering. One thus predicts r (dp/dX)(pN dPnp)/F2 (tK )F2 (sL )&f (X)/p2 . (5.27) 1N 1N T Other work has been done by Cardarelli et al. [86]. 5.5. Conclusions As we have emphasized in this section, QCD and relativistic light-cone Fock methods provide a new perspective on nuclear dynamics and properties. In many some cases the covariant approach fundamentally contradicts standard nuclear assumptions. More generally, the synthesis of QCD with the standard non-relativistic approach can be used to constrain the analytic form and
412
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
unknown parameters in the conventional theory, as in Bohr’s correspondence principle. For example, the reduced amplitude formalism and PQCD scaling laws provide analytic constraints on the nuclear amplitudes and potentials at short distances and large momentum transfers. 6. Exclusive processes and light-cone wavefunctions One of the major advantages of the light-cone formalism is that many properties of large momentum transfer exclusive reactions can be calculated without explicit knowledge of the form of the non-perturbative light-cone wavefunctions. The main ingredients of this analysis are asymptotic freedom, and the power-law scaling relations and quark helicity conservation rules of perturbative QCD. For example, consider the light-cone expression (5.5) for a meson form factor at high momentum transfer Q2. If the internal momentum transfer is large then one can iterate the gluon-exchange term in the effective potential for the light-cone wavefunctions. The result is that the hadron form factors can be written in a factorized form as a convolution of quark “distribution amplitudes” /(x ,Q), one for each hadron involved in the amplitude, with a hard-scatterig amplii tude ¹ [297—299]. The pion’s electro-magnetic form factor, for example, can be written as H 1 1 F (Q2)" dx dy /* (y, Q)¹ (x, y, Q)/ (x, Q)(1#O(1/Q)) . (6.1) n n H n 0 0 Here ¹ is the scattering amplitude for the form factor but with the pions replaced by collinear qqN H pairs, i.e. the pions are replaced by their valence partons. We can also regard ¹ as the free particle H matrix element of the order 1/q2 term in the effective Lagrangian for c*qqN PqqN . The process-independent distribution amplitude [297—299] / (x, Q) is the probability amplitude n for finding the qqN pair in the pion with x "x and x N "1!x. It is directly related to the light-cone q q valence wavefunction:
P P
P
d2k M t(Q)N (x, k ) / (x, Q)" n M 16p3 qq@n
P
(6.2)
dz~ c`c 5 t(z)DnT(Q) e*xP`pz~@2 S0DtM (0) . (6.3) "P` z`/zM/0 n 4n 2J2n c The k integration in Eq. (6.2) is cut off by the ultraviolet cutoff K"Q implicit in the wave M function; thus only Fock states with invariant mass squared M2(Q2 contribute. We will return later to the discussion of ultraviolet regularization in the light-cone formalism. It is important to note that the distribution amplitude is gauge-invariant. In gauges other than light-cone gauge, a path-ordered “string operator” P exp(:1 ds ig A(sz) ) z) must be included be0 tween the tM and t. The line integral vanishes in light-cone gauge because A ) z"A`z~/2"0 and so the factor can be omitted in that gauge. This (non-perturbative) definition of / uniquely fixes the definition of ¹ which must itself then be gauge-invariant. H The above result is in the form of a factorization theorem; all of the non-perturbative dynamics is factorized into the non-perturbative distribution amplitudes, which sums all internal momentum transfers up to the scale Q2. On the other hand, all momentum transfers higher than Q2 appear in ¹ , which, because of asymptotic freedom, can be computed perturbatively in powers of the QCD H running coupling constant a (Q2). s
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
413
Given the factorized structure, one can read off a number of general features of the PQCD predictions, e.g. the dimensional counting rules, hadron helicity conservation, color transparency, etc. [62]. In addition, the scaling behavior of the exclusive amplitude is modified by the logarithmic dependence of the distribution amplitudes in ln Q2 which is in turn determined by QCD evolution equations [297—299]. An important application of the PQCD analysis is exclusive Compton scattering and the related cross process ccPpN p. Each helicity amplitude for cpPcp can be computed at high momentum transfer from the convolution of the proton distribution amplitude with the O(a2) amplitudes for s qqqcPqqqc. The result is a cross section which scales as (dp/dt)(cpPcp)"F(h , ln s)/s6 (6.4) CM if the proton helicity is conserved. The helicity-flip amplitude and contributions involving more quarks or gluons in the proton wavefunction are power-law suppressed. The nominal s~6 fixed angle scaling follows from dimensional counting rules [54]. It is modified logarithmically due to the evolution of the proton distribution amplitude and the running of the QCD coupling constant [297—299]. The normalization, angular dependence, and phase structure are highly sensitive to the detailed shape of the non-perturbative form of / (x , Q2). Recently, Kronfeld and Nizic [284] have p i calculated the leading Compton amplitudes using model forms for / predicted in the QCD sum p rule analyses [100]; the calculation is complicated by the presence of integrable poles in the hard-scattering subprocess ¹ . The results for the unpolarized cross section are shown in Fig. 27. H There also has been important progress testing PQCD experimentally using measurements of the pPN* form factors. In an analysis of existing SLAC data, Stoler [409] has obtained measurements of several transition form factors of the proton to resonances at ¼"1232, 1535, and 1680 MeV. As is the case of the elastic proton form factor, the observed behavior of the transition form factors to the N*(1535) and N*(1680) are each consistent with the Q~4 fall-off and dipole scaling predicted by PQCD and hadron helicity conservation over the measured range 1(Q2(21 GeV2. In contrast, the pPD(1232) form factor decreases faster than 1/Q4 suggesting that non-leading processes are dominant in this case. Remarkably, this pattern of scaling behavior
Fig. 27. Comparison of the order a4/s6 PQCD prediction for proton Compton scattering with the available data. The s calculation assumes PQCD factorization and distribution amplitudes computed from QCD sum rule moments.
414
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
is what is expected from PQCD and the QCD sum rule analyses [100], since, unlike the case of the proton and its other resonances, the distribution amplitude / *(x , x , x , Q) of the D resonance is N 1 2 3 predicted to be nearly symmetric in the x , and a symmetric distribution leads to a strong i cancelation [89] of the leading helicity-conserving terms in the matrix elements of the hard scattering amplitude for qqqPc*qqq. These comparisons of the proton form factor and Compton scattering predictions with experiment are very encouraging, showing agreement in both the fixed-angle scaling behavior predicted by PQCD and the normalization predicted by QCD sum rule forms for the proton distribution amplitude. Assuming one can trust the validity of the leading order analysis, a systematic series of polarized target and beam Compton scattering measurements on proton and neutron targets and the corresponding two-photon reactions ccPppN will strongly constrain a fundamental quantity in QCD, the nucleon distribution amplitude /(x , Q2). It is thus imperative for theorists to develop i methods to calculate the shape and normalization of the non-perturbative distribution amplitudes from first principles in QCD. 6.1. Is PQCD factorization applicable to exclusive processes? One of the concerns in the derivation of the PQCD results for exclusive amplitudes is whether the momentum transfer carried by the exchanged gluons in the hard scattering amplitude ¹ is H sufficiently large to allow a safe application of perturbation theory [238]. The problem appears to be especially serious if one assumes a form for the hadron distribution amplitudes / (x , Q2) which H i has strong support at the endpoints, as in the QCD sum rule model forms suggested by Chernyak and Zhitnitskii and others [100,468]. This problem has now been clarified by two groups: Gari et al. [170] in the case of baryon form factors, and Mankiewicz and Szczepaniak [419], for the case of meson form factors. Each of these authors has pointed out that the assumed non-perturbative input for the distribution amplitudes must vanish strongly in the endpoint region; otherwise, there is a double-counting problem for momentum transfers occurring in the hard scattering amplitude and the distribution amplitudes. Once one enforces this constraint, (e.g. by using exponentially suppressed wavefunctions [299]) on the basis functions used to represent the QCD moments, or uses a sufficiently large number of polynomial basis functions, the resulting distribution amplitudes do not allow significant contribution to the high Q2 form factors to come from soft gluon exchange region. The comparison of the PQCD predictions with experiment thus becomes phenomenologically and analytically consistent. An analysis of exclusive reactions on the effective Lagrangian method is also consistent with this approach. In addition, as discussed by Botts [47], potentially soft contributions to large angle hadron—hadron scattering reactions from Landshoff pinch contributions [289] are strongly suppressed by Sudakov form factor effects. The empirical successes of the PQCD approach, together with the evidence for color transparency in quasi-elastic pp scattering [62] gives strong support for the validity of PQCD factorization for exclusive processes at moderate momentum transfer. It seems difficult to understand this pattern of form factor behavior if it is due to simple convolutions of soft wavefunctions. Thus, it should be possible to use these processes to empirically constrain the form of the hadron distribution amplitudes, and thus confront non-perturbative QCD in detail. For recent work, see Refs. [7,122,254,334].
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
415
6.2. Light-cone quantization and heavy particle decays One of the most interesting applications of the light-cone PQCD formalism is to large momentum transfer exclusive processes to heavy quark decays. For example, consider the decay g Pcc. If we can choose the Lagrangian cutoff K2&m2, then to leading order in 1/m , all of the c c c bound state physics and virtual loop corrections are contained in the ccN Fock wavefunction t c(x , k ). The hard scattering matrix element of the effective Lagrangian coupling ccN Pcc g i Mi contains all of the higher corrections in a (K2) from virtual momenta Dk2D'K2. Thus, s 1 M(g Pcc)" d2k dx t(Kc )(x, k ) ¹(K)(ccN Pcc) c M g M H 0 1 N dx /(x, K)¹(K)(ccN Pcc) , (6.5) H 0 where /(x, K2) is the g distribution amplitude. This factorization and separation of scales is shown c in Fig. 28. Since g is quite non-relativistic, its distribution amplitude is peaked at x"1/2, and its c integral over x is essentially equivalent to the wavefunction at the origin, t(r"O). Another interesting calculational example of quarkonium decay in PQCD is the annihilation of the J/t into baryon pairs. The calculation requires the convolution of the hard annihilation amplitude ¹ (ccN PgggPuud uud) with the J/t, baryon, and anti-baryon distribution amplitudes H [297—299] (see Fig. 29). The magnitude of the computed decay amplitude for tPpN p is consistent with experiment assuming the proton distribution amplitude computed from QCD sum rules [100], see also Keister [269]. The angular distribution of the proton in e`e~PJ/tPpN p is also consistent with the hadron helicity conservation rule predicted by PQCD, i.e. opposite proton and anti-proton helicity. The spin structure of hadrons has been investigated by Ma [318,319], using light-cone methods.
P P P
Fig. 28. Factorization of perturbative and non-perturbative contributions to the decay g Pcc. c
416
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 29. Calculation of J/tPppN in PQCD.
The effective Lagrangian method was used by Lepage et al. [301] to systematically compute the order a (QK ) corrections to the hadronic and photon decays of quarkonium. The scale QK can then be s set by incorporating vacuum polarization corrections into the running coupling constant [57]. A summary of the results can be found in Ref. [286]. 6.3. Exclusive weak decays of heavy hadrons An important application of the PQCD effective Lagrangian formalism is to the exclusive decays of heavy hadrons to light hadrons, such as B0Pn`n~, K`, K~ [418]. To a good approximation, the decay amplitude M"SBDH Dn`n~T is caused by the transition bM P¼`uN ; thus Wk M"f pk(G /J2)Sn~DJ DB0T where J is the bM PuN weak current. The problem is then to recouple n n F k k the spectator d quark and the other gluon and possible quark pairs in each B0 Fock state to the corresponding Fock state of the final state n~ (see Fig. 30). The kinematic constraint that (p !p )2"m2 then demands that at least one quark line is far off shell: B n n p2N "(yp !p )2&!km &!1.5 GeV2, where we have noted that the light quark takes only u B n B a fraction (1!y)&J(k2 #m2)/m of the heavy meson’s momentum since all of the valence quarks M d B must have nearly equal velocity in a bound state. In view of the successful applications [409] of PQCD factorization to form factors at momentum transfers in the few GeV2 range, it is reasonable to assume that SDp2N DT is sufficiently large that we can begin to apply perturbative QCD methods. u The analysis of the exclusive weak decay amplitude can be carried out in parallel to the PQCD analysis of electro-weak form factors [57] at large Q2. The first step is to iterate the wavefunction equations of motion so that the large momentum transfer through the gluon exchange potential is exposed. The heavy quark decay amplitude can then be written as a convolution of the hard scattering amplitude for QqN P¼`qqN convoluted with the B and n distribution amplitudes. The minimum number valence Fock state of each hadron gives the leading power law contribution. Equivalently, we can choose the ultraviolet cut-off scale in the Lagrangian at (K2(km ) so that the B hard scattering amplitude ¹ (QqN P¼`qqN ) must be computed from the matrix elements of the H order 1/K2 terms in dL. Thus, ¹ contains all perturbative virtual loop corrections of order a (K2). H s The result is the factorized form
P P
M(BPnn)"
1 1 dx dy / (y, K)¹ / (x, K) B H n 0 0
(6.6)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
417
Fig. 30. Calculation of the weak decay BPnn in the PQCD formalism of Ref. [418]. The gluon exchange kernel of the hadron wavefunction is exposed where hard momentum transfer is required.
be correct up to terms of order 1/K4. All of the non-perturbative corrections with momenta Dk2D(K2 are summed in the distribution amplitudes. In order to make an estimate of the size of the BPnn amplitude, in Ref. [418], we have taken the simplest possible forms for the required wavefunctions / (y)Jc p. y(1!y) n 5 n for the pion and
(6.7)
c [p. #m g(x)] 5 B B /B(x)J [1!(1/x)!e2/(1!x)]2
(6.8)
for the B, each normalized to its meson decay constant. The above form for the heavy quark distribution amplitude is chosen so that the wavefunction peaks at equal velocity; this is consistent with the phenomenological forms used to describe heavy quark fragmentation into heavy hadrons. We estimate e&0.05 to 0.10. The functional dependence of the mass term g(x) is unknown; however, it should be reasonable to take g(x)&1 which is correct in the weak binding approximation. One now can compute the leading order PQCD decay amplitude M(B0Pn~n`)"(G /J2)»* » Pk Sn~D»kDB0T F ud ub n` where
(6.9)
P P
1~e Tr[P. ~c cl k. ck(P. #M g(x))c c ] 8pa (Q2) 1 n 5 1 B B 5 l s dx dy / (x)/ (y) Sn~D»kDB0T" B n k2q2 3 1 0 0 Tr[P. !c cl(k. #M )cl(P. #M g(x))c c ] n 5 2 B B B 5 l . # (k2!M2)Q2 2 B Numerically, this gives the branching ratio BR(B0Pn`n~)&10~8m2N ,
(6.10)
(6.11)
418
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
where m"10D» /» D is probably less than unity, and N has strong dependence on the value of ub cb g: N"180 for g"1 and N"5.8 for g"1/2. The present experimental limit [21] is BR(B0Pn`n~)(3]10~4 .
(6.12)
A similar PQCD analysis can be applied to other two-body decays of the B; the ratios of the widths will not be so sensitive to the form of the distribution amplitude, allowing tests of the flavor symmetries of the weak interaction. Semi-leptonic decay rates can be calculated [99,128,187,246,404], and the construction of the heavy quark wavefunctions [101,465], can be helpful for that. 6.4. Can light-cone wavefunctions be measured? Essential information on the shape and form of the valence light-cone wavefunctions can be obtained empirically through measurements of exclusive processes at large momentum transfer. In the case of the pion, data for the scaling and magnitude of the photon transition form factor F 0(q2) cn suggest that the distribution amplitude of the pion / (x, Q) is close in form to the asymptotic form n /=(x)"J3f (1!x), the solution to the evolution equation for the pion at infinite resolution n n QPR, [299]. Note that the pion distribution amplitude is constrained by nPkl decay,
P
1
dx / (x, Q)"f /2J3 . (6.13) n n 0 The proton distribution amplitude as determined by the proton form factor at large momentum transfer, and Compton scattering is apparently highly asymmetric as suggested by QCD sum rules and SU(6) flavor-spin symmetry. The most direct way to measure the hadron distribution wavefunction is through the diffractive dissociation of a high energy hadron to jets or nuclei, e.g. nAPJet#Jet#A@, where the final-state nucleus remains intact [36,149]. The incoming hadron is a sum over all of its H0 fluctuations. LC When the pion fluctuates into a qqN state with small impact separation b0 (1/Q), its color interactions M are minimal the “color transparency” property of QCD [60]. Thus, this fluctuation will interact coherently throughout the nucleus without initial or final state absorption corrections. The result is that the pion is coherently materialized into two jets of mass M| with minimal momentum transfer to the nucleus M|!pq| n. DQ " (6.14) L 2E L Thus, the jets carry nearly all of the momentum of the pion. The forward amplitude at Q , Q @R~1 is linear in the number of nucleons. The total rate integrated over the forward M L n diffraction peak is thus proportional to A2/R2JA1@3. n The most remarkable feature of the diffractive nAPJet#Jet#X reactions is its potential to measure the shape of the pion wavefunction. The partition of jet longitudinal momentum gives the x-distribution; the relative transverse momentum distribution provides the k -distribution of M t N (x, k ). Such measurements are now being carried out by the E791 collaboration at Fermilab. qq@n M
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
419
In principle, such experiments can be carried out with a photon beam, which should confirm the x2#(1!x) cPqqN distribution of the basic photon wavefunction. Measurements of pAPJet p#Jet A could, in principle, provide a direct measurement of the proton distribution amplitude / (x ; Q). p i 7. The light-cone vacuum The unique features of “front form” or light-cone quantized field theory [123] provide a powerful tool for the study of QCD. Of primary importance in this approach is the existence of a vacuum state that is the ground state of the full theory. The existence of this state gives a firm basis for the investigation of many of the complexities that must exist in QCD. In this picture the rich structure of vacuum is transferred to the zero modes of the theory. Within this context the long-range physical phenomena of spontaneous symmetry breaking [206—208] [33,382,223,389,375,376] as well as the topological structure of the theory [259,377,378,383,384,261] can be associated with the zero mode(s) of the fields in a quantum field theory defined in a finite spatial volume and quantized at equal light-cone time [299]. 7.1. Constrained zero modes As mentioned previously, the light-front vacuum state is simple; it contains no particles in a massive theory. In other words, the Fock space vacuum is the physical vacuum. However, one commonly associates important long-range properties of a field theory with the vacuum: spontaneous symmetry breaking, the Goldstone pion, and color confinement. How do these complicated phenomena manifest themselves in light-front field theory? If one cannot associate long-range phenomena with the vacuum state itself, then the only alternative is the zero momentum components or “zero modes” of the field (long range % zero momentum). In some cases, the zero mode operator is not an independent degree of freedom but obeys a constraint equation. Consequently, it is a complicated operator-valued function of all the other modes of the field. Zero modes of this type have been investigated first by Maskawa and Yamawaki as early as in 1976 [324]. This problem has recently been attacked from several directions. The question of whether boundary conditions can be consistently defined in light-front quantization has been discussed by McCartor and Robertson [325—331,388], and by Lenz [295,296]. They have shown that for massive theories the energy and momentum derived from light-front quantization are conserved and are equivalent to the energy and momentum one would normally write down in an equal-time theory. In the analyses of Lenz et al. [295,296] and Hornbostel [230] one traces the fate of the equal-time vacuum in the limit P3PR and equivalently in the limit hPp/2 when rotating the evolution parameter q"x0 cos h#x3 sin h from the instant parametrization to the front parametrization. Heinzl and Werner et al. [206—210,214] considered /4 theory in (1#1) dimensions and attempted to solve the zero mode constraint equation by truncating the equation to one particle. Other authors [203,204,389] find that, for theories allowing spontaneous symmetry breaking, there is a degeneracy of light-front vacua and the true vacuum state can differ from the perturbative vacuum through the addition of zero mode quanta. In addition to these approaches there
420
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
are many others, like Refs. [78,356,262], Refs. [73,112,232,257], or Refs. [107,214,276]. Grange´ et al. [45,46] have dealt with a broken phase in such scalar models, see also Refs. [97,175,283]. An analysis of the zero mode constraint equation for (1#1)-dimensional /4 field theory [(/4) ] with symmetric boundary conditions shows how spontaneous symmetry breaking 1`1 occurs within the context of this model. This theory has a Z symmetry /P!/ which is 2 spontaneously broken for some values of the mass and coupling. The approach of Pinsky et al. [33,382,223] is to apply a Tamm—Dancoff truncation to the Fock space. Thus, operators are finite matrices and the operator-valued constraint equation can be solved numerically. The truncation assumes that states with a large number of particles or large momentum do not have an important contribution to the zero mode. Since this represents a completely new paradigm for spontaneous symmetry breaking we will present this calculation in some detail. One finds the following general behavior: for small coupling (large g, where gJ1/coupling) the constraint equation has a single solution and the field has no vacuum expectation value (VEV). As one increases the coupling (decreases g) to the “critical coupling” g , two additional solutions which give the field a non-zero VEV appear. These #3*5*#!solutions differ only infinitesimally from the first solution near the critical coupling, indicating the presence of a second-order phase transition. Above the critical coupling (g(g ), there are #3*5*#!three solutions: one with zero VEV, the “unbroken phase”, and two with non-zero VEV, the “broken phase”. The “critical curves”, shown in Fig. 31, is a plot of the VEV as a function of g. Since the vacuum in this theory is trivial, all of the long-range properties must occur in the operator structure of the Hamiltonian. Above the critical coupling (g(g ) quantum oscil#3*5*#!lations spontaneously break the Z symmetry of the theory. In a loose analogy with a symmetric 2 double-well potential, one has two new Hamiltonians for the broken phase, each producing states localized in one of the wells. The structure of the two Hamiltonians is determined from the broken phase solutions of the zero mode constraint equation. One finds that the two Hamiltonians have equivalent spectra. In a discrete theory without zero modes it is well known that, if one increases
Fig. 31. f "J4pS0D/D0T vs. g"24pk2/j in the one mode case with N"10. 0
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
421
the coupling sufficiently, quantum correction will generate tachyons causing the theory to break down near the critical coupling. Here the zero mode generates new interactions that prevent tachyons from developing. In effect what happens is that, while quantum corrections attempt to drive the mass negative, they also change the vacuum energy through the zero mode and the mass eigenvalue can never catch the vacuum eigenvalue. Thus, tachyons never appear in the spectra. In the weak coupling limit (g large) the solution to the constraint equation can be obtained in perturbation theory. This solution does not break the Z symmetry and is believed to simply insert 2 the missing zero momentum contributions into internal propagators. This must happen if lightfront perturbation theory is to agree with equal-time perturbation theory [94—96]. Another way to investigate the zero mode is to study the spectrum of the field operator /. Here one finds a picture that agrees with the symmetric double-well potential analogy. In the broken phase, the field is localized in one of the minima of the potential and there is tunneling to the other minimum. 7.1.1. Canonical quantization For a classical field the (/4) Lagrange density is 1`1 ¸" / /!1k2/2!(j/4!)/4. (7.1) ` ~ 2 One puts the system in a box of length d and imposes periodic boundary conditions. Then (7.2) /(x)"(1/Jd)+ q (x`)e*k`nx~, n n where k`"2pn/d and summations run over all integers unless otherwise noted. n It is convenient to define the integral :dx~ /(x)n!(zero modes)"R . In terms of the modes of n the field it has the form, 1 R" n n!
q 1 q 22q n d 1 2 + . i i i i `i `> > >`in,0 i ,i , ,i E0 Then the canonical Hamiltonian is
(7.3)
1 22 n
k2q2 jq4 jq2R jq R jR P~" 0#k2R # 0# 0 2# 0 3# 4 . 2 4!d 2 2!d d d
(7.4)
Following the Dirac—Bergman prescription, described in Appendix E, one identifies firstclass constraints which define the conjugate momenta 0"p !ik`q , n n ~n where
(7.5)
[q , p ]"1d , m, nO0. m n 2 n,m The secondary constraint is [457],
(7.6)
jq3 jq R jR 0"k2q # 0# 0 2# 3 , 0 3!d d d
(7.7)
422
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
which determines the zero mode q . This result can also be obtained by integrating the equations of 0 motion. To quantize the system one replaces the classical fields with the corresponding field operators, and the Dirac bracket by i times a commutator. One must choose a regularization and an operator-ordering prescription in order to make the system well-defined. One begins by defining creation and annihilation operators as and a , k k
S
d a , a "as , kO0 , q" k ~k k 4nDkD k
(7.8)
which satisfy the usual commutation relations [a , a ]"0 , [as, as]"0 , [a , as]"d , k,l'0 . k l k l k l k,l Likewise, one defines the zero mode operator
(7.9)
(7.10) q "J(d/4p)a . 0 0 In the quantum case, one normal orders the operator R . n General arguments suggest that the Hamiltonian should be symmetric ordered [32]. However, it is not clear how one should treat the zero mode since it is not a dynamical field. As an ansatz one treats a as an ordinary field operator when symmetric ordering the Hamiltonian. The tadpoles are 0 removed from the symmetric ordered Hamiltonian by normal ordering the terms having no zero mode factors and by subtracting 3 1 a2 + . (7.11) 0 2 DnD nE0 In addition, one subtracts a constant so that the VEV of H is zero. Note that this renormalization prescription is equivalent to a conventional mass renormalization and does not introduce any new operators into the Hamiltonian. The constraint equation for the zero mode can be obtained by taking a derivative of P~ with respect to a . One finds 0 1 3a 0"ga #a3# + a a a #a a a #a a a ! 0 #6R , (7.12) 0 0 n ~n 0 n 0 ~n 3 DnD 0 n ~n 2 nE0 where g"24pk2/j. It is clear from the general structure of Eq. (7.12) that a as a function of the 0 other modes is not necessarily odd under the transform a P!a , (kO0) associated with the k k Z symmetry of the system. Consequently, the zero mode can induce Z symmetry breaking in the 2 2 Hamiltonian. In order to render the problem tractable, we impose a Tamm—Dancoff truncation on the Fock space. One defines M to be the number of non-zero modes and N to be the maximum number of allowed particles. Thus, each state in the truncated Fock space can be represented by a vector of length S"(M#N)!/(M!N!) and operators can be represented by S]S matrices. One can define the usual Fock space basis, Dn , n ,2, n T where n #n #2#n 4N. In matrix form, a is 1 2 M 1 2 M 0 real and symmetric. Moreover, it is block diagonal in states of equal P` eigenvalue.
A
B
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
423
7.1.2. Perturbative solution of the constraints In the limit of large g, one can solve the constraint equation perturbatively. Then one substitutes the solution back into the Hamiltonian and calculates various amplitudes to arbitrary order in 1/g using Hamiltonian perturbation theory. It can be shown that the solutions of the constraint equation and the resulting Hamiltonian are divergence free to all orders in perturbation theory for both the broken and unbroken phases. To do this one starts with the perturbative solution for the zero mode in the unbroken phase,
A
B
6 6 M a R as#asR a !R k 3 k 3 #O(1/g3) , a "! R # 2R R #2R R # + k 3 k (7.13) 0 2 3 3 2 g 3 g2 k k/1 and substitutes this into the Hamiltonian to obtain a complicated but well-defined expression for the Hamiltonian in terms of the dynamical operators. The finite volume box acts as an infra-red regulator and the only possible divergences are ultraviolet. Using diagrammatic language, any loop of momentum k with l internal lines has asymptotic form k~l. Only the case of tadpoles l"1 is divergent. If there are multiple loops, the effect is to put factors of ln(k) in the numerator and the divergence structure is unchanged. Looking at Eq. (7.13), the only possible tadpole is from the contraction in the term a R a /k (7.14) k 3 ~k which is canceled by the R /k term. This happens to all orders in perturbation theory: each tadpole 3 has an associated term which cancels it. Likewise, in the Hamiltonian one has similar cancelations to all orders in perturbation theory. For the unbroken phase, the effect of the zero mode should vanish in the infinite volume limit, giving a “measure zero” contribution to the continuum Hamiltonian. However, for finite box volume the zero mode does contribute, compensating for the fact that the longest wavelength mode has been removed from the system. Thus, inclusion of the zero mode improves convergence to the infinite volume limit. In addition, one can use the perturbative expansion of the zero mode to study the operator ordering problem. One can directly compare our operator ordering ansatz with a truly Weyl ordered Hamiltonian and with Maeno’s operator ordering ansatz [320]. As an example, let us examine O(j2) contributions to the processes 1P1. As shown in Fig. 32, including the zero mode greatly improves convergence to the large volume limit. The zero mode compensates for the fact that one has removed the longest wavelength mode from the system. 7.1.3. Non-perturbative solution: One mode, many particles Consider the case of one mode M"1 and many particles. In this case, the zero mode is diagonal and can be written as N a "f D0T S0D# + f DkT SkD . (7.15) 0 0 k k/1 Note that a in Eq. (7.15) is even under a P!a , kO0 and any non-zero solution breaks the 0 k k Z symmetry of the original Hamiltonian. The VEV is given by 2 S0D/D0T"(1/J4p)S0Da D0T"(1/J4p) f . 0 0
(7.16)
424
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 32. Convergence to the large d limit of 1P1 setting E"g/p and dropping any constant terms.
Substituting Eq. (7.15) into the constraint Eq. (7.12) and sandwiching the constraint equation between Fock states, one get a recursion relation for M f N: n 0"g f #f 3#(4n!1) f #(n#1) f #nf , n n n n`1 n~1
(7.17)
where n4N, and one defines f to be unknown. Thus, M f , f ,2, f N is uniquely determined N`1 1 2 N`1 by a given choice of g and f . In particular, if f "0 all the f ’s are zero independent of g. This is the 0 0 k unbroken phase. Consider the asymptotic behavior for large n. If f A1, the f 3 term will dominate and n n f &f 3/n , n`1 n
(7.18)
thus, lim f &(!1)n exp(3nconstant) . (7.19) n n?= One must reject this rapidly growing solution. One only seeks solutions where f is small for large n. n For large n, the terms linear in n dominate and Eq. (7.17) becomes f #4f #f "0 . n`1 n n~1 There are two solutions to this equation: f J(J3$2)n . n
(7.20)
(7.21)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
425
One must reject the plus solution because it grows with n. This gives the condition J3!3#g ! "K , K"0, 1, 22 2J3
(7.22)
Concentrating on the K"0 case, one finds a critical coupling g
"3!J3 #3*5*#!-
(7.23)
or j
"4p(3#J3)k2+60k2 . (7.24) #3*5*#!In comparison, values of j from 22k2 to 55k2 have been reported for equal-time quantized #3*5*#!calculations [93,1,168,282]. The solution to the linearized equation is an approximate solution to the full Eq. (7.17) for f sufficiently small. Next, one needs to determine solutions of the full 0 non-linear equation which converge for large n. One can study the critical curves by looking for numerical solutions to Eq. (7.17). The method used here is to find values of f and g such that f "0. Since one seeks a solution where f is 0 N`1 n decreasing with n, this is a good approximation. One finds that for g'3!J3 the only real solution is f "0 for all n. For g less than 3!J3 there are two additional solutions. Near the n critical point D f D is small and 0 (7.25) f +f (2!J3)n . n 0 The critical curves are shown in Fig. 31. These solutions converge quite rapidly with N. The critical curve for the broken phase is approximately parabolic in shape: g+3!J3!0.9177f 2. (7.26) 0 One can also study the eigenvalues of the Hamiltonian for the one-mode case. The Hamiltonian is diagonal for this Fock space truncation and f 4 2n#1 n#1 n 3 f 2# f 2 # f 2 !C . SnDHDnT" n(n!1)#ng! n ! n 4 4 4 n`1 4 n~1 2
(7.27)
The invariant mass eigenvalues are given by njSnDHDnT DnT . P2DnT"2P`P~DnT" 24p
(7.28)
In Fig. 33 the dashed lines show the first few eigenvalues as a function of g without the zero mode. When one includes the broken phase of the zero mode, the energy levels shift as shown by the solid curves. For g(g the energy levels increase above the value they had without the zero mode. #3*5*#!The higher levels change very little because f is small for large n. n In the more general case of many modes and many particles many of the features that were seen in the one-mode and one-particle cases remain. In order to calculate the zero mode for a given value of g one converts the constraint Eq. (7.12) into an S]S matrix equation in the truncated Fock space. This becomes a set of S2 coupled cubic equations and one can solve for the matrix
426
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 33. The lowest three energy eigenvalues for the one-mode case as a function of g from the numerical solution of Eq. (7.27) with N"10. The dashed lines are for the unbroken phase f "0 and the solid lines are for the broken phase 0 f O0. 0
elements of a numerically. Considerable simplification occurs because a is symmetric and is block 0 0 diagonal in states of equal momentum. For example, in the case M"3, N"3, the number of coupled equations is 34 instead of S2"400. In order to find the critical coupling, one takes S0Da D0T as given and g as unknown and solves the constraint equation for g and the other matrix 0 elements of a in the limit of small but non-zero S0Da D0T. One sees that the solution converges 0 0 quickly as N increases, and that there is a logarithmic divergence as M increases. The logarithmic divergence of g is the major remaining missing part of this calculation and requires a careful #3*5*#!non-perturbative renormalization [281]. When one substitutes the solutions for the broken phase of a into the Hamiltonian, one gets two 0 Hamiltonians H` and H~ corresponding to the two signs of S0Da D0T and the two branches of the 0 curve in Fig. 31. This is the new paradigm for spontaneous symmetry breaking: multiple vacua are replaced by multiple Hamiltonians. Picking the Hamiltonian defines the theory in the same sense that picking the vacuum defines the theory in the equal-time paradigm. The two solutions for a are 0 related to each other in a very specific way. Let P be the unitary operator associated with the Z symmetry of the system; Pa Ps"!a , kO0. One breaks up a into an even part PaEPs"aE 2 k k 0 0 0 and an odd part PaOPs"!aO. The even part aE breaks the Z symmetry of the theory. For 0 0 0 2 g(g , the three solutions of the constraint equation are: aO corresponding to the unbroken #3*5*#!0 phase, aO#aE corresponding to the S0Da D0T'0 solution, and aO!aE for the S0Da D0T(0 0 0 0 0 0 0
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
427
solution. Thus, the two Hamiltonians are H`"H(a , aO#aE) k 0 0
(7.29)
and H~"H(a , aO!aE) , k 0 0 where H has the property
(7.30)
H(a , a )"H(!a , !a ) (7.31) k 0 k 0 and a represents the non-zero modes. Since P is a unitary operator, if DWT is an eigenvector of k H with eigenvalue E then PDWT is an eigenvalue of PHPs with eigenvalue E. Since, PH~Ps"PH(a , aO!aE)Ps"H(!a , !aO!aE)"H(a , aO#aE)"H`, (7.32) k 0 0 k 0 0 k 0 0 H` and H~ have the same eigenvalues. Consider the M"3, N"3 case as an example and let us examine the spectrum of H. For large g the eigenvalues are obviously 0, g, g/2, 2g, g/3, 3g/2 and 3g. However, as one decreases g one of the last three eigenvalues will be driven negative. This signals the breakdown of the theory near the critical coupling when the zero mode is not included. Including the zero mode fixes this problem. Fig. 34 shows the spectrum for the three lowest nonzero momentum sectors. This spectrum illustrates several characteristics which seem to hold generally (at least for truncations that have been examined, N#M46). For the broken phase, the vacuum is the lowest energy state, there are no level crossings as a function of g, and the theory does
Fig. 34. The spectrum for (a) P`"2p/d, (b) P`"4p/d, and (c) P`"6p/d, all with M"3, N"3. The dashed line shows the spectrum with no zero mode. The dotted line is the unbroken phase and the solid line is the broken phase.
428
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
not break down in the vicinity of the critical point. None of these are true for the spectrum with the zero mode removed or for the unbroken phase below the critical coupling. One can also investigate the shape of the critical curve near the critical coupling as a function of the cutoff K. In scalar field theory, S0D/D0T acts as the order parameter of the theory. Near the critical coupling, one can fit the VEV to some power of g—g ; this will give us the associated #3*5*#!critical exponent b, S0Da D0TJ(g !g)b . (7.33) 0 #3*5*#!Pinsky et al. [223] have calculated this as a function of cutoff and found a result consistent with b"1/2, independent of cutoff K. The theory (/4) is in the same universality class as the Ising 1`1 model in 2 dimensions and the correct critical exponent for this universality class is b"1/8. If one were to use the mean field approximation to calculate the critical exponent, the result would be b"1/2. This is what was obtained in this calculation. Usually, the presence of a mean field result indicates that one is not probing all length scales properly. If one had a cutoff K large enough to include many length scales, then the critical exponent should approach the correct value. However, one cannot be certain that this is the correct explanation of our result since no evidence that b decreases with increase K is seen. 7.1.4. Spectrum of the field operator How does the zero mode affect the field itself? Since / is a Hermitian operator it is an observable of the system and one can measure / for a given state DaT. /I and Ds T are the eigenvalue and i i eigenvector, respectively, of J4p/: J4p/Ds T"/I Ds T , Ss Ds T"d . (7.34) i i i i j i,j The expectation value of J4p/ in the state DaT is DSs DaTD2. i In the limit of large N, the probability distribution becomes continuous. If one ignores the zero mode, the probability of obtaining /I as the result of a measurement of J4p/ for the vacuum state is
A B
1 /I 2 P(/I )" exp ! d/I , 2q J2pq
(7.35)
where q"+M 1/k. The probability distribution comes from the ground-state wavefunction of the k/1 harmonic oscillator where one identifies / with the position operator. This is just the Gaussian fluctuation of a free field. Note that the width of the Gaussian diverges logarithmically in M. When N is finite, the distribution becomes discrete as shown in Fig. 35. In general, there are N#1 eigenvalues such that Ss D0TO0, independent of M. Thus, if one i wants to examine the spectrum of the field operator for the vacuum state, it is better to choose Fock space truncations where N is large. With this in mind, one examines the N"50 and M"1 case as a function of g in Fig. 36. Note that near the critical point, Fig. 36a, the distribution is approximately equal to the free field case shown in Fig. 35. As one moves away from the critical point, Fig. 36b—Fig. 36d, the distribution becomes increasingly narrow with a peak located at the VEV of what would be the minimum of the symmetric double-well potential in the equal-time paradigm. In addition, there is a small peak corresponding to minus the VEV. In the language of the equal-time
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
429
Fig. 35. Probability distribution of eigenvalues of J4p/ for the vacuum with M"1, N"10, and no zero mode. Also shown is the infinite N limit from Eq. (7.35).
paradigm, there is tunneling between the two minima of the potential. The spectrum of / has been examined for other values of M and N; the results are consistent with the example discussed here. 7.2. Physical picture and classification of zero modes When considering a gauge theory, there is a “zero-mode” problem associated with the choice of gauge in the compactified case. This subtlety, however, is not particular to the light cone; indeed, its occurrence is quite familiar in equal-time quantization on a torus [321,335,290]. In the present context, the difficulty is that the zero mode in A` is in fact gauge-invariant, so that the light-cone gauge A`"0 cannot be reached. Thus we have a pair of interconnected problems: first, a practical choice of gauge; and second, the presence of constrained zero modes of the gauge field. In two recent papers [258,259] these problems were separated and consistent gauge fixing conditions were introduced to allow isolation of the dynamical and constrained fields. In Ref. [259] the generalize gauge fixing is described, and the Poincare´ generators are constructed in perturbation theory. One observes that in the traditional treatment, choosing the light-cone gauge A`"0 enables Gauss’s law to be solved for A~. In any case the spinor projection t is constrained and ~ determined from the equations of motion. Discretization is achieved by putting the theory in a light-cone “box”, with !¸ 4 M xi4¸ and !¸4x~4¸, and imposing boundary conditions on the fields. A must be taken to M k be periodic in both x~ and x . It is most convenient to choose the Fermion fields to be periodic in M x and anti-periodic in x~. This eliminates the zero longitudinal momentum mode while still M allowing an expansion of the field in a complete set of basis functions. The functions used to expand the fields may be taken to be plane waves, and for periodic fields these will of course include zero-momentum modes. Let us define, for a periodic quantity f, its
430
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Fig. 36. Probability distribution of eigenvalues of J4p/ for the vacuum with couplings g"1, g"0, g"!1, and g"!2, all for M"1 and N"50. The positive VEV solution to the constraint equation is used.
longitudinal zero mode
P
1 L SfT , dx~ f (x~, x ) 0 2¸ M ~L and the corresponding normal mode part
(7.36)
S f T ,f!S f T . (7.37) / 0 We shall further denote the “global zero mode” — the mode independent of all the spatial coordinates — by S f T:
P
P
LM 1 L d2x f (x~, x ) . dx~ S f T, M M X M ~L ~L
(7.38)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
431
Finally, the quantity which will be of most interest to us is the “proper zero mode”, defined by f ,S f T !S f T . (7.39) 0 0 By integrating over the appropriate direction(s) of space, we can project the equations of motion onto the various sectors. The global zero mode sector requires some special treatment, and will not be discussed here. We concentrate our attention on the proper zero mode sector, in which the equations of motion become !2 A`"gJ`, (7.40) M 0 0 !2( )2A`!2 A~!2 Ai "gJ~ , (7.41) ` 0 M 0 i ` 0 0 !2 Ai # A`# Aj "gJi . (7.42) M 0 i ` 0 i j 0 0 We first observe that Eq. (7.40), the projection of Gauss’ law, is a constraint which determines the proper zero mode of A` in terms of the current J`: A`"!g(1/2 )J` . (7.43) 0 M 0 Eqs. (7.41) and (7.42) then determine the zero modes A~ and Ai . 0 0 Eq. (7.43) is clearly incompatible with the strict light-cone gauge A`"0, which is most natural in light-cone analyses of gauge theories. Here we encounter a common problem in treating axial gauges on compact spaces, which has nothing to do with light-cone quantization per se. The point is that any x~-independent part of A` is in fact gauge invariant, since under a gauge transformation A`PA`#2 K , (7.44) ~ where K is a function periodic in all coordinates. Thus, it is not possible to bring an arbitrary gauge field configuration to one satisfying A`"0 via a gauge transformation, and the light-cone gauge is incompatible with the chosen boundary conditions. The closest we can come is to set the normal mode part of A` to zero, which is equivalent to A`"0 . (7.45) ~ This condition does not, however, completely fix the gauge — we are free to make arbitrary x~-independent gauge transformations without undoing Eq. (7.45). We may therefore impose further conditions on A in the zero-mode sector of the theory. k To see what might be useful in this regard, let us consider solving Eq. (7.42). We begin by acting on Eq. (7.42) with . The transverse field Ai then drops out and we obtain an expression for the i 0 time derivative of A`: 0 A`"g(1/2 ) Ji . (7.46) ` 0 M i 0 [This can also be obtained by taking a time derivative of Eq. (7.43), and using current conservation to re-express the right-hand side in terms of Ji]. Inserting this back into Eq. (7.42) we then find, after some rearrangement, !2 (di ! /2 )Aj "g(di ! /2 )Jj . M j i j M 0 j i j M 0
(7.47)
432
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Now the operator (di ! /2 ) is nothing more than the projector of the two-dimensional j i j M transverse part of the vector fields Ai and Ji . No trace remains of the longitudinal projection of the 0 0 field ( /2 )Aj in Eq. (7.47). This reflects precisely the residual gauge freedom with respect to i j M 0 x~-independent transformations. To determine the longitudinal part, an additional condition is required. More concretely, the general solution to Eq. (7.47) is (7.48) Ai "!g(1/2 )Ji # u(x`, x ) , 0 M 0 i M where u must be independent of x~ but is otherwise arbitrary. Imposing a condition on, say, Ai i 0 will uniquely determine u. In Ref. [259], for example, the condition Ai "0 (7.49) i 0 was proposed as being particularly natural. This choice, taken with the other gauge conditions we have imposed, has been called the “compactification gauge”. In this case u"g(1/(2 )2) Ji . (7.50) M i 0 Of course, other choices are also possible. For example, we might generalize Eq. (7.50) to u"ag(1/(2 )2) Ji , M i 0 with a a real parameter. The gauge condition corresponding to this solution is
(7.51)
Ai "!g(1!a)(1/2 ) Ji . (7.52) i 0 M i 0 We shall refer to this as the “generalized compactification gauge”. An arbitrary gauge field configuration Bk can be brought to one satisfying Eq. (7.52) via the gauge function K(x )"!(1/2 )[g(1!a)(1/2 ) Ji # Bi ] . (7.53) M M M i 0 i 0 This is somewhat unusual in that K(x ) involves the sources as well as the initial field configuration, M but this is perfectly acceptable. More generally, u can be any (dimensionless) function of gauge invariants constructed from the fields in the theory, including the currents JB. For our purposes Eq. (7.52) suffices. We now have relations defining the proper zero modes of Ai, Ai "!g(1/2 )(di !a /2 )Jj , (7.54) 0 M j i j M 0 as well as A` [Eq. (7.43)]. All that remains is to use the final constraint, Eq. (7.41), to determine 0 A~. Using Eqs. (7.46) and (7.52), we find that Eq. (7.41) can be written as 0 2 A~"!gJ~!2ag(1/2 ) Ji . (7.55) M 0 0 M ` i 0 After using the equations of motion to express Ji in terms of the dynamical fields at x`"0, this ` 0 may be straightforwardly solved for A~ by inverting the 2 . In what follows, however, we shall have 0 M no need of A~. It does not enter the Hamiltonian, for example; as usual, it plays the role of 0 a multiplier to Gauss’ law, Eq. (7.42), which we are able to implement as an operator identity. We have shown how to perform a general gauge fixing of Abelian gauge theory in DLCQ and cleanly separate the dynamical from the constrained zero-longitudinal momentum fields. The
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
433
various zero-mode fields must be retained in the theory if the equations of motion are to be realized as the Heisenberg equations. We have further seen that taking the constrained fields properly into account renders the ultraviolet behavior of the theory more benign, in that it results in the automatic generation of a counter term for a non-covariant divergence in the fermion self-energy in lowest-order perturbation theory. The solutions to the constraint relations for the Ai are all physically equivalent, being related by 0 different choices of gauge in the zero mode sector of the theory. There is a gauge which is particularly simple, however, in that the fields may be taken to satisfy the usual canonical anti-commutation relations. This is most easily exposed by examining the kinematical Poincare´ generators and finding the solution for which these retain their free-field forms. The unique solution that achieves this is u"0 in Eq. (7.48). For solutions other than this one, complicated commutation relations between the fields will be necessary to correctly translate them in the initial-value surface. It would be interesting to study the structure of the operators induced by the zero modes from the point of view of the light-cone power-counting analysis of Wilson [456]. As noted in the Introduction, to the extent that DLCQ coincides with reality, effects which we would normally associate with the vacuum must be incorporated into the formalism through the new, noncanonical interactions arising from the zero modes. Particularly interesting is the appearance of operators that are non-local in the transverse directions. These are interesting because the strong infrared effects they presumably mediate could give rise to transverse confinement in the effective Hamiltonian for QCD. There is longitudinal confinement already at the level of the canonical Hamiltonian; that is, the effective potential between charges separated only in x~ grows linearly with the separation. This comes about essentially from the non-locality in x~ (i.e. the small-k` divergences) of the light-cone formalism. It is clearly of interest to develop non-perturbative methods for solving the constraints, since we are ultimately interested in non-perturbative diagonalization of P~. Several approaches to this problem have recently appeared in the literature [209,210,33,382], in the context of scalar field theories in 1#1 dimensions. For QED with a realistic value of the electric charge, however, it might be that a perturbative treatment of the constraints could suffice; that is, that we could use a perturbative solution of the constraint to construct the Hamiltonian, which would then be diagonalized non-perturbatively. An approach similar in spirit has been proposed in Ref. [456], where the idea is to use a perturbative realization of the renormalization group to construct an effective Hamiltonian for QCD, which is then solved non-perturbatively. There is some evidence that this kind of approach might be useful. Wivoda and Hiller have recently used DLCQ to study a theory of neutral and interacting charged scalar fields in 3#1 dimensions [458]. They discovered that including four-fermion operators precisely analogous to the perturbative ones appearing in P~ significantly improved the numerical behavior of the simulation. Z The extension of the present work to the case of QCD is complicated by the fact that the constraint relations for the gluonic zero modes are non-linear, as in the /4 theory. A perturbative solution of the constraints is of course still possible, but in this case, since the effective coupling at the relevant (hadronic) scale is large, it is clearly desirable to go beyond perturbation theory. In addition, because of the central role played by gauge fixing in the present work, we may expect complications due to the Gribov ambiguity [189], which prevents the selection of unique representatives on gauge orbits in non-perturbative treatments of Yang—Mills theory. As a step in this
434
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
direction, work is in progress on the pure glue theory in 2#1 dimensions [259]. There it is expected that some of the non-perturbative techniques used recently in 1#1 dimensions [33,382,379—381,331] can be applied. 7.3. Dynamical zero modes Our concern in this section is with zero modes that are true dynamical independent fields. They can arise due to the boundary conditions in gauge theory one cannot fully implement the traditional light-cone gauge A`"0. The development of the understanding of this problem in DLCQ can be traced in Refs. [206—209,325,326]. The field A` turns out to have a zero mode which cannot be gauged away [258,259,261,379—381,331]. This mode is indeed dynamical, and is the object we study in this paper. It has its analogue in instant form approaches to gauge theory. For example, there exists a large body of work on Abelian and non-Abelian gauge theories in 1#1 dimensions quantized on a cylinder geometry [321,217]. There indeed this dynamical zero mode plays an important role. We too shall concern ourselves in the present section with non-Abelian gauge theory in 1#1 dimensions, revisiting the model introduced by ’t Hooft [424]. The specific task we undertake here is to understand the zero-mode subsector of the pure glue theory, namely where only zero-mode external sources excite only zero-mode gluons. We shall see that this is not an approximation but rather a consistent solution, a sub-regime within the complete theory. A similar framing of the problem lies behind the work of Lu¨scher [314] and van Baal [434] using the instant form Hamiltonian approach to pure glue gauge theory in 3#1 dimensions. The beauty of this reduction in the (1#1)-dimensional theory is twofold. First, it yields a theory which is exactly soluble. This is useful given the dearth of soluble models in field theory. Secondly, the zero-mode theory represents a paring down to the point where the front and instant forms are manifestly identical, which is nice to know indeed. We solve the theory in this specific dynamical regime and find a discrete spectrum of states whose wavefunctions can be completely determined. These states have the quantum numbers of the vacuum. We consider an SU(2) non-Abelian gauge theory in 1#1 dimensions with classical sources coupled to the gluons. The Lagrangian density is L"1Tr(F Fkl)#2 Tr(J Ak) (7.56) 2 kl k where F " A ! A !g[A , A ]. With a finite interval in x~ from !¸ to ¸, we impose kl l l l k k l periodic boundary conditions on all gauge potentials A . k We cannot eliminate the zero mode of the gauge potential. The reason is evident: it is invariant under periodic gauge transformations. But, of course, we can always perform a rotation in color o is the only non-zero space. In line with other authors [14,385,146—148], we choose this so that A` 3 element, since in our representation only p3 is diagonal. In addition, we can impose the subsidiary o "0. The reason is that there still remains freedom to perform gauge transgauge condition A~ 3 formations that depend only on light-cone time x` and the color matrix p3. The above procedure would appear to have enabled complete fixing of the gauge. This is still not so. Gauge transformations »"expMix~(np/2¸)p3N
(7.57)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
435
generate shifts, according to Eq. (7.53), in the zero-mode component o PA` o #np/g¸ . A` 3 3
(7.58)
All of these possibilities, labelled by the integer n, of course still satisfy A`"0, but as one sees ~ n"0 should not really be included. One can verify that the transformations » also preserve the subsidiary condition. One notes that the transformation is x~-dependent and Z periodic. It is thus 2 a simple example of a Gribov copy [189] in 1#1 dimensions. We follow the conventional procedure by demanding o Onp/g¸ , A` 3
n"$1,$2,2 .
(7.59)
This eliminates singularity points at the Gribov “horizons” which in turn correspond to a vanishing Faddeev—Popov determinant [434]. For convenience, we henceforth use the notation o "v , A` 3
o J~ o /g2 and J~ o "1B . x`"t , w2"J` ` ` 3 2
(7.60)
We pursue a Hamiltonian formulation. The only conjugate momentum is o "~A` o "~v . p,P~ 3 3
(7.61)
o P~!L leads to the Hamiltonian The Hamiltonian density ¹`~"~A` 3 3 (7.62) H"1[p2#(w2/v2)#Bv](2¸) . 2 Quantization is achieved by imposing a commutation relation at equal light-cone time on the dynamical degree of freedom. Introducing the variable q"2¸v, the appropriate commutation relation is [q(x`), p(x`)]"i. The field theoretic problem reduces to quantum mechanics of a single particle as in Manton’s treatment of the Schwinger model in Refs. [321]. One thus has to solve the Schro¨dinger equation
A
B
d2 (2¸w)2 Bq 1 ! # # t"Et , dq2 q2 2¸ 2
(7.63)
with the eigenvalue E"E/(2¸) actually being an energy density. All eigenstates t have the quantum numbers of the naive vacuum adopted in standard front form field theory: all of them are eigenstates of the light-cone momentum operator P` with zero eigenvalue. The true vacuum is now that state with lowest P~ eigenvalue. In order to get an exactly o . soluble system we eliminate the source 2B"J~ 3 The boundary condition that is to be imposed comes from the treatment of the Gribov problem. Since the wavefunction vanishes at q"0 we must demand that the wavefunctions vanish at the first Gribov horizon q"$2p/g. The overall constant R is then fixed by normalization. This leads
436
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
to the energy density only assuming the discrete values E(l)"(g2/8p2)(X(l))2, m"1, 2,2, (7.64) m m where X(l) denotes the mth zero of the lth Bessel function J . In general, these zeroes can only be m l obtained numerically. Thus (7.65) t (q)"RJqJ (J2E(l)q) l m m is the complete solution. The true vacuum is the state of lowest energy namely with m"1. The exact solution we obtained is genuinely non-perturbative in character. It describes vacuumlike states since for all of these states P`"0. Consequently, they all have zero invariant mass M2"P`P~. The states are labelled by the eigenvalues of the operator P~. The linear dependence on ¸ in the result for the discrete energy levels is also consistent with what one would expect from a loop of color flux running around the cylinder. In the source-free equal time case Hetrick [217,218] uses a wavefunction that is symmetric about q"0. For our problem this corresponds to t (q)"N cos(J2e q) , (7.66) m m where N is fixed by normalization. At the boundary of the fundamental modular region q"2p/g and t "(!1)mN, thus J2e 2p/g"mp and m m (7.67) e"1g2(m2!1) . 8 Note that m"1 is the lowest-energy state and has as expected one node in the allowed region 04g42p/g. Hetrick [217] discusses the connection to the results of Rajeev [387] but it amounts to a shift in e and a redefining of mPm/2. It has been argued by van Baal that the correct boundary condition at q"0 is t(0)"0. This would give a sine which matches smoothly with the Bessel function solution. This calculation offers the lesson that even in a front-form approach, the vacuum might not be just the simple Fock vacuum. Dynamical zero modes do imbue the vacuum with a rich structure.
8. Non-perturbative regularization and renormalization The subject of renormalization is a large one and high-energy theorists have developed a standard set of renormalization techniques based on perturbation theory (see, for example, Ref. [111]). However, many of these techniques are poorly suited for light-front field theory. Researchers in light-front field theory must either borrow techniques from condensed matter physics [374,435,453] or nuclear physics or come up with entirely new approaches. Some progress in this direction has already been made, see, for example, Refs. [294,444—447,456]. A considerable amount of work is focusing on these questions [8,2,85,134,194,233,464]; particularly see the work of Bassetto et al. [5,23—26], Bakker et al. [309,310,399], Brisudova et al. [48—50], and Zhang et al. [461—466]. It should be noted, however, that the work of Perry and collaborators [48—50,461—466] has strange aspects beyond all the effort. Front-form QCD is a theory with many useful symmetries like gauge invariance, Lorentz invariance, thus boost, rotational and chiral invariance. But these
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
437
authors, particularly in Refs. [49,50,466], somehow manage to admittedly give up each and every one of them, including rotational invariance (thus no degeneracies of multiplets). This shows how difficult the problem is. But it is not really what one aims at. The biggest challenge to renormalization of light-front field theory is the infrared divergences that arise. Recall that the Hamiltonian for a free particle is P~"(P2 #m2)/2P`. (8.1) M Small longitudinal momentum P` is associated with large energies. Thus, light-front field theory is subject to infrared longitudinal divergences. These divergences are quite different in nature from the infrared divergences found in equal-time quantized field theory. In order to remove small P` states, one must introduce non-local counter terms into the Hamiltonian. Power counting arguments allow arbitrary functions of transverse momenta to be associated with these counter terms. This is in contrast to more conventional approaches where demanding locality strongly constrains the number of allowed operators. One hopes to use light-front field theory to perform bound state calculations. In this case one represents a bound state by a finite number of particles (a Tamm—Dancoff truncation) whose momenta are restricted to some finite interval. This has a number of implications. In particular, momentum cutoffs and Tamm—Dancoff truncations both tend to break various symmetries of a theory. Proper renormalization must restore these symmetries. In contrast, conventional calculations choose regulators (like dimensional regularization) that do not break many symmetries. In conventional approaches, one is often concerned simply whether the system is renormalizable, that is, whether the large cutoff limit is well defined. In bound-state calculations, one is also interested in how quickly the results converge as one increases the cutoffs since numerical calculations must be performed with a finite cutoff. Thus, one is potentially interested in the effects of irrelevant operators along with the usual marginal and relevant operators. Conventional renormalization is inherently perturbative in nature. However, we are interested in many phenomena that are essentially non-perturbative: bound states, confinement, and spontaneous symmetry breaking. The bulk of renormalization studies in light-front field theory to date have used perturbative techniques [456,182]. Non-perturbative techniques must be developed. Generally, one expects that renormalization will produce a large number of operators in the light-front Hamiltonian. A successful approach to renormalization must be able to produce these operators automatically (say, as part of a numerical algorithm). In addition, there should be only a few free parameters which must be fixed phenomenologically. Otherwise, the predictive power of a theory will be lost. 8.1. Tamm—Dancoff integral equations Let us start by looking at a simple toy model that has been studied by a number of authors [435,182,374,426,427,244]. In fact, it is the famous Kondo problem truncated to one-particle states [277]. Consider the homogeneous integral equation
P
K
(p!E)/(p)#g
0
dp@ /(p@)"0
(8.2)
438
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
with eigenvalue E and eigenvector /(p). This is a model for Tamm—Dancoff equation of a single particle of momentum p with Hamiltonian H(p,p@)"pd(p!p@)#g. We will focus on the E(0 bound state solution: /(p)"constant/(p!E) , E"K/(1!e~1@g) .
(8.3)
Note that the eigenvalue diverges in the limit KPR. Proper renormalization involves modifying the system to make E and /(p) independent of K in the limit KPR. Towards this end, we add a counterterm CK to the Hamiltonian. Invoking the high-low analysis [454], we divide the interval 0(p(K into two subintervals: 0(p(¸, a “low-momentum region”, and ¸(p(K, a “highmomentum region”, where the momentum scales characterized by E, ¸, and K are assumed to be widely separated. The idea is that the eigenvalue and eigenvector should be independent of the behavior of the system in the high momentum region. The eigenvalue equation can be written as two coupled equations
P P
L
P P
dp@ /(p@)#(g#CK)
K
dp@ /(p@)"0 , L K L p3[¸, K] , (p!E)/(p)#(g#CK) dp@ /(p@)#(g#CK) dp@ /(p@)"0 . 0 L Integrating Eq. (8.5) in the limit ¸, KAE, p3[0, ¸] , (p!E)/(p)#(g#CK)
P
0
(8.4) (8.5)
P
K
(g#CK)log (K/¸) L dp@ /(p@) , (8.6) dp /(p)"! 1#(g#CK)log (K/¸) 0 L and substituting this expression into Eq. (8.4), we obtain an eigenvalue equation with the highmomentum region integrated out
P
(g#CK) L p3[0, ¸] , (p!E)/(p)# dp@ /(p@)"0 . 1#(g#CK)log (K/¸) 0 If we demand this expression to be independent of K,
A
B
d (g#CK) "0 , dK 1#(g#CK)log (K/¸)
(8.7)
(8.8)
we obtain a differential equation for CK, dCK/dK"(g#CK)2/K .
(8.9)
Solving this equation, we are free to insert an arbitrary constant !1/A !log k k A k g#CK" . 1!A log (K/k) k Substituting this result back into Eq. (8.7),
P
A L k p3[0, ¸], (p!E)/(p)# dp@ /(p@)"0 , 1!A log (¸/k) k 0
(8.10)
(8.11)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
439
we see that K has been removed from the equation entirely. Using Eq. (8.10) in the original eigenvalue equation
P
K A k dp@ /(p@)"0 (p!E)/(p)# 1!A log (K/k) 0 k gives the same equation as Eq. (8.11) with ¸ replaced by K. The eigenvalue is now,
(8.12)
lim E"!ke1@Ak . (8.13) ?= Although the eigenvalue is still a function of the cutoff for finite K, the eigenvalue does become independent of the cutoff in the limit KPR, and the system is properly renormalized. One can think of A as the renormalized coupling constant and k as the renormalization scale. In k that case, the eigenvalue should depend on the choice of A for a given k but be independent of k k itself. Suppose, for Eq. (8.13), we want to change k to a new value, say k@. In order that the eigenvalue remains the same, we must also change the coupling constant from A to A , k k{ ke1@Ak"k@e1@Ak{. (8.14) E"K/(1!(K/k)e~1@Ak) ,
K
In the same manner, one can write down a b-function for A [436], k k(d/dk)A "A2 . (8.15) k k Using these ideas one can examine the general case. Throughout, we will be working with operators projected onto some Tamm—Dancoff subspace (finite particle number) of the full Fock space. In addition, we will regulate the system by demanding that each component of momentum of each particle lies within some finite interval. One defines the “cutoff” K to be an operator which projects onto this subspace of finite particle number and finite momenta. Thus, for any operator O,O,KOK. Consider the Hamiltonian H"H #»#CK , (8.16) 0 where, in the standard momentum space basis, H is the diagonal part of the Hamiltonian, » is the 0 interaction term, and CK is the counter term which is to be determined and is a function of the cutoff. Each term of the Hamiltonian is hermitian and compact. Schro¨dinger’s equation can be written (H !E)D/T#(»#CK)D/T"0 (8.17) 0 with energy eigenvalue E and eigenvector D/T. The goal is to choose CK such that E and D/T are independent of K in the limit of large cutoff. One now makes an important assumption: the physics of interest is characterized by energy scale E and is independent of physics near the boundary of the space spanned by K. Following the approach of the previous section, one defines two projection operators, Q and P, where K"Q#P, QP"PQ"0, and Q and P commute with H . Q projects onto a “high-momentum 0 region” which contains energy scales one does not care about, and P projects onto a “lowmomentum region” which contains energy scales characterized by E. Schro¨dinger’s equation (8.17)
440
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
can be rewritten as two coupled equations: (H !E)PD/T#P(»#CK)PD/T#P(»#CK)QD/T"0 , 0
(8.18)
(H !E)QD/T#Q(»#CK)QD/T#Q(»#CK)PD/T"0 . 0 Using Eq. (8.19), one can formally solve for QD/T in terms of PD/T,
(8.19)
and
QD/T"(1/Q(E!H)Q)(»#CK)PD/T .
(8.20)
The term with the denominator is understood to be defined in terms of its series expansion in ». One can substitute this result back into Eq. (8.18), (8.21) (H !E)PD/T#P(»#CK)PD/T#P(»#CK)(1/Q(E!H)Q) (»#CK)PD/T"0 . 0 In order to properly renormalize the system, we could choose CK such that Eq. (8.21) is independent of one’s choice of K for a fixed P in the limit of large cutoffs. However, we will make a stronger demand: that Eq. (8.21) should be equal to Eq. (8.17) with the cutoff K replaced by P. One can express CK as the solution of an operator equation, the “counter term equation”, »K"»!»F»K .
(8.22)
where »K"»#CK, and provided that we can make the approximation »(Q/(E!H ))»+»QF». (8.23) 0 This is what we will call the “renormalizability condition”. A system is properly renormalized if, as we increase the cutoffs K and P, Eq. (8.23) becomes an increasingly good approximation. In the standard momentum space basis, this becomes a set of coupled inhomogeneous integral equations. Such equations generally have a unique solution, allowing us to renormalize systems without having to resort to perturbation theory. This includes cases where the perturbative expansion diverges or converges slowly. There are many possible choices for F that satisfy the renormalizability condition. For instance, one might argue that we want F to resemble 1/(E!H ) as much as possible and choose 0 F"1/(k!H ) , (8.24) 0 where the arbitrary constant k is chosen to be reasonably close to E. In this case, one might be able to use a smaller cutoff in numerical calculations. One might argue that physics above some energy scale k is simpler and that it is numerically too difficult to include the complications of the physics at energy scale E in the solution of the counter term equation. Thus, one could choose F"!h(H !k)/H , (8.25) 0 0 where the arbitrary constant k is chosen to be somewhat larger than E but smaller than the energy scale associated with the cutoff. The h-function is assumed to act on each diagonal element in the standard momentum space basis. The difficulty with this renormalization scheme is that it involves three different energy scales, E, k and the cutoff which might make the numerical problem more
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
441
difficult. One can relate our approach to conventional renormalization group concepts. In renormalization group language, »K is the bare interaction term and » is the renormalized interaction term. In both of the renormalization schemes introduced above, we introduced an arbitrary energy scale k; this is the renormalization scale. Now, physics (the energy eigenvalues and eigenvectors) should not depend on this parameter or on the renormalization scheme itself, for that matter. How does one move from one renormalization scheme to another? Consider a particular choice of renormalized interaction term » associated with a renormalization scheme which uses F in the counter term equation. We can use the counter term equation to find the bare coupling »K in terms of ». Now, to find the renormalized interaction term »@ associated with a different renormalization scheme using a different operator F@ in the counter term equation, we simply use the counter term equation with »K as given and solve for »@ »@"»K#»KF@»@ .
(8.26)
Expanding this procedure order by order in » and summing the result, we can obtain an operator equation relating the two renormalized interaction terms directly »@"»#»(F@!F)»@ .
(8.27)
The renormalizability condition ensures that this expression will be independent of the cutoff in the limit of large cutoff. For the two particular renormalization schemes mentioned above, Eqs. (8.24) and (8.25), we can regard the renormalized interaction term » as an implicit function of k. We can see how the renormalized interaction term changes with k in the case (8.24): k d»/dk"!»(k/(H !k)2)» 0 and in the case of Eq. (8.25),
(8.28)
k d»/dk"»d(H !k)». (8.29) 0 This is a generalization of the b-function. The basic idea of asymptotic and box counter term renormalization in the 3#1 Yukawa model calculation in an earlier section can be illustrated with a simple example. Consider an eigenvalue equation of the form [426,427],
P
K
dq »(k, q)/(q)"E/(k) . 0 Making a high—low analysis of this equation as above and assuming that
(8.30)
» (k, q)"» (k, q)"» (k, q)"f . LH HL HH Then one finds the following renormalized equation:
(8.31)
k/(k)!g
P
K
P
K A k dq [»(k, q)!f ]/(q)! dq /(q)"E/(k) . (8.32) 1#A ln (K/k) k 0 0 One has renormalized the original equation in the sense that the low-energy eigenvalue E is independent of the high-energy cutoff and we have an arbitrary parameter C which can be adjusted to fit the ground-state energy level.
k/(k)!g
442
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
One can motivate both the asymptotic counter term and one-box counter term in the Yukawa calculation as different choices in our analysis. For a fixed k we are free to choose A at will. The k simple asymptotic counter term corresponds to A "0. However, subtracting the asymptotic k behavior of the kernel with the term g f causes the wavefunction to fall off more rapidly than it would otherwise at large q. As a result the (A /(1#A ln K/k)):/ dq is finite, and this term can be k k retained as an arbitrary adjustable finite counter term. The perturbative counter terms correspond to A "g f then expanding in g ln K/k. Then one k finds
P
K
k/(k)!g
P
K
k/(k)!g
P
K = dq [»(k, q)!f ]/(q)!g f + (!g f ln K/k)n dq /(q)"E . 0 0 n/0 Keeping the first two terms in the expansion one gets the so-called “Box counter term”
P
dq »(k, q)/(q)#g2f 2ln K/k
(8.33)
K
dq /(q)"E .
(8.34)
0 0 Note that the box counter term contains f 2 indicating that it involves the kernel at high momentum twice. Ideally, one would like to carry out the non-perturbative renormalization program rigorously in the sense that the cutoff independence is achieved for any value of the coupling constant and any value of the cutoff. In practical cases, either one may not have the luxury to go to very large cutoff or the analysis itself may get too complicated. For example, the assumption of a uniform high-energy limit was essential for summing up the series. In reality, » may differ from » . HH LH The following is a simplified two-variable problems that are more closely related to the equations and approximations used in the Yukawa calculation. The form of the asymptotic counter term that was used can be understood by considering the following equation:
P P
k /(k, x)!g x(1!x)
K
1
dy K(k, q)/(q, y)"E/(k, x) . (8.35) 0 0 This problem contains only x dependence associated with the free energy, and no x dependence in the kernel. It is easily solved using the high—low analysis used above and one finds
P P
dq
K
1 dy(K(k, q)!f )/(q, y) (8.36) 0 0 K A 1 k ! (8.37) dq dy /(q, y)"E/(kx) . 1#1A ln K/k 6 k 0 0 The factor of 1/6 comes from the integral :1 dx x(1!x). This result motivates our choice for GK in 0 the Yukawa calculation. k/(kx)!g
dq
P P
8.2. Wilson renormalization and confinement QCD was a step backwards in the sense that it forced upon us a complex and mysterious vacuum. In QCD, because the effective coupling grows at long distances, there is always copious production of low-momentum gluons, which immediately invalidates any picture based on a few
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
443
constituents. Of course, this step was necessary to understand the nature of confinement and of chiral symmetry breaking, both of which imply a nontrivial vacuum structure. But for 20 years we have avoided the question: Why did the CQM work so well that no one saw any need for a complicated vacuum before QCD came along? A bridge between equal-time quantized QCD and the equal-time CQM would clearly be extremely complicated, because in the equal-time formalism there is no easy non-perturbative way to make the vacuum simple. Thus, a sensible description of constituent quarks and gluons would be in terms of quasiparticle states, i.e., complicated collective excitations above a complicated ground state. Understanding the relation between the bare states and the collective states would involve understanding the full solution to the theory. Wilson and collaborators argue that on the light front, however, simply implementing a cutoff on small longitudinal momenta suffices to make the vacuum completely trivial. Thus, one immediately obtains a constituent-type picture, in which all partons in a hadronic state are connected directly to the hadron. The price one pays to achieve this constituent framework is that the renormalization problem becomes considerably more complicated on the light front [178—183]. Wilson and collaborators also included a mass term for the gluons as well as the quarks (they include only transverse polarization states for the gluons) in H . They have in mind here that all &3%% masses that occur in H should roughly correspond to constituent rather than current masses. &3%% There are two points that should be emphasized in this regard. First, cutoff-dependent masses for both the quarks and gluons will be needed anyway as counter terms. This occurs because all the cutoffs one has for a non-perturbative Hamiltonian calculations violate both equal-time chiral symmetry and gauge invariance. These symmetries, if present, would have protected the quarks and gluons from acquiring this kind of mass correction. Instead, in the calculations discussed here both the fermion and gluon self-masses are quadratically divergent in a transverse momentum cutoff K. The second point is more physical. When setting up perturbation theory (more on this below) one should always keep the zeroth order problem as close to the observed physics as possible. Furthermore, the division of a Hamiltonian into free and interacting parts is always completely arbitrary, though the convergence of the perturbative expansion may hinge crucially on how this division is made. Non-zero constituent masses for both quarks and gluons clearly come closer to the phenomenological reality (for hadrons) than do massless gluons and nearly massless light quarks. Now, the presence of a non-zero gluon mass has important consequences. First, it automatically stops the running of the coupling below a scale comparable to the mass itself. This allows one to (arbitrarily) start from a small coupling at the gluon mass scale so that perturbation theory is everywhere valid, and only extrapolate back to the physical value of the coupling at the end. The quark and gluon masses also provide a kinematic barrier to parton production; the minimum free energy that a massive parton can carry is m2/p`, so that as more partons are added to a state and the typical p` of each parton becomes small, the added partons are forced to have high energies. Finally, the gluon mass eliminates any infrared problems of the conventional equal-time type. In their initial work they use a simple cutoff on constituent energies, that is, requiring (p2 #m2)/p`(K2/P` (8.38) M for each constituent in a given Fock state. Imposing Eq. (8.38) does not completely regulate the theory, however; there are additional small-p` divergences coming from the instantaneous terms in
444
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
the Hamiltonian. They regulate these by treating them as if the instantaneously exchanged gluons and quarks were actually constituents. Having stopped the running of the coupling below the constituent mass scale, one arbitrarily take it to be small at this scale, so that perturbation theory is valid at all energy scales. Now one can use power counting to identify all relevant and marginal operators (relevant or marginal in the renormalization group sense). Because of the cutoffs one must use, these operators are not restricted by Lorentz or gauge invariance. Because we have forced the vacuum to be trivial, the effects of spontaneous chiral symmetry breaking must be manifested in explicit chiral symmetrybreaking effective interactions. This means the operators are not restricted by chiral invariance either. There are thus a large number of allowed operators. Furthermore, since transverse divergences occur for any longitudinal momentum, the operators that remove transverse cutoff dependence contain functions of dimensionless ratios of all available longitudinal momenta. That is, many counter terms are not parameterized by single coupling constants, but rather by entire functions of longitudinal momenta. A precisely analogous result obtains for the counter terms for light-front infrared divergences; these will involve entire functions of transverse momenta. The counter term functions can in principle be determined by requiring that Lorentz and gauge invariance will be restored in the full theory. The cutoff Hamiltonian, with renormalization counter terms, will thus be given as a power series in gK: H(K)"H(0)#gKH(1)#g2KH(2)#2 ,
(8.39)
where all dependence on the cutoff K occurs through the running coupling gK, and cutoffdependent masses. The next stage in building a bridge from the CQM to QCD is to establish a connection between the ad hoc qqN potentials of the CQM and the complex many-body Hamiltonian of QCD. In lowest order the canonical QCD Hamiltonian contains gluon emission and absorption terms, including emission and absorption of high-energy gluons. Since a gluon has energy (k2 #k2)/k` for M momentum k, a high-energy gluon can result either if k is large or k` is small. But in the CQM, M gluon emission is ignored and only low-energy states matter. How can one overcome this double disparity? The answer is that we can change the initial cutoff Hamiltonian H(K) by applying a unitary transformation to it. We imagine constructing a transformation º that generates a new effective Hamiltonian H : %&& H "ºsH(K)º . %&&
(8.40)
We then choose º to cause H to look as much like a CQM as we can [35,444—446]. %&& The essential idea is to start out as though we were going to diagonalize the Hamiltonian H(K), except that we stop short of computing actual bound states. A complete diagonalization would generate an effective Hamiltonian H in diagonal form; all its off-diagonal matrix elements would %&& be zero. Furthermore, in the presence of bound states the fully diagonalized Hamiltonian would act in a Hilbert space with discrete bound states as well as continuum quark—gluon states. In a confined theory there would only be bound states. What we seek is a compromise: an effective Hamiltonian in which some of the off-diagonal elements can be nonzero, but in return the Hilbert space for H remains the quark—gluon continuum that is the basis for H(K). No bound states %&&
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
445
should arise. All bound states are to occur through the diagonalization of H , rather than being %&& part of the basis in which H acts. %&& To obtain a CQM-like effective Hamiltonian, we would ideally eliminate all off-diagonal elements that involve emission and absorption of gluons or of qqN pairs. It is the emission and absorption processes that are absent from the CQM, so we should remove them by the unitary transformation. However, we would allow off-diagonal terms to remain within any given Fock sector, such as qqN PqqN off-diagonal terms or qqqPqqq terms. This means we allow off-diagonal potentials to remain, and trust that bound states appear only when the potentials are diagonalized. Actually, as discussed in Ref. [456], we cannot remove all the off-diagonal emission and absorption terms. This is because the transformation º is sufficiently complex that we only know how to compute it in perturbation theory. Thus, we can reliably remove in this way only matrix elements that connect states with a large energy difference; perturbation theory breaks down if we try to remove, for example, the coupling of low-energy quark to a low-energy quark—gluon pair. They therefore introduce a second cutoff parameter j2/P`, and design the similarity transformation to remove off-diagonal matrix elements between sectors where the energy difference between the initial and final states is greater than this cutoff. For example, in second order the effective Hamiltonian has a one-gluon exchange contribution in which the intermediate gluon state has an energy above the running cutoff. Since the gluon energy is (k2 #k2)/k`, where k is the exchanged M gluon momentum, the cutoff requirement is (k2 #k2)/k`'j2/P`. (8.41) M This procedure is known as the “similarity renormalization group” method. For a more detailed discussion and for connections to renormalization group concepts see Ref. [456]. The result of the similarity transformation is to generate an effective light-front Hamiltonian H , which must be solved non-perturbatively. Guided by the assumption that a constituent %&& picture emerges, in which the physics is dominated by potentials in the various Fock space sectors, we can proceed as follows. We first split H anew into an unperturbed part H and a perturbation ». The principle guiding %&& 0 this new division is that H should contain the most physically relevant operators, e.g., constitu0 ent-scale masses and the potentials that are most important for determining the bound-state structure. All operators that change particle number should be put into », as we anticipate that transitions between sectors should be a small effect. This is consistent with our expectation that a constituent picture results, but this must be verified by explicit calculations. Next we solve H non-perturbatively in the various Fock space sectors, using techniques from many-body 0 physics. Finally, we use bound-state perturbation theory to compute corrections due to ». We thus introduce a second perturbation theory as part of building the bridge. The first perturbation theory is that used in the computation of the unitary transformation º for the incomplete diagonalization. The second perturbation theory is used in the diagonalization of H to yield bound-state properties. Perry in particular has emphasized the importance of %&& distinguishing these two different perturbative treatments [367]. The first is a normal fieldtheoretic perturbation theory based on an unperturbed free field theory. In the second perturbation theory a different unperturbed Hamiltonian is chosen, one that includes the dominant potentials that establish the bound-state structure of the theory. Our working assumption is that the dominant potentials come from the lowest-order potential terms generated in the perturbation
446
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
expansion for H itself. Higher-order terms in H would be treated as perturbations relative to %&& %&& these dominant potentials. It is only in the second perturbative analysis that constituent masses are employed for the free quark and gluon masses. In the first perturbation theory, where we remove transitions to high-mass intermediate states, it is assumed that the expected field theoretic masses can be used, i.e., near-zero up- and down-quark masses and a gluon mass of zero. Because of renormalization effects, however, there are divergent mass counter terms in second order in H(K). H also has %&& second-order mass terms, but they must be finite — all divergent renormalizations are accomplished through the transformation º. When we split H into H and », we include in H both %&& 0 0 constituent quark and gluon masses and the dominant potential terms necessary to give a reasonable qualitative description of hadronic bound states. Whatever is left in H after subtracting H is %&& 0 defined to be ». In both perturbation computations the same expansion parameter is used, namely the coupling constant g. In the second perturbation theory the running value of g measured at the hadronic mass scale is used. In relativistic field theory g at the hadronic scale has a fixed value g of order one; but s in the computations an expansion for arbitrarily small g is used. It is important to realize that covariance and gauge invariance are violated when g differs from g ; the QCD coupling at any s given scale is not a free parameter. These symmetries can only be fully restored when the coupling at the hadronic scale takes its physical value g . s The conventional wisdom is that any weak-coupling Hamiltonian derived from QCD will have only Coulomb-like potentials, and certainly will not contain confining potentials. Only a strongcoupling theory can exhibit confinement. This wisdom is wrong [456]. When H is constructed by %&& the unitary transformation of Eq. (8.40), with º determined by the “similarity renormalization group” method, H has an explicit confining potential already in second order! We shall explain %&& this result below. However, first we should give the bad news. If quantum electrodynamics (QED) is solved by the same process as we propose for QCD, then the effective Hamiltonian for QED has a confining potential too. In the electro-dynamic case, the confining potential is purely an artifact of the construction of H , an artifact which disappears when the bound states of H are %&& %&& computed. Thus the key issues, discussed below, are to understand how the confining potential is cancelled in the case of electrodynamics, and then to establish what circumstances would prevent a similar cancelation in QCD.
9. Chiral symmetry breaking In the mid-70s QCD emerged from current algebra and the Parton model. In current algebra one makes use of the partially conserved axial-current hypothesis (PCAC), which states that light hadrons would be subjected to a fermionic symmetry called “chiral symmetry” if only the pion mass was zero. If this were the case, the symmetry would be spontaneously broken, and the pions and kaons would be the corresponding Goldstone bosons. The real world slightly misses this state of affairs by effects quantifiable in terms of the pion mass and decay constant. This violation can be expressed in terms of explicit symmetry breaking due to the nonzero masses of the fundamental fermion fields, quarks of three light flavors, and typically one assigns values of 4 MeV for the up-quark, 7 MeV for the down-quark and 130 MeV for the strange-quark [171,403]. Light-front
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
447
field theory is particularly well suited to study these symmetries [152]. This section follows closely the review of Daniel Mustaki [341]. 9.1. Current algebra To any given transformation of the fermion field we associate a current dL dt dt "itM ck , (9.1) d( t) h h k where dt is the infinitesimal variation parameterized by h. Consider first the free Dirac theory in space—time and light-front frames. For example, the vector transformation is defined in space—time by tÂe~*ht , dt"!iht ,
(9.2)
whence the current jk"tM ckt .
(9.3)
In a light-front frame the vector transformation will be defined as t Âe~*ht , dt "!iht , dt"dt #dt , (9.4) ` ` ` ` ` ~ where dt is calculated in Section 2. The distinction in the case of the vector is of course academic: ~ dt "!iht Ndt"!iht . (9.5) ~ ~ Therefore, for the free Dirac theory the light-front current jI k is jI k"jk .
(9.6)
One checks easily that the vector current is conserved: jk"0 . k therefore the space—time and light-front vector charges, which measure fermion number
P
P
Q, d3x j0(x) , QI , d3xJ j`(x) ,
(9.7)
(9.8)
are equal [325]. The space—time chiral transformation is defined by tÂe~*hc5t , dt"!ihc t , (9.9) 5 where c ,ic0c1c2c3. From the Hamiltonian, one sees that the space—time theory with nonzero 5 fermion masses is not chirally symmetric. The space—time axial-vector current associated to the transformation is jk "tM ckc t 5 5
(9.10)
jk "2imtM c t . k 5 5
(9.11)
and
448
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
As expected, this current is not conserved for non-zero fermion mass. The associated charge is
P
P
Q , d3x j0" d3x tM c0c t . 5 5 5
(9.12)
The light-front chiral transformation is t Âe~*hc5t , dt "!ihc t . (9.13) ` ` ` 5 ` This is a symmetry of the light-front theory without requiring zero bare masses. Using Mck, c N"0, 5 one finds
P
dt (x)"!hc dy~ ~ 5
e(x~!y~) (ic ) ³ !m)c`t (y) . M M ` 4
(9.14)
This expression differs from
P
!ihc t "!hc dy~ 5 ~ 5
e(x~!y~) (ic ) ³ #m)c`t (y) , M M ` 4
therefore jI k Ojk (except for the plus component, due to (c`)2"0). To be precise, 5 5 e(x~!y~) jI k "jk #imtM ckc dy~ c`t (y) . 5 5 5 ` 2
P
(9.15)
(9.16)
A straightforward calculation shows that jI k "0 , k 5 as expected. Finally, the light-front chiral charge is
P
(9.17)
P
QI , d3xJ jI `" d3xJ tM c`c t 5 5 5
(9.18)
From the canonical anti-commutator Mt(x), ts(y)N 0 0"d3(x!y) , x /y one derives
(9.19)
[t, Q ]"c tN[Q, Q ]"0 , (9.20) 5 5 5 so that fermion number, viz., the number of quarks minus the number of anti-quarks is conserved by the chiral charge. However, the latter are not conserved separately. This can be seen by using the momentum expansion of the field one finds
P
C
d3p DpD Q " + s (bs(p, s)b(p, s)#ds(p, s)d(p, s) 5 2p0 p0 s/B1 m # (ds(!p, s)bs(p, s)e2*p0t#b(p, s)d(!p, s)e~2*p0t) . p0
D
(9.21)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
449
This implies that when Q acts on a hadronic state, it will add or absorb a continuum of 5 quark—antiquark pairs (the well-known pion pole) with a probability amplitude proportional to the fermion mass and inversely proportional to the energy of the pair. Thus, Q is most unsuited for 5 classification purposes. In contrast, the light-front chiral charge conserves not only fermion number, but also the number of quarks and anti-quarks separately. In effect, the canonical anti-commutator is Mt (x), ts (y)N ` `"(K /J2) d3(xJ !yJ ) , ` ` x /y ` hence the momentum expansion of the field reads
(9.22)
P
d3pJ t (x)" + [w(h)e~*pxb(pJ , h)#w(!h)e`*pxds(pJ , h)] ` (2p)3@223@4Jp` h/B1@2
(9.23)
Mb(pJ , h), bs(qJ , h@)N"2p`d3(pJ !qJ )d "Md(pJ , h), ds(qJ , h@)N , hh{
(9.24)
and
+ w(h)ws(h)"K . (9.25) ` h/B1@2 In the rest frame of a system, its total angular momentum along the z-axis is called “light-front helicity”; the helicity of an elementary particle is just the usual spin projection; we label the eigenvalues of helicity with the letter “h”. It is easiest to work in the so-called “chiral representation” of Dirac matrices, where
C
1 0
0
0
0 1 0 c " 5 0 0 !1
0
0 0
0
0 !1
D
CD 1
A B
CD 0
A B
0 1 , w # " 2 0
0 1 , w ! " 2 0
0
1
Nws(h)c w(h@)"2hd . 5 hh{ Inserting Eq. (9.23) into Eq. (9.18), one finds
(9.26)
(9.27)
P
d3pJ QI " + 2h [bs(pJ , h)b(pJ , h)#ds(pJ , h)d(pJ , h)] . (9.28) 5 2p` h This is just a superposition of fermion and anti-fermion number operators, and thus our claim is proved. This expression also shows that QI annihilates the vacuum, and that it simply measures 5 (twice) the sum of the helicities of all the quarks and anti-quarks of a given state. Indeed, in a light-front frame, the handedness of an individual fermion is automatically determined by its helicity. To show this, note that c w($1)"$w($1)N1 (1$c )w($1)"w($1), 1 (1$c )w(G1)"0 . 5 2 2 2 5 2 2 2 5 2 Defining as usual t ,1 (1#c )t , t ,1 (1!c )t , 5 ` `L 2 5 ` `R 2
(9.29)
(9.30)
450
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
it follows from Eq. (9.23) that t contains only fermions of helicity #1 and anti-fermions of `R 2 helicity !1, while t contains only fermions of helicity !1 and anti-fermions of helicity #1. 2 `L 2 2 Also, we see that when acted upon by the right- and left-hand charges QI ,1 (QI #QI ) , R 2 5
QI ,1 (QI !QI ) , L 2 5
(9.31)
a chiral fermion (or anti-fermion) state may have eigenvalues #1 (resp. !1) or zero. In a space—time frame, this identification between helicity and chirality applies only to massless fermions. 9.2. Flavor symmetries We proceed now to the theory of three flavors of free fermions t , where f"u, d, s, and f
CD
t u t, t d t s
C
m
and M, 0 0
u
0
0
D
m 0 . d 0 m s
(9.32)
The vector, and axial-vector, flavor non-singlet transformations are defined, respectively, as tÂe~*jaha@2t ,
tÂe~*jahac5@2t ,
(9.33)
where the summation index a runs from 1 to 8. The space—time Hamiltonian P0 is invariant under vector transformations if the quarks have equal masses (“SU(3) limit”), and invariant under chiral transformations if all masses are zero (“chiral limit”). The light-front Hamiltonian is
P P P P
iJ2 P~"+ d3xJ dy~ e(x~!y~)ts (y)(m2!D )t (x) f` f M f` 4 f iJ2 " d3xJ dy~ e(x~!y~)ts (y)(M2!D )t (x) . ` M ` 4
(9.34)
Naturally, P~ is not invariant under the vector transformations t  e~*jaha@2t ` `
(9.35)
unless the quarks have equal masses. But if they do, then P~ is also invariant under the chiral transformations t  e~*jahac5@2t , ` `
(9.36)
whether this common mass is zero or not. One finds that the space—time currents jka"1tM ck jat , jka"1tM ckc ja t , 5 2 5 2
(9.37)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
451
have the following divergences: jka"itM [M, 1ja]t , jka"itM c MM, 1jaNt . k 5 5 2 k 2 These currents have obviously the expected conservation properties. Turning to the light-front frame we find
C D P
jI ka"jka!itM M,
ja e(x~!y~) ck dy~ c`t (y) . ` 2 4
(9.38)
(9.39)
So jI ka and jka may be equal for all k only if the quarks have equal masses. The vector, flavor non-singlet charges in each frame are two different octets of operators, except in the SU(3) limit. For the light-front current associated with axial transformations, we get
G H P
ja e(x~!y~) c ck dy~ c`t (y) . jI ka"jka!itM M, 5 ` 5 5 2 4
(9.40)
Hence, jI ka and jka are not equal (except for k"#), even in the SU(3) limit, unless all quark masses 5 5 are zero. Finally, one obtains the following divergences:
C
DP C D P
ja jI ka"tM M2, k 2
dy~
e(x~!y~) c`t (y) , ` 4
(9.41)
ja e(x~!y~) jI ka"!tM M2, c dy~ c`t (y) . k 5 ` 2 5 4
As expected, both light-front currents are conserved in the SU(3) limit, without requiring zero masses. Also note how light-front relations often seem to involve the masses squared, while the corresponding space—time relations are linear in the masses. The integral operator
P
e(x~!y~) 1 , 2 x ~ compensates for the extra power of mass. The associated light-front charges are dy~
P
QI a, d3xJ tM c`
(9.42)
P
ja ja t, QI a , d3xJ tM c`c t. 5 5 2 2
(9.43)
Using the momentum expansion of the fermion triplet, Eq. (9.23), where now b(pJ , h),[b (pJ , h), b (pJ , h), b (pJ , h)] and d(pJ , h),[d (pJ , h), d (pJ , h), d (pJ , h)] , u d s u d s one can express the charges as
P P
C
D
ja jaT d3pJ + bs(pJ , h) b(pJ , h)!ds(pJ , h) d(pJ , h) , 2 2 2p` h ja jaT d3pJ + 2h bs(pJ , h) b(pJ , h)!ds(pJ , h) d(pJ , h) , QI a " 5 2 2 2p` h QI a"
C
D
(9.44)
(9.45) (9.46)
452
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
where the superscript T denotes matrix transposition. Clearly, all 16 charges annihilate the vacuum [235,247,248,302,303,394]. As QI a and QI a conserve the number of quarks and anti-quarks separate5 ly, these charges are well-suited for classifying hadrons in terms of their valence constituents, whether the quark masses are equal or not [120]. Since the charges commute with P` and P , all M hadrons belonging to the same multiplet have the same momentum. But this common value of momentum is arbitrary, because in a light-front frame one can boost between any two values of momentum, using only kinematic operators. One finds that these charges generate an SU(3)?SU(3) algebra: [QI a, QI b]"if
QI c , [QI a, QI b ]"if QI c , [QI a , QI b ]"if QI c , abc 5 abc 5 5 5 abc
(9.47)
and the corresponding right- and left-hand charges generate two commuting algebras denoted SU(3) and SU(3) [247,248,302,303,305,306,30,74,118—120,135,142,234—237,332,350,87,88,394]. R L Most of these papers in fact study a larger algebra of light-like charges, namely SU(6), but the sub-algebra SU(3) ?SU(3) suffices for our purposes. R L Since [t , QI a ]"1c jat , ` 5 2 5 `
(9.48)
the quarks form an irreducible representation of this algebra. To be precise, the quarks (resp. anti-quarks) with helicity #1 (resp. !1) transform as a triplet of SU(3) and a singlet of SU(3) , 2 2 R L the quarks (resp. anti-quarks) with helicity !1 (resp. #1) transform as a triplet of SU(3) and 2 2 L a singlet of SU(3) . Then, for example, the ordinary vector SU(3) decuplet of J"3 baryons with R 2 h"#3 is a pure right-handed (10,1) under SU(3) ?SU(3) . The octet (J"1) and decuplet (J"3) R L 2 2 2 with h"#1 transform together as a (6,3). For bosonic states we expect both chiralities to 2 contribute with equal probability. For example, the octet of pseudo-scalar mesons arises from a superposition of irreducible representations of SU(3) ?SU(3) : R L DJPC"0~`T"(1/J2)D(8,1)!(1,8)T ,
(9.49)
while the octet of vector mesons with zero helicity corresponds to DJPC"1~~T"(1/J2)D(8,1)#(1,8)T ,
(9.50)
and so on. These low-lying states have ¸ "0, where z
P
¸ "!i d3xJ tM c`(x1 !x2 )t z 2 1
(9.51)
is the orbital angular momentum along z. In the realistic case of unequal masses, the chiral charges are not conserved. Hence, they generate multiplets which are not mass-degenerate — a welcome feature. The fact that the invariance of the vacuum does not enforce the “invariance of the world” (viz., of energy), in sharp contrast with the order of things in space—time (Coleman’s theorem), is yet another remarkable property of the light-front frame.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
453
In contrast with the space—time picture, free light-front current quarks are also constituent quarks because f They can be massive without preventing chiral symmetry, which we know is (approximately) obeyed by hadrons, f They form a basis for a classification of hadrons under the light-like chiral algebra. 9.3. Quantum chromodynamics In the quark—quark—gluon vertex gjkA , the transverse component of the vector current is k im dy~ e(x~!y~) [tM (y)c`c t (x)#tM (x)c`c t (y)] , (9.52) j (x)"2# ` M ` ` M ` M 4
P
where the dots represent chirally symmetric terms, and where color, as well as flavor, factors and indices have been omitted for clarity. The term explicitly written out breaks chiral symmetry for non-zero quark mass. Not surprisingly, it generates vertices in which the two quark lines have opposite helicity. The canonical anti commutator for the bare fermion fields still holds in the interactive theory (for each flavor). The momentum expansion of t (x) remains the same except that now the ` x` dependence in b and d and Mb(pJ , h, x`), bs(qJ , h@, y`)N ` `"2p`d3(pJ !qJ )d "Md(pJ , h, x`), ds(qJ , h@, y`)N ` ` (9.53) x /y hh{ x /y The momentum expansions of the light-like charges remain the same (keeping in mind that the creation and annihilation operators are now unknown functions of “time”). Hence, the charges still annihilate the Fock vacuum, and are suitable for classification purposes. We do not require annihilation of the physical vacuum (QCD ground state). The successes of CQMs suggest that to understand the properties of the hadronic spectrum, it may not be necessary to take the physical vacuum into account. This is also the point of view taken by the authors of a recent paper on the renormalization of QCD [182]. Their approach consists in imposing an “infrared” cutoff in longitudinal momentum, and in compensating for this suppression by means of Hamiltonian counter terms. Now, only terms that annihilate the Fock vacuum are allowed in their Hamiltonian P~. Since all states in the truncated Hilbert space have strictly positive longitudinal momentum except for the Fock vacuum (which has p`"0), the authors hope to be able to adjust the renormalizations in order to fit the observed spectrum, without having to solve first for the physical vacuum. Making the standard choice of gauge: A "0, one finds that the properties of vector and ~ axial-vector currents are also unaffected by the inclusion of QCD interactions, except for the replacement of the derivative by the covariant derivative. The divergence of the renormalized, space—time, non-singlet axial current is anomaly-free [111]. As jka and jI ka become equal in the 5 5 chiral limit, the divergence of the light-front current is also anomaly-free (and goes to zero in the chiral limit). The corresponding charges, however, do not become equal in the chiral limit. This can only be due to contributions at x~-infinity coming from the Goldstone boson fields, which presumably cancel the pion pole of the space—time axial charges. Equivalently, if one chooses
454
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
periodic boundary conditions, one can say that this effect comes from the longitudinal zero modes of the fundamental fields. From soft pion physics we know that the chiral limit of SU(2)?SU(2) is well-described by PCAC. Now, using PCAC one can show that in the chiral limit Qa (a"1, 2, 3) is conserved, but 5 Qa is not [87,234]. In other words, the renormalized light-front charges are sensitive to spontan5 eous symmetry breaking, although they do annihilate the vacuum. It is likely that this behavior generalizes to SU(3)?SU(3), viz., to the other five light-like axial charges. Its origin, again, must lie in zero modes. In view of this “time”-dependence, one might wonder whether the light-front axial charges are observables. From PCAC, we know that it is indeed the case: their matrix elements between hadron states are directly related to off-shell pion emission [142,87]. For a hadron A decaying into a hadron B and a pion, one finds 2i(2p)3p` A SB, naDATd3(pJ !pJ ) . SBDQI a (0)DAT"! 5 A B m2 !m2 A B
(9.54)
Note that in this reaction, the mass of hadron A must be larger than the mass of B due to the pion momentum. 9.4. Physical multiplets Naturally, we shall assume that real hadrons fall into representations of an SU(3)?SU(3) algebra. We have identified the generators of this algebra with the light-like chiral charges. But this was done in the artificial case of the free quark model. It remains to check whether this identification works in the real world. Of course, we already know that the predictions based on isospin (a"1, 2, 3) and hyper-charge (a"8) are true. Also, the nucleon-octet ratio D/F is correctly predicted to be 3/2, and several relations between magnetic moments match well with experimental data. Unfortunately, several other predictions are in disagreement with observations [104]. For example, G /G for the nucleon is expected to be equal to 5/3, while the experimental value is about A V 1.25. Dominant decay channels such as N*PNn, or b Pun, are forbidden by the light-like 1 current algebra. The anomalous magnetic moments of nucleons, and all form factors of the rho-meson would have to vanish. De Alwis and Stern [120] point out that the matrix element of jI ka between two given hadrons would be equal to the matrix element of jI ka between the same two 5 hadrons, up to a ratio of Clebsch-Gordan coefficients. This is excluded though because vector and axial-vector form factors have very different analytic properties as functions of momentum transfer. In addition there is, in general, disagreement between the values of ¸ assigned to any given z hadron. This comes about because in the classification scheme, the value of ¸ is essentially an z afterthought, when group-theoretical considerations based on flavor and helicity have been taken care of. On the other hand, at the level of the current quarks, this value is determined by covariance and external symmetries. Consider, for example, the ¸ assignments in the case of the pion, and of z the rho-meson with zero helicity. As we mentioned earlier, the classification assigns to these states a pure value of ¸ , namely zero. However, at the fundamental level, one expects these mesons to z contain a wave-function / attached to ¸ "0 (anti-parallel qqN helicities), and also a wavefunction 1 z
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
455
/ attached to ¸ "$1 (parallel helicities). Actually, the distinction between the pion and the 2 z zero-helicity rho is only based on the different momentum dependence of / and / [305,306]. If 1 2 the interactions were turned off, / would vanish and the masses of the two mesons would be 2 degenerate (and equal to (m #m )). u d We conclude from this comparison with experimental data, that if indeed real hadrons are representations of some SU(3)?SU(3) algebra, then the generators Ga and Ga of this classifying 5 algebra must be different from the current light-like charges QI a and QI a (except however for 5 a"1, 2, 3, 8). Furthermore, in order to avoid the phenomenological discrepancies discussed above, one must forego kinematical invariance for these generators; that is, Ga(kI ) and Ga (kI ) must depend 5 on the momentum kI of the hadrons in a particular irreducible multiplet. Does that mean that our efforts to relate the physical properties of hadrons to the underlying field theory turn out to be fruitless? Fortunately no, as argued by De Alwis and Stern [120]. The fact that these two sets of generators (the QI ’s and the G’s) act in the same Hilbert space, in addition to satisfying the same commutation relations, implies that they must actually be unitary equivalent (this equivalence was originally suggested by Dashen, and by Gell-Mann [173]). There exists a set of momentum-dependent unitary operators º(kI ) such that Ga(kI )"º(kI )QI aºs(kI ) ,
Ga (kI )"º(kI )QI a ºs(kI ) . (9.55) 5 5 Current quarks, and the real-world hadrons built out of them, fall into representations of this algebra. Equivalently (e.g., when calculating electro-weak matrix elements), one may consider the original current algebra, and define its representations as “constituent” quarks and “constituent” hadrons. These quarks (and antiquarks) within a hadron of momentum kI are represented by a “constituent fermion field”, skI (x)D ` ,º(kI )t (x)D ` ºs(kI ) , ` x /0 ` x /0 on the basis of which the physical generators can be written in canonical form:
P
GI a, d3xJ sN c`
ja s, 2
P
ja GI a , d3xJ sN c`c s. 5 5 2
(9.56)
(9.57)
it follows that the constituent annihilation/creation operators are derived from the current operators via akI (pJ , h),º(kI )b(pJ , h)ºs(kI ) , ckI s(pJ , h),º(kI )ds(pJ , h)ºs(kI ) .
(9.58)
Due to isospin invariance, this unitary transformation cannot mix flavors, it only mixes helicities. It can therefore be represented by three unitary 2]2 matrices ¹f(kI , pJ ) such that akI (pJ , h)" + ¹f (kI , pJ ) b (pJ , h@) , ckI (pJ , h)" + ¹ f*(kI , pJ )d (pJ , h@) , (9.59) f f hh{ f hh{ f h{/B1@2 h{/B1@2 one for each flavor f"u, d, s. Since we need the transformation to be unaffected when kI and pJ are boosted along z or rotated around z together, the matrix ¹ must actually be a function of only kinematical invariants. These are p` and i ,p !mk where m, M M M k`
+ m"1 , + i "0 . M #0/45*56%/54 #0/45*56%/54
(9.60)
456
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Invariance under time reversal (x`Â!x`) and parity (x1Â!x1) further constrains its functional form, so that finally [305,306] ¹f(kI , pJ )"exp[!i (i /Di D) ) p b (m, i2 )] . (9.61) M M M f M Thus, the relationship between current and constituent quarks is embodied in the three functions b (m, i2 ), which we must try to extract from comparison with experiment. (In first approximation it f M is legitimate to take b and b equal since SU(2) is such a good symmetry.) u d Based on some assumptions abstracted from the free-quark model [150,304—306] has derived a set of sum rules obeyed by mesonic wavefunctions. Implementing then the transformation described above, Leutwyler finds various relations involving form factors and scaling functions of mesons, and computes the current quark masses. For example, he obtains F (F , F "3F , F ((3/J2)DF D , (9.62) n o o u o ( and the u// mixing angle is estimated to be about 0.07 rad. Ref. [304] also shows that the average transverse momentum of a quark inside a meson is substantial (Dp D '400 MeV), thus justifying M 3.4 a posteriori the basic assumptions of the relativistic CQM (e.g., Fock space truncation and relativistic energies). This large value also provides an explanation for the above-mentioned failures of the SU(3)?SU(3) classification scheme [104]. On the negative side, it appears that the functional dependence of the b ’s cannot be easily determined with satisfactory precision. f 10. The prospects and challenges Future work on light-cone physics can be discussed in terms of developments along two distinct lines. One direction focuses on solving phenomenological problems while the other will focus on the use of light-cone methods to understand various properties of quantum field theory. Ultimately both point towards understanding the physical world. An essential feature of relativistic quantum field theories such as QCD is that particle number is not conserved, i.e. if we examine the wavefunction of a hadron at fixed-time t or light-cone time x`, any number of particles can be in flight. The expansion of a hadronic eigenstate of the full Hamiltonian has to be represented as a sum of amplitudes representing the fluctuations over particle number, momentum, coordinate configurations, color partitions, and helicities. The advantage of the light-cone Hamiltonian formalism is that one can conceivably predict the individual amplitudes for each of these configurations. As we have discussed in this review, the basic procedure is to diagonalize the full light-cone Hamiltonian in the free light-cone Hamiltonian basis. The eigenvalues are the invariant mass squared of the discrete and continuum eigenstates of the spectrum. The projection of the eigenstate on the free Fock basis are the light-cone wavefunctions and provide a rigorous relativistic many-body representation in terms of its degrees of freedom. Given the light-cone wavefunction one can compute the structure functions and distribution amplitudes. More generally, the light-cone wavefunctions provide the interpolation between hadron scattering amplitudes and the underlying parton subprocesses. The unique property of light-cone quantization that makes the calculations of light cone wavefunctions particularly useful is that they are independent of the reference frame. Thus, when
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
457
one does a non-perturbative bound-state calculation of a light-cone wavefunction, that same wavefunction can be used in many different problems. Light-cone methods have been quite successful in understanding recent experimental results, as we discussed in Sections 5 and 6. We have seen that light-cone methods are very useful for understanding a number of properties of nucleons as well as many exclusive processes. We also saw that these methods can be applied in conjunction with perturbative QCD calculations. Future phenomenological application will continue to address specific experimental results that have a distinct non-perturbative character and which are therefore difficult to address by other methods. The simple structure of the light-cone Hamiltonian can be used as a basis to infer information on the non-perturbative and perturbative structure of QCD. For example, factorization theories separating hard and soft physics in large momentum transfer exclusive and inclusive reactions [299]. Mueller et al. [67,98] have pioneered the investigation of structure functions at xP0 in the light-cone Hamiltonian formalism. Mueller’s approach is to consider the light-cone wavefunctions of heavy quarkonium in the large N limit. The resulting structure functions display energy # dependence related to the Pomeron. One can also consider the hard structure of the light-cone wavefunction. The wavefunctions of a hadron contain fluctuations which are arbitrarily far off the energy shell. In the case of light-wave quantization, the hadron wavefunction contains partonic states of arbitrarily high invariant mass. If the light-cone wavefunction is known in the domain of low invariant mass, then one can use the projection operators formalism to construct the wavefunction for large invariant mass by integration of the hard interactions. Two types of hard fluctuations emerge: “extrinsic” components associated with gluon splitting gPqqN and the qPqg bremsstrahlung process and “intrinsic” components associated with multi-parton interactions within the hadrons, ggPQQM , etc. One can use the probability of the intrinsic contribution to compute the xP1 power-law behavior of structure functions, the high relative transverse momentum fall-off of the light-cone wavefunctions, and the probability for high mass or high mass QQM pairs in the sea quark distribution of the hadrons [67]. The full analysis of the hard components of hadron wavefunction can be carried out systematically using an effective Hamiltonian operator approach. If we contrast the light-cone approach with lattice calculations we see the potential power of the light-cone method. In the lattice approach one calculates a set of numbers, for example a set of operator product coefficients [322], and then one uses them to calculate a physical observable where the expansion is valid. This should be contrasted with the calculation of a light-cone wavefunction which gives predictions for all physical observables independent of the reference frame. There is a further advantage in that the shape of the light-cone wavefunction can provide a deeper understanding of the physics that underlies a particular experiment. The focus is then on how to find reasonable approximations to light-cone wavefunctions that make non-perturbative calculations tractable. For many problems it is not necessary to know everything about the wavefunction to make physically interesting predictions. Thus, one attempts to isolate and calculate the important aspects of the light-cone wavefunction. We saw in the discussion of the properties of nuclei in this review, that spectacular results can be obtained this way with a minimal input. Simply incorporating the angular momentum properties lead to very successful results almost independent of the rest of the structure of the light-cone wavefunction. Thus far there has been remarkable success in applying the light-cone method to theories in one-space and one-time dimension. Virtually, any 1#1 quantum field theory can be solved using
458
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
light-cone methods. For calculation in 3#1 dimensions the essential problem is that the number of degrees of freedom needed to specify each Fock state even in a discrete basis quickly grows since each particles’ color, helicity, transverse momenta and light-cone longitudinal momenta have to be specified. Conceivably advanced computational algorithms for matrix diagonalization, such as the Lanczos method could allow the diagonalization of sufficiently large matrix representations to give physically meaningful results. A test of this procedure in QED is now being carried out by J. Hiller et al. [222] for the diagonalization of the physical electron in QED. The goal is to compute the electron’s anomalous moment at large a non-perturbatively. QED Much of the current work in this area attempts to find approximate solution to problems in 3#1 dimensions by starting from a (1#1)-dimensionally-reduced version of that theory. In some calculations this reduction is very explicit while in others it is hidden. An interesting approach has been proposed by Klebanov and coworkers [38,121,115]. One decomposes the Hamiltonian into two classes of terms. Those which have the matrix elements that are at least linear in the transverse momentum (non-collinear) and those that are independent of the transverse momentum (collinear). In the collinear models one discards the non-linear interactions and calculates distribution functions which do not explicitly depend on transverse dimensions. These can then be directly compared with data. In this approximation QCD (3#1) reduces to a 1#1 theory in which all the partons move along k "0. However, the transverse polarization Mi of the dynamical gluons is retained. In effect the physical gluons are replaced by two scalar fields representing left- and right-handed polarized quanta. Collinear QCD has been solved in detail by Antonuccio et al. [9—11,13]. The results are hadronic eigenstates such as mesons with a full complement of qqN and g light-cone Fock states. Antonuccio and Dalley also obtain a glueball spectrum which closely resembles the gluonium states predicted by lattice gauge theory in 3#1 QCD. They have also computed the wavefunction and structure functions of the mesons, including the quark and gluon helicity structure functions. One interesting result, shows that the gluon helicity is strongly correlated with the helicity of the parent hadron, a result also expected in 3#1 QCD [70]. While collinear QCD is a drastic approximation to physical QCD, it provides a solvable basis as a first step to actually theory. More recently, Antonuccio et al. [9—11,13] have noted that Fock states differing by 1 or 2 gluons are coupled in the form of ladder relations which constrain the light-cone wavefunctions at the edge of phase space. These relations in turn allow one to construct the leading behavior of the polarized and unpolarized structure function at xP0, see in particular Ref. [12]. The transverse lattice method includes the transverse behavior approximately through a lattice that only operates in the transverse directions. In this method which was proposed by Bardeen et al. [17,18], the transverse degrees of freedom of the gauge theory are represented by lattice variables and the longitudinal degrees of freedom are treated with light-cone variables. Considerable progress has been made in recent years on the integrated method by Burkardt [78,79], Griffin [190] and van de Sande et al. [116,437—439]; see also Gaete et al. [169]. This method is particularly promising for analyzing confinement in QCD. The importance of renormalization is seen in the Tamm—Dancoff solution of the Yukawa model. We present some simple examples of non-perturbative renormalization in the context of integral equations which seems to have all the ingredients one would want. However, the method has not been successfully transported to a (3#1)-dimensional field theory. We also discussed the Wilson approach which focuses on this issue as a guide to developing their light-cone method. They use
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
459
a unique unitary transformation to band-diagonalize the theory on the way to renormalization. The method, however, is perturbative at its core which calls into question its applicability as a true non-perturbative renormalization. They essentially start from the confining potential one gets from the longitudinal confinement that is fundamental to lower dimensional theories and then builds the three-dimensional structure on that. The methods has been successfully applied to solving for the low-lying levels of positronium and their light-cone wavefunctions. Jones and Perry [255,256] have also shown how the Lamb shift and its associated non-perturbative Bethe-logarithm arises in the light cone Hamiltonian formulation of QED. There are now many examples, some of which were reviewed here, that show that DLCQ as a numerical method provides excellent solutions to almost all two-dimensional theories with a minimal effort. For models in 3#1 dimensions, the method is also applicable, while much more complicated. To date only QED has been solved with a high degree of precision and some of those results are presented in this review [279,264,429—431]. Of course, there one has high-order perturbative results to check against. This has proven to be an important laboratory for developing light-cone methods. Among the most interesting results of these calculations is the fact, that rotational symmetry of the result appears in spite of the fact that the approximation must break that symmetry. One can use light-cone quantization to study the structure of quantum field theory. The theories considered are often not physical, but are selected to help in the understanding of a particular non-perturbative phenomenon. The relatively simple vacuum properties of light-front field theories underly many of these “analytical” approaches. The relative simplicity of the light-cone vacuum provides a firm starting point to attack many non perturbative issues. As we saw in this review in two dimensions not only are the problems tractable from the outset, but in many cases, like the Schwinger model, the solution gives a unique insight and understanding. In the Schwinger model we saw that the Schwinger particle indeed has the simple parton structure that one hopes to see QCD. It has been known for some time that light-cone field theory is uniquely suited for to address problems in string theory. In addition, recently new developments in formal field theory associated with string theory, matrix models and M-theory have appeared which also seem particularly well suited to the light-cone approach [416]. Some issues in formal field theory which have proven to be intractable analytically, such as the density of the states at high energy, have been successfully addressed with numerical light-cone methods. In the future one hopes to address a number of outstanding issues, and one of the most interesting is spontaneous symmetry breaking. We have already seen in this review that the light cone provides a new paradigm for spontaneous symmetry breaking in /4 in 2 dimensions. Since the vacuum is simple in the light-cone approach the physics of spontaneous symmetry breaking must reside in the zero-mode operators. It has been known for some time that these operators satisfy a constraint equation. We reviewed here the now well-known fact that the solution of this constraint equation can spontaneously break a symmetry. In fact, in the simple /4-model the numerical results for the critical coupling constant and the critical exponent are quite good. The light cone has a number of unique properties with respect to chiral symmetry. It has been known for a long time, for example, that the free theory of a fermion with a mass still has a chiral symmetry in a light-cone theory. In Section 9 we reviewed chiral symmetry on the light cone. There
460
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
has recently been a few applications of light-cone methods to solve supersymmetry but as yet no one has addressed the issue of dynamical supersymmetry breaking. Finally, let us highlight the intrinsic advantages of light-cone field theory: f The light-cone wavefunctions are independent of the momentum of the bound state — only relative momentum coordinates appear. f The vacuum state is simple and in many cases trivial. f Fermions and fermion derivatives are treated exactly; there is no fermion doubling problem. f The minimum number of physical degrees of freedom are used because of the light-cone gauge. No Gupta—Bleuler or Faddeev—Popov ghosts occur and unitarity is explicit. f The output is the full color-singlet spectrum of the theory, both bound states and continuum, together with their respective wavefunctions.
Appendix A. General conventions For completeness notational conventions are collected in line with the textbooks [39,242]. ¸orentz vectors. We write contravariant four-vectors of position xk in the instant form as xk"(x0, x1, x2, x3)"(t, x, y, z)"(x0, x , x3)"(x0, x) . M
(A.1)
The covariant four-vector x is given by k x "(x , x , x , x )"(t, !x, !y, !z)"g xl , k 0 1 2 3 kl
(A.2)
and obtained from the contravariant vector by the metric tensor
A
g " kl
B
#1
0
0
0
0
!1
0
0
0
0
!1
0
0
0
0
!1
.
(A.3)
Implicit summation over repeated Lorentz (k, l, i) or space (i, j, k) indices is understood. Scalar products are x ) p"xkp "x0p #x1p #x2p #x3p "tE!x ) p , k 0 1 2 3
(A.4)
with four-momentum pk"(p0, p1, p2, p3)"(E, p). The metric tensor gkl raises the indices. Dirac matrices. Up to unitary transformations, the 4]4 Dirac matrices ck are defined by ckcl#clck"2gkl .
(A.5)
c0 is hermitean and ck anti-hermitean. Useful combinations are b"c0 and ak"c0ck, as well as pkl"1i(ckcl!clck) , c "c5"ic0c1c2c3 . 2 5
(A.6)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
461
They usually are expressed in terms of the 2]2 Pauli matrices
C D 1 0
I"
0 1
C D
, p1"
0 1 1 0
C
, p2"
0 !i i
0
D
C
, p3"
1
D
0
0 !1
.
(A.7)
In Dirac representation [39,242] the matrices are
A A
c0"
B B
A A
B B
0 pk , c" , k 0 !I !pk 0 I
0
(A.8)
A
B
0 #I 0 pk pk 0 c " , ak" , pij" . 5 I 0 #pk 0 0 pk In chiral representation [242] c and 0 0 #I 0 c0" , ck" I 0 !pk
A A
B B
A A
(A.9)
c are interchanged: 5 pk , 0
B B
A
(A.10)
B
I 0 pk 0 pk 0 , ak" , pij" . c " 5 0 !I 0 !pk 0 pk
(A.11)
(i, j, k)"1, 2, 3 are used cyclically. Projection operators. Combinations of Dirac matrices like the hermitean matrices K "1(1#a3)"1c0(c0#c3) and K "1(1!a3)"1c0(c0!c3) ` 2 2 ~ 2 2 often have projector properties, particularly
(A.12)
K #K "1 , K K "0 , K2 "K , K2 "K . ` ~ ` ~ ` ` ~ ~ They are diagonal in the chiral and maximally off-diagonal in the Dirac representation:
(A.13)
A B 1 0 0 0
0 0 0 0 , (K ) " ` #)*3!0 0 0 0
A
0
1
1 0 (K ) " ` D*3!# 2 1
1
0 !1
0
1
0
0 !1 0
1
0 0 0 1
0
B
1
.
Dirac spinors. The spinors u (p, j) and v (p, j) are solutions of the Dirac equation a a (p. !m)u(p, j)"0, (p. #m)v(p, j)"0 .
(A.14)
(A.15)
They are orthonormal and complete: uN (p, j)u(p, j@)"!vN (p, j@)v(p, j)"2md , jj{
(A.16)
+ u(p, j)uN (p, j)"p. #m , + v(p, j)vN (p, j)"p. !m . j j
(A.17)
462
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Note the different normalization as compared to the textbooks [39,242]. The “Feynman slash” is p. "p ck. The Gordon decomposition of the currents is useful: k uN (p, j)cku(q, j@)"vN (q, j@)ckv(p, j)"(1/2m)uN (p, j)((p#q)k#ipkl(p!q) )u(q, j@) . (A.18) l With j"$1, the spin projection is s"j/2. The relations cka. c "!2a , (A.19) k cka. b. c "4ab , (A.20) k cka. b. c. c "c. b. a. (A.21) k are useful. Polarization vectors. The two polarization four-vectors e (p, j) are labeled by the spin projections k j"$1. As solutions of the free Maxwell equations they are orthonormal and complete: ek(p, j)ew(p, j@)"!d , pke (p, j)"0 . k jj{ k The star (w) refers to complex conjugation. The polarization sum is g p #g p l k, d (p)"+ e (p, j)ew(p, j)"!g # k l l kl kl k pig i j with the null vector gkg "0 given below. k
(A.22)
(A.23)
Appendix B. The Lepage–Brodsky convention (LB) This section summarizes the conventions which have been used by Lepage, Brodsky and others [66,300,299]. ¸orentz vectors. The contravariant four-vectors of position xk are written as xk"(x`, x~, x1, x2)"(x`, x~, x ) . M Its time-like and space-like components are related to the instant form by [66,300,299] x`"x0#x3 and x~"x0!x3 ,
(B.1)
(B.2)
respectively, and referred to as the “light-cone time” and “light-cone position”. The covariant vectors are obtained by x "g xl, with the metric tensor(s) k kl 0 2 0 0 0 1 0 0 2 2 0 0 0 1 0 0 0 gkl" and g " 2 . (B.3) kl 0 0 !1 0 0 0 !1 0
A
0 0
Scalar products are
0
!1
B
A
0 0
0
B
!1
x ) p"xkp "x`p #x~p #x1p #x2p "1 (x`p~#x~p`)!x p . M M k ` ~ 1 2 2 All other four-vectors including ck are treated correspondingly.
(B.4)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
463
Dirac matrices. The Dirac representation of the c-matrices is used, particularly c`c`"c~c~"0 .
(B.5)
Alternating products are, for example, c`c~c`"4c`
and c~c`c~"4c~ .
(B.6)
Projection operators. The projection matrices become K "1c0c`"1c~c` and K "1c0c~"1c`c~ . (B.7) ` 2 4 ~ 2 4 Dirac spinors. Lepage and Brodsky [66,300,299] use a particularly simple spinor representation
G G
s(C) for j"#1 , 1 (p`#bm#a p )] u(p, j)" M M s(B) for j"!1 , Jp`
(B.8)
s(B) for j"#1 , 1 v(p, j)" (p`!bm#a p )] M M s(C) for j"!1 . Jp`
(B.9)
The two s-spinors are
AB 1
0 s(C)"J1 2 1
AB 0
and
s(B)"J1 2
0
1 0
.
(B.10)
!1
Polarization vectors: The null vector is gk"(0, 2, 0) .
(B.11)
In Bj+rken—Drell convention [39], one works with circular polarization, with spin projections j"$1"CB. The transversal polarization vectors are e (C)"!1/J2(1, i) and e (B)" M M 1/J2(1, !i), or collectively (B.12) e (j)"(!1/J2) (je #ie ) , x y M with e and e as unit vectors in p - and p -direction, respectively. With e`(p, j)"0, induced by the x y x y light-cone gauge, the polarization vector is
A
ek(p, j)" 0,
B
2e (j)p M M, e (j) , M p`
which satisfies p ek(p, j). k
(B.13)
Appendix C. The Kogut—Soper convention (KS) ¸orentz vectors. Kogut and Soper [274,406,41,275] have used x`"(1/J2) (x0#x3) and
x~"(1/J2) (x0!x3) ,
(C.1)
464
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
respectively, referred to as the “light-cone time” and “light-cone position”. The covariant vectors are obtained by x "g xl, with the metric tensor k kl 0 1 0 0
A
1 0 0 gkl"g " kl 0 0 !1 0 0
0
0 0
B
.
(C.2)
!1
Scalar products are
x ) p"xkp "x`p #x~p #x1p #x2p #"x`p~#x~p`!x p . k ` ~ 1 2 M M All other four-vectors including ck are treated correspondingly. Dirac matrices. The chiral representation of the c-matrices is used, particularly c`c`"c~c~"0 .
(C.3)
(C.4)
Alternating products are, for example, c`c~c`"2c`
and c~c`c~"2c~ .
(C.5)
Projection operators. The projection matrices become K "(1/J2)c0c`"1c~c` and K "(1/J2) c0c~"1 c`c~ . (C.6) ` 2 ~ 2 In the chiral representation the projection matrices have a particularly simple structure, see Eq. (A.14). Dirac spinors. Kogut and Soper [274] use as Dirac spinors
A B A B J2k`
k #ik 1 x y , u(k,C)" m 21@4Jk` 0
A B A B 0
m 1 u(k,B)" , 21@4Jk` !kx#iky J2k`
0
(C.7)
J2k`
!m k #ik 1 1 x y . v(k,C)" , v(k,B)" 21@4Jk` !kx#iky 21@4Jk` !m J2k` 0 Polarization vectors. The null vector is gk"(0, 1, 0) .
(C.8)
The polarization vectors of Kogut and Soper [274] correspond to linear polarization j"1 and j"2: ek(p, j"1)"(0, p /p`, 1, 0) , x ek(p, j"2)"(0, p /p`, 0, 1) . y
(C.9)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
465
The following are useful relations: cacbd (p)"!2 , ab caclcbd (p)"(2/p`) (c`pl#g`lp. ) , ab cackclcbd (p)"!4gkl#2(p /p`) Mgkaclc`!galckc`#ga`ckcl ab a !g`lckca#g`kclcaN .
(C.10)
The remainder is the same as in Appendix A.
Appendix D. Comparing BD- with LB-spinors The Dirac spinors u (p, j) and v (p, j) (with j"$1) are the four linearly independent solutions a a of the free Dirac equations (p. !m) u(p, j)"0 and (p. #m) v(p, j)"0. Instead of u(p, j) and v(p, j), it is sometimes convenient [39] to use spinors wr(p) defined by w1(p)"u (p, C), w2(p)"u (p, B), w3(p)"v (p, C), w4(p)"v (p, B) . a a a a a a a a With p0"E"Jm2#p2 holds quite in general
(D.1)
u(p, j)"(1/JN) (E#a ) p#bm)sr for r"1, 2 ,
(D.2)
v(p, j)"(1/JN) (E#a ) p!bm)sr for r"3, 4 .
(D.3)
Bj+rken—Drell (BD) [39] choose sr "d . With N"2m(E#m), the four spinors are then explicitly a ar E#m 0 p p !ip z x y 0 E#m p #ip !p 1 x y z . (D.4) wr (p)" a p p !ip E#m 0 JN z x y p #ip !p 0 E#m x y z Alternatively (A), one can choose
A
B
s1"s(C) , s2"s(B) , s3"s(C) , s4"s(B) , (D.5) a a a a with given in Eq. (B.10). With N"2m(E#p ), the spinors become explicitly z E#p #m !p #ip E#p !m !p #ip z x y z x y p #ip E#p #m p #ip E#p !m 1 x y z x y z . (D.6) wr (p)" a p !ip E#p #m p !ip J2N E#pz!m x y z x y p #ip !E!p #m p #ip !E!p !m x y z x y z One verifies that both spinor conventions (BD) and (A) satisfy orthogonality and completeness
A
4 4 + wN r wr{"c0 , + c0 wr wN r "d , a a rr{ rr a b ab a/1 r/1
B
(D.7)
466
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
respectively, with wN "wsc0. But the two do not have the same form for a particle at rest, p"0, namely
A B 1 0 0 0
0 1 0 0 wr (m) " a BD 0 0 1 0
A
1 0 0
0
0 1 0 and wr (m) " a A 0 0 1
0
0 0 0 1
0
B
,
(D.8)
0 0 0 !1
respectively, but they have the same spin projection: p12u(m, j)"ju(m, j) and p12v(m, j)"jv(m, j) .
(D.9)
Actually, Lepage and Brodsky [299] have not used Eq. (4.5), but rather s1"s(C) , s2"s(B) , s3"s(B) , s4"s(C) . a a a a by which reason Eq. (4.9) becomes p12u(0, j)"ju(0, j) and p12v(0, j)"!jv(0, j) .
(D.10)
(D.11)
In the LC formulation the p/2 operator is a helicity operator which has a different spin for fermions and anti-fermions.
Appendix E. The Dirac—Bergmann method The dynamics of a classical, non-relativistic system with N degrees of freedom can be derived from the blockian. Obtained from an action principle, this Lagrangian is a function of the velocity phase space variables: ¸"¸(q , qR ) , n"1,2, N , (E.1) n n where the q’s and qR ’s are the generalized coordinates and velocities respectively. For simplicity we consider only Lagrangians without explicit time dependence. The momenta conjugate to the generalized coordinates are defined by p "¸/qR . (E.2) n n Now it may turn out that not all the momenta may be expressed as independent functions of the velocities. If this is the case, the Legendre transformation that takes us from the Lagrangian to the Hamiltonian is not defined uniquely over the whole phase space (q, p). There then exist a number of constraints connecting the q’s and p’s: / (q, p)"0 , m"1,2, M . (E.3) m These constraints restrict the motion to a subspace of the full 2N-dimensional phase space defined by the (p, q).
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
467
Eventually, we would like to formulate the dynamics in terms of Poisson brackets defined for any two dynamical quantities A(q, p) and B(q, p): A B A B ! . (E.4) MA, BN" p q q p n n n n The Poisson bracket (PB) formulation is the stage from which we launch into quantum mechanics. Since the PB is defined over the whole phase space only for independent variables (q, p), we are faced with the problem of extending the PB definition (among other things) onto a constrained phase space. The constraints are a consequence of the form of the Lagrangian alone. Following Anderson and Bergmann [15], we will call the / primary constraints. Now to develop the theory, consider the . quantity p qR !¸. If we make variations in the quantities q, qR and p we obtain n n (E.5) d(p qR !¸)"dp qR !pR dq n n n n n n using Eq. (E.2) and the Lagrange equation pR "¸/q . Since the right-hand side of Eq. (E.5) is n n independent of dqR we will call p qR !¸ the Hamiltonian H. Notice that this Hamiltonian is not n n n unique. We can add to H any linear combination of the primary constraints and the resulting new Hamiltonian is just as good as the original one. How do the primary constraints affect the equations of motion? Since not all the q’s and p’s are independent, the variations in Eq. (E.5) cannot be made independently. Rather, for Eq. (E.5) to hold, the variations must preserve the conditions Eq. (E.3). The result is [413] qR "(H/p )#u / /p n n m m n
(E.6)
and pR "!(H/q )!u / /q (E.7) n n m m n where the u are unknown coefficients. The N qR ’s are fixed by the N q’s, the N!M independent p’s m and the M u’s. Dirac takes the variables q, p and u as the Hamiltonian variables. Recalling the definition of the Poisson bracket Eq. (E.4) we can write, for any function g of the q’s and p’s g g gR " qR # pR "Mg, HN#u Mg, / N (E.8) m m q n p n n n using Eqs. (E.6) and (E.7). As mentioned already, the Poisson bracket has meaning only for two dynamical functions defined uniquely over the whole phase space. Since the / restrict the m independence of some of the p’s, we must not use the condition / "0 within the PB. The PB m should be evaluated based on the functional form of the primary constraints. After all PB’s have been calculated, then we may impose / "0. From now on, such restricted relations will be m denoted with a squiggly equal sign: / +0 . m This is called a weak equality. The equation of motion for g is now gR +Mg, H N T
(E.9)
(E.10)
468
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
where H "H#u / (E.11) T m m is the total Hamiltonian [126]. If we take g in Eq. (E.10) to be one of the /’s we will get some consistency conditions since the primary constraints should remain zero throughout all time: M/ , HN#u M/ , / N+0 . (E.12) m m{ m m{ What are the possible outcomes of Eq. (E.12)? Unless they all reduce to 0"0, i.e., are identically satisfied, we will get more conditions between the Hamiltonian variables q, p and u. We will exclude the case where an inappropriate Lagrangian leads to an inconsistency like 1"0. There are then two cases of interest. The first possibility is that Eq. (E.12) provides no new information but imposes conditions on the u’s. The second possibility is that we get an equation independent of u but relating the p’s and q’s. This can happen if the M]M matrix M/ , / N has any rows (or m m m{ columns) which are linearly dependent. These new conditions between the q’s and p’s are called secondary constraints s +0 , k@"1,2, K@ (E.13) k{ by Anderson and Bergmann [15]. Notice that primary constraints follow from the form of the Lagrangian alone whereas secondary constraints involve the equations of motion as well. These secondary constraints, like the primary constraints, must remain zero throughout all time so we can perform the same consistency operation on the v’s: sR "Ms , HN#u Ms , / N+0 . (E.14) k k m k m This equation is treated in the same manner as Eq. (E.12). If it leads to more conditions on the p’s and q’s the process is repeated again. We continue like this until either all the consistency conditions are exhausted or we get an identity. Let us write all the constraints obtained in the above manner under one index as / +0 , j"1,2,M#K,J j then we obtain the following matrix equation for the u : m M/ , HN#u M/ , / N+0 . j m j m The most general solution to Eq. (E.16) is
(E.15)
(E.16)
u "º #v » , a"1,2, A , (E.17) m m a am where » is a solution of the homogeneous part of Eq. (E.16) and v » is a linear combination of m a am all such independent soluions. The coefficients v are arbitrary. a Substitute Eq. (E.17) into Eq. (E.11). This gives H "H#º / #v » / "H@#v / , T m m a am m a a where H@"H#º / m m
(E.18)
(E.19)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
469
and / "» / . (E.20) a am m Note that the u’s must satisfy consistency requirements whereas the v’s are totally arbitrary functions of time. Later, we will have more to say about the appearance of these arbitrary features in our theory. To further classify the quantities in our theory, consider the following definitions given by Dirac [124]. Any dynamical variable, F(q, p), is called first class if MF, / N+0 , j"1,2, J , (E.21) j i.e., F has zero PB with all the /’s. If MF, / N is not weakly zero F is called second class. Since the /’s j are the only independent quantities which are weakly zero, we can write the following strong equations when F is first class: MF, / N"c / . (E.22) j jj{ j{ Any quantity which is weakly zero is strongly equal to some linear combination of the /’s. Given Eqs. (E.21) and (E.22) it is easy to show that H@ and / (see Eqs. (E.19) and (E.20)) are first class a quantities. Since / is a linear combination of primary constraints Eq. (E.20), it too is a primary a constraint. Thus, the total Hamiltonian Eq. (E.18), which is expressed as the sum of a first class Hamiltonian plus a linear combination of primary first class constraints, is a first class quantity. Notice that the number of arbitrary functions of the time appearing in our theory is equivalent to the number of independent primary first class constraints. This can be seen by looking at Eq. (E.17) where all the inependent first class primary constraints are included in the sum. This same number will also appear in the general equation of motion because of Eq. (E.18). Let us make a small digression on the role of these arbitrary functions of time. The physical state of any system is determined by the q’s and p’s only and not by the v’s. However, if we start out at t"t with fixed initial values (q , p ) we arrive at different values of 0 0 0 (q, p) at later times depending on our choice of v. The physical state does not uniquely determine a set of q’s and p’s but a given set of q’s and p’s must determine the physical state. We thus have the situation where there may be several sets of the dynamical variables which correspond to the same physical state. To understand this better consider two functions A a and A @a of the dynamical variables which v v evolve from some A with different multipliers. Compare the two functions after a short time 0 interval *t by considering a Taylor expansion to first order in *t: A a(t)"A #AQ a*t"A #MA , H N*t"A #*t[MA , H@N#v MA , / N] . v 0 v 0 0 T 0 0 a 0 a
(E.23)
A a!A @a"*t(v !v@ )MA , / N v v a a 0 a
(E.24)
Thus,
or *A"e MA , / N a 0 a where e "*t(v !v@ ) a a a
(E.25)
(E.26)
470
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
is a small, arbitrary quantity. This relationship between A a and A @a tells us that the two functions are v v related by an infinitesimal canonical transformation (ICT) [186] whose generator is a first class primary constraint / . This ICT leads to changes in the q’s and p’s which do no affect the physical state. a Furthermore, it can also be shown [126] that by considering successive ICTs that the generators need not be primary but can be secondary as well. To be completely general then, we should allow for such variations which do not change the physical state in our equations of motion. This can be accomplished by redefining H to include the first class secondary constraints with arbitrary T coefficients. Since the distinction between first class primary and first class secondary is not significant [413] in what follows we will not make any explicit changes. For future considerations, let us call those transformations which do not change the physical state gauge transformations. The ability to perform gauge transformations is a sign that the mathematical framework of our theory has some arbitrary features. Suppose we can add conditions to our theory that eliminate our ability to make gauge transformations. These conditions would enter as secondary constraints since they do not follow from the form of the Lagrangian. Therefore, upon imposing these conditions, all constraints become second class. If there were any more first class constraints we would have generators for gauge transformations which, by assumption, can no longer be made. This is the end of the digression although we will see examples of gauge transformations later. In general, of the J constraints, some are first class and some are second class. A linear combination of constraints is again a constraint so we can replace the / with independent linear j combinations of them. In doing so, we will try to make as many of the constraints first class as possible. Those constraints which cannot be brought into the first class through appropriate linear combinations are labeled by m , s"1,2, S. Now form the PBs of all the m’s with each other and s arrange them into a matrix:
A
B
Mm , m N 2 Mm , m N 1 2 1 s Mm , m N 0 2 Mm , m N 2 1 2 s . *, F F } F 0
(E.27)
Mm , m N Mm , m N 2 0 s 1 s 2 Dirac has proven that the determinant of D is non-zero (not even weakly zero). Therefore, the inverse of D exists: (E.28) (D~1) Mm , m AN"d A . ss ss{ s{ ss Define the Dirac bracket (DB) (Dirac called them “new Poisson brackets”) between any two dynamical quantities A and B to be MA, BN*"MA, BN!MA, m N(D~1) Mm , BN . (E.29) s ss{ s{ The DB satisfies all the same algebraic properties (anti-symmetry, linearity, product law, Jacobi identity) as the ordinary PB. Also, the equations of motion can be written in terms of the DB since for any g(p, q), Mg, H N*"Mg, H N!Mg, m N(D~1) Mm , H N+Mg, H N . T T s ss{ s{ T T The last step follows because H is first class. T
(E.30)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
471
Perhaps the most important feature of the DB is the way it handles second class constraints. Consider the DB of a dynamical quantity with one of the (remaining) m’s: Mg, m AN*"Mg, m AN!Mg, m N(D~1) Mm , m AN"Mg, m AN!Mg, m Nd A"0 . (E.31) s s s ss{ s{ s s s ss The definition Eq. (E.28) was used in the second step above. Thus, the m’s may be set strongly equal to zero before working out the Dirac bracket. Of course, we must still be careful that we do not set m strongly to zero within a Poisson bracket. If we now replace all PBs by DBs (which is legitimate since the dynamics can be written in terms of DBs via Eq. (E.30)) any second class constraints in H will appear in the DB in Eq. (E.30). Eq. (E.31) then tells us that those constraints can be set to T zero. Thus, all we are left with in our Hamiltonian are first class constraints: HI "H#v U , i"1,2, I , (E.32) T i i where the sum is over the remaining constraints which are first class. It must be emphasized that this is possible only because we have reformulated the theory in terms of the Dirac brackets. Of course, this reformulation in terms of the DB does not uniquely determine the dynamics for us since we still have arbitrary functions of the time accompanying the first class constraints. If the Lagrangian is such it exhibits no first class constraints then the dynamics are completely defined. Before doing an example from classical field theory, we should note some features of a field theory that differentiate it from point mechanics. In the classical theory with a finite number of degrees of freedom we had constraints which were functions of the phase space variables. Going over to field theory these constraints become functionals which in general may depend upon the spatial derivatives of the fields and conjugate momenta as well as the fields and momenta themselves: / "/ [u(x), p(x), u, p] . (E.33) m m i i The square brackets indicate a functional relationship and ,/xi. A consequence of this is that i the constraints are differential equations in general. Furthermore, the constraint itself is no longer the only independent weakly vanishing quantity. Spatial derivatives of / and integrals of m constraints over spatial variables are weakly zero also. Since there are actually an infinite number of constraints for each m (one at each space-time point x) we write
P
H "H# dx u (x)/ (x) . T m m
(E.34)
Consistency requires that the primary constraints be conserved in time:
P
0+M/ (x), H N"M/ , HN# dy u (y)M/ (x), / (y)N . n m n m T m
(E.35)
The field theoretical Poisson bracket for any two phase space functionals is given by
PA
B
dA dB dA dB ! (E.36) MA, BN 0 0(x, y)" dz x /y du (z) dn (z) dn (z) du (z) i i i i with the subscript x0"y0 reminding us that the bracket is defined for equal times only. Generally, there may be a number of fields present hence the discrete label i. The derivatives appearing in the
472
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
PB above are functional derivatives. If F[ f (x)] is a functional its derivative with respect to a function f (y) is defined to be dF[ f (x)]/df (y)"lim (1/e) [F[ f (x)#ed(x!y)]!F[ f (x)]] . e?0 Assuming Eq. (E.36) has a non-zero determinant we can define an inverse:
P
P
dy P (x, y)P~1 (y, z)" dy P~1 (x, y)P (y, z)"d d(x!z) , lm mn lm mn -/
(E.37)
(E.38)
where (E.39) P (x, y),M/ (x), / (y)N 0 0 . lm l m x /y Unlike the discrete case, the inverse of the PB matrix above is not unique in general. This introduces an arbitrariness which was not present in theories with a finite number of degrees of freedom. The arbitrariness makes itself manifest in the form of differential (rather than algebraic) equations for the multipliers. We must then supply boundary conditions to fix the multipliers [413]. The Maxwell theory for the free electro-magnetic field is defined by the action
P
S" d4x L(x) ,
(E.40)
where L is the Lagrangian density, Eq. (B.8). The action is invariant under local gauge transformations. The ability to perform such gauge transformations indicates the presence of first class constraints. To find them, we first obtain the momenta conjugate to the fields A : nk"!F0k as k defined in Eq. (B.12). This gives us a primary constraint, namely n0(x)"0. Using Eq. (B.26), we can write the canonical Hamiltonian density as P (x, y),M/ (x), / (y)N 0 0 , (E.41) lm l m x /y where the velocity fields AQ have been expressed in terms of the momenta n . After a partial i i integration on the second term, the Hamiltonian becomes
P
H" d3x (1p p !A n #1F F ) 2 i i 0 i i 4 ik ik
P
(E.42)
NH "H# d3x v (x)n0(x) . T 1 Again, for consistency, the primary constraints must be constant in time so that
G P
H
0+Mn0, H N"! n0, d3x A p " p . T 0 i i i i
(E.43)
Thus, p +0 is a secondary constraint. We must then check to see if Eq. (E.43) leads to further i i constraints by also requiring that p is conserved in time: i i 0+M n , H N . (E.44) i i T
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
473
The PB above vanishes identically however so there are no more constraints which follow from consistency requirements. So we have our two first class constraints: and
/ "n0+0 1
(E.45)
s,/ " p +0 . (E.46) 2 i i In light of the above statements the first class secondary constraints should be included in H as T well (Some authors call the Hamiltonian with first class secondary constraints included the extended Hamiltonian):
P
H "H# d3x(v / #v / ) . 1 1 2 2 T
(E.47)
Notice that the fundamental PB’s among the A and nk, k (E.48) MA (x), nl(y)N 0 0"dl d(x!y) k k x /y are incompatible with the constraint n0+0 so we will modify them using the Dirac-Bergmann procedure. The first step towards this end is to impose certain conditions to break the local gauge invariance. Since there are two first class constraints, we need two gauge conditions imposed as second class constraints. The traditional way to implement this is by imposing the radiation gauge conditions: X ,A +0 and X , i+0 . (E.49) 1 0 2 iA It can be shown [413] that the radiation gauge conditions completely break the gauge invariance thereby bringing all constraints into the second class. The next step is to form the matrix of second class constraints with matrix elements D "MX , / N 0 0 and i, j"1, 2: ij i j x /y 0 0 1 0
A
D"
0
0
0 !+ 2
!1
0
0
0
+2 0
0
0
B
d(x!y) .
(E.50)
To get the Dirac bracket we need the inverse of D. Recalling the definition, Eq. (E.38), we have
P
dy D (x, y)(D~1) (y, z)"d d(x!z) . ij jk ik
(E.51)
With the help of + 2(1/Dx!yD)"!4pd(x!y) we can easily perform Eq. (2.11) element by element to obtain
A
*~1"
0
0
!d(x!y)
0
0
0
0
1 ! 4pDx!yD
d(x!y)
0
0
0
0
1 4pDx!yD
0
0
B
(E.52)
474
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
Thus, the Dirac bracket in the radiation gauge is (all brackets are at equal times)
PP
MA(x), B(y)N*"MA(x), B(y)N!
du dMA(x), t (u)N(D~1) (u, )Mt (v), B(y)N , i ij j
(E.53)
where t "X , t "X , t "/ and t "/ . The fundamental Dirac brackets are 1 1 2 2 3 1 4 2 MA (x), pl(y)N*"(dl #d0gl0)d(x!y)! l (1/4pDx!yD) k k k k MA (x), A (y)N*"0"Mpk(x), pl(y)N* . k l
(E.54)
From the first of the above equations we obtain, MA (x), p (y)N*"d d(x!y)! (1/4pDx!yD) . i j ij i j
(E.55)
The right-hand side of the above expression is often called the “transverse delta function” in the context of canonical quantization of the electro-magnetic field in the radiation gauge. In nearly all treatments of that subject, however, the transverse delta function is introduced “by hand” so to speak. This is done after realizing that the standard commutation relation [A (x), n (y)]"id d(x!y) is in contradiction with Gauss’ law. In the Dirac—Bergmann approach i j ij the familiar equal-time commutator relation is obtained without any hand-waving arguments. The choice of the radiation gauge in the above example most naturally reflects the splitting of A and p into transverse and longitudinal parts. In fact, the gauge condition A "0 implies that the i i longitudinal part of A is zero. This directly reflects the observation that no longitudinally polarized photons exist in nature. Given this observation, we should somehow be able to associate the true degrees of freedom with the transverse parts of A and p. Sundermeyer [413] shows that this is indeed the case and that, for the true degrees of freedom, the DB and PB coincide. We have up till now concerned ourselves with constrained dynamics at the classical level. Although all the previous developments have occurred quite naturally in the classical context, it was the problem of quantization which originally motivated Dirac and others to develop the previously described techniques. Also, more advanced techniques incorporating constraints into the path integral formulation of quantum theory have been developed. The general problem of quantizing theories with constraints is very formidable especially when considering general gauge theories. We will not attempt to address such problems. Rather, we will work in the non-relativistic framework of the Schro¨dinger equation where quantum states are described by a wave function. As a first case, let us consider a classical theory where all the constraints are first class. The Hamiltonian is written then as the sum of the canonical Hamiltonian H"p qR !¸ plus a linear i i combination of the first class constraints: H@"H#v / . j j
(E.56)
Take the p’s and q’s to satisfy Mq , p NN(i/+) [qL , pL ] i j i j
(E.57)
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
475
where the hatted variables denote quantum operators and [qL , pL ]"qL pL !pL qL is the commutator. i j i j j i The Schro¨dinger equation reads i+ dt/dt"H@t ,
(E.58)
where t is the wavefunction on which the dynamical variables operate. For each constraint / impose supplementary conditions on the wavefunction j /K t"0 . (E.59) j Consistency of the Eq. (E.59) with one another demands that [/K , /K ]t"0 . (E.60) j j{ Recall the situation in the classical theory where anything that was weakly zero could be written strongly as a linear combination of the /’s: M/ , / N"cjA /jA. (E.61) j j{ jj{ Now if we want Eq. (E.60) to be a consequence of Eq. (E.59), an analogous relation to Eq. (E.61) must hold in the quantum theory, namely, (E.62) [/K , /K ]"cL jA /K jA . jj{ j j{ The problem is that the coefficients cL in the quantum theory are in general functions of the operators pL and qL and do not necessarily commute with the /K ’s. In order for consistency then, we must have the coefficients in the quantum theory all appearing to the left of the /K ’s. The same conclusion follows if we consider the consistency of Eq. (E.59) with the Schro¨dinger equation. If we cannot arrange to have the coefficients to the left of the constraints in the quantum theory then as Dirac says “we are out of luck” [126]. Consider now the case where there are second class constraints, m . The problems encountered s when there are second class constraints are similar in nature to the first class case but appear even worse. This statement follows simply from the definition of second class. If we try to impose a condition on t similarly to Eq. (E.59) but with a second class constraint we must get a contradiction since already Mm , / NO0 for all j at the classical level. s j Of course if we imposed mK "0 as an operator identity then there is no contradiction. In the s classical theory, the analogous constraint condition is the strong equality m"0. We have seen that strong equalities for second class constraints emerge in the classical theory via the Dirac— Bergmann method. Thus it seems quite suggestive to postulate MA, BN*N(i/+) [AK , BK ]
(E.63)
as the rule for quantizing the theory while imposing mK "0 as an operator identity. Any remaining s weak equations are all first class and must then be treated as in the first case using supplementary conditions on the wave function. Hence, the operator ordering ambiguity still exists in general. We have seen that there is no definite way to guarantee a well defined quantum theory given the corresponding classical theory. It is possible, since the Dirac bracket depends on the gauge constraints imposed by hand, that we can choose such constraints in such a way as to avoid any
476
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
problems. For a general system however, such attempts would at best be difficult to implement. We have seen that there is a consistent formalism for determining (at least as much one can) the dynamics of a generalized Hamiltonian system. The machinery is as follows: f f f f f f f f
Obtain the canonical momenta from the Lagrangian. Identify the primary constraints and construct the total Hamiltonian. Require the primary constraints to be conserved in time. Require any additional constraints obtained by step 3 to also be conserved in time. Separate all constraints into first class or second class. Invert the matrix of second class constraints. Form the Dirac bracket and write the equations of motion in terms of them. Quantize by taking the DB over to the quantum commutator.
Of course, there are limitations throughout this program; especially in steps six and eight. If there are any remaining first class constraints it is a sign that we still have some gauge freedom left in our theory. Given the importance of gauge field theory in today’s physics it is certainly worth one’s while to understand the full implications of constrained dynamics. The material presented here is meant to serve as a primer for further study.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [21] [22] [23] [24] [25] [26]
J. Abad, J.G. Esteve, A.F. Pacheco, Phys. Rev. D 32 (1985) 2729. O. Abe, K. Tanaka, K.G. Wilson, Phys. Rev. D 48 (1993) 4856—4867. O. Abe, G.J. Aubrecht, K. Tanaka, Phys. Rev. D 56 (1997) 2242—2249. N.A. Aboud, J.R. Hiller, Phys. Rev. D 41 (1990) 937—945. C. Acerbi, A. Bassetto, Phys. Rev. D 49 (1994) 1067—1076. S. Adler, Nucl. Phys. B 415 (1994) 195. A. Ali, V.M. Braun, H. Simma, Z. Physik C 63 (1994) 437—454. E.A. Ammons, Phys. Rev. D 50 (1994) 980—990. F. Antonuccio, S. Dalley, Phys. Lett. B 348 (1995) 55. F. Antonuccio, S. Dalley, Nucl. Phys. B 461 (1996) 275. F. Antonuccio, S. Dalley, Phys. Lett. B 376 (1996) 154. F. Antonuccio, S.J. Brodsky, S. Dalley, Phys. Lett. B 412 (1997) 104. F. Antonuccio, S. Pinsky, Phys. Lett. B 397 (1997) 42—50. A.M. Annenkova, E.V. Prokhatilov, V.A. Franke, Phys. Atom. Nucl. 56 (1993) 813—825. J.L. Anderson, P.G. Bergman, Phys. Rev. 83 (1951) 1018. K. Bardakci, M.B. Halpern, Phys. Rev. 176 (1968) 1786. W.A. Bardeen, R.B. Pearson, Phys. Rev. D 13 (1976) 547. W.A. Bardeen, R.B. Pearson, E. Rabinovici, Phys. Rev. D 21 (1980) 1037. V. Bargmann, Proc. Natl. Acad. Sci. (USA) 34 (1948) 211. D. Bartoletta et al., Phys. Rev. Lett. 62 (1989) 2436. A. Bassetto, G. Nardelli, R. Soldati, Yang-Mills Theories in Algebraic Noncovariant Gauges, World Scientific, Singapore, 1991. A. Bassetto, Phys. Rev. D 47 (1993) 727—729. A. Bassetto, M. Ryskin, Phys. Lett. B 316 (1993) 542—545. A. Bassetto, I.A. Korchemskaya, G.P. Korchemsky, G. Nardelli, Nucl. Phys. B 408 (1993) 62—90. A. Bassetto, Nucl. Phys. Proc. Suppl. C 51 (1996) 281—288.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486 [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74]
477
A. Bassetto, L. Griguolo, G. Nardelli, Phys. Rev. D 54 (1996) 2845—2852. A. Bassetto, G. Nardelli, Int. J. Mod. Phys. A 12 (1997) 1075—1090. R. Bayer, H.C. Pauli, Phys. Rev. D 53 (1996) 939. J.S. Bell, Acta Phys. Austriaca (Suppl.) 13 (1974) 395. E. Belz, ANL preprint, 1994. C.M. Bender, L.R. Mead, S.S. Pinsky, Phys. Rev. Lett. 56 (1986) 2445. C.M. Bender, S.S. Pinsky, B. Van de Sande, Phys. Rev. D 48 (1993) 816. H. Bergknoff, Nucl. Phys. B 122 (1977) 215. C. Bloch, Nucl. Phys. 6 (1958) 329. G. Bertsch, S.J. Brodsky, A.S. Goldhaber, J.F. Gunion, Phys. Rev. Lett. 47 (1981) 297. H.A. Bethe, F. de Hoffman, Mesons and Fields, Vol. II, Row, Peterson and Company, Evanston, Ill, 1955. G. Bhanot, K. Demeterfi, I.R. Klebanov, Phys. Rev. D 48 (1993) 4980. J.D. Bj+rken, S.D. Drell, Relativistic Quantum Mechanics, McGraw-Hill, New York, 1964; J.D.Bj+rken, S.D. Drell, Relativistic Quantum Fields, McGraw-Hill, New York, 1965. J.D. Bj+rken, E.A. Paschos, Phys. Rev. 185 (1969) 1975. J.D. Bj+rken, J.B. Kogut, D.E. Soper, Phys. Rev. D 3 (1971) 1382. B. Blaettel, G. Baym, L.L. Frankfurt, H. Heiselberg, M. Strikman, Phys. Rev. D 47 (1993) 2761. R. Blankenbecler, S.J. Brodsky, J.F. Gunion, R. Savit, Phys. Rev. D 8 (1973) 4117. G.T. Bodwin, D.R. Yennie, M.A. Gregorio, Rev. Mod. Phys. 56 (1985) 723. A. Borderies, P. Grange´, E. Werner, Phys. Lett. B 319 (1993) 490—496. A. Borderies, P. Grange´, E. Werner, Phys. Lett. B 345 (1995) 458—468. J. Botts, Nucl. Phys. B 353 (1991) 20. M. Brisudova, R.J. Perry, Phys. Rev. D 54 (1996) 1831—1843. M. Brisudova, R.J. Perry, Phys. Rev. D 54 (1996) 6453—6458. M. Brisudova, R.J. Perry, K.G. Wilson, Phys. Rev. Lett. 78 (1997) 1227—1230. S.J. Brodsky, J.R. Primack, Annals Phys. 52 (1960) 315. S.J. Brodsky, J.R. Primack, Phys. Rev. 174 (1968) 2071. S.J. Brodsky, R. Roskies, R. Suaya, Phys. Rev. D 8 (1973) 4574. S.J. Brodsky, G.R. Farrar, Phys. Rev. D 11 (1975) 1309. S.J. Brodsky, B.T. Chertok, Phys. Rev. D 14 (1976) 3003. S.J. Brodsky, S.D. Drell, Phys. Rev. D 22 (1980) 2236. S.J. Brodsky, G.P. Lepage, S.A.A. Zaidi, Phys. Rev. D 23 (1981) 1152. S.J. Brodsky, J.R. Hiller, Phys. Rev. C 28 (1983) 475. S.J. Brodsky, C.-R. Ji, G.P. Lepage, Phys. Rev. Lett. 51 (1983) 83. S.J. Brodsky, A.H. Mueller, Phys. Lett. B 206 (1988) 685. S.J. Brodsky, G.F. de Teramond, Phys. Rev. Lett. 60 (1988) 1924. S.J. Brodsky, G.P. Lepage, in: A.H. Mueller (Ed.), Perturbative Quantum Chromodynamics, World Scientific, Singapore, 1989. S.J. Brodsky, I.A. Schmidt, Phys. Lett. B 234 (1990) 144. S.J. Brodsky, G.F. de Teramond, I.A. Schmidt, Phys. Rev. Lett. 64 (1990) 1011. S.J. Brodsky, I.A. Schmidt, Phys. Rev. D 43 (1991) 179. S.J. Brodsky, H.C. Pauli, in: H. Mitter, H. Gausterer (Eds.), Recent Aspect of Quantum Fields, Lecture Notes in Physics, vol. 396, Springer, Berlin, 1991. S.J. Brodsky, P. Hoyer, A.H. Mueller, W.-K. Tang, Nucl. Phys. B 369 (1992) 519. S.J. Brodsky, G. McCartor, H.C. Pauli, S.S. Pinsky, Particle World 3 (1993) 109. S.J. Brodsky, W.-K. Tang, C.B. Thorn, Phys. Lett. B 318 (1993) 203. S.J. Brodsky, M. Burkardt, I. Schmidt, Nucl. Phys. B 441 (1994) 197. S.J. Brodsky, F. Schlumpf, Phys. Lett. B 329 (1994) 111. S.J. Brodsky, F. Schlumpf, Prog. Part. Nucl. Phys. 34 (1995) 69—86. R.W. Brown, J.W. Jun, S.M. Shvartsman, C.C. Taylor, Phys. Rev. D 48 (1993) 5873—5882. F. Buccella, E. Celeghin, H. Kleinert, C.A. Savoy, E. Sorace, Nuovo Cimento A 69 (1970) 133.
478 [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125]
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486 M. Burkardt, Phys. A 504 (1989) 762. M. Burkardt, A. Langnau, Phys. Rev. D 44 (1991) 1187. M. Burkardt, A. Langnau, Phys. Rev. D 44 (1991) 3857. M. Burkardt, Phys. Rev. D 47 (1993) 4628. M. Burkardt, Phys. Rev. D 49 (1994) 5446. M. Burkardt, Adv. Nucl. Phys. 23 (1996) 1—74. M. Burkardt, Light front hamiltonians and confinement, hep-ph/9512318. M. Burkardt, H. El-Khozondar, Phys. Rev. D 55 (1997) 6514—6521. M. Burkardt, Phys. Rev. D 54 (1996) 2913—2920. M. Burkardt, B. Klindworth, Phys. Rev. D 55 (1997) 1001—1012. M. Burkardt, Phys. Rev. D 57 (1998) 1136. F. Cardarelli, I.L. Grach, I.M. Narodetskii, G. Salme, S. Simula, Phys. Lett. B 349 (1995) 393—399. R. Carlitz, D. Heckathorn, J. Kaur, W.-K. Tung, Phys. Rev. D 11 (1975) 1234. R. Carlitz, W.-K. Tung, Phys. Rev. D 13 (1976) 3446. C.E. Carlson, J.L. Poor, Phys. Rev. D 38 (1988) 2758. A. Carroll, Lecture at the Workshop on Exclusive Processes at High Momentum Transfer, Elba, Italy, 1993. A. Casher, Phys. Rev. D 14 (1976) 452. S. Chang, S. Ma, Phys. Rev. 180 (1969) 1506. S.J. Chang, Phys. Rev. D 13 (1976) 2778. S.J. Chang, T.M. Yan, Phys. Rev. D 7 (1973) 1147. S.J. Chang, R.G. Root, T.M. Yan, Phys. Rev. D 7 (1973) 1133. S.J. Chang, R.G. Root, T.M. Yan, Phys. Rev. D 7 (1973) 1133. L. Chao, Mod. Phys. Lett. A 8 (1993) 3165—3172. Z. Chen, A.H. Mueller, Nucl. Phys. B 451 (1995) 579. H.Y. Cheng, C.Y. Cheung, C.W. Hwang, Phys. Rev. D 55 (1997) 1559—1577. V.L. Chernyak, A.R. Zhitnitskii, Phys. Rep. 112 (1984) 173. C.Y. Cheung, W.M. Zhang, G.-L. Lin, Phys. Rev. D 52 (1995) 2915—2925. P.L. Chung, W.N. Polyzou, F. Coester, B.D. Keister, Phys. Rev. C 37 (1988) 2000. P.L. Chung, F. Coester, Phys. Rev. D 44 (1991) 229. F.E. Close, An Introduction to Quarks and Partons, Academic Press, New York, 1979. F. Coester, W.N. Polyzou, Phys. Rev. D 26 (1982) 1349. F. Coester, Prog. Nuc. Part. Phys. 29 (1992) 1. F. Coester, W. Polyzou, Found. Phys. 24 (1994) 387—400. S. Coleman, Comm. Math. Phys. 31 (1973) 259. S. Coleman, R. Jackiw, L. Susskind, Ann. Phys. (NY) 93 (1975) 267. S. Coleman, Ann. Phys. (NY) 101 (1976) 239. J. Collins, Renormalization, Cambridge University Press, New York, 1984. M.E. Convery, C.C. Taylor, J.W. Jun, Phys. Rev. D 51 (1995) 4445—4450. D.P. Crewther, C.J. Hamer, Nucl. Phys. B 170 (1980) 353. R.H. Dalitz, F.J. Dyson, Phys. Rev. 99 (1955) 301. S. Dalley, I.R. Klebanov, Phys. Rev. D 47 (1993) 2517. S. Dalley, B. van de Sande, Nucl. Phys B Proc. Suppl. 53 (1997) 827—830. S.M. Dancoff, Phys. Rev. 78 (1950) 382. R. Dashen, M. Gell-Mann, Phys. Rev. Lett. 17 (1966) 340. S.P. De Alwis, Nucl. Phys. B 55 (1973) 427. S.P. De Alwis, J. Stern, Nucl. Phys. B 77 (1974) 509. K. Demeterfi, I.R. Klebanov, G. Bhanot, Nucl. Phys. B 418 (1994) 15. N.B. Demchuk, P.Yu. Kulikov, I.M. Narodetskii, P.J. O’Donnell, Phys. Atom. Nucl. 60 (1997) 1292—1304. P.A.M. Dirac, Rev. Mod. Phys. 21 (1949) 392. P.A.M. Dirac, Can. Jour. Math. 2 (1950) 129. P.A.M. Dirac, The Principles of Quantum Mechanics, 4th ed., Oxford Univ. Press, Oxford, 1958.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
479
[126] P.A.M. Dirac, Lectures on Quantum Mechanics, Belfer Graduate School of Science, Yeshiva University, 1964. [127] P.A.M. Dirac, in: D.W. Duke, J.F. Owens (Eds.), Perturbative Quantum Chromodynamics, Am. Inst. Phys., New York, 1981. [128] P.J. O’Donnell, Q.P. Xu, H.K.K. Tung, Phys. Rev. D 52 (1995) 3966—3977. [129] S.D. Drell, A.C. Hearn, Phys. Rev. Lett. 16 (1966) 908. [130] S.D. Drell, D. Levy, T.M. Yan, Phys. Rev. 187 (1969) 2159. [131] S.D. Drell, D. Levy, T.M. Yan, Phys. Rev. D 1 (1970) 1035. [132] S.D. Drell, D. Levy, T.M. Yan, Phys. Rev. D 1 (1970) 1617. [133] S.D. Drell, T.M. Yan, Phys. Rev. Lett. 24 (1970) 181. [134] A.Yu. Dubin, A.B. Kaidalov, Yu.A. Simonov, Phys. Lett. B 343 (1995) 310—314. [135] E. Eichten, F. Feinberg, J.F. Willemsen, Phys. Rev. D 8 (1973) 1204. [136] M.B. Einhorn, Phys. Rev. D 14 (1976) 3451. [137] T. Eller, H.C. Pauli, S.J. Brodsky, Phys. Rev. D 35 (1987) 1493. [138] T. Eller, H.C. Pauli, Z. Physik 42 C (1989) 59. [139] S. Elser, Hadron Structure ’94, Kosice, Slowakia, 1994; Diplomarbeit, U. Heidelberg, 1994. [140] S. Elser, A.C. Kalloniatis, Phys. Lett. B 375 (1996) 285—291. [141] G. Fang et al., Presented at the INT -Fermilab Workshop on Perspectives of High Energy Strong Interaction Physics at Hadron Facilities, 1993. [142] F.L. Feinberg, Phys. Rev. D 7 (1973) 540. [143] T.J. Fields, K.S. Gupta, J.P. Vary, Mod. Phys. Lett. A 11 (1996) 2233—2240. [144] R.P. Feynman, Phys. Rev. Lett. 23 (1969) 1415. [145] R.P. Feynman, Photon—Hadron Interactions, Benjamin, Reading, MA, 1972. [146] V.A. Franke, Yu.A. Novozhilov, E.V. Prokhvatilov, Lett. Math. Phys. 5 (1981) 239. [147] V.A. Franke, Yu.A. Novozhilov, E.V. Prokhvatilov, Lett. Math. Phys. 5 (1981) 437. [148] V.A. Franke, Yu.A. Novozhilov, E.V. Prokhvatilov, in: Dynamical Systems and Microphysics, Academic Press, New York, 1982, pp. 389—400. [149] L. Frankfurt, T.S.H. Lee, G.A. Miller, M. Strikman, Phys. Rev. C 55 (1997) 909. [150] H. Fritzsch, M. Gell-Mann, Proc. 16th. Int. Conf. on HEP, Batavia, IL, 1972. [151] H. Fritzsch, Mod. Phys. Lett. A 5 (1990) 625. [152] H. Fritzsch, Constituent Quarks, Chiral Symmetry, the Nucleon Spin, talk given at the Leipzig Workshop on Quantum Field Theory Aspects of High Energy Physics CERN, September 1993, preprintTH. 7079/93. [153] S. Fubini, G. Furlan, Physics 1 (1965) 229. [154] S. Fubini, A.J. Hanson, R. Jackiw, Phys. Rev. D 7 (1973) 1732. [155] M.G. Fuda, Phys. Rev. C 36 (1987) 702—709. [156] M.G. Fuda, Phys. Rev. D 42 (1990) 2898—2910. [157] M.G. Fuda, Phys. Rev. D 44 (1991) 1880—1890. [158] M.G. Fuda, Nucl. Phys. A 543 (1992) 111c—126c. [159] M.G. Fuda, Ann. Phys. 231 (1994) 1—40. [160] M.G. Fuda, Phys. Rev. C 52 (1995) 1260—1269. [161] M.G. Fuda, Phys. Rev. D 54 (1996) 5135—5147. [162] T. Fujita, A. Ogura, Prog. Theor. Phys. 89 (1993) 23—36. [163] T. Fujita, C. Itoi, A. Ogura, M. Taki, J. Phys. G 20 (1994) 1143—1157. [164] T. Fujita, Y. Sekiguchi, Prog. Theor. Phys. 93 (1995) 151—160. [165] M. Fujita, Sh.M. Shvartsman, Role of zero modes in quantization of QCD in light-cone coordinates, CWRU-TH95-11, hep-th/9506046, June 1995, 21pp. [166] T. Fujita, M. Hiramoto, H. Takahashi, Bound states of (1#1)-dimensional field theories, hep-th/9609224. [167] T. Fujita, K. Yamamoto, Y. Sekiguchi, Ann. Phys. 255 (1997) 204—227. [168] M. Funke, V. Kaulfass, H. Kummel, Phys. Rev. D 35 (1987) 621. [169] P. Gaete, J. Gamboa, I. Schmidt, Phys. Rev. D 49 (1994) 5621—5624. [170] M. Gari, N.G. Stephanis, Phys. Lett. B 175 (1986) 462. [171] A. Gasser, H. Leutwyler, Nucl. Phys. B 94 (1975) 269.
480 [172] [173] [174] [175] [176] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188]
[189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218] [219] [220]
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486 M. Gell-Mann, Phys. Lett. 8 (1964) 214. M. Gell-Mann, Lectures given at the 1972 Schladming Winter School, Acta Phys. Austriaca 9 (Suppl.) 733 (1972). S.B. Gerasimov, Yad. Fiz. 2 (1965) 598 [Sov. J. Nucl. Phys. 2 (1966) 430]. S.N. Ghosh, Phys. Rev. D 46 (1992) 5497—5503. S. G"azek, Acta Phys. Pol. B 15 (1984) 889. S. G"azek, R.J. Perry, Phys. Rev. D 45 (1992) 3734. S. G"azek, R.J. Perry, Phys. Rev. D 45 (1992) 3740. S. G"azek, K.G. Wilson, Phys. Rev. D 47 (1993) 4657. S. G"azek, K.G. Wilson, Phys. Rev. D 48 (1993) 5863. S. G"azek, A. Harindranath, S. Pinsky, J. Shigemitsu, K. Wilson, Phys. Rev. D 47 (1993) 1599. S. G"azek, K.G. Wilson, Phys. Rev. D 49 (1994) 4214. S. G"azek, K.G. Wilson, Phys. Rev. D 49 (1994) 6720. S. Glazek (Ed.), Theory of Hadrons and Light-front QCD, World Scientific, Singapore, 1995. H. Goldstein, Classical Mechanics, Addison-Wesley, Reading, MA, 1950. I.L. Grach, I.M. Narodetskii, S. Simula, Phys. Lett. B 385 (1996) 317—323. P. Grange´, A. Neveu, H.C. Pauli, S. Pinsky, E. Werner, (Eds.), New Non-perturbative Methods and Quantization on the Light Cone, “Les Houches Series”, Vol. 8, Springer, Berlin, 1997; Proc. Workshop at Centre de Physique des Houches, France, 24 February—7 March 1997. V.N. Gribov, Nucl. Phys. B 139 (1978) 1. P.A. Griffin, Nucl. Phys. B 372 (1992) 270. P.A. Griffin, Phys. Rev. D 46 (1992) 3538—3543. D. Gromes, H.J. Rothe, B. Stech, Nucl. Phys. B 75 (1974) 313. D.J. Gross, I.R. Klebanov, A.V. Matytsin, A.V. Smilga, Nucl. Phys. B 461 (1996) 109—130. E. Gubankova, F. Wegner, Exact renormalization group analysis in Hamiltonian theory: I. QED Hamiltonian on the light front, hep-th/9702162. C.J. Hamer, Nucl. Phys. B 121 (1977) 159. C.J. Hamer, Nucl. Phys. B 132 (1978) 542. C.J. Hamer, Nucl. Phys. B 195 (1982) 503. K. Harada, A. Okazaki, M. Taniguchi, M. Yahiro, Phys. Rev. D 49 (1994) 4226—4245. K. Harada, A. Okazaki, M. Taniguchi, Phys. Rev. D 52 (1995) 2429—2438. K. Harada, A. Okazaki, Phys. Rev. D 55 (1997) 6198—6208. K. Harada, A. Okazaki, M. Taniguchi, Phys. Rev. D 54 (1996) 7656—7663. K. Harada, A. Okazaki, M. Taniguchi, Phys. Rev. D 55 (1997) 4910—4919. A. Harindranath, J.P. Vary, Phys. Rev. D 36 (1987) 1141. A. Harindranath, J.P. Vary, Phys. Rev. D 37 (1988) 1064—1069. A. Harindranath, R.J. Perry, J. Shigemitsu, Ohio State preprint, 1991. T. Heinzl, S. Krusche, E. Werner, B. Zellermann, Phys. Lett. B 272 (1991) 54. T. Heinzl, S. Krusche, E. Werner, Phys. Lett. B 256 (1991) 55. T. Heinzl, S. Krusche, E. Werner, Nucl. Phys. A 532 (1991) 4290. T. Heinzl, S. Krusche, S. Simburger, E. Werner, Z. Phys. C 56 (1992) 415. T. Heinzl, S. Krusche, E. Werner, Phys. Lett. B 275 (1992) 410. T. Heinzl, E. Werner, Z. Phys. C 62 (1994) 521—532. T. Heinzl, Nucl. Phys. Proc. Suppl. B 39C (1995) 217—219. T. Heinzl, Hamiltonian formulations of Yang-Mills quantum theory, the Gribov problem, hep-th/9604018. T. Heinzl, Phys. Lett. B 388 (1996) 129—136. T. Heinzl, Nucl. Phys. Proc. Suppl. A 54 (1997) 194—197. S. Heppelmann, Nucl. Phys. B Proc. Suppl. 12 (1990) 159 and references therein. J.E. Hetrick, Nucl. Phys. B 30 (1993) 228—231. J.E. Hetrick, Int. J. Mod. Phys. A 9 (1994) 3153. M. Heyssler, A.C. Kalloniatis, Phys. Lett. B 354 (1995) 453. J.R. Hiller, Phys. Rev. D 43 (1991) 2418.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486 [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239] [240] [241] [242] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] [255] [256] [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] [270]
481
J.R. Hiller, Phys. Rev. D 44 (1991) 2504. J.R. Hiller, S.J. Brodsky, Y. Okamoto (in progress). J. Hiller, S.S. Pinsky, B. van de Sande, Phys. Rev. D 51 (1995) 726. L.C.L. Hollenberg, K. Higashijima, R.C. Warner, B.H.J. McKellar, Prog. Theor. Phys. 87 (1991) 3411. L.C.L. Hollenberg, N.S. Witte, Phys. Rev. D 50 (1994) 3382. K. Hornbostel, S.J. Brodsky, H.C. Pauli, Phys. Rev. D 38 (1988) 2363. K. Hornbostel, S.J. Brodsky, H.C. Pauli, Phys. Rev. D 41 (1990) 3814. K. Hornbostel, Constructing Hadrons on the Light Cone, Workshop on From Fundamental Fields to Nuclear Phenomena, Boulder, CO, 20—22 September 1990; Cornell preprint CLNS 90/1038, 1990. K. Hornbostel, Cornell Preprint CLNS 91/1078, August 1991. K. Hornbostel, Phys. Rev. D 45 (1992) 3781. S. Hosono, S. Tsujimaru, Int. J. Mod. Phys. A 8 (1993) 4627—4648. S.Z. Huang, W. Lin, Ann. Phys. 226 (1993) 248—270. T. Hyer, Phys. Rev. D 49 (1994) 2074—2080. M. Ida, Progr. Theor. Phys. 51 (1974) 1521. M. Ida, Progr. Theor. Phys. 54 (1975) 1199. M. Ida, Progr. Theor. Phys. 54 (1975) 1519. M. Ida, Progr. Theor. Phys. 54 (1975) 1775. N. Isgur, C.H. Llewellyn Smith, Phys. Rev. B 217 (1989) 2758. K. Itakura, Phys. Rev. D 54 (1996) 2853—2862. K. Itakura, Dynamical symmetry breaking in light front Gross-Neveu model, UT-KOMABA-96-15, August 1996, 12pp. hep-th/9608062. K. Itakura, S. Maedan, Prog. Theor. Phys. 97 (1997) 635—652. C. Itzykson, J.B. Zuber, Quantum Field Theory, McGraw-Hill, New York, 1985. J. Jackiw, in: A. Ali, P. Hoodbhoy (Eds.), M.A.B. Be´g Memorial Volume, World Scientific, Singapore, 1991. R. Jackiw, N.S. Manton, Ann. Phys. (NY) 127 (1980) 257. W. Jaus, Phys. Rev. D 41 (1990) 3394. J. Jersa´k, J. Stern, Nucl. Phys. B 7 (1968) 413. J. Jersa´k, J. Stern, Nuovo Cimento 59 (1969) 315. C.R. Ji, S.J. Brodsky, Phys. Rev. D 34 (1986) 1460; D 33 (1986) 1951, 1406, 2653. X.D. Ji, Comments Nucl. Part. Phys. 21 (1993) 123—136. C.R. Ji, Phys. Lett. B 322 (1994) 389—396. C.R. Ji, G.H. Kim, D.P. Min, Phys. Rev. D 51 (1995) 879—889. C.R. Ji, S.J. Rey, Phys. Rev. D 53 (1996) 5815—5820. C.R. Ji, A. Pang, A. Szczepaniak, Phys. Rev. D 52 (1995) 4038—4041. B.D. Jones, R.J. Perry, Phys. Rev. D 55 (1997) 7715—7730. B.D. Jones, R.J. Perry, S.D. Glazek, Phys. Rev. D 55 (1997) 6561—6583. J.W. Jun, C.K. Jue, Phys. Rev. D 50 (1994) 2939—2941. A.C. Kalloniatis, H.C. Pauli, Z. Phys. C 60 (1993) 255. A.C. Kalloniatis, H.C. Pauli, Z. Phys. C 63 (1994) 161. A.C. Kalloniatis, D.G. Robertson, Phys. Rev. D 50 (1994) 5262. A.C. Kalloniatis, H.C. Pauli, S.S. Pinsky, Phys. Rev. D 50 (1994) 6633. A.C. Kalloniatis, Phys. Rev. D 54 (1996) 2876. A.C. Kalloniatis, D.G. Robertson, Phys. Lett. B 381 (1996) 209—215. M. KaluRza, H.C. Pauli, Phys. Rev. D 45 (1992) 2968. M. KaluRza, H.-J. Pirner, Phys. Rev. D 47 (1993) 1620. G. Karl, Phys. Rev. D 45 (1992) 247. V.A. Karmanov, Nucl. Phys. B 166 (1980) 378. V.A. Karmanov, Nucl. Phys. A 362 (1981) 331. B.D. Keister, Phys. Rev. C 43 (1991) 2783—2790. B.D. Keister, Phys. Rev. D 49 (1994) 1500—1505.
482
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
[271] Y. Kim, S. Tsujimaru, K. Yamawaki, Phys. Rev. Lett. 74 (1995) 4771—4774; Erratum-ibid. 75 (1995) 2632. [272] D. KlabuRcar, H.C. Pauli, Z. Phys. C 47 (1990) 141. [273] J.R. Klauder, H. Leutwyler, L. Streit, Nuovo Cimento 59 (1969) 315. [274] J.B. Kogut, D.E. Soper, Phys. Rev. D 1 (1970) 2901. [275] J.B. Kogut, L. Susskind, Phys. Rep. C 8 (1973) 75. [276] S. Kojima, N. Sakai, T. Sakai, Prog. Theor. Phys. 95 (1996) 621—636. [277] J. Kondo, Solid State Physics 23 (1969) 183. [278] V.G. Koures, Phys. Lett. B 348 (1995) 170—177. [279] M. Krautga¨rtner, H.C. Pauli, F. Wo¨lz, Phys. Rev. D 45 (1992) 3755. [280] A.D. Krisch, Nucl. Phys. B (Proc. Suppl.) B 25 (1992) 285. [281] H.R. Krishnamurthy, J.W. Wilkins, K.G. Wilson, Phys. Rev. B 21 (1980) 1003. [282] H. Kro¨ger, R. Girard, G. Dufour, Phys. Rev. D 35 (1987) 3944. [283] H. Kro¨ger, H.C. Pauli, Phys. Lett. B 319 (1993) 163—170. [284] A.S. Kronfeld, B. Nizic, Phys. Rev. D 44 (1991) 3445. [285] S. Krusche, Phys. Lett. B 298 (1993) 127—131. [286] W. Kwong, P.B. Mackenzie, R. Rosenfeld, J.L. Rosner, Phys. Rev. D 37 (1988) 3210. [289] P.V. Landshoff, Phys. Rev. D 10 (1974) 10241. [290] E. Langmann, G.W. Semenoff, Phys. Lett. B 296 (1992) 117. [291] A. Langnau, S.J. Brodsky, J. Comput. Phys. 109 (1993) 84—92. [292] A. Langnau, M. Burkardt, Phys. Rev. D 47 (1993) 3452—3464. [293] T.D. Lee, Phys. Rev. 95 (1954) 1329. [294] P. Lenz, F.J. Wegner, cond-mat/9604087; Nucl. Phys. B 482 (1996) 693. [295] F. Lenz, in: D. Vautherin, F. Lenz, J.W. Negele (Eds.), Nonperturbative Quantum Field Theory, Plenum Press, New York, 1990. [296] F. Lenz, M. Thies, S. Levit, K. Yazaki, Ann. Phys. 208 (1991) 1. [297] G.P. Lepage, S.J. Brodsky, Phys. Lett. B 87 (1979) 359. [298] G.P. Lepage, S.J. Brodsky, Phys. Rev. Lett. 43 (1979) 545, 1625 (E). [299] G.P. Lepage, S.J. Brodsky, Phys. Rev. D 22 (1980) 2157. [300] G.P. Lepage, S.J. Brodsky, T. Huang, P.B. Mackenzie, in: A.Z. Capri, A.N. Kamal (Eds.), Particles and Fields 2, Plenum Press, New York, 1983. [301] G.P. Lepage, B.A. Thacker, CLNS-87, 1987. [302] H. Leutwyler, Acta Phys. Austriaca 5 (Suppl.) (1968) 320. [303] H. Leutwyler, in: Springer Tracts in Modern Physics, Vol. 50, Springer, New York, 1969, p. 29. [304] H. Leutwyler, Phys. Lett. B 48 (1974) 45. [305] H. Leutwyler, Phys. Lett. B 48 (1974) 431. [306] H. Leutwyler, Nucl. Phys. B 76 (1974) 413. [307] H. Leutwyler, J. Stern, Ann. Phys. 112 (1978) 94. [308] H. Li, G. Sterman, Nucl. Phys. B 381 (1992) 129. [309] N.E. Ligterink, B.L.G. Bakker, Phys. Rev. D 52 (1995) 5917—5925. [310] N.E. Ligterink, B.L.G. Bakker, Phys. Rev. D 52 (1995) 5954—5979. [311] J. Lowenstein, A. Swieca, Ann. Phys. (NY) 68 (1971) 172. [312] W. Lucha, F.F. Scho¨berl, D. Gromes, Phys. Rep. 200 (1991) 127. [313] M. Luke, A.V. Manohar, M.J. Savage, Phys. Lett. B 288 (1992) 355. [314] M. Lu¨scher, Nucl. Phys. B 219 (1983) 233. [315] Y. Ma, J.R. Hiller, J. Comp. Phys. 82 (1989) 229. [316] B.Q. Ma, J. Phys. G 17 (1991) L53. [317] B.Q. Ma, Qi-Ren Zhang, Z. Phys. C 58 (1993) 479. [318] B.Q. Ma, Z. Physik A 345 (1993) 321—325. [319] B.Q. Ma, The proton spin structure in a light cone quark spectator diquark model, hep-ph/9703425. [320] N. Makins et al., NE-18 Collaboration, MIT preprint, 1994.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
483
[321] N.S. Manton, Ann. Phys. (NY) 159 (1985) 220. [322] G. Martinelli, C.T. Sachrajda, Phys. Lett. B 217 (1989) 319. [323] J.C. Maxwell, Treatise on Electricity and Magnetism, 3rd ed., 2 Vols., reprint by Dover, New York, 1954. [324] T. Maskawa, K. Yamawaki, Prog. Theor. Phys. 56 (1976) 270. [325] G. McCartor, Z. Phys. C 41 (1988) 271. [326] G. McCartor, Z. Phys. C 52 (1991) 611. [327] G. McCartor, D.G. Robertson, Z. Phys. C 53 (1992) 679. [328] G. McCartor, D.G. Robertson, Z. Physik C 62 (1994) 349—356. [329] G. McCartor, Z. Physik C 64 (1994) 349—354. [330] G. McCartor, D.G. Robertson, Z. Physik C 68 (1995) 345—351. [331] G. McCartor, D.G. Robertson, S. Pinsky, Vacuum structure of 2D gauge theories on the light front, hepth/96112083. [332] H.J. Melosh, Phys. Rev. D 9 (1974) 1095. [333] A. Messiah, Quantum Mechanics, 2 Vols., North-Holland, Amsterdam, 1962. [334] G.A. Miller, Phys. Rev. C 56 (1997) 8—11. [335] J.A. Minahan, A.P. Polychronakos, Phys. Lett. B 326 (1994) 288. [336] A. Misra, Phys. Rev. D 53 (1996) 5874—5885. [337] P.M. Morse, H. Feshbach, Methods of Theoretical Physics, 2 Vols., McGraw-Hill, New York, 1953. [338] Y. Mo, R.J. Perry, J. Comp. Phys. 108 (1993) 159. [339] D. Mustaki, S. Pinsky, J. Shigemitsu, K. Wilson, Phys. Rev. D 43 (1991) 3411. [340] D. Mustaki, S. Pinsky, Phys. Rev. D 45 (1992) 3775. [341] D. Mustaki, Bowling Green State Univ. preprint, 1994. [342] T. Muta, Foundations of Quantum Chromodynamic: Lecture Notes in Physics, vol. 5, World Scientific, Singapore, 1987. [343] O. Nachtmann, Elementarteilchenphysik, Vieweg, Braunschweig, 1986. [344] T. Nakatsu, K. Takasaki, S. Tsujimaru, Nucl. Phys. B 443 (1995) 155—200. [345] J.M. Namyslowski, Prog. Part. Nuc. Phys. 74 (1984) 1. [346] E. Noether, Kgl. Ges. d. Wiss. Nachrichten, Math.-phys. Klasse, Go¨ttingen, 1918. [347] Y. Nakawaki, Prog. Theor. Phys. 70 (1983) 1105. [348] H.W.L. Naus, H.J. Pirner, T.J.Fields, J.P. Vary, QCD near the light cone, hep-th/9704135 [349] A. Ogura, T. Tomachi, T. Fujita, Ann. Phys. 237 (1995) 12—45. [350] H. Osborn, Nucl. Phys. B 80 (1974) 90. [351] Particle Data Group, Phys. Rev. D 45 (1992) 1. [352] H.C. Pauli, Nucl. Phys. A 396 (1981) 413. [353] H.C. Pauli, Z. Phys. A 319 (1984) 303. [354] H.C. Pauli, S.J. Brodsky, Phys. Rev. D 32 (1985) 1993. [355] H.C. Pauli, S.J. Brodsky, Phys. Rev. D 32 (1985) 2001. [356] H.C. Pauli, Nucl. Phys. A 560 (1993) 501. [357] H.C. Pauli, in: B. Geyer, E.M. Ilgenfritz (Eds.), Quantum Field Theoretical Aspects of High Energy Physics, Naturwissenschaftlich Theoretisches Zentrum der Universita¨t, Leipzig, 1993. [358] H.C. Pauli, A.C. Kalloniatis, S.S. Pinsky, Phys. Rev. D 52 (1995) 1176. [359] H.C. Pauli, J. Merkel, Phys. Rev. D 55 (1997) 2486—2496. [360] H.C. Pauli, R. Bayer, Phys. Rev. D 53 (1996) 939. [361] H.C. Pauli, Solving gauge field theory by discretized light-cone quantization, Heidelberg Preprint MPIH-V251996, hep-th/9608035. [362] H.C. Pauli, in: B.N. Kursunoglu, S. Mintz, A. Perlmutter (Eds.), Neutrino Mass, Monopole Condensation, Dark matter, Gravitational waves, Light-Cone Quantization, Plenum Press, New York, 1996, pp. 183—204. [363] R.J. Perry, A. Harindranath, K.G. Wilson, Phys. Rev. Lett. 65 (1990) 2959. [364] R.J. Perry, A. Harindranath, Phys. Rev. D 43 (1991) 4051. [365] R.J. Perry, Phys. Lett. B 300 (1993) 8.
484
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
[366] R.J. Perry, K.G. Wilson, Nucl. Phys. B 403 (1993) 587. [367] R.J. Perry, Hamiltonian light-front field theroy and quantum chromodynamics, Hadron 94, Gramado, Brasil, April 1994. [368] R.J. Perry, Ann. Phys. 232 (1994) 116—222. [369] V.N. Pervushin, Nucl. Phys. B 15 (1990) 197. [370] I. Pesando, Mod. Phys. Lett. A 10 (1995) 525—538. [371] I. Pesando, Mod. Phys. Lett. A 10 (1995) 2339—2352. [372] S.S. Pinsky, Proc. 4th Conf. on the Intersections between Particle and Nuclear Physics. Tucson, AZ, May 1991, World Scientific, Singapore, 1991. [373] S.S. Pinsky, Proc. Division of Particles and Fields of the APS, Vancouver, B.C., Canada, 18—22 August 1991, World Scientific, Singapore. [374] S.S. Pinsky, in: J. Tran Thanh Van (Ed.), Proc. 27th Recontre de Moriond, Les Arcs, Savoie, France, 22—28 March, 1992, Editions Frontieres, Dreux. [375] S.S. Pinsky, Proc. Orbis Scientiae, 25—27 January 1993, Coral Gables, FL, Nova Science. [376] S.S. Pinsky, in: S. Dubnicka, A. Dubnickova (Eds.), Proc. “Hadron Structure ’93” Banska Stiavnica Slovakia, 5—10 September 1993, Institute of Physics, Slovak Academy of Science. [377] S.S. Pinsky, in: St. Glazek (Ed.), Proc. Theory of Hadrons and Light-Front QCD Polona Zgorselisko Poland, August 1994, World Scientific, Singapore. [378] S.S. Pinsky, Proc. Orbis Scientiae, 25—28 January 1996, Coral Gables FL, Nova Science. [379] S.S. Pinsky, R. Mohr, Proc. Conf. on Low Dimensional Field Theory, Telluride Summer Research Institute, August 1996; Int. J. of Mod. Phys. A, to appear. [380] S.S. Pinsky, Wilson loop on a Light-cone cylinder, hep-th/9702091. [381] S. Pinsky, Phys. Rev. D 56 (1997) 5040—5049. [382] S.S. Pinsky, B. van de Sande, Phys. Rev. D 49 (1994) 2001. [383] S.S. Pinsky, A.C. Kalloniatis, Phys. Lett. B 365 (1996) 225—232. [384] S.S. Pinsky, D.G. Robertson, Phys. Lett. B 379 (1996) 169—178. [385] E.V. Prokhvatilov, V.A. Franke, Sov. J. Nucl. Phys. 49 (1989) 688. [386] J. Przeszowski, H.W.L. Naus, A.C. Kalloniatis, Phys. Rev. D 54 (1996) 5135—5147. [387] S.G. Rajeev, Phys. Lett. B 212 (1988) 203. [388] D.G. Robertson, G. McCartor, Z. Phys. C 53 (1992) 661. [389] D.G. Robertson, Phys. Rev. D 47 (1993) 2549—2553. [390] F. Rohrlich, Acta Phys. Austriaca VIII (Suppl.) (1971) 2777. [391] R.E. Rudd, Nucl. Phys. B 427 (1994) 81—110. [392] M. Sawicki, Phys. Rev. D 32 (1985) 2666. [393] M. Sawicki, Phys. Rev. D 33 (1986) 1103. [394] H. Sazdjian, J. Stern, Nucl. Phys. B 94 (1975) 163. [395] F. Schlumpf, Phys. Rev. D 47 (1993) 4114. [396] F. Schlumpf, Phys. Rev. D 48 (1993) 4478. [397] F. Schlumpf, Mod. Phys. Lett. A 8 (1993) 2135. [398] F. Schlumpf, J. Phys. G 20 (1994) 237. [399] N.C.J. Schoonderwoerd, B.L.G. Bakker, Equivalence of renormalized covariant and light front perturbation theory, 11pp. hep-ph/9702311, February 1997. [400] S. Schweber, Relativistic Quantum Field Theory, Harper and Row, New York, 1961. [401] J. Schwinger, Phys. Rev. 125 (1962) 397. [402] J. Schwinger, Phys. Rev. 128 (1962) 2425. [403] M.A. Shifman, A.I. Vainshtein, V.I. Zakharov, Nucl. Phys. B 147 (1979) 38. [404] S. Simula, Phys. Lett. B 373 (1996) 193—199. [405] C.M. Sommerfield, Yale preprint July, 1973. [406] D.E. Soper, Ph.D Thesis, 1971; SLAC Report No. 137, 1971. [407] M.G. Sotiropoulos, G. Sterman, Nucl. Phys. B 425 (1994) 489. [408] P.P. Srivastava, Nuovo Cim. A 107 (1994) 549—558.
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486 [409] [410] [411] [412] [413] [414] [415] [416] [418] [419] [420] [421] [422] [423] [424] [425] [426] [427] [428] [429] [430] [431] [432] [433] [434] [435] [436] [437] [438] [439] [440] [441] [442] [443] [444] [445] [446] [447] [448] [449] [450] [451] [452] [453] [454] [455] [456]
485
P. Stoler, Phys. Rev. Lett. 66 (1991) 1003. T. Sugihara, M. Matsuzaki, M. Yahiro, Phys. Rev. D 50 (1994) 5274—5288. T. Sugihara, M. Yahiro, Phys. Rev. D 53 (1996) 7239—7249. T. Sugihara, M. Yahiro, Phys. Rev. D 55 (1997) 2218—2226. K. Sundermeyer, Constrained Dynamics, Springer, New York, 1982. L. Susskind, G. Frye, Phys. Rev. 164 (1967) 2003. L. Susskind, Phys. Rev. 165 (1968) 1535. L. Susskind, Stanford preprint SU-ITP-97-11, April 1997, hep-th/9704080; hep-th/9611164. A. Szczepaniak, E.M. Henley, S.J. Brodsky, Phys. Lett. B 243 (1990) 287. A. Szczepaniak, L. Mankiewicz, Univ. of Florida Preprint, 1991. A. Tam, C.J. Hamer, C.M. Yung, J. Phys. G 21 (1995) 1463—1482. I. Tamm, J. Phys. (USSR) 9 (1945) 449. A.C. Tang, S.J. Brodsky, H.C. Pauli, Phys. Rev. D 44 (1991) 1842. M. Tachibana, Phys. Rev. D 52 (1995) 6008—6015. G. ’t Hooft, Published in Erice Subnucl. Phys. (1975) 261. M. Thies, K. Ohta, Phys. Rev. D 48 (1993) 5883—5894. C.B. Thorn, Phys. Rev. D 19 (1979) 639. C.B. Thorn, Phys. Rev. D 20 (1979) 1934. T. Tomachi, T. Fujita, NUP-A-91-10, September 1991, 45pp. U. Trittmann, H.C. Pauli, Heidelberg preprint MPI H-V4-1997, January. 1997, hep-th/9704215. U. Trittmann, H.C. Pauli, Heidelberg preprint MPI H-V7-1997, April 1997, hep-th/9705021. U. Trittmann, Heidelberg preprint MPI H-V17-1997, April 1997, hep-th/9705072. W.K. Tung, Phys. Rev. 176 (1968) 2127. W.K. Tung, Group Theory in Physics, World Scientific, Singapore, 1985. P. van Baal, Nucl. Phys. B 369 (1992) 259. B. van de Sande, S. Pinsky, Phys. Rev. D 46 (1992) 5479. B. van de Sande, S.S. Pinsky, Phys. Rev. D 49 (1994) 2001. B. van de Sande, M. Burkardt, Phys. Rev. D 53 (1996) 4628. B. van de Sande, Phys. Rev. D 54 (1996) 6347. B. van de Sande, S. Dalley, Orbis Scientiae: Neutrino Mass, Dark Matter, Gravitational Waves, Condensation of Atoms, Monopoles, Light-Cone Quantization, Miami Beach, FL, 25—28 January 1996. J.B. Swenson, J.R. Hiller, Phys. Rev. D 48 (1993) 1774. S. Tsujimaru, K. Yamawaki, MPI H-V19-1997, April 1997, 47pp., hep-th/9704171. J.P. Vary, T.J. Fields, H.J. Pirner, ISU-NP-94-14, August 1994. 5pp., hep-ph/9411263. J.P. Vary, T.J. Fields, H.-J. Pirner, Phys. Rev. D 53 (1996) 7231—7238. F.J. Wegner, Phys. Rev. B 5 (1972) 4529. F.J. Wegner, Phys. Rev. B 6 (1972) 1891. F.J. Wegner, in: C. Domb, M.S. Green (Eds.), Phase Transitions and Critical Phenomena, vol. 6, Academic Press, London, 1976. F.J. Wegner, Annalen Physik 3 (1994) 77. S. Weinberg, Phys. Rev. 150 (1966) 1313. H. Weyl, Z. Phys. 56 (1929) 330. E. Wigner, Ann. Math. 40 (1939) 149. K.G. Wilson, Phys. Rev. B 140 (1965) 445. K.G. Wilson, Phys. Rev. D 2 (1970) 1438. K.G. Wilson, Rev. Mod. Phys. 47 (1975) 773. K.G. Wilson, in: A. Perlmutter (Ed.), New Pathways in High Energy Physics, vol. II, Plenum Press, New York, 1976, pp. 243—264. K.G. Wilson, in: R. Petronzio et al. (Eds.) Proc. Int. Symp. Capri, Italy, 1989 [Nucl. Phys. B (Proc. Suppl.) 17 (1989)]. K.G. Wilson, T.S. Walhout, A. Harindranath, W.M. Zhang, R.J. Perry, S.D. G"azek, Phys. Rev. D 49 (1994) 6720—6766.
486
S.J. Brodsky et al. / Physics Reports 301 (1998) 299—486
[457] R.S. Wittman, in: M.B. Johnson, L.S. Kisslinger (Eds.), Nuclear and Particle Physics on the Light Cone, World Scientific, Singapore, 1989. [458] J.J. Wivoda, J.R. Hiller, Phys. Rev. D 47 (1993) 4647. [459] P.M. Wort, Carleton University preprint, February 1992. [460] C.M. Yung, C.J. Hamer, Phys. Rev. D 44 (1991) 2598. [461] W.M. Zhang, A. Harindranath, Phys. Rev. D 48 (1993) 4868—4880. [462] W.M. Zhang, A. Harindranath, Phys. Rev. D 48 (1993) 4881. [463] W.M. Zhang, A. Harindranath, Phys. Rev. D 48 (1993) 4903. [464] W.M. Zhang, Phys. Lett. B 333 (1994) 158—165. [465] W.M. Zhang, G.L. Lin, C.Y. Cheung, Int. J. Mod. Phys. A 11 (1996) 3297—3306. [466] W.M. Zhang, Phys. Rev. D 56 (1997) 1528—1548. [467] D.C. Zheng, J.P. Vary, B.R. Barret, Nucl. Phys. A 560 (1993) 211. [468] A.R. Zhitnitskii, Phys. Lett. B 165 (1985) 405. [469] G. Zweig, CERN Reports Th. 401 and 412, 1964; in: A. Zichichi (Ed.), Proc. Int. School of Phys. Ettore Majorana, Erice, Italy, 1964, Academic, New York, p. 192.