Physics Reports 308 (1999) 1—64
Interdisciplinary application of nonlinear time series methods Thomas Schreiber Physics Department, University of Wuppertal, D-42097 Wuppertal, Germany Received May 1998; editor: I. Procaccia Contents 1. Introduction 2. Theoretical foundation 2.1. Dynamical systems and predictability 2.2. Phase space, embedding, Poincare´ sections 2.3. Quantitative description 2.4. Comparing dynamics and attractors 3. Nonlinear analysis of limited data 3.1. Embedding finite, noisy time series 3.2. Practical aspects of embedding 3.3. Estimating dynamics and predicting 3.4. Estimating invariants 3.5. Non-invariant characterisation 3.6. Measures of dissimilarity 4. Nonstationarity 4.1. Moving windows 4.2. Recurrence plots
3 5 6 8 10 14 16 17 18 20 25 29 32 33 34 35
4.3. Tracing parameter variation 4.4. An application 5. Testing for nonlinearity 5.1. Detecting weak nonlinearity 5.2. Surrogate data tests 5.3. What can be learned 6. Nonlinear signal processing 6.1. Nonlinear noise reduction 6.2. Signal separation 7. Comparison and classification 7.1. Classification by histograms 7.2. Classification by clustering 8. Conclusion and future perspectives Acknowledgements References
36 38 40 41 42 47 48 48 51 51 53 53 55 57 57
Abstract This paper reports on the application to field measurements of time series methods developed on the basis of the theory of deterministic chaos. The major difficulties are pointed out that arise when the data cannot be assumed to be purely deterministic and the potential that remains in this situation is discussed. For signals with weakly nonlinear structure, the presence of nonlinearity in a general sense has to be inferred statistically. The paper reviews the relevant methods and discusses the implications for deterministic modeling. Most field measurements yield nonstationary time series, which poses a severe problem for their analysis. Recent progress in the detection and understanding of nonstationarity is reported. If a clear signature of approximate determinism is found, the notions of phase space, attractors, invariant manifolds, etc., provide a convenient framework for time series analysis. Although the results have to be interpreted with great care, superior performance can be achieved for typical signal processing tasks. In particular, prediction and filtering of signals are discussed, as well as the classification of system states by means of time series recordings. 1999 Elsevier Science B.V. All rights reserved. PACS: 05.45.#b; 07.05.Kf ; 02.50.Wp Keywords: Time series; Data analysis; Nonlinear dynamics
0370-1573/99/$ — see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 3 5 - 0
INTERDISCIPLINARY APPLICATION OF NONLINEAR TIME SERIES METHODS
Thomas SCHREIBER Physics Department, University of Wuppertal, D-42097 Wuppertal, Germany
AMSTERDAM — LAUSANNE — NEW YORK — OXFORD — SHANNON — TOKYO
T. Schreiber / Physics Reports 308 (1999) 1—64
3
1. Introduction The most direct link between chaos theory and the real world is the analysis of time series data in terms of nonlinear dynamics. Most of the fundamental properties of nonlinear dynamical systems have by now been observed in the laboratory. However, the usefulness of chaos theory in cases where the system is not manifestly deterministic is much more controversial. In particular, evidence for chaotic behaviour in field measurements has been claimed — and disputed — in many areas of science, including biology, physiology, and medicine; geo- and astrophysics, as well as the social sciences and finance. This article will take a critical look at the published literature, evaluating the perspectives and the limitations of the approach. While common misconceptions will be elucidated, I will try to adopt a constructive point of view by highlighting those cases where information in fact has been gained by the application of methods from chaos theory. Along with the treatment of conceptual issues, instructive practical examples from the literature and from work done in our group will be evaluated. I will try to depict the state of the art of the application of chaos theory to real time series. Neither naive enthusiasm to explain all kinds of unsolved time series problems by nonlinear determinism is justified, nor is the pessimistic view that no real system is ever sufficiently deterministic and thus out of reach for analysis. At least, chaos theory has inspired a new set of useful time series tools and provides a new language to formulate time series problems — and to find their solutions. Previous works of review character are Grassberger et al. [1], Abarbanel et al. [2], as well as Kugiumtzis et al. [3,4]. Apart from a collection of research articles by Ott et al. [5], two books on nonlinear time series from the point of view of chaos theory are available so far, one by Abarbanel [6] and one by Kantz and Schreiber [7]. While in the former volume chaoticity is usually assumed — as already reflected in the title — the latter book puts some emphasis on practical applications to time series that are not manifestly found, nor simply assumed, to be deterministic and chaotic. Apart from these works, a number of conference proceedings volumes are devoted to chaotic time series, including Refs. [8—12]. Nonlinear time series methods that arise as extensions and generalisations of linear tools can be mostly found in the statistical literature. Major references are the books by Tong [13] and by Priestley [14]. The application of nonlinear time series methods to field measurements has been accompanied by considerable controversy in the literature. Early enthusiasm has led to straightforward attempts to find, and even quantify, deterministic chaos in many types of systems, ranging from atmospheric dynamics [15—17] and financial markets [18—20] to heart [21,22] and brain activity [23]. In Ref. [24] it is even claimed that cigarette smoking optimises the “dimensional complexity” of an indicator of brain function. This wave of publications has been followed by a number of critical papers pointing out the methodological deficiencies of the former. Some of these will be cited below in their proper contexts. This review is however not the place to repeat the known arguments or the discussion. In my opinion, part of the controversy and the resulting frustration is due to the misconception that low-dimensional chaos is such an appealing theory that it can be expected to be present generically in nature. Of course, most researchers would deny that they have made such an a priori assumption. Nevertheless, the amount of evidence we require for a “climate attractor”, etc., does depend on how likely, that is, how convincing, we find such a concept. For example, in
4
T. Schreiber / Physics Reports 308 (1999) 1—64
medical research it is extremely tempting to have a means of measuring the “complexity” of the cardiac rhythm or even the brain function. Now, if we assume chaoticity in the sense of low-dimensional determinism as a starting point of our analysis, we can directly justify the use of delay coordinates by Takens’ theorem. The number of degrees of freedom in the system is readily estimated as the embedding dimension where the number of false neighbours drops below the noise floor, the rate of increase of uncertainty is identified with the Lyapunov exponent and so forth. This rationale is pursued for example in Ref. [6]. An experience I share with many other researchers is that this way to proceed is quite dangerous since the possibilities for spurious results and wrong conclusions are overwhelming. As an example, take the analysis of the Salt Lake area data in Ref. [6]. Taking for granted that nonlinear dynamics is at work, locally linear phase space predictions seem most appropriate to forecast future values. The predictions shown in Ref. [6] seem fair enough but a closer inspection of the data shows that already the simple linear rule to follow the trend of the last two observations, x "2x !x , is more appropriate than the local linear approach in that it gives forecast L> L L\ errors of about half the rms magnitude. A second possibility is to try to establish low-dimensional chaos by positive evidence. We could for example look for self-similar geometry over a reasonable range of length scales and demonstrate that uncertainties indeed grow exponentially over a certain period of time. Eventually, we should be able to extract empirical deterministic models that predict future values and that can be iterated to yield time series with statistical properties comparable to the data. This approach is preferred by most theoretical researchers and has been emphasised for example in Ref. [7]. The major drawback is that only exceptional time series show such a clear signature, all of which are from laboratory experiments set up specifically for the study of chaotic phenomena. If this were the only way to go, applications to real world problems would have to be largely abandoned. Finally, we can move the focus of our study from the question of whether deterministic chaos is really present to the question of whether deterministic chaos provides a useful language for the evaluation of a given signal. The concept that superior performance alone is a valid argument for the use of a particular method is not as surprising for an engineer or clinician as it may be for a physicist. In particular in time series analysis, very few people actually believe that the stock market or the brain actually are linear autoregressive machines. Nevertheless, linear time series methods have been applied to time series from these systems with considerable success. Evidence for the practical superiority of chaotic time series methods has so far been rather scarce. In the following I will at no point assume deterministic chaos. However, I will first review in Section 2 the fundamentals of dynamical systems theory as the theoretical basis for nonlinear time series methods. In many cases this will only amount to a theoretical motivation; very few facts are rigorously proven for finite, noisy time series. I will briefly review the concepts of dynamical systems, strange attractors, phase space embedding, and the invariant characteristics of a process. Next I will try to give an understanding of the signatures of determinism in finite observations. Section 3 will discuss what happens to the theoretical concepts when they are applied to real data of finite resolution and length. Some limitations are known rigorously, others can be understood heuristically. Some problems seem to be of purely technical nature but nevertheless may prove to be serious in practice. In particular, extended embedding theorems and amendments of the embedding procedure will be discussed. Estimators for characteristic quantities like dimension, entropy, and Lyapunov exponents are studied with respect to their practical viability. This material
T. Schreiber / Physics Reports 308 (1999) 1—64
5
will allow us to gauge for a given time series problem how far we are from the linear case and how close we are to a nonlinear deterministic situation. Accordingly, we will choose either linear methods, coarse but robust nonlinear tools, or more refined phase space methods. One formal requirement for almost all time series methods is stationarity. Specific tests for nonstationarity in a nonlinear context that have been proposed in the literature are discussed in Section 4, where also hints will be given what can be done in the presence of nonstationarity apart from choosing a different time series problem. In Section 5, formal statistical tests for nonlinearity in a time series will be set up, with particular emphasis on possible nonlinear determinism in the data. The section will state what has to be done in order to perform such a test correctly, but it will also discuss what can (and what cannot) be learned from such a test. While standard solutions for the forecasting and filtering of linearly correlated but otherwise random sequences exist, and methods for strongly deterministic but chaotic systems are also well established, signals of a mixed character are more difficult to deal with. Section 6 will discuss the problems that arise in practice and give some specific successful applications, including medical data analysis. While most estimates of invariants based on short and noisy data are dubious as absolute numbers, many authors have found the comparison of such numbers, or of nonlinear qualitative features, across an ensemble of systems (e.g. sick and healthy patients) quite promising. As discussed in Section 7, some of these works are of little use since the discrimination task could have been solved equally well by standard methods, other examples seem to give results that are by far superior to previous approaches. Pure low-dimensional determinism is quite special and can be found in nature only to a crude approximation. The range of potential practical applications of nonlinear theory can only by increased significantly if the underlying paradigm is generalised in some respects. Current efforts concerning the analysis of data from extensively chaotic (e.g. spatio-temporal) systems, as well as from mixed, nonlinear and stochastic, sources are discussed in Section 8. In this paper, I will not give many technical details of the practical implementation of the methods. These will be reviewed in a forthcoming article [25] which is accompanied by the publicly available software package TISEAN which contains many of the methods discussed here.
2. Theoretical foundation This section will briefly recall the definitions and properties of some concepts of chaos theory, insofar as they are relevant to applied time series problems. Most of the basic concepts are usually formulated in a purely deterministic setting; that is, without any external noise. The time evolution is then given by a dynamical system in phase space. Since usually the state points cannot be observed directly but only through a measurement function, typically involving a projection onto fewer variables than phase space dimensions, we have to recover the missing information in some way. This can be done by time delay embeddings and related methods. We can then quantify The TISEAN software package is publicly available for download from either http://www.mpipksdresden.mpg.de/ tsa/TISEAN/docs/welcome.html or http://wptu38.physik.uni-wuppertal.de/Chaos/DOCS/welcome. html. The distribution includes an on-line documentation system.
6
T. Schreiber / Physics Reports 308 (1999) 1—64
properties of the system through measurements made on the embedded time series. Since it is eventually the underlying system we want to characterise, these properties should ideally be unaffected by the measurement and the embedding procedure. The presentation here serves the main purpose of fixing the notation for the following section and is therefore extremely brief. Theoretical issues that are directly related to time series methods are discussed in more detail in the monographs [6,7]. More general references on the theory of deterministic dynamical systems include the volumes by Ott [26], as well as the older books by Berge´ et al. [27] and by Schuster [28]. More advanced material is contained in the work by Katok and Hasselblatt [29]. A gentle introduction to dynamics is given by Kaplan and Glass [30]. The volume by Tsonis [31] puts more emphasis on the applied side. 2.1. Dynamical systems and predictability When we are trying to understand an irregular (which essentially here means non-periodic) sequence of measurements, an immediate question is what kind of process can generate such a series. In the deterministic picture, irregularity can be autonomously generated by the nonlinearity of the intrinsic dynamics. Let the possible states of a system be represented by points in a finite-dimensional phase space, say some RB. The transition from the system’s state x(t ) at time t to its state at time t is then governed by a deterministic rule: x(t )"¹ (x(t )). This can be R \R realised either in continuous time by a set of ordinary differential equations: xQ (t)"F(x(t)) ,
(1)
or in discrete time t"n*t by a map of RB onto itself: "f (x ) . (2) L> L The family of transition rules ¹ , or its realisation in the forms (1) or (2), are referred to as R a dynamical system. The particular choice of F (resp. f ) allows for many types of solutions, ranging from fixed points and limit cycles to irregular behaviour. If the dynamics is dissipative (area contracting, the case assumed throughout this work), the points visited by the system after transient behaviour has died out will be concentrated on a subset of Lebesgue measure zero of phase space. This set is referred to as an attractor, the set of points that are mapped onto it for tPR as its basin of attraction. Since not all points on an attractor are visited with the same frequency, one defines a measure k(x) dx, the average fraction of time a typical trajectory spends in the phase space element dx. In an ergodic system, k(x) is the same for almost all initial conditions. Phase space averages taken with respect to k(x)dx are then equal to time averages taken over a typical trajectory. In real-world systems, pure determinism is rather unlikely to be realised since all systems somehow interact with their surroundings. Thus, the deterministic picture should be regarded only as a limiting case of a more general framework involving fluctuations in the environment and in the x
The notion of an attractor is mathematically difficult to define satisfactorily, sse Milnor [96]. The existence of a natural measure has been proven only for hyperbolic systems, see Eckmann and Ruelle [227], and for a small number of specific systems, see for example Benedicks [228].
T. Schreiber / Physics Reports 308 (1999) 1—64
7
system itself. However, it is the limiting case that is best studied theoretically and that is expected to show the clearest signatures in observations. The deterministic approach is not the most common way to explain irregularity in a time series. The traditional answer given by the time series literature is that external random influences may be acting on the system. The external randomness explains the irregularity, while linear dynamical rules may be sufficient to explain structure found in the sequence. The most general linear (univariate) model is the autoregressive moving average (ARMA) process, given by + , x " ax # bg , L G L\G G L\G G G
(3)
where +g , are Gaussian uncorrelated random increments. The linear stochastic description is L attractive mainly because many rigorous results are available, including the properties of finite sample estimators. Most sources of irregular signals, take for example the brain or the atmosphere, are known to be nonlinear. Nevertheless, if many weakly coupled degrees of freedom are active, their evolution may be averaged to quantities that are to a good approximation Gaussian random variables. If this approximation is valid, it is also reasonable to assume that an observed degree of freedom interacts with the averaged variable in a mean field way, justifying the linear dynamics in Eq. (3). However, there are many situations where this approximation fails, for example if the degrees of freedom of a system act in a coherent way, which can happen in nonlinear systems even when the coupling among the degrees of freedom is weak. The two paradigms, nonlinear deterministic and linear stochastic behaviour, are the extreme positions in the area spanned by the properties “nonlinearity” and “stochasticity”. They are singled out not because they are particular realistic for most situations, but rather because of their paradigmatic role and their solid mathematical background. Since the literature abounds with premature conclusions like that a system that is not found to be linear must be deterministic instead, let us emphasise that we are dealing with rather narrow limiting cases by drawing a sketch of the world of time series models that also contains all kinds of mixtures, see Fig. 1. The description of a particular time series by an empirical model will of course be guided by the paradigm adopted for the study. The idea is that one cannot possibly do a better modelling job than recovering the equations that actually govern the system under observation. One should note, however, that a description of a system which includes external influences is only complete if the input sequence (for example in Eq. (3) the sequence of increments +g ,) is known. L One can ask what constitutes a complete description in the case of a deterministic system. In the mathematical sense, the system equations (1) or (2) together with the initial conditions are sufficient. For this to be true, both parts must be known with infinite precision, which is unphysical. If the system is chaotic, then even in the noise free case, errors in the initial condition and in the specification of the model will grow in time and will have to be corrected. Of course, as soon as noise is present, even if only in the measurement procedure, the situation becomes worse. If we could recover F (resp. f ) from the observations correctly, we still could not simply generate future values by applying f to the observed present state because we could not account for the noise. Thus, we may face the confusing situation that the original equations of motion do not necessarily give the best model in terms of predictions. (See Refs. [32,33] for illustrative examples.)
8
T. Schreiber / Physics Reports 308 (1999) 1—64
Fig. 1. Sketch of the variety of systems spanned by the properties “nonlinearity” and “stochasticity”. Areas where theoretical knowledge and technology for the analysis of time series are available are outlined. Besides the (possibly noisy) periodic oscillations (a), these are mainly the deterministic chaotic and the linear stochastic areas. Also the common routes to chaos (c) and (d), and extensions for small nonlinearity (b) or small noise (e) are marked. There are a few “islands” (f ), like hidden Markov models and a few others, where a connection can be made between a nonlinear stochastic model approach and particular real world phenomenon. This sketch has been inspired by a similar representation by O. Kaplan.
Since there is not much that can be done by way of modelling the noise in the system, modelling deterministic systems will still attempt to fit a function f (or F) to the data such that Eq. (2) (or (1), resp.) will hold to the best available approximation. The most widespread approach is to solve Eq. (2) in the least-squares sense, minimising ,\ [x !fK (x )] . (4) L> L L Historical references on the prediction problem include the papers by Farmer and Sidorowich [34], and by Casdagli [35]. Advanced procedures are available that take better care of the noise problem, see Refs. [36,37], as well as Section 3.3. As an alternative to explicitly fitting equations of motion to the data, it has been proposed [38] to synchronise a model system with the observed phenomenon (may be given a posteriori by a time series) as a means of identifying the correct model equations. This approach is conceptually less unequivocal since there exist examples where systems lock into a generalised synchronised mode although the systems are quite different in structure. 2.2. Phase space, embedding, Poincare´ sections An immediate consequence of the formulation of the dynamics in a vector space is that when analysing time series, we will almost always have only incomplete information. Although more and more multi-probe measurements are being carried out, still the vast majority of time series taken outside the laboratory are single-valued. But even if multiple simultaneous measurements are available, they will not typically cover all the degrees of freedom of the system. Fortunately, however, the missing information can be recovered from time delayed copies of the available signal, if certain requirements are fulfilled. The theoretical framework for this approach is set by a number
T. Schreiber / Physics Reports 308 (1999) 1—64
9
of theorems, all of which specify the precise conditions when an attractor in delay coordinate space is equivalent to the original attractor of a dynamical system in phase space. Let +x(t), be a trajectory of a dynamical system in RB and +s(t)"s(x(t)), the result of a scalar measurement on it. Then a delay reconstruction with delay time q and embedding dimension m is given by s(t)"(s(t!(m!1)q), s(t!(m!2)q),2, s(t)) .
(5)
The celebrated delay embedding theorem by Takens [39] states that among all delay maps of dimension m"2d#1, those that form an embedding of a compact manifold with dimension d are dense, provided that the measurement function s : RBPR is C and that both the dynamics and the measurement function is generic in the sense that it couples all degrees of freedom. In the original version by Takens, d is the integer dimension of a smooth manifold, the phase space containing the attractor. Thus d can be much larger than the attractor dimension. Sauer et al. [40] were able to generalise the theorem to what they call the fractal delay embedding prevalence theorem. It states that under certain genericity conditions, the embedding property is already given when m'2d where d is the box counting dimension of the attractor of the dynamical system. Further generalisations in Ref. [40] assert that, provided sufficiently many coordinates are used, also more general schemes than simple delay embeddings are allowed. For practical purposes, filtered and SVD embeddings have interesting properties. Generalisations to periodically or stochastically driven systems (where the driving force is assumed to be known) are heuristically straightforward but the relevant genericity requirements are rather involved. The proofs have been worked out by Stark and coworkers [41]. Depending on the application, a reconstruction of the state space up to ambiguities on sets of measure zero may be tolerated. For example, for the determination of the correlation dimension, events of measure zero can be neglected and thus any embedding with a dimension larger than the (box counting) attractor dimension is sufficient [42,229]. Although the embedding theorems provide an important means of understanding the reconstruction procedure, none of them is formally applicable in practice. The reason is that they all deal with infinite, noise free trajectories of a dynamical system. It is not obvious that the theorems should be “approximately valid” if the requirements are “approximately fulfilled”, for example, if the data sequence is long but finite and reasonably clean but not noise free — the best we can hope for in time series analysis. Some of these issues will be discussed in Section 3.3. When analysing time continuous systems, Poincare´ sections are an attractive alternative to the reconstruction with fixed delay times. Instead of the time continuous trajectory, only its intersection points with a fixed surface of section are regarded. Generically, the resulting set has a dimension which is exactly one less than the attractor dimension. This concept is particularly useful if the system is driven periodically and the surface of section can be taken as the hyperplane defined by a fixed phase of the driving force. In this case, in the intersection points are equally spaced in time. Reducing the dimensionality of the problem can be an advantage but it comes at the price of reducing the number of available points for a statistical analysis. A finite piece of trajectory is a one-dimensional curve which is generically and trivially embedable in three dimensions.
10
T. Schreiber / Physics Reports 308 (1999) 1—64
2.3. Quantitative description A time series is usually not a very compact representation of a time evolving phenomenon. It is necessary to condense the information and find a parametrisation that contains the features that are most relevant for the underlying system. Most ways to quantitatively describe a time series are derived from methods to describe an assumed underlying process. Thus, for example, measures of chaoticity in a time series are usually derived from measures of chaoticity in a dynamical system. The rationale is that a certain class of processes is assumed to have generated the time series and then the measure quantifying that process is estimated from the data. Therefore, it is often necessary to distinguish between the abstract quantity, for example the power spectrum of a stochastic process, and its estimate from a time series, for example the periodogram. Since the underlying process is only observed through some measurement procedure, it is most useful to attempt to estimate quantities that are invariant under reasonable changes in the measurement procedure. As will be seen below (in Section 3.4), the finite resolution and duration of time series recordings damage the invariance properties of quantities which are formally invariant for infinite data. If the value of an observable depends on the observation procedure it looses its value as an absolute characteristic. While in some cases we can still make approximate statements, the interpretation of results has to be undertaken with great care. If we want to compare the results between different experiments, at least we have to unify the measurement procedure and the details of the analysis. If in a realistic situation invariance has been given up anyway, the quantities discussed in the following are no longer singled out that strongly among all possible ways of turning a time series into a number. Consequently, there is no lack of ad hoc definitions and characteristics that have been used in the literature. Since they are invariably defined for time series, rather than the underlying processes, some of them will be discussed later in Section 3. 2.3.1. Linear observables In the linear approach to time series analysis, a quantitative characterisation of a process is done on the basis of either the two-point autocovariance function or the power spectrum. If only a finite time series +s , n"1,2, N, is available, the autocovariance function can be estimated, e.g. by L , 1 ss . (6) C(q)" L L\O N!q LO> Depending on the circumstances, other estimators may be preferable. A whole branch of research is devoted to the proper estimation of the power spectrum from a time series. The simplest estimator, known as the periodogram P , is based on the Fourier transform of +s ,, I L ,\ S " s e pIL, (7) I L L through P ""S". Issues of spectral estimation will not be discussed here. An introduction and I I pointers to the literature can be found for example in Numerical Recipes [43]. According to the Wiener—Khinchin theorem, the power spectrum of a process equals the Fourier transform of its autocovariance function. For finite time series this is only true if either C(q) is computed on a periodically continued version of +s ,, or P is computed on a version of +s , that is L I L
T. Schreiber / Physics Reports 308 (1999) 1—64
11
extended to n"!N,2, N by padding with N zeroes. Nevertheless, both descriptions contain basically the same information, only that it is presented in different forms. Furthermore, there is a direct connection between the power spectrum and the coefficients of an ARMA model, Eq. (3), yielding a third possible representation. The power spectrum of a process (and its autocovariance function) is unchanged by the time evolution of the system (if it is stationary, see Section 4 below). However, it is affected by smooth coordinate changes, e.g. by the characteristics of a measurement device. Usually, the noninvariance of the power spectrum is not a serious drawback. The power spectrum is most useful for the study of oscillatory signals with sharp frequency peaks. The location of these peaks is conserved, only their relative magnitude may be affected by the change of coordinates, see Refs. [230,231]. Sharp peaks in the power spectrum indicate oscillatory behaviour and are useful indicators in linear as well as in nonlinear signals. Broad band contributions, however, have a less clear interpretation since they can be either due to deterministic or stochastic irregularity. Therefore, the power spectrum is only of limited use for the study of signals with possible nonlinear deterministic structure. 2.3.2. Lyapunov exponents The hallmark of deterministic chaos is the sensitive dependence of future states on the initial conditions. An initial infinitesimal perturbation will typically grow exponentially, the growth rate is called the Lyapunov exponent. Let x and x be two points in state space with distance I J ""x !x """d ;1. Denote by d the distance after a time *n between the two trajectories I J L emerging from these points, d """x !x "". Then the Lyapunov exponent j is determined L I> L J> L by d Kd eH L, d ;1, *n<1 . (8) L L A positive, finite, value of j means an exponential divergence of nearby trajectories, which defines chaos. A mathematically more rigorous definition will have to involve a first limit d P0 such that a second limit *nPR can be performed without saturation due to the finite size of the attractor. Here, only the single (maximal) Lyapunov exponent will be discussed. Lyapunov spectra can be defined that take into account the different growth rates in different local directions of phase space. However, the nonleading exponents are notoriously difficult to estimate from time series data. Only in very few cases of clean laboratory time series trustworthy results have been obtained so far. (See [7] for a discussion of the arising problems.) For field data, Lyapunov spectra beyond the first exponent have not so far been demonstrated to be a useful concept. There have been a number of attempts to generalise the Lyapunov exponent to systems which are not purely deterministic. For the usual definition, an arbitrarily small amount of noise leads to a diffusive separation of initially close trajectories and a divergent Lyapunov exponent (mind the order of the two limits involved). For very small noise levels, there may still be a range of length scales where the separation proceeds exponentially, until the finite size saturation is reached. This is the behaviour that is probed by the real space methods of estimating Lyapunov exponents from data, in particular the two very similar algorithms introduced independently by Rosenstein et al. [44] and by Kantz [45]. From the theoretical point of view, intermediate length scale definitions are less attractive since the resulting quantities are no longer invariant under smooth coordinate transformations.
12
T. Schreiber / Physics Reports 308 (1999) 1—64
An alternative way to introduce noise into the definition of Lyapunov exponents is to study the separation of initially close trajectories of two identical copies of a system which are evolving subject to the same noise realisation. Then the Lyapunov exponent quantifies the contribution to the divergence that originates in the intrinsic instability of the deterministic part of the system. This is essentially the kind of instability probed by the tangent space methods to obtain Lyapunov exponents from data, most prominently Refs. [46—49]. Lyapunov exponents quantify the average exponential growth rate of infinitesimal initial errors. Their natural units are therefore inverse times. However, this does not justify to quote inverse Lyapunov exponents as average predictability horizons or predictability times. (The two processes of averaging and taking the inverse of a quantity do not commute.) In fact, the degree of instability and predictability can vary considerably throughout phase space, as it has been pointed out for example by Abarbanel et al. [50] and by Smith [51]. The Lyapunov exponents constitute a particular way of averaging over these variations. They are constructed in a way such that the average becomes independent of the initial condition and invariant under smooth coordinate changes. For typical prediction times, one has to form different averages which cannot be expected to be invariant. It has been argued that the loss of information about the system by averaging in a specific way over local variations of the instability or predictability is too severe. Several people [52,53] have therefore proposed concepts of local Lyapunov exponents and predictabilities. Local Lyapunov exponents are defined in a quite similar way as the usual exponents, except that the limit *nPR is omitted, whence they become position dependent, or local. In particular, Bailey et al. [54] have studied the statistical properties of these exponents. They consider the case that dynamical noise is perturbing the system. In that case they can prove a central limit theorem about the existence and convergence of finite time Lyapunov exponents. Unfortunately, local quantities are almost never invariant in any useful sense. Quite trivially, they will change whenever the positions are transformed. These changes may easily been kept track of. But as soon as the coordinate changes are not isometries, the statistical weights of different areas in phase space are changed. Thus, the values, and distributions of values, of local Lyapunov exponents and local predictability times are manifestly non-invariant. 2.3.3. Dimension and entropy Besides the exponential divergence of trajectories, the most striking feature of chaotic dynamical systems is the irregular geometry of the sets in phase space visited by the system state point in the course of time. This fractal geometry is a natural consequence of the divergence of trajectories which can be realised in a finite phase space only through some folding mechanism. Stretching, folding, and volume contraction lead to statistically self-similar structure on small length scales. While the average stretching rate is quantified by the Lyapunov exponent, the loss of information due to the folding is reflected by the entropy of the process. The self-similar character of the resulting point sets and measures defined on them can be characterised by fractal dimensions. Several definitions of non-integer dimensions have been proposed in the literature. Most well While the connection between chaoticity and fractality arises naturally, there are counterexamples, both for strange but non-chaotic and for chaotic but non-fractal attractors.
T. Schreiber / Physics Reports 308 (1999) 1—64
13
known is the Hausdorff dimension of a set and the more easily computable box counting (or capacity) dimension. We can also weight the points in the set by the frequency with which they are visited on average. Then we need a definition of the dimension in terms of the natural measure k(x) dx defined on the set. One way to proceed is to take weighted averages of the number of points contained in the elements of a partition of phase space and study their dependence on the refinement of the partition. The translation of this scheme into a time series context leads to the box-counting methods of dimension estimation. The practical problems that arise when a space of moderate dimensionality must be covered by boxes of small length can be overcome by sophisticated bookkeeping algorithms. However, these methods make rather inefficient use of the statistics available and suffer from severe finite size effects on the larger length scales. They are therefore not recommended for the study of invariant properties of real world time series. An alternative way to define the dimension of a measure k(x)dx is by means of correlation integrals C (e). Let us define the locally averaged density o to be the convolution of k with a kernel O C function K (r)"K(r/e) of bandwidth e that falls off sufficiently fast for the convolution to exist: C
(9)
(10)
o (x)" dy k(y)K (""x!y"") . C C y
Most commonly, the kernel is chosen to be K (r)"H(1!r/e) where H( ) ) is the Heaviside step C function, H(x)"0 if x40 and H(x)"1 for x'0. Other kernels are popular in statistical density estimation. The correlation integral of order q is given by the order q average of o : C C (e)" dx k(x)[o (x)]O\ . C O x For a self-similar measure we have C (e)JeO\"O, eP0 . (11) O In the literature, D is called the order-q dimension. This definition includes the dimension D that O has been shown to coincide with the Hausdorff dimension in many cases, and the information dimension D through l’Hospital’s rule. Although D is the most relevant because of its information theoretic meaning — it quantifies the scaling of the amount of information needed to specify the state of the system with the required accuracy — we will usually at most be able to estimate a lower bound on it, the correlation dimension D . The correlation dimension as a means of quantifying the “strangeness” of an attractor has been introduced by Grassberger and Procaccia [55]. For finite samples, the double integral in C can be evaluated down to much smaller scales than the other C ’s. Although generic attractors are expected to be multifractal, that is, D depends on q, this O O property is difficult to study in real time series. Only for exceptionally long, clean signals, D can be O obtained for qO1,2. For real world recordings it is already an ambitious goal to establish a possible fractal nature by finding a scaling region of C . When analysing time series we are usually dealing with distributions of delay vectors with delay q in an m-dimensional reconstructed phase space. The m dependence of C in the limit of large O m can then be expressed as C (m,e)"a(m)e\O\FOOKeO\"O, eP0, mPR O
(12)
14
T. Schreiber / Physics Reports 308 (1999) 1—64
which defines the order q entropy h . The pre-factor a(m) depends on the norm "" ) "" and the kernel O function. Although a(m) does not affect the asymptotic value of the entropy h , the convergence for O finite m can be dramatically different, as demonstrated in Ref. [56]. (See also Ref. [57].) Again, the case q"1 is singled out since the (Shannon or Kolmogorov) entropy h is additive when independent processes are joined. Also, h is related to the Lyapunov exponents via Pesin’s identity, a fact that can be used for consistency checks. However, as with the dimensions, the case q"2 is much more accessible with time series data. See for example Ref. [58]. An algorithm for the determination of the Kolmogorov entropy is given in [59]. Equivalent scaling behaviour to that of Eq. (12) is valid for a large class of kernel functions in the average Eq. (9), see Refs. [60,61]. Besides the hard kernel given by the Heaviside function, the most natural choice is a Gaussian, so that for example C reads:
C%(e)"
dx dy k(x)k(y)e\x\yC .
(13)
xy
Apart from yielding smoother curves for finite sample estimates, Gaussian kernel correlation integrals have some other attractive properties. For example, log C% is additive under the pointwise summation of independent variables, in particular, a deterministic signal and measurement noise. It is not so much in use mainly since its numerical implementation seems quite awkward when the definition, Eq. (13) is used straightforwardly. However, it is quite easily obtained from the usual (step kernel) correlation integral by
1 C%(e)" deJ e\CJ CeJ C (eJ ) . 2e
(14)
2.4. Comparing dynamics and attractors The quantities considered so far were all meant to characterise a single process. Different processes can of course be compared by comparing these numbers. It may however be interesting to have some means to answer the question of how different two processes are directly, without going through the reduction to a small number of characteristics. Recently, several authors independently have begun to use relative measures for the classification of systems through time series [62,63] and for the study of nonstationary signals [64—66]. Therefore, some theoretical background will be given for the less ad hoc measures of dissimilarity. Practical aspects as well as some more informal but useful quantities will be taken up in Section 3.6. Let us first make a clear distinction between the problem of defining a measure of dissimilarity between attractors (dynamics, measures, probability distributions) and between trajectories. The latter is related to the question if two systems are dynamically synchronised in a general sense. Generalised synchronisation means that there is a smooth mapping that relates the states of the two systems at any time. The present work does not address the latter question. Relevant references include Refs. [67—71]. It can be easily seen that C%(e)" deJ e\CJ Cd/deJ C (eJ ) from which the result follows by partial integration.
T. Schreiber / Physics Reports 308 (1999) 1—64
15
Information theory provides a measure of distance between two probability densities k(x) and l(x) which is based on the Kullback entropy [72]. Let
H (k,l)" dx l(x)log k(x)/l(x) . ) x
(15)
(16)
Then c (k,l)"H (k,l)#H (l, k)" dx (k(x)!l(x))(log k(x)!log l(x)) ) ) ) x
is positive definite, symmetric, and fulfills the triangle inequality. Thus c ( ) , ) ) is a metric. For the ) same reasons discussed above, expression (16) will be replaced by its analog based on second order correlation integrals. A distance c (k,l) can then be defined by
c (k,l)" dx [k(x)!l(x)]"lim [C (e;l)#C (e;k)!2C (e;k,l)] , x C where C (e; k, l) is the cross-correlation integral
(17)
dx dy k(x)l(y) K (""x!y"") . (18) C The case that K (r)"H(e!r) as the generalisation of the Grassberger—Procaccia correlation C integral [55] has been introduced by Kantz [73]. The Gaussian kernel case K (r)"e\PC together C with its finite sample properties has been studied by Diks and coworkers [74]. If the proper limit eP0 is taken, c (k,l) is nothing but the ¸ distance of the two probability densities. For finite e, C (e; l)#C (e; k)!2C (e; k, l) is no longer formally a distance, except for particular kernels including the Gaussian case. One drawback of the second order distance c (k,l) as compared to the Kullback distance c (k,l) ) is that it is no longer invariant under smooth coordinate transformations acting on both distributions. Other measures of distance, in particular for discrete point sets, have been discussed by Moeckel and Murray [75]. One could further ask what happens if two distributions are observed but they may have been obtained with different measurement functions. This amounts to the question if the two distributions are absolutely continuous with respect to an unknown reference distribution. The hope for a time-series-based answer seems unrealistic at this stage. The quantities mentioned so far are all based on geometrical ideas. Apart from asking for similar phase space geometry one can also ask for similar dynamical evolution laws. If approximate predictive models can be established for the time series, one can derive measures of dissimilarity that often allow stable estimates for rather short sequences. Kadtke [76] uses global models of the form C (e; k, l)"
x y
+ s "f (s )" a f (s ) L> L G G L G
(19)
Strictly speaking, this formula is only valid for smooth distributions k,l. For measured data we can safely assume that smoothness is imposed on fractal distributions by measurement errors.
16
T. Schreiber / Physics Reports 308 (1999) 1—64
to fit several time series or segments individually. Then changes and differences in the dynamics are monitored by changes in the model parameters a . Technically, it is important to choose a model G class with as few basis function f as possible. Otherwise the values of individual coefficients a may G G strongly depend on unimportant details of the signal. Two numerically distinct sets of parameters may equally well model the same data. This approach requires the dynamical models to involve globally adjustable parameters, excluding locally constant or locally linear methods. There are at least two other ways to compare the dynamics of two predictive models f and u. One approach that has been taken by Herna´ndez and coworkers [62] is to use both models to make predictions on a time series +s , and compare them for each time step n: L 1 , (20) c ( f, u; +s ,)" [ f (s )!u(s )] . L L . L N L Since c ( f, u) is nothing but the ¸ distance of the vectors formed by the individual predictions, it is . a distance measure in the mathematical sense. In general its value depends on the choice of time series +s ,. If f has been obtained by a fit to the signal +s, and u by a fit to +s,, a symmetric L L L distance measure between +s, and +s, is given by L L c ( f, u)"c ( f, u; +s,)#c ( f, u; +s,) . (21) . . L . L The cross-prediction error used in Refs. [63,66] is quite similar, but it compares the individual predictions to the observed values rather than to each other: 1 ,\ [s !f (s )] . (22) c ( f; +s ,)" L> L ! L N!1 L A small value of c ( f; +s,) indicates that the dynamics on +s, is a subset of the dynamics found ! L L on +s, and modelled by f. A symmetric measure of dissimilarity (not a formal distance measure in L general) is given by c ( f, u)"c ( f; +s,)#c (u; +s,) . (23) ! ! L ! L In the last two schemes, in principle any method of prediction can be used. In Refs. [63,66] stable results have been obtained with simple locally constant phase space predictors. Prediction errors will be discussed in Section 3.5. Section 3.6 will discuss a few practical aspects of the above approaches.
3. Nonlinear analysis of limited data In the previous section, some definitions and theoretical motivation were given for a number of concepts which now have to be adapted to the case that instead of a measure, an attractor, a dynamical system, all we have is a finite, noisy time series. The way to proceed crucially depends on the point of view we want to assume about the nature of the system. As said earlier, we cannot assume deterministic chaos for any measured time series. If we want to use the theoretical results available, we need to establish it from the data, maybe backed up by additional considerations. Often, we will not be able to find low-dimensional structure, but we may still borrow some concepts just because they give a convenient framework for certain problems.
T. Schreiber / Physics Reports 308 (1999) 1—64
17
3.1. Embedding finite, noisy time series How much information can be recovered from time delayed copies of finite sets of noisy measurements is quite a complicated question and a general answer is not available. The embedding theorems mentioned previously all assume that the observations are available with arbitrary precision. For some results, in particular those concerning the attractor dimension, it is also assumed that arbitrarily small length scales can be accessed which implies that an infinite amount of data is available. A mathematical theorem cannot simply be expected to be almost valid if the conditions are almost fulfilled. Consequently, several authors have investigated what happens to the embedding procedure when noise is present and the sequence is of finite length. For the embedding procedure, noise seems to be the dominant limiting factor. Only a few theoretical results relevant for practical work are available on the embedding of noisy signals. First of all, we have to make a fundamental distinction between noise due to measurement error and noise that is intrinsic to the dynamics. In the first case, we suppose that there is a deterministic dynamical system underlying the signal. Thus, it is clear what we want to reconstruct by the embedding procedure. If the noise is coupled to the system we have to specify in what sense we want to use an embedding in the first place. Unfortunately, the nature of the noise is usually not known independently. There is no general straightforward way to infer its properties from a time series without making strong assumptions about the dynamical system or the spectral properties of the noise. One remarkable paper about the effect of measurement noise on the embedding procedure is that by Casdagli and coworkers [77]. Their main result is that a reconstruction technique that leads to a formally valid embedding with noise-free data can nevertheless amplify noise even in a singular way. That means that in such a case not all degrees of freedom of the system can be recovered from a scalar time series even for arbitrarily small amounts of noise. The examples studied in Ref. [77] suggest that this situation is quite typical and not just found in constructed pathological examples. Thus, bold interpretations of Takens’ theorem, for example, that we can recover the full dynamics of the human body from a recording of a single variable, is not only in contradiction with common sense but also disproven by mathematical arguments. Some results are available on the embedding of noise-driven signals. One line of thought supposes that the driving noise sequence is known. The dynamical system then becomes a nonlinear input—output device. Casdagli [78] and Stark et al. [41] formalise the idea that the observations of the output of such a system can be embedded in the sense that time delayed copies of the observation sequence together with the state of the input variable specify the state of the system equally well as the full output together with the input state. These results are maybe more useful for time series analysis than they may seem, given the fact that we almost never know the noise sequence. There are certain signals where the external influence can be inferred to some extent from the observed output. Consider, for example, a recording of the cardiac cycle, for example an electrocardiogram (ECG). The cycle itself is fairly regular but the initiation of a new cycle seems not to be fully determined by degrees of freedom of the heart itself. But even if the beat times were random, we could always infer a posteriori that triggering must have occurred once we observe a new cycle. An illustration of this point will be given in an example below. There are other theoretical works that also follow the idea that dynamical noise can be isolated in certain cases where the observations contain sufficient redundancy. Muldoon and coworkers
18
T. Schreiber / Physics Reports 308 (1999) 1—64
[79] study the case that more probes are available than necessary to cover the degrees of freedom of the system. They demonstrate in a number of examples that sufficient redundancy in the measurements allows for a distinction between the detreministic part of the signal and the dynamical noise. This allows also to recover missing variables by an embedding procedure. 3.2. Practical aspects of embedding In most interdisciplinary applications we do not know much about the nature of the noise. For example, biological systems are almost never isolated, and measurements are always of finita accuracy. Observational noise is not always white and Gaussian, although this is often the case. If we make any assumption about the statistical properties of the noise, we have to carefully check the consistency of the results. One of the most immediate restrictions of the embedding theorems for finite data is that the information contained in a time delay representation of real data is influenced by the choice of embedding parameters. While the theorems do not restrict the delay time q (only a few exceptional cases must be excluded), the proper choice of q does matter for practical applications. Also, there are many cases where the theoretically sufficient embedding dimension m is not optimal for a certain purpose. Larger (but sometimes also smaller) values may give superior results. The literature on this issue is quite confusing and at times contradictory. Part of the confusion is due to the fact that optimality can only be assessed with respect to a specific application. When fitting the dynamics by a global polynomial model, the embedding dimension should be as small as possible in order to limit the number of coefficients in the model. On the other hand, for local projective noise reduction, the redundancy of an embedding with small q and large m allows for better noise averaging. For signal classification we do not even need a formal embedding since the difference between states may be statistically better defined in a low-dimensional projection where small neighbourhoods tend to contain more points. The discussion will therefore be cut short by giving some pointers to the literature and by proposing simply to carry out each study with several embedding strategies and to compare the results. Theoretical work on the embedability of noisy sequences is found in Refs. [77,80,81]. More heuristic studies are Refs. [82—88]. One general remark is that if one attempts to formally optimise the performance of an embedding, one should use a large enough class of possible embeddings. There is no theoretical reason to restrict the study to time delay embeddings with equally weighted lags that are integer multiples of a common lag time. In fact, it was already reported in Ref. [1] that minimising the redundancy in a reconstruction (s(t!q ), s(t!q ), , s(t)) does not necessarK\ K\ 2 ily yield q "nq but for example q (q (2q . More general functions of s(t), t!w4t4t in L a time window of length w may be considered. The singular value decomposition constitutes the special case of maximising the variance among all linear combinations within a time window. But neither is it necessary to consider linear functions only, nor is the variance always the most interesting characteristic. Let us finish this section with a particular type of signal where a time delay embedding proves useful even though the signal has a strong stochastic (or high-dimensional, in any case unpredictable) component, the electrocardiogram (ECG). The ECG records the electro-chemical activity of the heart which is essential for its pumping mechanism. The cardiac muscle can be regarded as a spatially extended excitable medium with an excitable, an excited, and a refractory phase. At the
T. Schreiber / Physics Reports 308 (1999) 1—64
19
onset of a cardiac cycle, a stimulus is initiated at the sino-atrial (SA) node, a specialised collection of muscle cells. The resulting depolarisation wave proceeds along a well defined pathway, first through the atria and then to the ventricles. The excited tissue contracts and thereby ejects blood to the body and the lungs. Eventually, all cardiac tissue has been excited and is refractory whence the stimulus dies out. (If this condition fails, then the re-entry phenomenon can occur which is the cause of serious arrhythmiae.) The pathway of the depolarisation wave is quite similar from cycle to cycle, the variation over a few cycles can be parametrised approximately by a one- or two-parameter family of ECG curves. However, the onset of a new cycle fluctuates from beat to beat in a way that is not well understood. Certainly, the interbeat fluctuations cannot be modelled successfully by a low-dimensional deterministic approach. Coupling to the breath activity, blood pressure, as well as more complex control signals from the central nervous system have to be taken into account. With this picture in mind, it could not be expected from the embedding theorems that a delay coordinate representation of the ECG is useful at all. Let us consider a stochastically driven, damped harmonic oscillator as a toy model for such an input—output system with an unknown, fluctuating input sequence: x( #xR #x"a(t) .
(24)
The driving term is taken to be zero except for kicks of random strength at times t such that the G inter-beat intervals, t !t are random in the interval [p, q]. Fig. 2 shows two trajectories of such G G\ a system with different choices of the inter-beat time interval [p, q]. The kicks are realised by finite jumps by a random amount in the interval [0,1]. To the left, solutions of Eq. (24) are plotted versus
Fig. 2. Trajectories of a kicked, damped harmonic oscillator. Left: signal plotted versus time. Middle: true twodimensional phase space. Right: delay embedding. Upper: kicks occur close in time. Lower: kicks are well separated in time and the system can relax between kicks. See text for discussion.
20
T. Schreiber / Physics Reports 308 (1999) 1—64
Fig. 3. Delay coordinate embeddings of a human electrocardiogram. The delay time is 12 ms (resp. 24 ms) at an (interpolated) sampling rate of 500 Hz. Note that trajectories spend fluctuating stretches of time near the origin, where therefore an indeterminacy occurs. (The ECG voltages are in lV.)
time. In the middle, the true phase space spanned by x(t) and xR (t) is shown while to the right a delay representation is used with a delay of one time unit. In the upper row, beats are initiated with a time separation of p"0 to q"¹/2 where ¹"4p/(3+7.26 time units is the period of oscillation. This does not allow the system to relax sufficiently between kicks in order to form characteristic structure in phase space. Consequently, an embedding provides no clear picture and additional information would be needed for an analysis of such a time series. In the lower row, no kicks were closer in time than p"¹, the maximal separation being q"3¹. The inter-beat parts of the trajectory are distinct because of the different kick strength, but since this is the only fluctuating parameter except for the inter-beat interval, they are essentially restricted to a two-dimensional manifold. This manifold is preserved under time delay embedding although neither the sequence of beat times nor the beat amplitudes are used in any way. The randomness acts locally around the origin in the indeterminacy as to when the next beat will occur and how far the system will be taken by the kick. Similarly to this toy example, most of the lacking information in the ECG, that is, the times at which a new beat is triggered and possible other parameters of the new cycle, can be deduced from the recording itself. Once the cycle is on its way, we can find its origin quite easily. Thus, the redundancy in the ECG trace explains why delay representations of ECGs are found to be approximately confined to low-dimensional manifolds, see for example Fig. 3. In the left panel delay has been set to 12 ms in order to resolve best the large spike (the QRS-complex) that corresponds to the depolarisation of the ventricle (the large loop in Fig. 3). In the right panel, a longer delay of 24 ms has been used in order to unfold the smaller structures around the origin which represent the atrial depolarisation (P-wave) and ventricular re-polarisation (T-wave) phases. After each beat, the signal retires to the baseline, or the origin in delay space, where it may spend some time until a new beat occurs. Since at that point the future is indeterminate, the set visited by the trajectory is essentially finite dimensional, without being formally deterministic. 3.3. Estimating dynamics and predicting Perhaps the most fundamental idea behind the approach to time series analysis taken in this paper is that an irregular signal may have been generated by a nonlinear dynamical system with
T. Schreiber / Physics Reports 308 (1999) 1—64
21
only a few active degrees of freedom. Therefore, one of the most important goals should be to establish effective equations of motion that follow this principle and are consistent with the data. The ability to generate a time series that is equivalent to the measured one can be taken as evidence for the validity of the approach, and is therefore interesting in its own right. But there are many other situations where effective model equations can be of great value. Most properties of chaotic systems are much more easily determined from equations than from a time series. Thus, if a time series can be well represented by model equations, one might even abandon the analysis of the series in favour of an analysis of the model. But this situation is rather rare. Except for well controlled laboratory experiments, dynamical modelling is seldom faithful enough to justify such an approach. Nevertheless, analysing an empirical model, and maybe synthetic time series data generated from it, can provide a valuable consistency test for the results of time series analysis. The best we can hope for when fitting a model to data is that the result comes close to the real underlying dynamics. However, chaotic dynamical systems generically show the phenomenon of structural instability. This means that models with very similar parameters may exhibit qualitatively different global dynamics, for example close to an attractor crisis (see for example Ref. [26]). Therefore, if we simply iterate fitted model equations, we may see substantially different behaviour from the actual system even if the model in itself is faithful. One way to moderate this danger is to introduce a small amount of dynamical noise comparable to the modelling error when iterating the equations. Dynamical noise softens the sensitive dependence on parameters to some extent. Alternatively, or additionally, one could study ensembles of models which are compatible with the data. The ensemble variation of statistical properties can then be taken as an indicator for the expected effect of the remaining modelling error. The most obvious reason to reconstruct model equations from a time series is that one may be interested in predictions of future values. This task is of course quite common in meteorology and finance and several other fields. Moreover, in many situations the average error when predicting a time series can be taken as an indicator of the structure present in the signal. Thus, we will often use a nonlinear prediction error as a quantifier for the comparison of signals. The use of prediction errors will be discussed in Section 3.5. A choice of models that can be employed will be described below. In the context of nonlinear dynamics, the modelling task consists in estimating the function F or f that is supposed to generate the data via Eqs. (1) and (2) respectively. This may look like the common problem of estimating a nonlinear function, and there is indeed a close relation. However, the information available is not quite what one would like to have for that purpose. All we usually have is a noisy scalar time series: s "s(x )#m , x "f (x )#g . (25) L L L L L\ L Here, also an intrinsic noise term g has been included, since no real system is ever really isolated. Since we cannot completely recover +x , from +s ,, the best we can do is to use some kind of L L embedding of +s , and look for a mapping f that acts on the embedding vectors. If the original L Q phase space is d-dimensional this mapping may have to be defined in up to 2d dimensions according to the theory of embeddings. However, information about f is only given through the Q data, that is, on the attractor. This may render the estimation problem singular, depending on the model class from which f shall be estimated. Even if the embedding problem can be solved (or avoided, if multiple simultaneous measurements are available) the estimation problem remains difficult. The standard approach would be to
22
T. Schreiber / Physics Reports 308 (1999) 1—64
choose some parameter-dependent model for fK and optimise the parameters using a maximum Q likelihood or least-squares procedure. This however implies that the value y"f (x) is known at Q a number of locations, perhaps with some uncertainty. But for the usual procedure to work, the locations x have to be given without error. This cannot be assumed in time series analysis because f is sampled only at the noisy data points. This and other practical problem in estimating dynamics Q from a time series are discussed in Kostelich [36]. In Refs. [32,89] illustrative material and a partial solution can be found. The most thorough discussion, the one that also comes closest to a satisfactory solution of the problem, is offered by Jaeger and Kantz [37]. The solution involves two ingredients. The first is to replace the ordinary least-squares procedure by a procedure that also optimises the positions +x, (sometimes called total least-squares). Unfortunately, this renders the fitting problem nonlinear, even if the model class is a linear combination of basis functions. The second part requires to minimise the one-step error (y!fK (x)) but simultaneously optimising the Q precision of f (x), f ( f (x)), etc., a difficult nonlinear minimisation problem which is quite computer Q Q Q time intensive. As for the model class from which fK is to be determined, a number of different propositions have been made. One possibility favoured by many authors is to expand the dynamics in Taylor series locally in phase space. This has first been proposed in the context of Lyapunov exponent estimation by Eckmann and coworkers [46]. They perform local linearisation on time series data to obtain the dynamics in tangent space. In a classical paper [34], Farmer and Sidorowich port the idea to the prediction problem. In practice, the expansion is carried out up to at most linear order. Since one has to work in several dimensions, the number of coefficients in higher order approximations becomes too large for a local treatment. In m-dimensional delay coordinates, the local model is then quite simply K , (26) s D "aL# aLs H L\H\O L> L H where Dn is the time over which predictions are being made and q is the time delay as usual. The coefficients aL, j"0,2, m, may be determined by a least-squares procedure, involving only points H s within a small neighbourhood around the reference point s . Thus, the coefficients will vary I L throughout phase space. The fit procedure amounts to solving m#1 linear equations for the m#1 unknowns. When fitting the parameters a, several problems are encountered that seem purely technical in the first place but are related to the nonlinear properties of the system. If the system is lowdimensional, the data that can be used for fitting will locally not span all the available dimensions but only a subspace, typically. Therefore, the linear system of equations to be solved for the fit will be ill conditioned. However, we are only interested in that part of the linear map (26) which relates points on the attractor to their future. There are several ways to regularise the least-squares problem. In the presence of noise, the equations are not formally ill-conditioned but still the part of the solution that relates the noise directions to the future point is meaningless (and uninteresting). Equivalently to adding a small amount of noise one can add a small factor times the unit matrix before the singular matrix is inverted. The most appealing approach however is to restrict the fitting procedure to the directions spanned by the data which can locally be identified with the principal components or singular vectors of the data distribution. These and a few
T. Schreiber / Physics Reports 308 (1999) 1—64
23
other regularisation schemes for locally linear predictions are discussed in great detail by Kugiumtzis and coworkers [90]. In his contribution to the Santa Fe Institute time series contest in 1991, Sauer [91] has emphasised the close interplay between phase space embedding and fitting of the dynamics. The optimal degree of locality of a locally linear modelling approach has been used by Casdagli [92] as a measure for nonlinearity in a time series. He compares the predictive quality of models fitted with using different numbers of neighbours. In the absence of nonlinearity, the globally linear fit using all available points as neighbours should give best results since it uses the largest number of points and is structurally more robust. For increasing degrees of nonlinearity, the tradeoff between lack of statistics with few neighbours and curvature error with large neighbourhoods should move the optimum towards smaller and smaller length scales. The rationale of this paper is quite attractive — to define the degree of nonlinearity by what is the most useful assumption for modelling. If one is interested in a robust, low variance measure of nonlinear predictability without necessarily aiming at optimal forecasting power, one should consider using locally constant approximations to the dynamics. The idea is simply that determinism will cause similar present states to evolve into similar future states. (This idea has been first used for predictions by Lorenz [93] who called it the method of analogues.) Since we expect the signal to be noisy, it is advantageous to consider a collection of similar states rather than the single most similar state observed so far (Lorenz’ analogue). Thus, in order to make a prediction on the point s , we form L a neighbourhood U , either with a fixed radius or a fixed number of elements. The prediction model L is that s D "aL where aL may vary throughout phase space. The fitting problem then degenerL> L ates to finding the local (in phase space) average over the future points of s , k3U : I L 1 . (27) s s D" L> L "U " U I>DL L IZ L Here, "U " denotes the number of points in the neighbourhood. In its simplest implementation, all L that has to be set are the embedding parameters and a length scale, the radius of phase space neighbourhoods. Usually, none of these parameters need to be determined by a fit. Therefore the locally constant model can be regarded as almost parameter free. Very similar algorithms has first been used by Pikovsky [94] and later by Sugihara and May [95]. It is quite popular in the context of nonlinearity testing [85], see also Section 5. Extensions to the locally constant or linear multivariate function interpolation procedure by spline smoothing and adaptive parameter optimisation are implemented in the MARS (multivariate adaptive regression splines) package which is described in [97]. An approach that is quite different from those mentioned so far attempts to fit the dynamics by a nonlinear function that is globally defined in phase space. The price for having a single expression for the model function is that it has to accommodate the nonlinear structure appropriately. The most straightforward generalisation of the linear autoregressive models is to include higher-order polynomial terms in the ansatz for fK . This is sometimes called a Volterra series expansion. Another Q popular model class are radial basis functions. Their use is documented by a large body of papers, for example Powell [98], Broomhead and Lowe [99], Casdagli [35], and Smith [100]. As usual, the model is given by a linear combination of orthogonal functions U (s). Here, the functions are G essentially of the same form, radially symmetric about some centre point s: U (s)"U(""s!s""). G G G
24
T. Schreiber / Physics Reports 308 (1999) 1—64
This yields + fK (s)"a # a U(""s!s"") . (28) Q G G G Quite a variety of functional forms of U(r) can be used, including Gaussians or powers. For the art of choosing the centre points s (which do not have to be points from the data set), reference is G made to the literature — individual experimentation is also recommended. After giving examples of model classes which are most popular in the nonlinear time series community, it should be stressed again that multivariate function estimation is a common problem in statistics and has a rich literature. Neural networks have been very fashionable over the past few years. They have proven to have remarkable capabilities, but, as the huge literature indicates, they need a lot of expertise and experience to be used reliably. Another branch of research pursues the idea of a regression tree in order to organise multidimensional structure for prediction, see [101] for a classical reference. If any parameter dependent model is used, for example one like Eq. (28), or a neural network, care has to be taken to avoid overfitting. Overfitting means that a larger model class can always increase the accuracy of a fit, even though at some point only the details of the particular realisation of the process that is available for fitting are accommodated. This problem can be avoided by limiting the number of adjustable coefficients in the model. There are at least two ways to look at the problem. Both result in a penalty for the number of coefficients in the cost function that is to be minimised, but the exact form of this penalty differs for the two approaches. Akaike [102] observes that if we are planning to use the model for making predictions, then what we want to minimise is not the least-squares error of the model on the data but the expectation of that error for the case that the model is applied to new data from the same source. This can mean that a model obtained from such a fit may outperform the original equations when prediction errors are compared. The second approach is due to Rissanen [103]. The idea here is that modelling provides a way to represent a data set in a more compact way than storing the measured data. If the data can be reproduced by a simple model, one can store the model and the errors instead, which is enough to recover all the information. In such a context, fitting is a tradeoff between reducing errors and increasing the size of the model. Practically usable formulas have only been derived in this context for linear (AR) models. One of the reasons is that it is difficult to compare the importance of parameters across different model classes. In order to avoid overfitting in practical problems, one has to validate the predictive power of a model on yet unused data. Simply splitting the available data into two parts, one for fitting and one for testing, is the cleanest possibility, but unfortunately is quite wasteful in terms of data usage. Alternatively, one can use a cross-validation technique, as they are common in the statistical literature. For k-fold cross-validation, the data set is split into two segments in k different ways and the fitting and testing is repeated k times. At each time, a different part of the data is used as a test set and the remaining points are used for fitting. The errors of the k tests may then be averaged together. (For the correct averaging, in particular if the different test sets overlap, consult the statistical literature.) The advantage is that testing is done on all available points, making best use of the data base. At the same time, each fit is based on a large part of the time series. If the expense in computer time that is necessary to repeat the fit k times is feasible, N-fold cross-validation can be used on N data points, thereby optimising the available statistics for the fits. This case is sometimes
T. Schreiber / Physics Reports 308 (1999) 1—64
25
called take-one-out statistics for obvious reasons. Strictly speaking, k-fold cross-validation assumes that there are no serial dependencies between data in the different segments, which may or may not be true. If in doubt, on can ensure that the training section and the test section are sufficiently far apart in time. With this restriction, locally constant and locally linear predictors provide take-one-out out-of-sample errors automatically since the current point has to be excluded from the neighbourhoods. In the presence of serial correlations, one should also exclude temporally close points that may still be dynamically related. It should be stressed that cross-validation is a technique for model verification and not for model optimisation. If several models are proposed and their out-of-sample errors are compared, the error quoted for the best of these models can no longer be regarded as an out-of-sample statistic since it is obtained by optimisation over a known training set. Section 3.5 will discuss a few issues that arise when the error with respect to a nonlinear prediction scheme is to be used as a quantifier for the predictability in a system, or a measure of “complexity”. If such a quantifier is only used in a relative way to compare different signals, it is not formally necessary to use out-of-sample errors. In fact, unless cross-validation at high k is carried out, in-sample errors usually have lower sample variance and may therefore give better discriminative power. For the case of nonlinearity testing, this has been discussed by Theiler and Prichard [104]. 3.4. Estimating invariants All quantitative indicators of chaos involve in their definitions some kind of limit. If these indicators are to be estimated from a finite time series measurement, none of these limits can actually be carried out. Formally, the desirable theoretical properties of these indicators, in particular their invariance under smooth coordinate transformations, will be lost. If the indicator is to be computed for the purpose of a comparative study, lack of invariance may be compensated for by standardising the characterisation procedure. Nonlinear indicators which are not necessarily invariant but are optimised for their power to discriminate between different dynamical states are discussed in Section 3.5 below. With the typical data quality in nonlaboratory experiments, and given that pure lowdimensional determinism is quite a particular phenomenon in nature, we will very seldom be able to reliably estimate the proper dimension or Lyapunov exponent of a real-world phenomenon. The issues we have to consider in such an attempt will be discussed below — most of them are well covered by the literature, for example the book by Kantz and Schreiber [7]. One of the sharpest critics of naive use of the Grassberger—Procaccia correlation dimension describes the situation roughly like that: There is little use in computing the correlation dimension; If it is less than three one does not need it because the structure of the attractor is obvious, if it is larger than three it cannot be estimated reliably. Very similar statements can be made for the other invariants from chaos theory as well. Later, the discussion will proceed to quantities that have less theoretical value but are easier to compute or give statistically more powerful results.
P. Grassberger, private communication.
26
T. Schreiber / Physics Reports 308 (1999) 1—64
3.4.1. Lyapunov exponents Lyapunov exponents measure the rate of divergence of initially close trajectories. A positive but finite Lyapunov exponent is therefore a sharp criterion for the existence of deterministic chaos. The older literature on the determination of exponents from time series can be seen as extensions of techniques that have been developed for the analysis of systems with known evolution equations. From these they inherit the assumption that there actually exist such dynamical equations and trajectory separation evolves indeed exponentially. Sano and Sawada [47] as well as Eckmann and coworkers [46] introduce locally linear fits to the dynamics in order to follow the evolution in tangent space. The algorithm by Wolf et al. [105] follows several nearby trajectories to measure the average increase of local volume. Many refinements of these methods have been proposed, see Ref. [106] for a comparative discussion of all but the most recent Lyapunov algorithms. It is not wise to use these algorithms when it cannot be taken for granted that the dynamics is deterministic since none of them actually verifies the exponential behaviour of trajectories. More recently, the emphasis has shifted from the estimation of exponents under the assumption of determinism to the verification of exponential growth of errors. Very similar algorithms for this purpose have been proposed independently by Rosenstein et al. [44] and by Kantz [45]. We follow the latter reference here. The key idea is that initially close trajectories do not necessarily diverge exactly exponentially, but only on average. In order to cancel fluctuations around the general exponential growth, one has to average appropriately over many trajectory segments. Let W denote a set of delay reconstructed points s selected at random from a long trajectory such that they I approximate the true probability distribution. Let "W" denote the number of members in W. The set of points in an e-neighbourhood of s is denoted by U . Now define I I
1 1 ln "s D !s D " . S(Dn)" J> L "U " U I> L "W" W I IZ JZ I
(29)
If the distances "s D !s D " grow like eHDL, then so does exp S(Dn), but with less fluctuations. J> L I> L Kantz [45] discusses why this is the correct (that is, unbiased) way to average. A plot of S(Dn) versus Dn must show a reasonably straight line over a range of length scales before we accept its slope as an estimate of the Lyapunov exponent j. In order to find such a scaling range one has to choose the radius e of the neighbourhoods U as small as possible, but not so small that too few neighbours are found or that distances are dominated by noise. Examples for the successful use of this approach for computer generated sequences and for time series from low-dimensional systems in laboratory experiments have been given in the original articles by Rosenstein et al. [44] and by Kantz [45], as well as in Ref. [7]. In many time series from field measurements, initially close trajectories are found to diverge rapidly. Algorithms that assume that this divergence is due to an intrinsic instability of the dynamics will then issue a positive Lyapunov exponent. However, an intrinsic instability of a chaotic dynamical system should result in exponential growth of the discrepancies, which is difficult to establish. Consider, for example, the divergence rate plot shown in Fig. 4. For a sequence of 2000 time intervals between heartbeats in a normal human, the function S(Dn) defined in (Eq. (29)) has been computed using unit delay, initial neighbourhoods U of diameter 0.015 s around each point were formed in 2—10 dimensions. Indeed, the trajectories diverge quite fast. However, they reach a saturation soon. The curves are not fitted well by straight lines which would be the case if the divergence was exponential. If one were to
T. Schreiber / Physics Reports 308 (1999) 1—64
27
Fig. 4. Divergence of initially close trajectories for a series of time intervals between heartbeats in a normal human (semilog scale). S(*n) is shown for m"2,2,10. Trajectories do separate, but no straight line indicating exponential growth can be established.
assign a slope to the lines, the result would strongly depend on the length scale, embedding dimension, etc., and would be quite useless as an estimator of the Lyapunov exponent. It should, however, be remarked that the growth is not simply diffusive either. A non-invariant parameter, like the time of growth from S(0) to twice its value for given fixed embedding and neighbourhood parameters may well be useful for the comparison of different subjects but may not have much relation to a possible intrinsic instability of the system. 3.4.2. Correlation dimension and entropy The problems that arise when correlation integrals and the correlation dimension are estimated from finite time series have been discussed extensively in the literature. Statistical estimators for fractal dimensions and their theoretical properties are studied in Refs. [107—111]. Original contributions pointing out potential sources for spurious results are found for example in Refs. [112—116]. Some of the material has been reviewed for example in Refs. [1,7,117,118]. I will therefore only briefly state the main points. If the probability distribution implied by the natural measure is approximated by a sum of delta functions at N points +x, independently drawn from it, we can estimate the correlation integral C by the correlation sum , , 2 (30) H(e!""x !x "") . CK (e)" G H N(N!1) G HG> Thus, CK (e) is just the fraction of all pairs of points that are closer than e. It has been shown, for example, by Grassberger [119] that CK is an unbiased estimator of C . The hat will henceforth be suppressed. In time series applications, the assumption that the +x, are independently drawn from the underlying distribution is usually violated due to serial, also called temporal correlations. It has been pointed out by several authors that serial correlations can lead to spurious results for the correlation dimension, see for example Refs. [1,112,113,117]. The necessary correction is also well known (it has been proposed by Grassberger [115] and by Theiler [117]): pairs of points i, j that are closer than some correlation time t have to be excluded from the double sum in
28
T. Schreiber / Physics Reports 308 (1999) 1—64
(Eq. (30)). The loss of statistics is not dramatic since the total of pairs grows like N while only a number of terms JN is suppressed. Therefore it is advisable to be generous when choosing t ,
the time scale given by the decay of the linear autocorrelation function is often not sufficient. A useful tool to determine the decay of nonlinear correlations is the space-time separation plot introduced by Provenzale et al. [120], see Section 4, (Eq. (37)) below. The effect of failure to exclude serially correlated pairs from the correlation sum can be seen by comparing Fig. 5 of the present section and Fig. 9 of Section 4. It should be remarked that the literature might give the wrong impression that the sensitivity to serial correlations is a flaw specific to the Grassberger—Procaccia dimension algorithm. In fact, linear correlations and nonlinear determinism are sources of predictability which are detected by any algorithm that does not explicitly exclude the structure imposed by one of these sources. This is the reason why also, for example, prediction errors or false nearest- neighbour techniques have to be augmented by a comparison to linearly correlated random surrogates in tests for nonlinearity, unless similar corrections are carried out as for the correlation dimension. For the false nearestneighbour approach [121] this has been pointed out for example in Ref. [122]. If one wants to estimate a correlation dimension, plotting C (e) in a log—log plot is not always the best thing to do since deviations from the desired power law scaling do not appear very pronounced in this representation. A better method might be to plot the local slopes of the log—log plot, D (e)"d log C (e)/d log e ,
(31)
versus log e. These slopes can, for example, be obtained by a straight line fit over a small range of values of e. Theiler [123] gives a maximum likelihood estimator of the Grassberger—Procaccia correlation dimension. Since maximum likelihood estimation of the correlation dimension goes back to Takens [109], such quantities are often referred to as ¹akens+ estimator. The estimator is given by C (e) . D (e)" +* C (C (e)/e) de
(32)
Fig. 5. Maximum likelihood estimator of the correlation dimension as a function of the cutoff length scale and the embedding dimension for an intracranial recording [131] of the neural electric potential in a human. No scaling region of approximately constant D can be found. The time series was provided by Lehnertz and coworkers. +*
T. Schreiber / Physics Reports 308 (1999) 1—64
29
This quantity can also be plotted against log e for different values of the embedding dimensions. Its advantage over D (e) is that it incorporates all the statistical information that is available below the length scale e. It is however implied that the dominating contamination at the small length scale is given by the effect of the finite size of the data set. In many practical situations this is not quite the case since measurement errors destroy the self-similarity as well. The effect of noise on the correlation integral has been studied in a number of papers [108,124—127] which are reviewed and compared in Ref. [128]. Olofsen and coworkers [129] as well as Schouten and coworkers [130] derive maximum likelihood estimators of the correlation dimension for data which are contaminated with noise. However, the noise amplitude enters the analysis as an unknown parameter which complicates the application of their results in practical situations. If all precautions are taken and a dimension estimate is attempted on a complex data set, one should not be surprised if the result is negative in the sense that no scaling region and no proper saturation can be found. In fact, few real systems are sufficiently low-dimensional for this kind of analysis. As an example that illustrates this statement but which also shows that estimates of the correlation sum can be useful nevertheless, let us study an intracranial recording of the electric field in the brain of a human. The data and its analysis is described thoroughly in Elger and Lehnertz [131]. The data has been taken in an epilepsy patient and one of the questions is whether individual seizures can be anticipated from these recordings a few minutes before they occur. I will report on this application in more detail in Section 4.4. Huge amounts of data are available since multiple channel recordings are taken at 173 Hz continuously over several days for pre-surgical screening purposes. Not surprisingly, however, the data are quite nonstationary which is in fact essential for seizure anticipation to be possible. If one wants to assign an attractor dimension to such a time series, one first has to select windows in time which are short enough so that the dynamics can be considered to be effectively stationary within each window. A reasonable tradeoff between approximate stationarity and time series length is obtained with windows of 30 s duration. Fig. 5 shows the result of an attempt to calculate the correlation dimension of such a segment. After low-pass filtering (40 Hz cutoff), the correlation sum C (e) has been computed with a delay of 5 sampling time units and embedding dimensions 1—30. Dynamical correlations have been excluded by setting the minimal temporal separation of neighbours to 50 samples. From this data, the maximum likelihood estimator of the correlation dimension, (Eq. (32)), is obtained as a function of the upper cutoff length scale e and plotted versus log e. Clearly, there are no scaling regions where D becomes +* independent of e and m. Thus, it may be concluded that a low-dimensional attractor is not a good model for this data set. However, it will be discussed in Section 4.4 below how the correlation sum, although not suitable for a dimension estimate, could be used for monitoring changes in the brain state between different segments of a long recording. 3.5. Non-invariant characterisation So far, invariance of an observable has been emphasized as a requirement for the objective characterisation of time series data. However, we have also seen that estimation of truly invariant quantities is an ambitious goal that is worth pursuing only with sufficiently high data quality and for systems from the appropriate class. This does not imply that we have to give up a quantitative description in all other cases. An absolute, portable characterisation is not always indispensable. Non-invariant characterisation of time series data can, for example, be useful for the comparative
30
T. Schreiber / Physics Reports 308 (1999) 1—64
study of multiple data sets. Of course, if we want to use an observable for comparisons that is not invariant under changes in the measurement procedure, we have to standardize the measurement procedure. Practical issues concerning the comparison of time series are discussed in Section 7. Let us here only discuss some nonlinear time series measures which are not invariant under coordinate changes but which have been used in the literature for various reasons, for example because of their robustness to noise and to small sample sizes. Both noise and limited numbers of data points constitute severe problems whenever properties at small length scales in phase space have to be probed. In the presence of noise, length scales below the noise amplitude cannot be accessed without an explicit noise reduction step. Also, on small length scales the discrepancy between the finite collection of points and an underlying probability distribution becomes most pronounced. The obvious way to get away from these problems is to use coarse-grained quantities which are defined on intermediate length scales. While giving up invariance, the statistical properties of these quantities are often favourable. The most extreme step in this direction is to encode the time series as a symbol string and then analyse this sequence of discrete values. This approach is often called symbolic dynamics, although in the dynamical systems literature this name is reserved to a symbolic description that results from an encoding according to a generating partition. (See, for example, [132—134].) In the latter case, the symbol string has the same entropy as the full time series and, in fact, the full trajectory of a dynamical system can be in principle recovered from the (bi-infinite) symbol sequence. The partitions which are commonly used for symbolic encodings of time series data are almost never generating in that sense. Since any further refinement of a partition that is generating preserves this property, one often replaces a generating partition by a very fine ad hoc partition. Upon further refinement, the partition becomes “approximately generating” to a higher and higher degree. However, if fine partitions are used, the main advantage of symbolic dynamics, to reduce the information in a signal to the essential part, is lost. To obtain a symbolic encoding, one can directly partition the measurements into a small number of classes by defining suitable thresholds. Alternatively, a more general partitioning can be defined after a phase space reconstruction step in multidimensional space. The ergodic properties of symbol sequences are traditionally studied much more than continuous state dynamical systems. References from the dynamical systems context are, for example, the works by Herzel [135], Ebeling and coworkers [136], Schu¨rmann and Grassberger [137], and Hao [138]. Applications are discussed in Refs. [139,140] and many others. Symbolic encoding constitutes a severe selection among the available information. This may be a desirable property in cases with high noise levels. At a moderate level of coarse graining, any finite length scale version of the invariant quantities discussed above can in principle be used for comparative purposes. Most popular seem to be intermediate length scale estimates of the correlation dimension and of the maximal Lyapunov exponent. One example of such an estimate is the maximum likelihood estimator given by Theiler [123], see (Eq. (32)). In general, its value depends on the upper cutoff length scale e, and the embedding parameters. Another popular statistic based on the correlation integral goes back to a paper by Brock et al. [141] and is usually referred to as the BDS statistic. The authors of Refs. [141,142] make use of the fact that for a sequence of independent random numbers, C (e)"C (e)K holds, where m is the embedding dimension. In these papers, also a formal test for K this property is introduced. The original BDS statistic is specifically designed so that one can derive
T. Schreiber / Physics Reports 308 (1999) 1—64
31
the asymptotic distribution analytically. A simpler expression that contains the same information is C (e)/C (e)K. K A whole class of measures for nonlinearity is given by constructive measures of predictability that use a specific modelling approach to make forecasts of time series. If an estimate of the dynamics fK has been produced (see Section 3.3), one can define an average prediction error for example by 1 e" N!1!(m!1)q
,\ (s !fK (s )) . L> L LK\O>
(33)
This, or some differently averaged error of prediction, is then used as an indicator for the unpredictability of the signal. If such an interpretation is put forth, it is essential to use some cross-validation technique to ensure that e is an out-of-sample error. An out-of-sample error is obtained if the data set that is used for the estimation of e is a different one from that used to fit fK . See also Section 3.3. One of the simplest nonlinear predictive models is the locally constant approximation given by (Eq. (27)). Since it does not involve the numerical optimisation of parameters, the danger of overfitting is rather small. Of course, the predictions obtained are often not optimal but on the other hand statistically quite stable. A similar nonparametric prediction scheme has been used by Sugihara et al. [143] and by Kennel and Isabelle [85]. Barahona and Poon [144] (among others) have used a global polynomial model (a Volterra series) for nonlinearity testing. In Ref. [145], a number of popular measures for nonlinearity are compared quantitatively for the task of discriminating noisy chaotic data from randomised surrogates with the same linear properties. The finding there was that in this particular setting at the edge of detectability, the most stable statistics outperform more subtle measures. A number of test statistics for the detection and quantification of nonlinearity have been used in the literature which, while not explicitly called prediction errors, can be seen as specific ways to quantify nonlinear predictability in the sense used here. Among these, the test statistic proposed by Kaplan and Glass [146] is particularly suited to quantify deterministic structure in densely sampled data which permit the estimation of local flow vectors. Also the technique of false nearest neighbours advocated by Kennel et al. [121] can be regarded in this way. Pompe [147], and Palus\ [148,149] advocate the use of coarse-grained redundancies, generalisations of the time-delayed mutual information. Prichard and Theiler [58] (among others) have pointed out that it can be computationally advantageous to estimate information theoretic quantities like redundancy and mutual information by their second order generalisations. The latter can be obtained from correlation integrals, thus avoiding the common problems with box-counting approaches. Correlation integrals are also much easier to compute than the adaptive partitionings used, for example, by Fraser and Swinney [82]. The drawback of using second-order quantities is that generalised entropies lack the additivity property. The generalised mutual information is therefore no longer positive definite. This is however unproblematic as long as it is used only as a relative measure. With measured data we will never be able to carry out the proper limit of small length scales. As a rule, the necessary coarse graining leads to a loss of invariance properties. There is one notable exception to this rule. Unstable periodic orbits embedded in a strange attractor define
32
T. Schreiber / Physics Reports 308 (1999) 1—64
a family of invariant quantities which are accessible at finite length scales. In particular, the existence, length, and stability of each orbit are such invariants. Consequently, many people have pursued the analysis of unstable periodic orbits from time series. Periodic orbit expansions [150—153] are of great theoretical appeal but they require knowledge of the dynamical system or data of exceptional quality for useful results. See Ref. [154] for a review. Similar data requirements are valid for the topological analysis of time series and the extraction of templates [155]. Recently, the emergence of methods for the stabilisation of chaotic systems [5,156] has attracted renewed interest in the detection of unstable fixed points or low order periodic orbits [157,158]. Unstable fixed points have been found and stabilised in a number of real world systems. Controlability with methods from chaos theory has been often taken as an indication of the presence of chaos in these systems. However, Christini and Collins [159] have shown that also stochastic, nonchaotic systems can be successfully controlled by such methods. If the detection and analysis of unstable fixed points is used as a means to detect nonlinearity and chaos, the same issues of significance and possible spurious results have been considered as for other quantifiers of nonlinearity. 3.6. Measures of dissimilarity The idea to use relative measures between time series or segments of a long sequence for signal classification and nonstationarity testing has been brought up independently in a number of recent publications [62—66]. In principle, it is desirable to use relative measures that can be interpreted as a distance or a dissimilarity. As we have seen in Section 2.4, one such measure can be derived from the cross-correlation integral, (Eq. (18)). For the practical estimation of the cross-correlation integral, refer to what has been said about the correlation sum (Section 3.4.2). There are at least two other ways to construct an informal measure of dissimilarity from an estimator C (e) of the 67 cross-correlation integral C (e; k, l). (This notation implies that k(x) (resp. k(y)) are the probability distributions of the random variables X and ½, respectively.) Kantz [73] defines an informal distance between attractors by the minimal length scale e above which the attractors are indistinguishable up to an accuracy d: max("log C (e)!log C (e)", "log C (e)!log C (e)")(o ∀e'e . 66 67 77 67
(34)
This approach is particularly useful when comparing clean model attractors and noisy measurements. In that case the length scale at which the two attractors start to differ indicates the noise level. Another possible definition of a dissimilarity based on the cross-correlation integral is 1!C (e)/(C (e)C (e). Further, Albano et al. [160] use a Kolmogorov—Smirnov test to detect 67 66 77 dissimilarity of two correlation integrals.
Confusingly, the purely topological properties which one would expect to be invariants in the first place, are not in general. The topological length of a cycle in a flow system depends on the choice of Poincare´ section and winding numbers, etc., are only invariant under families of transformations where the family depends smoothly on its parameters. For example, the knot structure changes under reflection. Reflection is smooth but cannot be connected with the identity by a smooth family of transformations.
T. Schreiber / Physics Reports 308 (1999) 1—64
33
The different measures of dissimilarity based on predictive models (see Section 2.4) have been introduced in a more a hoc way. They have less of a theoretical foundation and weaker invariance properties than cross-correlation integrals. In practical work however, this is often the price to be paid for statistical robustness and modest data requirements. Locally constant phase space predictors can yield stable results with a few hundred points and global polynomial or radial basis function models with even less. It should be stressed here that it is not essential for the present purpose that the predictions are optimal in the usual sense, as long as the predictions are sensitive to differences in the dynamics. In other words, for comparative purposes it may be advantageous to trade a possible bias for a lower variance.
4. Nonstationarity Almost all methods of time series analysis, traditional linear, or nonlinear, require some kind of stationarity. Therefore, changes in the dynamics during the measurement usually constitute an undesired complication of the analysis. There are however situations where such changes represent the most interesting structure in the recording. For example, electro-encephalographic (EEG) recordings are often taken with the main purpose of identifying changes in the dynamical state of the brain. Such changes occur e.g. between different sleep stages, or between epileptic seizures and normal brain activity. In the past, emphasis has been put on the question how stationarity can be established. If nonstationarity was detected, often the time series was discarded as unsuitable for a detailed analysis, or it was split into segments that were short enough to be regarded as stationary. More recently, authors have begun to exploit the information contained in time-variable dynamics as an essential part of the underlying process. Thus, this section will discuss tests for stationarity but also report on the steps that have been taken towards a time resolved study of nonstationary signals. The most common definition of a stationary process found in textbooks (often called strong stationarity) is that all conditional probabilities are constant in time. Note that this definition is only applicable to the abstract generating process, and not to a realisation that produces a time series. If we regard a deterministic system as the limiting case of a stochastic process where the conditional probability density for a transition from state x to state x is given by d(x!f (x)), the definition requires f ( ) ) to be unchanged with time. In the study of time series, the transition probabilities are unknown and have to be estimated from the data, subject to statistical fluctuations. In some cases, for example in intermittent systems, these fluctuations are large and the properties of measured time series can change dramatically, even though the underlying process is formally stationary after the above definition. There is no agreement on a definition of stationarity for time series. It seems reasonable to require that the duration of the measurement is long compared to the time scales of the systems. If this is the case, all temporal changes can be modelled as part of the dynamics. For this reason, processes with power law correlations are often considered nonstationary since no length of measurement could ever cover all time scales. On the other hand, processes with very well seperated time scales can lead to time series which are stationary for practical purposes. The heart beat of a resting person is often homogeneous over several minutes. Longer recordings, however, cover new elements due to slower biological cycles. Since the common 24 h ECG recordings cover just a single cycle of the
34
T. Schreiber / Physics Reports 308 (1999) 1—64
circadian rhythm, they are more problematic with respect to stationarity than shorter or longer sequences. 4.1. Moving windows A number of statistical tests for stationarity in a time series have been proposed in the literature. Most of the tests I am aware of are based on ideas similar to the following: Estimate a certain parameter using different parts of the sequence. If the observed variations are found to be significant, that is, outside the expected statistical fluctuations, the time series is regarded as nonstationary. In many applications of linear (frequency-based) time series analysis, stationarity has to be valid only up to the second moments (“weak stationarity”). Then, the obvious approach is to test for changes in quantities up to second order, like the mean, the variance, or the power spectrum. See e.g. [14] and references therein. In a nonlinear dynamical framework, weak stationarity is not an interesting property. Quite often, the linear properties of the processes do not carry much information anyway. It is therefore desirable to use some nonlinear quantifier in order to trace nonstationarity. In particular, Isliker and Kurths [161] use a binned probability distribution. The method proposed there, however, suffers from a problem that arises with most nonlinear quantifiers. Unless quite narrow assumptions are made, the probability distribution of these quantities is not known exactly. Therefore, we cannot usually assess the significance of changes in these quantities in a rigorous way. Also, a signal might be considered stationary for some purpose, but not for another. A typical case is dimension estimation which requires stationarity in the probability of close recurrences. The authors of Ref. [161] use a s-test for the difference of histograms on sections of the data. This test, however, assumes that the histogram is formed by independent draws from some probability distribution. In the presence of serial correlations or deterministic structure, this is usually not justified. A possible remedy is to exclude points close in time from the histograms, thereby however losing statistical stability. Computing nonlinear indicators for moving windows of data is attractive because it allows for a time resolved study of possible changes. As we have seen, however, there is a tradeoff between time resolution and statistical accuracy. A different way to proceed therefore is to completely give up the detailed time information and concentrate on testing the null hypothesis that the sequence is stationary. Let us remark that stationarity is an awkward concept to test for. What we would like to have is the assertion that a given time series is stationary. The failure of some test to reject the hypothesis of stationarity is not sufficient — the test might just have no power against the particular kind of nonstationarity present. Quite generally, a statistical test can never prove the null hypothesis. Thus, we would rather like to test against the null hypothesis that the data is non-stationary. Unfortunately, this is such a hopelessly composite hypothesis that we do not know how to devise a statistical test for it. Therefore, formal tests against stationarity, like the one set up by Kennel [162] have to be understood with a particular alternative hypothesis in mind. The alternative in Ref. [162] is that the phase space geometry of the time series, reflected by the nearest-neighbour structure, is changing in time. The basic idea is that the expectation value of the number of reconstructed phase space points that have their nearest neighbour in the same half of the sequence is minimal for a stationary sequence. When thinking of geometry in phase space, nonstationarity introduces a tendency that points close in space are also close in time.
T. Schreiber / Physics Reports 308 (1999) 1—64
35
4.2. Recurrence plots The relation between closeness in time and in phase space is the most relevant manifestation of nonstationarity in a nonlinear dynamical setting. The basic graphical tool that evaluates temporal and phase space distance of states is the classical recurrence plot of Eckmann et al. [163]. In its original version, a pair of times n ,n is called a recurrence if s is one of the kth nearest neighbours L of s , for some predefined value of k. An alternative given by Koebbe [164] is to define a recurrence L to occur at times n ,n at resolution e if n On and ""s !s ""4e. Usually, the s are delay L L L embedding vectors and the results depend on the embedding parameters. A recurrence plot is generated by marking all recurrences at a given neighbour order k or resolution e in a graph with coordinates n and n . In the second form, a recurrence plot can be simply identified with the expression r (n ,n )"(1!d ) K (""s !s "") , C LL C L L
(35)
where the kernel function is usually taken to be the Heaviside step function K (r)"H(e!r). The C full recurrence structure is contained in the recurrence matrix, which is simply defined by R """s !s "". Obviously, R is invariant under isometries (translations, rotations, and reflecL L LL tions) in phase space. McGuire and coworkers [165] give an algorithm to explicitly reconstruct an attractor up to isometries from a recurrence matrix. Of course, since an (N;N) symmetric matrix has N(N!1) independent entries, a recurrence matrix is not a very economical representation of N vectors, and the requirement that the entries are distances in a space of dimension m poses a strong constraint. Since recurrence plots are rather difficult to read they have not gained much popularity beyond the admiration of the intriguing patterns they exhibit [166]. Zbilut and coworkers [167] propose different parameters for the statistical quantification of recurrence plots but they give little clues on how to interpret these numbers. Nevertheless, the recurrence plot can be a useful starting point for the analysis of nonstationary sequences if the relevant information is extracted in a suitable way. The most detailed account of these techniques has been given by Casdagli [168], where also the interrelations to other methods are discussed thoroughly. Let us use a very simple nonstationary dynamical system as an illustration in the following. Consider a one-parameter family of sawtooth maps [0,1] | [0,1]:
2x, x "f L(x )"f (x # mod 1), f (x)" L L L> ( L 2!2x,
x(1/2 , 1/24x(1 .
(36)
Take a time series of length N"20 000 and let vary with time such that it covers L two oscillations of a damped sine function within the measurement period: " L (1#e\L, sin 4pn/N)/2. Recurrence plots of (a) the time series for two-dimensional embeddings, e"0.1 and 0.01, for a three-dimensional embedding, e"0.005, and (b) for the parameter at e"0.001 are shown in Fig. 6. Casdagli [168] has pointed out that for a L faithful embedding and in the limit eP0, NPR, the recurrence plot of the time series from a system approaches that of the varying parameter. For the above example, this can be verified from Fig. 6.
36
T. Schreiber / Physics Reports 308 (1999) 1—64
Fig. 6. Recurrence plots of a tent map time series subject to a parameter drift (see text for details). The left panel was obtained with two-dimensional embeddings. Above the diagonal: e"0.01 (50% of all points chosen at random for better contrast). Below the diagonal: e"0.1 (0.2% shown). In the right panel, embedding was done in three dimensions. Above the diagonal: e"0.005. Below the diagonal, the recurrence plot of the parameter sequence + , is shown with e"0.001 L (2% shown). For decreasing e, the recurrence plots of the signal indeed converge to that of the parameter sequence (right panel, below the diagonal).
Let us note that the total average r (n , n ) equals the sample correlation integral C(e) in L L C Eq. (30). However, in the practical estimation of d one has to exclude terms with "n !n "(t .
One way to estimate the correlation time t is the following. The partial average
, r (n,n!*n) (37) C L L> yields the space—time separation plot introduced by Provenzale et al. [120]. Contour lines of C(e,*n) should not increase with *n except for possible oscillatory variation. The minimal *n for which this is the case yields a guideline for the minimal time separation t to be used in
the correlation sum. Due to the temporal averaging, the space—time separation plot does not allow for a time-resolved analysis. Also, the effect of nonstationarity may average out in certain cases, as for example for the oscillating parameter in the tent map example. Let us remark that for the same reason also the time averaged statistic used by Kennel [162] fails to reject the null hypothesis of stationarity since the nearest neighbour of each point can have any temporal distance with about equal probability. 1 C(e,*n)" N!*n
4.3. Tracing parameter variation A useful way to formulate nonstationary dynamics is by introducing a temporal variation of dynamical parameters into the system, as it was already done in the tent map example. If this variation is sufficiently slow, recurrence plots and similar techniques can asses these changes to some extent. In Ref. [168], it is shown by a scaling argument that for a dynamical system with time varying parameters, the recurrence plot in the limit of small e, large N, and sufficient m approaches the recurrence plot of the fluctuating parameter. This can be seen in Fig. 6 for the time varying tent map. However, it is in general difficult to extract the time variation of the parameter from its
T. Schreiber / Physics Reports 308 (1999) 1—64
37
recurrence plot. Nevertheless, qualitative information, like the number of fluctuating parameters and the time scales of their fluctuations, can often be inferred from such a plot. It can be useful to average the number of recurrences over windows in time, in particular, if there is a stochastic component in the dynamics of the system: 1 U U C (e, w, n , n )" r (n #i, n #j) , C 67 a G H
(38)
where a"w! U U d is the normalization which takes the varying number of G H L >G L >H diagonal recurrences into account. The quantity C (e, w, n , n ) defined in Eq. (17) is just the 67 cross-correlation integral (Section 2.4, Eq. (17)) between segments of length w of +s ,, starting at L n and n , that is, X and ½ here are two segments of the same time series. The cross-correlation integral has been introduced and discussed as a measure for the distance between attractors by Kantz [73]. Relative measures between time series or segments of a long sequence for signal classification and nonstationarity testing have been discussed in Sections 2.4 and 3.6 above. There we also discussed the conditions for the quantity C (e, w, n , n )#C (e, w, n , n )!2C (e, w, n , n ) to become 67 67 67 a formal distance fulfilling the triangle inequality in the limit eP0 and wPR. The latter limit causes problems with signals which can be considered to be stationary only over short times w, if at all. If we give up the formal requirement of a distance, we can alternatively use nonlinear cross-prediction errors. The average error of a locally constant predictor (see also Eq. (23))) can be written in a compact way in terms of recurrences: U !s ) , c(e, w, n , n )" (sK L >G> L >G> G
(39)
where the prediction sK is given by the average over an e-neighbourhood, L >G>
1 U r (n #i, n #j) s C L >H> a " H sK L >G> 1 U s L >H> w H
a'0 , (40) a"0 ,
where a" U r (n #i, n #j) is the number of neighbours of s closer than e. Fig. 7 H C L >G shows c(e, w, n , n ) for the modulated tent map series with w"1000 and n , n in steps of 500. The variation of the prediction error with the segment location n , n is clearly visible. The information contained in Fig. 7 can be processed in a way that makes the time dependence of the parameter more clear. The quantities c(e,w,n ,n ) can be regarded as a dissimilarity matrix and treated by a cluster algorithm. (This technique will be discussed in more detail below in Section 7.) If the analysis is successful, the clusters are localised in parameter space and can be used to define coordinates in that space. The time-varying “distances” of each segment i to the clusters l (DJ, to be defined in Eq. (50) of Section 7.2 below) then reflect the time G variation of the parameter(s). A successful example with two clusters and one parameter is shown in Fig. 8.
38
T. Schreiber / Physics Reports 308 (1999) 1—64
Fig. 7. Cross-prediction errors for the model (40) obtained with locally constant prediction in one dimension versus location in time of segments. Local neighbourhoods were formed with a radius of one-quarter of the variance of the sequence, e"0.072. Time windows of length w"1000 overlapping by 500 time steps were used.
Fig. 8. Left panel: Time dependence of the parameter in the tent map example. Right panel: The information L contained in Fig. 7 was used to form two clusters of similar time series. For each segment, the distances D,D to G G clusters (1) and (2) are computed. The difference D!D is plotted for each segment versus time. Such a plot cannot G G only reveal that there is a single changing parameter but also the form of its change. That the units in both panels are of comparable magnitude is purely coincidental. Clustering of time series is discussed in Section 7.2 below, where also DJ is G defined.
4.4. An application Let us finish the discussion of nonstationary time series with an application that is currently studied with considerable effort in a number of research groups: the anticipation of epileptic seizure onset from intracranial recordings of neural potentials. Epileptic seizures manifest themselves in specific patterns in the neural electric field. While traditional electro-encephalograms (EEG) with electrodes placed on the surface of the scalp show such patterns when the epilepsy activity has reached a cortical region that is sufficiently close to the surface, for a detailed study of focal epilepsy in deeper regions of the brain electrodes have to be implanted in the epileptogenic region. This is a common clinical technique in pre-surgical screening. The specific activity during seizures is usually so pronounced that it can be detected visually and also automatically. A far more Many authors have discussed the question if there is evidence for low-dimensional chaos and strange attractors in normal, or, more likely, in epilepsy EEG data. References include [169—172] and many more, conclusions are controversial. More recently, the focus has shifted from the question of chaos versus noise to the quantification of changes in the EEG. Most authors agree that ictal (during seizures) and inter-ictal EEG can be distinguished with nonlinear, but also with spectral methods. References for the former are, for example, Refs. [65,173—176], and Ref. [177] for the latter.
T. Schreiber / Physics Reports 308 (1999) 1—64
39
challenging problem is to detect specific changes in the dynamics of the recordings just prior to the actual seizure. First of all, a reliable anticipation of a seizure several minutes ahead potentially allows for pharmacological or electrophysiological intervention. The insights into the mechanism that leads to the large scale pathological activity are of equally high interest. The problem however is very intricate. At any given time, the recorded neural activity is very rich and far from being understood. Although usually simultaneous recordings at several positions are available, it is not clear to what extent multivariate studies provide more insight at this stage [178]. Electrode spacings down to fractions of a millimeter are still much larger than typical coherence lengths. The dynamics is time variable and only part of this variability is specific for the generation of epileptic activity. It is not expected that the brain falls into a single typical state prior to a seizure but rather that the pre-seizure activity shows some characteristic yet variable behaviour. The task of time series analysis is to find and specify such features that allow for the detection of the critical state. Elger and Lehnertz [131] claim statistically significant positive evidence for seizure predictability several minutes ahead of seizure onset. The authors use a sliding window version of the correlation integral that has been customised for this particular purpose. Since the intracranial EEG signal is nonstationary even in episodes without epileptic activity, the window length should be short enough for the segment to be effectively stationary but long enough to yield stable results. Half-overlapping windows of 30 s duration were chosen at a sampling rate of 173 Hz. The variance of the data segments is time dependent which makes it difficult to choose a length scale for the determination of an effective scaling index. The dominating contribution to the variance found in pathological regions often arises from spikes occurring at irregular intervals. Although these spikes are characteristic for epileptogenic tissue, they seem not to be specific precursors of seizures and should therefore not be overemphasised in the analysis. In Ref. [131] this is achieved by selecting approximate scaling regions for each individual window. These are usually found at much smaller scales than the spikes. Pre-seizure behaviour is found to be accompanied by epochs of smaller effective scaling indices as compared to the standard behaviour found in segments that are spatially and/or temporally well separated from the seizure. Correlation dimension data for a time series recorded by Elger and Lehnertz has already been shown in Section 3.4.2 and it may surprise the reader that in Ref. [131] the authors do find small approximate scaling regions. This finding can be reproduced by limiting the correction for the dynamical correlations to the exclusion of pairs which are not more than three sampling intervals apart in time. For the determination of a proper dimension, or for the interpretation of the effect as a signature of a finite attractor dimension, this would be disastrous, but it is fully justified by the resulting discriminative power for the particular purpose at hand. Fig. 9 repeats the same as was done in Fig. 5 but with the limited correction indicated above. The left panel shows a data segment that was taken far away in time from any seizure. The right panel was calculated from the same data as Fig. 5, a segment taken about 10 min prior to the onset of an epileptic seizure. Indeed we see tiny “plateaus” not present in Fig. 5. Further, we see that there is a difference between the two segments in where the pseudo-scaling is found. It is such differences that have been studied systematically in Ref. [131]. The statistical material presented there is based on 16 patients and shows
The particular shape of the deflection that could have been interpreted as a plateu resembles what is typically seen for intermittent systems, see [179].
40
T. Schreiber / Physics Reports 308 (1999) 1—64
Fig. 9. Maximum likelihood estimator of the correlation dimension for an intracranial EEG recording [131] of an epilepsy patient. The effect of dynamically correlated pairs was only incompletely corrected. Left: a data set measured long before the next seizure. Right: a data set measured about 10 min prior to seizure onset. The right panel was produced with the same time series as Fig. 5.
that the pre-seizure states and normal epochs follow significantly different distributions of approximate scaling indices. Certainly, a number of ad hoc decisions have been made in devising the algorithm and it is not fully clear from Ref. [131] to what extent the same sample of patients has been used to optimise parameters. Further research will have to evaluate whether the differences are strong enough to make reliable out-of-sample predictions for individual patients. For clinical applicability, it will eventually be necessary to compute and interpret the relevant quantities in real time. The finding that the discriminative power declines upon full correction of the dynamical correlations indicates that it is not exactly phase space geometry that distinguishes the different states. It should however be stressed that the shown segments are not distinguishable by their autocorrelation functions or spectra. Thus, what is represented in Fig. 9 is a difference in the nonlinear dynamical correlation. This suggests that there may be nonlinear indicators which are more sensitive to changes in the dynamics and that might correlate even more strongly with the seizure onset.
5. Testing for nonlinearity There are two distinct motivations to use a nonlinear approach when analysing time series data. It might be that the arsenal of linear methods has been exploited thoroughly but all the efforts left certain structures in the time series unaccounted for. It is also common that a system is known to include nonlinear components and therefore a linear description seems unsatisfactory in the first place. Such an argument is often heard for example in brain research — nobody expects the brain to be a linear device. In fact, there is ample evidence for nonlinearity in particular in small assemblies of neurons. Nevertheless, the latter reasoning is rather dangerous. That a system is known to contain nonlinear components does not prove that this nonlinearity is also reflected in a specific signal we measure from that system. In particular, we do not know if it is of any practical use to go beyond the linear approximation. After all, we do not want our data analysis to reflect our prejudice about the underlying system but to represent a fair account of the structures that are present in the data. Consequently, the application of nonlinear time series methods has to be justified by establishing nonlinearity in the time series data.
T. Schreiber / Physics Reports 308 (1999) 1—64
41
This section will discuss formal statistical tests for nonlinearity. First, a suitable null hypothesis for the underlying process will be formulated covering all Gaussian linear processes or a class that is somewhat wider. We will then attempt to reject this null hypothesis by comparing the value of a nonlinear parameter estimated on the data with its probability distribution for the hypothesis. Since only exceptional cases allow for the exact or asymptotic derivation of this distribution unless strong additional assumptions are made, we have to estimate it by a Monte Carlo resampling technique. This procedure is known in the nonlinear time series literature as the method of surrogate data, see Refs. [104,180,181]. Thus we have to face a two-fold task. We have to find a nonlinear parameter that is able to actually detect an existing deviation of the data from a given null hypothesis and we have to provide an ensemble of randomised time series that accurately represents the null hypothesis. 5.1. Detecting weak nonlinearity In the preceding sections, several quantities have been discussed that can be used to characterise nonlinear time series. For the purpose of nonlinearity testing we need such quantities that are particular powerful in discriminating linear dynamics and weakly nonlinear signatures — strong nonlinearity is usually more easily detectable. Quite a number of such measures has been proposed and used in the literature. An important objective criterion that can be used to guide the preferred choice is the discrimination power of the resulting test. The power b is defined as the probability that the null hypothesis is rejected when it is indeed false. It will obviously depend on how and how strongly the data actually deviates from the null hypothesis. Traditional measures of nonlinearity are derived from generalisations of the two-point autocovariance function or the power spectrum. The use of higher-order cumulants and bi- and multi-spectra is discussed for example in Ref. [182]. One particularly useful third-order quantity is , (s !s ) LO> L L\O ,
(q)" [ , (s !s )] LO> L L\O
(41)
since it measures the asymmetry of a series under time reversal. (Remember that the statistics of linear stochastic processes is always symmetric under time reversal. This can be most easily seen when the statistical properties are given by the power spectrum which contains no information about the direction of time.) Time reversibility as a criterion for discriminating time series is discussed in detail in Ref. [183]. When a nonlinearity test is performed with the question in mind if nonlinear deterministic modelling of the signal may be useful, it seems most appropriate to use a test statistic that is related to a nonlinear deterministic approach. We have to keep in mind however that a positive test result only indicates nonlinearity, not necessarily determinism. Since nonlinearity tests are usually performed on data sets which do not show unambiguous signatures of low-dimensional determinism (like clear scaling over several orders of magnitude), one cannot simply estimate one of the quantitative indicators of chaos, like dimension or Lyapunov exponent. The formal answer would almost always be that both are probably infinite. Still, some useful test statistics are at least inspired by these quantities. Usually, some effective value at a finite length scale has to be computed rather than attempting to take the proper limits. We can largely follow the discussion in
42
T. Schreiber / Physics Reports 308 (1999) 1—64
Section 3.5, considering the limiting case that the deterministic signature to be detected is probably weak. In that case the major limiting factor for the performance of a statistical indicator is its variance since possible differences between two samples may be hidden among the statistical fluctuations. In Ref. [145], a number of popular measures of nonlinearity are compared quantitatively. The results can be summarised by stating that in the presence of time-reversal asymmetry, the three-point autocorrelation (Eq. (41)) gives very reliable results. However, many non-linear evolution equations produce little or no time-reversal asymmetry in the statistical properties of the signal. In these cases, simple measures like a prediction error of a locally constant phase space predictor performed best. It was found to be advantageous to choose embedding and other parameters in order to obtain a quantity that has a small spread of values for different realisations of the same process, even if at these parameters no valid embedding could be expected. 5.2. Surrogate data tests All of the measures of nonlinearity discussed above have in common that their probability distribution on finite data sets is not known analytically. Some authors have tried to give error bars for measures like predictabilities (e.g. see [144]) or averages of pointwise dimensions (e.g. see [184]) based on the observation that these quantities are averages (or medians) of many individual terms, in which case the variance (or quartile points) of the individual values yield an error estimate. This reasoning is however only valid if the individual terms are independent, which is usually not the case for time series data. In fact, it is found empirically that nonlinearity measures often do not even follow a Gaussian distribution. Also the standard error given by Roulston [185] for the mutual information is not quite correct except for uniformly distributed data. While a smooth rescaling to uniformity would not do harm to his derivation, rescaling to exact uniformity is in general non-smooth and introduces a bias in the joint probabilities. In order to determine the distribution of a nonlinear statistic on realisations of the null hypothesis, it is therefore preferable to use a Monte Carlo resampling technique. Traditional bootstrap methods use explicit model equations that have to be extracted from the data. This typical realizations approach can be very powerful for the computation of confidence intervals, provided the model equations can be extracted successfully. As discussed by Theiler and Prichard [186], the alternative approach of constrained realizations is more suitable for the purpose of hypothesis testing we are interested in here. It avoids the fitting of model equations by directly imposing the desired structures onto the randomised time series. However, the choice of possible null hypothesis is limited by the difficulty to impose arbitrary structures on otherwise random sequences. The following section will discuss a number of null hypotheses and algorithms to provide the adequately constrained realisations. The most general method to generate constrained randomisations of time series is described in Ref. [187]. The price for its accuracy and generality is its high computational cost. 5.2.1. How to make surrogate data It is essential for the validity of the statistical test that the surrogate series are created properly. If they contain spurious differences to the measured data, these may be detected by the test and interpreted as signatures of nonlinearity. More formally, the size of a test is the actual probability that the null hypothesis is rejected although it is in fact true. For a valid test, the size a must not exceed the level of significance p. The correct size crucially depends on the way the surrogates are
T. Schreiber / Physics Reports 308 (1999) 1—64
43
generated. Let us discuss a hierarchy of null hypotheses and the issues that arise when creating the corresponding surrogate data. A simple case is the null hypothesis that the data consists of independent draws from a fixed probability distribution. Surrogate time series can be simply obtained by randomly shuffling the measured data. If we find significantly different serial correlations in the data and the shuffles, we can reject the hypothesis of independence. The next step would be to explain the structures found by linear two-point autocorrelations. A corresponding null hypothesis is that the data have been generated by some linear stochastic process with Gaussian increments. The most general univariate linear process is given by Eq. (3). The statistical test is complicated by the fact that we do not want to test against one particular linear process only (one specific choice of the a and b ), but G G against a whole class of processes. This is called a composite null hypothesis. The unknown values a and b are sometimes referred to as nuissance parameters. There are basically three directions we G G can take in this situation. First, we could try to make the discriminating statistic independent of the nuissance parameters. This approach has not been demonstrated to be viable for any but some very simple statistics. Second, we could determine which linear model is most likely realised in the data by a fit for the coefficients a and b , and then test against the hypothesis that the data has been G G generated by this particular model. Surrogates are simply created by running the fitted model. This typical realisations approach is the common choice in the bootstrap literature, see e.g. the classical book by Efron [188]. The main drawback is that we cannot recover the true underlying process by any fit procedure. Apart from problems associated with the choice of the correct model orders M and N, the data is by construction a very likely realisation of the fitted process. Other realisations will fluctuate around the data which induces a bias against the rejection of the null hypothesis. This issue is discussed thoroughly in Ref. [104], where also a calibration scheme is proposed. The most attractive approach to testing for a composite null hypothesis seems to be to create constrained realisations [186]. Here it is useful to think of the measurable properties of the time series rather than its underlying model equations. The null hypothesis of an underlying Gaussian linear stochastic process can also be formulated by stating that all structure to be found in a time series is exhausted by computing first- and second-order quantities, the mean, the variance and the autocovariance function. This means that a randomised sample can be obtained by creating sequences with the same second-order properties as the measured data, but which are otherwise random. When the linear properties are specified by the squared amplitudes of the Fourier transform (that is, the periodogram estimator of the power spectrum), surrogate time series +s , are L readily created by multiplying the Fourier transform of the data by random phases and then transforming back to the time domain: ,\ s " e ?I(P e\ pIL, , I L I where 04a (2p are independent uniform random numbers. I
(42)
Independence seems not to be an interesting null hypothesis for most time series problems. It becomes relevant when the residual errors of a time series model is evaluated. For example in the BDS test for nonlinearity [141], an ARMA model is fitted to the data. If the data are linear, then the residuals are expected to be independent.
44
T. Schreiber / Physics Reports 308 (1999) 1—64
The two null hypotheses discussed so far (independent random numbers and Gaussian linear processes) are not what we want to test against in most realistic situations. In particular, the most obvious deviation from the Gaussian linear process is usually that the data do not follow a Gaussian single time probability distribution. This is quite obvious for data obtained by measuring intervals between events, e.g. heart beats since intervals are strictly positive. There is however a simple generalisation of the null hypothesis that explains deviations from the normal distribution by the action of a monotone, static measurement function: + , s "s(x ), x " a x # b g . (43) L L L G L\G G L\G G G We want to regard a time series from such a process as essentially linear since the only nonlinearity is contained in the — in principle invertible — measurement function s( ) ). The most common method to create surrogate data sets for this null hypothesis essentially attempts to invert s( ) ) by rescaling the time series +s , to conform with a Gaussian distribution. The L rescaled version is then phase randomised (conserving Gaussianity on average) and the result is rescaled to the empirical distribution of +s ,. These amplitude-adjusted Fourier transformed surroL gates (AAFT) yield a correct test when N is large, the correlation in the data is not too strong and s( ) ) is close to the identity. It is argued in Ref. [189] that for short and strongly correlated sequences, the AAFT algorithm can yield an incorrect test since it introduces a bias towards a slightly flatter spectrum. In fact, the formal requirement the surrogates have to fulfill for the test to be correct is that they have the same sample periodogram and the same single time sample probability distribution as the data. Schreiber and Schmitz [189] propose a method which iteratively corrects deviations in spectrum and distribution. Alternatingly, the surrogate is filtered towards the correct Fourier amplitudes and rank-ordered to the correct distribution. The accuracy that can be reached depends on the size and structure of the data and is generally sufficient for hypothesis testing. The above schemes are all based on the Fourier amplitudes of the data, which is however not exactly what we want. Remember that the autocorrelation structure given by Eq. (6) corresponds to the Fourier amplitudes only if the time series is one period of a sequence that repeats itself every N time steps. Conserving the Fourier amplitudes of the data means that the periodic autocovariance function 1 , (44) C (q)" x x L L\O\ ,> N N L is reproduced, rather than C(q). The difference can lead to serious artefacts in the surrogates, and, consequently, spurious rejections in a test. In particular, any mismatch between the beginning and the end of a time series poses problems, as discussed e.g. in Ref. [190]. In spectral estimation, problems caused by edge effects are dealt with by windowing and zero padding. None of these have been successfully implemented for the phase randomisation of surrogates. The problem of nonmatching ends can be partly overcome by choosing a subinterval of the recording such that the end points do match approximately. Still, there may remain a finite phase slip at the matching points. The only method that has been proposed so far that strictly implements C(q) rather than C (q) is N given in Ref. [187]. The method is very accurate but also rather costly in terms of computer time. In
T. Schreiber / Physics Reports 308 (1999) 1—64
45
practical situations, the matching of end points is a simple and mostly sufficient precaution that should not be neglected. Since the randomisation algorithm of Ref. [187] is of very general applicability and conceptually quite simple, let us give a brief description. In order to create randomised sequences with the correct distribution of values, only permutations of the original time series are considered. The shuffling is however carried out under the constraint that the autocovariances of the surrogate C(q) are the same as those of the data, C(q). This is done by specifying the discrepancy as a cost function, e.g.
,\ O . (45) EO" "C(q)!C(q)"O O The way the average over all lags q is taken can be influenced by the choice of q. Now EO(+sJ ,) is L minimised among all permutations +sJ , of the original time series +s , using the method of L L simulated annealing. Configurations are updated by exchanging pairs in +sJ ,. With an appropriate L cooling scheme, the annealing procedure can reach any desired accuracy. Simulated annealing has a rich literature, classical references are Metropolis et al. [191] and Kirkpatrick [192], more recent material can be found for example in [193]. Constrained randomisation using combinatorial minimisation is a very flexible method since in principle arbitrary constraints can be realised. Although it is seldom possible to specify a formal null hypothesis for more general constraints, it can be quite useful to be able to incorporate into the surrogates any feature of the data that is understood already or that is uninteresting. Let us give an example for the flexibility of the approach, a simultaneous recording of the breath rate and the instantaneous heart rate of a human subject during sleep. (Data set B of the Santa Fe Institute time series contest in 1991 [194], sample points 1800—4350.) Regarding the heart rate recording on its own, one easily detects nonlinearity, in particular via an asymmetry under time reversal. An interesting question however is, how much of this structure can be explained by linear dependence on the breath rate, the breath rate also being non-time-reversible. In order to answer this question, we need to make surrogates that have the same autocorrelation structure but also the same cross-correlation with respect to the fixed input signal, the breath rate. Accordingly, the constraint is imposed that lags 0,2,500 of the autocovariance function and lags !500,2,500 of the cross-covariance function with the reference (breath) signal are given by the data. Further suppose that within the 20 min of observation, during one minute the equipment spuriously recorded a constant value. In order not to interpret this artefact as structure, the same artefact is imposed on the surrogates, simply by excluding these data points from the permutation scheme. Fig. 10 shows the measured breath rate (upper trace) and instantaneous heart rate (middle trace). The lower trace shows a surrogate conserving both, auto- and cross-correlations. The visual
Strictly speaking, these constraints overspecify the problem and it is likely that the only permutation that fulfills them exactly is the original time series itself. However, it can be expected that there are a large number of permutations which are essentially different and which fulfill the constraint almost exactly. In fact, it has been observed [195] for very short sequences of N(50 points and strong correlations that the annealing scheme settled on the original data. If this seems to happen, one can introduce a term in the cost function that discourages similarity to the original permutation.
46
T. Schreiber / Physics Reports 308 (1999) 1—64
Fig. 10. Simultaneous measurements of breath and heart rates [194], upper and middle traces. Lower trace: a surrogate heart rate series preserving the autocorrelation structure and the cross-correlation to the fixed breath rate series, as well as a gap in the data. Auto- and cross-correlation together seems to explain some, but not all of the structure present in the heart rate series.
Fig. 11. Nonstationary financial time series (BUND Future returns, top) and a surrogate (bottom) preserving the nonstationary structure quantified by running window estimates of the local mean and variance (middle).
impression from Fig. 10 is that while the linear cross-correlation with the breath rate explains the cyclic structure of the heart rate data, other features, in particular the asymmetry under time reversal, remain unexplained. This finding can be verified at the 95% level of significance, using the time asymmetry statistic given in Eq. (41). Possible explanations include artefacts due to the peculiar way of deriving heart rate from inter-beat intervals, nonlinear coupling to the breath activity, nonlinearity in the cardiac system, and others. Let us finish the section by giving a more exotic example, from finance. The time series consists of 1500 daily returns (until the end of 1996) of the BºND Future, a derived german financial instrument. As can be seen in the upper panel of Fig. 11, the sequence is nonstationary in the sense that the local variance and also the local mean undergo changes on a time scale that is long compared to the fluctuations of the series itself. This property is known in the statistical
The data were kindly provided by Thomas Schu¨rmann, WGZ-Bank Du¨sseldorf.
T. Schreiber / Physics Reports 308 (1999) 1—64
47
literature as heteroscedasticity and modeled by the so-called GARCH [196] and related models. Here, we want to avoid the construction of an explicit model from the data but rather ask the question if the data is compatible with the null hypothesis of a correlated linear stochastic process with time-dependent local mean and variance. We can answer this question in a statistical sense by creating surrogate time series that show the same linear correlations and the same time dependence of the running mean and running variance as the data and comparing a nonlinear statistic between data and surrogates. The lower panel in Fig. 11 shows a surrogate time series generated using the annealing method described above. The cost function was set up to match the autocorrelation function up to five days and the moving mean and variance in sliding windows of 100 days duration. In Fig. 11 the running mean and variance are shown as points and error bars, respectively, in the middle trace. The deviation of these between data and surrogate has been minimised to such a degree that it can no longer be resolved. A comparison of the time-asymmetry statistic Eq. (41) for the data and 19 surrogates did not reveal any discrepancy, and the nullhypothesis could not be rejected. 5.3. What can be learned Having set up all the ingredients for a statistical hypothesis test of nonlinearity, we may ask what we can learn from the outcome of such a test. The formal answer is of course that we have, or have not, rejected a specific hypothesis at a given level of significance. How interesting this information is, however, depends on the null hypothesis we have chosen. The test is most meaningful if the null hypothesis is plausible enough so that we are prepared to believe it in the lack of evidence against it. If this is not the case, we may be tempted to go beyond what is justified by the test in our interpretation. Take as a simple example a recording of hormone concentration in a human. We can test for the null hypothesis of a stationary Gaussian linear random process by comparing the data to phase randomised Fourier surrogates. Without any test, we know that the hypothesis cannot be true since hormone concentration, unlike Gaussian variates, is strictly nonnegative. If we failed to reject the null hypothesis by a statistical argument, we will therefore go ahead and reject it anyway by common sense, and the test was pointless. If we did reject the null hypothesis by finding a coarse grained “dimension” which is significantly lower in the data than in the surrogates, the result formally does not give any new information but we might be tempted to speculate on the possible interpretation of the “nonlinearity” detected. This example is maybe too obvious, it was meant only to illustrate that the hypothesis we test against is often not what we would actually accept to be true. Other, less obvious and more common, examples include signals which are known (or found by inspection) to be nonstationary (which is not covered by most null hypotheses), or signals which are likely to be the squares of some fundamental quantity. An example for the latter are the celebrated sunspot numbers. Sunspot activity is generally connected with magnetic fields and is to first approximation proportional to the squared field strength. Obviously, sunspot numbers are nonnegative, but also the null hypothesis of a monotonically rescaled Gaussian linear random process is to be rejected since taking squares is not a monotonic operation. Unfortunately, the framework of surrogate data does not currently provide a method to test against null hypothesis involving noninvertible measurement functions. Yet another example is given by linearly filtered time series. Even if the null hypothesis of a monotonically rescaled Gaussian linear random process is true for the underlying signal, it is
48
T. Schreiber / Physics Reports 308 (1999) 1—64
usually not true for filtered copies of it, in particular sequences of first differences, see [197] for a discussion of this problem. Recent efforts on the generalisation of randomisation schemes try to broaden the repertoire of null hypotheses we can test against. The hope is that we can eventually choose one that is general enough to be acceptable if we fail to reject it with the methods we have. Still, we cannot prove that there is not any structure in the data beyond what is covered by the null hypothesis. From a practical point of view, however, there is not much of a difference between structure that is not there and structure that is undetectable with our observational means. 6. Nonlinear signal processing The goals of time series analysis are probably as diverse as the methods. In basic research, the ultimate aim is a deeper understanding of some phenomenon in nature. In engineering, clinical research, finance, etc., a better understanding of the processes is also most welcome but the actual purpose of the work is different, making better devices, making people healthier, making money. Pursuing such a goal often involves a rather specific task of time series analysis. The interesting problem of signal classification will be dealt with in Section 7. One of the most well-known objectives is the prediction of future values of some quantity, for instance the price of an entity at the stock market. The prediction problem has been discussed in Section 3.3. It is almost identical to the problem of estimating the dynamics underlying a time series. An intermediate step in most time series studies is to filter the data in order to enhance the relevant information. Prediction and filtering, or noise reduction, have many things in common, but there are notable differences. In particular, for noise reduction it is not enough to have a description of the dynamics. One also has to have a means of finding a cleaner signal that is consistent with this dynamics. 6.1. Nonlinear noise reduction Originally, phase space methods of nonlinear noise reduction have been developed [198—201] under the premise that there is a low-dimensional dynamical system which is only observed with some observational error that is to be suppressed. Conceptual as well as technical issues arising in such a situation have been well discussed in the literature, see Kostelich and Schreiber [202] for a review containing the relevant references. In interdisciplinary applications, we usually face a different situation — the signals themselves often contain a stochastic component. Before we apply a filtering technique we therefore have to specify what exactly we want to separate. Phase space projection techniques, like those employed by Grassberger and coworkers [201], rely on the assumption that the signal of interest is approximately described by a manifold that has a lower dimension than some phase space it is embedded in. This statement can be formalised as follows. Let +x , be the states of the system at times n"1,2, N, represented in some space RB. A (d!Q)L dimensional submanifold F of this space can be specified by F ( y)"0, q"1,2, Q. Even if F is not O O known exactly, or if +x , is corrupted by noise, we can always find +e , such that y "x #e and L L L L L F (x #e )"0, ∀q, n . (46) O L L Then (1e2 denotes the (root mean squared) average error we make by approximating the points +x , by the manifold F. In a measurement we can only obtain noisy data y "x #g , where +g , is L L L L L
T. Schreiber / Physics Reports 308 (1999) 1—64
49
some random contamination. By projecting these values onto some estimated manifold F we may be able to recover x "x #e . If we can find a suitable manifold — and carry out the projections L L L — such that 1e2(1g2, then we have reduced the observational error. For dynamical systems embedded in delay coordinate space there always exists a manifold for which e ,0, but as we can L see, noise reduction is possible as soon as the e are smaller than the observational error. Of course, L we will not only reduce the magnitude of the errors but also alter their structure. Therefore, we will have to be careful (for example by statistically analysing the corrections) when we are going to interpret the structure we find in the corrected data. Since the full, true phase space of a system is not usually accessible to time series measurements, phase space filtering has to make heavy use of time delay or related embedding techniques. Filtering implies that the signal is not pure but a mixture of several components to be separated, and the use of the embedding theorems for dynamical systems is limited. We will rather have to take a pragmatic attitude. Consult Section 3.1 for material and references on the embedding of finite noisy time series. In time series work, the most practical way to approximate data by a manifold is by a locally linear representation. It should in principle be possible to fit global nonlinear constraints FK from O data but the problem is complicated by the necessity to have Q locally independent equations. In the locally linear case this is achieved by establishing local principal components. The derivation will not be repeated here, it is carried out for example in Refs. [7,203]. The resulting algorithm proceeds as follows. In an embedding space of dimension m we form delay vectors s . For each of L these we construct small neighborhoods U , so that the neighbouring points are s , k3U . Within L I L each neighbourhood, we compute the local mean 1 s sN L" "U " U I L IZ L and the (m;m) covariance matrix
(47)
1 C " (s ) (s ) !sN LsN L . (48) GH "U " U I G I H G H L IZ L The eigenvectors of this matrix are the semi-axes of the best approximating ellipsoid of this cloud of points (these are local versions of the well-known principal components, or singular vectors, see for example Refs. [204,205]). If the clean data lives near a smooth manifold with dimension m (m, and if the variance of the noise is sufficiently small for the linearisation to be valid, then for the noisy data the covariance matrix will have large eigenvalues spanning the smooth manifold and small eigenvalues in all other directions. Therefore, we move the vector under consideration towards the manifold by projecting onto the subspace of large eigenvectors. The procedure is
It has been found advantageous [201] to introduce a diagonal weight matrix R and define a transformed version of the covariance matrix C "R C R for the calculation of the principal directions. In order to penalise corrections based GH GG GH HH on the first and last coordinates in the delay window one puts R "R "r where r is large. The other values on the KK diagonal of R are 1. For this to be valid, the neighbourhoods should be larger than the noise level. In practice, a tradeoff between the clear definition of the noise directions and a good linear approximation has to be balanced.
50
T. Schreiber / Physics Reports 308 (1999) 1—64
illustrated in Fig. 12. The correction is done for each embedding vector, resulting in a set of corrected vectors in embedding space. Since each element of the scalar time series occurs in m different embedding vectors, we finally have as many different suggested corrections, of which we simply take the average. Therefore, in embedding space the corrected vectors do not precisely lie on the local subspaces but are only moved towards them. As an application, Fig. 13 shows the result of the noise reduction scheme applied to a noisy ECG. As discussed already in Section 3.2, a delay coordinate embedding of an electrocardiogram seems to be well approximated by a lower-dimensional manifold. This is apparent already in a two-dimensional representation (Fig. 3), but for the purpose of noise reduction, embeddings in higher dimensions are advantageous. The data shown in Fig. 13 was produced with delay windows covering 200 ms, that is, m"50 at a delay time of 4 ms (equal to the sampling interval). See Ref. [206] for more details on the nonlinear projective filtering of ECG signals. Applications of
Fig. 12. Illustration of the local projection scheme. For each point to be corrected, a neighbourhood is formed (grey shaded area), the point cloud in which is then approximated by an ellipsoid. An approximately two-dimensional manifold embedded in a three-dimensional space, could, for example, be cleaned by projecting onto the first two principal directions.
Fig. 13. Nonlinear noise reduction applied to electrocardiogram data. Upper trace: original recording. Middle: the same contaminated with typical baseline noise. Lower: the same after nonlinear noise reduction. The enlargements on the right show that indeed clinically important features like the small downward deflection of the P-wave preceding the large QRS-complex (see for example Goldberger and Goldberger [210] for an introduction to electrocardiography) are recovered by the procedure. Note that the noise and the signal have very similar spectral contents and could thus not be separated by Fourier methods.
T. Schreiber / Physics Reports 308 (1999) 1—64
51
nonlinear noise reduction to chaotic laboratory data are given in Ref. [207]. It should be noted that, as it stands, nonlinear noise reduction is quite computer time intensive, in particular if compared to Fourier-based filters. For small and moderate noise levels, this can be moderated by using fast neighbour search strategies, see for example Ref. [208] for a review set in the context of time series analysis. Recently, a fast version of nonlinear projective noise reduction has been developed [209] that cannot only be used a posteriori but also for real time processing in a data stream. If the data quality does not permit to use the local linear approach, one can try to use locally constant approximations instead [211]. This is done exactly in the same way as for locally constant predictions, see Section 3.3, Eq. (27). The only difference is that instead of predicting a future value with *n'0, the middle coordinate of the embedding window *n"m/2 is estimated. 6.2. Signal separation Noise reduction can be regarded as the particular case of the more general task of signal separation where one of the signals is the noise contribution. It turns out that the methodology developed for noise reduction can be generalised to the separation of other types of signals. As a specific example, let us discuss the extraction of the fetal electrocardiogram (FECG) from noninvasive maternal recordings. Other very similar applications include the removal of ECG artefacts from electro-myogram (EMG) recordings (electric potentials of muscle) and spike detection in electro-encephalogram (EEG) data [212]. Fetal ECG extraction can be regarded as a three-way filtering problem since we have to assume that a maternal abdominal ECG recording consists of three main components, the maternal ECG, the fetal ECG, and exogenous noise, mostly from action potentials of intervening muscle tissue. All three components have quite similar broad-band power spectra and cannot be filtered apart by spectral methods. The fetal component is detectable from as early as the 11th week of pregnancy. After about the 20th week, the signal becomes weaker since the electric potential of the fetal heart is shielded by the vernix caseosa forming on the skin of the fetus. It appears again towards delivery. In Refs. [213,214], it has been proposed to use a nonlinear phase space projection technique for the separation of the fetal signal from maternal and noise artefacts. A typical example of output of this procedure is shown in Fig. 14. The assumption made about the nature of the data is that the maternal signal is well approximated by a low-dimensional manifold in delay reconstruction space. After projection onto this manifold, the maternal signal is separated from the noisy fetal component. Now it is assumed that the fetal ECG is also approximated by a low-dimensional manifold and the noise is removed by projection. Since both manifolds are curved, the projections have to be made onto linear approximations. For technical details see Refs. [213,214].
7. Comparison and classification With current methods, many real-world systems cannot be fully understood on the basis of time series measurements. Approaches aiming at an absolute analysis, like the reliable determination of the fractal dimension of a strange attractor have often been found to fail for various reasons, the most prominent being that most of the systems are not low-dimensional deterministic. However, many phenomena can still be studied in a comparative way. In that case, we do not have to worry
52
T. Schreiber / Physics Reports 308 (1999) 1—64
Fig. 14. Signal separation by locally linear projections in phase space. The original recording (upper trace) contains the fetal ECG hidden under noise and the large maternal signal. Projection onto the manifold formed by the maternal ECG (middle trace) yields fetus plus noise, another projection yields a fairly clean fetal ECG (lower trace). The data was kindly provided by J.F. Hofmeister [215].
too much about the theoretical basis of the quantities we use. The results are validated by the statistical significance for the discriminative power. The classification of states can give valuable insights into the structure of a problem, and very often, signal classification is desirable in its own right. In clinical applications, for example, it is common to define quantities by a standardised procedure, even if this procedure yields an observable which has no immediate physical interpretation. Take the standard procedure of determining the blood pressure non-invasively (the RivaRocci method). Although the measurement is indirect and does not yield invariant results, the standardisation of the procedure ensures good comparability of the results. For its value as a diagnostic tool, it is irrelevant whether the measured numbers actually represent the pressure of the blood in a specific part of the body or not. Comparison and classification of time series is most often done pretty much in the same spirit as, for example, the blood pressure measurement. A complex phenomenon is reduced in a well defined way to a single number or a small set of numbers. Further analysis is then carried out on these numbers. In nonlinear time series analysis these numbers can be for example nonlinear prediction errors, coarse grained dimensions or entropies, etc. Below, an alternative approach will be discussed which attempts to carry out the actual comparison between the signals directly rather than between single numbers abstracted from them. Before we set up a classification problem, we have to decide how we want to use the available information. We need to make economical use of the data we have since we need them to set up and maybe optimise a classification scheme and then to independently verify it. One way to proceed is by splitting the available data base into two parts, a learning set where the correct classification is known (for example, which of the patients were in the control group), and a test set which is only used at the very end of the analysis to verify the validity of the classification without using the correct answer. (See the discussion of the overfitting problem in Section 3.3.) The advantage of this approach is that the training phase can be supervised and directed to the desired behaviour. The disadvantage is that we need a sufficient number of test cases to be kept apart. Another possibility is to perform unsupervised classification of the whole database. One tries to find whether the whole ensemble of signals falls into distinct groups naturally without help from a supervisor who knows
T. Schreiber / Physics Reports 308 (1999) 1—64
53
the correct answer. One can then regard the whole available data base as the test set for the correctness and significance of the grouping. The latter approach of unsupervised learning will be considered here mostly. Note that in both cases, supervised and unsupervised classification, the test set cannot be used repeatedly to optimise strategies or parameters, unless claims of significance are modified accordingly. 7.1. Classification by histograms The standard approach to classification of time series is to express the information in each sequence by a single number or a few of them. One can then form a histogram of these numbers, either in one or a few dimensions. Let us give a simple example. Consider a generalised baker map v 4a: u "bu , v "v /a, L L> L L> L (49) v 'a: u "0.5#bu , v "(v !a)/(1!a) L L> L L> L with a"0.4. The parameter b can be varied without changing the positive Lyapunov exponent [28]. Two groups of sequences were generated, the first (group a) containing 50 sets with b"0.6 and the second (group b) containing 50 with b"0.8. Each sequence has a length of 400 points. Prediction errors c are calculated from Eqs. (33) and (27) and collected in a histogram (see Fig. 15). G For unsupervised learning, this histogram would not provide enough information since the observed distribution (black bars) does not fall into two groups naturally. In fact, in the best case we can find a threshold value (arrow) that minimises the number of misclassifications. Still at least 20 series will be assigned to the wrong group since the individual distributions (white and grey bars) overlap. 7.2. Classification by clustering The success of the usual classification schemes based on histograms or scatter plots of a few quantities crucially depends on the right choice of observable to characterise the signals. In general, it seems quite a loss of information to express nonlinear dynamics by a few numbers. An alternative
Fig. 15. Classification using a histogram. The arrow indicates the threshold value resulting in the fewest misclassifications possible.
54
T. Schreiber / Physics Reports 308 (1999) 1—64
is to compare the individual series directly without first extracting an observable. As we will see in the baker map example of the preceding section, two series produced with different parameters may be well distinguishable even though they have comparable predictability. This motivates to generalise the usual measures of nonlinearity discussed in Sections 2.3, 3.4 and 3.5 to comparative measures, or measures of similarity, as it was done in Sections 2.4 and 3.6. If we want to compare w signals, the study of symmetric dissimilarities yields w(w#1)/2 independent relative quantities c rather than just w characteristics c "c . In this section it will be shown how such matrices can GH G GG be obtained and used for the task of classification. The method has been proposed in Ref. [63] where also more technical details and further examples can be found. The main idea is to use a cluster algorithm to find groups of data based on a dissimilarity matrix. There are several standard methods to do so [232] and the choice made below is not meant to be exclusive by any means. The task now is to classify w objects into K groups or clusters. Let us define a membership index uJ to be 1 if object i is in cluster l, and 0 if not. A cluster is given by all points G with membership index 1: CJ"+i: uJ"1,. The size of a cluster is "CJ"" U uJ. The average G G G dissimilarity of object i to cluster l (the “distance” of i to l) is then given by 1 U uJc . DJ" H GH G "CJ" H The average dissimilarity within cluster l is 1 U uJDJ DJ" G G "CJ" G and the total average intra-cluster dissimilarity: 1 ) D" "CJ" DJ . K J This finally yields the cost function
(50)
(51)
(52)
U ) 1 uJuJc (53) E"KD" G H GH "CJ" GH J that quantifies the average distance within the clusters. The cost function E can be minimised numerically, for example with simulated annealing (see Ref. [63] for details). Let us illustrate the use of this approach with the same example studied previously, the collection of generalised baker map data. (Eq. (23)) generalises the prediction error c used for the histogram G approach to a symmetrised cross-prediction error c . With this, two clusters are formed by GH minimizing E. In Fig. 16, for each series the average dissimilarity D to cluster 2 is plotted against G that to cluster 1 (D). Two distinct groups can easily be seen that coincide perfectly with the G correct classes. Indeed the algorithm forms exactly the two desired clusters. Of course, to some extent the problem has only been shifted from finding a magic characteristic number to finding a suitable (not much less magic) dissimilarity measure. However, the approach augments the set of available tools in a meaningful way. After all, we want to classify signals by their dynamics, a feature that is not usually well described by a few parameters. For any classification method, the major problem remains to separate those differences that are relevant for
T. Schreiber / Physics Reports 308 (1999) 1—64
55
Fig. 16. Distances of objects from two clusters generated for 100 time series in two groups of 50. Baker map with b"0.6 in the first group (#) and b"0.8 in the second (;). The two original groups are readily separated by the cluster algorithm without any misclassification, as compared to more than 20 mistakes using a histogram of prediction errors.
the discrimination task from those that are not. Sometimes, the calibration of the measurement is known to be of no significance, in which case we can subtract the mean and rescale to unit variance. But apart from such simple transformations we have so far little means of being selective in a controlled way. It is possible to exclude the linear correlation structure from the analysis by normalising characteristic parameters to values obtained with surrogate data.
8. Conclusion and future perspectives This review paper tries to give an impression on how useful time series methods from chaos theory can be in practical applications. Chaos theory has attracted researchers from many areas for various reasons, in particular, because of its ability to explain complicated temporal behaviour by equations with only a few degrees of freedom and without assuming random forcing to act on the system. The attractiveness of the new paradigm (or the desperation in fields where standard time series methods fail miserably) has tempted people to take several steps at a time, and high expectations have been raised. Only now the path is being retraced step by step. Starting from a theoretical understanding of the new class of systems, time series methods have been tested on computer generated and well controlled laboratory data. Some of these studies provided sobering experience as to how fragile chaos and fractals can be and one could now be tempted to become quite pessimistic about the usefulness of algorithms derived from these concepts. Now that procedures have been revised, limitations and pitfalls have been pointed out, and intuition has been gained, we can again try to expose the algorithms to field measurements. We will do it less naively than people have done previously — but also with much more modest expectations. The aim of this paper is to get away from naive enthusiasm, but also from a roundabout abandonment of the approach. Realistic applications will only be possible with a pragmatic attitude — what can we learn from the new methods even if the assumption of determinism is not really valid for the system we study? The obvious goal for the near future is thus to enlarge the class of time series problems that are only, or better, or more efficiently, solvable by the nonlinear approach. The main obstacle will
56
T. Schreiber / Physics Reports 308 (1999) 1—64
probably not be the lack of good quality data. Experimentalists have come a long way towards controlling devices and measurement apparatuses. The major challenge lies in the nature of the systems that are most interesting. Many outstanding time series problems in the bio- geo- and social sciences involve multiple time scales, or put differently, lead to nonstationary signals. Also, natural systems are never isolated and thus are of a mixed nature, containing intrinsic and external dynamical components. There has been a noticable shift of focus in recent research from the mere testing for nonstationarity and, if the result is positive, excluding a time series from study, to the development of tools to understand the nature of the changes in a system’s dynamics. If a faithful parametric or empirical model for the process is lacking, there is no obvious set of parameters whose changes could be monitored. A promising approach is to define a basis to describe changes in the dynamics either by a number of reference sets [64] or a number of clusters of dynamically similar reference states (Refs. [63,66], Sections 4.3 and 7). The distances, or dissimilarities, of the dynamics at a given time to these reference dynamical states constitutes then a natural set of parameters. Further work on the problem of how to quantify the similarity of dynamical states and on how to use that information could be rewarding. It may then be possible to answer questions about the number of time-dependent parameters and the time scales and nature of these variations. Systems with many degrees of freedom are notoriously difficult to study through time series, even if multiple recordings are available. As an extreme example take the dynamics of the human brain. The neurons and synapses are not only enormous in number but they are also highly connected. The connection structure can moreover change slowly with time. The system is quite inhomogeneous and has to carry out many different tasks at different times. Certainly, only very specific questions can be hoped to be answered on the basis of time series recordings with a few channels. But even time-resolved imaging techniques can only give a coarsened picture and do not adequately represent the connection structure. Fortunately, there are some interesting intermediate problems that carry more promise to be tractable with dynamical methods. Spatial homogeneity and local coupling leads to a class of systems which can show interesting but still understandable dynamics. Apart from steady states and static patterns, they can exhibit phenomena which are summarised under the term weak turbulence. Neither this term, nor the notions of spatio-temporal chaos or, more fashionably, extensive chaos have been clearly defined so far. This is a direct consequence of the lack of a unifying framework for the study of these systems. Extensivity in this context means that quantities like the number of degrees of freedom, attractor dimension, entropy, etc., asymptotically grow linearly with the volume of the system. In the large system limit, one can then define intensive quantities like a dimension density. This paper is certainly not the place to review the huge literature on nonlinear, spatially extended dynamics, in particular since few of the approaches have been shown to be useful when analysing observational data. The reader may find interesting material and additional pointers to the literature in the proceedings volumes Refs. [216,217], as well as in [218—220]. Probably, the most immediate problem when analysing spatio-temporal data is the choice of a useful representation of the system states and dynamics. A high-frequency sequence of images contains an amount of information that is hardly managable, even with a powerful computer. The other extreme, a small number of local probes, causes severe problems since the time delay embedding technique is of very limited use with high dimensional data [221]. Popular schemes to reduce the spatial information to a few modes (like for example the Karhunen—Loe´ve decomposition, see Ref. [222] for a recent
T. Schreiber / Physics Reports 308 (1999) 1—64
57
application) are most often linear in nature and therefore not quite appropriate for nonlinear systems. In certain situations [223], nonlinear mode dynamics have been used successfully to describe spatio-temporal phenomena. A different approach that carries promise in this respect is the representation by temporally periodic recurrent patterns, or unstable periodic orbits, see the works by Christiansen and coworkers [224], and by Zoldi and Greenside [225]. Despite these efforts, the expectation raised in 1991 at the Santa Fe Institue Time Series Contest that within five years we may have enough experience to enter a second contest, this time on spatio-temporal data, has not been substantiated and it seems that more than a slight relaxation of the time frame will be necessary. High-dimensional signals can also be produced by systems with only a few components when a delayed feedback is involved. In biology, delayed feedback loops are quite common due to the retarded response of subsystems to changes in other parts of the system. In other fields, delayed feedback can be realised, for example, when part of the output of a device is reflected back from a finite distance, as it sometimes happens in laser or radar equipment, but also with seismic waves. Delayed feedback is also often used for the control of chaotic systems. There has been recent progress in the analysis of such systems, in particular if some knowledge about the feedback structure is available. Bu¨nner and coworkers [226] have been able to extract relatively simple dynamical equations from scalar time delay systems on the basis of time series data, despite the high dimensionality of the signals. It should be possible in principle also to infer the delay structure from observations. The recovery of dynamical equations from data could then provide a better understanding of many systems in nature. Acknowledgements Let me first thank Peter Grassberger who has accompanied my work since I started doing science. My work on nonlinear time series has again and again led to close and enjoyable collaboration with Holger Kantz. Among the people who had impact on the research leading to this paper let me name James Theiler, Daniel Kaplan, Leon Glass, Martin Casdagli, Tim Sauer, Rainer Hegger, and Lenny Smith. I am grateful to Petr Saparin, John F. Hofmeister, Klaus Lehnertz, and Thomas Schu¨rmann for letting me use their time series data in this publication. This work was supported by the SFB 237 of the Deutsche Forschungsgemeinschaft. Peter Grassberger, James Theiler, Floris Takens, and Johannes Mu¨ller-Gerking were so kind to read and comment on the manuscript prior to publication. References [1] P. Grassberger, T. Schreiber, C. Schaffrath, Nonlinear time sequence analysis, Int. J. Bifurcation Chaos 1 (1991) 521. [2] H.D.I. Abarbanel, R. Brown, J.J. Sidorowich, L.Sh. Tsimring, The analysis of observed chaotic data in physical systems, Rev. Mod. Phys. 65 (1993) 1331. [3] D. Kugiumtzis, B. Lillekjendlie, N. Christophersen, Chaotic time series I, Modeling, Identification Control 15 (1994) 205. [4] D. Kugiumtzis, B. Lillekjendlie, N. Christophersen, Chaotic Time Series II, Modeling, Identification Control 15 (1994) 225.
58
T. Schreiber / Physics Reports 308 (1999) 1—64 [5] [6] [7] [8] [9]
[10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41]
E. Ott, T. Sauer, J.A. Yorke, Coping with Chaos, Wiley, New York, 1994. H.D.I. Abarbanel, Analysis of Observed Chaotic Data, Springer, New York, 1996. H. Kantz, T. Schreiber, Nonlinear Time Series Analysis, Cambridge University Press, Cambridge, 1997. G. Mayer-Kress (Ed.), Dimensions and Entropies in Chaotic Systems, Springer, Berlin, 1986. M. Casdagli, S. Eubank (Eds.), Nonlinear modeling and forecasting, Santa Fe Institute Studies in the Science of Complexity, Proc. Vol. XII, Addison-Wesley, Reading, MA, 1992. A.S. Weigend, N.A. Gershenfeld (Eds.), Time series prediction: forecasting the future and understanding the past, Santa Fe Institute Studies in the Science of Complexity, Proc. Vol. XV, Addison-Wesley, Reading, MA, 1993. J. Be´lair, L. Glass, U. an der Heiden, J. Milton (Eds.), Dynamical Disease, AIP Press, New York, 1995. H. Kantz, J. Kurths, G. Mayer-Kress (Eds.), Nonlinear Techniques in Physiological Time Series Analysis, Springer Series in Synergetics, Springer, Heidelberg, 1998. H. Tong, Non-linear Time Series Analysis, Oxford University Press, Oxford, 1990. M.B. Priestley, Non-linear and Non-stationary Time Series Analysis, Academic Press, London, 1988. C. Nicolis, G. Nicolis, Is there a climatic attractor?, Nature 326 (1987) 523. K. Fraedrich, Estimating the dimensions of weather and climate attractors, J. Atmos. Sci. 43 (1986) 419. C. Essex, T. Lockman, M.A.H. Nerenberg, The climate attractor over short time scales, Nature 326 (1987) 64. D.A. Hsieh, Chaos and nonlinear dynamics: applications to financial markets, J. Finance 46 (1991) 1839. E.E. Peters, A chaotic attractor for the s#p 500, Financial Analysts J. 3 (1991) 55. G. DeCoster, W. Labys, D. Mitchell, Evidence of chaos in commodity future prices,, J. Futures Markets 12 (1992) 291. A.L. Goldberger, D.R. Rigney, J. Mietus, E.M. Antman, S. Greenwald, Nonlinear dynamics in sudden cardiac death syndrome: heart rate oscillations and bifurcations, Experientia 44 (1988) 983. D.R. Chialvo, J. Jalife, Non-linear dynamics in cardiac excitation and impulse propagation, Nature 330 (1987) 749. A. Babloyantz, Strange attractors in the dynamics of brain activity, in: H. Haken (Ed.), Complex Systems, Springer, Berlin, 1985. W.S. Pritchard, Electroencephalographic effects of cigarette smoking, Psychopharmacology 104 (1991) 485. R. Hegger, H. Kantz, T. Schreiber, Practical implementation of nonlinear time series methods, to be published, 1998. E. Ott, Chaos in Dynamical Systems, Cambridge University Press, Cambridge, 1993. P. Berge´, Y. Pomeau, C. Vidal, Order Within Chaos: Towards a Deterministic Approach to Turbulence, Wiley, New York, 1986. H.-G. Schuster, Deterministic Chaos: An Introduction, Physik Verlag, Weinheim, 1988. A. Katok, B. Hasselblatt, Introduction to the Modern Theory of Dynamical Systems, Cambridge University Press, Cambridge, 1996. D. Kaplan, L. Glass, Understanding Nonlinear Dynamics, Springer, New York, 1995. A.A. Tsonis, Chaos: From Theory to Applications, Plenum, New York, 1992. T. Schreiber, H. Kantz, Noise in chaotic data: Diagnosis and treatment, CHAOS 5 (1995) 133; Reprinted in [11]. L. Jaeger, H. Kantz, Effective deterministic models for chaotic dynamics perturbed by noise, Phys. Rev. E 55 (1997) 5234. J.D. Farmer, J. Sidorowich, Predicting chaotic time series, Phys. Rev. Lett. 59 (1987) 845; Reprinted in [5]. M. Casdagli, Nonlinear prediction of chaotic time series, Physica D 35 (1989) 335; Reprinted in [5]. E.J. Kostelich, Problems in estimating dynamics from data, Physica D 58 (1992) 138. L. Jaeger, H. Kantz, Unbiased reconstruction underlying a noisy chaotic time series, CHAOS 6 (1996) 440. R. Brown, E.R. Rulkov, N.F. Tracy, Modeling and synchronizing chaotic systems from time-series data, Phys. Rev. E 49 (1994) 3784. F. Takens, Detecting strange attractors in turbulence, in: D.A. Rand, L.-S. Young (Eds.), Dynamical Systems and Turbulence, Lecture Notes in Mathematics, vol. 898, Springer, New York, 1981. T. Sauer, J. Yorke, M. Casdagli, Embedology, J. Stat. Phys. 65 (1991) 579. J. Stark, D.S. Broomhead, M.E. Davies, J. Huke, Takens embedding theorems for forced and stochastic systems, Nonlinear Analysis 30 (1997) 5303.
T. Schreiber / Physics Reports 308 (1999) 1—64
59
[42] T. Sauer, J. Yorke, How many delay coordinates do you need?, Int. J. Bifurcation Chaos 3 (1993) 737. [43] W.H. Press, B.P. Flannery, S.A. Teukolsky, W.T. Vetterling, Numerical Recipes, 2nd ed., Cambridge University Press, Cambridge, 1992. [44] M.T. Rosenstein, J.J. Collins, C.J. De Luca, A practical method for calculating largest Lyapunov exponents from small data sets, Physica D 65 (1993) 117. [45] H. Kantz, A robust method to estimate the maximal Lyapunov exponent of a time series, Phys. Lett. A 185 (1994) 77. [46] J.-P. Eckmann, S. Oliffson Kamphorst, D. Ruelle, S. Ciliberto, Lyapunov exponents from a time series, Phys. Rev. A 34 (1986) 4971; Reprinted in [5]. [47] M. Sano, Y. Sawada, Measurement of the Lyapunov spectrum from a chaotic time series, Phys. Rev. Lett. 55 (1985) 1082. [48] R. Stoop, P.F. Meier, Evaluation of Lyapunov exponents and scaling functions from time series, J. Opt. Soc. Am. B 5 (1988) 1037. [49] R. Stoop, J. Parisi, Calculation of Lyapunov exponents avoiding spurious elements, Physica D 50 (1991) 89. [50] H.D.I. Abarbanel, R. Brown, M.B. Kennel, Variation of Lyapunov exponents on a strange attractor, J. Nonlinear Sci. 1 (1991) 175. [51] L.A. Smith, Local optimal prediction: exploiting strangeness and the variation of sensitiy to initial condition, Philos. Trans. Roy. Soc. A 348 (1994) 371. [52] H.D.I. Abarbanel, R. Brown, M.B. Kennel, Local Lyapunov exponents from observed data, J. Nonlinear Sci. 2 (1992) 343. [53] B. Eckhardt, D. Yao, Local Lyapunov exponents in chaotic systems, Physica D 65 (1993) 100. [54] B.A. Bailey, S. Ellner, D.W. Nychka, Chaos with confidence: asymptotics and applications of local Lyapunov exponents, Fields Inst. Comm. 11 (1997) 115. [55] P. Grassberger, I. Procaccia, Measuring the strangeness of strange attractors, Physica D 9 (1983) 189. [56] M. Frank, H.-R. Blank, J. Heindl, M. Kaltenha¨user, H. Ko¨chner, N. Mu¨ller, S. Pocher, R. Sporer, T. Wagner, Improvement of K -entropy calculations by means of dimension scaled distances, Physica D 65 (1993) 359. [57] D. Kugiumtzis, Assessing different norms in nonlinear analysis of noisy time series, Physica D 105 (1997) 62. [58] D. Prichard, J. Theiler, Generalized redundancies for time series analysis, Physica D 84 (1995) 476. [59] A. Cohen, I. Procaccia, Computing the Kolmogorov entropy from time signals of dissipative and conservative dynamical systems, Phys. Rev. A 31 (1985) 1872. [60] J.M. Ghez, S. Vaienti, Integrated wavelets on fractal sets I: the correlation dimension, Nonlinearity 5 (1992) 777. [61] J.M. Ghez, S. Vaienti, Integrated wavelets on fractal sets II: the generalized dimensions, Nonlinearity 5 (1992) 791. [62] J.L. Herna´ndez, R. Biscay, J.C. Jimenez, P. Valdes, R. Grave de Peralta, Measuring the dissimilarity between EEG recordings through a non-linear dynamical system approach, Int. J. Bio-Med. Comp. 38 (1995) 121. [63] T. Schreiber, A. Schmitz, Classification of time series data with nonlinear similarity measures, Phys. Rev. Lett. 79 (1997) 1475. [64] R. Manuca, R. Savit, Stationarity and nonstationarity in time series analysis, Physica D 99 (1996) 134. [65] M.C. Casdagli, L.D. Iasemidis, R.S. Savit, R.L. Gilmore, S. Roper, J.C. Sackellares, Non-linearity in invasive EEG recordings from patients with temporal lobe epilepsy, Electroencephalogr. Clin. Neurophysiol. 102 (1997) 98. [66] T. Schreiber, Detecting and analysing nonstationarity in a time series using nonlinear cross predictions, Phys. Rev. Lett. 78 (1997) 843. [67] L.M. Pecora, T.L. Carroll, J.F. Heagy, Statistics for mathematical properties of maps between time series embeddings, Phys. Rev. E 52 (1995) 3420. [68] L. Kocarev, U. Parlitz, General approach for chaotic synchronization with applications to communication, Phys. Rev. Lett. 74 (1995) 5028. [69] L. Kocarev, U. Parlitz, Generalized synchronization, predictability, and equivalence of unidirectionally coupled dynamical systems, Phys. Rev. Lett. 76 (1996) 1816. [70] N.F. Rulkov, M.M. Sushchik, L.S. Tsimring, H.D.I. Abarbanel, Generalized synchronization of chaos in directionally coupled chaotic systems, Phys. Rev. E 51 (1995) 980. [71] M.G. Rosenblum, A.S. Pikovsky, J. Kurths, Phase synchronisation of chaotic attractors, Phys. Rev. Lett. 76 (1996) 1804.
60
T. Schreiber / Physics Reports 308 (1999) 1—64
[72] S. Kullback, Information Theory and Statistics, Wiley, New York, 1959. [73] H. Kantz, Quantifying the closeness of factal measures, Phys. Rev. E 49 (1994) 5091. [74] C. Diks, W.R. van Zwet, F. Takens, J. DeGoede, Detecting differences between delay vector distributions, Phys. Rev. E 53 (1996) 2169. [75] R. Moeckel, B. Murray, Measuring the distance between time series, Physica D 102 (1997) 187. [76] J. Kadtke, Classification of highly noisy signals using global dynamical models, Phys. Lett. A 203 (1995) 196. [77] M. Casdagli, S. Eubank, J.D. Farmer, J. Gibson, State space reconstruction in the presence of noise, Physica D 51 (1991) 52. [78] M. Casdagli, A dynamical systems approach to modeling input—output systems, in [9]. [79] M.R. Muldoon, D.S. Broomhead, J.P. Huke, R. Hegger, Delay embedding in the presence of dynamical noise, Dyn. Stab. Systems 13 (1998) 175. [80] M. Ding, C. Grebogi, E. Ott, T. Sauer, J.A. Yorke, Plateau onset for correlation dimension: When does it occur? Phys. Rev. Lett. 70 (1993) 3872; Reprinted in [5]. [81] G.G. Malinetskii, A.B. Potapov, A.I. Rakhmanov, E.B. Rodichev, Limitations of delay reconstruction for chaotic systems with a broad spectrum, Phys. Lett. A 179 (1993) 15. [82] A.M. Fraser, H.L. Swinney, Independent coordinates for strange attractors from mutual information, Phys. Rev. A 33 (1986) 1134. [83] W. Liebert, H.G. Schuster, Proper choice of the time delays for the analysis of chaotic time series, Phys. Lett. A 142 (1989) 107. [84] W. Liebert, K. Pawelzik, H.G. Schuster, Optimal embeddings of chaotic attractors from topological considerations, Europhys. Lett. 14 (1991) 521. [85] M.B. Kennel, S. Isabelle, Method to distinguish possible chaos from colored noise and to determine embedding parameters, Phys. Rev. A 46 (1992) 3111. [86] T. Buzug, G. Pfister, Comparison of algorithms calculating optimal parameters for delay time coordinates, Physica D 58 (1992) 127. [87] T. Buzug, T. Reimers, G. Pfister, Optimal reconstruction of strange attractors from purely geometrical arguments, Europhys. Lett. 13 (1990) 605. [88] D. Kugiumtzis, State space reconstruction parameters in the analysis of chaotic time series — the role of the time window length, Physica D 95 (1996) 13. [89] T. Schreiber, H. Kantz, Observing and predicting chaotic signals: is 2% noise too much? in: Y. Kravtsov, J. Kadtke (Eds.), Predictability of Complex Dynamical Systems, Springer, New York, 1996. [90] D. Kugiumtzis, O.C. Lingjaerde, N. Christophersen, Regularized local linear prediction of chaotic time series, Physica D 112 (1998) 344. [91] T. Sauer, Times series prediction using delay coordinate embedding, in [10]. [92] M. Casdagli, Chaos and deterministic versus stochastic nonlinear modeling, J. Roy. Stat. Soc. 54 (1991) 303. [93] E.N. Lorenz, Atmospheric predictability as revealed by naturally occurring analogues, J. Atmos. Sci. 26 (1969) 636. [94] A. Pikovsky, Discrete-time dynamic noise filtering, Sov. J. Commun. Technol. Electron. 31 (1986) 81. [95] G. Sugihara, R. May, Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series, Nature 344 (1990) 734; Reprinted in [5]. [96] J. Milnor, On the concept of attractor, Comm. Math. Phys. 99 (1985) 177. [97] J.H. Friedman, Multivariate adaptive regression splines (with discussion), Ann. Stat. 19 (1991) 1. [98] M.J.D. Powell, Radial basis functions for multivariable interpolation: a review, Proc. IMA Conf. Algorithms for the approximation of functions and data, RMCS, Shrivenham, 1985. [99] D. Broomhead, D. Lowe, Multivariable function interpolation and adaptive networks, Complex Syst. 2 (1988) 321. [100] L.A. Smith, Identification and prediction of low-dimensional dynamics, Physica D 58 (1992) 50. [101] L. Breiman, J.H. Friedman, R.A. Olshen, C.J. Stone, Classification and Regression Trees, Chapman & Hall, New York, 1993. [102] H. Akaike, A new look at the statistical model identification, IEEE Trans. Automat. Control AC-19 (1974) 716. [103] J. Rissanen, Consistent order estimates of autoregressive processes by shortest description of data, in: O. Jacobs et al. (Eds.), Analysis and Optimisation of Stochastic Systems, Academic Press, New York, 1980.
T. Schreiber / Physics Reports 308 (1999) 1—64
61
[104] J. Theiler, D. Prichard, Using ‘Surrogate Surrogate Data’ to calibrate the actual rate of false positives in tests for nonlinearity in time series, Fields Inst. Comm. 11 (1997) 99. [105] A. Wolf, J.B. Swift, H.L. Swinney, J.A. Vastano, Determining Lyapunov exponents from a time series, Physica D 16 (1985) 285. [106] K. Geist, U. Parlitz, W. Lauterborn, Comparison of different methods for computing Lyapunov exponents, Prog. Thoeor. Phys. 83 (1990) 875. [107] J. Theiler, Statistical precision of dimension estimators, Phys. Rev. A 41 (1990) 3038. [108] R.L. Smith, Estimating dimension in noisy chaotic time-series, J.R. Statist. Soc. B 54 (1992) 329. [109] F. Takens, On the numerical determination of the dimension of an attractor, in: B.L.J. Braaksma, H.W. Broer, F. Takens (Eds.), Dynamical systems and bifurcations, Lecture Notes in Mathematics, vol. 1125, Springer, Heidelberg, 1985. [110] C.D. Cutler, Some results on the behavior and estimation of the fractal dimensions of distributions on attractors, J. Stat. Phys. 62 (1991) 651. [111] C.D. Cutler, A theory of correlation dimension for stationary time series, Philosoph. Trans. Royal Soc. London A 348 (1995) 343. [112] A.R. Osborne, A. Provenzale, Finite correlation dimension for stochastic systems with power-law spectra, Physica D 35 (1989) 357. [113] J. Theiler, Some comments on the correlation dimension of 1/f ? noise, Phys. Lett. A 155 (1991) 480. [114] P. Grassberger, Do climatic attractors exist? Nature 323 (1986) 609. [115] P. Grassberger, Evidence or climatic attractors: Grassberger replies, Nature 326 (1987) 524. [116] D. Ruelle, Deterministic chaos: the science and the fiction, Proc. R. Soc. London A 427 (1990) 241. [117] J. Theiler, Estimating fractal dimension, J. Opt. Soc. Amer. A 7 (1990) 1055. [118] H. Kantz, T. Schreiber, Dimension estimates and physiological data, CHAOS 5 (1995) 143; Reprinted in [11]. [119] P. Grassberger, Finite sample corrections to entropy and dimension estimates, Phys. Lett. A 128 (1988) 369. [120] A. Provenzale, L.A. Smith, R. Vio, G. Murante, Distinguishing between low-dimensional dynamics and randomness in measured time series, Physica D 58 (1992) 31. [121] M.B. Kennel, R. Brown, H.D.I. Abarbanel, Determining embedding dimension for phase-space reconstruction using a geometrical construction, Phys. Rev. A 45 (1992) 3403; Reprinted in [5]. [122] M.B. Kennel, H.D.I. Abarbanel, False neighbors and false strands: A reliable minimum embedding dimension algorithm, INLS Preprint, 1994. [123] J. Theiler, Lacunarity in a best estimator of fractal dimension, Phys. Lett. A 135 (1988) 195. [124] C. Diks, Estimating invariants of noisy attractors, Phys. Rev. E 53 (1996) 4263. [125] D. Kugiumtzis, Correction of the correlation dimension for noisy time series, Int. J. Bifurcation Chaos 7 (1997) 1283. [126] H. Oltmans, P.J.T. Verheijen, The influence of noise on power law scaling functions and an algorithm for dimension estimations, Phys. Rev. E 56 (1997) 1160. [127] T. Schreiber, Determination of the noise level of chaotic time series,, Phys. Rev. E 48 (1993) 13. [128] T. Schreiber, Influence of Gaussian noise on the correlation exponent, Phys. Rev. E 56 (1997) 274. [129] E. Olofsen, J. Degoede, R. Heijungs, A maximum likelihood approach to correlation dimension and entropy estimation, Bull. Math. Biol. 54 (1992) 45. [130] J.C. Schouten, F. Takens, C.M. van den Bleek, Maximum likelihood estimation of the entropy of an attractor, Phys. Rev. E 49 (1994) 126. [131] C. Elger, K. Lehnertz, Seizure prediction by nonlinear time series analysis of brain electrical activity, European J. Neuroscience 10 (1998) 786. [132] R. Bowen, Symbolic dynamics for hyperbolic flows, Amer. J. Math. 95 (1973) 429. [133] F. Christiansen, A. Politi, Symbolic encoding in symplectic maps, Nonlinearity 9 (1996) 1623. [134] P. Grassberger, H. Kantz, Generating partitions for the dissipative He´non map, Phys. Lett. A 113 (1985) 235. [135] H. Herzel, Compelxity of symbol sequences, Syst. Anal. Modl. Simul. 5 (1988) 435. [136] W. Ebeling, Th. Po¨schel, K.-F. Albrecht, Entropy, transinformation and word distribution of informationcarrying sequences, Int. J. Bifurcation Chaos 5 (1995) 51. [137] T. Schu¨rmann, P. Grassberger, Entropy estimation of symbol sequences, CHAOS 6 (1996) 414.
62
T. Schreiber / Physics Reports 308 (1999) 1—64
[138] B.-L. Hao, Elementary Symbolic Dynamics, World Scientific, Singapore, 1989. [139] R. Wackerbauer, A. Witt, H. Atmanspacher, J. Kurths, H. Scheingraber, A comparative classification of complexity measures, Chaos Solitons Fractals 4 (1994) 133. [140] J. Kurths, A. Voss, P. Saparin, A. Witt, H.J. Kleiner, N. Wessel, Complexity measures for the analysis of heart rate variability, CHAOS 5 (1995) 88. [141] W.A. Brock, W.D. Dechert, J.A. Scheinkman, B. LeBaron, A Test for Independence Based on the Correlation Dimension, University of Wisconsin Press, Madison, 1988. [142] W.A. Brock, D.A. Hseih, B. LeBaron, Nonlinear Dynamics, Chaos, and Instability: Statistical Theory and Economic Evidence, MIT Press, Cambridge, MA, 1991. [143] G. Sugihara, B. Grenfell, R.M. May, Distinguishing error from chaos in ecological time series, Phil. Trans. R. Soc. Lond. B 330 (1990) 235. [144] M. Barahona, C.-S. Poon, Detection of nonlinear dynamics in short, noisy time series, Nature 381 (1996) 215. [145] T. Schreiber, A. Schmitz, Discrimination power of measures for nonlinearity in a time series, Phys. Rev. E 55 (1997) 5443. [146] D.T. Kaplan, L. Glass, Direct test for determinism in a time series, Phys. Rev. Lett. 68 (1992) 427; Reprinted in [5]. [147] B. Pompe, Measuring statistical dependencies in a time series, J. Stat. Phys. 73 (1993) 587. [148] M. Palus\ , Testing for nonlinearity using redundancies: Quantitative and qualitative aspects, Physica D 80 (1995) 186. [149] M. Palus\ , On entropy rates of dynamical systems and Gaussian processes, Phys. Lett. A 227 (1997) 301. [150] D. Auerbach, P. Cvitanovic´, J.-P. Eckmann, G. Gunaratne, I. Procaccia, Exploring chaotic motion through periodic orbits, Phys. Rev. Lett. 58 (1987) 2387. [151] R. Artuso, E. Aurell, P. Cvitanovic´, Recycling of strange sets I, Nonlinearity 3 (1990) 325. [152] R. Artuso, E. Aurell, P. Cvitanovic´, Recycling of strange sets II, Nonlinearity 3 (1990) 361. [153] P. Cvitanovic´, R. Artuso, R. Mainieri, G. Vattay, Classical and quantum chaos: a cyclist treatise, Web-book in progress. Available from http://www.nbi.dk/ChaosBook (1998). [154] R. Badii, E. Brun, M. Finardi, L. Flepp, R. Holzner, J. Parisi, C. Reyl, J. Simonet, Progress in the analysis of experimental chaos through periodic orbits, Rev. Mod. Phys. 66 (1994) 1389. [155] G.B. Mindlin, X.-J. Hou, H.G. Solari, R. Gilmore, N.B. Tufillaro, Classification of strange attractors by integers, Phys. Rev. Lett. 64 (1990) 2350. [156] E. Ott, C. Grebogi, J.A. Yorke, Controlling chaos, Phys. Rev. Lett. 64 (1990) 1196; Reprinted in [5]. [157] D. Pierson, F. Moss, Detecting periodic unstable points in noisy chaotic and limit cycle attractors with applications to biology, Phys. Rev. Lett. 75 (1995) 2124. [158] P. So, E. Ott, S.J. Schiff, D.T. Kaplan, T. Sauer, C. Grebogi, Detecting unstable periodic orbits in chaotic experimental data, Phys. Rev. Lett. 76 (1996) 4705. [159] D.J. Christini, J.J. Collins, Controlling nonchaotic neuronal noise using chaos control techniques, Phys. Rev. Lett. 75 (1995) 2782. [160] A.M. Albano, P.E. Rapp, A. Passamante, Kolmogorov—Smirnov test distinguishes attractors with similar dimensions, Phys. Rev. E 52 (1995) 196. [161] H. Isliker, J. Kurths, A test for stationarity: finding parts in a time series apt for correlation dimension estimates, Int. J. Bifurcation Chaos 3 (1993) 1573. [162] M.B. Kennel, Statistical test for dynamical nonstationarity in observed time-series data, Phys. Rev. E 56 (1997) 316. [163] J.P. Eckmann, S. Oliffson Kamphorst, D. Ruelle, Recurrence plots of dynamical systems, Europhys. Lett. 4 (1987) 973. [164] M. Koebbe, G. Mayer-Kress, Use of the recurrence plots in the analysis of time-series data, in [9]. [165] G. McGuire, N.B. Azar, M. Shelhamer, Recurrence matrices and the preservation of dynamical properties, Phys. Lett. A 237 (1997) 43. [166] C.L. Webber, J.P. Zbilut, Dynamical assessment of physiological systems and states using recurrence plot strategies, J. Appl. Physiol. 76 (1994) 965. [167] J.P. Zbilut, A. Giuliani, C.L. Webber, Recurrence quantification analysis and principal components in the detection of short complex signals, Phys. Lett. A 237 (1998) 131.
T. Schreiber / Physics Reports 308 (1999) 1—64
63
[168] M. Casdagli, Recurrence plots revisited, Physica D 108 (1997) 206. [169] A. Babloyantz, A. Destexhe, Low-dimensional chaos in an instance of epilepsy, Proc. Natl. Acad. Sci. USA 83 (1986) 3513. [170] G.W. Frank, T. Lookman, M.A.H. Nerenberg, C. Essex, J. Lemieux, W. Blume, Chaotic time series analyses of epileptic seizures, Physica D 46 (1990) 427. [171] J. Theiler, On the evidence for low-dimensional chaos in an epileptic electroencephalogram, Phys. Lett. A 196 (1995) 335. [172] J. Theiler, P.E. Rapp, Re-examination of the evidence for low-dimensional, nonlinear structure in the human electroencephalogram, Electroencephalogr. Clin. Neurophysiol. 98 (1996) 213. [173] D.E. Lerner, Monitoring changing dynamics with correlation integrals: case study of an epileptic seizure, Physica D 97 (1996) 563. [174] J.P. Pijn, J. Van Neerven, A. Noest, F.H. Lopes da Silva, Chaos or noise in EEG signals; dependence on state and brain site, Electroencephalogr. Clin. Neurophysiol. 79 (1991) 371. [175] K. Lehnertz, C.E. Elger, Spatio-temporal dynamics of the primary epileptogenic area in temporal lobe epilepsy characterized by neuronal complexity loss, Electroencephalogr. Clin. Neurophysiol. 95 (1995) 108. [176] L. Pe´zard, J. Martinerie, J. Mu¨ller-Gerking, F.J. Varela, B. Renault, Entropy quantification of human brain spatio-temporal dynamics, Physica D 96 (1996) 344. [177] Z. Rogovski, I. Gath, E. Bental, On the prediction of epileptic seizures, Biol. Cybernet. 42 (1981) 9. [178] I. Dvora´k, Takens versus multichannel reconstruction in EEG correlation exponent estimates, Phys. Lett. A 151 (1990) 225. [179] R. Hegger, H. Kantz, E. Olbrich, Correlation dimension of intermittent signals, Phys. Rev. E 56 (1997) 199. [180] J. Theiler, S. Eubank, A. Longtin, B. Galdrikian, J.D. Farmer, Testing for nonlinearity in time series: the method of surrogate data Physica D 58 (1992) 77; Reprinted in [5]. [181] J. Theiler, B. Galdrikian, A. Longtin, S. Eubank, J.D. Farmer, Using surrogate data to detect nonlinearity in time series, in [9]. [182] T. Subba Rao, M.M. Gabr, An Introduction to Bispectral Analysis and Bilinear Time Series Models, Lecture Notes in Statistics, vol. 24, Springer, New York, 1984. [183] C. Diks, J.C. van Houwelingen, F. Takens, J. DeGoede, Reversibility as a criterion for discriminating time series, Phys. Lett. A 201 (1995) 221. [184] J.E. Skinner, M. Molnar, C. Tomberg, The point correlation dimension: performance with nonstationary surrogate data and noise, Integrative Physiological Behavioral Sci. 29 (1994) 217. [185] M.S. Roulston, Significance testing on information theoretic functionals, Physica D 110 (1997) 62. [186] J. Theiler, D. Prichard, Constrained-realization Monte-Carlo method for hypothesis testing, Physica D 94 (1996) 221. [187] T. Schreiber, Constrained randomization of time series data, Phys. Rev. Lett. 80 (1998) 2105. [188] B. Efron, The Jackknife, the Bootstrap and Other Resampling Plans, SIAM, Philadelphia, PA, 1982. [189] T. Schreiber, A. Schmitz, Improved surrogate data for nonlinearity tests, Phys. Rev. Lett. 77 (1996) 635. [190] J. Theiler, P.S. Linsay, D.M. Rubin, Detecting nonlinearity in data with long coherence times, in [10]. [191] N. Metropolis, A. Rosenbluth, M. Rosenbluth, A. Teller, E. Teller, Equations of state calculations by fast computing machine, J. Chem. Phys. 21 (1953) 1097. [192] S. Kirkpatrick, C.D. Gelatt Jr., M.P. Vecchi, Optimization by simulated annealing, Science 220 (1983) 671. [193] R.V.V. Vidal (Ed.), Applied Simulated Annealing, Lecture Notes in Economics and Mathematical Systems, vol. 396, Springer, Berlin, 1993. [194] D.R. Rigney, A.L. Goldberger, W. Ocasio, Y. Ichimaru, G.B. Moody, R. Mark, Multi-channel physiological data: description and analysis, in [10]. [195] T. Schreiber, A. Schmitz, unpublished, 1997. [196] T. Bollerslev, Generalized autoregressive conditional heteroscedasticity, J. Econometrics 31 (1986) 207. [197] D. Prichard, The correlation dimension of differenced data, Phys. Lett. A 191 (1994) 245. [198] J.D. Farmer, J. Sidorowich, Exploiting chaos to predict the future and reduce noise, in: Y.C. Lee (Ed.), Evolution, Learning and Cognition, World Scientific, Singapore, 1988. [199] E.J. Kostelich, J.A. Yorke, Noise reduction in dynamical systems, Phys. Rev. A 38 (1988) 1649; Reprinted in [5].
64
T. Schreiber / Physics Reports 308 (1999) 1—64
[200] T. Schreiber, P. Grassberger, A simple noise-reduction method for real data, Phys. Lett. A 160 (1991) 411. [201] P. Grassberger, R. Hegger, H. Kantz, C. Schaffrath, T. Schreiber, On noise reduction methods for chaotic data, CHAOS 3 (1993) 127; Reprinted in [5]. [202] E.J. Kostelich, T. Schreiber, Noise reduction in chaotic time series data: a survey of common methods, Phys. Rev. E 48 (1993) 1752. [203] T. Schreiber, Processing of physiological data, in [12]. [204] I.T. Jolliffe, Principal Component Analysis, Springer, New York, 1986. [205] D. Broomhead, G.P. King, Extracting qualitative dynamics from experimental data,, Physica D 20 (1986) 217. [206] T. Schreiber, D.T. Kaplan, Nonlinear noise reduction for electrocardiograms, CHAOS 6 (1996) 87. [207] H. Kantz, T. Schreiber, I. Hoffmann, T. Buzug, G. Pfister, L.G. Flepp, J. Simonet, R. Badii, E. Brun, Nonlinear noise reduction: a case study on experimental data, Phys. Rev. E 48 (1993) 1529. [208] T. Schreiber, Efficient neighbor searching in nonlinear time series analysis, Int. J. Bifurcation Chaos 5 (1995) 349. [209] T. Schreiber, M. Richter, Nonlinear projective filtering in a data stream, Wuppertal preprint WUB-98-8, 1998. [210] A.L. Goldberger, E. Goldberger, Clinical Electrocardiography, Mosby, St. Louis, 1977. [211] T. Schreiber, Extremely simple nonlinear noise reduction method, Phys. Rev. E 47 (1993) 2401. [212] T. Schreiber, unpublished, 1997. [213] T. Schreiber, D.T. Kaplan, Signal separation by nonlinear projections: the fetal electrocardiogram, Phys. Rev. E 53 (1996) 4326. [214] M. Richter, T. Schreiber, D.T. Kaplan, Fetal ECG extraction with nonlinear phase space projections, IEEE Trans. Bio-Med. Eng. 45 (1998) 133. [215] J.F Hofmeister, J.C. Slocumb, L.M. Kottmann, J.B. Picchiottino, D.G. Ellis, A noninvasive method for recording the electrical activity of the human uterus in vivo, Biomed. Instr. Technol. (1994) 391. [216] F.H. Busse, L. Kramer (Eds.), Nonlinear evolution of spatio-temporal structures in dissipative continuous systems, Plenum, New York, 1990. [217] A.M. Albano, P.E. Rapp, N.B. Abraham, A. Passamante (Eds.), Measures of spatio-temporal dynamics, Physica D 96 (1996). [218] P. Manneville, Dissipative Structures and Weak Turbulence, Academic Press, New York, 1989. [219] R. Kapral, K. Showalter, Chemical Waves and Patterns, Kluwer, Dordrecht, 1995. [220] M.C. Cross, P.C. Hohenberg, Pattern formation outside of equilibrium, Rev. Mod. Phys. 65 (1993) 851. [221] E. Olbrich, H. Kantz, Inferring chaotic dynamics from time-series: On which length scale determinism becomes visible, Phys. Lett. A 232 (1997) 63. [222] S.M. Zoldi, H.S. Greenside, Karhunen-Loe´ve decomposition of extensive chaos, Phys. Rev. Lett. 78 (1997) 9. [223] V.K. Jirsa, R. Friedrich, H. Haken, Reconstrution of the spatio-temporal dynamics of a human magnetoencephalogram, Physica D 89 (1995) 100. [224] F. Christiansen, P. Cvitanovic´, V. Putkaradze, Hopf ’s last hope: spatiotemporal chaos in terms of unstable recurrent patterns, Nonlinearity 10 (1997) 1. [225] S.M. Zoldi, H.S. Greenside, Spatially localized unstable periodic orbits of a high-dimensional chaotic system, Phys. Rev. E 57 (1998) R2511. [226] M.J. Bu¨nner, M. Popp, Th. Meyer, A. Kittel, J. Parisi, Tool to recover scalar time-delay systems from experimental time series, Phys. Rev. E 54 (1996) 3082. [227] J.-P. Eckmann, D. Ruelle, Ergodic theory of chaos and strange attractors, Rev. Mod. Phys. 57 (1985) 617. [228] M. Benedicks, L. Carleson, The dynamics of the He´non map, Ann. Math. 133 (1991) 73. [229] T. Sauer, J. Yorke, Are the dimensions of a set and its image equal under typical smooth functions? Ergodic Th. Dyn. Syst. 17 (1997) 941. [230] D. Ruelle, Resonances of chaotic dynamical systems, Phys. Rev. Lett. 56 (1986) 405. [231] V. Baladi, J.-P. Eckmann, Resonances for intermittent systems, Nonlinearity 2 (1989) 119. [232] L. Kaufman, Finding Groups in Data: An Introduction to Cluster Analysis, Wiley, New York, 1990.
Physics Reports 308 (1999) 65—233
Hadronic and electromagnetic probes of hot and dense nuclear matter W. Cassing*, E.L. Bratkovskaya Institut fu( r Theoretische Physik, Universita( t Giessen, 35392 Giessen, Germany Received March 1998; editor: G.E. Brown
Contents 1. Introduction 2. Chiral symmetry restoration and deconfinement 2.1. The symmetries of QCD 2.2. Results from lattice QCD 2.3. Brown—Rho scaling 2.4. QCD sum rules 2.5. NJL effective Lagrangians 3. A covariant transport approach 3.1. Baryon self-energies 3.2. Meson self-energies 3.3. Quasiparticle parametrization and propagation 3.4. Baryon—baryon reaction channels 3.5. Meson—baryon reaction channels 3.6. Meson—meson reaction channels 3.7. Formation time t $ 4. Heavy-ion reaction dynamics 4.1. Protons, pions and g’s at SIS energies 4.2. Protons and pions at AGS energies 4.3. Protons and pions at SPS energies 4.4. Optimizing for high baryon density 5. K>, K\ and pN production 5.1. SIS energies
68 69 70 71 73 75 78 88 89 93 97 99 106 110 110 112 112 116 118 121 123 127
5.2. AGS energies 5.3. SPS energies 6. Dilepton production 6.1. Elementary production channels and formfactors 6.2. BEVALAC/SIS energies 6.3. SPS energies 6.4. How to disentangle the different scenarios? 6.5. Systematics of dilepton production from AGS to SPS energies 6.6. Direct photons 7. Charmonium production and suppression 7.1. Elementary production cross sections 7.2. Analysis of experimental data 8. Future perspectives 8.1. Meson m -scaling at SIS energies 2 8.2. Pion—nucleus reactions 8.3. Dilepton anisotropies 8.4. Stepping towards RHIC energies 9. Summary Acknowledgements References
* Corresponding author. E-mail:
[email protected]. Supported by BMBF, DFG, GSI Darmstadt and FZ Ju¨lich. 0370-1573/99/$ — see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 2 8 - 3
144 148 152 155 161 166 170 173 174 180 184 188 196 196 203 212 219 223 225 226
HADRONIC AND ELECTROMAGNETIC PROBES OF HOT AND DENSE NUCLEAR MATTER
W. CASSING, E.L. BRATKOVSKAYA Institut fu( r Theoretische Physik, Universita( t Giessen, 35392 Giessen, Germany
AMSTERDAM — LAUSANNE — NEW YORK — OXFORD — SHANNON — TOKYO
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
67
Abstract We review the constraints imposed by the spontaneously broken chiral symmetry of the QCD vacuum on the hadron properties at finite temperature ¹ and baryon density o . A restoration of chiral symmetry is indicated by the dropping of the scalar quark condensate 1qN q2 at finite ¹ and o in various approaches. This suggests the hadrons to become approximately massless in hot and dense nuclear matter or the vector and axial vector currents to become equal. In this respect we study the properties of hadrons — as produced in relativistic nucleus—nucleus collisions — by means of a covariant hadronic transport approach where scalar and vector hadron self-energies are taken into account explicitly which are modelled in terms of effective ‘chiral’ Lagrangians. Within this transport approach we investigate the reaction dynamics of relativistic heavy-ion collisions and analyse experimental data on n, g, K>, K\, o, u, , pN and charmonium production for proton—nucleus and nucleus—nucleus collisions from SIS to SPS energies (1—200A GeV). Whereas n, g and to some extent K> mesons are found not to change their properties in the nuclear medium substantially, antiprotons and antikaons do show sizeable attractive self-energies as can be extracted from their experimental abundancies and spectra. The properties of the vector mesons o, u and at finite baryon density are investigated by their dileptonic decay; the CERES and HELIOS-3 data at SPS energies are found to be incompatible with a ‘bare’ vector meson mass scenario. Here, a description by ‘dropping’ o and u masses leads to a very good reproduction of the data, however, also approaches based on more conventional hadronic interactions as pion polarizations and meson—nucleon scattering amplitudes are compatible with the present dilepton spectra at SPS energies. Constraints from dilepton studies at BEVALAC/SIS energies are investigated in all decay schemes as well as a variety of further observables that allow to disentangle the different scenarios experimentally. Furthermore, the charmonium production and suppression in proton—nucleus and nucleus—nucleus collisions is investigated within the transport approach in order to probe a possible transition to a quark-gluon plasma (QGP) phase. We finally discuss ‘optimized’ observables for an experimental investigation of the restoration of chiral symmetry and/or the phase transition to a quark-gluon plasma. 1999 Elsevier Science B.V. All rights reserved. PACS: 11.30.Rd; 12.38.Mh; 12.40.Vv; 21.65.#f ; 25.75.!q Keywords: Chiral symmetries; Quark-gluon plasma; Vector-meson dominance; Nuclear matter; Relativistic heavy-ion collisions
68
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
1. Introduction The study of hot and dense hadronic matter by means of relativistic nucleus—nucleus collisions is the major aim of high energy heavy-ion physics. Here the question of chiral symmetry restoration at high baryon density and/or high temperature and a phase transition to the quark-gluon plasma (QGP) are of primary interest [1,2]. Independent information from lattice QCD calculations, QCD sum rule studies, chiral perturbation theory or the approximate scale invariance of the QCD Lagrangian point towards a phase transition from hadronic to partonic degrees of freedom. The change of the scalar quark condensate 1qN q2 — which is nonvanishing in the vacuum due to a spontaneous breaking of chiral symmetry — with temperature ¹ and baryon density o towards a chiral symmetric phase characterized by 1qN q2"0 should also lead to a change of the hadron properties with density and temperature, i.e. in a chirally restored phase the hadrons might become approximately massless [3]. We will denote such a scenario by the dropping mass scheme. On the other hand, hadrons are strongly interacting objects and many-body excitations in the medium may carry the same quantum numbers as the hadrons under investigation. In this picture the change of the hadron properties is due to their mutual interactions; as a consequence of these interactions their spectral function will split into different branches which then entirely overlap at high density or temperature. In more simple words: the hadron will become very short-lived or melt in the nuclear medium. It is not clear so far up to which extent the dropping mass or melting hadron scenarios are complementary to each other or if they might describe the same physics. At first sight, however, they seem to predict different hadronic in-medium spectral functions. Nowadays our knowledge about the hadron properties at high temperature or baryon densities is based on heavy-ion experiments from BEVALAC/SIS to SPS energies where hot and dense nuclear systems are produced on a timescale of a few fm/c. However, any conclusions about the properties of hadrons in the nuclear environment are based on the comparison of experimental data with nonequilibrium kinetic transport theory [4—6]. Among these, the covariant RBUU approach [7—16], the QMD [17—19], RQMD/UrQMD model [20,21], ARC [22,23], ART [24] or HSD [25] approach have been successfully used in the past to provide a glance at the nonequilibrium aspects of high energy nuclear reactions. As a genuine feature of transport theories there are two essential ingredients: i.e. the baryon (and meson) scalar and vector self-energies — which are neglected in a couple of approaches — as well as in-medium elastic and inelastic cross sections for all hadrons involved. Whereas in the low-energy regime these ‘transport coefficients’ can be calculated in the Dirac—Brueckner approach starting from the bare nucleon—nucleon interaction [26—28] this is no longer possible at high baryon density (o 52—3o ) and high temperature, since the number of independent hadronic degrees of freedom increases drastically and the interacting hadronic system is expected to enter a phase with a vanishing scalar quark condensate 1qN q2+0 [3,29—32]. As a consequence the hadron self-energies or spectral functions in the nuclear medium will change substantially especially close to the chiral phase transition and transport theoretical studies should include the generic properties of QCD that so far are known from nonperturbative computations on the lattice [33—36]. Since QCD lattice calculations will not be possible for high baryon densities within the next years we have to rely on suitable effective Lagrangians that lead to the same physical condensates and thermodynamic behaviour as the original QCD problem. In this review we will address the results from chiral perturbation theory (ChPT) for the in-medium properties of mesons and those from an extended Nambu—Jona—Lasinio (NJL) model
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
69
for the nucleon properties and for nuclear matter at high density. Furthermore, a relativistic hadronic transport approach (HSD) is formulated in line with the in-medium dispersion relation of hadrons following from the set of effective Lagrangians; its results will be confronted with a large set of experimental data at SIS (42A GeV), AGS (415A GeV) and SPS energies (4200A GeV). We will start with the dynamics and properties of hadrons consisting dominantly out of light quarks and antiquarks (u, d, uN , dM ), then continue with strange hadrons containing s and/or sN quarks and finally investigate the production and propagation of ccN pairs in the dense medium created in ultrarelativistic nucleus—nucleus collisions. A summary as well as detailed proposals for future experimental investigations will close this work.
2. Chiral symmetry restoration and deconfinement The theory of quantum chromo dynamics (QCD) is expected to represent the fundamental theory of the strong interaction and so far is well tested in its short distance or large momentum transfer range. However, its dynamical properties at large distances or low momentum transfer are not well understood since standard perturbation theory is not applicable in this case. On the other hand the low energy excitations of QCD are well known experimentally in the vacuum in terms of hadron spectra or spectral functions, respectively. The question thus arises, how the low energy QCD spectrum will change when heating the vacuum or filling it with up with additional valence quarks. Since this question addresses the nonperturbative regime of QCD, combined theoretical and experimental efforts are necessary to shed some light on this issue. We start with a brief overview of the theoretical approaches in this Chapter and then continue with an analysis of the experimental data obtained in this context in the following. As for any quantum field theory — without knowledge about the actual gauge field configurations — the symmetries (or approximate symmetries) of its Lagrangian help to classify the physical subspaces of field configurations. In case of QCD we have L(x)"tM (x)(icID !M K )t (x)!G? (x)GIJ?(x) O I O IJ with the gluonic field strength tensor
(2.1)
G? (x)"j AI ?(x)!j AI ? (x)#gf ?@AAI @ (x)AI A (x) IJ I J J I I J and the gauge covariant derivative
(2.2)
D "j !igt?AI ? (x) (2.3) I I I generating the coupling between the fermion and gauge fields AI ? . In Eq. (2.2) f ?@A denote the I structure constants of the group SU(3), while the 3;3 matrices t? follow the commutator algebra [t?, t@]"if ?@AtA
(2.4)
with the normalization Tr(t?t@)"1/2 d?@. In case of SU(3) the color indices a, b, c run from 1 to 8. Restricting for a while to 3 quark flavors the spinor t — apart from the four Dirac indices — is O represented by u, d and s quarks, i.e. in the fundamental representation tM "(uN , dM , sN ). In this case O M K is a 3;3 diagonal matrix in flavor space with the bare quark masses m, m, m on the diagonal. S B Q
70
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
In Eqs. (2.2) and (2.3) g is the strong coupling constant; c (k"0, 1, 2, 3) are the Dirac 4;4 I matrices. 2.1. The symmetries of QCD By construction, the Lagrangian (2.1) is invariant under the local SU(3) gauge transforma tions: AI ? (x)PAI ? (x)!j H?(x)#gf ?@AH@(x)AI A (x) , (2.5) I I I I t (x)Pexp(!it?H?(x))t (x), tM (x)Pexp(it@H@(x))tM (x) , (2.6) O O O O where H?(x) are infinitesimal real-valued functions, which has to be fulfilled in any approximation scheme. Apart from a global º(1) symmetry, i.e. multiplication of the spinors by a constant phase — which expresses the conservation of the fermion number — the fermion—gluon interaction density L "gtM (x)cIt?t (x)AI ? (x) (2.7) O O I is independent on the flavors (u, d, s). In case of a vanishing mass matrix M K the Lagrangian (2.1) is invariant under the transformations t Pexp(!(i/2)j@a@ )t , tM Pexp((i/2)j@a@ )tM , (2.8) O 4 O O 4 O t Pexp(!(i/2)j@a@ c)t , tRPexp((i/2)j@a@ c)tR , (2.9) O O O O where the 3;3 matrices j@"2t@ are the Gell-Mann matrices in the standard representation. They are denoted differently from the color matrices t? in order to specify rotations in flavor; the parameters a and a are constant, but arbitrary vectors in flavor space, respectively. 4 Now defining right- and left-handed quarks by the linear combinations 1 t " (1$c)t 0* 2 O
(2.10)
the transformations (2.8) and (2.9) translate to a global SU(3) ;SU(3) symmetry in flavor space 0 * implying that left- (right-) handed quarks are not mixed dynamically and conserve their ‘handedness’. This global symmetry is denoted by chiral invariance of QCD in the limit M K "0. It expresses the physical effect that the sign of the projection of the fermion spin on its momentum direction cannot be changed by the dynamics if M K "0. On the other hand, chiral invariance implies that right- and left-handed currents are the same or the vector and axial vector current—current correlation functions should be identical. However, chiral symmetry is broken in the vacuum as can directly be seen from the different masses of the o-meson and its chiral partner, the a -meson, which are the meson poles in the isovector and axial vector current—current correlation functions [32,37]. A quite successful approximation scheme for low energy hadron physics is the separation of the Lagrangian (2.1) in a chiral symmetric L and the symmetry violating term L , i.e. L"L #L "L #tM M K t , (2.11) O O which is physically of relevance since especially the bare light quark masses are small compared to hadronic energy/mass scales as pointed out by Gasser and Leutwyler [38]. The separation (2.11) is
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
71
the starting point for chiral perturbation theory (ChPT), i.e. an approximation of L in terms of the chiral fields º, ºR and their derivatives jº, jº,.. LPL +L (º, ºR, jº, jºR, jº, jºR,2) ,
(2.12)
with º"exp(in /f ) , (2.13) L where n denotes the pseudoscalar meson octet and f +93 MeV is the pion decay constant. Here L the pion decay constant f enters because it provides a scale for the breaking of the axial vector L current conservation (PCAC) in the matrix element 10"jIA? "n@(x)2"!if md?@ exp(!iqx) , (2.14) I L L where 10" and "n@(x)2 stand for the vacuum and pion fields, respectively, and m is the pion mass. L Furthermore, a and b stand for the flavor indices and A? for the axial vector current I A? "tM j?/2c c t , which should not be mixed up with the gauge field AI ? (x) in Eq. (2.2). I O I O I The ansatz (2.12) has been shown to be quite successful in low energy hadron physics [38—40]. We will come back to this approach in Sections 2.3 and 3.2 when modelling effective meson— baryon interactions. 2.2. Results from lattice QCD The dynamical problem (2.1) so far is not well understood theoretically in the nonperturbative regime since the problem of confinement is related to its long distance properties. Conventional nonperturbative calculations here face the problem of gauge invariance, which is violated in any finite order truncation scheme [41], whereas actual calculations in a finite ‘Hilbert space’ with infrared and ultraviolet cutoffs in momentum space in general violate gauge invariance as well. For a vanishing quark chemical potential k , however, qualitative and quantitative results are O obtained in the thermodynamic limit from lattice QCD calculations, where the partition function as well as a variety of 2-point functions are calculated from a discretized action on a euclidean lattice in 1 time and 3 space dimensions. Whereas the pure Yang-Mills sector (without fermions) is numerically sufficiently under control nowadays [42], light fermions such as u and d quarks are still not easy to handle. For a more detailed discussion of this issue we refer the reader to Ref. [43]. The most simple 2-point function to be extracted from these calculations is the scalar quark condensate 1qN q2, which is nonvanishing in the vacuum state due to chiral symmetry breaking, i.e. 10"qN q"02"1qN q2 +!(230 MeV/c)+!1.6 fm\ . (2.15) The expectation value (2.15) implies that there are about 1.6 virtual uN u pairs per fm and also the same amount of dM d pairs. However, when heating up the vacuum the scalar condensate reduces in magnitude and almost vanishes above a critical temperature ¹ as can be seen in Fig. 2.1 (taken from Ref. [44]) where the ratio of the scalar vacuum condensate at finite ¹ to its vacuum value is displayed as a function of the temperature ¹. The scalar quark condensate thus plays the role of an order parameter for the QCD phase transition to a chirally restored phase at high temperature where the density of virtual quark—antiquark pairs approaches zero, whereas the density of real quark—antiquark pairs increases with ¹.
72
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 2.1. Lattice QCD result for the chiral condensate at finite temperature, extrapolated to the chiral limit (m P0) and O normalized to its value at ¹"0, for various numbers of flavors N and time-like grid points N (from Ref. [44]). $ O Fig. 2.2. Energy density (upper symbols) and pressure (lower symbols) divided by ¹ for different fermion masses (from [45]). The solid lines indicate the Stefan—Boltzmann limit for different grid points in the time-like direction.
The question arises now about the actual value of ¹ , which presently is still uncertain in the range 150 MeV4¹ 4200 MeV [43]. The energy density as well as the pressure in the vacuum can be extracted from the energy momentum tensor as computed on the lattice. In fact, the energy density e(¹) — when plotted as e(¹)/¹ — shows a jump at the same critical temperature ¹ as seen from Fig. 2.2 taken from [45]. Note, that the energy density for any system of quasi-particles in the high temperature phase should be proportional to ¹ due to thermodynamics if u(p)Kp. The lattice results [45] in Fig. 2.2 indicate that above the critical temperature ¹ (K160 MeV in this case) the Stefan—Boltzmann limit is not yet reached; this implies that the partons (quarks and gluons) still behave as interacting (or massive) particles with a nontrivial dispersion relation u(p). On the other hand the pressure P — when displayed as P/¹ — remains small in the region of the phase-transition and rises substantially in the new phase denoted as the quark-gluon plasma (QGP) phase. Presently, it is still a matter of debate if this phase transition is of first or second order or just a rapid crossover, which would imply quite different scenarios for the expansion phase during the hadronization of a QGP-droplet in the course of an ultra-relativistic heavy-ion collision [46,47]. On the other hand, at vanishing quark density the restoration of chiral symmetry goes along with a phase transition to a quark-gluon plasma (QGP) phase. It is presently unclear if this will also be the case for finite quark or baryon density. A further quantity of relevance is the gluon condensate 1a G? GIJ?2, which is nonvanishing in the Q IJ vacuum due to the breaking of scale invariance of QCD, and of order 1a G? GIJ?2 +4n;0.5 GeV/fm . Q IJ
(2.16)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
73
Apart from the scalar quark condensate (2.15) this quantity, which expresses the energy density of virtual gluon pairs in the vacuum, sets a second relevant scale for the physics to be discussed throughout this work. It is worthwhile to state the results for the condensates in chiral perturbation theory in case of N "3, i.e. [40] ¹ 1 ¹ 16 ¹ K ! ! ln #2 , (2.17) 1qN q2 "1qN q2 1! 2 6 (8f 9 (8f ¹ (8f L L L
32n ¹ 1 K 1a G? GIJ?2 "1a G? GIJ?2 ! log ! !2 , (2.18) Q IJ 2 Q IJ 405 f 4 ¹ L (with K+0.14—0.25 GeV) which, however, is only valid for ¹;¹ and does not provide informa tion on the condensates close to the critical temperature. One should note, however, that the gluon condensate is much more stable against thermal excitations than the scalar quark condensate due to its high power in (¹/f ). L For the physical scales to be discussed it is important to translate the findings from lattice QCD to actual numbers for critical energy densities. Adopting SU(3) the number of degrees of freedom in the plasma phase is N "N ;4;N #N ;2"36#16"52 (2.19) D A E for N "N "3 and N "8 in case of SU(3) . For massless partons thermal perturbation theory D A E up to order g gives for the energy density [48—50]
8n 7n 3 15 50 2 1 g ¹#N 1! g ¹# 1! g k n¹# k , e" 1! D D 15 10 n 16n 84n 4n 2 D D (2.20) where k denotes the quark chemical potential for each flavor. Assuming the strong coupling D constant a to be +0.3 at the critical temperature, we get g+2 for a "g/4n. This gives Q Q e+1.25 GeV/fm\ at ¹ "150 MeV and +4 GeV/fm\ for ¹ "200 MeV in case of vanishing A A quark chemical potentials. On the other hand the nonperturbative lattice calculations presented in Fig. 2.2 for ¹ K160 MeV give an energy density of eK1.1 GeV/fm which indicates that the A perturbative result (2.20) — valid at high temperature — cannot be directly extrapolated down to ¹ . Nevertheless, the critical energy density for a phase transition to the QGP phase presently is expected to be within the limits 1 GeV/fm4e44 GeV/fm for k "0. O So far, these high energy densities — in a baryon-free regime — have not been yet reached experimentally, but are expected to be probed in experiments at the Relativistic Heavy-Ion Collider (RHIC) at Brookhaven or the Large Hadron Collider (LHC) at CERN in the next years or next decade, respectively. 2.3. Brown—Rho scaling Whereas the change of the condensates with temperature can quantitatively be computed by lattice QCD at vanishing baryon density o "0 or quark chemical potential k "0, the change of O
74
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
the condensates with the net quark density 1qRq2 cannot be extracted from lattice calculations by now. One thus has to address to low density theorems or relations which connect symmetry breaking terms of the QCD Lagrangian to physical observables like the pion mass m and the pion L decay constant f as in Eq. (2.14). L A general relation of this kind has been derived by Drukarev and Levin [51] and reads for the condensate at finite (scalar) baryon density o"o +o , 1 1qN q2 o d E(o) M"1! R #m , (2.21) 1qN q2 f m L, dm A L L where E(o)/A is the binding energy per nucleon and R +45$7 MeV is the pion—nucleon R-term L, [52]. Another approach has been put forward by Brown and Rho [3] which is based on the argument that due to scale invariance of QCD — in the classical limit and for massless fermions — the symmetries of QCD remain valid also at finite baryon density. Thus the properties of hadrons — as low mass excitations of the vacuum — should scale with the changing vacuum as
m* m* m* m* ,+ N+ M+ S , (2.22) m m m m , N M S where in-medium quantities are denoted by m*. This argument is demonstrated for the effective Lagrangian (cf. Eq. (2.12))
e s 1 f s Tr(j ºjIºR)# Tr[ºRj º, ºRj º]# j sjIs# F(M K º) , L " L I I J I 4 s 4 s 2 where º is the chiral field (of scale dimension 0),
(2.23)
º"exp(in /f ) , (2.24) L while n denotes the pseudoscalar meson octet. In Eq. (2.23) s is an effective scalar ‘glueball’ field, s&(Tr[G? GIJ?]); 10"s"02"s , (2.25) IJ of scale dimension 1. The scale breaking term &F(MK º), which depends on the fermion mass M K and the chiral field º, is &(s/s ) which implies * * * 10 "qN q"0 2 s (2.26) " 10"qN q"02 s with s*"10*"s"0*2. Defining an effective pion decay constant by
s* , f *"f L Ls this gives
(2.27)
10*"qN q"0*2 f* . " L f 10"qN q"02 L
(2.28)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
75
This suggests to expand the s field as s"s*#s, where s denotes a fluctuating field and s* characterizes the minimum potential energy at finite baryon density. The effective Lagrangian for the chiral field º in the modified vacuum characterized by s* then can be written as
f* e f * (2.29) L " L Tr(j ºjIºR)# Tr[ºRj º, ºRj º]#c L Tr(MK º#h.c.)#2 , I I J 3 f 4 4 L with º"exp(in*/f *) and n*"n s*/s in shorthand form, such that the meson field has scale L dimension 1. Identifying the effective p field — that plays a dominant role in low energy nuclear physics — with a mixture of the scalar 2-pion field and the ‘glueball’ field s one may relate m* m* f * Q + N+ L . (2.30) m m f Q N L The Lagrangian (2.29) within the assumption 1r2&g /f and m &(g f [53] leads to, L , L * * * * m g f f ,+ L+ L , (2.31) m g f f , L L which approximately scales in the same way. Here g , g* denote the axial vector coupling constant in the vacuum and medium, respectively. Furthermore, using the KSFR relation [54]
m "2gf , (2.32) 4 L where g denotes the hidden gauge coupling, the vector mesons (»"o, u) then also should scale as the nucleon and the p in line with Eq. (2.22). The suggestion (2.22) of Brown and Rho has stimulated the field of nuclear physics at high baryon density to a large extent during this decade. 2.4. QCD sum rules Some model independent information on current—current correlation functions in the vacuum as well as at finite nuclear density can be extracted from QCD sum rules [55—57]. We recall that vector mesons are resonances in the current—current correlation functions [2,37,58—60],
P (q)"i dx exp(iqx)10"¹K j (x)j (0)"02 , IJ I J
(2.33)
where ¹K denotes time ordering while j is the electromagnetic current which can be decomposed as I j "jM#jS#j( , (2.34) I I I I which correspond to the physical resonances o, u and . Their actual quark content is given by jM" (uN c u!dM c d), jS" (uN c u#dM c d), j("! (sN c s) . I I I I I I I I The tensor (2.33) is purely transversal in the vacuum due to current conservation, i.e.
qq P (q)" g ! I J P(q) IJ IJ q
(2.35)
(2.36)
76
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
with the scalar correlation function P(q)" gIJP (q) . (2.37) IJ The negative imaginary part of P(s), i.e. !12n/s Im P(s), is proportional to the hadron production cross section in e>e\ collisions at the squared invariant energy s. Similarly, the invariant mass spectrum of dileptons (e>e\, k>k\) stemming from the vector mesons o, u and is proportional to !n Im P(M)/M where M is the invariant mass of the lepton pair. Thus dilepton studies (in principle) allow to measure the imaginary part of the current—current correlation function in the medium. Here especially the o meson is of interest due to its short lifetime because, once produced in a dense and hot hadronic environment, it will decay predominantly at finite baryon density. We will come back to this issue in more detail in Section 6. The underlying idea of QCD sum rules now is to combine two different representations of P(q) at large spacelike Q"!q for the individual channels (o, u, ) [61]. One way is to use the dispersion relation,
Im P(s) q , P(q)"P(0)#cq# ds s(s!q!ie) n
(2.38)
in twice subtracted form. Since the photon mass vanishes in the vacuum we have P(0)"0 for the vacuum case. Relation (2.38) is compared with the correlation function in the operator product expansion
Q c c c d Q !c ln # # # #2 , P(Q"!q)" k Q Q Q 12n
(2.39)
at some scale k+1 GeV. Here the coefficients c reflect the QCD coupling, bare masses and G condensates, i.e. for the isovector (or o) channel: a (Q) , c "1# Q n
(2.40)
c "!3((m)#(m)) , S B n a 0 QG GIJ 0 #4n10"muN u#mdM d"02 , c " S B n IJ 3
c "!4n[10"a (uN c c j?u!dM c c j?d)"02#10"a (uN c j?u#dM c j?d) qN cIj?q"02] , Q I I Q I I OSBQ with 4n (2.41) a (Q)" Q 9 ln(Q/K ) /!" in case of SU(3) with K "K+140 MeV. The constant d"3/2 in case of the o-meson. The /!" 4-point condensate in c usually is approximated by the square of the 2-point condensates 10"qN q"02 (times some parameter i) and 4-point functions with mixed flavor content are neglected. This approximation seems to hold for vacuum expectation values, however, becomes questionable
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
77
at high baryon density where 10"qN q"02 approaches zero because quantum fluctuations should become dominant in the 4-point functions. One then compares Eq. (2.38) — involving some effective model for Im P(s) — with the r.h.s. of Eq. (2.39). In order to improve convergence a Borel transformation [62,63] is applied additionally, i.e.
(Q)L d L f (Q) , (2.42) ! f (M)" lim (n!1)! dQ Ld/L+ where the mass scale M +1 GeV is used. It turns out that for the vector mesons the results are stable with respect to moderate variations in this ‘convergence’ scale M [56]. In Fig. 2.3 we show the results from Hatsuda and Lee [56] for the vector meson masses as a function of the baryon density in units of o . The in-medium mass here is defined as the energy in the medium at vanishing 3-momentum, i.e. (2.43) m* "u( p"0)"(P(u, p"0)#m . 4 4 in Fig. 2.3 almost drop linearly with density o"o which The effective masses m* "m 4 MS( suggests the parametrization
o 5m#mN (2.44) m* "m 1!a 4 4 4o O O with a "a K0.18 and a K0.025 (yK0.12) for practical applications and estimates. The dashed M S ( lines in Fig. 2.3 indicate the vacuum threshold for the decay of the into KM K and K>K\, respectively. Thus QCD sum rules also yield a decrease of the vector meson masses in the nuclear medium which, however, is approximately linear in the scalar quark condensate and not of 3rd order as in Eq. (2.28).
Fig. 2.3. L.h.s.: the o—u meson mass m and the continuum threshold S as a function of o/o as resulting from the MS QCD sum rule analysis of Ref. [56] using a spectral function of the form d(s!M). R.h.s.: the effective mass for the
-meson for different values of y (the strangeness content in the nucleon). The dashed lines indicate the KM K and K>K\ threshold at o"0, which are the main decay modes.
78
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
A note of caution has to be added here because in the latter analysis a o spectral function &d(s!M) has been adopted. More recent studies by Klingl et al. [61] as well as Leupold et al. [64] show that the QCD sum rules can also be satisfied if the o-meson pole does not shift but the width of the o-meson becomes very broad, i.e. its lifetime very short. We will come back to these different scenarios in Chapter 6 when discussing dilepton production experiments. Nevertheless, whatever the hadronic model for a vector meson might be: the resulting spectral function for the vector meson will have to be compatible with the r.h.s. of Eq. (2.39) which provides stringent constraints on the hadronic Lagrangians employed for the evaluation of Im P(s) in Eq. (2.38). 2.5. NJL effective Lagrangians As discussed above, the symmetries of the Lagrangian (2.1) as well as its symmetry breaking mass term tM M K t play a fundamental role in hadron physics. It is thus tempting to explore more simple O O effective Lagrangians with the same symmetries for the quark degrees of freedom, however, leaving the gluon condensate (or scalar ‘glueball’) unchanged, or even more discarding the gluon dynamics completely and shifting them to a constant background term. Thus one attempts to model the quark/antiquark dynamics in an effective quark Lagrangian, where the gluon fields are assumed to be integrated out, leading to an effective fermion—fermion interaction that might be used in a Hartree—Fock approximation scheme. For pedagogical purposes and also for orientation at high baryon density and/or temperature we will present here an extended NJL-model which in similar form has been used as a guideline to low energy QCD physics [65,66]. The underlying idea of an effective 4-point interaction for quarks has been discussed, e.g. by Vogl and Weise in Ref. [67]. Since the fundamental currents in QCD are color currents, i.e. J? "tM c t?t , an elementary color current interaction with a universal coupling G is expected to I O I O ! be dominant. An effective Lagrangian for d(x !x )-like quark interactions, which are most easy to handle, thus reads L (x)"tM (icIj !M K )t !G (tM c t?t ) , (2.45) O O I O ! O I O ? where t? (a"1,2, 8) again are the SU(3) matrices, M K a diagonal mass matrix in flavor space, i.e. M K "diag(m, m, m) and tM "(uN , dM , sN ) is the quark spinor in case of SU(3) as before. The S B Q O color-current interaction is invariant under chiral transformations (2.8), (2.9) or SU(3) ;SU(3) 0 * flavor rotations. The Lagrangian (2.45), however, in its present form is not yet well suited for the formulation of quark dynamics on the mean-field level because antisymmetrization generates a further mixing of color, flavor and Dirac indices. It is thus more convenient to introduce a Fierz transformation, i.e. to antisymmetrize the 4-point interaction to proceed with further computations on the Hartree level. The Fierz transform then generates color singlet as well as color octet terms, i.e. [67,68]
j j tM Gt # tM ic Gt L (x)"tM (icIj !M K )t #G O2 O O 2 O O O I O 1 G j j !G tM c Gt # tM c cI Gt 4 O I2 O O 2 O G
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
79
2 jG jG !G (tM c t?t )! G tM t?t # tM ic t?t ! O I O ! O O O O 3 2 2 ? ? G 1 jG jG # G tM c t?t # tM c c t?t , (2.46) ! O I O O I O 3 2 2 ? G where G"2G "G. In Eq. (2.46) the matrices jG (i"1, 2 , 8) again stand for the SU(3) 1 4 ! degrees of freedom with Tr(j j )"2d while j is given by j"(I with I denoting the 3;3 G H GH unitary matrix in flavor space. The Lagrangian (2.46) in its color-singlet version has been the starting point for RPA-type calculations for the bosonic excitations of the nonperturbative QCD vacuum, i.e. the mesonic degrees of freedom n, g K, KM [67—69]. Similar Lagrangian densities have also been exploited by a variety of authors [39,65,66,70—74] following an early suggestion by Nambu and Jona-Lasinio (NJL) [75]. In the following we will discard the mesonic (RPA-type) sector and concentrate on the determination of a static effective quark-quark interaction by nucleon properties as well as nuclear matter related quantities. Similar concepts have been proposed by Guichon [76] and Saito and Thomas [77] based on bag-model wavefunctions. Here we present a slightly different concept (following Ref. [25]) by determining the quark wavefunctions for the nucleon from the experimental data for the proton electromagnetic formfactor. In this way we ‘circumvent’ the problem of absolute confinement which cannot be dealt with properly using only a color neutral mean-field approach of the NJL-type. Since we will be only interested in energy densities for given quark configurations, the resulting Lagrangian should not be used for dynamical studies such as the RPA response (mesonic sector). Furthermore, it is not expected that the respective soliton solution of Eq. (2.46) for a nucleon presents a dynamically stable object of size and shape consistent with the experimental proton formfactor.
2.5.1. Isospin symmetric systems In this subsection we concentrate on vacuum as well as nucleon properties, where the nucleons are assumed to be represented by three valence quarks with a fixed phase-space distribution on top of the (truncated) Dirac sea with a formfactor in line with the experimental data. The color singlet terms of the Lagrangian (2.46) in the mean-field limit — performing the sum over the flavor matrix elements — then leads to the following Lagrangian for tM "(uN , dM , sN ), O G 1 +(tM t )#(tM ic t ), L (x)" tM (icIj !m)t # I I I I O I I I I 2 ISBQ ISBQ G ! 4+(tM c t )#(tM c cIt ), , (2.47) I I I I I 2
where the couplings G and G are now considered as free parameters. For the systems of positive 1 4 parity also the pseudoscalar and pseudovector terms (&c ) vanish in the Hartree limit such that The most general four-point interaction compatible with QCD symmetries starts from combinations of all possible vector and axial currents. Therefore, in general, there is no strict relationship between G and G . 1 4
80
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
one is left with the scalar and vector term, only. This is quite similar to the p-u model [78,79] in the nuclear physics context. The hamiltonian density then is given by
G G H(x)" tM (!icGj #m!G1tM t 2)t # 1 1tM t 2# 41tM c t 2 , I G I 1 I I I I I I I I 2 2 ISBQ
(2.48)
which leads to the gap equations for the effective masses m , i.e. I m "m!G1tM t 2 . I I 1 I I
(2.49)
Since the problem (2.48) decouples in the flavor degrees of freedom we will consider in the following only u-quarks assuming m"m and neglect a possible strangeness content of the nucleon S B furtheron. For the nonperturbative vacuum we then end up with the gap equation in phase space for the effective quark mass m of u or d quarks: S
m g S dp H(K !" p")"m , m "m#G 1 4 S S 1 (2n) (p#m S
(2.50)
where we have introduced a cutoff parameter K to regularize the divergent integral over the Dirac 1 sea. Alternatively, one might also introduce covariant cutoff schemes as in [67], but for reasons to be discussed below in context of Eq. (2.52) we prefer to use the scheme (2.50), since we are basically interested in quark configurations with a well defined rest frame. In Eq. (2.50) the factor g"6 arises from the trace over color and spin in Eq. (2.49); it should not be mixed up with the strong coupling constant in Eqs. (2.2) and (2.3). The gap equation (2.50) then leads to a constituent quark mass m 'm in the nonperturbative vacuum provided that the coupling G is sufficiently S S 1 large. The coupling constant G together with the cutoff parameter K now can be determined via the 1 1 Gell-Mann, Oakes and Renner relation [80] assuming 1uN u2"1dM d2 m f "!(m#m)1uN u2 , L L S B
(2.51)
where f "93.3 MeV is the pion decay constant, m the physical pion mass and 1uN u2 the scalar L L condensate (for u or d quarks in the vacuum). Choosing m"7 MeV as an average value of the S light quark mass the quark condensate then amounts to 1uN u2+!230 MeV; a value which is achieved by choosing a cutoff K +0.59 GeV and the coupling constant G +4.95 GeV\ in 1 1 Eq. (2.50). In the presence of additional localized light valence quarks on top of the Dirac sea the gap equation (2.50) modifies locally to
m (r) g m (r) g S S dp f (r, p)#G dp H(K !" p") , m (r)"m!G S 1 1 S S 1 (2n) (2n) (p#m (r) (p#m (r) S S (2.52) where f (r, p) denotes the phase-space distribution of a single u-quark (with fixed spin and color). S
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
81
Fig. 2.4. The quark condensate 1qN q2 — normalized to its vacuum value — as a function of temperature ¹ (l.h.s.) and the quark density 1qRq2 (r.h.s.) within the NJL model for the coupling constants specified in the text.
In case of homogenous systems at finite temperature the distribution f ( p) is well defined either S by the quark density 1qRq2 or the temperature ¹ of the system. Using the coupling constants fixed above the scalar condensate 1qN q2 at finite temperature, i.e. 1 f ( p;¹)" S 1#exp(e( p)/¹)
(2.53)
with e( p)"(p#m or finite density ( f "const for ¹"0) drops as shown in Fig. 2.4. The S S sudden drop of the condensate with temperature (l.h.s. of Fig. 2.4) is comparable with that from the lattice calculations (cf. Fig. 2.1) and appears at ¹ +160 MeV for the parameters used here. Thus the NJL model also yields the restoration of chiral symmetry at a temperature which is comparable to that expected from lattice QCD calculations. Furthermore, the NJL model can be used to evaluate the scalar condensate as a function of the quark or baryon density, too. As seen from Fig. 2.4 (r.h.s.) the scalar condensate drops almost linearly with the quark density and approaches zero close to 1qRq2+0.6 fm\ which corresponds to 3—4o (nuclear matter density). To combine the effects from finite density and temperature we show in Fig. 2.5 the results from Vogl and Weise [67] for 1qN q2 (¹, o/o ) calculated within a similar approach. It becomes obvious that in order to explore observable consequences from a dropping quark condensate regions of high baryon (or quark) density are more favorable than those for vanishing quark density and moderate temperatures. Thus also at SIS energies (42A GeV) precursor effects from a dropping condensate should be observable since here densities up to 3 o can be achieved in nucleus—nucleus collisions [6,7]. For finite hadronic systems the phase-space density f (r, p) has to be specified in some convenient S model. In mean-field theory f (r, p) results from the solution of the Dirac equation S (!icGj #m!GoI (r)#cG oI (r))t (r)"cE t (r) (2.54) G I 1 1 4 4 I I I with oI (r)"1tM (r)t (r)2, oI (r)"1tR(r)t (r)2 , (2.55) 1 I I 4 I I and subsequent Wigner-transformation of tR(r!s/2)t (r#s/2). However, since we do not aim at I I a dynamical theory for the nucleon — due to the lack of confinement in L (x) (2.47) — and we are O
82
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 2.5. The mass of a constituent u-quark as a function of temperature ¹ and density o (in units of o ) from Ref. [67].
only interested in the total energy of well defined quark configurations, we fix f (r, p) by the S experimental electromagnetic formfactor of the proton which is well represented in momentum space by a dipole approximation up to momentum transfers Q+25 GeV/c [81]. This implies that the quark charge distribution (of a proton) is of the exponential form [82] 1tR(r)t (r)2+N exp(!"r"/b )"o (r) , (2.56) O O O where r is given in fm, b +0.25 fm and N "(8nb)\ provides normalization to 1. Considering now a nucleon state averaged over spin and isospin, i.e. a mixture of proton, neutron and D’s of average mass M +1.085 GeV, we obtain for the u-quark density , 3 (2.57) o (r)+ N exp(!"r"/b ) , S 2 where the factor 3/2 reflects the average u-quark content of the states considered. In the local density approximation the phase-space distribution for u-quarks (at ¹"0) then is given by f (r, p)"H(p (r)!"p") S $ with the local Fermi momentum
(2.58)
p (r)"((6/g)n)o (r) . (2.59) $ S This approximation has been quite successfully applied in the nuclear physics context [7,8] and also been adopted in [83,84] for quark oriented models. It is a legitimate approximation for the quark phase-space distribution as long as one is interested in expectation values like the total energy, only. Inserting f (r, p) (2.58) with (2.57) and (2.59) in the gap equation (2.52) one can compute the S effective quark mass m (r) for the ‘nucleon’ described above. The resulting coordinate-space S dependence of m (r) for the ‘nucleon’ is shown in Fig. 2.6 (full line) together with the u-quark S
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
83
Fig. 2.6. Effective quark mass m"m (r) (full line), quark density o (r) (dashed line) and scalar condensate !1qN q2 S S (dotted line) as a function of the radial distance r from the center of the ‘nucleon’ within the NJL model. The figure is taken from Ref. [25].
density 1uR(r)u(r)2"o (r) (dashed line). In the interior of the ‘nucleon’ the effective quark mass S drops to about m"7 MeV and thus the quark scalar selfenergy ºO to zero while it is about S 1 320 MeV in the vacuum, i.e. at large r. Whereas the scalar sector now is fixed by the gap equation (2.52) for arbitrary quark phase-space distributions f (r, p) — that are at rest within the frame of reference considered here — the local vector S quark interaction is modified in order to allow for an explicit momentum dependence. We note that nonlocal generalizations of the NJL Lagrangian have been suggested by Bowler and Birse [85]. We adopt a similar concept and assume that the vector interaction in Eq. (2.47) is mediated by massive color neutral (vector) mesons which implies to modify the couplings K 4 , G PG 4 4 K #q 4
(2.60)
where K +1.0—1.5 GeV is a vector cutoff and q denotes the momentum transfer in the 4 quark—quark interaction. This strategy is similar to that used in effective meson-exchange interactions for hadron—hadron scattering [86]. We note that the formfactor (2.60) will essentially weaken the vector interaction at high baryon density, i.e. large p , or high relative momenta and thus also $ lead to a drop of the optical potential for the ‘nucleon’ at high momenta with respect to the nuclear matter rest frame. The energy density ¹(r) in phase-space representation thus reads (including a factor of 2 from the summation over u and d quarks)
¹(r)"2g
dp dp (p#m (r)f (r, p)!2g (p#m (r)H(K !"p") S S S 1 (2n) (2n)
#2
1 1 g K 4 Go (r)# G dp dp f (r, p ) f (r, p ) !E , S K #( p1!p2) S 2 1 1 2 4(2n) 4 (2.61)
84
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
where the vacuum contribution
1 dp (p#mH(K !"p")# Go E "!2 g S 1 2 1 1 (2n)
(2.62)
has been subtracted. In Eq. (2.61) o "(m !m)/G is the scalar quark density whereas the vector 1 S S 1 quark density is o "g/(2n)dp f (r, p) in the local rest frame of the system. The total energy 1H2 4 S of a quark configuration described by f (r, p) then is obtained by dr ¹(r). S The average nucleon energy to be fixed in our case corresponds to 1.085 GeV, which is the average of the nucleon and the D mass. Since in Eq. (2.61) for ¹(r) all quantities are determined except the quark vector coupling G and cutoff K , the vector coupling (for fixed K +1.5 GeV) is 4 4 4 well determined by the total energy of the quark configuration. Our fit provides G "4.2 GeV\ 4 using f (r, p)"H(p (r)!"p") with p (r) from Eq. (2.59). S $ $ The pion—nucleon R-term, defined by the following matrix element with the nucleon state "N2, R "! (m#m)1N"uN u#dM d"N2 , (2.63) L, S B within the parameters stated above leads to R +47 MeV, which is well in line with the value L, extracted from pion—nucleon s-wave scattering of 45$7 MeV from [52]. Thus the NJL model can be well tuned to QCD related quantities with a minimum of ‘free’ parameters. However, one should always keep in mind that this effective model can only serve for qualitative considerations and should not be considered as a QCD equivalent dynamical approach. On the other hand, it is instructive to explore its predictions for nuclear matter configurations at least for pedagogical purposes. 2.5.2. Symmetric nuclear matter In order to evaluate the energy density for symmetric nuclear matter configurations we have to introduce in addition to f "f (r, p) a phase-space distribution for the nucleons or ‘localized’ quark O S states f . Denoting by (r , p ) the position and momentum of a nucleon, the corresponding quark , , , phase-space distribution f (r, p)r, p, is obtained from a translation of the center of f by r and O , O a proper Lorentz transformation by b "p/(p#m in phase-space, i.e. a contraction of f by , , O c\"(1!b in coordinate-space and dilation in momentum-space by c , which keeps the , , , individual phase-space integral invariant. For the isospin symmetric nuclear matter problem the nucleon phase-space distribution for fixed spin and isospin at ¹"0 is given by f (r , p )"H(p !"p ") , (2.64) , , , $ , with the nucleon Fermi momentum p "(no ), where o is the nuclear density. $ , , Here a further problem is related with the change of the nucleon formfactor in the medium. As suggested e.g. by the interpretation of the EMC effect by Close et al. [87] or arguments based on chiral symmetry by Brown and Rho [3] (cf. Section 2.3) the nucleon might change its size in the nuclear medium (1r2&g* /f *) [53] such that the vector density of a quark is no longer given by L Eq. (2.56). A fully dynamical model of the nucleon in the nuclear medium should give this modification of the formfactor in a self-consistent manner. In fact, the dynamical calculations of Christov et al. [72] in a model similar to Eq. (2.47) predict a sizeable swelling of the nucleon with baryon density. Since the Lagrangian (2.47) here is only considered to provide an effective
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
85
quark—quark interaction for the energy density (2.61) we model such in-medium effects by modifying the width parameter b in Eq. (2.56) as o (2.65) b (o )"0.25 [fm] 1#a , ,o , with a parameter a (+0.18) to be determined by the nuclear matter saturation point (see below). , The linear dependence on o in Eq. (2.65) might be questionable at higher baryon density and , alternative functions will yield a different nuclear equation of state due to a modified energy density from the vector interaction; the scalar energy density here will not be changed sensitively since the quark scalar density essentially drops to zero for o '4o (see below). At o '7o also the energy , , density from the vector interaction becomes almost independent from the nucleon swelling due to the large overlap of the ‘quark’ wavefunctions leading to an approximately homogeneous quark distribution in r-space. However, one should keep in mind that the region of 2o !7o — below the ‘homogeneous’ quark phase — actually depends on the modification of the nucleon formfactor in the medium. In order to carry out computations for the nuclear matter problem the quark phase-space distribution f (r, p) — which enters ¹ in Eq. (2.61) — is simulated by characteristic samples k S f I(r, p)" f (r!r H , p!pH ; b (o )) , (2.66) S S , , , H where f (r, p; b ) denotes the semiclassical quark phase-space distribution for a ‘nucleon’ of width S b (o ) (2.65). The nucleon positions rH are determined by Monte Carlo in a box of volume »"a , , with !a/24xH , yH , zH 4a/2. Only those samples are accepted for which the average distance to , , , the next neighbour agrees within 3% with that for the respective infinite nuclear matter value. The nucleon momenta pH then are selected by Monte Carlo with the constraint " pH "4p (o ) and , , $ , pH "0. Additionally samples are rejected where the average kinetic energy H , 1 ¹ " ((( pH )#M !M ) (2.67) , , , A , H does not match with the nuclear matter value within 3%. The density o in these simulations is , given by o "A/» (input) while A"64 has been adopted throughout the calculations. In order to , compute the dependence of the total energy on o we have scaled the individual positions rH with , , a&o\ and the momenta pH with o. , , , A snapshot of the quark density (for fixed z) for a characteristic sample k at normal nuclear matter density o is shown in the upper part of Fig. 2.7 (l.h.s.); the resulting effective mass m (r) S according to the gap equation (2.52) — for the configuration shown in the upper part (l.h.s.) of Fig. 2.7 — is displayed in its lower part. Since at normal nuclear matter density the overlap of the nucleons is only moderate, the individual scalar ‘quark bags’ can still approximately be separated in space for a given time. As an example for higher nucleon density we show a snapshot of the quark distribution at 4;o for a "0.18 in the r.h.s. of Fig. 2.7 (upper part) together with the , corresponding quark mass m (x, y, z"const.) (lower part) from the gap equation (2.52). Since the S overlap of the quark distributions now becomes substantial, the average quark mass drops to about 30 MeV indicating almost a restoration of chiral symmetry at 4;o .
86
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Now performing the integration of ¹(r) over coordinate space and averaging over characteristic samples k for nuclear matter configurations (as shown in the l.h.s. of Fig. 2.7), dividing by the number of nucleons on the grid and subtracting the bare nucleon mass we can compute the energy per nucleon (N +100), I
1 1 ,I E o " AN A o I I
dr ¹(r)!M , I ,
(2.68)
and thus establish a direct link between the energy density of quarks with the energy per nucleon of isospin symmetric nuclear matter at finite density o/o . In Eq. (2.68) the energy density ¹(r) is I defined by Eq. (2.61) with f (r, p) replaced by f I(r, p) from Eq. (2.66). S S The energy per nucleon (2.68) (for a "0.18) is shown in Fig. 2.8 (full line denoted by HSD) in , comparison to the Dirac—Brueckner results from [27] (full squares) and the parametrizations POL6 and POL7 of the RBUU approach [16] that were found to optimally describe heavy-ion reactions in the energy regime up to about 1A GeV. We find the binding energy per nucleon (+!16 MeV at o "o ) to be reproduced well for a +0.18 which corresponds to a swelling of , , the nucleon by 18% at normal nuclear matter density. We note that for a "0 there is no , minimum in E/A due to the Pauli pressure such that the swelling of the nucleon — which enhances the scalar attraction and reduces the vector repulsion — is a necessary phenomenon here to achieve proper binding. The resulting incompressibility K of nuclear matter amounts to K+250 MeV.
Fig. 2.7. Snapshot of the spatial quark distribution (upper parts) (for fixed z) at normal nuclear matter density o (l.h.s.) and 4;o (r.h.s.) for a "0.18 together with the resulting effective quark mass m"m (x,y) (lower parts). The figures are , S taken from Ref. [25].
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
87
Fig. 2.8. Equation of state for nuclear matter; HSD (solid line), DBHF (full squares); RBUU results: POL6 (dotted line), POL7 (dashed line) from Ref. [16]. The figure is taken from Ref. [25].
Since the energy per nucleon from the NJL model is well in between the limits of the parametrizations POL6 and POL7, as extracted from detailed comparisons in Ref. [16] for nucleus—nucleus collisions in the SIS energy regime, we infer that the equation of state generated by the model is quite realistic in the lower density (o43o ) regime. Its extension to 10o (lower part of Fig. 2.8), however, is still questionable and has to be examined in comparison to experimental flow data at much higher (e.g. AGS) bombarding energies. The resulting nuclear equation of state (EOS) in Fig. 2.8 shows no density isomer up to 10o on the basis of the effective NJL model. The thermodynamic pressure
j E , P "o 2 jo A
(2.69)
furthermore, increases quadratically for o/o '2 and slightly levels off at high density, but does not drop to zero in the range considered here. In summarizing this section, there are a lot of theoretical arguments and model independent relations that predict a decrease of the scalar quark condensate 1qN q2 with temperature and density. However, the scalar quark density itself is no directly measurable observable. On the other hand the hadron properties in the dense medium should modify in a nuclear environment at high temperature in line with the scalar quark condensate. If there is any scaling at all or if this scaling is &1qN q2 as for the Brown—Rho conjecture or &1qN q2 as indicated by the QCD sum rule analysis
88
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
of Hatsuda and Lee remains to be determined by experiment. Experimental data will also have to clarify if the ‘mean-field approaches’ discussed in this section are suitable at all in view of the strong residual interactions.
3. A covariant transport approach Covariant transport theories so far have been very successful in describing the dynamics of hadron—nucleus or nucleus—nucleus reactions from the initial nonequilibrium phase to the final stage where a thermodynamic or chemical equilibrium might be achieved [2,88,89]. Such theories can be derived from standard many-body theory for connected Green functions in phase-space representation by adopting the limit P0 in the mean-field propagation part and neglecting off-shell transitions in the set of coupled collision terms [7,26,27]. Since these derivations are well known from the literature [6,7,26,27] we discard an explicit presentation here. In this section we present a ‘chiral’ transport theory for the hadronic degrees of freedom, that has been denoted as Hadron-String-Dynamics (HSD) [25], which in covariant notation formally can be written as a coupled set of transport equations for the phase-space distributions f (x, p) of F hadron h [7—9,14,15], i.e. +(P !P jNºJ!M*jNº1)jI#(P jVºJ#M*jVº1)jI, f (x, p) F I F V J I F F I F N F I J I F
" d2 d3 d42[GRG] d(P#P !P !P 2) 2 C FFF2 ;+ f (x, p ) f (x, p ) fM (x, p) fM (x, p )!f (x, p) f (x, p ) fM (x, p ) fM (x, p ),2 . (3.1) F F F F F F F F In Eq. (3.1) º1(x, p) and ºI(x, p) denote the real part of the scalar and vector hadron self-energies, F F d(P#P !P !P 2) is the ‘transition rate’ for the process respectively, while [G>G] 2 C 1#2P3#4#2 which is taken to be on-shell in the semiclassical limit adopted. The hadron quasi-particle properties in Eq. (3.1) are defined via the mass-shell constraint [14], (3.2) d(P PI!M*) , F I with effective masses and momenta (for a hadron of bare mass M and momentum pI) given by F * PI(x, p)"pI!ºI(x, p) , (3.3) M (x, p)"M #º1(x, p), F F F F while the phase-space factors fM (x, p)"1$f (x, p) (3.4) F F are responsible for fermion Pauli-blocking or Bose enhancement, respectively, depending on the type of hadron in the final/initial channel. The dots in Eq. (3.1) stand for further contributions to the collision term with more than two hadrons in the final/initial channels. The transport approach Eq. (3.1) is fully specified by º1(x, p) and ºI(x, p) (k"0, 1, 2, 3), which determine the mean-field F F The index C at the d-function indicates that off-shell transitions of width C should also be allowed. In the actual transport simulations, however, we will use the on-shell limit C"0.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
89
propagation of the hadrons, and by the transition rates GRG d(2) in the collision term, that describe the scattering and hadron production/absorption rates. The scalar and vector mean-fields º1 and ºI are conventionally determined in the mean-field F F limit from an effective hadronic Lagrangian density L which is the sum of the Lagrangian density & for the free fields L and some interaction density L , i.e. F & L " L#L . (3.5) & F & F The actual form of L , however, is only known for more simple cases at low baryon density & o and its general form at high o and for large relative momenta between the interacting hadrons — which is probed in nucleus—nucleus collisions up to 200A GeV — is essentially undetermined. This opens up a large parameter space for coupling constants g , formfactors at the vertices as well as FFY respective powers in the hadron fields, which might lead to various density isomers in the nuclear equation of state or mesonic condensates, respectively. In order to reduce this large parameter space and to incorporate aspects of chiral symmetry we adopt the strategy to specify L for baryons via the effective Lagrangian for the underlying quark & degrees of freedom L for nuclear-matter phase-space configurations (2.46). By comparing the O energy density for nuclear-matter configurations from L with that of L on the hadronic side, O & one can fit the hadronic couplings and vertices in L even for high baryon densities and thus & determine º1 and ºI in the transport equation (3.1) in a less arbitrary way. F F The ‘hard’ hadronic processes, on the other hand, which govern the r.h.s. of Eq. (3.1), are modeled by the LUND string-fragmentation [90] which is known to describe inelastic hadronic reactions in a wide energy regime at zero baryon density. The medium modifications due to the hadron self-energies, however, require to introduce some conserving approximations in line with the in-medium quasi-particle properties. With the specifications of º1(x, p) and ºI(x, p) and the F F inelastic collision rates GRG d(2) the transport approach (3.1) is fully defined and can be confronted with experiment. 3.1. Baryon self-energies In this section we specify the evaluation of the mean fields º1 and ºI for baryons that enter the F F l.h.s. of the transport equation (3.1) for the mean-field propagation. In view of the rather simple shape of the EOS in Fig. 2.8 and its similarity to the RBUU parameter sets from Ref. [16], it is now almost straight forward to ‘extract’ nucleon self-energies º1 and ºI for the hadronic transport approach. , , The scalar and vector mean fields º1 and ºI for nucleons are specified along the line of Ref. [14]. F F In order to achieve a covariant transport approach, which is also thermodynamically consistent, we parametrize the scalar and vector self-energies in phase-space representation as
KM 4 gN 1 dp M*(x, p) 1 f (x, p) , º1(x, p)"º1 (x)! KM !(p!p) , (2n) m 1 1 KM 4 gN 4 dp PI(x, p) 4 f (x, p) , ºI(x, p)"ºI (x)# KM !(p!p) , (2n) m 4 4
(3.6)
90
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
where f (x, p) is the nucleon phase-space distribution. Here the effective nucleon mass M*(x, p) and , the kinetic momentum PI(x, p) are given by Eq. (3.3). In Eq. (3.6) º1 (x) and ºI (x) are the local parts of the self-energy, º1 (x)"!g p (x), ºI (x)"g uI (x) (3.7) 1 & 4 & with
g 4 dp PI I(x, p) f (x, p) , uI (x)" 4 , & m (2n) 4
(3.8)
PI I"PI!PJ(jIº )!M*(jIº1) , N N J while p (x) is obtained from the solution of & 4 dp M*(x, p) f (x, p). mp #Bp #Cp "g , 1 & & & 1 (2n)
(3.9)
The quasi-particle properties are defined via the mass-shell constraint (3.2) and the associated energy—momentum tensor reads
4 dp PI IpJf (x, p)#(jIp (x))(jJp (x))!(jIuH (x))(jJu&(x)) ¹IJ(x)" , & & & H , (2n)
#
1 1 1 1 1 1 mp # Bp # Cp ! (j p )(jHp )! m u&uH # (j uB )(jHu&) 1 & & & H & & 4 H & B 2 3 4 2 2 2 H &
2 4 gN KM 1 1 dp M*(x, p) ! dp M*(x, p) f (x, p) (2n) (2n) m KM !(p!p) , 1 1 2 4 gN KM 4 dp P (x, p) 4 ! dp PH(x, p) f (x, p) gIJ . (3.10) H (2n) (2n) m KM !(p!p) , 4 4 In this hadronic approach with momentum-dependent fields the ‘free’ parameters g, gN , g , gN , m , 1 1 4 4 1 m , KM , KM , B, C allow to describe a variety of equations of state and nucleon self-energies. For 4 1 4 nuclear matter at density o the energy per nucleon is given by , E ¹ " , !M , (3.11) , A o , where M denotes the bare nucleon mass. The evaluation of ¹ for , , f (p)"2H(P )d(P!M*)H(p,!"p") (3.12) , $ at ¹"0 with the nucleon Fermi momentum p, then reduces to the coupled Eqs. (44)—(49) in $ Ref. [14]. The key link for determining the free parameters in the hadronic model above now is the model independent relation for the effective quark mass as a function of (small) o (cf. Eq. (2.21)) , R m (o )"m 1! L, o #2 , (3.13) S , 4 f m , L L
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
91
which follows from the Hellmann—Feynman theorem and the GOR relation (2.51) [51,91]. In Eq. (3.13) m is the vacuum effective quark mass from Eq. (2.50). In this context we show in 4 Fig. 3.1 the average effective quark mass (in units of the vacuum mass m ) (solid line) as a function 4 of the nuclear density o "o as obtained from the nuclear matter simulations in Section 2.5. The , effective quark mass drops by about 35% at o according to Eq. (3.13) with R "47 MeV and L, essentially continues with a constant slope up to about 2;o +0.33 fm\ in line with the Dirac—Brueckner analysis in Ref. [92] (cf. also Ref. [93]). The bare quark mass then is reached at about o +0.6 fm\. It is important to note that the effective nucleon mass M*(p,) (normalized $ , to the vacuum mass) in the RBUU approach of Ref. [16] shows the same scaling with density up to about o (parameter sets POL6, POL7 in Fig. 3.1), which is also well in line with Dirac phenomenology. Thus observing that the equation of state from the effective NJL model (Fig. 2.8) as well as the relative scaling of the quark mass with nucleon density o (Fig. 3.1) is very similar to the more , traditional RBUU transport approach from Refs. [14,16] at low density o , we fix the parameters , g, gN , g , gN , KM , KM 2 by the condition 1 1 4 4 1 4 M*(o , p,) m (o ) , $" S , , (3.14) M m , 4 which essentially determines the scalar self-energy of the nucleon, as well as the equation of state from Fig. 2.8,
E A
¹ " , !M . (3.15) , o &1" , The actual values obtained are (parameter set HSD): m "0.55 GeV, m "0.783 GeV, g"0.01, 1 4 1 g "0, B"C"0, gN "9.9, KM "0.8 GeV, gN "12.4 and KM "0.75 GeV. 4 1 1 4 4 Via Eqs. (3.14) and (3.15) the scalar and vector self-energies are well determined up to nucleon momenta P+0.6 GeV/c which corresponds to a kinetic energy of about 180 MeV. The further
Fig. 3.1. Effective mass divided by the vacuum mass as a function of the nucleon density o "o; quark mass m"m (o ) , S , in the NJL approach (solid line, HSD); nucleon mass in the RBUU approach: POL6 (dotted line), POL7 (dashed line). The figure is taken from Ref. [25].
92
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 3.2. Nucleon self-energies º , º and the Schroedinger equivalent potential º as a function of the nucleon 1 1#. kinetic energy E with respect to the nuclear matter rest frame. HSD (solid lines); DBHF (full squares); exp. data from Hama et al. [94] (crosses). The figure is taken from Ref. [25]. Fig. 3.3. Nucleon self-energies º , º and the Schroedinger equivalent potential º as a function of the nucleon 1 1#. kinetic energy E at normal nuclear matter density o within the parameter set denoted by HSD. The figure is taken from Ref. [25].
extension up to momenta of 1.7 GeV/c is provided by the optical potential analysis from Hama et al. [94] following Ref. [14]. The scalar and vector nucleon self-energies then are uniquely determined by Eqs. (3.6), (3.7), (3.8) and (3.9) for arbitrary nucleon phase-space distributions (within the parameterset HSD). In Fig. 3.2 we compare the resulting momentum dependence of the nucleon self-energies at density o with Dirac—Brueckner results from [27] (full squares). In the lower part of Fig. 3.2 the real part of the Schroedinger equivalent potential (SEP) (P#M !M 1 , , (º (o , P)!º (o , P))#º (o , P) º "º (o , P)#º (o , P)# 1 1#. 1 M 2M , , (3.16) is additionally shown (full line) in comparison to the analysis from Hama et al. [94] (crosses) and Dirac—Brueckner computations from [27] up to a kinetic energy E of 1 GeV. This comparison shows that the overall properties of the nucleon self-energies for o 43o and E 4 1 GeV are , reasonably well met. Apart from the close analogy of our results with the p-u model at ‘low’ momenta (cf. Figs. 2.8 and 3.2) we are especially interested in the ‘high’ momentum properties of the present approach, where the standard p-u model is known to fail significantly. The respective results from our present approach for the scalar and vector nucleon self-energy as well as the Schroedinger equivalent potential in analogy to Fig. 3.2 are displayed in Fig. 3.3 up to relative kinetic energies of 14 GeV.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
93
Whereas the scalar and vector nucleon self-energies gradually decrease with momentum (or kinetic energy) — which is a consequence of the cutoff K +1.5 GeV introduced in Eq. (2.60) — the 4 Schroedinger equivalent potential approximately reaches a constant value of about 110 MeV at high energy. Since this is partly a consequence of our restricted parameterspace we do not expect our extrapolations to high momenta to have a predictive power. In order to obtain closer bounds on the momentum dependence of the nucleon scalar and vector self-energy experimental data up to about E "10 GeV are definitely needed. However, comparing the actual (or possible) values of º (E ) at high energy we expect the 1#. effects from the real part of the nucleon self-energies to be of minor importance in the initial phase of nucleus—nucleus collisions at bombarding energies of a couple of A GeV, where nucleon cascading with inelastic nucleon excitations should be dominant, i.e. the imaginary part of the hadron self-energies (cf. Sections 3.4 and 3.5). Since a transport approach for high energy nucleus—nucleus collisions also has to include excited states of the nucleon as well as hyperons — we include nucleons, D’s, N*(1440), N*(1535), K and R hyperons as well as their antiparticles — their respective self-energies have to be specified, too. In a first approximation we assume here that all baryons (made out of light (u, d) quarks) have the same scalar and vector self-energies as the nucleons while the hyperons pick up a factor 2/3 according to the light quark content. Antibaryons will be treated separately (cf. Section 5.1.3). 3.2. Meson self-energies Whereas the baryon self-energies º1 and ºI are a necessary ingredient for a relativistic transport F F model to achieve a realistic description of finite nuclei and intermediate energy nucleus—nucleus reactions, the meson self-energies might be neglected in zeroth order as in conventional cascade simulations. However, in order to explore dynamical effects from a phase, where the chiral symmetry might be restored, they have to be specified as well (on the one-loop level) e.g. by a suitable Lagrangian density. In the HSD approach, where we propagate explicitly pions, kaons, g’s, g’s, the vector mesons u, o, , K*(892) and the axial vector meson a we assume that the pions as Goldstone bosons do not change their properties in the medium; we also discard self-energies for the g-mesons in the ‘default’ version. Thus a Lagrangian density for the coupled system of baryons and mesons can be written as L "L # L #L #L #L #L #L * , (3.17) & K M S ( ) ) K where L corresponds to the baryon Lagrangian (density) specified in Section 3.1, L is the free K meson Lagrangian density for a meson of type m and L denote the meson-baryon interaction K densities. The problem now is to fix L in connection with chiral symmetry constraints. K 3.2.1. Kaons and antikaons The original idea goes back to Kaplan and Nelson [95] who start from a SU(3) ;SU(3) * 0 nonlinear chiral Lagrangian describing the interactions of pseudoscalar mesons and baryons
94
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
(nucleons and hyperons). Here we follow the recent review of Li, Lee and Brown [96] as well as (approximately) their notation: f 1 L " L Tr jIRj RR# f K+Tr M K (R!1)#h.c.,#Tr BM (icIj !m )B ) I I 4 2 L #i Tr BM cI[» , B]#D Tr BM cIc+A , B,#F Tr BM cIc[A , B] I I I #a Tr BM (mM K m#h.c.)B#a Tr BM B(mM K m#h.c.)#a +Tr M K R#h.c.,Tr BM B , (3.18) where B denotes the spinor for the baryon octet with mass m and M K is the quark mass matrix. K, D, F, a , a and a here are free parameters. Furthermore, R and m are the chiral fields R"exp(2in /f ) , m"º"exp(in /f ) (3.19) L L with n denoting the pseudoscalar meson octet. The vector » and axial vector A currents are I I defined as 1 » " (mRj m#mj mR) , I 2 I I
i A " (mRj m!mj mR) . I 2 I I
(3.20)
Expanding R up to order 1/f and keeping only the kaon field K"()> ), KM "(K\ KM ) the first ) L terms in Eq. (3.18) give a kinetic and a mass term: jIKM j K!K(m#m)KM K#2 , (3.21) I S Q where m"m has been assumed. Restricting to nucleons and kaons, the next terms in Eq. (3.18) B S yield the bare Lagrangian for the nucleon (NM "(pN nN )) as well as a nucleon—kaon interaction term, i.e. 3i NM (icIj !m )N! NM cN(KM j K!(j KM )K)#2 . I 8f L The last 3 terms give NM N Tr BM (mM K m#h.c.)B+2mNM N! (m#m)KM K , S Q 2f S L NM N (m#m)KM K , Tr BM B(mM K m#h.c.)+2mNM N! Q Q f S L 2NM N (m#m)KM K . +Tr M K R#h.c.,Tr BM B+2(2m#m)NM N! S Q S Q f L Writing all terms in compact form then gives 3i LI "NM (icIj !m )N! NM cN(KM j K!(j KM )K)#jIKM j K I , I 8f L R ! m! ),NM N KM K#2 , I f L
(3.22)
(3.23)
(3.24)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
95
with the kaon mass squared m"K(m#m) , I S Q while the nucleon mass is given by
(3.25)
m "m !2+a m#a m#a (2m#m), . , S Q S Q The kaon—nucleon R-term defined by
(3.26)
1 R " (m#m)1N"uN u#sN s"N2 Q ), 2 S
(3.27)
then amounts to 1 R "! (m#m)(a #2a #4a ) . Q ), 2 S
(3.28)
It leads to a reduction of the kaon mass at finite scalar density o "1NM N2 and is denoted as the Q Kaplan—Nelson term. The other interaction term in Eq. (3.24) including the time derivates of the kaon fields is the Weinberg—Tomozawa term which gives a repulsive vector potential for kaons and an attractive vector potential for antikaons. Thus the dispersion relation for kaons and antikaons in the nuclear medium reads in leading order [97]:
3o R , u ( p)" p#m 1! ), o # ) ) 8f m f m 1 L ) L )
3 o # ,, 8 f L
(3.29)
3o 3 o R , ! ,, u M ( p)" p#m 1! ), o # ) ) 8f m 8 f f m 1 L ) L L ) with m denoting the bare kaon mass, f + 93 MeV and R + 350 MeV, while o and o are the ) L ), 1 , scalar and vector baryon densities, respectively. We note that R is not known very well and may ), vary from 270 to 450 MeV. Furthermore, there are a couple of corrections to the mean-field result (3.29) as pointed out in [98]. Another approach to the kaon in-medium properties is to extend the relativistic mean-field model to strange mesons and hyperons [99]. Without going into the details here we show the results of Schaffner et al. [99] for the K> and K\ mass as a function of the nuclear density o/o in Fig. 3.4, where the label TM1 indicates a ‘soft’ nuclear equation of state while the label NL-Z stands for a ‘hard’ equation of state (cf. Table 1 in [99]). All approaches presented in Fig. 3.4, where RMF denotes the relativistic mean-field model, ChPT the results from chiral perturbation theory incorporating different values for R , and the coupled-channel model described in [99], ), agree on the relative sign of the K and KM potential although they differ quite sizeably in the absolute strength. Our intention is not to favor any of these models but merely to point out that strong attractive potentials are expected for antikaons, which should lead to enhanced K\ cross sections in nucleus—nucleus collisions especially at subthreshold energies (SIS regime), whereas kaons should only see a slightly repulsive potential in the medium and thus be suppressed, respectively.
96
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 3.4. The energy of kaons and antikaons in nuclear matter (at p"0) as a function of density for the ‘soft’ EOS (l.h.s.) and ‘hard’ EOS (r.h.s.) from Ref. [99] within different approaches (see text).
In the transport approach HSD we assume the same form of L for the K*-mesons, too. Since this identification lacks fundamental arguments, the vector mesons K* might couple differently; this question, however, has to be worked out in future. 3.2.2. The vector mesons o, u and
With respect to the interaction of the vector mesons o, u, with baryons a suitable L has been K proposed by Mei{ner in chiral perturbation theory [39]. This Lagrangian has been taken up by Klingl, Kaiser and Weise [100] and worked out in detail with respect to meson photoproduction [101] and vector meson self-energies in the nuclear medium [61]. We here briefly present the effective Lagrangian and results obtained in Ref. [61]. Writing the pseudoscalar field (including pions and kaons) as
(
L
n>
U" n\
!L
K\
KM
K 0
and the vector fields as »" I
K>
(
(3.30)
o#u
0
0
0
!o#u
0
,
(3.31)
0 0 (2
I an interaction between pseudoscalar and vector mesons is introduced by minimal coupling, i.e. ig j UPj U# [U,» ] , I I I 2
(3.32)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
97
whereas pseudoscalar meson interactions with baryons are introduced via [101] F D L "! Tr(BM c c [jIU, B])! Tr(BM c +jIU, B, ) , (3.33) 1 I I > 2f 2f L L where B represents the SU(3) matrix fields of the baryon octet, while F+ 0.51 and D+ 0.75 give the axial coupling g "F#D"1.26. The interaction between the vector fields and the baryons is taken as g L " [Tr(BM c [»I, B])!Tr(BM c B)Tr(»I)] 4 I I 2
(3.34)
with g"g "1/3g according to SU(3) symmetry. For the oN and uN interaction they, M, S, furthermore, include an anomalous tensor coupling with i "6.0 and i "0.1, i.e. M S gi gi L " M NM sp NjIqJ# S NM p NjIuJ , (3.35) 4, 4M IJ IJ 4M , , where M is the bare nucleon mass and N the nucleon spinor. , The current—current correlation functions calculated within this approach [61] are displayed in Fig. 3.5 for the vector mesons o, u and , i.e. !Im P4(u; o)/u at relative momentum p"0 as a function of the energy u which in this case is equal to the invariant mass M. At nuclear densities o /2 and o the o-meson current—current correlation function broadens significantly thereby shifting a large fraction of strength to the low mass region, however, without showing a major shift of the o-meson pole. In case of the u-meson a sizeable shift of the pole to lower masses with density o/o is found as well as a large broadening due to inelastic reactions uNPnN. The meson, furthermore, broadens slightly with density o/o and also shows no sizeable shift of its pole at density o . The question, if such medium modifications of the mesons can actually be tested in pion—nucleus, proton—nucleus or nucleus—nucleus collisions requires the comparison of a variety of experimental data with the corresponding results from nonequilibrium transport approaches which incorporate the main aspects of the meson spectral functions in the dense medium (cf. Sections 5 and 6). 3.3. Quasiparticle parametrization and propagation The hadronic Lagrangian specified above now can be expressed in form of a Hamiltonian by a Legendre transformation and be rewritten in phase-space representation. To facilitate the actual computations, however, a couple of approximations are included which involve a representation of the phase-space distribution f (r, p; t) in terms of ‘testparticles’ [6], i.e. F 1 ,FR", d(r!r (t)) d(p!p (t)) , (3.36) f (r, p; t)" G G F N G where N is the number of testparticles per hadron h and N (t) the actual number of hadrons h at F time t. ‘Stable’ baryons (N, K, R, etc) are propagated with the mass-shell constraint (3.2) while
98
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 3.5. The current—current correlation functions for the o (a), u (b) and meson (c) at zero (dashed lines), half (long dashed lines) and normal nuclear density o (solid lines). The figure is taken from Ref. [61].
unstable hadrons (of short lifetime) are approximated by a Breit—Wigner spectral distribution MC(M) 2NI , (3.37) F(M)" n (M!M )#MC(M) 0 where M denotes the pole of the resonance and C(M) its mass dependent width given by the 0 imaginary part of the baryon self-energy, C(M)"!Im R (M; o )
(3.38)
at finite baryon density o . In Eq. (3.37) NI ensures the normalization of dM F(M)"1. For mesons we employ the dispersion relation u(p)"p#m#Re P (o , o ; p), F F F 1
(3.39)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
99
where P (o , o ) is in this case the meson self-energy as a function of the baryon density o and F 1 scalar density o as determined by the Lagrangian. We note that P in general also is momentum 1 F dependent; in the calculations to be discussed in this work, however, this explicit momentum dependence is neglected for the mesons throughout. The actual lifetime q of the hadron with self-energy P is given by F F 1 q\"! Im P . (3.40) F F 2u F The mean-field propagation of all hadrons (for fixed mass) then is determined by the Hamilton equations of motion that follow from their in-medium dispersion relation u (p) at fixed density. F Furthermore, the decay of unstable particles is treated by Monte-Carlo according to the exponential decay law
t (3.41) N (t)&exp ! F cq F with q from (3.40) while c denotes the Lorentz c-factor of the hadron in the calculational frame. F With the specification of the meson—baryon interaction densities L the real part of the hadron K self-energies is now fully defined on the one-loop level and the (semiclassical) propagation of the hadrons by the total derivative of their in-medium dispersion relation. The actual elastic and inelastic scattering or decay process, as described by the imaginary part of the hadron self-energies, is modelled by a set of collision terms. 3.4. Baryon—baryon reaction channels Whereas in a fully selfconsistent relativistic transport theory the real part and the imaginary part of hadron self-energies are related by means of dispersion relations [7,9,27], it is not justified to employ the model self-energies (specified in Sections 3.1 and 3.2) in dispersion integrals for the imaginary part because the inelastic scattering rate of nucleons and mesons turns out to be wrong in the limit of vanishing baryon density. As known from transport studies at energies below 2A GeV the elementary cross sections in Eq. (3.1) may be approximated by their values in free space. Thus we will adopt the same strategy and use the explicit cross sections as in the BUU model [102] (for (s42.6 GeV) — that have been successfully tested in the energy regime below 2A GeV bombarding energy — and by the LUND string formation and fragmentation model (LSM) [90] (for (s'2.6 GeV) in case of baryon—baryon collisions. In order to obtain a rough idea about the inelastic cross sections from the LUND string fragmentation model we show in Fig. 3.6 the rapidity spectra for n>, n\ and K mesons from pp 1 collisions at p "12 GeV/c and 24 GeV/c in comparison to the experimental data from Ref. [103]: the full dots indicate n> mesons, the squares correspond to n\ and the triangles denote K. 1 Whereas the pion rapidity distributions here are described rather well, the K spectra turn out to be 1 slightly too broad in comparison to the data; this has to be kept in mind when analyzing p#A and A#A data furtheron. In Fig. 3.7 we present the rapidity distributions for n>, n\, K> and K\ mesons from the LUND string fragmentation model for pp collisions at SPS energies (p "400 GeV/c) in comparison to
100
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 3.6. Rapidity distributions for n>, n\ and K from the LUND string fragmentation model for pp collisions at 1 p "12 GeV/c (l.h.s.) and 24 GeV/c (r.h.s.) in comparison to the experimental data from Ref. [103].
Fig. 3.7. Rapidity distributions for n>, n\, K> and K\ mesons from the LUND string fragmentation model (solid lines) for pp collisions at p "400 GeV/c in comparison to the experimental data from Ref. [104].
the experimental data from Ref. [104]: the full dots indicate n> mesons, the squares correspond to n\, ‘up’ triangles to K> and ‘down’ triangles to K\ mesons. As in case of Fig. 3.6 the LUND string fragmentation model — which has been slightly modified as compared to the original version [90] — describes the pp data quite well and thus can be used confidently within the transport approach. Whereas the LSM is well suited for describing inelastic hadron—hadron collisions at rather high invariant energy, its results for meson production close to the respective thresholds become
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
101
questionable. Therefore, explicit parametrizations of the individual channels NNPNNm have to be used at low invariant energy in accordance with experimental data. This is of particular importance when studying ‘subthreshold’ particle production, i.e. particle production at bombarding energies per nucleon below the threshold in ‘free’ nucleon—nucleon collisions. We start with the production of kaons close to threshold since any conclusion on in-medium effects of kaons (cf. Section 3.2) will strongly depend on the accuracy of the production cross sections used. We thus present the explicit parametrizations used to allow for independent control and future improvement by experimental data-sets. The isospin averaged production cross section of a K>K and K>R pair in a nucleon—nucleon collision is related to the measured isospin channels as " p , (3.42) p ,,)>K, NN)>KN p " (p #p ). (3.43) ,,)>R, NN)R>N NN)>RN Following [105] the reaction cross sections are approximated by a fit to the experimental data as
s s p [lb] , >K (s)"732 1! NN) N s s
(3.44)
s s p [lb] , R> (s)"338.46 1! NN) N s s
(3.45)
s s [lb] p (s)"275.27 1! NN)>RN s s
(3.46)
with (s "mK#m #m and (s "mR#m #m . According to isospin relations the ND , ) , ) and DD production channels in our approach get additional factors of 3/4 and 1/2, respectively. The latter isospin factors are taken differently in the literature depending on the underlying boson exchange model [93]. This has to be kept in mind before drawing final conclusions because the DN and DD channels cannot be measured experimentally. The isospin averaged K\ production cross section from nucleon—nucleon collisions is taken from [106] in the parametrization (K"(K, K>), KM "(KM , K\))
s s (s)+a 1! p ,,,,))M s s
(3.47)
with a"1.5 mb and (s "2m #m #m* . The channels ND and DD are taken to be the same as ) , ) Eq. (3.47) due to the KKM pair in the final state. The latter channels play an essential role due to the formation of resonance matter in the heavy-ion collision zone [107] and due to lack of experimental information provide a major source of uncertainty in the analysis also for anti-kaon production. The parametrizations for the K> and K\ cross section from pp collisions are displayed in Fig. 3.8 by the solid lines in comparison to the inclusive experimental data for K> production (open squares) and K\ production (full circles) as a function of the invariant energy above threshold. The cross section for K> production includes both K and R-hyperon reaction channels. The dash-dotted line (denoted by Z&S) reflects the parametrization from Ref. [108] that has been
102
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 3.8. The parametrizations for the isospin averaged inclusive K> and K\ cross sections from NN collisions as a function of the invariant energy (s above threshold (s (solid lines) in comparison to the experimental data [111]. The dash-dotted line is the parametrization from Ref. [108] for K\ production whereas the open square (denoted by OBE) is from the OBE calculation of Ref. [106]. The figure is taken from Ref. [160].
used in many of the earlier studies [109,110] of K\ production in heavy-ion collisions and close to threshold is much larger than our cross section for K\ by several orders of magnitude. Our parametrization for the inclusive K\ cross section is essentially based on the results of the boson-exchange model from [106] for the isospin averaged cross section close to threshold (open square denoted by OBE in Fig. 3.8). In this OBE-model the different exclusive isospin channels are related to each other via the same Feynman graphs and a comparison to a much larger set of experimental data can be established. As an example for antikaon production we show in the left upper part of Fig. 3.9 the result of the calculation for the reaction ppPpnK>KM in comparison to the experimental data from [111]. Also other channels like ppPpKK>, ppPpRK> or ppPKR>p are described reasonably well within this OBE approach as seen from Fig. 3.9. We thus expect the parametrization (3.47) to be more realistic than that of Ref. [108]. However, a detailed experimental study close to threshold energies should be performed to obtain accurate numbers for the elementary production channels. We, finally, note that the ratio of the K>/K\ production cross section increases when going closer to threshold; this is expected because the final phase-space for K> production is of three-body type while it is a four-body phase-space in case of antikaon production. We, furthermore, have incorporated the antikaon production by hyperon—baryon collisions, which is evaluated in the OBE approach, too. The resulting cross sections for RNPNNKM and NKPNNKM are displayed in Fig. 3.10 by the dashed and solid line which show maxima below
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
103
Fig. 3.9. Comparison of our parametrizations with the experimental cross sections [112] for the isospin channels ppPpnK>KM , ppPpKK>, ppPpRK>, ppPpR>K. The figure is taken from Ref. [160].
1 mb. We will find in Section 5 that these channels are of minor importance in p—nucleus and nucleus—nucleus collisions. The cross sections for the inclusive K>, o, u and -meson production from p#p collisions within the Lund-String-Model (LSM) are shown in Fig. 3.11 as a function of the invariant collision energy (s in comparison to the experimental data from [112]. The solid lines show fits to the LSM results according to p(p#pPM#X)"a(x!1)@x\A
(3.48)
104
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 3.10. Antikaon production cross sections from hyperon—nucleon collisions according to an OBE calculation. The figure is taken from Ref. [160].
with x"s/s and a, b, c from Table 1, whereas in case of u production the full dots stand for the LSM results as well as the full squares in case of o production. We additionally show the cross sections for exclusive meson production within the One-Boson Exchange Model (OBE) from Sibirtsev [113] by the dashed lines in comparison to the respective data from [112] (triangles). Whereas in case of K>, o, u production the inclusive LSM results smoothly match with the OBE calculations for the exclusive channels close to threshold, this is no longer the case for -meson production because the string fragmentation model requires the formation of two ssN -pairs, which shifts the threshold up to higher energies. Since there are presently no experimental data available for (s(3.5 GeV in case of p#p reactions we adopt the maximum of the cross sections from the OBE and LSM calculations for the inclusive -meson production. For antiproton production we adopt the results from Lykasov et al. [114] obtained within an OBE model for the exclusive reaction ppPppppN . In this approach the pN cross section above threshold increases proportional to the 4-body phase space; at higher energies again the inclusive cross section from the LSM is adopted. Combining the results from the OBE and LSM calculations the inclusive pN cross section can be parametrized by [115] p(ppPppppN )"0.12(x!1) x\ [mb]
(3.49)
with x"s/(4M ) (cf. Section 5.1.3). This cross section is considerably smaller close to threshold , energies than the parametrization from Batko et al. [116] which has been adopted in most of the transport studies on antiproton production in nucleus—nucleus collisions so far. For ND and DD production channels we will also use the parametrization (3.49).
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
105
Fig. 3.11. Cross sections for the exclusive reactions p#pPK>#K#p, p#pPo#p#p, p#pPu#p#p and p#pP #p#p (triangles) and the inclusive reactions p#pPK>#X, p#pPo#X, p#pPu#X and p#pP #X (dots) from [112]. The solid lines show the fits of the LSM calculations, while the dashed lines refer to the OBE calculations for the exclusive channels. The full dots (for ppPu#X) and full squares (for ppPo#X) denote the explicit values from the LSM calculations. The figures are taken from Ref. [117]. Table 1 The parameters of Eq. (3.48) for proton—proton reactions. Here the index implies a shift xPx!1.3 Meson
s (GeV)
a (mb)
b
c
K> o u
g
6.49 7.01 7.06 8.38 5.88
1.12 2.2 2.5 0.09 2.5
1.47 1.47 1.47 2.54 1.47
1.22 1.1 1.11 2.09 1.25
106
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
3.5. Meson—baryon reaction channels Elastic and inelastic reactions of mesons with baryons are treated in analogy to baryon-baryon collisions via the LUND string formation and fragmentation model [90] above (s"2.3 GeV, whereas for lower invariant energies experimental cross sections from [112] are incorporated whenever possible since the latter are dominated by intermediate baryon resonance formation and decay. For (s42.3 GeV the cross section for o, u and -meson production from the reaction n#NPRPM#N
(3.50)
is most conveniently described within a ‘resonance’ model (cf. Ref. [117]). Assuming that the squared matrix element is proportional to a Breit—Wigner function the cross section for the reaction (3.50) can be written as (nonrelativistically) B B C n 2J#1 ;R (s) , (3.51) p(n#NPM#N; s)" 2 k ((s!M )#C/4 0 where J, B and B are the resonance spin and the branching ratios of the incoming and outgoing channels, respectively, while the factor R stands for the phase-space volume of the final particles, R "nj(s, m , m )/s , (3.52) , + with m and m denoting the masses of the final nucleon and meson, respectively. In Eq. (3.51) k is , + given by k"j(s, m , m ) L , with the Ka¨llen-function j(z, x, y)"[z!(x#y)][z!(x!y)]/4z .
(3.53)
(3.54)
The parameters M and C from Eq. (3.51) are the mass of the ‘baryon resonance’ and its full width, 0 respectively. We note that within the resonance description we neglect coherent contributions from the available baryons as well as possible interference terms since we do not have experimental information on their actual magnitude. Within this semiphenomenological approach the experimental data on o, u and -meson production in pion induced reactions have been fitted in order to extract the mass and width of the ‘effective’ resonance R. The fits to the data [112] (taken from Sibirtsev et al. [117]) are shown in Fig. 3.12 with the parameters listed in Table 2. For convenience, the data are plotted as a function of (s!(s with s denoting the squared invariant energy of the colliding particles and (s "m #m in accordance with the reaction threshold. In Fig. 3.12 the , + fitted resonance cross sections for the exclusive reactions n#NPu#N, o#N, and #N are given by the dashed lines in comparison to the data from [112]. Furthermore, the calculated results within the LUND-String-Model (LSM) for these reactions can be fitted again by a function of the form p(n#pPM#X)"a(x!1)@x\A
(3.55)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
107
Fig. 3.12. Experimental cross sections for the exclusive reaction n>#pPo>#p (triangles) (l.h.s.), n>#nPu#p, (squares), n>nP #p (triangles) (r.h.s.) and the inclusive reactions n>#pPo, u, #X (full dots) from Ref. [112]. The lines represent the parametrizations as discussed in the text. The figures are taken from Ref. [117].
Table 2 The effective mass and width of the baryonic resonance in Eq. (3.51), as well as B"B ;B ;(2J#1) Meson
M (GeV) 0
C (GeV)
B (lb GeV\)
o u
1.809 1.809 1.8
0.99 0.99 0.99
413 302 5.88
Table 3 The parameters of Eq. (3.55) for pion induced reactions Meson
s (GeV)
a (mb)
b
c
o u
2.917 2.958 3.831
3.6 4.8 0.09
1.47 1.47 2.54
1.25 1.26 2.1
Implies a shift xPx!1.3.
up to a few 100 GeV, where the scaling variable is defined as x"s/s . The parameters for all interesting channels a, b, c and s are listed in Table 3. The fits to the LSM results are shown by the solid lines in Fig. 3.12 and reasonably well reproduce the available inclusive data on vector meson production from inclusive processes at high energies.
108
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
In case of ppN production in pion-baryon reactions we adopt the results from Sibirtsev et al. [115] which are parametrized as in Eq. (3.55) with a"1 mb, b"2.31 and c"2.3 (cf. Section 5.1.3). The cross sections for o#N, u#N and #N interactions in case of low relative momenta can be extracted from the exclusive n#N reactions by assuming that this is the dominant absorption channel. Detailed balance provides the relation (2I #1)(2I #1) j(s, m , m ) p + , + , L>,+>," , (3.56) (2I #1)(2I #1) j(s, m , m ) p L , , L +>,L>, where s is the squared invariant energy, m , m , m are the masses, while I , I and I are the spins , L + , L + of nucleon, pion and vector meson, respectively. Here, the function j is given by Eq. (3.54). Using the parametrizations for the cross sections p Eq. (3.51) we then obtain the cross sections L>,+>, for vector meson-nucleon interactions via Eq. (3.56). We note that the cross sections extracted from detailed balance are only related to the particular inelastic reaction channel MNPnN, and do not saturate the total cross sections at momenta 5100 MeV/c. Indeed the 2nN, 3nN, mnN or KKM N final states as well as elastic scattering can also occur, which all contribute to the total cross sections. For g production in nN reactions at low invariant energies we use the parametrization from [118] given by
p ((s)" L,E,
13.07
(s!(s [mb] for (s 4(s41.589 GeV , GeV GeV
0.1449
[mb] for 1.589 GeV4(s ,
(3.57)
(s!(s whereas for kaon production channels we adopt the detailed parametrizations from Tsushima et al. [119] that have been fitted to the experimental data. The experimental n\pPpKK\ cross section, furthermore, can be expressed by [106],
s s [mb] , p(n\pPpKK\)"1.121 1! s s
(3.58)
where (s is the invariant mass of the nN system and (s "m #m #m* . Exploring isospin ) , ) symmetries [106] the other cross sections can be related to p(n\pPpKK\) by 2p(n>pPpK>KM )"2p(n>nPnK>KM )"p(n>nPpK>K\) "p(n>nPpKKM )"p(npPnK>KM )"4p(npPpK>K\) "4p(npPpKKM )"p(nnPpKK\)"4p(nnPnK>K\) "4p(nnPnKKM )"2p(n\pPpKK\)"p(n\pPnK>K\) "p(n\pPnKKM )"2p(n\nPnKK\) .
(3.59)
These isospin relations have been found in [106] to be well in line with the experimental data from [112]. Note that in Ref. [115] the parameter c must be replaced by !c.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
109
Fig. 3.13. K\N inelastic (solid line) and elastic (dashed line) cross section as a function of the kaon laboratory momentum p as fitted to the experimental data from [112]. The dotted line displays the elastic K>N cross section. The ) figure is taken from Ref. [160].
A further important production channel for antikaons is given by the flavor exchange reaction nKPK\N and nRPK\N, where the strange quark from the hyperon is exchanged with a light (u, d) quark. These channels are determined by detailed balance from the inverse reactions which are the dominant channels for KM absorption on nucleons at low energies. The latter absorption cross section from Ref. [120] is displayed in Fig. 3.13 by the solid line together with the elastic K\N cross section (dashed line) as a function of the K-meson momentum with respect to the nucleon at rest. It should be noted that these cross sections are rather well known experimentally [112] and that the parametrization provides an optimal fit through the data. Apart from the antikaon final state interactions shown in Fig. 3.13, K> elastic scattering with nucleons also has an impact on the final kaon spectra. The elastic cross section employed is displayed in Fig. 3.13 by the dotted line and indicates that K> rescattering is of minor importance; however, it is explicitly included in the actual calculations since it slightly modifies the K> spectra from p#A and A#A collisions. Using detailed balance, i.e. (3.56) the n-hyperon production channels can be computed from the parametrizations of the isospin averaged KM NPn½ cross sections [121]. The corresponding results are displayed in Fig. 3.14 as a function of the invariant energy (s (solid and dashed line). Due to the rather well known data for K\N scattering [112] these flavor exchange reactions are expected to be well determined in the actual calculations.
110
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
The Lorentz invariant differential cross sections entering the r.h.s. of Eq. (3.1) as well as all low energy reactions, finally, due to two-body kinematics near threshold are completely determined when assuming an isotropic angular distribution in the pion—nucleon cms. 3.6. Meson—meson reaction channels Especially in ultrarelativistic nucleus—nucleus collisions the meson density may become much higher than the baryon density such that meson—meson reaction channels become even more important. Since these reactions are of secondary, ternary and higher order, the average invariant energy in these collisions is rather small. It is thus convenient to use a Breit—Wigner resonance scheme because the total width and the individual branching ratios can be adopted from the particle data booklet [122] without introducing any new parameters. In this sense reactions of the type a#bPm Pc#d are described by Breit—Wigner cross 0 sections 4n sC C 2J #1 0?@ 0AB . 0 p ((s)" ?@0AB (2S #1)(2S #1) p (s!M )#sC G 0 ? @
(3.60)
In Eq. (3.60) ab and cd denote the mesons in the initial and final channel while R is the mesonic resonance (e.g. o, a , ,2); J , S and S are the spins of the resonance R and the initial mesons, 0 ? @ respectively. C and C denote the partial width in the initial and final channel while C is 0?@ 0AB RMR the total resonance width and M its mass, which may depend on the properties of the mesons in 0 the nuclear medium (cf. Section 3.2). The quantity p , finally, is the initial meson momentum in the G resonance rest frame. We note that especially the p-wave pion—pion scattering with an intermediate o as well as KKM P and noP or noPa play an important role in dilepton physics to be discussed in Section 6. The strangeness production in meson—meson collisions also becomes an important issue at high bombarding energies where especially high pion densities are achieved. We thus include the KKM production by meson—meson reactions. The isospin averaged cross section for the nnPKKM channel can be parametrized by [30]
s +a 1! pN LL))M s
(3.61)
with a"2.7 mb and s "(m #m M ). This cross section is shown also in Fig. 3.14 by the ) ) dash-dotted line. For the other nonstrange mesons we use the same expression as a function of the invariant energy squared. 3.7. Formation time t
$
The implementation of the LUND string formation and fragmentation model [90] — which describes the free transition probabilities — in a covariant transport theory implies to use a time scale to transform the cross-sections to collision rates and particle production rates. An
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
111
Fig. 3.14. Antikaon production cross section from n#hyperon collisions according to Eq. (3.56). The dash-dotted line represents the channel nnPKKM according to Eq. (3.61). The figure is taken from Ref. [160].
appropriate time scale is given by a string formation time t , which denotes the time between the $ formation and fragmentation of the string in the individual hadron-hadron center-of-mass system (cms) for a particle of rapidity y "0. Due to covariance this time should be also related to the spatial extension of the interacting hadrons which on average gives t +0.7—0.8 fm/c for the $ pseudoscalar and vector mesons. The actual value of t is defined for a hadron being formed at rest. $ Due to covariance hadrons with finite momentum p then are formed at time t ;c where c denotes $ the Lorentz c-factor in the computational frame. In order to demonstrate the sensitivity of the proton rapidity spectra dN/dy to the actual value of t we show in Fig. 3.15 the proton rapidity spectra for central collisions of Ca#Ca at $ 30A GeV (from [25]). It is seen that t controls essentially the rapidity distribution at midrapidity $ and at projectile and target rapidity, i.e. the baryon stopping in relativistic nucleus—nucleus collisions. We will adopt t "0.7—0.8 fm/c for the calculations to be presented furtheron; similar $ values are also used in the RQMD approach [123] and in the UrQMD model of the Frankfurt group [21]. After having specified now the ‘ingredients’ of the HSD transport approach we are able to perform calculations for np, pp, nA, pA and AA collisions from threshold up to (sK200—300 GeV per elementary collision. Any ‘new’ physics should indicate its presence in the failure of the hadronic transport approach.
112
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 3.15. The proton rapidity distribution for central collisions of Ca#Ca at 30A GeV using t "0.5, 0.6, 0.8, $ 1.0 fm/c, respectively. The figure is taken from Ref. [25].
4. Heavy-ion reaction dynamics The relativistic transport approach HSD outlined in Section 3 now is applied to hadron—nucleus and nucleus—nucleus collisions from the SIS to the SPS energy regime (1—200A GeV) with particular emphasis on rapidity distributions and particle spectra to control the stopping and energy—momentum transfer achieved in these reactions. The numerical implementation of the self-energies and collisions rates is performed in close analogy to Refs. [16,102,124,125]; modifications of the latter prescriptions will be indicated explicitly. 4.1. Protons, pions and g’s at SIS energies The SIS energy regime up to 2A GeV allows for the study of nuclear densities up to +3o and temperatures (for heavy systems) up to ¹+90 MeV. At these energies the stopping of baryons is
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
113
Fig. 4.1. The calculated rapidity distribution of protons (solid histogram) for central collisions of Ni#Ni at 1.93A GeV (b43 fm) in comparison to the experimental data from the FOPI Collaboration [126]. The dashed line corresponds to an isotropic thermal model scenario normalized at midrapidity. The calculations have been scaled by a factor 0.78 since deuterons have been measured separately in the experiment, however, are included in the calculated spectrum.
quite pronounced as can be seen from Fig. 4.1, where we show the calculated rapidity distribution of protons (solid histogram) for central collisions of Ni#Ni at 1.93A GeV (b43 fm) in comparison to the experimental data from the FOPI Collaboration [126]. The proton rapidity distribution clearly peaks at midrapidity, however, is somewhat broader as expected for a complete thermalization of the system [126] (dashed line; normalized for y"0). Furthermore, nucleons show a sizeable amount of radial flow as well as directed flow [126], the latter being expressed as the derivative of the average momentum per nucleon in the reaction plane versus the rapidity in the cms,
dP /A(y ) . F" V dy W
(4.1)
This quantity shows a decent sensitivity to the momentum and density dependence of the nucleon self-energies º and ºI in Eq. (3.1) [7,8]. Recent calculations by Sahu et al. [127] here indicate 1 , that the vector potential ºI has to become softer at high relative momenta and density to , reproduce the dropping of the flow for Ni#Ni and Au#Au reactions above 1A GeV. Here we do not work out this question furthermore because we are primarily interested in in-medium effects for the mesons. As shown in Ref. [107] the pion density may reach up to 30% of the baryon density at 2A GeV, which implies that up to 30% of the nucleons are excited to the D-resonance in the compressed phase. Whereas energetic pions are emitted rather early from the compressed phase, low energy
114
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 4.2. Calculated transverse n-spectra for Ar#Ca at 1.5 A GeV (full line) in comparison to the experimental data from Ref. [134]. The figure is taken from Ref. [25]. Fig. 4.3. Inclusive n> spectra from Ni#Ni collisions at 1.0 and 1.8A GeV and Au#Au at 1.0A GeV in comparison to the experimental data from Ref. [135] at h "44°$4°. The figure is taken from Ref. [177].
pions — that are more sensitive to in-medium effects — only freeze out during the late expansion phase at low nuclear density [128—130]. Thus pion selfenergies are hard to determine from pion spectra in nucleus—nucleus collisions. In fact, the low p enhancement seen experimentally (espe2 cially in heavy systems) is interpreted differently in the literature [129,131,132]. Since we consider the pion as a Goldstone boson not to be modified substantially in the nuclear medium, we treat pions as free particles throughout this work; deviations between our calculations and experimental data then may indicate the necessity to introduce pion polarizations [133]. As a first example we show in Fig. 4.2 the transverse n-spectra from Ar#Ca collisions at 1.5A GeV in comparison to the data of Berg et al. [134] as a characteristic system at SIS energies. Since at these energies the HSD approach is close to the results achieved with the former BUU model [131], the reproduction of the data is of similar quality. Apparently, there are no pronounced deviations between the calculation and the data such that for this system there is no indication for pion self-energies. We continue with a comparison of our calculations for pion production at SIS energies with the available n> data for Ni#Ni at 1.0 and 1.8A GeV and Au#Au at 1.0A GeV in Fig. 4.3. The experimental spectra of the KaoS Collaboration at h "44°$4° [135] are described reasonably well in the whole kinematical range not only for Ni#Ni, but also for Au#Au. We slightly underestimate the pion spectrum for Ni#Ni at 1.8A GeV at low momenta which might reflect limitations of our configuration space showing up at higher bombarding energy. The transverse-mass spectra of n and g mesons in heavy-ion collisions at SIS energies were measured by the TAPS Collaboration [136—138] and a m scaling has been observed for both 2
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
115
Fig. 4.4. The calculated transverse-mass spectra of n and g mesons in comparison with the TAPS data [137,138]. The upper part shows the m spectra for n’s (dashed histogram) and g’s (solid histogram) for C#C at 1.0A GeV in the 2 rapidity interval 0.424y40.74 and at 2.0A GeV for 0.84y41.08. The theoretical results as well as the experimental data at 2.0A GeV are multiplied by a factor of 10. The middle part corresponds to Ca#Ca at 1.0A GeV for 0.484y40.88 (multiplied by 10\) and at 2.0A GeV for 0.84y41.1. The lower part shows the calculated m spectra 2 for Ni#Ni at 1.93A GeV for rapidities 0.84y41.1 in comparison with the data from Ref. [138]. The figure is taken from Ref. [140].
mesons. Such a universal property of the meson spectra at SIS energies has been predicted by the Quark Gluon String Model calculations in Ref. [139] for the Ar#Ca system. The same scaling we find in our transport calculations when including no g-meson self-energies. In Fig. 4.4 we compare the results of our calculation [140] for the inclusive transverse-mass spectra of n and g mesons with the TAPS data. The upper part shows the m spectra for n’s (dashed histogram) and g’s (solid 2 histogram) for C#C at 1.0A GeV in the rapidity interval 0.424y40.74 and at 2.0A GeV for 0.84y41.08. The experimental data — the open circles and solid squares correspond to n and g mesons, respectively — are taken from Ref. [137]. The theoretical results as well as the experimental data at 2.0A GeV here are multiplied by a factor of 10. The middle part corresponds to Ca#Ca at 1.0A GeV for 0.484y40.88 (multiplied by 10\) and at 2.0A GeV for 0.84y41.1 in comparison with the data from Ref. [138]. The lower part shows the calculated m spectra for 2 Ni#Ni at 1.93A GeV for 0.84y41.1 in comparison with the data from Ref. [138].
116
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
As seen from Fig. 4.4 the HSD transport model gives a reasonable description of the m spectra 2 of pions and g’s as measured by the TAPS Collaboration without incorporating any medium modifications for both mesons. It is important to point out that these calculations are parameterfree in the sense that all production cross sections for g mesons are extracted from experimental data in the vacuum and the g-nucleon elastic and inelastic cross sections are obtained by using detailed balance on the basis of an intermediate N(1535) resonance. Though we do not find an unambiguous indication for n or g self-energies we like to note that the ‘low energy dynamics’ involving essentially nucleons, D’s, N(1440), N(1535), pions and g’s is reasonably well described by the transport calculations. This will become important again for the analysis of dilepton production since the low mass dilepton pairs dominantly stem from the Dalitz-decay of n’s and g’s in the vacuum, which can be used as an independent normalization of the dilepton spectra (cf. Section 6). 4.2. Protons and pions at AGS energies At AGS energies (415A GeV) the initial nucleon—nucleon collisions occur at (s+5 GeV and the Lorentz contraction of the nuclear density in the nucleon—nucleon cms amounts to c +3. Thus most of the mesons produced in p#Be reactions hadronize (after their formation time t ;c) $ in the vacuum without rescattering such that this light system may serve as a test for the LUND string-model employed at (s+5 GeV. In this respect we show in Fig. 4.5 the inclusive proton and n\ rapidity spectra for p#Be (l.h.s.) and p#Au (r.h.s.) at 14.6A GeV in comparison to the data from the E802 Collaboration [141]. The approximate symmetry of the n\ rapidity distribution around midrapidity (y"0) for p#Be indicates very little rescattering of the pions. Also the proton distribution is rather well reproduced by the calculation for t "0.7 fm/c, which is the ‘default’ $ value for the formation time. The effect of pion rescattering on nucleons in p#Au at 14.6 GeV collisions can be extracted from the r.h.s. of Fig. 4.5 where the pion rapidity distribution is no longer symmetric around y"0, but sizeably enhanced at target rapidity (y+!2). The stopping of protons is also clearly visible in the proton rapidity distribution (upper parts of Fig. 4.5), both in the calculations as well as in the data of the E802 Collaboration [141]. We will come back to these systems in Section 5.2 when studying the production of strangeness where secondary pion—nucleon collisions will play a specific role. The next system addressed is Si#Al at 14.6A GeV. The computed rapidity distribution of protons and n>-mesons for b"1.5 fm is compared in Fig. 4.6 to the data from Refs. [142,89]. Whereas the proton rapidity distribution turns out to be quite flat in rapidity y due to proton rescattering, the pion rapidity distribution is essentially of gaussian shape which reflects the pion rapidity spectrum from the string fragmentation model outlined in Section 3.4 (cf. Figs. 3.6 and 3.7). We note, that similar to SIS energies [7,8] the proton rapidity distribution is rather insensitive to variations of the nucleon scalar and vector mean fields within the numerical accuracy. In analogy to Fig. 4.1 we show in Fig. 4.7 the calculated transverse mass-spectra of n>-mesons for Si#Al at 14.6A GeV (solid lines) in comparison to the experimental data from Ref. [142]. The overall agreement for lab. rapidities of y"0.9, 1.7 and 2.7 seems to indicate that the general reaction dynamics for pions is rather well reproduced within the HSD approach, although the n> rapidity spectrum is slightly narrower in experiment as compared to the calculations.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
117
Fig. 4.5. The inclusive proton and n\ rapidity spectra for p#Be (l.h.s.) and p#Au (r.h.s.) at 14.6A GeV in comparison to the data from the E802 Collaboration [141].
The flat proton rapidity spectrum in Fig. 4.6 might lead to the interpretation that there is already a substantial amount of stopping in the light system Si#Al. This, however, has to be taken with care because the actual snapshots of the baryon density distribution from our computations shown in Fig. 4.8 (l.h.s.) as well the phase-space distribution (r.h.s.)
f (z, p ; t)"(2n)\ dr dp f (r , z, p , p ; t) , , , , , X X
(4.2)
where denotes a sum over all baryon species, indicate a dominant transparency for the light system. This is essentially due to the large surface of the two light nuclei with a nucleon—nucleon collision probability less than 1. Furthermore, the time evolution in momentum space (middle column) shows that the system is far from kinetic equilibrium in the baryon degrees of freedom in the final state. Nucleon stopping becomes more pronounced for the system Si#Au at 14.6A GeV as seen from Fig. 4.9 where the calculated proton and n\ rapidity distributions (for b42 fm) are compared to
118
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 4.6. The proton and n\ rapidity spectra for Si#Al at 14.6A GeV for b"1.5 fm in comparison to the data from [89,142].
the data from E802 [142] (full squares) and E810 [143] (full triangles). Whereas the proton stopping is reasonably well reproduced by the calculation the pion spectra clearly come out too broad in rapidity again as compared to the experimental data. The amount of stopping at AGS energies is most clearly pronounced for central Au#Au reactions as displayed in Fig. 4.10 for the proton and n\ rapidity distributions in comparison to the experimental data from Refs. [144,145]. Though the pion rapidity spectrum — which comes out slightly too broad in the calculation again — does not differ very much in shape from that of the Si#Al system in Fig. 4.7 at first sight, the time evolution of the baryon distribution in coordinate space, momentum space and phase space (Fig. 4.11) for Au#Au at 14.6A GeV shows a clear approach versus equilibration. However, the coordinate space evolution indicates a dominant longitudinal expansion which is also reflected in the baryon momentum distribution that does not show full isotropy. We note that the proton rapidity spectrum for central Au#Au collisions at this energy shows a similar amount of stopping as the RQMD approach [123], the ART calculations by Li and Ko [24] or the ARC calculations by Kahana et al. [23]. 4.3. Protons and pions at SPS energies We continue our comparison to experimental data with the system S#S at 200A GeV, i.e. the SPS regime. In Fig. 4.12 we show the proton and negative hadron (essentially n\) rapidity distributions in comparison to the experimental data from [146,147]. Though the experimental proton and h\ rapidity spectra are approximately reproduced, we cannot conclude on the general
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
119
Fig. 4.7. Comparison of the calculated transverse mass spectra of n>-mesons for Si#Al at 14.6A GeV with the experimental data from Ref. [142] for rapidities y"0.9, 1.7, 2.7 in the laboratory system. The figure is taken from Ref. [25].
applicability of our approach at SPS energies because also more simple models like HIJING or VENUS [148] — with a less amount of rescattering — can reproduce the data in a similar way [149]. This is due to the fact that at 200A GeV the Lorentz contraction in the cms amounts to c +10 such that hadronization essentially occurs in the vacuum again and little rescattering occurs in S#S collisions. The situation changes for central S#Au reactions at 200A GeV as shown in Fig. 4.13 (taken from Ref. [248]) for the n\ and proton rapidity distribution in comparison to the data from [150,151]. Here the proton rapidity spectrum shows a narrow peak at target rapidity (+!3.03) which is easily attributed to the spectators from the Au target. The bump at y+!2 is mainly due to rescattering of target nucleons. Note that there is no longer any yield at projectile rapidity (y+3.03) which implies that all nucleons from the projectile have undergone inelastic scatterings. Futhermore, around midrapidity the n\ distribution is large compared to the proton distribution. The transverse momentum spectra of neutral pions for central S#Au reactions at 200A GeV are shown in Fig. 4.14 (from Ref. [152]) in comparison to the data in the rapidity interval 2.14y42.9 from [153]. The agreement between the data and the HSD calculations is sufficiently good such that the baryon and pion dynamics for the system S#Au — which we will address again in the context of dilepton production in Section 6.2 — is reasonably well under control. Baryon stopping is most clearly seen for the system Pb#Pb at 160A GeV. In Fig. 4.15 we show the proton and h\ rapidity distributions in comparison to the data from NA49 [154]. Our computed proton rapidity spectrum for central collisions (b42.5 fm) shows only a slight dip at midrapidity, which is much more pronounced in HIJING or VENUS simulations [149], but no peak at midrapidity as compared to RQMD simulations [20]. Thus full stopping is not achieved at SPS energies even for this heavy system. On the other hand, the h\ rapidity distributions are very similar to the S#S case, however, enhanced by about a factor of 6.5"208/32.
120
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 4.8. Baryon density distribution (left column), momentum space (middle column) and phase-space distribution (right column) for a 14.6A GeV Si#Al collision at b"1 fm for various times in fm/c. The figure is taken from Ref. [25].
Since the system Pb#Pb at 160A GeV is explored experimentally at the SPS in great detail, it is advantageous to have a look at the space—time evolution of the baryon one-body density for central collisions. In this respect we show in Fig. 4.16 the space—time evolution of baryons for this system at b"0 fm: (l.h.s.) contour plot of the baryon density distribution in coordinate space o (x, y"0, z; t), (middle column) contour plot of the baryon momentum distribution o (p , p "0, p ; t), (r.h.s.) the phase-space distribution f (z, p ; t) (4.2). In the time evolution of the V W X X
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
121
Fig. 4.9. The proton and n\ rapidity spectra for Si#Au at 14.6A GeV (for b42 fm) in comparison to the data from E802 [142] (full squares) and E810 [143] (full triangles).
density distribution o (x, y"0, z; t) we explicitly mention the short phase of high baryon density from about 5—8 fm/c as well as the sizeable fraction of ‘spectators’ from the nuclear corona. The time evolution in momentum space (middle column) shows that the system reaches its final distribution within a few fm/c, however, is far from kinetic equilibrium in the baryon degrees of freedom, which would be reflected by a spherical distribution here. It clearly indicates a dominant longitudinal expansion of the system which is much more pronounced than at AGS energies (cf. Fig. 4.11). 4.4. Optimizing for high baryon density In order to probe the restoration of chiral symmetry at high baryon density in nucleus—nucleus collisions, one has to perform experiments with heavy nuclei (e.g. Pb#Pb) and optimize the beam energy to achieve a large volume of high baryon density for a sufficiently long time. In this respect central collisions of Pb#Pb have been investigated within the HSD transport approach and the ‘stopped’ baryon density oQ (t) — including only baryons with rapidity "y"40.7 in the cms — has been computed in a central cylinder of the volume »"nRD /c with D "R"4 fm, while c is the X X Lorentz factor in the nucleus—nucleus center-of-mass system. Since we are interested in high baryon densities above some value o for long times, we consider the quantity
F" dt (oQ (t)!o )/o H(oQ (t)!o ) [fm/c] ,
(4.3)
122
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 4.10. The proton and n\ rapidity distributions for Au#Au at 11A GeV (for b43 fm) in comparison to the experimental data from Refs. [144,145].
which should serve as a useful guide in the optimization problem. The quantity F (4.3) is displayed in Fig. 4.17 (from Ref. [155]) for central collisions of Pb#Pb from 1—200A GeV for different values of o from 2o to 5o (o +0.168 fm\). Accordingly, optimal bombarding energies for
baryon densities above 4o should be around 20—30A GeV in order to explore the properties of an intermediate phase, where the chiral symmetry might approximately be restored and the hadron masses (except for the Goldstone bosons) might be close to their current quark masses m #m N . O O However, also lower bombarding energies (2—10A GeV) are seen to qualify for studies of a partial restoration of chiral symmetry since sizeable space—time volumes with baryon densities above 2—3o can be achieved. Summarizing this Section, the proton and pion rapidity distributions and transverse pion (and g) spectra as calculated within the HSD transport approach look reasonably well for the systems studied experimentally at SIS, AGS and SPS energies such that we can proceed with more detailed investigations on in-medium properties of kaons, antikaons, antibaryons and vector mesons.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
123
Fig. 4.11. Baryon density distribution (left column), momentum space (middle column) and phase-space distribution (right column) for a 14.6A GeV Au#Au collision at b"0 fm for various times in fm/c. The figure is taken from Ref. [25].
5. K>, K\ and pN production The production of particles especially at ‘subthreshold’ energies is expected to provide valuable information about the properties of hadrons at high baryon density and temperature [7]. Their relative abundance and spectra should reflect the in-medium properties of the particles produced
124
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 4.12. The calculated proton and negative hadron (essentially n\) rapidity distributions for S#S at 200A GeV in comparison to the experimental data from [146,147]. Fig. 4.13. Comparison of our calculations for the proton (dashed line) and n\ rapidity distribution (solid line) with the experimental data [150,151] for S#Au at 200A GeV. The figure is taken from Ref. [248].
since for a ‘dropping’ mass — i.e. a reduced quasiparticle energy in the medium, m*"u(p"0)"(m#P (o , o , p"0) , F 1
(5.1)
where P (o , o , 2) denotes the meson self-energy as a function of the baryon density o and scalar F 1 density o — the particle can be created more abundantly. On the other hand it will be suppressed in 1 case of repulsive potentials. Furthermore, a quasiparticle feeling an attractive potential at finite baryon density will be decelerated during its propagation out of the medium and thus asymptotically its momentum spectrum will be enhanced at low relative momenta with respect to the baryon matter rest frame. The opposite holds in case of repulsive potentials. According to Section 3 self-energy effects are expected for practically all hadrons in the medium, however, of different strength and sign. Antikaons (according to Section 3.2.1) should feel strong attractive forces at finite baryon density whereas the kaon potential is expected to be slightly repulsive (cf. Fig. 3.4). Due to attractive scalar and repulsive vector interactions of the nucleon the antinucleons should feel an even more attractive potential at finite density since the strong vector interaction will be attractive in their case, too. The first dynamical studies on antiproton and antikaon production in nucleus—nucleus collisions have been performed since more than a decade ago without including any self-energies for the
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
125
Fig. 4.14. The calculated invariant cross section (open circles) for n mesons in the rapidity interval 2.14y42.9 as a function of the transverse pion mass for S#Au at 200A GeV in comparison with the experimental data (full circles) of the WA80 Collaboration [153]. The figure is taken from Ref. [152]. Fig. 4.15. The proton and h\ rapidity distributions from the HSD calculations (for b42.5 fm) in comparison to the data from NA49 [154].
particles [110,116,156,157]. The first studies on pN production including scalar and vector selfenergies for antiprotons have been reported in Ref. [158] showing clear evidence for attractive pN self-energies. On the other hand, Li et al. [109] have performed first explorative calculations for antikaons which indicated that sizeable attractive K\ potentials might be needed to explain the experimental spectra from [159] for Ni#Ni at 1.85A GeV. Indeed, their findings could be substantiated in Ref. [160] in a systematic analysis of antikaon production in nucleus—nucleus collisions at SIS energies. Independent studies by Li et al. [96] then have confirmed the findings of Ref. [160] to a large extent. The most serious problem related to K>, K\ production at ‘subthreshold’ energies are the baryon—baryon and pion—baryon elementary production cross sections close to threshold where only limited experimental data are available so far and experimental information on reactions involving resonances is fully lacking. In fact, earlier extrapolations from high energy data [108] used in [109] overestimate the elementary K\ yield by more than an order of magnitude close to threshold [105,106,161]. Meanwhile, detailed calculations — within boson exchange models — have been carried out for the near threshold production cross section of antiprotons [114], vector mesons (o, u, ) [117] as well as antikaons [106] from nucleon—nucleon and pion—nucleon reactions (cf. also Ref. [162] in case of K> mesons). Since within the boson exchange model one can interpolate between different isospin channels and thus compare to a much larger set of experimental data, the results of these studies should be more reliable than the early parametrizations
126
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 4.16. Baryon density distribution (left column), momentum space (middle column) and phase-space distribution (right column) for a 160A GeV Pb#Pb collision at b"0 fm for various times in fm/c. The figure is taken from Ref. [155].
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
127
Fig. 4.17. The quantity F (4.3) for central collisions of Pb #Pb from 1—200A GeV as a function of the bombarding energy per nucleon for 4 different cuts in o . The figure is taken from Ref. [155].
[108,163,164] that are partly in severe contradiction to the available phase-space. A variety of strangeness production cross sections — as implemented in the HSD approach — have been presented in Sections 3.4, 3.5 and 3.6; for a general presentation of most of the input channels we also refer the reader to the recent work of Li et al. [96]. Since the real part of the actual K> and K\ self-energy is quite a matter of debate (cf. Fig. 3.4) we adopt a more practical point of view and as a guide for our analysis use a linear extrapolation of the form, o , (5.2) m* (o )"m 1!a ) ) o with a M +0.2—0.25 for antikaons and a +!0.06 for kaons. Alternative fits to the antikaon self) ) energies lead to different values for the parameter a M in the range 0.1 4a M 40.3 (cf. Fig. 3.4). The ) ) choice (a M +0.2) leads to a fairly reasonable reproduction of the antikaon mass from Refs. ) [95,97,165] and the results from Waas et al. [98] (cf. Fig. 5.1) whereas a +!0.06 corresponds to ) an isospin averaged kaon—nucleon scattering length aN +!0.255 fm from Ref. [166] within the ), impulse approximation [167]. In Eq. (5.2) we have neglected a momentum dependence of the kaon or antikaon potential for reasons of numerical simplicity. The dispersion analysis of Sibirtsev et al. [168] shows that this is roughly fulfilled for the kaon potential, however, the antikaon potential should be strongly momentum dependent. We, furthermore, note that the dropping of the antikaon mass is associated with a corresponding scalar energy density in the baryon/meson Lagrangian, such that the total energy-momentum is conserved within the transport approach during the heavy-ion collision or proton—nucleus reaction.
5.1. SIS energies The calculation of ‘subthreshold’ particle production is described in detail in Refs. [7,6] and has to be treated perturbatively at SIS energies (42A GeV) due to the small cross sections involved.
128
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.1. The antikaon and kaon masses as a function of the baryon density in units of o +0.16 fm\ according to Kaplan and Nelson [95,97] (dashed lines) and Waas et al. [98] (thick solid lines). The thin solid lines (with a M "0.2 and ) a "!0.06) present the linear fits to the K\ and K> effective masses which are used in the actual transport calculations. )
Since we work within the parallel ensemble algorithm, each parallel run of the transport calculation can be considered approximately as an individual reaction event, where binary reactions in the entrance channel at given invariant energy (s lead to final states with 2 (e.g. K>½ in nB channels), 3 (e.g. for K>½N channels in BB collisions) or 4 particles (e.g. KKM NN and ppN NN in BB collisions) with a relative weight ¼ for each event i which is defined by the ratio of the production cross G section to the total hadron—hadron cross section. The perturbative treatment now implies that in case of strangeness or antibaryon production channels the initial hadrons are not modified in the respective final channel. On the other hand, each strange hadron is represented by a test particle with weight ¼ and propagated according to the Hamilton equations of motion. Elastic and G inelastic reactions with pions, g’s or nonstrange baryons are computed in the standard way [7,102] and the final cross section is obtained by multiplying each test particle with its weight ¼ . In this G way one achieves a realistic simulation of the strangeness or antibaryon production, propagation and reabsorption during the heavy-ion collision, where only the dynamical feedback of the ‘perturbative’ hadrons to the nonstrange mesons and baryons is neglected. 5.1.1. Kaon production Before going over to K> production in nucleus—nucleus collisions we present the calculated differential K> spectra for p#Pb at 2.1 GeV in comparison to the data from Schnetzer et al. [169] in Fig. 5.2 for laboratory angles of 15°, 35°, 60°, and 80°. Fig. 5.3, furthermore, shows the kaon spectra for p#C and p#Pb at 1.5 GeV in comparison to the data of the KaoS Collaboration [172] at h "40°. In these calculations we have used a "0, i.e. neglected a kaon potential. Since ) the data are described quite well in this limit within the error bars we cannot extract further information on the kaon potential from these reactions because the kaons are produced at densities
The actual final states are chosen by Monte Carlo according to the 2-, 3-, or 4-body phase space.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
129
Fig. 5.2. The calculated differential K> spectra for p#Pb at 2.1 GeV in comparison to the data from Schnetzer et al. [169] for laboratory angles of 15°, 35°, 60°, and 80°. The figure is taken from Ref. [170]. Fig. 5.3. The calculated kaon spectra for p#C and p#Pb at 1.5 GeV in comparison to the data of the KaoS Collaboration [172] at h "40°. The figure is taken from Ref. [171].
o 4o in this case. Thus for kaons much higher production densities will be needed to obtain , further information on their potential. We consequently continue with nucleus—nucleus collisions. The Lorentz invariant K> spectra for Ni#Ni at 0.8, 1.0 and 1.8A GeV are shown in Fig. 5.4 (l.h.s.) in comparison to the data from the KaoS Collaboration [173]. Here the full lines reflect calculations including only bare K> masses (a "0) while the dashed lines correspond to ) calculations with a "!0.06 in Eq. (5.2), which leads to an increase of the kaon mass at o by ) about 30 MeV. The general tendency seen at all bombarding energies is that our calculations with a bare kaon mass seem to provide a better description of the experimental data for Ni#Ni than those with an enhanced kaon mass. The latter tendency is confirmed by the independent calculations from Li et al. [96] for Ni#Ni at 1.8A GeV and 44° in the laboratory in comparison to the data of the KaoS Collaboration [173] (r.h.s. of Fig. 5.4). In their calculation the dotted histogram corresponds to a "0 while the solid ) histogram reflects an in-medium calculation with a +!0.05. As in our calculations the slope of ) the kaon spectra increases for a repulsive kaon potential due to the acceleration of the kaons in the expansion phase by their repulsive interaction with baryons. Note that in the study by Li et al. [96] elementary K> production cross sections from BB channels have been used which are about a factor of 2 larger than ours. Consequently these authors end up with higher spectra also in case of nucleus—nucleus collisions. Further independent information stems from the FOPI Collaboration that has measured the K> spectra for Ni#Ni at 1.93A GeV at backward angles in the nucleus—nucleus cms. In Fig. 5.5 we compare our calculations for the K> rapidity distribution versus the normalized
130
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.4. L.h.s.: Inclusive K> spectra from Ni#Ni collisions at 0.8, 1.0 and 1.8A GeV in comparison to the experimental data from Ref. [173] at h "44°$4°. The solid lines represent calculations with bare K> masses, while the dashed lines result for a "!0.06 in Eq. (5.2). The figure is taken from Ref. [177]. R.h.s.: Inclusive K> spectra calculated by Li et al. ) [96] for Ni #Ni at 1.8A GeV and 44° in the laboratory in comparison to the data of the KaoS Collaboration [173]. The dotted histogram corresponds to a "0 while the solid histogram reflects an in-medium calculation with a +!0.05 in ) ) Eq. (5.2).
rapidity y"y /y with their data [126] — that have been reflected around midrapidity — for the ‘bare’ (a "0) and ‘in-medium’ mass (a "!0.06) scenarios. Again, the calculations describe the ) ) data much better for a bare kaon mass then for repulsive kaon potentials as in case of the data from the KaoS Collaboration (cf. Fig. 5.4). We, furthermore, compare our calculations for the heavier systems, i.e. Bi#Pb at 0.8A GeV and Au#Au at 1.0A GeV, with the respective experimental data from the KaoS Collaboration [173—175] in Fig. 5.6. The full dots represent the earlier data for Au#Au at 1.0A GeV [175] while the open circles result from a new measurement [173] of the system at the same bombarding energy. Both calculations (solid line: a "0; dashed line: a "!0.06) are compatible with the ) ) data due to the experimental uncertainties. Only the more recent data for Au#Au from [173] favor a slightly repulsive potential. We note that with the elementary cross sections from Sections 3.4 and 3.5 the relative weights of the K> production channels changes considerably compared to earlier calculations [176] as can be seen from Fig. 5.7 for Au#Au at 1A GeV. Whereas the NN production channels (dashed line) are almost negligible [176] the dominant yield stems from nN reactions (dot-dot-dashed line) which surpass the ND channel (dot-dashed line) [177]. This dominance of the secondary pion induced reaction channels for heavy systems has also been found by the Tu¨bingen group in Ref. [178]. The calculation in Fig. 5.7 has been performed for a bare K> mass; we note that the relative channel decomposition does not change very much when performing a calculation with a slightly repulsive kaon potential.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
131
Fig. 5.5. The calculated K> rapidity distribution versus the normalized rapidity y"y /y in comparison with the FOPI data [126] — that have been reflected around midrapidity — for the ‘bare’ (solid line) and ‘in-medium’ mass (dashed line) scenarios.
Fig. 5.6. Inclusive K> spectra from Bi#Pb collisions at 0.8A GeV and Au#Au at 1.0A GeV in comparison to the experimental data from Ref. [173—175] at h "44°$4°. The solid lines represent calculations with ‘bare’ K> masses, while the dashed lines result for a "!0.06 in Eq. (5.2). The figure is taken from Ref. [177]. ) Fig. 5.7. Inclusive K> spectra from Au#Au collisions at 1.0A GeV (solid line) in comparison to the experimental data from Refs. [173,175] at h "44°$4° for the ‘bare kaon mass’ scenario. The dashed line represents the contribution from NN collisions while the dot-dashed and dot-dot-dashed line show the contributions from DN and nN collisions, respectively. The figure is taken from Ref. [177].
132
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.8. L.h.s.: Kaon flow in the reaction plane (1p 2/m ) as a function of the normalized rapidity y"y /y for V )
Ni#Ni at 1.93A GeV. We have gated on central collisions (b44 fm) and applied a transverse momentum cut for p 50.5m as for the experimental data of the FOPI Collaboration [126] (full dots). The open dots are obtained by 2 ) reflection at y"0. The solid line and dashed line display the results of our transport calculations without (a "0) and ) with a slightly repulsive kaon potential (a "!0.06), respectively. The figure is taken from Ref. [177]. R.h.s.: Kaon flow ) in the reaction plane (1p 2) as a function of rapidity for Ni#Ni at 1.93A GeV from Ref. [181] without potentials (dotted V line) and with potentials (solid line).
A further quantity of interest is the kaon flow in the reaction plane, which should show some sensitivity to the kaon potential in the nuclear medium as put forward by Li, Ko and Brown [179—182] and was investigated by the FOPI Collaboration [126,183]. Here due to elastic scattering with nucleons the kaons partly flow in the direction of the nucleons thus showing a positive flow in case of no mean-field potentials [179]. With increasing repulsive kaon potential the positive flow will turn to zero and then become negative; experimental data on kaon flow thus are expected to discriminate further between the potentials seen in the medium. In order to investigate this question we have performed detailed (and high statistics) calculations for K> production in Ni#Ni reactions at 1.93A GeV as measured by the FOPI Collaboration [126]. In order to compare with their data we have included a transverse momentum cut p 50.5m , where m is the kaon mass, and gated on central collisions with impact parameter 2 ) ) b44 fm as in Ref. [181]. The results of our calculations are displayed in Fig. 5.8 (l.h.s.) in terms of 1p 2/m versus the normalized rapidity y. The calculations without any kaon potential (a "0, V I ) full line) indeed show a positive flow as expected, which appears still to be compatible with the data within the error bars and is close to the respective calculations from Ref. [181] (r.h.s. of Fig. 5.8). On the other hand, increasing the kaon potential (a "!0.06, dashed line) the kaon flow becomes ) slightly negative or comparable to zero, almost quantitatively again in line with the calculations from Ref. [181] (r.h.s. of Fig. 5.8) and in somewhat better agreement with the data. Thus in case of the flow observable a slightly repulsive kaon potential (+20—30 MeV at o ) is more favored by the present data of the FOPI Collaboration [126] for Ni#Ni at 1.93A GeV whereas the K> rapidity distributions show no indication for this (cf. Fig. 5.5). The deviations between the calculations and
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
133
Fig. 5.9. Preliminary proton (triangles) and K> flow (full dots) in the reaction plane (1p 2/m ) as a function of the V ) normalized rapidity y"y /y for Ru#Ru at 1.69A GeV for central events (b"2 fm) (l.h.s.) and for intermediate
impact parameter (b"4.5 fm) (r.h.s.) in comparison to the transport calculations (lower parts) for a "0 (Bare mass) and ) a "!0.06 (in-medium mass). The figures have been taken from [185]. )
also those of the Tu¨bingen group [184] reflect the relative model uncertainties and different numerical recipies that enter the transport calculations. More detailed constraints might be obtained from the system Ru#Ru at 1.69A GeV including different event classes. Here the preliminary transverse flow for kaons 1p /m2 versus the norV malized rapidity y from the FOPI Collaboration [185] is shown in Fig. 5.9 for central events (b"2 fm) (l.h.s.) and intermediate impact parameter (b"4.5 fm) (r.h.s.), respectively. Whereas the protons (triangles) show a strong attractive flow 1p 2/m in the reaction plane versus the normalized V rapidity y for both event classes, the K>-mesons (full dots) show almost no flow for central events (l.h.s. of Fig. 5.9) and a slightly repulsive flow for b+4.5 fm (r.h.s.). This tendency is only reproduced in the calculations when including a repulsive kaon potential (denoted as ‘in-medium mass’) as shown in the lower part of Fig. 5.9 (taken from Ref. [185]). Unfortunately, the differential spectra for different events classes have not been evaluated by the FOPI Collaboration so far such that more detailed conclusions have to be delayed for future. 5.1.2. Antikaon production We now turn to the production of antikaons and start our analysis with the system Ni#Ni at 1.85A GeV without including any self-energies. The inclusive Lorentz-invariant cross section for negative pions and antikaons in the nucleus—nucleus cms (for h "0°) is shown in Fig. 5.10 (l.h.s.) by the solid lines in comparison to the data of Refs. [159] that were taken at 0° in the laboratory system and have been transformed to the nucleus—nucleus cms. We note that our calculations yield
134
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.10. L.h.s: The inclusive Lorentz-invariant cross section as a function of the meson momentum in the nucleus—nucleus cms for n\ and K\ mesons at h "0° for Ni#Ni at 1.85A GeV. The full dots and full squares represent the experimental data from Refs. [159,186]. In these calculations no meson self-energies have been taken into account. R.h.s.: The inclusive Lorentz-invariant cross section as a function of the kaon momentum in the nucleus—nucleus cms for K\ mesons at h "0° for Ni#Ni at 1.85A GeV without including K\ self-energies in comparison to the experimental data from [159,186] (full squares) and the data for Ni#Ni (h "44°) at 1.8A GeV from [187] (open circles). The dashed line displays the cross section from baryon—baryon (BB) channels, the dotted line that from pion—baryon (nB) channels, while the dash-dotted line shows the contribution from n—hyperon collisions. The solid line represents the sum of all channels taken into account. The figures have been taken from Ref. [160].
an anisotropic n\ angular distribution in this reference frame in line with the analysis in Ref. [129]; however, the K\ angular distribution is found to be isotropic within the numerical accuracy. Whereas the n\ spectra in Fig. 5.10 are reasonably well reproduced — as discussed in Section 4.2 — the antikaon spectra are underestimated by up to a factor of 5 especially at low momenta. This finding agrees qualitatively with that of Li et al. in Ref. [109], who also underestimated these antikaon data substantially when using a vacuum K\ mass, even when adopting the parametrization of the elementary cross section from Ref. [108] (dash-dotted line in Fig. 3.8). It is, however, interesting to have a look at the contributions from the different production channels in this case (r.h.s. of Fig. 5.10) in comparison to the experimental data from [159,186] (full squares) and the data from the KaoS Collaboration [187] for Ni#Ni at 1.8A GeV. Here the BB channels are approximately in the same order of magnitude as the nB channels, but the n½ channels provide the dominant contribution as suggested early by Ko [188] and also found in Ref. [110]. This is due to the fact that in more central collisions the pion density reaches about 0.15 fm\ while the hyperons have almost the same abundancy as the K> mesons. Thus a substantial amount of hyperons suffer a quark exchange (sPu, d) when propagating out of the nuclear medium.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
135
Fig. 5.11. L.h.s.: The inclusive Lorentz-invariant cross section as a function of the kaon momentum in the nucleus—nucleus cms for K\ mesons for Ni#Ni at 1.85A GeV in comparison to the experimental data from Refs. [159,186] and the data for Ni#Ni at 1.8A GeV from Ref. [187]. The dashed-dotted line corresponds to a calculation with bare kaon masses, whereas the solid and dashed lines display the results with antikaon self-energies according to Eq. (5.2) for a M "0.2 and 0.24, respectively. R.h.s.: The inclusive Lorentz-invariant cross section as a function of the kaon momentum ) in the nucleus—nucleus cms for K\ mesons at h "0° for Ni#Ni at 1.66A GeV in comparison to the experimental data from Ref. [159].
We now address the possible medium effects on the antikaon according to Fig. 5.1 or Eq. (5.2), respectively. We note that for a M "0 we recover the limit of vanishing antikaon self-energy, ) whereas for a M +0.2 we approximately describe the scenario of Kaplan and Nelson [95,97] or ) Waas et al. [98]. For practical purposes one should consider a M to be a free parameter to be fixed in ) comparison to the experimental data in order to learn about the magnitude of the antikaon self-energy. The K\ spectra for Ni#Ni at 1.85 and 1.66A GeV from Refs. [159,186] are shown in Fig. 5.11 for a M "0,0.2 and 0.24 where the latter cases correspond to an attractive potential of ) !100 and !120 MeV at density o , respectively. We note, that due to the uncertainties involved in the elementary BB production cross sections we cannot determine this value very reliably. With increasing a M not only the magnitude of the spectra is increased, but also the slope becomes softer. ) This is most clearly seen at low antikaon momenta because the net attraction leads to a squeezing of the spectrum to low momenta. The analysis from Ref. [160] has been repeated independently by Li et al. [96] using, however, slightly different cross sections for the NNPK>½N channel, which are almost a factor of 2 larger than ours (cf. Section 3.4) as well as the DN and DD channels. The antikaon production channels, however, are practically the same in both approaches and based on the analysis by Sibirtsev et al. [106]. Their result is shown in Fig. 5.12 for the antikaon spectrum for Ni#Ni at 1.8A GeV at h "44° without K\ potential in terms of the dotted histogram which underestimates the data of
136
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.12. The inclusive Lorentz-invariant antikaon cross section from Li et al. [96] as a function of the kaon kinetic energy in the nucleus—nucleus cms at h "44° for Ni#Ni at 1.8A GeV in comparison to the data from Ref. [187].
the KaoS Collaboration by factors of 3—5. Only when including an attractive antikaon potential (+!110 MeV at density o ) the experimental spectra can be described reasonably well (solid histogram). The KaoS Collaboration recently has also investigated kaon and antikaon production in C#C collisions at 1.8A GeV. The comparison of their data [189] with the theoretical predictions is displayed in Fig. 5.13 (l.h.s.). Here the dashed lines correspond to our calculations for antikaons while the solid lines reflect the calculations for K>-mesons. In the latter case the upper solid line results for a "0 while the lower solid line is obtained for a "!0.06. As in case of all ) ) the other systems shown before the K> spectra are reproduced without a kaon potential. On the other hand the K\ spectra are described well for a M "0.24 (upper dashed line) whereas the ) calculations for a M "0 (lower dashed line) underestimate the data by up to a factor of 5. Though ) the medium effects are not as pronounced as for Ni#Ni collisions it becomes clear that also this light system shows a clear evidence for an attractive K\ potential which is fully in line with that obtained for the heavier system. It is remarkable to note that the preliminary experimental n\, K> and K\ spectra show an approximate scaling as a function of the ‘effective’ energy of the meson in the nucleus—nucleus cms, E* "E #E with E"m for pions, E"m !M #MK for kaons and E"2m for ) L ) , antikaons as shown in Fig. 5.13 (r.h.s.) for C#C collisions at 1.8A GeV. This scaling had been predicted in Ref. [190] in case of attractive antikaon potentials and a vanishing or weak kaon potential. We will come back to this surprising behaviour in Section 8.1. Probably the most sensitive test for kaon and antikaon self-energies are the K\/K> ratios for central collisions of heavy systems as a function of rapidity. In this respect we show in Fig. 5.14 the
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
137
Fig. 5.13. L.h.s.: The inclusive Lorentz-invariant cross section for K> (solid lines) and K\ mesons (dashed lines) as a function of the kaon/anti-kaon momentum in the nucleus—nucleus cms at h "40$4° for C#C at 1.8A GeV in comparison to the preliminary data from Ref. [189]. The upper soild line corresponds to a calculation with bare kaon masses, whereas the upper dashed line displays the results for a M "0.24. The lower soild line corresponds to an attractive ) kaon potential with a "!0.06 and the lower dashed line to a calculation without self-energies for the kaon and ) antikaon. R.h.s.: The inclusive n\, K> and K\ spectra as a function of E* "E #E (see text for C#C at 1.8A GeV. The figures are taken from Ref. [189].
corresponding preliminary ratio for Ru#Ru (b44 fm) at 1.69A GeV in comparison to our transport calculations with and without potentials as worked out by the FOPI Collaboration [191]. Here the preliminary data for the K\/K> ratio (full dots) steeply rise towards midrapidity almost in line with our predictions (solid line) (using a "!0.06 and a M "0.2), whereas this ratio ) ) is practically constant as a function of rapidity when no medium effects are incorporated (dotted line). As becomes clear from this figure, much better statistics will be needed in future, both from the experimental as well as the theoretical side, to allow for a more precise determination of the antikaon potential in the nuclear medium. In summary, the analysis at SIS energies shows that K> inclusive spectra for C#C and Ni#Ni from the KaoS and FOPI Collaborations are reasonably well described without introducing any medium modifications for these mesons. Only for heavy systems like Au#Au the inclusive spectra indicate reduced kaon cross sections in favor of a slightly repulsive kaon potential. On the other hand, more exclusive studies for the system Ru#Ru at 1.69A GeV suggest that a slightly repulsive potential of about 20—30 MeV at density o might be needed to explain the preliminary differential K> flow data. Such values for the kaon potential are fully in line with the independent analysis of Li et al. [96]. The antikaon spectra, furthermore, are underestimated severely when incorporating only bare kaon masses. When including an attractive antikaon potential comparable to that proposed early by Kaplan and Nelson [95] or Waas et al. [98], a satisfactory description of the K\ spectra can be given, both in the actual magnitude as well as in the slope. These findings again are in line with the computations by Li et al. [96] which imply an attractive antikaon potential at o of !(110$15) MeV in the momentum range 0.34p M 40.8 GeV/c with respect to the baryon ) matter at rest.
138
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.14. The calculated K\/K> ratios for Ru#Ru (b44 fm) at 1.69A GeV with (solid line) and without potentials (dotted line) as a function of the normalized rapidity in comparison to the preliminary data of the FOPI Collaboration [191] (full dots). The figure has been taken from [191].
We mention again that experimental studies of antikaon production in NN reactions close to threshold are urgently needed to have a final control on the input cross sections used in the transport analysis. Another uncertainty in the calculations, i.e. the behaviour of the DN production cross sections, can be eliminated by performing studies for K> and K\ production in proton— nucleus reactions, where the D-excitation plays a minor role [192]. 5.1.3. Antiproton production Antiproton production at energies of a few A GeV is the most extreme subthreshold production process and has been observed in proton—nucleus collisions already a few decades ago [193—195]. Experiments at the JINR [196] and at the BEVALAC [197,198] have provided, furthermore, first measurements of subthreshold antiproton production in nucleus—nucleus collisions followed by measurements at KEK [199] and GSI [159,200] with new detector setups. Various descriptions for these data have been proposed. Based on thermal models it has been suggested that the antiproton yield contains large contributions from DNPpN #X, DDPpN #X and ooPpN N production mechanisms [201—203]. Other models have also attempted to explain these data in terms of multiparticle interactions [204]. Nowadays, the relative strength of the various production channels as well as possible inmedium effects are most effectively controlled by means of transport approaches [9,12,116,157,158]. First results of a fully relativistic transport calculation for antiproton production including pN annihilation as well as the change of the quasiparticle properties in the medium have been reported in [158]. There it was found that according to the reduced nucleon mass in the medium the threshold for pN -production is shifted to lower energy and the antiproton cross section prior to annihilation becomes enhanced substantially as compared to a relativistic cascade
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
139
calculation where no in-medium effects are incorporated. A variety of transport studies have been performed since then — most of them using the parametrization of the elementary production cross section as proposed by Batko et al. [116] — and confirmed the necessity of attractive pN potentials, however, with some debate about its actual magnitude [9,12,157]. For a review on this subject we refer the reader to Ref. [13]. More recently, the question about the elementary pN production amplitude has been addressed by Lykasov et al. in an effective boson exchange model [114]. Here the production cross section close to threshold follows accurately the 4-body phase-space constraints, but is much lower than the commonly adopted parametrization from [116]. Furthermore, the secondary pion induced channel might dominate as suggested in Ref. [205] such that also the nNPNppN production channels have to be considered in addition to the studies reviewed in Ref. [13]. Here we report the reanalysis of pN production performed in Ref. [115]. The basis for the description of the pN production by baryon—baryon collisions is the process B#BPpN #p#N#N,1#2PpN #3#4#5 ,
(5.3)
for which the corresponding covariant collision integral reads
4 m夹m夹m夹m夹m夹m夹N dP dP dP dP dP N INN (x, P N )" N PPPPPPN (2n) N ;¼(PI , PI "PI , PI , PI , PIN )d(pI #pI !pI !pI !pI !PIN ) N N ;+ f (x, P ) f (x, P )(1!f (x, P ))(1!f (x, P ))(1!f (x, P )), , (5.4) where ¼ is the transition probability for the reaction P #P PP #P #P #P N in terms of N momentum coordinates. We have omitted the Pauli-blocking factor for the antiproton in the final state because the number of antiprotons created during a heavy-ion collision in the subthreshold energy regime is negligible. For the same reason we neglect the effects of the reaction (5.3) on the phase-space distribution function of the baryons as in case of kaons and antikaons. For antiproton absorption the ‘free’ annihilation cross section (parametrized as a function of the invariant energy above threshold) is adopted [115]. In order to derive an expression for the differential pN -multiplicity it is assumed, as in Refs. [116,204,206], that the differential elementary pN production cross section is proportional to the phase-space available for the final state in BB reactions:
and
dp ((s) ,,,>NN E E E EN NdP dP dP dP N N 1 "p ((s) d(PI #PI !PI !PI !PI !PIN !DI) ,,,>NN N 16R ((s) dp ((s) 1 L ,,>NN E E EN "p ((s) d(PI #PI !PI !PI !PIN !DI) N dP dP dP N L ,,>NN N L 8R ((s) N
in case of nB reactions.
(5.5)
(5.6)
140
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Table 4 The parameters in the approximation (5.7) Reaction
s
a [mb]
b
c
nNPpN X NNPpN X
9 m 16 m
1 0.12
2.31 3.5
2.3 2.7
Here, the d-functions guarantee the energy and momentum conservation and (s is the invariant energy available for the quasiparticles in the initial state. The quantities DI and DI in Eqs. (5.5) and L (5.6) stand for the differences in the quasiparticle self-energies, respectively (cf. Ref. [115]). R ((s) (R ) is the 4-body (3-body) phase-space integral [207]; it has been included to ensure that the differential cross sections are normalized to the total cross section. For practical use in transport calculations the antiproton production cross section in n\p reactions from Ref. [115] is parametrized in the form
s @ s \A "a !1 p\ L >NNN 6 s s
(5.7)
with parameters listed in Table 4. By averaging over isospin we get an additional factor 2/3 for the reaction nNPpN #X. This approximation is sufficient in view of the rather complicated reactions dynamics as described by the transport approach. Using the form (5.7) we also parametrize the cross section for antiproton production from nucleon—nucleon collisions as calculated within the OBE-approach in Ref. [114] (cf. Table 4). Note that the first term of Eq. (5.7) reflects the energy dependence of the cross section near the reaction threshold; its rise with the excess energy indicates a phase-space dominance for the antiproton production cross sections both from pion and nucleon induced reactions which supports the ansatz (5.6) for the multi-differential cross section. Fig. 5.15 shows the parametrizations of the cross sections for antiproton production from np and pp collisions as a function of the excess energy (s!(s . The solid circles indicate the experi mental data for the reaction p#pPpN X [208]. Note that the n#N channel above threshold is much larger than the p#p channel since it increases with 3-body phase space compared to 4-body phase space for p#p. This indicates that the contribution from secondary pion induced reactions to antiproton production in proton—nucleus and heavy-ion collisions might be important or even of leading order. In the following calculations it is assumed that the cross sections are the same for the reactions with protons, neutrons and D-resonances. Detailed experimental studies on antiproton production in p#A collisons at beam energies below the free NNPNNNpN reaction threshold were performed at KEK [199,209]. Antiprotons with momenta of 1.0—2.5 GeV/c were detected at an emission angle of 5.1° in the laboratory and at beam energies of 3.5, 4.0, 5.0 and 12.0 GeV. The experiments show a very high production cross section at the incident kinetic energy of 3.5 GeV, which is substantially below the free NN threshold of 5.6 GeV. Fig. 5.16 shows the invariant differential cross section for antiproton production in p#C collisions at a beam energy of 3.5 GeV and an emission angle of 5° in the laboratory. The experimental data are taken from [209] while the histograms indicate the results calculated with different antiproton potentials (in MeV at saturation density o "0.16 fm\). It is clearly seen that
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
141
Fig. 5.15. Cross sections for inclusive antiproton production from np and pp collisions. The circles show the experimental data for the reaction ppPpN X [208] while the lines are the parametrizations (5.7) used in the transport calculations. The figure is taken from Ref. [115]. Fig. 5.16. Antiproton spectra from p # C collisions at a beam energy of 3.5 GeV. The experimental data are from Ref. [209] while the histograms indicate the calculations with different antiproton potentials º in MeV at density o . The figure is taken from Ref. [115].
without in-medium effects (º"0) one significantly underestimates the production cross section by more than an order of magnitude. The agreement becomes better for an attractive antiproton potential of !(125$25) MeV at normal nuclear matter density. Figs. 5.17 and 5.18, furthermore, show the experimental data together with the calculations for the reaction p#CPpN X at bombarding energies of 4.0 and 5.0 GeV, respectively. Note that at higher energy the data are no longer that sensitive to a variation of the antiproton potential. The antiproton spectra from p#Cu collisions at bombarding energies of 3.5, 4.0 and 5.0 GeV are shown in Fig. 5.19. The dashed histograms are the results for º"0, while the solid histograms indicate the calculations with an antiproton potential of º"!100 MeV at o which almost perfectly describe the experimental data at all energies. Experimental studies on antiproton production from heavy-ion collisions at energies below the free NN threshold have been performed at the BEVALAC [197,206] and at GSI Darmstadt [159,200]. Whereas in proton—nucleus collisions the antiprotons have quite large momenta relative to the nuclear matter, in heavy-ion reactions the antiprotons have small momenta in the nucleus—nucleus center-of-mass and are comoving with the expanding ‘fireball’. Moreover, heavyion reactions probe much higher baryon densities, which influences both the antiproton self-energy and their final state interactions. Fig. 5.20 shows the antiproton spectra from Si#Si collisions at a bombarding energy of 2A GeV (l.h.s.). The experimental data are from Ref. [198] while the histograms indicate the
142
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.17. Antiproton spectra from p # C collisions at a beam energy of 4.0 GeV at h "0°. The experimental data are from Ref. [209] while the histograms indicate the calculations with different antiproton potentials º in MeV at density o . The figure is taken from Ref. [115]. Fig. 5.18. Antiproton spectra from p # C collisions at a beam energy of 5.0 GeV. The experimental data are from Ref. [209] while the histograms indicate the calculations with different antiproton potentials º in MeV at o . The figure is taken from Ref. [115].
Fig. 5.19. Antiproton spectra from p#Cu collisions at beam energies of 3.5, 4.0 and 5.0 GeV. The experimental data are from Ref. [209] while the solid histograms show the calculations with an antiproton potential º"!100 MeV at o ; the dashed histograms indicate the results for º"0. The figure is taken from Ref. [115].
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
143
Fig. 5.20. L.h.s.: Antiproton spectra from Si#Si collisions at 2.0A GeV. The experimental data are from Ref. [198] while the histograms show the calculations with different antiproton potentials (at o ). R.h.s.: Antiproton spectra from Ni # Ni collisions at 1.85A GeV and h "0°. The experimental data are from Refs. [159,186] while the histograms show the calculations with different antiproton potentials º at density o . The figures are taken from Ref. [115]. Table 5 The differential antiproton multiplicity for Si#Si collisions at 2A GeV multiplied by the factor 2nb for an antiproton potential of º"!150 MeV at o . The decomposition is performed for different impact parameter b and nuc leon—nucleon, D-nucleon and pion—nucleon reaction channels b (fm)
1 2 3 4 5
dp/db (nb/fm) NN
DN
nN
0.64 1.1 1.0 0.67 0.027
7.09 9.5 7.3 3.3 0.63
22.0 28.9 24.5 4.7 1.7
calculations for different values of the antiproton potential º (in MeV at o ). In Table 5 we illustrate the relative contribution to antiproton production from NN, DN and nN reaction channels for a potential º"!150 MeV at o . In order to investigate the variation of the cross section with the impact parameter b we also show in Table 5 the differential antiproton multiplicity multiplied by the factor 2nb. It becomes clear that the dominant contribution to pN -production also for A#A collisions stems from the secondary pion induced reactions as in case of p#A collisions. Moreover, the dynamics of antiproton production and propagation show no sizeable difference in the channel decomposition as a function of the centrality of the collision.
144
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
The r.h.s. of Fig. 5.20 shows the antiproton spectra from Ni#Ni collisions at 1.85A GeV. The experimental data are taken from [159,186] and can be reasonably reproduced by our calculations with ºK!(125$25) MeV at o whereas the data are underestimated by more than an order of magnitude if no pN potential is included. We thus find a consistent description of all data employing an attractive potential for the antiprotons of about !100 to !150 MeV at o which is roughly in line with a dispersive potential extracted from the dominant imaginary part of the antiproton selfenergy due to annihilation (cf. Ref. [115]). The preceding analysis at subthreshold energies, on the other hand, suggests lower antiproton potentials as anticipated from studies at AGS energies [31,210] where the total antiproton production cross section is no longer that sensitive to its value near the reaction threshold. Here a pN potential of K!250 MeV at o is proposed in Refs. [31,210] for an antiproton at rest with respect to the nuclear matter, whereas an antiproton potential of K!160 MeV is reported for a pN momentum of 1 GeV/c in Ref. [210]. 5.2. AGS energies Whereas our investigation of strangeness production up to 2A GeV has given some clear evidence for antikaon potentials in the medium due to the strong increase of the elementary production cross section with the excess energy (s!(s , the question comes up about strangeness production at AGS energies (10—15A GeV). Here the invariant energy in first chance NN collisions is (s+5 GeV, which is far above threshold where the production cross section changes only smoothly with energy. Furthermore, due to higher meson densities also meson— meson reaction channels will become important especially for heavy systems such as Au#Au. The analysis in this section follows closely the work of Geiss et al. [211]. We will investigate the same systems as in Section 4.2 where we have concentrated on proton and pion rapidity distributions that were found to be reasonably in line with the HSD transport calculations. We start with p#Be at 14.6 GeV and display in Fig. 5.21 (l.h.s.) the calculated K> and K\ rapidity distribution in the nucleon—nucleon cms in comparison to the data of the E802 Collaboration [141]. Both K> (upper part) and K\ rapidity distributions (lower part) are almost symmetric around midrapidity indicating little reabsorption of both mesons due to the small size of the target. Whereas the K> spectra are described quite well the K\ spectrum is clearly overestimated by the calculation and also too broad in rapidity. We attribute this deficiency to an inadequate description of KM production by the LUND model in pN production channels which has to be kept in mind when comparing to the heavier systems furtheron. The calculated K> and K\ spectra for p#Au at 14.6A GeV shown in Fig. 5.21 (r.h.s.) are no longer symmetric around midrapidity due to rescattering and especially K\ absorption on target nucleons (at rapidity +!2). However, both spectra from the E802 Collaboration [141] are not well described by the transport approach and the K\ yield is slightly overestimated for y5 0. This indicates that antikaon production at AGS energies is not well understood even for p#A reactions. The present calculations, so far, have been performed without any kaon and antikaon potentials. Attractive K\ potentials do not improve the situation, but make it worse; on the other hand, repulsive K\ potentials at high momentum might help.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
145
Fig. 5.21. The K> (upper parts) and K\ (lower parts) rapidity distributions in the nucleon—nucleon cms for p# Be at 14.6 GeV (l.h.s.) and p # Au at 14.6 GeV (r.h.s.) in comparison to the data of the E802 Collaboration [141].
How does the situation look like in light nucleus—nucleus collisions? The K> and K\ rapidity distributions for central Si#Al at 14.6A GeV reactions are shown in Fig. 5.22 (l.h.s.) in comparison to the data from [89,142]. Here the K> yield as well as the K\ yield are underestimated by roughly 30—40%. Especially the result for antikaons is surprising in view of Fig. 5.21 where the yield for p#Au was overestimated for y50. The situation becomes worse for Si#Au at 14.6A GeV as shown in Fig. 5.22 (r.h.s.) where our calculations with bare kaon and antikaon masses (solid histograms) underpredict the K> and K\ rapidity distributions significantly from E802 and E859 [142,212,213]. Introducing in-medium kaon and antikaon masses according to Eq. (5.2) with a "!0.06 and a M "0.24 (as appropriate ) ) for SIS energies), the situation improves for the K\ yield, however, the K> rapidity spectra are still underestimated by the transport calculation (dashed histograms). Note, that in spite of a repulsive K> potential the kaon yield is increased in this case because the channel nnPKM K is enhanced since the vector interactions cancel out in this channel and only the scalar attraction is left for the KM K pair. The latter situation is similar for the RQMD approach for Si#Au as demonstrated in Fig. 5.23 (taken from Ref. [214]). Whereas RQMD also describes n> and n\ spectra reasonably well [214] the K>/K\ ratio (upper part) and especially the K>/n> ratio (lower part) is underestimated sizeably in comparison to the data from E802 (full squares). Since strangeness is
146
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.22. L.h.s.: The K> (upper parts) and K\ (lower parts) rapidity distributions for central Si # Al reactions at 14.6A GeV in comparison to the data from [142] (full squares) and [89] (open dots). R.h.s.: The K> and K\ rapidity distributions for central Si # Au reactions at 14.6A GeV calculated with bare masses (solid histograms) and with in-medium masses (dashed histograms) in comparison to the data from Refs. [142] (full dots), [212] (full triangles) and [213] (open dots).
conserved in both calculations (RQMD and HSD) we have to conclude here that the initial production of strangeness, i.e. sN s pairs, is underestimated in the hadronic models. The heaviest system studied at the AGS is Au#Au at +11A GeV. Our calculated kaon and antikaon rapidity spectra for central (0—10%) reactions (b42 fm) are displayed on the l.h.s. in Fig. 5.24 in comparison to the data from Ref. [144]. The K> and K\ rapidity spectra for mid-central (5—12%) reactions are shown on the r.h.s. in Fig. 5.24 in comparison to the experimental data from [215]. The solid histograms corresponds to the ‘bare mass’ scenario and underestimate the data strongly whereas the dashed histograms are obtained for a "!0.06 and ) a M "0.24, respectively. Whereas the K\ yield is almost reproduced in the latter scheme, the ) K> yield is still underestimated as in case of the Si#Au system at 14.6A GeV. It is worth to mention that in the latter case the n½PKM N production channel is as important as the BB production channel for antikaons since for o 52 o this channel now is above the in-medium threshold contrary to the ‘bare mass’ case. Furthermore, the pion density is of the same order as the baryon density in central Au#Au collisions at 11A GeV. Nonstrange meson—baryon channels contribute only by about 30% in the ‘in-medium mass’ scenario.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
147
Fig. 5.23. Comparison of multiplicity ratios from E802 (black symbols) for Si # Au reactions at 14.6A GeV and RQMD analyzed in Ref. [214] within the experimental acceptance (open symbols). The upper part shows the Dn(K>)/Dn(K\) ratio plotted versus laboratory rapidity, while the lower part shows the ratio Dn(K>)/Dn(n>). The figure is taken from Ref. [214].
In view of the systematic presentation of our results in comparison to data from p#Be to Au#Au collisions we infer that the hadronic transport model does not accurately enough describe these systems as to allow for definite conclusions. Though there is some tendency to get a better reproduction of the experimental data when including in-medium kaon and antikaon masses as found for the SIS energy regime — and also suggested by Brown et al. [216] — we do not consider this to be conclusive. Furthermore, independent calculations within the ART-code from Li and Ko [24] or the ARC-code from Kahana et al. [23] seem to describe the K> spectra for central Au#Au reactions without any in-medium effects. Unfortunately, the latter calculations have not been applied to the other systems presented here such that in case of conflicting results between different transport calculations no unbiased message can be extracted. However, when relying on the results presented in this Section within the HSD approach, the failure in reproducing the K>/n> ratios suggests nonhadronic or partonic degrees of freedom to become important in nucleus—nucleus collisions at &10A GeV. Although the actual rapidity distributions for kaons and antikaons are not well described it is worthwhile to explore the flow of kaons and antikaons because the flow more sensitively reflects the sign of the interaction of mesons with baryons. In this respect we show in Fig. 5.25 the transverse momentum per particle (of the kaon and antikaon) in the reaction plane for central collisions (b46 fm) and peripheral reactions (b57 fm) for both scenarios, the ‘bare mass’ scheme (solid lines) as well as for the in-medium K> and K\ masses (dashed lines). Whereas the kaon flow
148
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.24. L.h.s.: The K> (upper parts) and K\ (lower parts) rapidity distributions for central (0—10%) Au # Au reactions at 11A GeV calculated with bare masses (solid histograms) and with in-medium masses (dashed histograms) in comparison to the data from Ref. [144]. R.h.s.: the same for mid-central (5—12%) reactions. The experimental data are taken from Ref. [215].
in both cases is almost zero within the statistics (especially for central collisions), the antikaons exhibit a strong negative flow for peripheral reactions in the ‘bare mass’ case which is sizeably reduced in the ‘in-medium mass’ scheme and even slightly positive close to midrapidity. This effect is due to K\ absorption in the spectators and an additional attraction to the baryon current in the ‘medium mass’ scenario. On the other hand, for central collisions the spectator absorption is quite low such that the antikaon flow in the ‘bare mass’ case is only weakly repulsive; this weak repulsion turns to a slight attraction in case of the in-medium masses. It remains to be seen experimentally if these correlations actually hold and especially, if an attractive flow of antikaons can be established for central Au#Au collisions. 5.3. SPS energies Since about 2 decades the strangeness enhancement in ultrarelativistic nucleus—nucleus collisions has been proposed as a possible signature for the formation of a quark-gluon-plasma (QGP) [217,218]. However, strangeness is produced also in all energetic collisions of nonstrange mesons with nonstrange baryons as well as nonstrange meson—meson collisions. Thus the relative
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
149
Fig. 5.25. Kaon (l.h.s.) and antikaon (r.h.s.) flow in the reaction plane for Au # Au central collisions (b46 fm) and peripheral reactions (b'7 fm) at 11A GeV for both scenarios, the ‘bare mass’ scheme (solid lines) as well as for in-medium K> and K\ masses (dashed lines).
abundance of these secondary and ternary reaction channels will be essential in determining the relative sN s enhancement compared to pp collisions at the same energy. In this respect we display in Fig. 5.26 the number of BB and mB collisions as a function of the invariant collision energy (s for central S#S collisions at 200A GeV and Pb#Pb collisions at 160A GeV. The baryon distributions for both systems show slight peaks around the initial (s"(4m #2m ¹ ), but extend , , over the whole (s regime with a strong peak slightly above (s"2m . Whereas the peak at high , (s corresponds to the first chance nucleon—nucleon collisions, the latter one represents low energy ‘comover’ scattering. Due to the larger size of the system low and intermediate energy BB collisions are enhanced for Pb#Pb as compared to S#S. In order to achieve a relative normalization the collision numbers in Fig. 5.26 for S#S have been multiplied by 208/32"6.5. Meson—baryon collisions and meson—meson collisions (not shown) are about factors of 2 and 4, respectively, higher in Pb#Pb as compared to S#S. In view of the strangeness production threshold in mB reactions of 1.612 GeV for kaons and 1.932 GeV for KKM pairs, respectively, still a considerable part of secondary mB reactions can contribute to the net strangeness production. Since the Lund-string-model (LSM) describes the strangeness production in pp collisions very well (cf. Figs. 3.6 and 3.7) and also the low energy production channels are reasonably well under
150
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.26. The number of baryon—baryon (upper part) and meson—baryon collisions (lower part) as a function of the invariant collision energy (s for central S#S collisions at 200A GeV (multiplied by 208/32) and Pb # Pb collisions at 160A GeV. Fig. 5.27. The rapidity distributions of K> (upper part) and K\ (lower part) mesons for central collisions of S#S at 200A GeV calculated with bare masses (solid histograms) and with in-medium masses (dashed histograms) in comparison to the data from Refs. [219,220].
control (cf. Section 5.1), it is now a quantitative question if a purely hadronic model will be able to describe the strangeness production channels in nucleus—nucleus collisions at SPS energies. Our results for central collisions of S#S at 200A GeV for the rapidity distributions of K> and K\ are displayed in Fig. 5.27 in comparison to the data from Refs. [219,220]. Since these data as well as the corresponding pion rapidity distributions (cf. Fig. 4.12) are described quite reasonably in the hadronic transport approach, the quoted strangeness enhancement might also be explained in a purely hadronic scenario including rescattering. This has been pointed out by Sorge since a couple of years [221]; our independent calculations thus support his findings. On the other hand, using in-medium kaon masses with a "!0.06 and a M "0.24 in Eq. (2) the kaon yield is only ) ) enhanced for S#S by about 4% whereas the rapidity distributions become slightly broader (dashed histograms in Fig. 5.27). Both results are within the experimental errorbars such that no clear distinction can be made when comparing to the present data. For the light system S#S about 90% K> and 82% of K\ stem from BB collisions whereas the contribution from mB reactions is 6% for K> and 8% for K\; 4% of K> arise from mm reactions and about 7% of K\ mesons. The residual K\ seen asymptotically stem from n½ channels. Since
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
151
in Pb#Pb collisions these secondary and ternary reactions are more frequent, one might expect a drastic enhancement of strangeness production for the heavier system. Our results for the kaon and antikaon rapidity distributions for central collisions of Pb#Pb at 160A GeV are shown in Fig. 5.28 (l.h.s.) in comparison to the data from [222]. The solid histograms correspond to the ‘bare mass’ scenario whereas the dashed histograms reflect the ‘in-medium mass’ case with a "!0.06 and a M "0.24. As for S#S at 200A GeV the K>, K\ distributions are ) ) reproduced rather well showing even a tendency for an excess of kaons and antikaons in the calculations rather than missing strangeness. This is also confirmed by the K rapidity distribution (r.h.s. of Fig. 5.28) in comparison to the data from Ref. [222]. The distribution in rapidity again becomes broader for in-medium kaon masses (dashed histograms in Fig. 5.28), but also here no final conclusion can be drawn on the explicit in-medium masses. Contrary to S#S the kaon production by mB channels for Pb#Pb increases to about 20% and mm channels give roughly 15% in case of K> mesons. Antikaons, that are detected finally, stem from BB collisions by +52%, further 20% come from mB reactions, 13% from mm channels and 15% from n½ channels which indicates the relative importance of secondary and ternary reactions for the heavy system. Thus the strangeness enhancement seen at SPS energies appears to be compatible with a hadronic reaction scenario. A similar conclusion has been also obtained by Letessier et al. [223] who have studied the abundancies of all strange particles for Pb#Pb at 158A GeV within a model incorporating a hadron gas in thermal and chemical equilibrium. We note that the production of antihyperons as well as NM is enhanced in S#S and Pb#Pb collisions at the SPS and color rope formation has been included in the RQMD approach in order to describe this phenomenon [224]. In the VENUS model [148] the enhancement of antihyperons is due to ‘preclusters’ of high energy density whereas Capella [225] attributes the enhancement to additional strings. However, our transport simulations indicate a strong sensitivity of the pN /KM ratio on the annihilation cross section with baryons, i.e. assuming the annihilation cross section of KM ’s to be roughly 2/3 of the pN annihilation cross section the experimental ratios might be explained since antiprotons are more suppressed than KM ’s. In summarizing this Chapter we like to point out that the subthreshold production studies on kaons at SIS energies indicate either a vanishing or slightly repulsive kaon potential (+ 20—30 MeV at o ) in the nuclear medium whereas antikaon and antiproton potentials are strongly attractive, i.e. +!(110$15) MeV and +!(125$25) MeV, respectively. The size of the potentials for kaons and antikaons are practically the same in the studies by Li et al. [96] such that our conclusions on the meson self-energies might be considered as model independent to a large extent. However, more detailed experimental exclusive data (as a function of centrality) are urgently needed to allow for final conclusions. From the theoretical side the explicit momentum dependence of the meson self-energies will have to be included in future. At AGS energies no consistent picture could be extracted so far. Here the HSD approach underestimates the kaon and antikaon yields in nucleus—nucleus reactions even when including the self-energies in line with the SIS data. Furthermore, the experimental rapidity distributions for pions are narrower in A#A reactions than for p#p collisions; this effect is also not described by the hadronic transport model and might indicate new physics such as a major influence from partonic degrees of freedom. On the other hand, the production of strangeness at SPS energies is fully in line with the hadronic transport approach such that here the signal of ‘strangeness enhancement’ in A#A collisions does not qualify as a sensitive observable for an intermediate QGP phase.
152
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 5.28. The rapidity distributions of K> and K\ (l.h.s.) and of K-hyperons (r.h.s.) for central collisions of Pb # Pb at 160A GeV calculated with bare masses (solid histograms) and with in-medium masses (dashed histograms) in comparison to the data from Ref. [222].
6. Dilepton production Electromagnetic decays to virtual photons (e>e\ or k>k\ pairs) have been suggested long ago to serve as a possible signature for a phase transition to the QGP in Refs. [226—230] or to be an ideal probe for vector meson spectroscopy in the nuclear medium. Furthermore, dileptons of low mass and transverse momentum might also indicate the presence of disordered chiral condensates (DCC) [231]. As pointed out in Refs. [37,61] (cf. Section 2.4) the isovector current-current correlation function (2.33) is proportional to the imaginary part of the o-meson propagator and also to the dilepton invariant mass spectra. Dileptons are particularly well suited for an investigation of the violent phases of a high-energy heavy-ion collision because they can leave the reaction volume essentially undistorted by final-state interactions. Indeed, dileptons from heavy-ion collisions have been observed by the DLS Collaboration at the BEVALAC [232—234] and by the CERES [235,236], HELIOS [237,238], NA38 [239] and NA50 Collaborations [240] at SPS energies.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
153
Fig. 6.1. Dilepton invariant mass spectra for Ca#Ca at 1 and 2A GeV in comparison to the data from Refs. [232—234]. The dashed curves labelled by ‘background’ contain all processes except n>n\ annihilation. The dotted and solid lines are obtained with a free and a medium dependent o spectral function from Herrmann et al. [244], respectively. The figures are taken from Ref. [243].
Quite some years ago it has been found within microscopic transport studies at BEVALAC/SIS energies [7,125,241,242] that above about 0.5 GeV invariant mass (of the lepton pair) the dominant production channel is from n>n\ annihilation, such that the properties of the short lived o-meson could be explored at high baryon density. It has then been speculated [243] that the coupling of the o to the two-pion channel and the coupling of the pion to the Delta-nucleon(hole) channel might lead to a disappearence of the o peak in the dilepton spectra as shown in Fig. 6.1 (from Ref. [243]), where the calculated dilepton spectra for Ca # Ca at 1 and 2A GeV are displayed for a free o spectral function (dotted lines) and an in-medium spectral function according to the model by Herrmann et al. [244] (solid lines) in comparison to the old DLS data [232—234]. Furthermore, a shift and broadening of the -meson peak has been proposed by Li and Ko in Ref. [245] at SIS energies. The data at that time, however, did not allow for a closer distinction of the various models proposed. The recent data on e>e\ or k>k\ spectra at SPS energies, on the other hand, appear to be more conclusive. The enhancement of the low mass dimuon yield in S#¼ compared to p#¼ collisions [237,238] at 200A GeV has been first suggested by Koch et al. [246] to be due to n>n\ annihilation. Furthermore, Li et al. [247] have proposed that the enhancement of the e>e\ yield in S#Au collisions as observed by the CERES Collaboration [235] should be due to an enhanced o-meson production (via n>n\ annihilation) and a dropping o-mass in the medium. In fact, their analysis — which was based on an expanding fireball scenario in chemical equilibrium — could be confirmed within the microscopic transport calculations in Ref. [248] including a dropping o-mass in line with the QCD sum rule approach by Hatsuda and Lee [56]. Meanwhile, various authors have substantiated the observation in Refs. [248,249], that the spectral shape of the dilepton yield is incompatible with ‘free’ meson formfactors [250—254]. This is demonstrated in Fig. 6.2 (taken from Ref. [255]) where the calculations within various dynamicals models [247—252,256,257] are compared to the data from the CERES [235,236] and HELIOS-3 Collaborations [237] involving a bare o spectral function. All of these approaches underestimate the dilepton yield for invariant
154
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 6.2. Comparison of dilepton data from the CERES and HELIOS-3 Collaborations to calculations from different groups [247—252,256,257]. The upper figures show the calculations with bare meson masses, whereas the lower figures show the results from Refs. [248,256,286] which assume ‘dropping’ meson masses in the dense nuclear medium. The figures are taken from Ref. [255].
masses 0.3 GeV4M40.6 GeV thus indicating a sizeable change of the o-meson properties in the dense medium. In fact, the transport models from Refs. [247—249,256] provide a very good description of the data when involving dropping vector meson masses either in line with Brown—Rho scaling (2.22) or the sum rule analysis of Hatsuda and Lee (lower part of Fig. 6.2 taken from Ref. [255]). The latter independent studies thus have suggested that the low mass dilepton enhancement might be interpreted as a step towards the restoration of chiral symmetry. However, a more conventional approach including the change of the o-meson spectral function in the medium due to the coupling of the o, n, D and nucleon dynamics along the line of Refs. [244,258,259] was found to be roughly compatible with the CERES data [248,260], too. Meanwhile, our knowledge on the o properties in the medium have improved considerably. As first pointed out by Friman and Pirner, the scattering of o-mesons on nucleons is dominated by higher resonances and resonance-nucleon(hole) loops induce a shift of the o strength to lower invariant masses in its spectral function [261]. More elaborate calculations along this line have been performed by Rapp et al. [262]; within the expanding fireball model from [247] they achieved
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
155
a good reproduction of the CERES data, too. In short hand form the question thus emerges as: Is the o-meson dropping in the medium or just melting due to the strong hadronic couplings? This simplifying question, however, might be misleading. We recall, that the question of chiral symmetry restoration does not necessarily imply that the vector meson masses have to drop with baryon density or temperature [263]. Actually, chiral symmetry restoration (ChSR) only implies that the isovector current-current correlation function (cf. Section 2.4) and the axial vector current-current correlation function (dominated by the chiral partner of the o, the a -meson) should become identical at high o or temperature ¹, respectively, because there should be no more differences between left- and right-handed particles or equivalently vector and axial vector currents [37,263]. Thus also a strong broadening of the o- as well as the a -meson and their mixing in the medium can be considered as a signature for the ChSR. Before investigating the different scenarios more closely in comparison to experimental data from 1—200A GeV we briefly recall the elementary sources for dilepton production and give the corresponding formfactors employed in the following calculations. 6.1. Elementary production channels and formfactors In our analysis we calculate dilepton production by taking into account the contributions from the Dalitz-decays DPNe>e\, gPce>e\, uPne>e\, gPce>e\, a Pne>e\ and the direct dilepton decays of the vector mesons o, u, where the o, and a mesons may as well be produced in nn, KKM and no collisions, respectively. In case of a perturbative treatment of the channel n>n\PoPe>e\ the cross section is parametrized as in Refs. [125,243,264] by 4n a p > \ > \(M)" L L C C 3 M
4m 1! L "F (M)" , M L
(6.1)
where the ‘free’ formfactor of the pion is approximated by m M "F (M)"" . L (M!m)#mC M M M In Eq. (6.1) M is the dilepton invariant mass, a is the fine structure constant, and
(6.2)
m "775 MeV, C "118 MeV. M M The cross section for K>K\ annihilation is parametrized as [245]
4n a 4m p > \ > \(M)" 1! ) "F (M)" , ) ) C C 3 M M )
(6.3)
where the ‘free’ formfactor of the kaon is approximated by m 1 ( "F (M)"" ) 9 (M!m)#mC ( ( (
(6.4)
The perturbative treatment is used to test the dynamical scheme described below in case of the bare o-mass scenario, only.
156
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
with C "4.43 MeV . (
m "1020 MeV, (
The cross section for dilepton production in n>o\P Pe>e\, n\o>P Pe>e\ scattering can be represented as 1 B ML p > \ p (M)" > \ LM(C>C\ 3 B > \ ) ) (C C ) ) with B "0.13 and B > \"0.49. ML ) ) The g Dalitz-decay is given by [265]
(6.5)
dC 4a C 4m m EAC>C\" EA 1! C 1#2 C dM 3n M M M
M "F 1! (M)" , EAC>C\ m E where the formfactor is parametrized in the pole approximation as
M \ F (M)" 1! E K E with the cut-off parameter K K0.72 GeV. E Similarly, the u Dalitz-decay is [265]
(6.6)
(6.7)
dC 2a C 4m m SLC>C\" SLA 1! C 1#2 C dM 3n M M M
4m M M S ! "F 1# (M)" , SLC>C\ (m !m) m !m S L S L where the formfactor squared is parametrized as ;
K S "F (M)"" SLC>C\ (K !M)#K C S S S
(6.8)
(6.9)
with K "0.65 GeV, C "75 MeV . S S For gPce>e\ we use a similar expression as Eq. (6.6). However, in this case the pole approximation (6.7) is no longer valid since the vector meson pole occurs in the physical region of the dilepton spectrum (M(m ). Instead we use a formfactor of the form (6.9) with EY K "0.75 GeV, C "0.14 GeV , EY EY which reproduces the experimental data from [265] reasonably well. The direct decays of the vector mesons u, to e>e\ are taken as 1 m C C >\ dp 4 4 4C C (M)" n (M!m )#m C C dM 4 4 4 4 with C > \/C "2.5;10\ and C > \/C "7.1;10\. SC C S (C C (
(6.10)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
157
The dilepton channels gPe>e\, uPne>e\, uPe>e\, Pe>e\, noP Pe>e\ and K>K\P Pe>e\ are treated perturbatively and are computed at the end of the transport calculation due to the ‘long’ lifetime of the mesons g, g, u, . The decays of the short-lived o and a mesons, however, have to be treated explicitly since these mesons change their properties rapidly (in case of the dropping mass scenario) and decay during the expansion phase of the system. The mesons ( o, a ) stemming from a string decay with invariant mass m* at baryon density 4 o are selected by Monte Carlo according to the Breit—Wigner distribution: 2 Mm* C* 4 4 , (6.11) 4 n (M!m*)#m*C* 4 4 4 while N guarantees normalization to unity, i.e. f (M) dM"1. The width C* (M) is approximated 4 4 by f (M)"N
m* q 4 C* (M)"C , (6.12) 4 M q 4 where C is the full width at the mean resonance energy, q and q are the pion three-momenta in the 4 restframe of the resonance with mass M and m* , respectively. 4 Apart from string decay the mesons (o, a ) are abundantly also created from nn or no collisions, respectively. For the a formation cross section in the reaction n>o\Pa , n\o>Pa or for the o cross section the reaction n>n\Po we use the Breit—Wigner form [266,267]: 2J #1 4n m*C* 4 4 p (s)" 4 , (6.13) 4 2S#1 k (s!m*)#m*C* 4 4 4 where k is the pion momentum in the center-of-mass of the produced meson »"(o, a ), s is the invariant energy squared while J stands for the spin of the produced meson and S for the spin of 4 the collision partner of the pion, respectively. In the time reversed processes the vector mesons of actual mass M may decay in each timestep according to the probability P"exp(!C* (M)Dt/c) , (6.14) 4 where Dt is the actual timestep size and c the Lorentz factor of the resonance with respect to the calculational frame. The o decay to e>e\ with invariant mass M is calculated by integrating the equation (using the mass bin DM) 1 dNM> \ 1 ,M+R C C " C > \(M) M C C DM DM dt
G in time with
(6.15)
(6.16) C > \(M)"8.8;10\ M , M C C where N (M, t) is the number of o mesons of mass M at time t in the calculation. The factor M 8.8;10\ stems from the measured width of the o to e>e\ at resonance. It can be shown [245]
158
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
that the method described above leads to the same result as Eq. (6.1) if the o meson does not change its properties in time. We note that by treating explicitly the o formation by nn collisions the perturbative channel (6.1) has been switched off to avoid double counting. The a decay to ne>e\ with invariant mass M is calculated in analogy by integrating the equation dN?> \ 1 ,?R dC > \ * ? LC C (m , M) C C " (6.17) ? dM dt
dM G in time with
2a C 4m m dC > \ * C ? LC C (m , M)" ? LA 1! C 1#2 ? 3n M M M dM
4m*M M ! * ? , (6.18) 1# * (m !m) m !m ? L ? L C "6.4;10\ GeV, while N (t) is the number of a mesons at time t in the calculation. ? LA ? The nucleon—nucleon bremsstrahlung contribution is calculated within the soft-photon approximation [264], where the radiation from internal lines is neglected and the strong interaction vertex is assumed to be on-shell. In this case the strong interaction part and the electromagnetic part separate, however, the cross section for dileptons has to be corrected [268,269] by reducing the phase-space for the colliding particles in their final-state. The phase-space corrected soft-photon cross section then can be written as ;
dp a pN (s) R (s ) , " dy dq dM 6n Mq R (s) 2
R (s)"(1!(m #m )/s ,
(6.19) s!(m #m ) p(s) , pN (s)" s "s#M!2q (s, 2m where m is the mass of the charged accelerated particle, M is the dilepton invariant mass, q the energy, q the transverse momentum and y the rapidity of the dilepton pair. This approximation 2 works quite reasonable for pn bremsstrahlung at lower energies as shown in Ref. [269], however, overestimates the dilepton radiation in the general case substantially [270]. Fortunately, the bremsstrahlungs contributions are not very strong in relativistic heavy-ion collisions such that we will use (6.19) furtheron keeping in mind that it gives more likely an upper limit for this channel. For the D Dalitz-decay we use the NDc vertex L "eAIWM D@ C W , @I , where
(6.20)
3 mD#m , , g "!Ms #s #0.5s , f"! @I @I @I @I 2 ((mD#m )!M) , s "(q c !q cJg )c , s "(q PM !q PM Jg )c , (6.21) @I @ I J @I @I @ I J @I s "(q q !Mg )c , PM "(pD#p ) , @I @ I @I , and g"2.72 is the coupling constant fitted to the photonic decay width C (0)"0.72 MeV. C "g f g , @I @I
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
159
In case of the o spectral function approach we cannot represent the density, temperature and momentum dependent spectral function by a set of ‘Monte Carlo’ testparticles, since their propagation would be inconsistent with the actual spectral density of the Dyson—Schwinger approach. We thus calculate the dilepton radiation from direct o mesons as 2M dN > \ J J "!Br(M) Im D (q , q; o ,¹) , M dM n
(6.22)
where D is the o-meson propagator in the hadronic medium, depending on the baryon density M o and temperature ¹ as well as on energy q and 3-momentum q,"q" in the local rest frame of the baryon current (‘comoving’ frame) while Br(M) is the branching ratio to dileptons which is fixed at the vaccum mass. This approximation should hold for configurations where the spectral width is broad and the o decays before the hadronic environment changes considerably. Note that when using vector meson dominance for the in-medium o decay the branching ratio Br(M)&(m /M) M in Eq. (6.22) where m is the bare o mass. However, it is not clear if vector meson dominance M should hold in case of a spectral function approach where the properties of the in-medium o-meson are dominated by oN* couplings. This problem is presently unsolved and needs further analysis. The invariant mass M in Eq. (6.22) is related to the o-meson 4-momentum in the nuclear medium as M"q!q ,
(6.23)
while Im D "(Im D*#2Im D2 ) M M M is spin averaged over the longitudinal and transverse part of the o-propagator:
(6.24)
Im R*2(q , q; o , ¹) M Im D*2(q , q; o ,¹)" . (6.25) M "M!(m )!R*2(q , q; o , ¹)" M M The o-meson self-energy R*2(q , q, o , ¹) itself is model dependent. Here we adopt the approach M from Rapp et al. [262] obtained by combining the effects of the following hadronic interactions as R*"R #R* #R* #R* M M , (6.26) M MLL ML? M)) M)) R2"R #R \#R2 #R2 #R2 M M . (6.27) M MLL M ML? M)) M)) The explicit evaluation of the various self-energy components is discussed in detail in Ref. [262]. As in Ref. [271] we show the results for an extended spectral function in the following respects: 1. in addition to p-wave oNPB interactions (B"N, D, N(1720) and D(1905)) also s-wave excitations to N(1520), D(1620) and D(1700) resonances are taken into account (the corresponding coupling constants are estimated from the oN partial decay width); 2. the coupling of the resonances to (virtual) photons is calculated within the improved vector dominance model of Kroll et al. [272] to avoid overestimates of the BPNc branching ratios. This leads to a modification in the coupling of the transverse part of the o propagator entering
160
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Eq. (6.22) such that the combination (m) Im D2(q , q; o , ¹) is replaced by the ‘transition M M formfactor’ FM 2(q , q; o )"!Im [R2 #R2 #R2 ] "d !1"!Im R2 M)) M M MLL ML? M!R2 !R2 !R2 !r R2 M)) M MLL ML? d (q , q; o )" M M!(m)!R2 M M where
\
"d !r " , M
(6.28)
\
,
k r " f (m) M , M g m M denotes the ratio of the photon coupling to its value in the naive VDM. 3. the medium modifications of the two-pion self-energy R are extended to arbitrary 3MLL momentum within the model of Urban et al. [273]. The combined o-meson self-energy is then further constrained by experimental information on cp and cA absorption cross sections as described in Ref. [274]. With the aforementioned improvements (1) to (3) a satisfactory description of the photoabsorption data, which represent the MP0 limit of the dilepton regime, can be achieved on protons as well as on nuclei. In fact, the melting of the resonance structures (above the D mass), as seen in the photoabsorption data on nuclei [275], can be explained by the broadening of the higher resonances due to a strong coupling to the short-lived o-meson in the medium [274]. As an example — relevant for the heavy-ion reactions at SPS energies — we show in Fig. 6.3 the spin averaged !Im D (q , q, o , ¹) as a function of the invariant mass M and the 3-momentum M q for a temperature of 150 MeV at o "0, o , 2o and 3o , respectively. With increasing baryon density we find the o spectral function to increase substantially in width showing only minor structures at high density. Thus the lifetime of the o-meson in the nuclear medium becomes very short due to well established hadronic interactions such that its average propagation in space is limited to distances less than 0.5 fm already at normal nuclear matter density (even for high momenta). The pion annihilation channel n>n\PoPe>e\ is treated perturbatively in case of the spectral function approach again. The cross section is taken in line with Ref. [276] as 16na 1 (m )Im D (q , q, o , ¹) (6.29) p > \ > \(M)"! M L L C C kM M g MLL where k"(M!4m)/2 is the pion 3-momentum in the center-of-mass frame and a is the fine L structure constant. The onn coupling constant and bare o mass are fixed to reproduce the pion electromagnetic formfactor in free space [262]. We note that there are quite a couple of further channels (e.g. nonresonant noPne>e\[253]) that lead to dileptons since any reaction involving charged particles will have a dilepton component. Especially the issue of dilepton production for M51 GeV has been addressed in Refs. [277—279]. Furthermore, field theoretical calculations at finite temperature and off-equilibrium
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
161
Fig. 6.3. The (negative) imaginary part of the o propagator averaged over the longitudinal and transverse components as a function of the invariant mass M and the momentum q for baryon densities of 0, 1 o , 2 o , and 3 o and temperature ¹"150 MeV within the approach of Rapp et al. [262]. Note the different absolute scales in the individual figures (taken from Ref. [271]).
[257] show sizeable corrections especially for meson—meson channels. For brevity we discard an explicit discussion of these studies and refer the reader to the original literature. 6.2. BEVALAC/SIS energies As mentioned before the first studies on dilepton production at BEVALAC energies have been performed by the DLS Collaboration [232—234], however, with limited statistics and resolution at
162
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
that time. The DLS Collaboration has continued their efforts and recently presented new data on pp and pd collisions from 1—4.9 GeV as well as for C#C and Ca#Ca at 1A GeV [280]. In fact, the new measurements of the dilepton spectra in A#A collisions have increased by about a factor of 6—7 in comparison to the previous data set [232—234] due to an improvement of the DLS detector and the data analysis correcting for dead time losses. However, the first generation of DLS data were reasonably described by several theoretical models [125,281—283] without incorporating any medium effects. These various models are consistent in the general tendency that the dilepton pairs of invariant mass below 0.4 GeV are primarily produced from hadronic sources, such as the Dalitz-decays of n, D and g and pn bremsstrahlung, whereas for M50.4 GeV the dominant contributions stem from the n>n\ annihilation channel and direct decays of vector mesons (o, u). Since this has lead to a fair agreement with the old DLS data the aforementioned theoretical models are not able to explain the new DLS data set [280] without incorporating new ingredients such as vector meson self-energies. In this Section we will investigate the issue especially of o-meson self-energies as advocated in the studies of Ref. [262]. The dynamical calculations are carried out as follows [140]: the time-evolution of the proton—nucleus or nucleus—nucleus collision is described within the HSD transport approach without any dropping vector meson masses. Whenever a o-meson is produced in the course of the hadronic cascade (by baryon—baryon, meson—baryon or pion—pion collisions), its 4-momentum in the local rest frame of the baryon current is recorded together with the information on the local baryon density, the local ‘transverse’ temperature and its production source. We note that the definition of a local temperature is model dependent; here we have used a logarithmic fit to the transverse p spectra of mesons at midrapidity. Without going into a detailed discussion of this issue we note 2 that the results for dilepton spectra do not change within the numerical accuracy when using a constant temperature ¹"70 MeV at BEVALAC energies since Im D depends rather weakly on M ¹ (at fixed nucleon density). For an explicit representation of Im D within the spectral function M approach of Rapp et al. [262] at ¹"70 MeV and that of Peters et al. [284] we refer the reader to Ref. [140]. In Fig. 6.4 we present the calculated inclusive dilepton invariant mass spectra dp/dM for Ca#Ca (upper part) and C#C (lower part) at the bombarding energy of 1.0A GeV in comparison with the new experimental data of the DLS Collaboration [280] including the DLS acceptance filter (version 4.1) as well as a mass resolution DM/M"10%. The thin lines indicate the individual contributions from the different production channels; i.e. starting from low M: Dalitz-decay nPce>e\ (dashed line), gPce>e\ (dotted line), DPNe>e\ (dashed line), uPne>e\ (dotdashed line), N*PNe>e\ (dotted line), proton-neutron bremsstrahlung (dot-dashed line), nN bremsstrahlung (dot-dot-dashed line); for M+ 0.8 GeV: uPe>e\ (dot-dashed line), oPe>e\ (dashed line), n>n\PoPe>e\ (dot-dashed line). The pion annihilation channel as well as the direct decay of the vector mesons here have been calculated with the ‘free’ o spectral function. The full solid lines represent the sum of all sources. As seen from Fig. 6.4 the new BEVALAC data cannot be properly described in terms of a ‘free’ o spectral function as argued before. The discrepancy between the data and the calculations for Ca#Ca as well as for C#C at 0.154M40.4 GeV is about a factor of 3—5. At M&m the calculation is within the error bars M except for the last experimental point. The description of the DLS data is slightly improved when including the o spectral functions from Ref. [262] or Ref. [284]. In Fig. 6.5 we show the dilepton spectra (full solid line) for Ca#Ca
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
163
Fig. 6.4. The dilepton spectra (full solid line) for Ca#Ca (upper part) and C#C (lower part) at 1.0A GeV calculated with the ‘free’ o spectral function including the DLS acceptance filter (version 4.1) as well as a mass resolution DM/M"10% in comparison with the data from Ref. [280]. The thin lines indicate the individual contributions from the different production channels including the DLS acceptance and mass resolution; i.e. starting from low M: Dalitz-decay nPce>e\ (dashed line), gPce>e\ (dotted line), DPNe>e\ (dashed line), uPne>e\ (dot-dashed line), N*PNe>e\ (dotted line), proton—neutron bremsstrahlung (dot-dashed line), nN bremsstrahlung (dot-dot-dashed line); for M+0.8 GeV: uPe>e\ (dot-dashed line), oPe>e\ (dashed line), n>n\PoPe>e\ (dot-dashed line). The figures are taken from Ref. [140].
(upper part) and C # C (lower part) at 1.0A GeV calculated with the o spectral function from Ref. [262] (l.h.s.) and that from Ref. [284] (r.h.s.) in comparison with the data [280]. The dashed lines correspond to the channel oPe>e\ while the thin solid lines indicate the contribution from
164
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 6.5. The dilepton spectra (full solid line) for Ca#Ca (upper part) and C#C (lower part) at 1.0A GeV calculated with the full o spectral function in the extended approach of Ref. [262] (l.h.s.) and Peters et al. [284] (r.h.s.) in comparison with the data from Ref. [280]. The figures are taken from Ref. [140].
n>n\PoPe>e\. The full o spectral functions lead to a shift of the pion annihilation contribution to the small invariant mass region as well as to a broadening of the o-meson contribution in both approaches. We note that in our transport calculations the pions are on-shell, thus we cannot probe the o-spectral function in the reactions n>n\PoPe>e\ below the two-pion threshold at M(2m . The tails from the n>n\ annihilation and o meson at M(2m , which are seen in L L Fig. 6.5, exist only due to the finite mass resolution of DM/MK10%. We note that we discard meson—baryon (mB) and meson—meson (mm) bremsstrahlung channels as well as the Dalitz decays of the baryon resonances (stemming from secondary pion induced reactions) in order to avoid double counting when employing the full o spectral functions. Due to the on-shell treatment of pions in the transport approach the dilepton spectra for MK2m might not be properly described. In order to investigate the effects from an off-shell L propagation of pions in the channel n>n\PoPl>l\, additional calculations have been performed within the thermodynamical approach used in Ref. [262] at SPS energies. For BEVALAC
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
165
Fig. 6.6. The dilepton spectra for Ca#Ca at 1.0A GeV calculated within the thermodynamical approach [262] (solid lines) and within the transport model (dashed lines). The figure is taken from Ref. [140].
energies the same procedure is used, however, with temperature and density profiles obtained from the HSD transport model for Ca#Ca at 1.0A GeV. The solid curves in Fig. 6.6 correspond to the sum of all channels and the o decay (from n>n\ annihilation and direct production), respectively, while the dashed lines are from the transport model (l.h.s. of Fig. 6.5). As seen from Fig. 6.6 an off-shell pion propagation leads to a large enhancement in the o-decay channel especially below the 2m threshold. However, considering all channels simultaneously, the maximum increase — comL pared to the transport model — is only 30%. Thus, all independent calculations give practically the same results; the new DLS data are underestimated at least by a factor of 2—3 for 0.154M40.4 GeV even when taking into account the modification of the o-meson properties in the nuclear medium. According to a general Brown—Rho scaling [3] g-mesons might also change their properties in the medium. We have examined the possibility of a dropping g mass at BEVALAC/SIS energies as well as for g photoproduction on nuclei by using a linear extrapolation of the g mass,
o , m*"m 1!a E E Eo
(6.30)
with a ‘large’ value for a +0.18, which might be considered as an upper limit. In fact the new DLS E data can be reproduced quite well in this limit [140], however, the transverse mass spectra for g-mesons then are overestimated considerably in comparison to the data from the TAPS Collaboration (cf. Section 4.1). Furthermore, the calculated angular distribution of g’s in the cms is found to be slightly anisotropic [140], but this anisotropy does not account for the dilepton spectra reported in Ref. [280]. Dilepton spectra for the systems Ca # Ca and C # C at 1A GeV have recently also been calculated independently by Ernst et al. within the UrQMD model [285]. In Fig. 6.7 we display
166
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 6.7. Dilepton spectra for the systems d#Ca, a#Ca, Ca#Ca and C#C at 1A GeV calculated by Ernst et al. within the UrQMD model [285] in comparison to the DLS data [280]. The figures are taken from Ref. [285].
their result for the e>e\ spectra in comparison to the DLS data from [280]. Since also their analysis underestimates the data by a factor 2—3 for invariant masses 0.3 GeV4M40.6 GeV, these data might indicate new physics that is not incorporated in the dynamical calculations so far, e.g. the ‘subthreshold’ channel n#NPN*(1520)PNe>e\. On the other hand, independent spectra with a higher mass resolution are urgently needed to dissolve this ‘puzzle’; a task that will be taken up by the HADES Collaboration in the near future at the SIS. 6.3. SPS energies Since at AGS-energies no dilepton studies have been performed experimentally so far we directly step to SPS energies where the situation appears more promising. As noted in the introduction of this Section all nucleus—nucleus data from the CERES [235,236] and HELIOS-3 Collaborations [237] are incompatible with a ‘free’ o spectral function whereas the low mass dileptons can well be explained by a ‘hadronic cocktail’, i.e. the Dalitz and direct decays of mesons in free space. As an example for dilepton spectra at SPS energies Fig. 6.8 shows the spectral channel decomposition as a function of the e>e\ or k>k\ invariant mass M for p#Be at 450 GeV and
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
167
Fig. 6.8. Upper part: the calculated dielectron spectra (full solid line) for p#Be at 450A GeV in comparison with the data from Ref. [235]. Lower part: the dimuon spectra (full solid line) for p#¼ at 200A GeV in comparison with the data from Ref. [237]. The thin lines indicate the individual contributions from the different production channels including the experimental acceptance and mass resolution; i.e. starting from low M: gPce>e\ (dashed line), uPne>e\ (dot-dashed line), gPce>e\ (long dashed line); for M+0.8 GeV: uPe>e\ (dot-dot-dashed line), oPe>e\ (dashed line), n>n\PoPe>e\ (dot-dashed line); for M+1 GeV: Pe>e\ (dot-dashed line), noPe>e\ (dot-long dashed line), KKM Pe>e\ (dashed line). The figures are taken from Refs. [155,271].
p#¼ at 200 GeV in comparison to the data of the CERES [235] and HELIOS-3 Collaboration [237], respectively, including the experimental cuts in transverse mass and rapidity as well as the experimental mass resolution as in Refs. [248,256]. As is apparent from Fig. 6.8, the spectra for p#Be and p#¼ can be fully accounted for by the electromagnetic decays of the g, g and vector mesons o, u and ; contributions from meson—meson channels (n>n\, KKM , no) are of minor importance here. Furthermore, since the o-meson essentially hadronizes in the vacuum, there is no noticable difference between the summed spectra in p#A reactions when using alternatively the in-medium spectral function from Rapp et al. [262] (cf. Fig. 2 in Ref. [271]). The situation changes appreciably when turning to nucleus—nucleus collisions. In Fig. 6.9 we compare the results of our calculation for the differential dilepton spectra for S#¼ at 200A GeV with the experimental data [237] employing the experimental cuts and mass resolution as before.
168
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 6.9. Dimuon invariant mass spectra for central collisions of S#¼ at 200A GeV (full solid lines) in comparison to the data of the HELIOS-3 Collaboration [237]. The upper part shows the results of a calculation with a ‘free’ o spectral function whereas the middle part includes the full o spectral function (from Ref. [271]). The lower part corresponds to the calculations within the ‘dropping mass’ scenario (from Ref. [155]).
The upper part shows our results for a ‘free’ o spectral function, the lower part the dilepton spectra in the ‘dropping mass’ scenario from Ref. [256] whereas the middle part represents the results including the full medium modifications of the o propagator (from Ref. [271]). Even at forward rapidities as measured in central S#¼ (200A GeV) a sizeable contribution for invariant masses 0.3 GeV4 M40.7 GeV stems from the n>n\PoPk>k\ channel. Also in the mass regime around 1 GeV there is a significant contribution from KKM and no annihilation to dimuons which explains the enhanced /(o#u) ratio in S#¼ relative to p#¼. In accordance with Refs. [256,286] the experimental spectrum cannot be properly described in terms of a ‘free’ o spectral function: there is an excess for invariant masses M+0.8 GeV and too little yield around M+0.5 GeV. The description of the data is significantly improved when including the hadronic medium modifications of the o propagator (middle part of Fig. 6.9) as well as for the ‘dropping mass’ scenario [256] when employing a o-mass shift according to Hatsuda and Lee [56]. However,
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
169
Fig. 6.10. Dielectron invariant mass spectra for mid-central collisions of Pb#Au at 160A GeV (full solid lines) in comparison to the data of the CERES Collaboration [287]. The upper part shows the results of a calculation with a ‘free’ o spectral function whereas the middle part includes the full o spectral function from Ref. [271]. The lower part corresponds to the calculations within the ‘dropping mass’ scenario from Ref. [155].
there is a significant difference: whereas in the ‘dropping mass’ scenario (lower part of Fig. 6.9) the dimuon yield [155] is underestimated for M51.2 GeV the in-medium o spectral function seems to give sufficient pairs in this region due to its substantial broadening in dense matter (cf. Fig. 6.9). In Fig. 6.10 we also compare the results of our calculation for the differential dilepton spectra in Pb#Au at 160A GeV and b"5 fm with the experimental data [287]. The e>e\ acceptance cuts in pseudorapidity (2.14g42.65), transverse momentum (p 50.175 GeV/c) as well as in the open2 ing angle of the e>e\ pair (H535 mrad) are taken into account. Furthermore, the experimental
170
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
mass resolution has been included in evaluating the final mass spectrum, which is normalized by the number of charged particles dn /dg in the pseudorapidity bin 2.14g43.1. Again, the upper part shows our results for the ‘free’ o spectral function, the middle part the result when including the full medium modifications from Ref. [271] whereas the lower part represents the results within the ‘dropping mass’ scenario as in Ref. [155]. For Pb#Au at 160A GeV (and semicentral collisions) the dominant yield for invariant masses 0.34M40.7 GeV stems from n>n\ annihilation. In the mass regime around 1 GeV there is again a large contribution from KKM and no annihilation to dileptons. The ‘free’ spectral function approach underestimates the experimental data for 0.254M40.55 GeV by up to a factor of 3 similar to the S#Au reaction at 200A GeV (cf. Fig. 6.2), which provides a further proof for in-medium o modifications. However, the calculations with the full spectral function from Rapp et al. [262] as well as the calculations within the ‘dropping mass’ scenario are equally well compatible with the CERES data [287] as in case of the S#¼ reaction at 200A GeV (cf. Fig. 6.9). 6.4. How to disentangle the different scenarios? Since the present dilepton invariant mass spectra do not discriminate the different models we explore if the momentum distributions of the lepton pairs give some further insight (as suggested by Friman and Pirner [261] and in Ref. [155]). Our calculated transverse momentum spectra — integrated over rapidity and the invariant mass 0.34M40.7 GeV — are shown in Fig. 6.11 for Pb#Au at 160A GeV and b"5 fm using q -bins of 50 MeV/c. The dashed line in Fig. 6.11 2 displays the result with the ‘free’ spectral function while the solid line shows the result for the in-medium spectral function from Rapp et al. [262]. Apart from an overall increase of the spectrum we do not observe significant changes of the slope in q as in case for the ‘dropping mass’ scenario 2 (cf. Fig. 13 of Ref. [155]). First transverse momentum spectra for Pb#Au at 160A GeV have been presented recently by the CERES Collaboration [287]; here we compare our calculated distributions in q for the same 2 cuts in mass as performed experimentally. The l.h.s. of Fig. 6.12 shows the q distribution for 2 M40.2 GeV that dominantly stems from n decay in the vacuum and thus reflects the n transverse momentum spectrum. In fact, the observed dilepton yield is in reasonable agreement with the transport calculations that also provide a good description for the pion spectra (cf. Section 4.3). For invariant masses 0.2 GeV4M40.6 GeV the calculation for a bare o spectral function (dotted line) misses the spectrum by about a factor of 2; only for the ‘full’ spectral function from [262] (solid line) or within the ‘dropping mass’ scenario (dashed line) the height of the spectrum is reproduced accordingly. There is still some excess at very low q experimentally which is not under2 stood theoretically so far. The dilepton q spectra for M50.6 GeV, furthermore, are well 2 described practically within all scenarios. As noted in Ref. [271] the q distributions are not very 2 promising to disentangle the ‘dropping mass’ scenario from the spectral function approach. Another signature studied in Ref. [155] is related to the total dilepton yield between the free u and peaks where the dominant contribution stems from o decays. In this regime the ‘dropping mass’ scenario leads to a reduction by about a factor 2—3 as compared to a ‘free’ o spectral function calculation for Pb#Pb at 160A GeV [155], while a broadening of the o spectral function due to hadronic interactions shows an enhancement. This feature is demonstrated quantitatively in Fig. 6.13 for Pb#Au at b"5 fm for the different scenarios employing an experimental mass
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
171
Fig. 6.11. The dielectron transverse momentum distribution for the invariant mass range 0.34M40.7 GeV for Pb#Au at 160A GeV at b"5 fm for the ‘free’ o spectral function (dotted line) and with the in-medium spectral function (solid line) (from Ref. [271]).
Fig. 6.12. The transverse momentum spectra for Pb#Au at 160A GeV calculated for the ‘free’ o spectral function (dotted lines), with the in-medium spectral function from Rapp et al. [262] (solid lines) and within the ‘dropping mass’ scenario (dashed lines) in comparison to the data of the CERES Collaboration [287].
resolution of DM"10 MeV. We find a difference by about a factor of 5 between the ‘dropping mass’ scenario (dotted line) and the hadronic spectral function approach (solid line) which might be accessible experimentally with the CERES detector at improved mass resolution. We note, furthermore, that in the spectral function approach the relative enhancement is most pronounced
172
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 6.13. The dielectron invariant mass spectra for Pb#Au at 160A GeV at b"5 fm for the ‘free’ o spectral function (dashed line) and with the in-medium spectral function (solid line) taken from Ref. [271]. The dotted line corresponds to the ‘dropping mass’ scenario from Ref. [155].
at MK0.4 GeV while in the dropping mass scheme we find a maximum enhancement at MK0.55 GeV for this system. Another signature for modified in-medium properties of the o-meson might be the differential dilepton yield as a function of the centrality of the nucleus—nucleus collision [255], where the latter can be related to the number of charged particles detected simultaneously with a dilepton event. Here we investigate this question by comparing the ‘free’ spectral function approach with the ‘dropping mass’ scheme (cf. Ref. [155]). The number of charged particles dn /dg in the pseudorapidity bin 2.14g43.1 for Pb#Au at 160A GeV is shown in Fig. 6.14 (l.h.s.). The open circles are the result of the computations with ‘free’ meson masses, while the solid circles correspond to calculations when including the in-medium modifications of the meson masses. In both cases the charged particle multiplicity decreases with impact parameter practically linearly. For peripheral collisions there is no essential difference between both schemes as expected. For central collisions, however, the charged particle multiplicity in the ‘dropping meson mass’ scenario is slightly larger due to the reduction of the vector meson production thresholds, which enhances the respective particle formation cross sections at high baryon density. Especially the subsequent decay of o- and u-mesons to pions leads to a slightly larger number of pions in the final expansion phase. We note, that due to the conservation of energy and momentum in each production event the enhanced number of vector mesons and antikaons at finite baryon density goes along with a lower number of those mesons, that do not change their quasiparticle properties in the medium (n, g, g etc.). Including the CERES acceptance cuts and mass resolution as described above, we show in Fig. 6.14 (r.h.s.) the dilepton yield integrated over the invariant mass range 0.34M41.0 GeV,
%4 dn > \/(dM dg) dN/dg " dM C C dn /dg dn /dg %4
(6.31)
as a function of the charged particle multiplicity dn /dg without (open circles) and with (solid circles) in-medium mass modification. At small charged particle multiplicity, which corresponds to
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
173
Fig. 6.14. L.H.S.: The charged particle multiplicity dn /dg for Pb#Au at 160A GeV as a function of the impact parameter b for the ‘free meson mass’ scheme (dashed line) and the ‘dropping meson mass’ scenario (solid line). R.H.S.: The differential dilepton spectra for Pb#Au at 160A GeV integrated over the invariant mass region 0.34M41.0 GeV as a function of the charged particle multiplicity dn /dg without (open circles) and with (solid circles) the in-medium mass modification of the mesons. The figures are taken from Ref. [155].
very peripheral collisions, the integrated dilepton yields coincide for both cases. With decreasing impact parameter the average baryon density and especially the pion density increases; as a consequence the contribution from pion annihilation to o and subsequent decay to dileptons becomes larger. Since we gate on dileptons above the n>n\ annihilation threshold, also the integrated dilepton spectra increase with dn /dg. Using ‘free’ meson masses we reach some plateau for low impact parameter which implies that the nn annihilation contribution, divided by the pion density, becomes approximately constant. However, for the ‘dropping mass scenario’ the absolute dilepton yield above 0.3 GeV is smaller from the directly produced o-mesons (at high initial baryon density). Furthermore, due to an initially higher vector meson density the initial pion density (due to energy conservation) is reduced as compared to the ‘free’ mass scenario and the corresponding pion annihilation contribution is also lowered to some extent. All effects together lead to an approximately linear increase of the integrated dilepton yield with the charged particle multiplicity in the ‘dropping mass’ picture. Experimental data with sufficient statistics should allow to disentangle the two schemes or disqualify the hadronic scenario as employed in the HSD transport approach. Unfortunately, the available data [287] are not accurate enough to distinguish between the two cases so far. 6.5. Systematics of dilepton production from AGS to SPS energies A very detailed discussion of dilepton production in p#A collisions from 10 to 450 GeV as well as Pb#Pb collisions from 10 to 160A GeV has been presented in Ref. [155] with respect to integrated spectra, transverse momentum and rapidity distributions for the ‘free’ spectral function approach as well as the ‘dropping mass’ scheme. For the actual channel decompositions as well as the individual results we refer the reader to the original literature [155].
174
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Here we only show the differential dilepton spectra — integrated over rapidity and transverse momentum — with respect to their ‘cocktail’ decomposition. The respective spectra for p#Be at 10, 50, 450 GeV and for Pb#Pb at 10, 50, 160A GeV for b"2 fm are shown in Figs. 6.15, 6.16 and 6.17. For all cases we include a mass resolution of DM"10 MeV, which can be expected for future dilepton detector systems. As seen from Figs. 6.15, 6.16 and 6.17, there are no dramatic differences in the relative contribution of the various dilepton channels; the total yield (and especially the
decay) increase with bombarding energy quite smoothly. In case of central Pb#Pb collisions the dominance of the n>n\ annihilation component is most pronounced at 10A GeV in both scenarios and decreases with bombarding energy. Since experimentally only the total spectra can be observed, we show in Fig. 6.18 the sum of all contributions for the ‘free’ mass (dashed lines) and the ‘dropping mass’ scenario (solid lines) for central collisions of Pb#Pb at different bombarding energies to demonstrate the influence of in-medium effects for the mesons. For a mass resolution DM"10 MeV one can expect to observe not only the ‘usual’ enhancement of the dilepton spectra by about a factor of 2 at invariant masses 0.34M40.6 GeV, but also a sharp drop of the spectrum above the u mass by a factor 4—6 due to the shift of the o contribution to lower invariant masses. Furthermore, the peak from the u meson becomes more pronounced since the ‘background’ from the o decay (either from nB, BB or nn collisions) is significantly reduced. In the mass region, furthermore, we find a small increase of the yield for the ‘dropping mass’ scheme at all energies from 10 to 160A GeV. Again, the relative modifications of the total spectrum are most pronounced at 10A GeV which suggests to perform experimental dilepton studies also in the 5—40A GeV range since here the space—time integral for high baryon density has a maximum (cf. Fig. 4.17). 6.6. Direct photons Directly radiated thermal photons have been considered as an independent probe to study the hot and dense nuclear matter produced in ultrarelativistic nucleus—nucleus collisions [228,288]. However, an experimental measurement of direct photons is a quite complicated task due to the background from hadronic decays. Only a few years ago first upper limits for direct photon spectra have been reported by the WA80 Collaboration [153] for S#Au at 200A GeV. In the latter study photons from n and g Dalitz decays have been subtracted from the total photon signal; their spectra thus can be interpreted as upper bounds for the direct photon cross section. A first calculation of the direct photon radiation from a quark-gluon-plasma (QGP) was performed a couple of years ago in Ref. [226]; various hydrodynamical model calculations followed (cf. [153,289] and references therein), where the radiation from a QGP [290] has been compared to the radiation from a pure hadron gas scenario [291]. The comparison of the various models with the WA80 data [153], however, has demonstrated only the inapplicability of hadronic thermal models with high initial temperature. In this respect it is useful to compare the WA80 upper limits with the results of a nonthermal model — such as the HSD transport approach — to find out possible conflicts with the hadronic scenario employed. In our analysis we take into account the following processes for photon production: a Pnc, uPnc, gPoc or uc. The n and g decays are already subtracted experimentally and thus do not have to be taken into account in our calculations.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
175
Fig. 6.15. The differential dilepton spectra for p#Be at 10, 50, and 450 GeV. The thick solid lines display the sum of all channels whereas the individual contributions are given in terms of the thinner lines. The mass resolution employed is DM"10 MeV. The figures are taken from Ref. [155].
The treatment of photon production in the HSD approach is quite similar to that for dileptons, however, using the branching ratios: C C C C SLA"0.085, EYMA"0.3, EYSA"0.03 . ? LAK1.6;10\, C C C C S EY EY ?
(6.32)
We discard baryon—baryon, meson—baryon and meson—meson bremsstrahlung in our present study since these channels were found to be of minor importance for dilepton production in case of
176
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 6.16. The differential dilepton spectra for Pb#Pb at 10, 50, 160A GeV at b"2 fm within the ‘free meson mass’ scenario. The mass resolution employed is DM"10 MeV. The figures are taken from Ref. [155].
S#Au at 200A GeV in Ref. [248]. Furthermore, the soft-photon approximation employed in [248] is questionable at these energies according to the studies in Refs. [270,292]. In Fig. 6.19 (l.h.s.) we show the result of our calculations for photon production in central S#Au collisions at 200A GeV in comparison with the experimental data [153]. The computations were performed at b"2 fm including the experimental rapidity cut 2.14y42.9. As seen from Fig. 6.19 the main contribution in our calculation comes from gPo/u c decays at low p 40.4 GeV and from uPnc for p 50.4; the solid line is the sum of all contributions which is 2 2 still well below the upper limits of WA80. For the process a Pnc we explore again both scenarios, i.e. without (dashed line) and with (dashed-dotted line) dropping mass modification [155]. In fact, the dropping of the a mass leads to a sizeable enhancement of a mesons in the reaction zone and thus to a significant enhancement of
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
177
Fig. 6.17. The differential dilepton spectra for Pb#Pb at 10, 50, 160A GeV at b"2 fm within the ‘dropping meson mass’ scenario. The mass resolution employed is DM"10 MeV. The figures are taken from Ref. [155].
the photon spectra from the a as pointed out in Refs. [266,289]. However, the relative contribu tion from the a decay is still far below the ‘background’ from g and u decays. Thus, even in case of the ‘dropping meson masses’ we do not get in any conflict with the upper limits imposed by the WA80 data [153]. Our studies on direct photon production have independently been repeated by Li et al. in Ref. [293]. In addition to our calculations they also include nonresonant photon production channels as n#oPn#c; thus their studies are more complete than ours. In Fig. 6.19 (r.h.s.) we show their result for the direct photon spectrum in comparison to the upper limit imposed by the WA80 Collaboration [153]. Here the dashed line corresponds to the ‘free mass’ scenario whereas the solid line is obtained in the ‘dropping mass’ scheme. In fact their photon yield is higher than that from Ref. [155] due to the additional channels included, however, still well below the upper limits such
178
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 6.18. The differential dilepton spectra for Pb#Pb at 10, 50, 160A GeV at b"2 fm for the ‘free’ o spectral function (dashed lines) and for the ‘dropping mass’ scenario (solid lines) including a mass resolution DM"10 MeV. The figures are taken from Ref. [155].
that all dilepton transport calculations performed so far are not in conflict with the photon data. On the other hand, the calculations by Srivastava and Sinha [290] (long dashed line in Fig. 6.19 (r.h.s.)) overestimate the upper limits by more than two orders of magnitude and thus can clearly be ruled out. The direct production of photons has been studied, furthermore, in great detail in Ref. [294] in order to disentangle a purely hadronic scenario with that from a QGP. Their results (shown in Fig. 6.20), however, demonstrate that the production of direct photons should be even slightly suppressed in a QGP scenario compared to a hadron gas calculation such that it appears very questionable if direct photons are suitable to prove the presence of an intermediate QGP phase. In summarizing this Section we point out that the present dilepton data from the DLS, CERES and HELIOS Collaborations indicate without doubt that something is happening with the o
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
179
Fig. 6.19. L.H.S.: The calculated differential photon multiplicity for central S#Au collisions at 200A GeV in comparison to the upper limits from the WA80 Collaboration [153]; all contributions (solid line). The figure is taken from Ref. [155]. R.H.S.: The differential photon multiplicity for central S#Au collisions at 200A GeV calculated by Li et al. in Ref. [293] in comparison to the upper limits from the WA80 Collaboration [153]. Note the different scales in the two figures.
Fig. 6.20. Transverse momentum distribution of photons at rapidity y"0 for a gas containing all known resonances with masses below 2.5 GeV with (full line) and without (dashed line) a phase transition to a QGP. The figure is taken from Ref. [294].
180
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
in the dense medium. At BEVALAC energies the detailed transport studies presently are unable to describe the data when proper constraints on the individual sources from independent experiments are imposed. On the other hand, at SPS energies both scenarios, the ‘dropping mass’ scheme in the sense of Brown—Rho scaling or according to the QCD sum rule analysis of Hatsuda and Lee as well as the hadronic spectral function approach [262] describe the data equally well. We have argued that data with improved mass resolution might be able to distinguish the different schemes especially when looking in the mass regime between the u- and the -meson. Direct photons do not appear as promising observables for an intermediate QGP phase. Future data with high mass resolution and various cuts on centrality should tell what might be actually going on in more detail.
7. Charmonium production and suppression The search for a restoration of chiral symmetry at high baryon density and temperature or for a phase transition to the quark-gluon plasma (QGP) continues. Matsui and Satz [295] have proposed that a suppression of the J/W yield in ultra-relativistic heavy-ion collisions is a plausible signature for the formation of the quark-gluon plasma because the J/W should dissolve in the QGP due to color screening [295]. This suggestion has stimulated a number of heavy-ion experiments at CERN SPS to measure the J/W via its dimuon decay. Indeed, these experiments have shown a significant reduction of the J/W yield when going from proton—nucleus to nucleus—nucleus collisions [239]. Especially for Pb#Pb at 160A GeV an even more dramatic reduction of J/W has been reported by the NA50 Collaboration [240,296]. To interpret the experimental results, various models based on J/W absorption by hadrons have also been proposed. In Ref. [297] Gerschel and Hu¨fner have shown within the Glauber model that the observed suppression of J/W in nuclear collisions is consistent with the hadronic absorption scenario if one assumes a J/W-nucleon absorption cross section of about 6—7 mb. Similar but more recent analyses by the NA50 Collaboration [296] and Kharzeev [298] have led to the same conclusion. However, this model has failed to explain the ‘anomalous’ suppression reported in central Pb#Pb collisions, thus leading to the suggestion of a possible formation of a quark-gluon plasma in these collisions [296,298—300]. On the other hand, Gavin et al. [301,302], based also on the hadronic absorption model, have found that although J/W absorption by nucleons is sufficient to account for the measured total J/W cross sections in both proton—nucleus and nucleus—nucleus collisions, it cannot explain the transverse energy dependence of J/W suppression in nucleus—nucleus collisions. To account for the nucleus—nucleus data they have introduced additionally the absorption on mesons (‘comovers’) with a cross section of about 3 mb. A similar model has also been proposed by Capella et al. [303] to describe the J/W and W suppression in nucleus—nucleus collisions. On the other hand, Kharzeev et al. [304] claim the ‘comover’ absorption model to be inconsistent when considering all data on J/W production simultaneously. In all these studies the dynamics of the collisions is based on the Glauber model, so a detailed space and time evolution of the colliding system is not included. In particular, the transverse expansion of the system and the finite hadron formation time is ignored in the Glauber models. Especially for nucleus—nucleus collisions involving heavier beams, such as the Pb#Pb collisions at 160A GeV, the dynamics is more subtle than in proton and S induced reactions. Thus dynamical
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
181
models are needed to complement our information on the reaction dynamics. In this respect Loh et al. [305] have investigated the J/W dissociation in a color electric flux tube in a semiclassical model based on the Friedberg-Lee color dielectric Lagrangian. They find that the ccN dissociation time is in the order of 1 fm/c. The first transport theoretical analysis of J/W production and absorption has been performed in Ref. [306] where the ccN production was based on the LUND string formation and fragmentation model [90]. Indeed, substantial differences to the Glauber approaches have been found due to a finite formation time of the ccN pair and its subsequent interactions with baryons and mesons. However, due to the low statistics achieved in the numerical calculations a definite conclusion about the charmonium suppression could not be obtained in the latter study since alternative production and absorption schemes also could lead to different absorption rates. We here discuss the extended studies from Ref. [307] on ccN production and suppression at SPS energies within the HSD approach using different production and absorption models. As shown before the HSD transport approach provides a more realistic description of the heavy-ion reaction dynamics than the models used in Refs. [296—299,301,302]. Within this approach we can check if the ccN pair might be destroyed by nucleons before the mesons are produced as argued in Ref. [299] (cf. also Refs. [298,300]). Furthermore, we can test if a finite lifetime of the ccN pre-resonance system as suggested by Kharzeev et al. [298,308] comes in conflict with the data since according to Ref. [309] the J/W-meson cross section might be negligibly small in hadronic matter due to the small size of the J/W and its large mass gap from open charms. However, it is expected that the ccN pair is first produced in a color-octet state together with a gluon (‘pre-resonance state’) and that this more extended configuration has a larger interaction cross section with baryons and mesons before the J/W singlet state finally emerges. We will not address the question of whether the magnitude of the ccN -hadron cross sections used is correct or can be justified by nonperturbative QCD and concentrate on the question if specific reaction models can be ruled out in comparison to the data available so far. Before discussing the question of charmonium production we show in Fig. 7.1 the time evolution of the baryon density o (x, y, z; t) as a function of z and time t for x"y"0 in a central collision of Pb#Pb at 160A GeV in the nucleon—nucleon center-of-mass frame. The two Pb-ions start to overlap at t+1 fm/c, get compressed up to a maximum density of about 2.5 fm\ at t+2.5 fm/c and expand in longitudinal (z) direction later on indicating a sizeable amount of transparency. We note that the space-time evolution in Fig. 7.1 is controlled by the experimental rapidity distributions dN/dy for protons and negatively charged particles from NA49 [154] (cf. Section 4.3); the streaming of hadrons or more precisely their distribution in velocity b, i.e. dN/db can also be directly extracted from the experimental data using dN/db"dN/dy(b)(1/(1#b)#1/(1!b)). The produced mesons in this reaction (in a central cylinder of radius R"3 fm and volume »+15 fm) appear at about t+2 fm/c as can be extracted from Fig. 7.2 (lower part) where the densities of pions, o-, u- and g-mesons are displayed separately as a function of time. The maximum in the meson density (+1 fm\ for pions) at t+3.2 fm/c appears with a delay of t "0.7 fm/c $ with respect to the maximum baryon density (cf. Fig. 7.1). For comparison we also show in Fig. 7.2 the meson densities for a central S#º collision at 200A GeV within the same volume. When comparing to the central Pb#Pb collision at 160A GeV we observe a lower meson density for the S#º case in the central overlap region; especially the o-meson density is lower by about a factor of 2. Nevertheless, high baryon and meson densities are encountered in these
182
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 7.1. The baryon density o (x"0, y"0, z; t) for a Pb#Pb collision at 160A GeV and impact parameter b"1 fm within the HSD transport approach. The figure is taken from Ref. [307].
reactions for t52 fm/c where a ccN -pair — that can be produced from 1 to 3.5 fm/c — has to pass through. This situation is summarized schematically in Fig. 7.3 for a p#Pb (upper part) and S#º collision at 200A GeV (middle part) as well as for a central Pb#Pb collision at 160A GeV (lower part) for freely streaming baryons (thick lines). In all cases the initial string formation space-time points are indicated by the full dots; the mesons (indicated by arrows) hadronize after a time delay t +0.7 fm/c as shown by the first hyperbola. A ccN -pair produced in the initial hard nuc$ leon—nucleon collision cannot be absorped by mesons in the dark shaded areas in space and time; however, in case of Pb#Pb, where ccN pairs should be produced within the inner rectangles, a sizeable fraction will also be produced in a dense mesonic environment. This fraction of pairs produced at finite meson density for S#º is much reduced as can be seen from Fig. 7.3 (middle part). The upper hyperbolas in Fig. 7.3 represent the boundaries for the appearance of mesons from the second interaction points (full dots) which appear somewhat later in time; they stand for a representative further nucleon—nucleon collision during the reaction. Following Refs. [299,300], we explore if the energy density in these reactions might be large enough to create a quark-gluon plasma in some region of space and time. In this respect we show in Fig. 7.4 as a function of time the volume with energy density above 2, 3, and 4 GeV/fm for S#¼ at 200A GeV and Pb#Pb at 160A GeV for a reaction at b"2 fm. In these calculations the energy
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
183
Fig. 7.2. The density of n-, o-, u- and g-mesons in a Central cylinder of radius R"3 fm and volume »+15 fm for a S#º collision at 200A GeV and a Pb#Pb collision at 160A GeV for b"1 fm within the HSD transport approach. The figure is taken from Ref. [307].
density is computed as
E(x)"(D»(x)c(x))\
(p#m# (p#m , G G H H D D GZ 4
HZ 4
(7.1)
where all mesons, but only baryons that have scattered at least once, have been counted. In Eq. (7.1), c(x) is the Lorentz-factor associated with the cell D»(x), which is taken to be 1 fm in the nucleus—nucleus center of mass. It is seen that in Pb#Pb collisions there is an appreciable volume of high energy density above 3 GeV/fm (+300 fm) or even 4 GeV/fm (+200 fm), as claimed in Ref. [300], for time scales of a few fm/c, where a quark-gluon plasma (QGP) might be formed in the reaction. These regions of high energy density are about 20% or 13%, respectively, of the volume of Pb and — from our point of view — should not be adequately described by a hadronic transport theory. The actual volume of high energy density (above 4 GeV/fm) for S#¼ is about 25 fm, i.e. 11% of the volume of S, which is only slightly lower than in case of Pb#Pb. Thus our calculations, which are in line with experimental hadronic rapidity and transverse momentum distributions for both systems, do not indicate a sizeable increase in the energy density for the heavier system. The critical energy density for a phase transition to the QGP is not accurately known (cf. Section 2); the value of 2—4 GeV/fm is chosen for convenience.
184
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 7.3. Schematic representation of a p#Pb collision at 200 GeV (upper part), a S#º collision at 200A GeV (middle part) and a Pb#Pb collision at 160A GeV (lower part) in space—time. The full dots represent early hard collision events (for Drell—Yan and ccN -pairs) while mesons (n, g, o, u, etc. — arrows) only appear after a respective formation time t +0.7 fm/c. The overlap area (inner rectangle) specifies the space—time region of hard production events. The figure is $ taken from Ref. [307].
7.1. Elementary production cross sections Since the probability of producing a ccN or Drell—Yan pair is very small, a perturbative approach is used for technical reasons as in case of kaons and antikaons at SIS energies (cf. Section 5.4). Whenever two nucleons collide a ccN -pair is produced with a probability factor ¼, which is given by the ratio of the J/W (or W) to NN cross section at a center-of-mass energy (s of the baryon—baryon collision, ((s) p W . ¼" ( >6 p ((s) >6 The parametrization used for the J/W cross section is p
a (s ((s)"110 1! [nb] (W>6 a (s
(7.2)
(7.3)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
185
Fig. 7.4. Time evolution of the reaction volume with an energy density above 2, 3, and 4 GeV/fm for S#¼ at 200A GeV and Pb#Pb at 160A GeV within the HSD transport approach. The figure is taken from Ref. [306].
with a"7.47 GeV which is displayed in Fig. 7.5 (solid line) in comparison to the experimental data from Ref. [310]. For the systems to be studied in the energy regime (s423 GeV, the estimated uncertainty of our parametrization is about 20%. We note that the parametrization used conventionally [310], p
c [nb] , ((s)"d 1! ( >6 (s W
(7.4)
with c"3.097 GeV and d"2 ) 37/0.0597, is in a fair agreement with Eq. (7.3) in the energy range of interest (dashed line). The rapidity distribution of the ccN -pair is approximated by a Gaussian in the nucleon—nucleon center-of-mass of width p+0.6 while the transverse momentum distribution is fitted to experimental data (see below). For W production we employ the same model, however, scale the experimental cross section by a factor of 0.122 relative to the J/W production cross section. In extension to Ref. [306] the Drell—Yan process is taken into account explicitly. The generation of Drell—Yan events was performed with the PYTHIA event generator [311] (version 5.7) using GRV LO (leading order) or MRS A (next to leading order) structure functions from the PDFLIB
186
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 7.5. The parametrization (7.3) for the elementary J/W cross section in pp collisions (solid line) in comparison to the experimental data from Ref. [310]. The dashed line shows the parametrization (7.4). The figure is taken from Ref. [307].
package [312] with k "1.0 GeV. This yields a dimuon cross section of 270 pb in pp collisions at 2 200 GeV and 261 pb in pn collisions, respectively, as in [239,313]. According to our dynamical prescription the Drell—Yan pairs can be created in each hard pp, pn, np or nn collision ((s510 GeV). Since PYTHIA calculates the Drell—Yan process in leading order only (using GRV LO structure functions) we have multiplied the Drell—Yan yield for NN collisions by a K-factor of 2.0 (cf. Refs. [239,313]). In case of MRS A structure functions, which include NLO corrections, a K-factor of about 1.6 had to be introduced (cf. [310]). The energy distribution of the hard NN collisions dN/d(s for nucleus—nucleus collisions in our transport approach shows a peak around (s "(2m (¹ #2m ) where ¹ is the kinetic energy per , , nucleon in the laboratory frame. Thus the main contribution to the dimuon yield — summed over all NN events — comes from NN collisions with (sK(s . However, there are also Drell—Yan pairs from NN collisions with (s larger or smaller than (s . We have compared our results within the production scheme given above with that used in Refs. [239,240,313], where the Drell—Yan yield from p#A and A#A collisions is calculated as the isotopical combination of the yield from pp and pn at fixed (s scaled by A ;A . We found that . 2 the variation from the scheme used in Refs. [239,240,313] is less than 10%. We, furthermore, note that the difference in the dimuon spectrum using different structure functions (GRV LO or MRS A) is less than 5%. In the present analysis we have discarded dimuon production from open charm channels because a recent analysis by Braun—Munzinger et al. [314] on the basis of the same PYTHIA event generator has shown that the open charm contributions at low and high invariant masses are of minor importance. Since the production scheme is the same for ccN -pairs their total cross section (without reabsorption) also scales with A ;A ; the ratio of the J/W to the Drell—Yan cross section thus provides . 2 a direct measure for the J/W suppression. In order to obtain some information about the primary distribution of the produced ‘preresonance’ states in coordinate space we show in Fig. 7.6 the ccN distribution in the (x, z)-plane
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
187
Fig. 7.6. The production probability for ccN -pairs in a central Pb#Pb collision at 160A GeV in the (x, z)-plane integrated over y within the HSD transport approach. The z-axis is scaled by the Lorentz factor c "9.3 to compensate for the Lorentz contraction in beam direction. The figure is taken from Ref. [307].
integrated over y for a central collision of Pb#Pb at 160A GeV. The collision of the two nuclei proceeds along the z-direction and the actual z-axis has been stretched by the Lorentz factor c +9.3 to compensate for the Lorentz contraction in beam direction. It is clearly seen from Fig. 7.6 that the production of the charmonium state by hard nucleon—nucleon collisions is enhanced in the center and drops rapidly in the surface region of the overlapping nuclei. The motion of the ccN -pair in hadronic matter is followed throughout the collision dynamics by propagating it as a free particle. In the simulations the ccN -pair, furthermore, may be destroyed in collisions with hadrons using the minimum distance concept as described in Section 2.3 of Ref. [125]. For the actual cross sections employed we study two models (denoted by I and II) which both assume that the ccN pair initially is produced in a color-octet state and immediately picks up a soft gluon to form a color neutral ccN !g Fock state [298] (color dipole). This extended configuration in space is assumed to have a 6 mb dissociation cross section in collisions with baryons (ccN #BPK #DM ) as in Refs. [296—298] during the lifetime q of the ccN !g state which is A a parameter. In the model I we assume q"10 fm/c which is large compared to the nucleus—nucleus reaction time such that the final resonance states J/W and W are formed in the vacuum without
188
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 7.7. The transverse momentum distribution of J/W mesons in p#º (upper part) and S#º collisions (lower part) at 200A GeV from the HSD calculations (histograms). The solid lines are fits to the experimental data from Ref. [239]. The figure is taken from Ref. [307].
further interactions with hadrons. In the model II we adopt q"0.3 fm/c as suggested by Kharzeev [298] which implies also to specify the dissociation cross sections of the formed resonances J/W and W on baryons. For simplicity we use 3 mb following Ref. [308]. The cross section for ccN !g, J/W, or W dissociation on mesons (ccN #mPDDM ) is treated as a free parameter ranging from 0 to 3 mb. In order to test the transverse momentum dependence of the J/W production calculations have been performed for p#º and S#º at 200A GeV (using model I with p "6 mb and p "1 mb) which are scaled in magnitude to the respective data for the same systems from Ref. [239]. The calculated results are shown in Fig. 7.7 for both systems in terms of the histograms while the solid lines represent fits to the experimental transverse momentum distributions. Since this p dependence is described fairly well one can expect the ccN event distributions within the HSD 2 transport simulations to be quite realistic. 7.2. Analysis of experimental data Since the J/W and W are measured in nuclear collisions through their decay into dimuons, we calculate explicitly the dimuon invariant mass spectra from these collisions. This includes not only
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
189
the decay of the J/W, W but also the decay of other mesons (g, o, u, and ) as well as the Dalitz-decay of the g, u, g etc. (cf. Section 6). Since the Drell—Yan contribution is important for dileptons with invariant masses above 1.5 GeV [313] we have included these channels by computing for each ‘hard’ nucleon—nucleon collision ((s510 GeV) their contribution via PYTHIA 5.7 [311] as described above. Calculations have been carried out for p#¼ and central S#¼ collisions at 200A GeV in order to check on an absolute scale if the production and absorption schemes employed in the HSD approach are included properly. In Fig. 7.8 we show the dimuon invariant mass spectra for p#¼ and S#¼ at 200A GeV normalized to the number of charged particles in the pseudorapidity bin 3.74g45.2 and compare them with the experimental data from Ref. [237] including their acceptance and mass resolution. Again model I has been employed with p "6 mb and p "1 mb (see below). It is seen from Fig. 7.8 that in the p#¼ and S#¼ cases the theoretical results agree well with the data on an absolute scale which implies that apart from the low mass dimuon spectrum — cf. Section 6 — also the J/W and W region is described reasonably well. At low dimuon masses the explicit contributions from the mesons g, u, are displayed in terms of the thin lines while for invariant masses above 1.5 GeV the Drell—Yan (DY), J/W and W contributions are shown explicitly. The full solid curve represents the sum of all contributions (without open charm channels). It is intersected from M+ 1.2 to 1.5 GeV because the continuation of the Drell—Yan contribution to lower energies is not clear as well as the tail of the o-meson contribution at higher M. Futhermore, open charm channels (from D and DM mesons) have not been included yet (cf. Ref. [314]) as well as Drell—Yan pairs from secondary pion—baryon reactions [315]. We note that for S#¼ in the invariant mass range from 1.64M42.5 GeV the dimuon yield is almost twice that for p#¼ within the present normalization and that our calculations reproduce also the intermediate mass range on the basis of PYTHIA 5.7 reasonably well. Since the absorption of ccN -pairs on secondary mesons in proton—nucleus collisions is practically negligible [306], these reactions allow to fix ‘experimentally’ the ccN -baryon dissociation cross section. Furthermore, the Glauber approach for ccN absorption in this case should be approximately valid at energies of about 200—450 GeV such that the transport calculations can be tested additionally in comparison e.g. to the model calculations of Refs. [297,298,301,302,308]. In Fig. 7.9 we show the results of the calculations for the J/W survival probability (open squares) using 6 mb for the absorption cross section of the ccN -pairs on nucleons in comparison to the data [296]. The experimental survival probabilities in this figure as well as in the following comparisons are defined by the ratio of experimental J/W to Drell—Yan cross sections as
B p(W II S " p"7" \ %4
B p(W II NB , p"7 NB
where A and B denote the target and projectile mass-number while p(W and p"7 stand for the J/W and Drell—Yan cross sections from AB collisions, respectively, and B is the branching ratio of J/W II to dimuons. We note that due to the large statistical error bars of the experimental data absorption cross sections p of 6$1 mb are compatible also. These values are slightly smaller than those of Kharzeev et al. [304] in the Glauber model claiming 7.3$0.6 mb, but in the same range as those used in the Glauber models of Ref. [297]. We do not expect to get exactly the same values as in the Glauber model because the transverse expansion of the scattered nucleons as well as of the ccN -pairs
190
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 7.8. Dimuon invariant mass spectra from p#¼ and S#¼ collisions at 200A GeV in comparison to the data of the HELIOS-3 Collaboration [237]. In the low mass region the individual contributions from the g Dalitz-decay, u and
decay are shown by the thin lines. The Drell—Yan contribution (denoted by DY) is calculated only for invariant masses M51.5 GeV. The explicit contributions from J/W and W decays are indicated by thin lines for M52.7 GeV. The thick solid lines represent the sum of all dimuon channels (except for open charm contributions). The figure is taken from Ref. [307].
is neglected there. Due to the ‘optimal’ reproduction of the available data with a cross section of about 6 mb this value is used also for nucleus—nucleus collisions in the following. In order to perform a detailed comparison to the data of the NA38 and NA50 Collaborations for S#º at 200A GeV and Pb#Pb at 160A GeV one first has to fix the experimental event classes as a function of the neutral transverse energy E in order to allow for an event by event analysis. In 2 this respect we compute the differential cross section
dN dp "2n b db (b) dE dE 2 2
(7.5)
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
191
Fig. 7.9. The J/W survival probability for p#A reactions at 200 GeV assuming a 6 mb cross section for the ccN dissociation on baryons (open squares) within the model I in comparison to the experimental data from [296] (full circles). The figure is taken from Ref. [307].
as a function of the impact parameter b, where dN(b)/dE is the differential E distribution 2 2 for fixed b. Since the detector response is not known to the authors the E distribution is rescaled 2 to reproduce the tail in the experimental E distributions which results in the dashed histograms 2 in Fig. 7.10 that increase for low E . In the actual experiments, however, only events are recorded 2 with a k>k\ pair of invariant mass M51.5 GeV and rapidity 34y 44 for NA38 and 2.934y 43.93 for NA50. A respective selection in the transport calculation is obtained by
dN dpII "2nN (b) ¼II(b) , b db G dE dE 2 G 2
(7.6)
where ¼II(b) are the weights for produced k>k\ pairs within the experimental cuts. The calculated G distributions (7.6) are shown in Fig. 7.10 in terms of the solid histograms which reasonably reproduce the experimental distributions (grey histogram for S#º, full dots for Pb#Pb). The average values for E in the five bins are indicated in Fig. 7.10 by the open squares and coincide 2 with the corresponding experimental values from Ref. [240]. We are now in the position to perform a comparison to the experimental survival probabilities for J/W and W production in the respective transverse energy bins. We first show the results for S#º at 200A GeV and Pb#Pb at 160A GeV within the model I varying the dissociation cross section on mesons of the ccN !g object from 0 to 1.5 mb while keeping the absorption cross section on baryons fixed at 6 mb. The calculated J/W survival probabilities are displayed in Fig. 7.11 (l.h.s.) in comparison to the data for both systems; the dashed lines are obtained for p "0 mb while the solid lines correspond to p "1.5 mb. Whereas the data for S#º appear to be approximately compatible with our calculations without any
192
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 7.10. The distributions in the neutral transverse energy for S#º at 200A GeV and Pb#Pb at 160A GeV. Experimental distributions: grey histogram for S#º, full dots for Pb#Pb; dashed histograms: result of the transport calculations without any constraints on the experimental acceptance (7.5); solid histograms: HSD calculations for the E distribution according to Eq. (7.6). The open squares denote the average E in the experimental E bins. The figure is 2 2 2 taken from Ref. [307].
dissociation by mesons the Pb#Pb system shows an additional suppression. This finding is in agreement with the results of Glauber models [298,304]. On the other hand, the Pb#Pb data are well reproduced with a cross section of 1.5 mb (in model I) for the J/W absorption on mesons which, however, then slightly overestimates the suppression for the S#º data for the 3 middle E bins. 2 Our calculations thus do not indicate a strong argument in favour of a QGP phase in the Pb#Pb reaction to interpret the J/W survival probabilities within the model I. The latter conclusion is different from the Glauber calculations of Refs. [298,304] and should be due to the simplified assumptions about the actual meson abundancy with which the ccN pair can be dissociated. Note again that mesons appear in our dynamical approach only after t "0.7 fm/c $ after the first ‘hard’ collision. As has been pointed out by Gavin et al. [301] especially the J/W in the comoving frame should be dissociated by o- and u-mesons and less by pions due to the large gap in energy for DDM dissociation. We have investigated this suggestion more quantitatively within our
The S#º data are best described with p +6.5 mb.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
193
Fig. 7.11. L.H.S.: The J/W survival probability for S#º at 200A GeV (upper part) and Pb#Pb at 160A GeV (lower part) as a function of the transverse energy in comparison to the experimental data from [240] (full circles) within the model I assuming a long lifetime for the ccN !g system. The absorption cross section on mesons is varied from 0 (dashed lines) to 1.5 mb (solid lines). R.H.S.: The J/W survival probability for S#º at 200A GeV (upper part) and Pb #Pb at 160A GeV (lower part) as a function of the transverse energy in comparison to the experimental data from [240] within the model II (see text). The absorption cross section on mesons is varied from 0 (dashed lines) to 3 mb (solid lines); the dissociation cross section on baryons for the pre-resonance ccN !g system was taken as 6 mb while for the J/W singlett cross section with baryons 3 mb were adopted. The open diamonds represent the calculated survival probabilities (within model II) for more peripheral reactions (see text). The figures are taken from Ref. [307].
microscopic approach and find the following hadronic decomposition for J/W and W absorption on mesons: S#º (central): pions (35%), o’s (42%), u’s (15%), g’s (8%); Pb#Pb (central): pions (37%), o’s (42%), u’s (13%), g’s (8%). Thus the hadronic decomposition practically does not change when going from S#º to the heavier Pb#Pb system. As a side remark we note that the dilepton yield from central S#Au and Pb#Au collisions — normalized to the charged particle multiplicity in the rapidity bin 2.14y43.1 — experimentally appears to be the same [236] which is in line with our findings.
194
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 7.12. The J/W and W absorption rate dN (t)/dt on mesons for central collisions of S#º at 200A GeV and Pb#Pb at 160A GeV within the HSD transport approach. The absorption rate for S#º has been multiplied by a factor 208/32 to compensate for the different number of projectile nucleons. The figure is taken from Ref. [307].
A remarkable difference, however, is found when comparing the number of absorbed J/W’s and W by mesons in central collisions for S#º and Pb#Pb as a function of time. In order to compare the two absorption rates dN (t)/dt on a relative scale we have multiplied the absorption rate for the system S#º by the ratio of projectile nucleons (208/32) in Fig. 7.12. Here the heavier system Pb#Pb shows higher absorption rates from the beginning (at t"2 fm/c), which correlates with the central meson densities shown in Fig. 7.2, and also the J/W absorption on mesons lasts longer in accordance with the calculations in Ref. [306]. The latter effect can also be extracted from the schematic picture in Fig. 7.3 where the number of produced ccN -pairs at finite meson density is expected to be much larger for Pb#Pb than for S#º. The J/W suppression for S#º at 200A GeV and Pb#Pb at 160A GeV within the model II is shown in Fig. 7.11 (r.h.s.) for a ccN !g lifetime q"0.3 fm/c varying the dissociation cross section of the ccN -pair with mesons from 0 to 3 mb while keeping the absorption cross section on baryons fixed at p "6 mb for the ‘pre-resonance’ state and at 3 mb for the formed J/W resonance. Again the dashed lines correspond to the calculations without any charmonium absorption on mesons whereas the solid lines represent the calculations for a meson absorption cross section of 3 mb. In the absorption model II the data for S#º are no longer compatible with our calculations without any dissociation by mesons. The S#º data here need an absorption by mesons in the range of 3 mb as in the phenomenological model of Gavin et al. [301,302]. With p +3 mb for the absorption on mesons, however, the Pb#Pb data appear to be compatible, too. We also display in Fig. 7.11 (r.h.s.) the calculated survival probabilities (within model II) for more peripheral reactions (open diamonds) that correspond to an E -bin from 5—10 GeV in case of S#º at 200A GeV and to 2 an E -bin from 10 to 20 GeV in case of Pb#Pb at 160A GeV. 2 We note that W suppression as measured by NA38 and NA50 for S#º and Pb#Pb provides additional information on the reaction mechanism. In our transport approach the W’s are dominantly absorbed in collisions with pions which are above the DM D threshold in this case and
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
195
Fig. 7.13. The distribution in the invariant collision energy (s for J/W (upper part) and W (lower part) absorption on n-, g-, o- and u-mesons in central (b"2 fm) Pb#Pb collisions at 160A GeV within the HSD transport approach.
less in o and u collisions as shown in Fig. 7.13 where the distribution in the invariant collision energy (s is shown for J/W and W absorption on n, g, o and u mesons in central (b"2 fm) Pb#Pb collisions at 160A GeV. Thus to control the ‘comover absorption’ independently the respective charmonium cross sections are needed from the DM D threshold up to about 5 GeV of invariant energy (s. The transport calculations show that the absorption of ‘pre-resonance’ ccN !g states by both nucleons and produced mesons can explain reasonably not only the inclusive J/W cross sections but also the transverse energy (E ) dependence of J/W suppression measured in nucleus—nucleus 2 collisions. In particular, the absorption of J/W’s by produced mesons is found to be important especially for Pb#Pb reactions, where the J/W-hadron reactions extend to longer times as compared to the S#¼ or S#º reactions. This is in contrast with results based on a simple Glauber model, which neglects both the transverse expansion of the hadronic system and the finite meson formation times, where the ccN !N absorption is roughly sufficient even for S-induced collisions. As a consequence we do not find evidence for the formation of a quark-gluon plasma in Pb#Pb collisions. This could only be done through experimental or theoretical proofs that the
196
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
charmonium-meson cross sections employed (+ 1.5—3 mb) in our analysis are too large. Since the ccN dissociation on mesons is expected to be dominated by flavor exchange reactions the present cross sections in our opinion, however, should be reasonable. We close this section by noting that the charmonium suppression can also be described in a scenario where prior to absorption by ‘comovers’ a ccN color octet state is dissolved in the color electric field of neighbouring strings [316]. Since there are a couple of ‘plausible’ explanations for the J/W suppression observed experimentally, the J/W signal appears very questionable as a ‘smoking gun’ for a QGP phase. On the other hand we have to point out that all hadronic models have in common that the J/W suppression is a smooth function of the transverse energy E ; 2 the recently reported step in the suppression profile for Pb#Pb at E +50 GeV might point 2 against the hadronic reaction scenario used here as argued in Refs. [317,318]. However, before drawing premature conclusions, independent data with good statistics have to show such an effect independent of the E -bins taken. 2
8. Future perspectives The ultimate goal of relativistic nucleus—nucleus collisions is to reanalyze the early ‘big-bang’ under laboratory conditions and to find the ‘smoking gun’ for a phase transition from the initial quark-gluon plasma (QGP) to a phase characterized by an interacting hadron gas. Any theoretical approach might describe such a phase transition starting from the hadronic side by involving hadronic degrees of freedom, i.e. hadrons with proper self-energies or spectral functions at high baryon density or temperature, or by starting from the partonic side with strongly interacting quarks and gluons. It thus remains to be seen which approach will prove to be more successful, economic and transparent. Presently, there are no indications for a ‘smoking gun’ as shown in the previous sections. As characteristic for the field of heavy-ion physics it is the ‘whole picture’ that will provide first ‘additive’ experimental information which finally should converge towards a subtle understanding of the nature of the strong interaction and the phases of its constituents. In this Section we will address a couple of steps that should help in clarifying the problem using different probes under well defined experimental conditions. 8.1. Meson m -scaling at SIS energies 2 At SIS energies we propose to study the meson m -spectra in heavy-ion collisions, where m is 2 2 the transverse-mass m "(p#m) and p the transverse momentum of a meson with bare mass 2 2 2 m. The transverse-mass spectra are a common way to represent experimental information on particle production in heavy-ion physics [319]. Usually the measurement of m -spectra is asso2 ciated with studies on equilibration phenomena of the system due to the trivial exponential behaviour: i.e., if the spectrum 1/m dp/dm is of Boltzmann type &exp(!bm ), the slope 2 2 2 parameter b might be related to the global (inverse) temperature at freeze-out in the absence of collective flow [88,89]. On the other hand the m -spectra might also indicate the presence of 2 attractive or repulsive forces as suggested already by Fang et al. [320] or Koch et al. [321] in their analysis at AGS energies.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
197
The m -spectra of n- and g-mesons in heavy-ion collisions at SIS energies have been measured 2 by the TAPS Collaboration [136—138] and m -scaling has been found for both mesons and all 2 systems investigated. Such a universal property of the meson spectra at SIS energies has been predicted also by the Quark-Gluon-String Model calculations in Ref. [139] for the Ar#Ca system at several energies. A systematic analysis of n and g spectra, furthermore, was performed in Ref. [140] within the HSD transport approach. Following Ref. [190] we propose to examine the m -spectra of all mesons, that can be produced 2 at SIS energies, to obtain information about their in-medium meson properties. We will use the same cross sections for all reaction channels as specified in Section 3 and employ the in-medium mass scheme as in Eq. (5.2), i.e.
o 5m#mN (8.1) m*"m 1!a Ko O O with a "!0.06, a "0.18, a "0.18, a M "0.24 and a "0.025. ( ) S M ) In Fig. 8.1 we show the results of the HSD calculations with bare meson masses for the transverse-mass spectra of n (solid line with solid circles), g (dashed line with open circles), u (dot-dashed line with solid ‘up’ triangles), (dotted line with solid squares), K> (short dashed line with open diamonds) and K\ (dotted line with open ‘down’ triangles) mesons for C#C collisions at 2.0A GeV (upper part), Ni#Ni at 1.93A GeV (middle part) and Au#Au at 1.5A GeV (lower part). For scalar mesons (n, g) and vector mesons (u, ) we use (8.2) m*"m "(p#m) , 2 2 2 whereas for kaons m* contains the shift in threshold due to the K!N mass difference, i.e.: 2 * m (K>)"(p#(m #mK!m )) , (8.3) , 2 2 ) while for antikaons due to the associated (K>, K) production we use, (8.4) m*(K\)"(p#(2m )) , 2 2 ) where m , mK and m are the masses of kaon, K and nucleon, respectively. The contributions of ) , vector mesons u and are divided by a factor of 3 due to the 3 different polarizations of the vector mesons. As can be seen from Fig. 8.1 the n, g, u and K> spectra indicate m -scaling, whereas the 2 contribution of the -meson is suppressed by a factor +10, however, it has the same slope as the other mesons. The K\ spectra (using Eq. (8.4)) approximately scale only for the light system C#C; for heavy systems such as Ni#Ni (middle part) and especially Au#Au (lower part) the K\ spectra are essentially below the scaling line due to a stronger absorption by baryons. This result for K\ is also consistent with the analysis in Ref. [160] showing that the antikaon yield calculated with a bare mass significantly underestimates the experimental data [159,186,187] (cf. Section 5.1.2). We note that the apparent m -scaling especially for C#C at 2A GeV is not due to thermal and 2 chemical equilibration as suggested in Ref. [322] or in Refs. [88,89] for AGS and SPS energies. Here it merely reflects the fact that the production of a meson (per degree of freedom), after folding over the baryon—baryon and pion—baryon collisional distribution in the invariant energy (s, essentially depends on the excess energy available as suggested by Metag [323]. This notion is also
198
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.1. The calculated inclusive transverse-mass spectra of n- (solid line with solid circles), g- (dashed line with open circles), u- (dot-dashed line with solid ‘up’ triangles), - (dotted line with solid squares), K>- (short dashed line with open diamonds) and K\- (dotted line with open ‘down’ triangles) mesons for C # C collisions at 2.0A GeV (upper part), Ni#Ni at 1.93A GeV (middle part) and Au#Au at 1.5A GeV (lower part). The calculations have been performed for bare meson masses. The figure is taken from Ref. [190].
consistent with the mass shifts in Eqs. (8.3) and (8.4) for kaons and antikaons, where the shift in threshold due to the associated strange hadron is taken into account explicitly. On the other hand, in chemical equilibrium no shift in the transverse mass as in Eqs. (8.3) and (8.4) should be considered. Our calculations, furthermore, show a strong anisotropy in the center-of-mass angular
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
199
Fig. 8.2. Inclusive m -spectra of n-, g-, u- and K>-mesons for C#C (upper part) and Au#Au (middle part) at 2 1.0A GeV within the HSD transport approach. The lower part corresponds to Au#Au at 1.0A GeV for a central rapidity bin !0.34y40.3. The figure is taken from Ref. [190].
distribution for all mesons similar to that of pions [9,324] which raises severe doubts on the issue of thermal equilibration, too. In Fig. 8.2 we show — again for bare meson masses — our calculations for the m -spectra of n-, 2 g-, u- and K>-mesons for C#C (upper part) and Au#Au (middle part) at 1.0A GeV. As seen from Fig. 8.2 the spectra at 1.0A GeV show the same scaling behaviour as in Fig. 8.1. The lower
200
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
part of Fig. 8.2 corresponds to calculations for Au#Au at 1.0A GeV gating on central rapidity !0.34y40.3. The rapidity cut decreases the particle yield, however, does not ‘destroy’ the global scaling behaviour. Thus data at midrapidity provide similar information. In order to investigate the dynamical origin of the scaling behaviour found for bare meson masses more closely, we show in Fig. 8.3 the channel decomposition for g, u, K> and K\ spectra for Au#Au at 1.5A GeV. The solid lines with solid circles indicate the sum over all contributions, the dashed lines with open circles correspond to the NN production channel, the dotted lines with open squares are the DN channel, the dot-dashed line with solid ‘up’ triangles show the nN contribution and the short dashed line with open diamonds in the lower part of Fig. 8.3 represents the pion—hyperon (n½) channel for K\ production. For g production all included channels (NN, DN, nN) give practically the same contribution because at 1.5A GeV we are above the pion—baryon and baryon—baryon g production threshold. For the u-meson the nN channel provides the dominant yield, the NN and DN contributions are below due to the large threshold for u production in baryon—baryon collisions. For K> production the nN and DN channels are approximately of the same order of magnitude; the contribution from the DN channel is more important at 1.5A GeV in comparison to the lower energy of 1.0A GeV, where the nN channel is dominant (cf. Ref. [177]). As first pointed out by Ko in Ref. [188] and in line with our analysis (cf. Ref. [160]) the n½ channel gives the main contribution for K\ production. This is due to the fact that in more central collisions the pion density reaches about 0.15 fm\ while the hyperons have almost the same abundancy as the K> and K mesons. Thus a substantial amount of hyperons suffer a quark exchange with pions (sPu, d) when propagating out of the nuclear medium. On the other hand the same type of process (i.e. flavor exchange) leads to antikaon absorption on baryons. Fig. 8.3 demonstrates that the secondary channels nN and DN give the main contribution for meson production at SIS energies which are very close (or even below) the elementary production threshold for most of the mesons. The primary NN collisions become important only if the incoming energy is substantially above the elementary production threshold (as in the g case). We now turn to the question of in-medium meson properties by adopting the scaling hypothesis of Brown and Rho [3]. For our calculations we use a linear extrapolation of the meson masses with baryon density as imployed in Eq. (8.1). Since in Eq. (8.1) we have neglected an explicit momentum dependence of the meson self-energies our following calculations should be considered as a more qualitative study. The symbols in Fig. 8.4 indicate the results from the HSD calculations for C # C at 2.0A GeV, Ni#Ni at 1.93A GeV and Au#Au at 1.5A GeV (the assignment is the same as in Fig. 8.1). The solid lines are exponential fits to the HSD results for orientation. The pion spectra are shown by the thick straight lines indicating the general scaling behaviour (cf. Fig. 8.1) with the slope parameters 77 MeV for C # C at 2.0A GeV, 82 MeV for Ni#Ni at 1.93A GeV and 83 MeV for Au#Au at 1.5A GeV. The ‘dropping’ of the g, u and K\ masses according to Eq. (8.1) leads to an enhancement of the m -spectra especially at low m due to the shift of the production thresholds to lower 2 2 energy. The enhancement of the yield is almost not seen due to the very small coefficient a in ( Eq. (8.1). The increasing K> mass in the medium Eq. (8.1) leads to a suppression of the K> yield at low m correspondingly. 2 Thus the in-medium modifications of the mesons according to Eq. (8.1) destroy the m -scaling 2 picture presented in Fig. 8.1. As already shown in Ref. [140] the simple ‘dropping’ g mass scheme is
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
201
Fig. 8.3. The channel decomposition for g, u, K> and K\ inclusive m -spectra for Au#Au at 1.5A GeV. The calculations 2 have been performed with bare meson masses within the HSD transport approach. The solid lines with solid circles indicate the sum over all contributions, the dashed lines with open circles correspond to the NN production channel, the dotted lines with open squares are the DN channel, the dot-dashed lines with solid ‘up’ triangles show the nN contributions while the short dashed line with open diamonds is the pion—hyperon (n½) channel for K\ production. The figure is taken from Ref. [190].
not consistent with the m -scaling observed by the TAPS Collaboration [136—138]. On the other 2 hand the present experimental data for K> and K\[126,159,173] indicate that the in-medium mass scheme for strange mesons leads to a reasonable agreement with the data. So, one can expect that the kaon and antikaon m -spectra should not show a strict scaling behaviour, i.e. the slopes 2
202
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.4. The calculated inclusive transverse-mass spectra of g-, u-, -, K>- and K\-mesons with in-medium masses according to Eq. (8.1). The symbols indicate the results from the HSD calculations (the assignment is the same as in Fig. 8.1). The solid lines are exponential fits to the HSD results to guide the eye. The pion yield is shown by straight thick lines. The figure is taken from Ref. [190].
should be different from the pions, especially for K\ mesons. Some first indication for this phenomenon can be seen in Fig. 5.13 (r.h.s.) in the preliminary data from the KaoS Collaboration. Summarizing this Section, our analysis shows that the m -spectra of mesons (corrected 2 by production thresholds as in Eq. (8.3) and Eq. (8.4)) are quite sensitive to the in-medium
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
203
modifications of the mesons. For bare meson masses we find a general scaling for n-, g-, u- and K>-mesons, whereas K\ and are suppressed, however, have the same slope. The m -scaling 2 found within our transport simulations is not due to thermal and chemical equilibration as suggested in Ref. [322], but a genuine nonequilibrium effect which is seen most prominantly for the light system C#C at 2A GeV. Dynamically it results from the observation [323] that the production probability (per degree of freedom) of the mesons considered here (except the ) in heavy-ion reactions ‘only’ depends on the excess energy available in hadron—hadron collisions. Any in-medium modifications thus show up in softer m -slopes in case of attractive potentials and 2 higher onsets at p "0; the opposite holds for repulsive potentials. The change in slope relative to 2 pions, that are usually detected simultaneously with the heavier mesons, is proportional to the a parameter in Eq. (8.1) which reflects the sign and strength of the meson potentials in-medium. These relations do not vary much when restricting to midrapidity or changing the bombarding energy according to the HSD transport calculations. Thus the m -spectra should provide valuable 2 information on the in-medium properties of the heavier mesons once the pion m -spectra are 2 measured in the same experiment for reference. 8.2. Pion—nucleus reactions As noted in Section 6, it is necessary to have independent information on the vector meson properties from reactions, where the dynamical picture is more transparent, e.g. in pion—nucleus collisions. Here, especially the u-meson can be produced with low momenta in the laboratory system, such that a substantial fraction of them will also decay inside a heavy nucleus [325,326]. The same holds for the -meson, however, its vacuum decay will still dominate except for very low momenta in the laboratory. The mass distributions of the vector mesons in nA collisions are expected to have a two component structure [327] in the dilepton invariant mass spectrum: the first component corresponds to resonances decaying in the vacuum, thus showing the free spectral function which is very narrow in case of the u- or -meson; the second (broader) component then corresponds to the resonance decay inside the nucleus. We will use that (in first order) the in-medium resonance can also be described by a Breit-Wigner formula with a mass and width distorted by the nuclear environment. Here we report on microscopic calculations [326,328] for the production and dileptonic decay of u, o and resonances in n\A collisions at pion momenta of 1.1—1.7 GeV/c available at GSI in the near future. The calculations are performed within the framework of the intranuclear cascade model (INC) [329] which for nA reactions practically coincide with the HSD transport calculations once the same cross sections are employed. A recent reanalysis of this issue in a coupled channel transport approach has been performed by Weidmann et al. [330]. Here we consider explicitly n\ induced reactions and also compute the background sources in the dilepton spectrum from pion—nucleon bremsstrahlung as well as the Dalitz decays uPne>e\ and gPce>e\ following Section 6. When the resonance decays inside the nucleus its explicit form is described (in a first approximation) by a Breit—Wigner formula C* 1 0 F(M)" * 2n (M!M )#C*/4 0 0
(8.5)
204
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
that contains the effects of collisional broadening, C* "C #dC , 0 0 where
(8.6)
dC"cvp o , 0, and a shift of the meson mass
(8.7)
M* "M #dM , 0 0 0 where
(8.8)
dM "!cvp o a . (8.9) 0 0, In Eqs. (8.7) and (8.9) v is the resonance velocity with respect to the target at rest, c is the associated Lorentz factor, o is the nuclear density, p is the resonance-nucleon total cross section and 0, a"(Re f (0))/(Im f (0)) with f (0) denoting the forward scattering amplitude. If the ratio a is small — which is actually the case for the reactions considered because many reaction channels are open — the broadening of the resonance will be the main effect. The sign of the mass shift depends on the sign of the real part of the forward RN scattering amplitude which, in principle, also depends on the momentum of the resonance. The latter dependence, however, will be disregarded in the following calculations. Restricting to pion momenta below 1.7 GeV/c one has to take into account the following elementary processes: n\pPun ,
(8.10)
n\NPunN
(8.11)
n\pPon ,
(8.12)
n\NPonN
(8.13)
n\pP n .
(8.14)
For the process (8.10) we use a parametrization of the experimental data from Refs. [331,121]: P !P S, (8.15) p(n\pPun)"C L, P? !d L, where P is the relative momentum (in GeV/c) of the pion—nucleon pair while P "1.095 GeV/c is L, S the threshold value. The parameters C"13.76 mb (GeV/c)?\, a"3.33 and d"1.07 (GeV/c)? describe satisfactorily the data on the energy-dependent cross section (8.10) in the near-threshold energy region. The other production channels are parametrized in a similar way as in Section 3.4. For further details the reader is refered to Ref. [328]. The decay of the resonances to e>e\ with their actual spectral shape is performed as described in Section 6: when the resonance decays into dileptons inside the nucleus its mass is generated according to a Breit—Wigner distribution with average mass M* and width C* "C #dC (8.6), 0 0 0 0
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
205
Fig. 8.5. The momentum distribution of the u-mesons produced in the reaction n\#PbPuX at 1.3 GeV/c as a function of the longitudinal momentum in the laboratory P (upper part) and the transverse momentum P (lower part); X 2 solid histograms: u-mesons decaying inside the nucleus, dashed histograms: u-mesons decaying in the vacuum. The figure is taken from Ref. [328].
where the collisional broadening and the mass shift are calculated according to the local nuclear density. Its decay to dileptons is recorded as a function of the corresponding invariant mass bin and the local nucleon density o . If the resonance leaves the nucleus, its spectral function automatically coincides with the free distribution because dM and dC are zero in this case. 0 0 8.2.1. Production and decay of u-mesons Following the suggestion by Scho¨n et al. [325] we start with the reaction n\#Pb at 1.3 GeV/c and present the calculated longitudinal (P ) and transverse (P ) momentum distributions X 2 of u-mesons in Fig. 8.5 which decay inside (solid histograms) and outside (dashed histograms) of the Pb-nucleus. Since the u decays have been recorded as a function of the nucleon density, the ‘inside’ component is defined as those u mesons which decay at densities o50.03o . As one might expect due to kinematical reasons the longitudinal momentum distribution of u mesons for the ‘inside’ component is shifted to lower momenta P whereas fast u’s predominantly decay in the X vacuum. A similar correlation also holds for the transverse momentum distribution though it is not as pronounced as for the longitudinal momentum distribution. Thus it is clear that in order to study in-medium decays of u mesons cuts for low P and P are favourable. X 2 Including the mass shift of the u’s as well as collisional broadening (as described before) we show in Fig. 8.6 the inclusive differential cross section of the e>e\ pairs from direct u decays for different cuts in P and P . In the lowest momentum interval (P 40.25 GeV, P 40.25 GeV) one clearly X 2 X 2 observes a two peak structure corresponding to the in-medium and vacuum decays, respectively. In order to quantify the ratio from both components we have introduced a cut in invariant mass at
206
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.6. The inclusive differential cross section of e>e\ pairs from direct u decays for n\#Pb at 1.3 GeV/c for different cuts in longitudinal (P ) and transverse (P ) momentum. The ratio R (8.16) provides a measure for the relative weight of X 2 the in-component relative to the vacuum component using a cut in invariant mass at M "0.725 GeV. The figure is taken from Ref. [328].
M "0.725 GeV and integrated the dilepton yield below and above M . The quantity +A dM N > \(M) C C R" dM N > \ (M) C C +A
(8.16)
thus provides a measure for the in-medium u decay relative to the vacuum decay. Its actual values for the various momentum cuts are given in Fig. 8.6 and decrease from R"1.16 (for the lowest momentum bin) with increasing total momentum of the dilepton pair to R"0.21 (for the highest
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
207
momentum bin). Thus in case of a large momentum acceptance of the dilepton spectrometer (HADES [332]) one can use different cuts in longitudinal and transverse momentum to explore the in-medium properties of the u-meson as a function of its momentum with respect to the target nucleus at rest, too. Since in light nuclei the vacuum decay of the u will dominate, one has to investigate the target mass dependence of the in-medium component, respectively. The results of the calculations at a pion laboratory momentum of 1.3 GeV/c are displayed in Fig. 8.7 for Pb, Zr, Ca and C in the lowest momentum bin (P 40.25 GeV/c, P 4 0.25 GeV/c) for the dilepton X 2 pair. The ratio R (8.16) here decreases from R"1.16 to R"0.44 when going from the heavy (Pb) to the light target (C). In case of C no explicit in-medium peak is visible anymore even for the lowest momentum cut; only a low mass tail of the pronounced peak for the vacuum decay appears. This explicit mass dependence can also be exploited experimentally to prove or disprove inmedium modifications of the u-meson by directly comparing dilepton spectra from light and heavy targets. A further question is related to the dependence of the in-medium component as a function of the pion laboratory momentum. One expects the u production cross section to increase with the pion momentum, however, their average momentum distribution will be shifted to higher momenta, too, such that the in-medium decay component might be reduced at higher energy. Thus calculations have been performed for n\#Pb collisions for pion momenta of 1.1, 1.3, 1.5, and 1.7 GeV/c, respectively. The results of the computations for n\#Pb reactions for the lowest momentum bin (P 4 0.25 GeV/c, P 4 0.25 GeV/c) of the dilepton pair indicate (cf. Fig. 8.8) that the ratio R here X 2 (within the numerical accuracy) is roughly constant from 1.3 to 1.7 GeV/c while the cross section increases by about 50%. The largest signal R"1.44 is obtained at 1.1 GeV/c, however, here the cross section is already down by about a factor of 2 as compared to a beam momentum of 1.3 GeV/c. Thus the INC calculations favor laboratory momenta of about 1.3 GeV/c for the study of the in-medium properties of the u-meson, which is in line with the suggestion by Scho¨n et al. [325]. 8.2.2. Background processes A first calculation for the background processes at invariant masses above 0.2 GeV for the reaction n\ Pb at 1.3 GeV/c has been presented in Ref. [326] integrated over all momenta of the dilepton pair. Here we show the results of the studies of Ref. [328] that differentiate with respect to momentum bins. The inclusive dilepton yield integrated over all momenta (l.h.s.) as well as for the lowest momentum bin (P 40.25 GeV/c, P 40.25 GeV/c) (r.h.s.) are shown in Fig. 8.9 for X 2 the reaction n\ Pb at 1.3 GeV/c. The background contributions above 0.65 GeV in the low momentum bin are found to be in the same order of magnitude as for the momentum integrated spectrum, whereas the inclusive cross section drops by about a factor of 15 for the lowest momentum bin. For experimental purposes we, furthermore, show the results of the calculations from Ref. [328] including the background processes discussed above for the reaction n\ C (dashed histograms) and n\ Pb (solid histograms) at 1.3 GeV/c in Fig. 8.10 integrated over all momenta (l.h.s.) as well as for the lowest momentum bin (P 40.25 GeV/c, P 4 0.25 GeV/c) (r.h.s.), respectively. In X 2 order to allow for a direct comparison, both systems have been normalized to the same differential cross section in the u vacuum decay peaks. The relative enhancement at invariant masses of 0.65
208
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.7. The inclusive differential cross section of e>e\ pairs from direct u decays for n\ induced reactions at 1.3 GeV/c on Pb, Zr, Ca and C for a low momentum cut (P 40.25 GeV/c, P 40.25 GeV/c). The ratio R (8.16) provides X 2 a measure for the relative weight of the in-component relative to the vacuum component using a cut in invariant mass at M "0.725 GeV. The figure is taken from Ref. [328]. Fig. 8.8. The inclusive differential cross section of e>e\ pairs from direct u decays for n\ induced reactions on Pb at 1.1, 1.3, 1.5 and 1.7 GeV/c for a low momentum cut (P 40.25 GeV/c, P 40.25 GeV/c). The ratio R (8.16) provides X 2 a measure for the relative weight of the in-component relative to the vacuum component using a cut in invariant mass at M "0.725 GeV. The figure is taken from Ref. [328].
GeV for the Pb-target is quite pronounced for the lowest momentum bin, however, survives also in the momentum integrated spectra (l.h.s.). Thus the in-medium u-meson properties can be extracted experimentally by comparing directly the dilepton yield from light and heavy targets especially for low momentum cuts. We note that the relative dilepton yield at invariant masses M40.65 GeV is smaller for C as for Pb because we have normalized to the free u decay peak which is more pronounced for C since most of the u-mesons decay in the vacuum here. 8.2.3. -meson production and decay Apart from the u-meson the in-medium properties of the -meson can be studied as well by n\A reactions. Since for pion momenta of 1.3 GeV/c the cross section is rather low even for Pb-targets, we report on an analysis at a laboratory momentum of 1.7 GeV/c [328]. The distributions of
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
209
Fig. 8.9. The inclusive differential cross section of e>e\ pairs from n\ induced reactions on Pb at 1.3 GeV/c; L.H.S.: integrated over all dilepton momenta; R.H.S.: for a low momentum cut (P 40.25 GeV/c, P 40.25 GeV/c). The upper X 2 solid line represents the sum of all contributions; the thin histograms present the yields from pion-nucleon bremsstrahlung (nN), the g Dalitz-decay (g), the u Dalitz-decay and the direct decays of the vector mesons o and u, respectively. The figure is taken from Ref. [328].
Fig. 8.10. Comparison of the inclusive differential cross section of e>e\ pairs from n\ induced reactions on Pb (solid histograms) and C (dashed histograms) at 1.3 GeV/c including all sources; L.H.S.: integrated over all dilepton momenta; R.H.S.: for a low momentum cut (P 40.25 GeV/c, P 40.25 GeV/c). Both systems have been normalized to X 2 the same differential cross section in the u vacuum decay peaks. The figure is taken from Ref. [328].
210
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.11. The momentum distribution of the -mesons produced in the reaction n\PbP X at 1.7 GeV/c as a function of the longitudinal momentum P in the laboratory (upper part) and the transverse momentum P (lower part) for the X ‘inside’ (solid histograms) and the ‘outside’ components (dashed histograms). The figure is taken from Ref. [328].
-mesons in longitudinal momentum (P ) and transverse momentum (P ) are displayed in X Fig. 8.11 for the ‘inside’ and ‘outside’ component, respectively. Due to the longer lifetime of the
-mesons (as compared to u-mesons) and lower in-medium scattering cross sections [333], the ‘outside’ momentum distribution is considerably larger than the ‘inside’ momentum distribution. The in-medium properties of the will thus be harder to observe. In Fig. 8.12 we show the invariant mass distribution of dilepton pairs from the background processes as well as u, o and decays including the mass shift (8.1) for u and o as well as collisional broadening for all mesons, however, discarding a mass shift of the -meson. The momentum integrated mass spectra for n\Pb at 1.7 GeV/c are shown in the l.h.s. of Fig. 8.12 whereas a low momentum cut (P 4 0.25 GeV/c, P 40.25 GeV/c) has been applied in the r.h.s. of X 2 Fig. 8.12. The peak clearly emerges out of the background from the o decay even for the integrated spectra, while the decay is even more pronounced in the low momentum bin. Thus the
-meson can clearly be studied experimentally if the mass resolution DM is 10 MeV or less. We, furthermore, address the question of in-medium effects on the -meson again for the system n\Pb at 1.7 GeV/c. The results of the INC calculations from Ref. [328] are displayed in Fig. 8.13 for a low momentum cut (P 4 0.25 GeV/c, P 40.25 GeV/c). The full histogram shows the X 2 dilepton spectra without a mass shift for (cf. Fig. 8.12) while the dashed histogram includes a mass shift according to Eq. (8.1) with a 2.5% reduction at o according to [56] leading to an in-medium peak shifted by about 25 MeV, which is only visible when applying the cut on low momenta; otherwise only a slight asymmetry in the mass spectrum survives.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
211
Fig. 8.12. The invariant mass distribution of dilepton pairs from u, o and decays for n\Pb at 1.7 GeV/c including the mass shift (8.1) for u’s and o’s as well as collisional broadening for all mesons, however, discarding a mass shift of the -meson; L.H.S.: momentum integrated mass spectra; R.H.S.: for a low momentum cut (P 40.25 GeV/c, X P 40.25 GeV/c). The figure is taken from Ref. [328]. 2
Fig. 8.13. The invariant mass distribution of dilepton pairs from u, o and decays for n\Pb at 1.7 GeV/c including the mass shift (8.1) for u’s and o’s as well as collisional broadening for these mesons for a low momentum cut (P 40.25 GeV/c, P 40.25 GeV/c). The full histogram at +1.02 GeV of invariant mass shows the decay without X 2 a mass shift but collisional broadening (cf. Fig. 8.10) while the dashed histogram additionally includes a mass shift according to Eq. (8.1) by 2.5% at o . The figure is taken from Ref. [328].
212
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
We note, however, that the lifetime of the -meson at normal nuclear matter density might be shorter due to in-medium modifications of the kaons and antikaons because for dropping kaon masses the phase space for the decay to kaon and antikaon in the medium increases [93] (cf. Section 5). As a consequence, the ‘inside’ component of the decay should be enhanced in such a scenario. Golubeva et al. [328] also have performed studies with a modified decay width of the
to KKM as
C* "C ) )
(M*!(M #M*M ))(M*!(M !M*M ))/M* ( ) ( ) ( ) ) , M!4M ( )
(8.17)
where M , C are the bare mass and decay width to KKM , respectively, and M* is the in-medium ( ) ( mass of the that drops with density according to Eq. (8.1). At normal nuclear matter density the decay width of the to KKM thus increases from 3.8 MeV to +8 MeV by roughly a factor of 2 due to the kaonic decay; however, this additional width is smaller than the collisional broadening of dC +15—20 MeV such that this effect will be hard to see experimentally. 0 Summarizing this section we have presented detailed predictions for dilepton spectra from pion—nucleus reactions (following Refs. [325,326,328,330]) to learn especially about the in-medium properties of the u- and -mesons. The results for n\ Pb at 1.3 GeV/c indicate that the dominant background for invariant masses M above 0.6 GeV arises from n\N bremsstrahlung which, however, is still small compared to the yield from the direct vector meson decays. A mass shift of the o- and u-mesons should be seen experimentally by an enhanced yield in the mass regime 0.654M40.75 GeV and a mass shift of the o meson especially in a decrease of the dilepton yield for M50.85 GeV because the o almost entirely decays inside a Pb-nucleus. The in-medium modifications of the u-mesons are most pronounced for small momentum cuts on the e>e\ pair in the laboratory (P 40.25 GeV/c, P 40.25 GeV/c). Furthermore, for n\Pb at 1.7 GeV/c X 2 a sufficient cross section for production is predicted; its dilepton decay signal should be clearly visible above the background from o decays. The in-medium mass shift of the is expected to be much smaller than that of the o-, u-mesons and the dominant effect expected is a broadening of the peak due to elastic N collisions and an enhanced kaonic decay width in-medium due to a dropping antikaon mass [160,98]. In order to distinguish experimentally the in-medium peak from the vacuum decay our analysis indicates that sensible cuts on low dilepton momenta in the laboratory as well as an experimental mass resolution DM45 MeV will be necessary. 8.3. Dilepton anisotropies As discussed in Section 6, dileptons are used as electromagnetic signals to probe the hot and dense nuclear phase in heavy-ion collisions or the in-medium properties of vector mesons in pion—nucleus reactions. However, there are a lot of hadronic sources for dileptons because the electromagnetic field couples to all charges and magnetic moments. In particular, in hadron— hadron collisions, the e>e\ pairs are created due to the electromagnetic decay of time-like virtual photons — which can result from the bremsstrahlung process or from the decay of baryonic and mesonic resonances including the direct conversion of vector mesons — into virtual photons in accordance with the vector meson dominance hypothesis. In the nuclear medium, the properties of
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
213
these sources should be modified and it is thus very desirable to have experimental observables which allow to disentangle the various channels of dilepton production additionally. Apart from the differential e>e\ spectra the investigation of lepton angular distributions is promising, too, because the virtual photon created in hadronic interactions is polarized. The coupling of a virtual photon to hadrons induces a dynamical spin alignment of both the resonances and the virtual photons due to the conservation laws and consequently the angular distribution of a lepton pair with respect to the polarized photon momentum is anisotropic. The angular distributions of leptons for large invariant masses (above the J/W regime) have been investigated at high energy, both experimentally [334] and theoretically (cf. the review of Strojnowski [335]), more than a decade ago. Here, the dominant source at large invariant masses is the quark annihilation (Drell—Yan) process. The former measurements allowed to observe a small deviation of the polar angular distribution from the prediction of the naive quark-parton model; the deviations observed, however, are in line with more detailed QCD computations. Thus the sensitivity of lepton angular characteristics already has been successfully used to differentiate between models. It has been proposed in Refs. [336,337] to use the lepton pair angular distributions also for a distinction between different sources in the ‘low’ invariant mass region (M(1 GeV), where a lot of dilepton sources contribute. Furthermore, it has been demonstrated, that due to the spin alignment of the virtual photon and the spins of the colliding or decaying hadrons, this lepton decay anisotropy turns out to be sensitive to the specific hadronic production channel. In Refs. [336—338] the dilepton anisotropy has been calculated for elementary nucleon—nucleon collisions and for p#d reactions at BEVALAC energies. First results for proton—nucleus and nucleus—nucleus collisions at 1—2A GeV have been reported in [283]. Here we present the results of a more systematic study on the dilepton angular anisotropy for p#A and A#A reactions from 2A GeV to 200A GeV within the HSD transport approach from Ref. [152]. 8.3.1. Definition of the anisotropy coefficient The general form of the angular distribution for the decay of a virtual photon into a lepton pair may be written [339] as 3 sin h cos2u ¼(h, u)" [o (1#cos h)#(1!2o )sin h#o \ 8n (8.18) #(2Reo sin h cos u] , where the angles h, u of the electron momentum are measured with respect to a fixed z-axis in the virtual photon rest frame. The density matrix elements o depend on the choice of the reference GH frame as well as all the variables describing the virtual photon. An integration over the azimuthal angle gives ¼(h)&1#B cos h ,
(8.19)
where the coefficient B may vary between !1 and #1. Adopting the normalization condition o #2o "1, the coefficient B can be represented as (cf. [340]) 3o !1 . (8.20) B" 1!o
214
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Adopting the general form Eq. (8.18) the angular distribution of dileptons created in channel i in a hadron—hadron (h#h) reaction then can be written as [336,337] dpFF G "AFF(1#BFF cos h ) . (8.21) G G FF dM d cos h FF Here h is defined by h "(l* , FF) with the electron momentum l* measured in the dilepton \ O \ FF FF center-of-mass system (q*,l* #l* "0), while FF"qFF/qFF is the velocity of the dilepton cms \ > O relative to the h#h cms. The coefficient BFF in Eq. (8.21) describes the anisotropy of the angular distribution (BFF"0 in G G case of isotropy), while AFF determines the magnitude of the respective cross section. The absolute G value of BFF now depends on the choice of the coordinate system. For example, in the G Gottfried—Jackson frame the z-axis coincides with the direction of the incident beam in the dimuon rest frame; this definition was adopted in the experiments before [334]. Following [336,337] we will use for our analysis the ‘helicity’ system, where the z-axis is chosen along the direction of the virtual photon momentum q in the center-of-mass system (cms) of the colliding hadrons. The total differential cross section for h#h collisions now can be represented as a sum of the differential cross sections for all channels i, dpFF dpFF G " "AFF(M)(1#BFF(M) cos h ) , (8.22) FF dM d cos h dM d cos h FF FF GAF?LLCJ which leads to the total anisotropy coefficient, dpFF BFF G G dM 1#BFF G (8.23) BFF(M)" 1BFF(M)2, 1BFF(M)2" G G 1 dpFF G GAF?LLCJ , G dM 1#BFF G where the special weighting factors originate from the necessary angle-integrations. Thus, the anisotropy coefficient BFF for h#h reactions is the sum of the ‘weighted’ anisotropy coefficients (1BFF2) for each channel i obtained by means of the convolution of BFF with the corresponding G G invariant mass distribution (cf. [338]). The anisotropy coefficients for the bremsstrahlung, Dalitz-decays of D-resonance and g-meson in pn and pp interactions have been calculated on the basis of a one-boson-exchange model fitted to elastic NN scattering in Ref. [337]. These anisotropy coefficients are a function of the invariant mass M, the masses m , m and the initial invariant energy (s of the hadrons a#b involved in the ? @ reaction. As shown in Ref. [336], the anisotropy coefficient for pion annihilation in the nn cms is given by B > \"!1. For this particular situation, where there are only two particles in the initial L L and final states, h is the angle of the lepton momentum with respect to the pion momentum in the FF cms of the leptons (or pions), i.e. q*"p*#p*"l* #l* "0. ? @ > \ The production of dilepton pairs (e>e\) includes the Dalitz-decays DPNe>e\, gPce>e\, uPne>e\, the bremsstrahlung for charged NN, nN and nn collisions as well as the direct decays of the vector mesons o, u and . Furthermore, the secondary mesonic channels n>n\PoPe>e\, KKM P Pe>e\ and noP Pe>e\ are taken into account, too. The decay ratios are taken from the particle data table [341] whereas the formfactors for the Dalitz-decays are adopted from Landsberg [265] (cf. Section 6).
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
215
For heavy-ion reactions the situation becomes more complicated due to the nuclear dynamics and the explicit time evolution of the interacting system. For the calculation of the nuclear anisotropy coefficient we start from the point that the form of the angular distribution for all ‘elementary’ hadron—hadron interactions a#b, that occur in the nucleus—nucleus reaction A#B, are known from the microscopic calculations [336—338]. In case of A#A reactions we have to take into account that the ‘elementary’ a#b cms (in which the B?@) are computed are moving relative to the A#B cms. This implies that the direction of the G virtual photon momentum q?@ relative to the hadron—hadron cms a#b is different for each elementary interaction a#b and cannot be restored from the experimental data. However, the direction of the virtual photon momentum q in the nucleus—nucleus cms A#B can be reconstructed for each lepton pair. Thus, it is necessary to perform an angular transformation from h to ?@ h , where h is the angle between the lepton momentum l* and virtual photon momentum in the cms of the colliding nuclei A#B. At high energies (&200A GeV) the main contributions to the dilepton spectra come from the Dalitz-decays of g, u-mesons, n>n\-annihilation and direct decay of vector mesons o, u, (cf. Section 6). Furthermore, the polarization of vector mesons created in direct nucleon—nucleon reactions is practically zero according to the measurements of Blobel et al. [342]. This is due to the fact that the exclusive cross section NNPNNo, u, is very small compared to the inclusive cross section NNPNNo, u, #X containing several pions in the final channel, too. Consequently, we do not consider the contribution to the anisotropy coefficient from the decays of vector mesons produced in primary NN-collisions. Thus, there are only three dominant channels at intermediate and high energies, i.e. the Dalitz-decays of g- and u-mesons as well as n>n\ annihilation that have to be taken into account in the analysis. For the computation of the nuclear anisotropy coefficient we simulate explicitly lepton events for each channel i with fixed angular distribution in the individual a#b system and then transform the lepton and photon 4-momenta to the A#B system. For example, for the Dalitz-decay of the g-meson, the lepton angular distribution in the rest frame of the g-meson has a form (8.19) with B?@"#1 [336], where h is the angle between the lepton momentum l* and the direction of qE, E ?@ while qE is the virtual photon momentum in the rest frame of the g. We also use that the distribution of the virtual photon momentum qE in the rest frame of the g is isotropic. Thus, for each g event with energy E and momentum P in the nucleus cms A#B we generate lepton events distributed E E with respect to qE according to Eq. (8.19) with B?@"#1 in the lepton cms. The next step then is E the Lorentz transformation from the g rest frame to the cms of A#B: q "¸(P /E )q , where E E E q "(m#M)/2m , q "(j(m, M, 0)/2m . Finally we compute the angle h between l* and E E E E E E the direction of the photon momentum q in the cms of A#B as l* ) cos h " * O , (8.24) "l " ) " " O where "q /q and "l*""M/2 for fixed invariant mass M. Thus the angular distribution as O a function of cos h for all generated lepton events is recovered and B can be computed via E ¼(h "0°) B " !1 . (8.25) E ¼(h "90°) The weighted 1B 2 then follows from Eq. (8.23). E
216
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.14. The weighted anisotropy coefficients 1B (M)2 for p#Be collisions at the bombarding energy of 2.1 GeV and G Ca#Ca collisions at the bombarding energy of 2.0A GeV. The ‘g’ denotes the contribution of the g-channel, the ‘D’ labels the contribution of the D Dalitz-decay, ‘pn’ the proton-neutron bremsstrahlung, and ‘n>n\’ the pion annihilation channel. The solid curves (denoted by ‘all’) show the sum of all sources. The figure is taken from Ref. [152].
In a similar way we calculate the anisotropy coefficient for the pion annihilation channel by using that the anisotropy coefficient in the cms of the leptons (or pions) [336] B?@> \"!1. For L L each pion annihilation event we then restore the virtual photon energy and momentum in the cms of A#B: q "E #E , q "p #p , where E , E , p , p are the energies and momenta of the ? @ ? @ ? @ ? @ pions in the cms of A#B. Then we perform a Lorentz transformation for the pion momentum from the A#B cms to the leptons cms: p*"¸(q /q )p . In the lepton c.m.s. we generate the ? ? lepton events with "l*""M/2 (M"q !q ) and angular distribution ¼(h )"1!cos h , * ?@ ?@ where h is the angle between the l and the pion momentum p* in the leptons c.m.s.; ?@ * * * * ? cos h "(l ) p )/("l " ) "p "). For each selected lepton event the angle h can be calculated according ? ? ?@ to Eq. (8.24). The weighted coefficient 1B > \2 can be computed in the similar manner as in the L L g case. The anisotropy coefficient for the Dalitz-decay of the u-meson, furthermore, is computed in analogy to the g case using the elementary coefficient B?@"#1. This discrete algorithm is close to S an experimental event by event analysis and includes the full hadron dynamics. 8.3.2. Numerical results for p#A and A#A reactions In this section we present the results of our calculations for the dilepton spectra and the anisotropy coefficient for the systems p#Be, Ca#Ca, S#Au, Au#Au from BEVALAC to SPS
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
217
Fig. 8.15. The weighted anisotropy coefficients 1B (M)2 for p#Be and central Au#Au collisions at 10A GeV; the G notation is the same as in Fig. 8.14. The figure is taken from Ref. [152].
energies without employing any medium effects for the mesons, i.e. º1 "0, ºI "0 in the set of K K transport equations Eq. (3.1) for all mesons. In Fig. 8.14 we show the computed weighted anisotropy coefficients 1B (M)2 for p#Be and G Ca#Ca collisions at the bombarding energy of 2A GeV. We note that the inclusive dilepton spectra for the reactions have been presented in Ref. [283] together with a ‘cocktail’ decomposition. The main contributions arise from the g and D Dalitz decays due to their large ‘elementary’ anisotropy coefficients and cross sections, respectively. The contribution from pn bremsstrahlung is practically zero at all invariant masses due to a small ‘elementary’ anisotropy coefficient. The weighted coefficient from n>n\ annihilation is rather small (+0.1) even for the Ca#Ca reaction and decreases for M5m due to the threshold behaviour of the cross section. However, compared M to 1B > \(M)2 for p#Be, where pion annihilation is very low, a clear (but moderate) enhancement L L can be extracted. In Fig. 8.15 we display the weighted anisotropy coefficients 1B (M)2 for p#Be and Au#Au G collisions at 10A GeV using the same notations as in Fig. 8.14. The main contribution arises from the g Dalitz-decays whereas the contribution from the u Dalitz-decay is quite small due to its reduced cross section. The weighted anisotropy coefficient from pion annihilation for p#Be is zero for the same reason. However, the 1B > \(M)2 for Au#Au is zero because the angular L L distribution of the annihilating pions relative to the q direction becomes practically isotropic at
218
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.16. The weighted anisotropy coefficients 1B (M)2 for p#Be collisions at 450A GeV, S#Au at 200A GeV and G Au#Au at 160A GeV; the notation is the same as in Fig. 8.14. The figure is taken from Ref. [152].
this energy which leads automatically to an isotropic dilepton spectrum. Obviously, this effect is related to our choice of the z-axis along the q . One can show, e.g. in the Gottfried—Jackson frame, that the distribution of annihilating pions is anisotropic and 1B > \(M)2 is small and negative at L L this energy. In Fig. 8.16 we present the weighted anisotropy coefficients 1B (M)2 for p#Be collisions at G 450A GeV, S#Au at 200A GeV and Au#Au at 160A GeV. For p#Be the situation is similar to the previous cases at lower energies, the main contribution at small invariant masses comes from the g Dalitz decay; 1B > \(M)2 is approximately zero due to the small cross section. For L L nucleus—nucleus collisions, however, the contribution of the pion annihilation channel becomes more essential. As seen from Fig. 8.16, contrary to the lower bombarding energies, the weighted anisotropy coefficient for pion annihilation is negative; an effect related to some differences in the pion dynamics at low and high energy. Summarizing this section we find that the dilepton anisotropy coefficient 1B2 is different from zero in p#A and A#A collisions at all bombarding energies. Thus, the calculated anisotropy
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
219
coefficients for nucleus—nucleus collisions suggest that the dilepton decay anisotropy may serve as an additional observable to decompose the dilepton spectra into the various sources and to obtain further information on the reaction dynamics. We note, however, that our transport calculations correspond to the full acceptance for the lepton pairs. Especially cuts in the transverse momenta affect the anisotropies sensitively. This should be borne in mind when analyzing experimental data with a limited acceptance. 8.4. Stepping towards RHIC energies Nucleus—nucleus collisions with initial energies per nucleon of (s"200 GeV or +21.5A TeV will be available soon at the Relativistic-Heavy-Ion-Collider (RHIC) in Brookhaven. In central collisions of Au#Au here energy densities above 5 GeV/fm are expected such that the critical energy density for a QGP phase (cf. Section 2.2) should be overcome in considerable space—time volumes where the relevant degrees of freedom are partons (quarks and gluons). Parton cascade calculations have been used so far [343—345] to estimate the energy densities and particle production yields in violent reactions at (s"200 GeV, an order of magnitude higher than at SPS energies. Intuitively one expects that the initial nonequilibrium phase of a nucleus—nucleus collision at RHIC energies should be described by parton degrees of freedom whereas hadrons are only formed (by ‘condensation’) at a later stage of the reaction which might be a couple of fm/c from the initial contact of the heavy ions. Thus parton cascade calculations — including transitions rates from perturbative QCD — should be adequate for all initial reactions involving a large 4momentum transfer between the consituents since QCD is well tested in its short distance properties. The question, however, remains to which extent the parton calculations can be extrapolated to low Q where hadronic scales become important. As a rough estimate one can employ here the average mass of vector mesons, the nucleon and its first excited state, which gives Q +1 GeV. On the other hand, using the uncertainty relation this implies timescales of (0.2 fm/c or relative separations of partons (0.2 fm, which are small compared to the hadronic size or average lifetime of the o, D, etc. in free space or the formation time of hadrons t + $ 0.7—0.8 fm/c as used in the HSD transport approach. Turning the argument around, a nonequilibrium hadronic approach involving a timescale of 0.7—0.8 fm/c cannot tell anything about shorter times because the uncertainty relation does not allow to distinguish states which are separated in mass by less than +300 MeV, which is the N!D mass difference. Thus one faces the problem that neither the parton description nor a nonequilibrium hadronic model should be valid for times 0.2 fm/c4t4 0.7—0.8 fm/c in individual hadronic reactions, which corresponds to the nonperturbative formation time of the hadronic wavefunction. This regime of the ‘soft’ QCD physics is presently not understood and appropriate dynamical models are urgently needed. Inspite of these apparent difficulties the practical question is now, if nonequilibrium partonic and hadronic models can be distinguished at all, i.e. do they lead to different predictions for experimental observables? In fact, first applications of the parton cascade model developed by Geiger [344] to nucleus—nucleus collisions at SPS energies [346] show that a reasonable description of the meson and baryon rapidity distributions can also be achieved on the basis of partonic degrees of freedom. We will thus investigate in the following if similar observations also hold at RHIC
220
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.17. The proton, (L.H.S.), n> and K> (R.H.S.) rapidity distributions in the cms from the hadronic transport approach HSD (solid histograms) and the parton cascade VNI [347] (dashed histograms) for pp collisions at (s"200 GeV.
Fig. 8.18. The proton (i.e. p!pN ), antiproton (L.H.S.), n> and K> (R.H.S.) rapidity distributions in the cms from the hadronic transport approach HSD (solid histograms) and the parton cascade VNI [347] (dashed histograms) for central (b42 fm) Au#Au collisions at (s"200 GeV.
energies by comparing the predictions from the parton cascade [347] with those from the HSD transport approach. We start with pp collisions at (s"200 GeV. The calculated results for the proton, n> and K> rapidity distributions in the cms are shown in Fig. 8.17 for both models, which are denoted individually by the labels VNI and HSD in obvious notation. Already on the level of pp collisions we find considerable differences between the two kinetic models. The parton cascade shows a higher amount of proton stopping as the hadronic model (l.h.s.) and a higher production of n> and K> mesons (r.h.s.), because the energy taken from the relative motion of the leading
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
221
Fig. 8.19. The number of baryon-baryon and meson-baryon collisions as a function of the invariant energy (s for central (b42 fm) Au#Au collisions at (s"200 GeV in the HSD model.
baryons is converted to the production of mesons. We note that the parton cascade (VNI) has been performed with the same set of default parameters that also describe nucleus—nucleus collisions at SPS energies [346] as in case of the HSD approach. Furthermore, the rapidity distributions from the parton cascade for produced particles are broader than that of the HSD approach which in this case is equivalent to the LUND string model [90]. Experiments will first have to clarify for pp collisions which is the more appropriate description. We directly step towards central collisions (b42 fm) for Au#Au at (s"200 GeV. The calculated results for the proton (here p!pN ), antiproton, n> and K> rapidity distributions in the cms are shown in Fig. 8.18 for both models, which are denoted again by the labels VNI and HSD. Here the hadronic model shows a larger stopping than the parton cascade (l.h.s.) and a flatter distribution in rapidity of antiprotons than the partonic cascade. The pion and kaon multiplicities turn out to be roughly the same, but the results from the parton cascade are more strongly peaked around midrapidity as those from the HSD approach. These differences are large enough to allow for experimental distinction provided that the experimental acceptance in rapidity is sufficiently large. In physical terms the larger stopping of the HSD approach and the additional production of pions and kaons (relative to pp collisions) stems from secondary and ternary reaction channels in the hadronic rescattering phase, which are quite abundant since the meson densities achieved are very high. The narrow width of the antiproton and meson rapidity distribution (as compared to pp collisions) is due to an approximate thermalization of the partonic degrees of freedom; in the hadronic scenario essentially ‘comover’ scattering occurs with a low change of the meson rapidity distribution. Thus the meson rapidity distributions are practically the same as for pp collisions. Also note that at midrapidity the net baryon density &N !N N is practically zero, however, even N N at midrapidity at lot of baryons appear that are produced together with antibaryons. Thus also mesons (especially ccN pairs) will encounter a lot of baryons and antibaryons on their way to the continuum. The amount of higher order hadronic rescattering processes is depicted in Fig. 8.19 as emerging from the HSD calculation, where the number of baryon—baryon and meson—baryon collisions is
222
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
Fig. 8.20. The soft dilepton multiplicity for invariant masses M41.4 GeV for central Au#Au collisions (b42 fm) for bombarding energies from 2A GeV to 21.5A TeV without incorporating any medium effects as discussed in Section 6.
shown as a function of the invariant energy (s. Apart from the initial small peak at (s"200 GeV a substantial amount of intermediate and low energy rescattering processes with maxima at 2.5 GeV and 1.8 GeV are found, which essentially stand for flavor exchange processes, multiple pion production in mB and BB collisions as well as secondary strangeness production channels. This has to be kept in mind when comparing to pp and pA reactions at (s"200 GeV. Soft dilepton pairs will also be measured at RHIC. In Fig. 8.20 we show the calculated soft dilepton multiplicity for invariant masses M41.4 GeV — integrated over rapidity and transverse momentum — for central Au#Au collisions (b42 fm) for bombarding energies from 2 to 21.5A TeV without incorporating any medium effects as discussed in Section 6. The global shape of the dilepton spectrum is roughly the same in the whole energy domain showing essentially an increase in the meson abundancy and a more steep rise in the mass region due to more abundant
production with increasing bombarding energy. In the present calculations we have adopted a mass resolution DM"10 MeV which might be considered as a lower limit for the experiments at RHIC. We have, furthermore, performed calculations for the differential dilepton multiplicity including the o spectral function from Ref. [262] using the same methods as discussed in Section 6. Without explicit representation we mention that at all energies an enhancement for invariant masses around 0.5 GeV is found which is most pronounced, however, at AGS energies (10A GeV) (cf. [155]). Though low mass dilepton pairs are more abundant at RHIC energies and essentially better statistics can be achieved as compared to lower bombarding energies, RHIC energies are not well suited e.g. for in-medium o spectroscopy. We also have performed calculations for high mass dimuons following Section 7, where especially the J/W and t peaks are of interest. Without explicit representation we note that the dimuon spectra are similar in shape to Fig. 7.8 (S#¼ at 200A GeV), however, in the ‘comover’ absorption scenario the W peak vanishes almost completely whereas the J/W peak is suppressed by + 90% in central Au#Au collisions due to the high meson densities achieved at these energies. We note that we have used the same charmonium absorption cross sections on baryons and mesons as in
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
223
model II of Section 7; it might well be that especially the absorption on mesons at RHIC energies will occur with even higher cross sections leading to a complete disappearance of the J/W signal in central Au#Au collisions. Thus the J/W signal does no longer appear as a promising criterion for the presence of an intermediate QGP phase also at RHIC energies. In summarizing this section, we have pointed out a couple of perspectives for experiments to be performed in the near future. We find that by increasing the bombarding energy from SIS to RHIC energies the ‘classical’ observables, i.e. strangeness enhancement, low mass dilepton enhancement and charmonium suppression, do not qualify so far as a ‘smoking gun’ for a restoration of chiral symmetry at high baryon density and temperature or a phase transition to the QGP phase according to our studies within the hadronic transport approach. Only a systematic measurement of vacuum production amplitudes and differential meson and dilepton cross sections under controlled circumstances — nA and pA reactions — as well as excitation functions for central nucleus—nucleus collisions will provide the necessary steps towards a solution of the questions raised in the introduction of this work. Furthermore, hadronic observables — as shown in Section 8.4 — also qualify to disentangle the relevant degrees of freedom and the amount of thermalization by comparing pp, pA and central AA collisions at the highest bombarding energies.
9. Summary In this review we have discussed the issue of chiral symmetry restoration at high baryon density and/or temperature as well as a possible phase transition from an interacting hadron gas to a quark-gluon plasma (QGP). Both transitions occur at the same critical temperature according to lattice calculations at vanishing quark chemical potential k . Extrapolations to finite density within O chiral perturbation theory (ChPT), effective low energy Lagrangians — with the same symmetry breaking terms as QCD — and arguments based on scale invariance of QCD (in the classical limit) as well as results from QCD sum rules have been briefly reviewed. All these extrapolations to high density and temperature have to be taken with care, however, they all point out that the hadron properties should change dramatically in the dense nuclear medium especially close to the phase transition. These changes are reflected in their spectral function at finite density, temperature and 3-momentum, where the real part of the self-energy indicates a drop or rise in the pole position, i.e. the mass, whereas the imaginary part reflects the inverse lifetime of the hadron. In simple terms the questions reduce to: do the hadrons melt or drop in mass when approaching a phase of chiral symmetry? The properties of hadrons so far have been explored experimentally by means of proton—nucleus and nucleus—nucleus collisions from SIS to SPS energies, i.e. from 1 to 200A GeV. Conclusions about the hadron properties at high temperature or baryon densities have been based on the comparison of the experimental data with nonequilibrium kinetic transport theory denoted by BUU, IQMD, RBUU, RQMD, ART, ARC, UrQMD or HSD, respectively. Here our analysis has been dominantly based on the covariant HSD approach (Hadron-String-Dynamics) which incorporates momentum-dependent self-energies for baryons and — in a first approximation — densitydependent scalar self-energies for the mesons in line with the results from chiral effective Lagrangians (cf. Section 3). The in-medium hadron production and scattering cross sections have been approximated by their values in free space as a function of the invariant energy above the
224
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
production threshold ((s!(s ). Our modelling of in-medium production and scattering rates is based on the assumption that all the reactions are dominated by phase-space; this hypothesis has to be investigated in more detail in future; however, it holds quite well for ‘free’ processes. We have then performed detailed investigations on proton, pion and g spectra and rapidity distributions for p#A and A#A collisions from 1—200A GeV in comparison to the available experimental data. Since all data so far can be described reasonably well without including any selfenergies for the pseudoscalar mesons n and g, we conclude that pion or g self-energies should be weak in the dense medium and might be neglected to a good approximation. However, at small transverse momenta the experimental data indicate an excess of pions for heavy systems which points towards attractive potentials for the pion at low momenta. This issue will have to be reinvestigated in future with more refined models. The production of strangeness in terms of kaons, antikaons and hyperons as well as the production of antiprotons has shown a clear evidence for rather strong attractive antikaon and antiproton potentials at finite baryon density at ‘subthreshold’ bombarding energies per nucleon. This conclusion is rather model independent, however, an explicit momentum dependence of the self-energies has not been considered in the present models and more exclusive data (also from p#A reactions) are needed to pin down this momentum dependence explicitly. On the other hand, the kaon flow data point towards a slightly repulsive potential of K> mesons in the medium as expected from the chiral Lagrangian models. The production of kaons and antikaons at AGS energies (11—15A GeV), however, is not well reproduced within the HSD transport approach even when including self-energies in line with the analysis at SIS energies. One might speculate that something ‘new’ happens for nucleus—nucleus collisions at these energies, but before drawing final conclusions more detailed investigations of strangeness production in the energy range from 1—10A GeV are needed with cuts on the centrality of the reactions. In fact, corresponding studies at the SIS and AGS will be performed in the near future. At SPS energies the production of strangeness is reasonably described for ‘free’ as well as ‘in-medium’ kaons, antikaons and hyperons due to a sizeable amount of rescattering processes. Self-energy effects for these particles can be seen in slightly broader rapidity distributions, but the present data are compatible with both scenarios. On the other hand, the strangeness enhancement is ruled out as a clear signature for a phase transition to a QGP. We have then investigated the production of dileptons in comparison to the available data from 1 to 200A GeV. Absolute normalizations of the spectra here are obtained by the low mass dilepton yield which is dominated by the pion and g Dalitz-decays. This is an important issue since within the same transport approach we can investigate if the n and g yields are consistent with the measurements from other collaborations. On the other hand, the properties of vector mesons can be studied by their dileptonic decay in the dense medium. Here the o-meson spectroscopy is of primary interest in case of nucleus—nucleus collisions. In fact the data of the CERES and HELIOS-3 Collaborations cannot be explained by ‘free’ or vacuum formfactors as has been found by a couple of groups. A ‘dropping’ mass for the o meson in line with the scaling hypothesis of Brown and Rho or according to the sum rule analysis of Hatsuda and Lee is much better in line with the data at SPS energies. However, also approaches based on more conventional hadronic interactions as pion polarizations and meson—nucleon scattering amplitudes describe the present dilepton spectra equally well. Thus the question of a dropping or melting o-meson in the medium cannot be answered uniquely on the basis of the present data sets. We have argued, furthermore,
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
225
that the dilepton yield between the u and mass should be well suited to disentangle the different scenarios provided that a sufficiently high resolution in mass is achieved experimentally. Furthermore, the charmonium production and suppression in proton—nucleus and nucleus— nucleus collisions has been investigated within the transport approach in order to probe a possible transition to a quark-gluon plasma (QGP) phase. The detailed comparisons with experimental data on an event-by-event basis for p#A and A#A reactions, however, show that all data are compatible with an additional ‘comover’ absorption scenario. Alternatively, the extra suppression in nucleus—nucleus collisions can be described by a prehadronic dissociation of color octet ccN states in the strong chromoelectric field of the neighbouring strings generated by the hard hadronic reactions. All hadronic scenarios show a smooth increase of J/W and W suppression with centrality or transverse energy. The presently reported step in the J/W absorption versus E by the NA50 2 Collaboration — if confirmed in an independent experiment with good statistics — indeed would indicate a nonhadronic scenario. We have, furthermore, presented detailed predictions for a couple of experiments to be performed in the near future; e.g. meson m -spectra at SIS energies should provide valuable informa2 tion on the in-medium properties of the heavier mesons once the pion m -spectra are measured in 2 the same experiment for reference. Dilepton spectra from pion-nucleus reactions are well suited to learn especially about the in-medium properties of the u- and -mesons, since the vector mesons can be produced with low momenta with respect to the residual nucleus as suggested by Metag and collaborators. Additionally, dilepton angular anisotropies help in disentangling the various dilepton sources and provide further information from the same measured data set. Finally, we have argued that at RHIC energies already pion and kaon spectra and rapidity distributions will differentiate between a partonic (quarks and gluons) and a hadronic scenario (mesons and baryons) while soft dileptons as well as J/W suppression do appear less conclusive. Thus presently there are no stringent indications for a ‘smoking gun’ showing the restoration of chiral symmetry at high baryon density and temperature or a phase transition to the QGP. We recall that the restoration of chiral symmetry might also be realized by a mixing of the vector and axial vector particles at high density and temperature which might be hard to investigate experimentally. As characteristic for the field of heavy-ion physics it is the ‘whole picture’ that will provide first ‘additive’ experimental information, which finally converges towards a subtle understanding of the nature of the strong interaction and the phases of its constituents. There is still a long — but interesting and exciting — way to go.
Acknowledgements We are grateful for many illuminating discussions with our theoretical and experimental colleagues, in particular to H. Bokemeyer, G.E. Brown, A. Drees, C. Gale, C. Gerschel, C. Greiner, B. Friman, K. Haglin, R. Holzmann, J. Hu¨fner, D. Kharzeev, F. Klingl, L. Kluberg, V. Koch, W. Koenig, W. Ku¨hn, H. Lenske, S. Leupold, G.Q. Li, C. Lourenc7 o, V. Metag, K. Redlich, H. Satz, S.S. Shimanskij, W. Scho¨n, H. Specht, H. Sto¨cker, M.H. Thoma, R. Vogt, W. Weise, Gy. Wolf and I. Zahed. Furthermore, we would like to thank N. Herrmann, H. Oeschler and P. Senger for many interesting discussions and access to their data prior to publication. Finally we wish to thank especially our collegues at Giessen, Darmstadt, Cracow, Dubna, Moscow and Texas A&M
226
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
University for their work, on which many of the results presented in this review are based, in particular W. Ehehalt, J. Geiss, U. Mosel, A. Sibirtsev, R. Rapp, J. Wambach, B. Kamys, L. Jarcyk, Z. Rudy, G.I. Lykasov, M.V. Rzjanin, O.V. Teryaev, A.I. Titov, V.D. Toneev, Ye.S. Golubeva, L.A. Kondratyuk and C.M. Ko.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
[21]
[22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36]
J.W. Harris, B. Mu¨ller, Annu. Rev. Nucl. Part. Sci. 46 (1996) 71. E.V. Shuryak, Rev. Mod. Phys. 65 (1993) 1. G.E. Brown, M. Rho, Phys. Rev. Lett. 66 (1991) 2720. H. Sto¨cker, W. Greiner, Phys. Rep. 137 (1986) 277. G.F. Bertsch, S. Das Gupta, Phys. Rep. 160 (1988) 189. W. Cassing, V. Metag, U. Mosel, K. Niita, Phys. Rep. 188 (1990) 363. W. Cassing, U. Mosel, Prog. Part. Nucl. Phys. 25 (1990) 235. B. Bla¨ttel, V. Koch, U. Mosel, Rep. Prog. Phys. 56 (1993) 1. S. Teis, W. Cassing, T. Maruyama, U. Mosel, Phys. Lett. B 319 (1993) 47; Phys. Rev. C 50 (1994) 388. C.M. Ko, Q. Li, R. Wang, Phys. Rev. Lett. 59 (1987) 1084; C.M. Ko, Q. Li, Phys. Rev. C 37 (1988) 2270. X.S. Fang, C.M. Ko, G.Q. Li, Y.M. Zheng, Phys. Rev. C 49 (1994) R608; Nucl. Phys. A 575 (1994) 766. G.Q. Li, C.M. Ko, X.S. Fang, Y.M. Zheng, Phys. Rev. C 49 (1994) 1139. C.M. Ko, G.Q. Li, J. Phys. G 22 (1996) 1673. K. Weber, B. Bla¨ttel, W. Cassing, H.-C. Do¨nges, V. Koch, A. Lang, U. Mosel, Nucl. Phys. A 539 (1992) 713. K. Weber, B. Bla¨ttel, W. Cassing, H.-C. Do¨nges, A. Lang, T. Maruyama, U. Mosel, Nucl. Phys. A 552 (1993) 571. T. Maruyama, W. Cassing, U. Mosel, S. Teis, K. Weber, Nucl. Phys. A 573 (1994) 653. J. Aichelin, Phys. Rep. 202 (1991) 233; J. Aichelin, H. Sto¨cker, Phys. Lett. B 176 (1986) 14; J. Aichelin, G. Peilert, A. Bohnet, A. Rosenhauer, H. Sto¨cker, W. Greiner, Phys. Rev. C 37 (1988) 2451. S.W. Huang, A. Faessler, G.Q. Li, D.T. Khoa, E. Lehmann, R.K. Puri, M.A. Matin, N. Ohtsuka, Prog. Part. Nucl. Phys. 30 (1993) 105. G.Q. Li, A. Faessler, S.W. Huang, Prog. Part. Nucl. Phys. 30 (1993) 159. H. Sorge, H. Sto¨cker, W. Greiner, Ann. Phys. 192 (1989) 266; Nucl. Phys. A 498 (1989) 567c; H. Sorge, A.V. Keitz, R. Matiello, H. Sto¨cker, W. Greiner, Z. Phys. C 47 (1990) 629; Phys. Lett. B 243 (1990) 7; A. Jahns, H. Sorge, H. Sto¨cker, W. Greiner, Z. Phys. A 341 (1992) 243; H. Sorge, Phys. Rev. Lett. 78 (1997) 2309. S. Bass, M. Belkacem, M. Bleicher, M. Brandstetter, C. Ernst, L. Gerland, M. Hofmann, S. Hofmann, J. Kanopka, G. Mao, L. Neise, S. Soff, C. Spieles, H. Weber, L. A. Winckelmann, H. Sto¨cker, W. Greiner, Prog. Part. Nucl. Phys. (1998), to appear. Y. Pang, T.J. Schlagel, S.H. Kahana, Phys. Rev. Lett. 68 (1992) 2743. S.H. Kahana, D.E. Kahana, Y. Pang, T.J. Schlagel, Annu. Rev. Nucl. Part. Sci. 46 (1996) 31. B.A. Li, C.M. Ko, Phys. Rev. C 52 (1995) 2037. W. Ehehalt, W. Cassing, Nucl. Phys. A 602 (1996) 449. W. Botermans, R. Malfliet, Phys. Rep. 198 (1990) 115. R. Malfliet, Prog. Part. Nucl. Phys. 21 (1988) 207. A. Faessler, Prog. Part. Nucl. Phys. 30 (1993) 229. G.E. Brown, Prog. Theor. Phys. 91 (1987) 85. G.E. Brown, C.M. Ko, Z.G. Wu, L.H. Xia, Phys. Rev. C 43 (1991) 1881. V. Koch, G.E. Brown, Nucl. Phys. A 560 (1993) 345. V. Koch, Int. J. Mod. Phys. E 6 (1997) 203. B. Peterson, Nucl. Phys. B 30 (1992) 66. F. Karsch, Nucl. Phys. B 34 (1993) 63. J. Engels, Phys. Lett. B 252 (1990) 625; J. Engels, F. Karsch, K. Redlich, Nucl. Phys. B 435 (1995) 295. F. Karsch, E. Laermann, Phys. Rev. D 50 (1994) 6954.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233 [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86]
227
J.V. Steele, H. Yamagishi, I. Zahed, Phys. Lett. B 384 (1996) 255; Phys. Rev. D 56 (1997) 5605. J. Gasser, H. Leutwyler, Phys. Lett. B 184 (1987) 83. U. Meissner, Rep. Prog. Phys. 56 (1993) 903. H. Leutwyler, Ann. of Phys. 235 (1994) 165. J.M. Ha¨user, W. Cassing, S. Leupold, M.H. Thoma, Ann. of Phys. 265 (1998) 155. G. Boyd et al., Phys. Rev. Lett. 75 (1995) 4169; Phys. Lett. B 349 (1995) 170; Nucl. Phys. B 469 (1996) 419. E. Laermann, Nucl. Phys. A 610 (1996) 1c. F. Karsch, Nucl. Phys. B (Proc. Suppl.) 34 (1994) 63; Proc. CORINNE II (Nantes 1994), World Scientific, Singapore, 1995, p. 203. C. Bernard et al., Nucl. Phys. (Proc. Suppl.) 53 (1997) 442. C.M. Hung, E.V. Shuryak, Phys. Rev. Lett. 75 (1995) 4003. D. Rischke, Nucl. Phys. A 610 (1996) 88c. G. Baym, S.A. Chin, Phys. Lett. B 62 (1976) 241. B. Friedman, L. McLerran, Phys. Rev. D 17 (1978) 1109. E.V. Shuryak, Phys. Rep. 61 (1980) 71. E.G. Drukarev, E.M. Levin, Nucl. Phys. A 511 (1990) 679; Prog. Part. Nucl. Phys. 27 (1991) 77. J. Gasser, H. Leutwyler, M.E. Sainio, Phys. Lett. B 253 (1991) 252. I. Zahed, G.E. Brown, Phys. Rep. 142 (1986) 1. M. Bando, T. Kugo, K. Yamawaki, Phys. Rep. 164 (1988) 217. T.D. Cohen, R.J. Furnstahl, D.K. Griegel, X. Jin, Prog. Part. Nucl. Phys. 35 (1995) 221. T. Hatsuda, S.H. Lee, Phys. Rev. C 46 (1992) R34. X. Jin, D.B. Leinweber, Phys. Rev. C 52 (1995) 3344. C. Gale, J.I. Kapusta, Nucl. Phys. B 357 (1991) 65. J.I. Kapusta, E.V. Shuryak, Phys. Rev. D 49 (1994) 4694. R.D. Pisarski, Phys. Rev. D 52 (1995) R3773. F. Klingl, N. Kaiser, W. Weise, Nucl. Phys. A 624 (1997) 527. L.J. Reinders, H. Rubinstein, S. Yazaki, Phys. Rep. 127 (1985) 1. M.A. Shifman, A.I. Vainshtein, V.I. Zakharov, Nucl. Phys. B 147 (1979) 385. S. Leupold, W. Peters, U. Mosel, Nucl. Phys. A 628 (1998) 311. M.C. Birse, J. Phys. G 20 (1994) 1537. S.P. Klevansky, Rev. Mod. Phys. 64 (1992) 649. U. Vogl, W. Weise, Prog. Part. Nucl. Phys. 27 (1991) 195. S. Klimt, M. Lutz, U. Vogl, W. Weise, Nucl. Phys. A 516 (1990) 429. U. Vogl, M. Lutz, S. Klimt, W. Weise, Nucl. Phys. A 516 (1990) 469. J. Hu¨fner, S.P. Klevansky, P. Zhuang, H. Voss, Ann. Phys. 234 (1994) 225. P.P. Domitrovich, H. Mu¨ther, J. Phys. G 20 (1994) 1885. Chr.V. Christov, E. Ruiz Arriola, K. Goeke, Nucl. Phys. A 556 (1993) 641. T. Meissner, E. Ruiz Arriola, K. Goeke, Z. Phys. A 339 (1990) 91. V. Bernard, U.-G. Mei{ner, A.A. Osipov, Phys. Lett. B 324 (1994) 201. Y. Nambu, G. Jona-Lasinio, Phys. Rev. 122 (1961) 345; Phys. Rev. 124 (1961) 246. P.A.M. Guichon, Phys. Lett. B 200 (1988) 235. K. Saito, A.W. Thomas, Phys. Rev. C 51 (1995) 2757. J.D. Walecka, Ann. Phys. 83 (1974) 491. B.D. Serot, J.D. Walecka, Adv. Nucl. Phys. 16 (1986) 1. M. Gell-Mann, R. Oakes, B. Renner, Phys. Rev. 175 (1968) 2195. M. Gourdin, Phys. Rep. 11 (1974) 29; D. Krupa, S. Dubnicka, V. Kundrat, V.A. Meshcheryakov, J. Phys. G 10 (1984) 455. B.W. Bush, J.R. Nix, Ann. Phys. 227 (1993) 97. U. Kalmbach, T. Vetter, T.S. Biro, U. Mosel, Nucl. Phys. A 563 (1993) 584. T. Vetter, T.S. Biro, U. Mosel, Nucl. Phys. A 581 (1995) 598. R.D. Bowler, M.C. Birse, Nucl. Phys. A 582 (1995) 655. R. Machleidt, K. Holinde, Ch. Elster, Phys. Rep. 149 (1987) 1.
228 [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138]
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233 F.E. Close, R.L. Jaffe, R.G. Roberts, G.G. Ross, Phys. Rev. D 31 (1985) 1004. P. Braun-Munzinger, J. Stachel, J.P. Wessels, N. Xu, Phys. Lett. B 344 (1995) 43. P. Braun-Munzinger et al., Phys. Lett. B 365 (1996) 1. B. Nilsson-Almqvist, E. Stenlund, Comput. Phys. Comm. 43 (1987) 387. T.D. Cohen, R.J. Furnstahl, D.K. Griegel, Phys. Rev. C 45 (1992) 1881. R. Brockmann, W. Weise, Phys. Lett. B 367 (1996) 40. G.Q. Li, C.M. Ko, Phys. Lett. B 338 (1994) 118. S. Hama, B.C. Clark, E.D. Cooper, H.S. Sherif, R.L. Mercer, Phys. Rev. C 41 (1990) 2737. D.B. Kaplan, A.E. Nelson, Phys. Lett. B 175 (1986) 57. G.Q. Li, C.-H. Lee, G.E. Brown, Nucl. Phys. A 625 (1997) 372. A.E. Nelson, D. Kaplan, Phys. Lett. B 192 (1987) 193. T. Waas, N. Kaiser, W. Weise, Phys. Lett. B 379 (1996) 34. J. Schaffner-Bielich, I.N. Mishustin, J. Bondorf, Nucl. Phys. A 625 (1997) 325. F. Klingl, N. Kaiser, W. Weise, Z. Phys. A 356 (1996) 193. N. Kaiser, T. Waas, W. Weise, Nucl. Phys. A 612 (1997) 297. Gy. Wolf, W. Cassing, W. Ehehalt, U. Mosel, Prog. Part. Nucl. Phys. 30 (1993) 273. V. Blobel et al., Nucl. Phys. B 69 (1974) 454. M. Aguilar-Benitez et al., Z. Phys. C 50 (1991) 405. A. Sibirtsev, Phys. Lett. B 359 (1995) 29. A. Sibirtsev, W. Cassing, C.M. Ko, Z. Phys. A 358 (1997) 101. W. Ehehalt, W. Cassing, A. Engel, U. Mosel, Gy. Wolf, Phys. Rev. C 47 (1993) R2467. W. Zwermann, B. Schu¨rmann, Phys. Lett. B 145 (1984) 315. G.Q. Li, C.M. Ko, X.S. Fang, Phys. Lett. B 329 (1994) 149. H.W. Barz, H. Iwe, Phys. Lett. B 153 (1985) 217. G. Giacomelli, Int. J. Mod. Phys. A 5 (1990) 223. H. Schopper (Ed.), vol. I (12), Landolt-Bo¨rnstein, New Series, Springer, Berlin, 1988. A.A. Sibirtsev, Nucl. Phys. A 604 (1996) 455. G.I. Lykasov, M.V. Rzjanin, W. Cassing, Phys. Lett. B 387 (1996) 691. A. Sibirtsev, W. Cassing, G.I. Lykasov, M.V. Rzjanin, Nucl. Phys. A 632 (1998) 131. G. Batko, W. Cassing, U. Mosel, K. Niita, Gy. Wolf, Phys. Lett. B 256 (1991) 331. A. Sibirtsev, W. Cassing, U. Mosel, Z. Phys. A 358 (1997) 357. W. Cassing, G. Batko, T. Vetter, Gy. Wolf, Z. Phys. A 340 (1991) 51. K. Tsushima, S.W. Huang, A. Faessler, J. Phys. G 21 (1995) 33; Phys. Lett. B 337 (1994) 245. S.V. Efremov, E.A. Paryev, Z. Phys. A 348 (1994) 217. J. Cugnon, P. Deneye, J. Vandermeulen, Phys. Rev. C 41 (1990) 1701. Particle Data Booklet, Phys. Rev. D 50 (1994) 1173. H. Sorge, Phys. Rev. C 52 (1995) 3291. K. Niita, W. Cassing, U. Mosel, Nucl. Phys. A 504 (1989) 391. Gy. Wolf, G. Batko, W. Cassing, U. Mosel, K. Niita, M. Scha¨fer, Nucl. Phys. A 517 (1990) 615. N. Herrmann, Nucl. Phys. A 610 (1996) 49c; B. Hong et al., FOPI Coll., Phys. Rev. C 57 (1998) 244. P.K. Sahu, A. Hombach, W. Cassing, U. Mosel, M. Effenberger, Nucl. Phys. A 640 (1998) 693. S.A. Bass, C. Hartnack, H. Sto¨cker, W. Greiner, Phys. Rev. C 51 (1995) 12. S. Teis, W. Cassing, M. Effenberger, A. Hombach, U. Mosel, Gy. Wolf, Z. Phys. A 356 (1997) 421. S. Teis, W. Cassing, M. Effenberger, A. Hombach, U. Mosel, Gy. Wolf, Z. Phys. A 359 (1997) 297. W. Ehehalt, W. Cassing, A. Engel, U. Mosel, Gy. Wolf, Phys. Lett. B 298 (1993) 31. L. Xiong, C.M. Ko, V. Koch, Phys. Rev. C 47 (1993) 788. J. Helgesson, J. Randrup, Ann. Phys. 244 (1995) 12. F.D. Berg et al., Phys. Rev. Lett. 72 (1994) 977. C. Mu¨ntz et al., Z. Phys. A 352 (1995) 17; D. Brill et al., Z. Phys. A 357 (1997) 207. O. Schwalb et al., Phys. Lett. B 321 (1994) 20; F.D. Berg et al., Phys. Rev. Lett. 72 (1994) 977. R. Averbeck et al., Z. Phys. A 359 (1997) 65. M. Appenheimer et al., GSI Annual Report 1996, p. 58.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233 [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187]
229
K.K. Gudima, M. Ploszajczak, V.D. Toneev, Phys. Lett. B 328 (1994) 249. E.L. Bratkovskaya, W. Cassing, R. Rapp, J. Wambach, Nucl. Phys. A 634 (1998) 168. T. Abbott et al., E802 Collaboration, Phys. Rev. D 45 (1992) 3906. T. Abbott et al., E802 Collaboration, Phys. Rev. C 50 (1994) 1024. S.E. Eiseman et al., E810 Collaboration, Phys. Lett. B 292 (1992) 10. Y. Akiba, for the E802 Collaboration, Nucl. Phys. A 610 (1996) 139c. R. Lacasse, for the E877 Collaboration, Nucl. Phys. A 610 (1996) 153c. J. Baechler et al., Phys. Rev. Lett. 72 (1994) 1419. H. Stroebele, et al., Nucl. Phys. A 525 (1991) 59c. K. Werner, Phys. Rep. 232 (1993) 87. M. Gyulassy, Nucl. Phys. A 590 (1995) 431c. R. Bauer, et al., Nucl. Phys. A 566 (1994) 87c. R. Santo, et al., Nucl. Phys. A 566 (1994) 61c. E.L. Bratkovskaya, W. Cassing, U. Mosel, Z. Phys. C 75 (1997) 119. R. Albrecht et al., Phys. Rev. Lett. 76 (1996) 3506. P.G. Jones and the NA49 Collaboration, Nucl. Phys. A 610 (1996) 188c. E.L. Bratkovskaya, W. Cassing, Nucl. Phys. A 619 (1997) 413. C.M. Ko, Phys. Lett. B 120 (1983) 294; B 138 (1984) 361. S.W. Huang, G.Q. Li, T. Maruyama, A. Faessler, Nucl. Phys. A 547 (1992) 653. W. Cassing, A. Lang, S. Teis, K. Weber, Nucl.Phys. A 545 (1992) 123c. A. Schro¨ter et al., Z. Phys. A 350 (1994) 101. W. Cassing, E.L. Bratkovskaya, U. Mosel, S. Teis, A. Sibirtsev, Nucl. Phys. A 614 (1997) 415. A. Sibirtsev, H. Mu¨ller, C. Schneidereit, M. Bu¨scher, Z. Phys. A 351 (1995) 333. J.Q. Wu, C.M. Ko, Nucl. Phys. A 499 (1989) 810. J. Randrup, C.M. Ko, Nucl. Phys. A 343 (1980) 519; A 411 (1983) 537. B. Schu¨rmann, W. Zwermann, Phys. Lett. B 183 (1987) 31. G.Q. Li, C.M. Ko, Nucl. Phys. A 594 (1995) 439. T. Barnes, E.S. Swanson, Phys. Rev. C 49 (1994) 1166. G.E. Brown, C.-H. Lee, M. Rho, V. Thorsson, Nucl. Phys. A 567 (1994) 937. A. Sibirtsev, W. Cassing, Preprint UGI-98-18, nucl-th/9805021. S. Schnetzer et al., Phys. Rev. C 40 (1989) 640. W. Cassing, Z. Rudy, L. Jarczyk, B. Kamys, P. Kulessa, O.W.B. Schult, A. Sibirtsev, A. Strzalkowski, in: E. Gadioli (Ed.), Proc. VIII. Int. Conf. on Nuclear Reaction Mechanisms, Varenna, June 1997, p. 142. Z. Rudy, Habilitation thesis, University of Cracow, Poland, unpublished. M. Debowski et al., Z. Phys. A 356 (1996) 313. P. Senger for the KaoS Collaboration, Acta Phys. Polon. B 27 (1996) 2993. P. Senger et al., Proc. Int. Workshop XXIII on Gross Properties of Nuclei and Nuclear Exitations, Hirschegg, Austria, January, 1995, p. 306. D. Miskowiec, W. Ahner, R. Barth et al., Phys. Rev. Lett. 72 (1994) 3650. A. Lang, W. Cassing, U. Mosel, K. Weber, Nucl. Phys. A 541 (1992) 507. E.L. Bratkovskaya, W. Cassing, U. Mosel, Nucl. Phys. A 622 (1997) 593. C. Fuchs et al., Phys. Rev. C 56 (1997) 606. G.Q. Li, C.M. Ko, B.A. Li, Phys. Rev. Lett. 74 (1995) 235. G.Q. Li, C.M. Ko, Nucl. Phys. A 594 (1995) 460. G.E. Brown, C.M. Ko, G.Q. Li, nucl-th/9608039. G.Q. Li, C.M. Ko, Phys. Rev. C 54 (1996) R2159. J.L. Ritman et al., Z. Phys. A 352 (1995) 355. Z.S. Wang, A. Faessler, C. Fuchs, V.S. Uma Maheswari, D.S. Kosov, Nucl. Phys. A 628 (1998) 151. P. Crochet et al., GSI Annual Report, 1997, p. 59. P. Kienle, A. Gillitzer, in: H. Sto¨cker, A. Gallmann, J.H. Hamilton (Eds.), Structure of Vacuum and Elementary Matter, World Scientific, Singapore, 1997, p. 249. R. Barth et al., Phys. Rev. Lett. 78 (1997) 4007.
230 [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236]
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233 C.M. Ko, Phys. Lett. B 120 (1983) 294. C. Sturm et al., GSI Annual Report, 1997, p. 47. E.L. Bratkovskaya, W. Cassing, U. Mosel, Phys. Lett. B 424 (1998) 244. K. Wisniewski et al., GSI Annual Report, 1997, p. 60. W. Cassing, T. Demski, L. Jarczyk, B. Kamys, Z. Rudy, O.W.B. Schult, A. Strzalkowski, Z. Phys. A 349 (1994) 77. O. Chamberlain et al., Nuovo Cimento 3 (1956) 447. T. Elioff et al., Phys. Rev. 128 (1962) 869. D. Dorfan et al., Phys. Rev. Lett. 14 (1965) 995. A.A. Baldin et al., JETP Lett. 47 (1988) 137. J.B. Carroll et al., Phys. Rev. Lett. 62 (1989) 1829. A. Shor, V. Perez-Mendez, K. Ganezer, Phys. Rev. Lett. 63 (1989) 2192. J. Chiba et al., Nucl. Phys. A 553 (1993) 771c. A. Schro¨ter et al., Nucl. Phys. A 553 (1993) 775c. P. Koch, C.B. Dover, Phys. Rev. C 40 (1989) 145. C.M. Ko, X. Ge, Phys. Lett. B 205 (1988) 195. C.M. Ko, L.H. Xia, Phys. Rev. C 40 (1989) R1118. P. Danielewicz, Phys. Rev. C 42 (1990) 1564. V.B. Kopeliovich, Phys. Rep. 139 (1986) 51. A. Shor, V. Perez-Mendez, K. Ganezer, Nucl. Phys. A 514 (1990) 717. E. Byckling, K. Kajantie, Particle Kinematics, Wiley, London, 1973. H. Schopper (Ed.), Landolt-Bo¨rnstein, New Series, vol. I (12), Springer, Berlin, 1988. J. Chiba et al., Nucl. Phys. A 634 (1998) 115. C. Spieles, M. Bleicher, A. Jahns, R. Mattiello, H. Sorge, H. Sto¨cker, W. Greiner, Phys. Rev. C 53 (1996) 2011. J. Geiss, Ph.D. thesis, University of Giessen, 1998; J. Geiss, W. Cassing, C. Greiner, nucl-th/9805012. G.S.F. Stephans, and the E802 Collaboration, Nucl. Phys. A 566 (1994) 269c. B.A. Cole et al., Nucl. Phys. A 590 (1995) 179c. M. Gonin, Ole Hansen, B. Moskowitz, F. Videbæk, H. Sorge, R. Mattiello, Phys. Rev. C 51 (1995) 310. C.A. Ogilvie for the E866 and E917 Collaboration, Nucl. Phys. A 638 (1998) 57c. G.E. Brown, K. Kubodera, M. Rho, Phys. Lett. B 192 (1987) 273; G.E. Brown, C.M. Ko, K. Kubodera, Z. Phys. A 341 (1992) 301. P. Koch, B. Mu¨ller, J. Rafelski, Phys. Rep. 142 (1986) 167. J. Letessier, J. Rafelski, A. Tounsi, Phys. Lett. B 321 (1994) 394; B 323 (1994) 393; B 333 (1994) 484; B 390 (1997) 363. J. Baechler et al., Z. Phys. C 58 (1993) 367. V. Topor Pop et al., Phys. Rev. C 52 (1995) 1618; M. Gaz´dzicki et al., Nucl. Phys. A 590 (1995) 197c. H. Sorge, Z. Phys. C 67 (1995) 479. C. Bormann et al., J. Phys. G 23 (1997) 1817. J. Letessier, J. Rafelski, A. Tounsi, Phys. Lett. B 410 (1997) 315. H. Sorge, L. Winckelmann, H. Sto¨cker, W. Greiner, Z. Phys. C 59 (1993) 85; M. Berenguer, H. Sorge, W. Greiner, Phys. Lett. B 332 (1994) 15. A. Capella, Phys. Lett. B 364 (1995) 175. E. Shuryak, Phys. Lett. B 78 (1978) 150; Sov. J. Nucl. Phys. 28 (1978) 408. K. Kajantie, J. Kapusta, L. McLerran, A. Mekjian, Phys. Rev. D 34 (1986) 2746. P.V. Ruuskanen, Nucl. Phys. A 544 (1992) 169c. J. Kleymans, K. Redlich, H. Satz, Z. Phys. C 52 (1991) 517. U. Heinz, K.S. Lee, Phys. Lett. B 259 (1991) 162. Y. Kluger, V. Koch, J. Randrup, X.N. Wang, Phys. Rev. C 57 (1998) 280. G. Roche et al., Phys. Rev. Lett. 61 (1988) 1069. C. Naudet et al., Phys. Rev. Lett. 62 (1989) 2652. G. Roche et al., Phys. Lett. B 226 (1989) 228. G. Agakichiev et al., Phys. Rev. Lett. 75 (1995) 1272. Th. Ullrich et al., Nucl. Phys. A 610 (1996) 317c.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233
231
[237] M.A. Mazzoni, Nucl. Phys. A 566 (1994) 95c; M. Masera, Nucl. Phys. A 590 (1995) 93c. [238] T. A> kesson et al., Z. Phys. C 68 (1995) 47. [239] C. Baglin et al., Phys. Lett. B 220 (1989) 471; B 251 (1990) 465; B 270 (1991) 105; B 345 (1995) 617; S. Ramos, Nucl. Phys. A 590 (1995) 117c. [240] F. Fleuret et al., in: J. Tran Thanh Van (Ed.) 97 QCD and High Energy Hadronic Interactions, Editions Frontie´res, Dreux, p. 503, 1997. [241] L. Xiong, Z.G. Wu, C.M. Ko, J.Q. Wu, Nucl. Phys. A 512 (1990) 772. [242] L.A. Winckelmann, H. Sorge, H. Sto¨cker, W. Greiner, Phys. Rev. C 51 (1995) R9. [243] Gy. Wolf, W. Cassing, U. Mosel, Nucl. Phys. A 552 (1993) 549. [244] M. Herrmann, B. Friman, W. No¨renberg, Nucl. Phys. A 560 (1993) 411. [245] G.Q. Li, C.M. Ko, Nucl. Phys. A 582 (1995) 731. [246] P. Koch, Phys. Lett. B 288 (1992) 187; B. Ka¨mpfer, P. Koch, O.P. Pavlenko, Phys. Rev. C 49 (1994) 1132. [247] G.Q. Li, C.M. Ko, G.E. Brown, Phys. Rev. Lett. 75 (1995) 4007. [248] W. Cassing, W. Ehehalt, C.M. Ko, Phys. Lett. B 363 (1995) 35. [249] G.Q. Li, C.M. Ko, G.E. Brown, H. Sorge, Nucl. Phys. A 611 (1996) 539. [250] V. Koch, C. Song, Phys. Rev. C 54 (1996) 1903. [251] D.K. Srivastava, B. Sinha, C. Gale, Phys. Rev. C 53 (1996) R567. [252] L.A. Winckelmann, C. Ernst, L. Gerland, J. Konopka, S. Soff et al., nucl-th/9610042. [253] J. Murray, W. Bauer, K. Haglin, Phys. Rev. C 57 (1998) 882. [254] H.-J. Schulze, D. Blaschke, Phys. Lett. B 386 (1996) 429. [255] A. Drees, Nucl. Phys. A 610 (1996) 536c. [256] W. Cassing, W. Ehehalt, I. Kralik, Phys. Lett. B 377 (1996) 5. [257] R. Baier, M. Dirks, K. Redlich, Phys. Rev. D 55 (1997) 4344; D 56 (1997) 2548. [258] M. Asakawa, C.M. Ko, P. Le´vai, X.J. Qiu, Phys. Rev. C 46 (1992) R1159. [259] G. Chanfray, P. Schuck, Nucl. Phys. A 545 (1992) 271c. [260] R. Rapp, G. Chanfray, J. Wambach, Phys. Rev. Lett. 76 (1996) 368. [261] B. Friman, H.J. Pirner, Nucl. Phys. A 617 (1997) 496. [262] R. Rapp, G. Chanfray, J. Wambach, Nucl. Phys. A 617 (1997) 472. [263] C.M. Ko, V. Koch, G.Q. Li, Ann. Rev. Nucl. Part. Sci. 47 (1997) 505. [264] C. Gale, J. Kapusta, Phys. Rev. C 35 (1987) 2107. [265] L.G. Landsberg, Phys. Rep. 128 (1985) 301. [266] L. Xiong, E. Shuryak, G.E. Brown, Phys. Rev. D 46 (1992) 3789. [267] C.M. Ko, Phys. Rev. C 23 (1981) 2760. [268] C. Gale, J. Kapusta, Nucl. Phys. A 495 (1989) 423c. [269] M. Scha¨fer, T.S. Biro, W. Cassing, U. Mosel, Phys. Lett. B 221 (1989) 1. [270] P. Lichard, Phys. Rev. D 51 (1995) 6017. [271] W. Cassing, E.L. Bratkovskaya, R. Rapp, J. Wambach, Phys. Rev. C 57 (1998) 916. [272] N.M. Kroll, T.D. Lee, B. Zumino, Phys. Rev. 157 (1967) 1376. [273] M. Urban et al., private communication and to be published. [274] R. Rapp, M. Urban, M. Buballa, J. Wambach, Phys. Lett. B 417 (1998) 1. [275] N. Bianchi et al., Phys. Lett. B 299 (1993) 219; B 309 (1993) 5; B 325 (1994) 333. [276] P. Koch, Z. Phys. C 57 (1993) 283. [277] K.L. Haglin, Phys. Rev. C 53 (1996) R2606. [278] K. Haglin, C. Gale, Phys. Rev. C 49 (1994) 401. [279] G.Q. Li, C. Gale, Nucl. Phys. A 638 (1998) 491c. [280] R.J. Porter et al., Phys. Rev. Lett. 79 (1997) 1229. [281] L. Xiong, Z.G. Wu, C.M. Ko, J.Q. Wu, Nucl. Phys. A 512 (1990) 772. [282] K.K. Gudima, A.I. Titov, V.D. Toneev, Sov. J. Nucl. Phys. 55 (1992) 1715. [283] E.L. Bratkovskaya, W. Cassing, U. Mosel, Phys. Lett. B 376 (1996) 12. [284] W. Peters, M. Post, H. Lenske, S. Leupold, U. Mosel, Nucl. Phys. A 632 (1998) 109. [285] C. Ernst, S.A. Bass, M. Belkacem, H. Sto¨cker, W. Greiner, Phys. Rev. C 58 (1998) 447. [286] C.M. Ko, G.Q. Li, G.E. Brown, H. Sorge, Nucl. Phys. A 610 (1996) 342c.
232 [287] [288] [289] [290] [291] [292] [293] [294] [295] [296] [297] [298] [299] [300] [301] [302] [303] [304] [305] [306] [307] [308] [309] [310] [311] [312] [313] [314] [315] [316] [317] [318] [319] [320] [321] [322] [323] [324] [325] [326] [327] [328] [329] [330] [331] [332] [333] [334] [335]
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233 G. Agakichiev et al., CERES Coll., Phys. Lett. B 422 (1998) 405. J. Kapusta, Nucl. Phys. A 566 (1994) 45c. C.M. Hung, E.V. Shuryak, Phys. Rev. C 56 (1997) 453. D.K. Srivastava, B. Sinha, Phys. Rev. Lett. 73 (1994) 316. A. Dumitru, U. Katscher, J.A. Maruhn, H. Sto¨cker, W. Greiner, D.H. Rischke, Phys. Rev. C 51 (1995) 2166. H.C. Eggers, C. Gale, R. Tabti, K. Haglin, hep-ph/9604372. G.Q. Li, G.E. Brown, C. Gale, C.M. Ko, nucl-th/9712048. J. Cleymans, K. Redlich, D.K. Srivastava, Phys. Rev. C 55 (1997) 1431. T. Matsui, H. Satz, Phys. Lett. B 178 (1986) 416. M. Gonin et al., Nucl. Phys. A 610 (1996) 404c. C. Gerschel, J. Hu¨fner, Z. Phys. C 56 (1992) 71; C. Gerschel, Nucl. Phys. A 583 (1995) 643; C. Gerschel, J. Hu¨fner, hep-ph/9802245. D. Kharzeev, Nucl. Phys. A 610 (1996) 418c. J.P. Blaizot, J.Y. Ollitrault, Phys. Rev. Lett. 77 (1996) 1703; Nucl. Phys. A 610 (1996) 452c. C.-Y. Wong, Nucl. Phys. A 610 (1996) 434c; Phys. Rev. C 55 (1997) 2621. S. Gavin, R. Vogt, Nucl. Phys. B 345 (1990) 104; S. Gavin, H. Satz, R.L. Thews, R. Vogt, Z. Phys. C 61 (1994) 351; S. Gavin, Nucl. Phys. A 566 (1994) 383c. S. Gavin, R. Vogt, Nucl. Phys. A 610 (1996) 442c; Phys. Rev. Lett. 78 (1997) 1006. A. Capella, A. Kaidalov, A. Kouider Akil, C. Gerschel, Phys. Lett. B 393 (1997) 431; N. Armesto, A. Capella, Phys. Lett. B 430 (1998) 23. D. Kharzeev, C. Lourenc7 o, M. Nardi, H. Satz, Z. Phys. C 74 (1997) 307. S. Loh, C. Greiner, U. Mosel, Phys. Lett. B 404 (1997) 238. W. Cassing, C.M. Ko, Phys. Lett. B 396 (1997) 39. W. Cassing, E.L. Bratkovskaya, Nucl. Phys. A 623 (1997) 570. D. Kharzeev, H. Satz, Phys. Lett. B 366 (1996) 316. D. Kharzeev, H. Satz, Phys. Lett. B 334 (1994) 155. C. Lourenc7 o, Nucl. Phys. A 610 (1996) 552c. H.-U. Bengtsson, T. Sjo¨strand, Comput. Phys. Commun. 46 (1987) 43. H. Plothow-Besch, Int. J. Mod. Phys. A 10 (1995) 2901. C. Lourenc7 o, Ph.D. Thesis, Lisbon 1995. P. Braun-Munzinger, D. Miskowiec, A. Dress, C. Lourenc7 o, Eur. Phys. J. C 1 (1998) 123. C. Spieles et al., hep-ph/9706525. J. Geiss, C. Greiner, E.L. Bratkovskaya, W. Cassing, U. Mosel, nucl-th/9803008, Phys. Lett. B, in press. D. Kharzeev, Nucl. Phys. A 638 (1998) 279c. D. Kharzeev, M. Nardi, H. Satz, hep-ph/9707308. Quark Matter 96, Nucl. Phys. A 610 (1996) and refs. therein. X.S. Fang, C.M. Ko, G.E. Brown, V. Koch, Phys. Rev. C 47 (1993) 1678. V. Koch, C.M. Ko, G.E. Brown, Phys. Lett. B 265 (1991) 29. J. Cleymans, D. Elliott, A. Kera¨nen, E. Suhonen, Phys. Rev. C 57 (1998) 3319. V. Metag, Prog. Part. Nucl. Phys. 30 (1993) 75; GSI preprint 97-43, Nucl. Phys. A 630 (1998) 1c. D. Pelte et al., Z. Phys. A 359 (1997) 47. W. Scho¨n, H. Bokemeyer, W. Koenig, V. Metag, Acta Phys. Polon. B 27 (1996) 2959. W. Cassing, Ye.S. Golubeva, A.S. Iljinov, L.A. Kondratyuk, Phys. Lett. B 396 (1997) 26. K.G. Boreskov, J. Koch, L.A. Kondratyuk, M.I. Krivoruchenko, Phys. Atom. Nuclei 59 (1996) 1908. Ye.S. Golubeva, L.A. Kondratyuk, W. Cassing, Nucl. Phys. A 625 (1997) 832. Ye.S. Golubeva, A.S. Iljinov, B.V. Krippa, I.A. Pshenichnov, Nucl. Phys. A 537 (1992) 393. Th. Weidmann, E.L. Bratkovskaya, W. Cassing, U. Mosel, nucl-th/9711004. Ye.S. Golubeva, A.S. Iljinov, I.A. Pshenichnov, Nucl. Phys. A 562 (1993) 389. HADES-Collaboration, Proposal for a high-acceptance di-electron spectrometer, GSI 1994. W.S. Chung, G.Q. Li, C.M. Ko, Nucl. Phys. A 625 (1997) 325. J. Badier et. al., Phys. Lett. B 89 (1979) 145; K.J. Anderson et al., Phys. Rev. Lett. 42 (1979) 944. R. Strojnowski, Phys. Rep. 71 (1981) 1.
W. Cassing, E.L. Bratkovskaya / Physics Reports 308 (1999) 65—233 [336] [337] [338] [339] [340] [341] [342] [343] [344] [345] [346] [347]
233
E.L. Bratkovskaya, O.V. Teryaev, V.D. Toneev, Phys. Lett. B 348 (1995) 283. E.L. Bratkovskaya, M. Scha¨fer, W. Cassing, U. Mosel, O.V. Teryaev, V.D. Toneev, Phys. Lett. B 348 (1995) 325. E.L. Bratkovskaya, W. Cassing, U. Mosel, O.V. Teryaev, A.I. Titov, V.D. Toneev, Phys. Lett. B 362 (1995) 17. K. Gottfried, J.D. Jackson, Nuovo Cimento 33 (1964) 309. K.V. Vasavada, Phys. Rev. D 16 (1977) 146. Review of Particle Properties, Phys. Rev. D 50 (1994) 1173. V. Blobel et al., Phys. Lett. B 48 (1974) 73. K. Geiger, B. Mu¨ller, Nucl. Phys. B 369 (1992) 600. K. Geiger, Phys. Rep. 258 (1995) 237. X.-N. Wang, Phys. Rep. 280 (1997) 287. K. Geiger, D.K. Srivastava, Phys. Rev. C 56 (1997) 2718. K. Geiger, Comput. Phys. Commun. 104 (1997) 70.
Physics Reports 308 (1999) 235—331
Infinite-volume limit of continuous n-particle quantum systems J. Mac´kowiak Institute of Physics, N. Copernicus University, ul. Grudzia7 dzka 5/7, 87-100 Torun& , Poland Received March 1998; editor: M.L. Klein
Contents 1. Introduction 2. The Hamiltonian and canonical density operator of a continuous quantum system of n particles 2.1. n particles in a bounded region K 2.2. n-particle quantum systems in a compact Riemann space 3. Thermodynamic limit of free energy density f (HLK, b) for n quantum particles obeying N Boltzmann statistics 3.1. n particles in a bounded region K 3.2. Example — a system of coupled oscillators on a sphere in RJ 4. Thermodynamic limit of free energy density for continuous n-fermion systems 4.1. Contractions and expansions of p-particle fermion operators 4.2. Asymptotic equality of the free energy density of noninteracting Fermi gases in the canonical and grand canonical ensemble 4.3. Thermodynamic limit of free energy density f (HLK , b) for continuous n-fermion systems N $ with separable interactions 5. Thermodynamic limit of free energy density for continuous n-boson systems 5.1. Contractions and expansions of p-particle boson operators
238
239 239 240
242 242 253 255 255
257
262 270
5.2. Asymptotic equality of the free energy density of noninteracting Bose gases in the canonical and grand canonical ensemble 5.3. Thermodynamic limit of free energy density of continuous n-boson systems in a bounded region 5.4. A mean-field approach to the gas—liquid transition 5.5. Bose—Einstein condensation in the liquid phase of an interacting gas 6. The Kondo effect in terms of a reduced s—d model 6.1. Introduction 6.2. Reduction of H to H and asymptotic )+ treatment of H 6.3. The resistivity due to interaction of conducting electrons with localized impurity spins Appendix A Appendix B Appendix C Appendix D Appendix E Appendix F References
271
273 274 281 293 293 296
310 315 317 319 324 325 326 329
270
Abstract The Feynman—Kac theorem is applied in order to establish the infinite-volume limit behaviour of the free energy per particle of continuous n-particle quantum systems with bounded separable 2-body interactions defined in the configuration space of particle positions. The mean-field character of such systems is demonstrated. 0370-1573/99/$ — see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 2 1 - 0
INFINITE-VOLUME LIMIT OF CONTINUOUS n-PARTICLE QUANTUM SYSTEMS
J. MAC¨ KOWIAK Institute of Physics, N. Copernicus University, ul. Grudzi!dzka 5/7, 87-100 Torun& , Poland
AMSTERDAM — LAUSANNE — NEW YORK — OXFORD — SHANNON — TOKYO
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
237
A similar technique is applied to n-particle quantum systems with separable interactions defined in the space of particle momenta and spins. Three examples of systems with separable interactions are given and solved, one of which deals with an electron gas interacting with localized impurity spins in a dilute magnetic alloy (DMA) and extension of Kondo’s resistivity formula for DMA to temperatures close to 0 K. Most of the results are generalizations or more detailed presentations of those published earlier. 1999 Elsevier Science B.V. All rights reserved. PACS: 72.15.Cz; 75.10.Dg; 75.30.Et Keywords: n-particle system; Thermodynamic limit; Mean field; Bose—Einstein condensation; Resistivity
238
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
1. Introduction Infinite-volume limit investigations of interacting continuous many-particle quantum systems were initiated by Fisher (1964), Ginibre (1965) and Ruelle (1969) who first derived several properties of thermodynamic functions of such systems. Ginibre (1965, 1971, 1972) was the first to apply functional integration techniques in these studies, which allowed him to establish several hightemperature and low-activity results for reduced density matrices of these systems under Dirichlet boundary conditions. These techniques were extended to other boundary conditions by Novikov (1969) and Angelescu and Nenciu (1973), who proved independence of the infinite-volume limit of pressure of these conditions. The Neuman boundary conditions were dealt with by Robinson (1971). Bratteli and Robinson (1981) established further extensions of these and other results by generalizing functional integration methods and the Feynman—Kac theorem on which these techniques are based. A recent review of these developments can be found in the monograph of Roepstorff (1994). The functional integration technique was applied by the author to continuous quantum systems with bounded separable 2-body interactions (Mac´kowiak, 1983a, 1983). Combined with the saddle-point method of Tindemans and Capel (1974) this technique allowed to establish the infinite-volume limit of free energy density for such systems and to demonstrate their mean-field character. The resulting asymptotic mean-field equations proved to be similar in form to those of asymptotic Thomas—Fermi theory (Narnhofer and Thirring, 1981; Thirring, 1982). A stronger version of the theory developed in 1983 (Mac´kowiak, 1983a,b) proved possible by adapting the ideas of Pearce and Thompson (1975). The aim of this review is to present this version with new simplified proofs. Systems of quantum Boltzmann, Fermi and Bose particles in the canonical ensemble and with separable bounded 2-body interactions are considered in Sections 3—5. The infinite-volume limit of free energy density for these systems is analysed by exploiting the Feynman—Kac functional integral representation of the canonical density matrix and proofs of equivalence of canonical and grand canonical quantum ensembles of noninteracting gases. The method applied is almost identical in all three types of statistics. The proofs, however, are most complex for n-fermion systems and for this reason the case of Fermi statistics is presented with full details. Examples of continuous interacting quantum systems are given in Sections 3 and 5. The asymptotic mean-field equations for these models are solved and the phase transitions present discussed. The analysis of an interacting Bose gas, carried out in Section 5 in this manner, is possible by generalizing the treatment of an ideal Bose gas due to Landau and Wilde (1979). Section 6 deals with a reduced s—d model (Rs—dM) of a Fermi gas interacting via an equivalentneighbour Heisenberg potential with localized impurity spins. The Hamiltonian of the Rs—dM results by simplifying the s—d exchange Hamiltonian of Kasuya (1956) which describes the interaction of conducting electrons with magnetic impurities in a dilute magnetic alloy. The infinite-volume limit of free energy density for the Rs—dM is analysed by a method similar to the one applied in Section 4 to n-fermion systems. The mean-field Hamiltonian asymptotically equivalent to the Rs—dM Hamiltonian is next exploited in order to extend Kondo’s (1964) theory of resistivity minimum in dilute magnetic alloys to temperatures close to 0 K.
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
239
2. The Hamiltonian and canonical density operator of a continuous quantum system of n particles 2.1. n particles in a bounded region K Consider a quantum system of n identical spinless point particles obeying Boltzmann statistics, confined to a bounded region KLRJ with a piecewise differentiable boundary jK. A detailed construction of the Hamiltonian of such system for various types of interactions and boundary conditions was given by Bratteli and Robinson (1981). The construction proceeds as follows: Let ¹1K denote the self-adjoint extension by quadratic form technique of N J j !D"! " jx I I defined on the set of functions t belonging to ¸(K) which are twice continuously differentiable and fulfill on jK the boundary condition jt "pt , jn
(2.1.1)
where p3C(jK) is a real nonnegative differentiable function on jK and j/jn denotes the inward normal derivative. For p"0, Eq. (2.2.1) defines the Neumann conditions and setting formally p"R, one obtains the Dirichlet condition: t"0 on jK. If K is a l-dimensional parallelpiped, one can define a self-adjoint extension of !D on the domain of periodic functions. We can now define ¹LK, the n-particle expansion of ¹K in ¸(K)L"¸(K)2¸(K), as N N ¹LK"¹KK2K#K¹KK2K#2#K2K¹K N N N N ,nCI L ¹K , (2.1.2) N where ¹KK is defined in ¸(K) by the equality N (¹KK)(tu)"(¹Kt)u . N N The interaction operator ¼L K in ¸(K)L is now introduced in such a manner that the operator sum (2.1.3) HLK"¹LK#¼L K N N is essentially self-adjoint on a dense subset of ¸(K)L. Operators ¼L K which are relatively bounded with respect to ¹LK, viz., N for a, b50, t3¸(K)L (2.1.4) #¼L K t#4a#t##b#¹LKt# N allow to define such a class of Hamiltonians HLK. The infimum over all b for which bounds of this N type are satisfied is called the relative bound of ¼L K with respect to ¹LK. Obviously, any bounded N ¼L K satisfies Eq. (2.1.4) with a"#¼L K # and arbitrary b50. In the next sections our interest will be focused on the mean-field description of some simple n-particle quantum systems HLK in the canonical ensemble with n approaching infinity. This N investigation can be carried out most conveniently by exploiting the representation of the operator exp(!bHLK) as an integral operator with respect to the conditional Wiener measure. For N n-particle quantum systems in RJ this representation follows immediately from the Feynman—Kac
240
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
theorem. It was frequently exploited in the past (Kac, 1959b; Ginibre, 1965, 1971, 1972; Feynman and Hibbs, 1965). For the kernel exp(!bHL K) this representation was first introduced by Ginibre (1965). Novikov (1969) and Angelescu and Nenciu (1973) subsequently constructed this representation for other boundary conditions. Bratteli and Robinson (1981) generalized their method to potentials relatively bounded with respect to ¹LK by proving the following form of the FeynN man—Kac theorem: Theorem 2.1 (Feynman—Kac). ¸et ¼L K be a multiplication operator on ¸(K)L by a real function ¼(x ,2, x ). If the relative bound of ¼L K with respect to ¹LK is less than one, then for N L b'0 the operator exp(!bHLK) is an integral operator on ¸(K)L with the kernel N exp(!bHLK)(x ,2, x , y , 2, y ) L L N
@ L " dkNG KG @(u ) exp ! ¼(u (s),2, u (s)) ds , (2.1.5) VW G L XK @ G where XK "; KM is an uncountable Cartesian product of the closure KM of K, i.e., the set of all @ XRX@ paths u(s) in KM parameterized in the interval [0, b] and kNG KG @ is the conditional ¼iener measure VW corresponding to the boundary condition (2.1) with support on the set of continuous paths in XK @ satisfying u (0)"x , u (b)"y , i"1,2, n. G G G G Eq. (2.1.5) provides a representation for the canonical density matrix o "(Tr exp(!bHLK))\ N ;exp(!bHLK). This representation was exploited by Angelescu et al. (1994) in an investigation N of the asymptotic form of reduced density matrices of o for a large quantum system of particles obeying Boltzmann statistics and interacting through a repulsive 2-body potential. It was also applied by the author in 1983 (Mac´kowiak, 1983a,b) in an investigation of the asymptotic behaviour of the free energy density of HLK, as nPR, by a method of Tindemans and Capel N (1974). These results are reviewed and generalized in Sections 3—5. The following property of the integral (2.1.5) will be exploited in the proofs of asymptotic formulae for the free energy density: For 0(t 4t 424t "b, K K RI dx dkNK@(u) exp ! ¼ (u(s)) ds VV I " XK b I RI\ K "Tr exp[!(t !t )(¹K#¼K)] . (2.1.6) I I\ N I I The proof of Eq. (2.1.6) can be inferred from the construction of the integral (2.1.5) and subsequent theorems in the monographs by Glimm and Jaffe (1981) and Roepstorff (1994).
2.2. n-particle quantum systems in a compact Riemann space The configurational space of an n-particle classical system with constraints is, in general, a differential manifold. Geometric quantization of such systems has been studied by several authors (e.g. Souriau, 1970; Blattner, 1973). The quantization scheme applied in this section follows the one developed by Sniatycki (1980).
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
241
Suppose we are dealing with a physical system whose configurational space is a connected compact orientable p-dimensional C Riemann space RN, embedded in RJ, which is given by the P equations u (x,2, xJ)"r , c"1,2, l!p , A A where u : RJPR>, c"1,2, l!p, are C functions monotonically increasing in "x?", a"1,2, l. A P Thus, RN"+(x,2, xJ): u (x, 2, xJ)"r , a"1,2, l!p, . P ? ? Suppose RN has a natural parametric representation P x?"x?(q,2, qN, r), r"(r ,2, r ), a"1,2, l , (2.2.1) J\N where x?"x?(q,2, qN, r) are C functions defined on an open bounded set SNLRN (independent of r) and that the mapping x : SNPRN given in Eq. (2.2.1) is a surjection a.e. onto RN. The metric in P P RN is defined by the covariant tensor P J jxAjxA , g " ?@ jq?jq@ A which determines the surface measure on RN as follows (Schwartz, 1967, Chap. IV, Section 10): P
"KN"" P
J (q, r) dq,2, dqN, N
N
¸N"+q"(q,2, qN): x(q, r)3KNLRN, , P P
* where J (q, r)"(det g ). N ?@ Let us now consider a quantum system of n identical point particles confined to RN. As proved by P Sniatycki (1980, Section 7.2) the Hilbert space of such a system is isomorphic with HL , the n-fold P tensor product of H "¸(SN, J (q, r) dq). The kinetic energy operator of a single particle (in P N appropriate units) is defined in H in terms of S"!D#R (Sniatycki, 1980, Section 7.2), P P P P where D denotes the Laplace—Beltrami operator P J jq? j jq@ j N j j N " J g?@J\ D" N N P jxH jq? jxH jq@ jq? jq@ ?@ ?@ H and R is the scalar curvature of RN. Let ¹ denote the self-adjoint Friedrichs extension of S. The P P P P kinetic energy operator of an n-particle system in RN is then given by P ¹L"¹2#2#2¹ . P P P Suppose the interaction in the system is defined as a multiplication operator ¼Ln\ on HL by the P P real-valued function L n\¼(q ,2, q )" »(q )#n\ º(q , q ) , (2.2.2) L G G H G XGHXL where »(q)3¸(SN), º(q, q)" I j u (q)u (q)3¸(SN;SN). The Hamiltonian of the system is ? ? ? ? defined in HL as the sum P HL"¹L#n\¼L . (2.2.3) P P P
242
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
The next step in setting up a mean-field theory for the system (2.2.3) in the Gibbs canonical ensemble is the construction of the conditional Wiener measure kP@ over the space of paths OOY X "+u : [0, b]PRN,. It appears that this is a relatively simple task once one has found the P@ P integral kernel of the operator exp(!b¹) on H . The latter can be constructed by a method of P P Ito and McKean, 1965, Section 4.11; Mac´kowiak, 1983b. The classical procedure of Nelson (1964) then immediately produces the measure kP@ . Existence of the Wiener measure kP@ implies validity OOY OOY of the Feynman—Kac theorem in HL (Angelescu et al., 1973; Bratteli et al., 1981; Roepstorff, 1994). P By applying the method of Sections 3—5, one can investigate the limiting behaviour of the free energy density f (HL, b)"!(nb)\ ln TrHLP exp(!bHL) P P in the thermodynamic limit nPR, r PR, r\r "a "const . ? ? @ ?@ (2.2.4) (a, b"1,2, l!p) n"RN"\"d . P One arrives in this manner at verbal analogues of theorems proved in these chapters for n-particle quantum systems in a bounded region.
3. Thermodynamic limit of free energy density f (H (n)K , b) for n quantum particles obeying N Boltzmann statistics 3.1. n particles in a bounded region K Suppose the potential ¼L of n quantum particles obeying Boltzmann statistics, confined to a bounded region KLRJ, is defined as a multiplication operator in ¸(K)L by a real-valued function L ¼(x ,2, x )" ¼ (x , x )" »(x )#n\ º(x , x ) , L G H G G H XGHXL G XGHXL where º(x, y) is a real function of the form
(3.1.1)
I J º(x, y)"! u (x)u (y)# t (x)t (y) (3.1.2) ? ? A A ? A with u , t 3¸(RJ) (a"1,2, k, c"1,2, l) continuous almost everywhere (a.e.) on RJ and ? A u ,2, u , t ,2, t , 1, are linearly independent on any sufficiently large subset of RJ. The factor I J n\ on the r.h.s. of Eq. (3.1.1) is introduced in order to ensure linearity of energy in n as nPR. No restrictions will be imposed on »(x) as long as HLK"¹LK#¼L K is self-adjoint and exp(!bHLK) N N N is a trace-class operator admitting a Feynman—Kac representation. The set of trace-class operators of this type is nonempty. In particular, for any bounded ¼(x ,2, x ), exp(!bHLK) is trace-class, L N since exp(!b¹LK) is such (Bratteli and Robinson, 1981). N
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
243
We shall focus our interest on the thermodynamic limit of free energy per particle of the system HLK in the canonical ensemble, viz., N f (b, d)" lim f (HLK, b) , N LK where
(3.1.3)
f (HLK, b)"!(bn)\ ln(n!)\Z K , N N Z K"Tr exp(!bHLK) , N N the limit being approached with particle density tending to d:
(3.1.4)
nPR, "K"PR,
lim n"K"\"d .
(3.1.5)
The conditions under which the thermodynamic limit of free energy per particle exists for many particle quantum systems in the canonical and grand canonical ensemble, as well as its properties, were investigated thoroughly by Lieb (1966), Ruelle (1969), Novikov (1969), Robinson (1971), Ginibre (1965, 1971, 1972), Bratteli and Robinson (1981). The properties of f (b, d) for systems of identical quantum particles interacting via stable and tempered potentials ¼(x ,2, x ), i.e., L satisfying ¼(x ,2, x )5!nB for some B50 and all n'0 (stability) , L ¼ (x , x )4A"x !x "\H for some j'l, A50, "x !x "5R '0 (temperedness) and under Dirichlet boundary conditions were investigated by Ruelle (1969) and Novikov (1969). Angelescu and Nenciu (1973) proved that for such potentials satisfying, moreover, K ¼ (x , x )5!2B for any finite family of points x , x ,2, x 3RJ with "x !x "5a, iOj, K G H G G where a is the hard core radius of ¼ (x , x ), the thermodynamic limit of pressure for Boltzmann G H particles and bosons in the grand canonical ensemble is independent of the type of boundary conditions imposed. In order to calculate the limit (3.1.3) we shall derive an upper and lower bound on f (HLK, b) and N prove that both bounds coalesce in the limit (3.1.5). The proof exploits a technique developed by Tindemans and Capel (1974) and Pearce and Thompson (1975). 3.1.1. Lower bound on f (HLK, b) N A lower bound on f (HLK, b) can be derived by exploiting the Feynman—Kac representation for N the integral kernel of the operator exp(!bHLK). As an integral with respect to the conditional N Wiener measure, Z K assumes the form N @ I L Z K" dkNK@(u ,2, u )exp (2n)\ u (u (s)) ds N 4 L ? G X ? G @ J L @ L (3.1.6) !(2n)\ t (u (s)) ds#(2n)\ º(u (s), u (s)) ds A G G G A G G
244
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
the measure kNK@ being defined as follows: 4
@ L dx 2dx dkNG KG@(u )exp ! »(u (s)) ds (3.1.7) L V G V G KL G CG for C LXK , i"1,2, n. The third integral in the exponent of the integrand in Eq. (3.1.6) is @ G bounded in n, viz., kNK@(C ,2, C )" 4 L
@ L 1 º(u (s), u (s)) ds 4 b #u # # #t # G G ? A 2 G ? A therefore it does not contribute to f (b, d). In further calculations it will be discarded. The functions u , t 3¸(RJ), therefore ? A (2n)\
@
u (u (s)) ! t (u (s)) ds ? G A G ? G A G L kb kb K ! t u " lim bm\ u u A G m ? G m A G K I ? G for any continuous trajectory +u (s),2, u (s),. Furthermore, L
kb kb K ! t u dkNK@(u ,2, u )exp (2mn)\b u u 4 L ? G m A G m XKL @ A G I ? G 1 4exp bn #u # # #t # Tr exp(!b¹LK!b»L K )(R , ? A N 2 ? A so by the dominated convergence theorem
K kb dkNK@(u ,2, u ) exp b(2nm)\ u u 4 L ? G m XKL b I ? G kb . (3.1.8) ! t u A G m A G Let us now linearize the quadratic terms ( u (u (kb/m))) in the exponential of the integrand in G ? G Eq. (3.1.8) using the identity Z K" lim N K
exp(a)"(2p)\
\
1 exp ! m#(2am dm 2
and those containing the t functions by performing the substitution A
L 1 L L !(2n)\ t (x ) " ng! t (x )g !(2n)\ t (x )!g n . A G A G A A G A 2 A G G G
(3.1.9)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
245
Discarding the last term on the r.h.s. of Eq. (3.1.9), one obtains, by exploiting positivity of the measure kN K @, an upper bound on Z K: V W N
K I df dkN K @(u ,2, u ) Z K4 lim (2p)\KI I? XKL 4 L N KI b K 0 I ? b L jb 1 bn f u u ;exp ! f # # g H? ? G m H? A mn 2 2m H? G H? IA kb L b , ! g t u A A G m m IA G where the g ’s are arbitrary functions of f ,2, f . In new variables m "(m/bn)f , A KI H? H? j"1,2, m, and for the choice g ,g (m ,2, m ), c"1,2, k, the bound takes the form A A O OI
bn KI b jb 1 dm dkNK@(u) exp ! m #m u u Z K4 lim I? 4 H? H? ? N 2pm m m 2 XK b 0KI I? K H? 1 qb L b . # g(m ,2, m )!g (m ,2, m )t u OI A O OI A 2 A O m m OA To carry out the next step one now adds and subtracts the term nb( K m )/2mx, where Q Q m "(m ,2, m ), in the exponent of the integrand: Q Q QI
bn KI Z K4 lim dm exp [ngK (m ,2, m , g ,2, g )] N V KI J I? 2pm 0KI I? K bn ;exp ! (d !(mx)\)m ) m , PQ P Q 2m PQ where
(3.1.10)
gK (m ,2, m , g ,2, g )"!b(2mx)\ m #b(2m)\ g(m ,2, m ) V KI J A N NI Q NA Q b jb kb #ln dkNK@(u) exp m u u ! g (m ,2, m )t u 4 H? ? A I I A m m m XK b H? IA and, following Pearce and Thompson (1975), replaces the first exponential function in the integrand on the r.h.s. of Eq. (3.1.10) by its maximum with respect to m ,2, m : K
Z K4 lim max exp[ngK (m ,2, m , g ,2 g )] N V K J K +KQ, bn KI nb ; dm exp ! (d !(mx)\)m ) m . I PQ P Q 2pm 2m 0KI I PQ
246
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
The m;m matrix with entries d !(mx)\ has one nondegenerate eigenvalue 1!x\ and one PQ (m!1)-fold eigenvalue equal 1. Thus for x'1, Z K4 lim max exp[ngK (m ,2, m , g ,2 g )](1!x\)\I N V K J K +KQ, (3.1.11) "max exp[nGK (m ,2, m , g ,2g )](1!x\)\I , V I J + , KQ where GK is a functional, defined on real functions m 3¸[0, b], of the form V ? @ 1 @ m (s) ds # g(m (s),2, m (s)) ds GK (m ,2, m , g ,2, g )"!(2bx)\ V I J ? A I 2 ? A @ @ m u (u(s)) ds! g (m (s),2, m (s))t (u(s)) ds . dkNK@(u)exp #ln ? ? A I A 4 XK b A ? In order to find the maximum on the r.h.s. of Eq. (3.1.11) we shall need the following result.
Lemma 3.1. GK assumes absolute maximum on constant functions V m (s)"m "const. for a"1,2, k . ? ?
(3.1.12)
Proof. GK can be written alternatively as V lim gK (m ,2, m , g ,2, g ) , K KI J K where
I K gK (m ,2, m , g ,2, g )"!b(2mx)\ m #b(2m)\ g(m ,2, m ) K KI J I? A O OI ? I AO I@K\ O@K\ dkNK@(u)exp m u (u(s)) ds! g (m ,2, m ) t (u(s)) ds #ln 4 I? A O OI XK \ ? \ A b I\@K O\@K I? OA and m " : m (s ), where s 3[(k!1)m\b, km\b). According to Eq. (2.1.6) and the construction of I? ? I I integrals with respect to the Wiener measure kNK@ (Bratteli and Robinson, 1981; Glimm and Jaffe, VW 1981; Roepstorff, 1994)
#b(2m)\ g(m ,2, m ) gK (m ,2, m , g ,2, g )"!b(2mx)\ m K KI J I? A O OI ? I AO K # ln Tr exp(!bm\hI K) , H H where hI K"¹K#» # g (m ,2, m )tK . K ! m u H N H? ?K A H HI A ? A
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
247
The form of gK in the neighbourhood of its absolute maximum can be found by applying Ho¨lder’s K inequality in the form
K K Tr A 4 (Tr AK)K , (3.1.13) G G G G where A are positive definite trace-class operators (Dunford and Schwartz, 1963). Capel and G Tindemans (1974) proved this inequality, together with a necessary and sufficient condition for equality to hold: there exist constants j such that A j "A for i, j"1,2, m, in finite dimensional GH G GH H space. Since finite rank operators are dense in the Banch space of trace-class operators with the trace norm #A# "Tr(A>A) and the trace is a continuous functional with respect to the trace norm (Reed and Simon, 1973) it follows that their proof of Eq. (3.1.13) for finite rank operators, together with the condition for equality, extends to trace-class operators. Using Eq. (3.1.13), we get a bound on gK : K gK (m ,2, m , g ,2, g )4!b(2mx)\ m K KI J I? ? I K 1 (3.1.14) # g(m ,2, m )#m\ ln Tr exp(!bhI K) . A I II I 2 IA I The necessary condition for the maximum of the bound on the r.h.s. of Eq. (3.1.14) is, for t"1,2, m, a"1,2, k,
j 1 !b(mx)\ m # b g(m ,2, m )#ln Tr exp(!bhI K) "0 . (3.1.15) Q? jm 2 A R RI R R? Q A The solutions m "(m ,2, m ) of this equation are, clearly, independent of t. The r.h.s. of R R RI Eq. (3.1.14) thus assumes maximum at a point +m ,"+m , for k"1,2, m, a"1,2, k and so I? ? does gK , since equality in Eq. (3.1.14) holds if and only if hI K"!(b/m) ln j #hI K, k, l"1,2, m IJ J K I which, by the linear independence of the functions u ,2, u , t ,2, t , 1, imposes the constraints I J (3.1.12) on m (s),2, m (s). QED I Let us verify whether the constraint (3.1.12) agrees with the necessary conditions for the maximum of GK : V dGK V "0, d"1,2, k . (3.1.16) dm (t) B Explicitly, these equations are
jg A g (m (t), , m (t)) m (s) ds# B jm (t) A 2 I B A jg (m (t),2, m (t)) I t (u(t)) "0, d"1,2, k , # u (u(t))! A B A jm (t) K @ B A
!(bx)\
@
(3.1.17)
248
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
where X dkNK@(u)s(u)exp[ @ m (s)u (u(s)) ds! @ g (m (s),2, m (s))t (u(s)) ds] ? I A A A . 1s(u)2K " Kb 4 ? ? @ XKbdkNK@(u)exp[ @ m (s)u (u(s)) ds! @ g (m (s),2, m (s))t (u(s)) ds] ? I A 4 A A ? ? According to Eq. (2.1.6), for constant functions m (s)"m , Eqs. (3.1.17) take the form ? ? jg (m , , m ) (Tr exp(!bhI K))\Tr exp(!thI K) uK! A 2 I tK exp(!(b!t)hI K) N N B A N jm B A jg !x\m # Ag (m ,2, m )"0, d"1,2, k (3.1.18) B I jm A A B and, clearly, do not depend on t. The condition (3.1.12) is therefore compatible with the necessary conditions for the maximum of GK . V For the functions g (m ,2, m ) let us now take the solutions of the equations A I dGK V "0, j"1,2, l . (3.1.19) dg (m (t),2, m (t)) H I On substitution m (s)"m , a"1,2, k, these equations assume the form ? ? g (m ,2, m )"1tK2 I p K , j"1, 2, l , (3.1.20) H I H F where
1t2 "(Tr exp(!bh))\ Tr(texp(!bh)) . F With the choice of g (m ,2, m ) given by Eqs. (3.1.20), Eqs. (3.1.18) reduce to A I x\m "1uK2 I p K, d"1,2, k B B F As for the functional GK , for constant m ,2, m it simplifies to the form V I 1 1 GK (m ,2, m , g , 2, g )"! b m# b g!bf (hI K(m, g), b) V I J ? 2 A N 2x ? A and has the following property:
(3.1.21)
(3.1.22)
Lemma 3.2 (Tindemans and Capel, 1974). For fixed m ,2, m the solution g*,2, g* of Eqs. (3.1.20) is J I unique and minimizes GK . V Proof. Let g ,2, g be fixed. Then Bogoliubov’s inequality J f (H #H , b)4f (H , b)#n\1H 2 & implies f (hI K(m, g), b)4f (hI K(m, g ), b)# (g !g )1tK2 I p K . N N A A A F KE A
(3.1.23)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
249
On the other hand, f (hI K(m, g), b)"f (hI K(m, g ), b)# (g !g )1tK2 I p K N N A A A F KE A jf 1 #2 # (g !g )(g !g ) A A B B jg jg 2 A B EE AB therefore, by comparison, the matrix (jf/jg jg ) (g"g ) is nonpositive. Furthermore, for gOg A B jGK V (g !g )(g !g ) A A B B jg jg A B EE AB jf (g !g )(g !g )'0 . (3.1.24) "b (g !g )!b A A B B A A jg jg A B EE A AB Suppose now that Eqs. (3.1.20) have two different solutions +g , and +g ,"+g #e ,. Then, A A A A according to the Lagrange theorem, there exists a point +g ,"+g #he ,, with 0(h(1, such A A A that
jGK V e e "0 AB jg jg A B EEY AB which contradicts Eq. (3.1.24). Thus there can be at most one solution, its existence being guaranteed by the property lim GK "R for fixed +m , V ? E? of GK . QED V Let us denote the unique solution of Eq. (3.1.20) by +g*,. Since "g*"4#t # according to A A A Eq. (3.1.20), therefore, lim GK (m ,2, m , g*,2, g*)"!R V I J K? whence GK (m ,2, m , g*,2, g*) has at least one maximum with respect to +m ,. Let V I J ? * * * * max+ ?,GK (m ,2, m , g ,2, g )"GK (m ,2, m ) K V I J V I b 1 "! m*# b g*(m*,2, m*)#ln Tr exp(!bhI K(m*, g*)) . I N ? A 2x 2 ? A The bound (3.1.11) now yields 1 1 f (HLK, b)5f (hI LK(m*, g*(m*)), b)# m*! g*#k(2nb)\ ln(1!x\) . N N ? A 2x 2 ? A
(3.1.25)
(3.1.26)
250
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
On passing to the limit nPR, xP1, we get 1 1 lim f (HLK, b)5 lim f (hI LK(m*, g*(m*)), b)# m*! g* N ? A N 2 2 L ? A L
with
" lim min f (hLK(m, g*), b) N L +K?,
(3.1.27)
1 1 hK(m, g)"hI K# m ! g , hLK"nCI KL hK N ? 2 A N N N 2 ? A and the necessary conditions (3.1.21) for the minimum in Eq. (3.1.27) simplify in this limit to m "1uK2 p K * . B B F KE
(3.1.28)
3.1.2. Upper bound on f (HLK, b) N An upper bound on f (HLK, b) follows from the Bogoliubov inequality (3.1.23) by taking N H "hI LK(m, g*) and splitting HLK accordingly N N HLK"¹LK#¼L K N N 1 1 "hI LK(m, g*)# n m! g* K L! ( LK!nm K L ) N ? A ? ? 2 2n ? A ? 1 1 (3.1.29) # (WLK!ng* K L)! ºLK , A A 2n B 2n A where m ,2, m are arbitrary real-valued parameters, g*,2, g* are the unique solutions of J I Eqs. (3.1.20),
* hI LK(m, g*)"¹LK#»L K ! m LK# g WLK N A A N ? ? ? A and LK, WLK, ºLK are multiplication operators on ¸(K)L by the functions ? A B L L
(x ,2, x )" u (x ), W (x ,2, x )" t (x ) , ? L ? G A L A G G G L º (x ,2, x )" º(x , x ) , B L G G G respectively. Bogoliubov’s inequality now yields the following upper bound on f (HLK, b): N 1 1 f (HLK, b)4f (hI LK(m, g*), b)# m! g* ! 1( LK!nm )2 I Lp K * N N ? A ? ? F KE 2 2n ? A ? 1 1 1 # 1(tK)2 I p K * ! 1tK2I p K * ! 1ºLK2 I Lp K * A F A F KE KE 2n 2n 2n B F KE A A 1 1 4f (hI LK(m, g*), b)# m! g* # 3 #t # # #u # . (3.1.30) N ? A A ? 2 2n ? A A ?
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
251
3.1.3. Asymptotic equality of upper and lower bound on f (HL K, b) N Comparison of the bounds (3.1.27) and (3.1.30) shows that for +m ,"+m*,, where +m*, is the ? ? ? solution of Eq. (3.1.28) which minimizes GK , the two bounds coalesce in the limit (3.1.5). As could have been expected, Eqs. (3.1.20) and (3.1.28) result by setting
j 1 f (hI LK(m, g), b)# m! g "0 , N ? A jg 2 A ? A 1 j f (hI LK(m, g), b)# m! g "0 . N ? A 2 jm ? ? A We have thus proved the following theorem:
(3.1.31) (3.1.32)
Theorem 3.3. For separable bounded 2-particle interactions J I º(x, y)" u (x)u (y)! t (x)t (y) , ? ? A A ? A the equality lim f (HLK, b)" lim f (hLK(m*, g*(m*)), b) , N N LK LK where +g*(m ,2, m ), are the unique solutions of the equations A I
g "1tK2 , c"1,2, l A A F p KKE and +m*, are the solutions of the equations ? m "1uK2 p K * , a"1,2, k ? ? F KE which minimize f (hLK(m, g*(m)), b), holds for all b'0. N
(3.1.33)
(3.1.20a)
(3.1.28a)
Eqs. (3.1.20a) and (3.1.28a) can be written equivalently as one equation for hK, viz., N (3.1.34) hK"Tr (HM K(KoJ K))!Tr(HM K(oJ KoJ K)) , N N N where oJ K"(Tr exp(!bhK))\ exp(!bhK) and N N HM K"(¹K#» K )K#K(¹K#» K )#º K , N N N Tr denoting the partial trace over the Hilbert space of the second particle, or as an equation for the density matrix oJ K: )])\ exp(!b(¹K#» )) , (3.1.35) oJ K"(Tr exp[!b(¹K#» K #º K #º N MJ N MJ "Tr (º(KoJ K)). Eq. (3.1.34) are identical with those one would obtain by minimizwhere º MJ ing the free energy of n Boltzmann particles F[HLK]"(n!)\(Tr(HLKoL)#b\ Tr (oLln oL)) N N over product density matrices oL"o K o K 2o K .
252
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
As for Eq. (3.1.35), it was derived by Fannes et al. (1980) for an infinite system of distinguishable particles, using correlation inequalities. They proved that the free energy density of an infinite system of such particles with 2-body interaction expressed in terms of a bounded operator º, is equal to the free energy density of an infinite system of non-interacting particles in a state which is satisfying locally defined by a density matrix of the form oL"o K o K 2o K with o K Eq. (3.1.35). To conclude this section, let us note that Theorem 3.3 allows to set up an approximation scheme to the limiting free energy density of a gas of Boltzmann particles with a 2-body interaction given by a bounded symmetric function º(x, y) square-integrable on RJ;RJ with respect to the Lebesgue measure kk. The construction of such scheme exploits the property of such º(x, y) allowing to approximate it arbitrarily closely in the norm on ¸(RJ;RJ) by sums H S (x, y)" j u (x)u (y) , H ? ? ? ? where j 5j 5j 52 are the eigenvalues of the integral operator on ¸(RJ) with the kernel º(x, y) and u the corresponding eigenfunctions. As a consequence, ? lim k-ess sup "º(x, y)!S (x, y)""0 . H H V WZ0J Since "ln Tr e!ln Tr e "4#A!B#
(3.1.36)
for any self-adjoint finite-dimensional operators A, B (Ruelle, 1969), therefore, for any finitedimensional space ML¸(K)L, n\"ln TrM exp[!b(¹LK#»L K #n\ºL K )]!ln TrM exp[!b(¹LK#»L K #n\SLK)]" N N H n k-ess sup "º(x, y)!S (x, y)" , (3.1.37) 4n\b#ºL K !SLK#M4bn\ H H 2 V WZ0J º(x , x ) and S (x , x ), where ºL K , SLK are multiplication operators in ¸(K)L by H GHXL G H GHXL H G H respectively, and # #M denotes the operator norm of linear operators on ML¸(K)L. Taking a sequence of subspaces +M , such that M LM and M "¸(K)L, one obtains from G G G> G G Eq. (3.1.37).
n k-ess sup "º(x, y)!S (x, y)" . " f (HLK, b)!f (¹LK#»L K #n\SLK, b)"4n\ N H H N 2 VWZ0J Let hLK (m*, g*) denote the operator on ¸(K)L defined in Theorem 3.3 and satisfying N H lim f (¹LK#»L f (hLK (m*, g*), b) . K #SLK, b)" lim H N N H K K L L The triangle inequality, together with Eqs. (3.1.37) and (3.1.38), then yields lim " f (HLK, b)!f (hLK (m*, g*)b)"4k-ess sup "º(x, y)!S (x, y)" N H H N LK VWZ0J
(3.1.38)
(3.1.39)
(3.1.40)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
253
which shows that lim f (HLK, b) can be approximated arbitrarily closely by limiting free energy N densities of solvable systems hLK (m*, g*(m*)). This result indicates that, since the r.h.s. of Eq. (3.1.40) N H can be made arbitrarily small by choosing j sufficiently large, a perturbation expansion of f (HLK, b) N in powers of º!S , with hLK treated as the unperturbed Hamiltonian, should be rapidly H N H convergent for such j. Perturbation theories for the free energy density of many-particle systems have been often exploited, e.g. Negele and Orland (1988). 3.2. Example — a system of coupled oscillators on a sphere in RJ Consider a quantum system of n identical point particles in RJ confined to a sphere SJ\, with P radius r, and interacting via the potential º(x, y)"c(2r)\"x!y", c'0, x, y3SJ\ , (3.2.1) P where " " denotes the Euclidean distance in RJ. In order to define the Hamiltonian of such a system, let us denote by SJ\ the range of spherical coordinates h for the sphere SJ\ and by J (h, r) the P J Jacobian of the transformation expressing Cartesian coordinates x in terms of h: x "r (h), h"(h ,2, h ), a"1,2, l . ? ? J\ The Hilbert space of the system is HL"¸(SJ\, J (h, r) dh)L and the Hamiltonian has the form P J HL"¹L#n\ º(x (h ), x (h )) , (3.2.2) P P G G H H GH where ¹"!r\DI #RJ is the single-particle kinetic energy operator, DI denoting the P J P J Friedrichs self-adjoint extension of the spherical Laplacian j j j (sin h )J\ #(sin h )D , D " D "(sin h )\J J\ J\ J\ jh J J\ jh jh J\ J\ and RJ the scalar curvature of SJ\ (Lightman et al., 1975): RJ"(l!1)(l!2)r\. Since lim RJ"0 P P P P as rPR, the term nRJ can be dropped in the calculation of the thermodynamic limit of f (HL, b). P P In spherical coordinates the interaction º(x (h ), x (h )) takes the form G G H H J (3.2.3) º(h , h )"c 1! (h ) (h ) . ? G ? H G H ? According to Section 2.2 the construction of the Wiener measure on the trajectory space +u : [0, b]PSJ\,"X can be carried out in the same manner as that of the measure kNK@ on VW P P@ XK . The Feynman—Kac theorem in ¸(SJ\, J (h, r) dh)L also remains therefore valid. By applying @ J the same technique as in Sections 3.1.1, 3.1.2 and 3.1.3, one can thus prove that in the limit
n"SJ\"\Pd , P the following asymptotic equality holds: nPR, rPR,
(3.2.4)
lim f (HL, b)" lim f (hL, b) , P P LP LP
(3.2.5)
254
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
where hL"nCI KL hL and h is the solution of the equation P P P (3.2.6) h"Tr (HM (o ))!Tr(HM (o o )) P P P P P P with o "(Tr exp(!bh))\ exp(!bh), which minimizes f (hL, b). h in Eq. (3.2.6) has the P P P P P form h"!r\DI ! J m #c\o, o" m with (m ,2,m )"m denoting the minim? ? J J ? ? ?P J J P izing solution of the equations (3.2.7) m "c1 2 P, a"1,2, l . ? ?P F Given the integral kernel o (h, h) of exp(!bh)"o , Eq. (3.2.7) could be written down explicitly. P P P However, derivation of o (h, h) would require knowledge of the eigenstructure of h. The difficulty P P can be circumvented by making use of the following property of the Wiener measure k@ defined by VW the infinite-volume Laplacian D: Lemma 3.4. ¸et X denote the space of paths in RQ J (one-point compactification of RJ) parametrized @ in the interval [0, b]: X "+u : [0, b]PRQ J,. ¸et u be a continuous function on X : u3C(X ). @ @ @ If x,y3SJ\, then P
u (u) dkP@ (u) , lim (2pb) u(u) dk@ (u)" lim VW P FVFW X X @ P@ P P where u is the restriction of u to X : u "u"XP@3C(X ). P P@ P P@ The proof of this lemma can be found in Appendix A. The property of the measure k@ stated in VW the lemma allows to replace o (h, h) in Eq. (3.2.7) by the integral kernel R (x, x) of the operator P P (2pb) exp (!bh)"(2pb) exp [!b(!D#o r\(x!ro\m)!o #m )] J J J with m "c#c\o, defined on ¸(RJ), the range of variables of the kernel R (x, x) being J P restricted to SJ\. This can be seen from the expression for R (x, y) in terms of k@ : P P V W @ R (x, y)"(2pb) e@MJ\K dk@ (u) exp ! o (2r)\(u(s)!o\rm) ds . P VW J J X @ After passing to the limit n, rPR, both kernels o (h, h) and R (x(h), x(h)) yield, according to P P Lemma 3.4, the same limiting form of Eq. (3.2.7). The explicit form of R (x, y) is well known P (Feynman and Hibbs, 1965; Pruski and Mac´kowiak, 1971):
a J e@MJ\K exp[!a((x!o\rm)#(y!o\rm))coth X R (x, y)"(2pb)(2p)\J J J P sinh X #a(x!o\rm)(y!o\rm) (sinh X)\] , J J where a"o r\, X"ba. Using the identity J J\ exp m (h) J (h, 1) dh "(2p)Jo\J>I (o ) , ? ? J A J J\ J J\ 1 ? A
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
255
one obtains
a J exp [!bm #bo R (x(h), x(h)) dh "(2pb) J P A sinh X J\ 1 A !2ar tanh (X)](2ra tanh (X))\JI (2ra tanh (X)) . J\
(3.2.8)
Hence, (bo ) lim Tr exp (!bh)"(2pb)b\J e\@K(bo )\JI J J\ J P P and
a J exp [b(o !m )!2ar tanh X] lim Tr ( exp (!bh))"lim (2pb) J ? P sinh X P P j ; (2ra tanh X)\JI (2ar tanh X) J\ jm ? "(2pb)b\J e\@K(bo )\JI (bo )m o\ . J J J ? J Eqs. (3.2.7) thus reduce to the following equation for o : J I (bo ) o "c J J (3.2.9) J I (bo ) J\ J which coincides with the asymptotic form of the equation for the mean field of a classical system of n particles on SJ\ with the interaction (3.2.1) (cf. Mac´kowiak, 1982). P Calculation of the limit (3.2.5) is now easy. One obtains
l 1 d(2pb)\J#1! lb#ln max K (b, o) , lim f (HL,b)"!b\ ln C J J P 2 2 MJZ0, LP where
(3.2.10)
K (b, o)"(bo )\JI (bo )exp (!c\bo) . J J J J J\ J The condition for the maximum in Eq. (3.2.10) coincides with Eq. (3.2.9) and the limit (3.2.10) has the same form as the corresponding quantity for a classical system of n particles on SJ\ with the P interaction (3.2.1). The quantum system with the Hamiltonian HL thus behaves in the thermodynP amic limit exactly in the same manner as its classical prototype (Mac´kowiak, 1982). Such behaviour of large quantum systems has been demonstrated on many other occasions, e.g. Hepp and Lieb (1973) and Fuller and Lenard (1979).
4. Thermodynamic limit of free energy density for continuous n-fermion systems 4.1. Contractions and expansions of p-particle fermion operators Suppose we are dealing with a system of n identical fermions. If H denotes the Hilbert space of a single fermion, the Hilbert space of n such fermions is the n-fold Grassmann
256
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
product HL of H, viz., HKL"AL(HH2H)AL"ALHLAL , where AL denotes the projector on the subspace of antisymmetric vectors in HL. AL acts on product vectors in the following manner: ALu 2u "(n!)\ (!1)CPLuPL 2uPL L L P L
summation running over all permutations of n elements. Obviously, HKLO+0, if and only if dim H5n. The p-particle observables of an n-fermion system are represented by self-adjoint operators in HKL of the form CL BN " : AL(BN L\N)AL , N where O denotes the identity operator in HO. The mapping CL was introduced by Kummer (1967) N and called the (p, n)-expansion. The set of states SL of an n-fermion system consists of all density matrices oL on HKL: SL"+oL: oL50, Tr oL"1, oL"ALoLAL, . As shown by Kummer (1967), for bounded operators BN, the mapping CL is the adjoint of another N mapping, the so-called (n, p)-contraction ¸N : TLPTN, which maps n-particle self-adjoint trace L class operators into p-particle self-adjoint trace class operators: Tr(oLCL BN)"Tr(BN¸NoL) . (4.1.1) N L One easily verifies that ¸NoL is the partial trace of oL over the Hilbert space of n!p fermions: If L oL(p#1,2, n; (p#1),2, n) represents the kernel of oL with respect to n!p fermion variables, then Tr(oLCL BN)"Tr(oLBNL\N) N
"TrHKN d(p#1) d(p#1)2dn dn oL(p#1,2, n; (p#1),2, n) ;BN d(p#1, (p#1))2d(n, n) "TrHKN(BNTrHKL\NoL)"TrHKN(BN¸NoL) . L The (n, p)-contraction has another important property: Lemma 4.1. ¸N is order preserving: if oL50, then ¸NoL50. L L Proof. Let t3HKN and let PN denote the projector on t. Then R Tr(¸NoLPN)"Tr(oLCL PN)"Tr(oL(L\NPN))50 L R N R R since L\NPN is a projector on a subspace of HL where oL is nonnegative. QED R
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
257
4.2. Asymptotic equality of the free energy density of noninteracting Fermi gases in the canonical and grand canonical ensemble It is generally accepted that thermodynamic functions corresponding to different equilibrium ensembles are equal up to additive terms which vanish in the thermodynamic limit. The asymptotic equality of such functions for various ensembles and models has been demonstrated on many occasions, e.g. by Glimm and Jaffe (1981), Thirring (1982), Baumgartner et al. (1983) and Lewis et al. (1994a,b). Here we shall prove that the expressions for the free energy density of arbitrary noninteracting Fermi gases in the canonical and grand canonical ensemble are equal in the thermodynamic limit. The proof consists in deriving an equation for the free energy density of n fermions and finding its asymptotic solution for large n (Mac´kowiak, 1988). 4.2.1. An equation for the free energy density f (nCLKh K ,b) of n noninteracting fermions Suppose we are dealing with a system of n identical noninteracting point fermions in a bounded region KLRJ. Let HK denote the Hilbert space of a single fermion and HKKL that of n fermions. Let : !DI K#» I K is the self-adjoint extension h K " K be the Hamiltonian of a single fermion, where D a multiplication of the Laplacian in ¸(K)"HK under given boundary conditions and » K N operator in HK by a function » : RJPR, semibounded from below. Suppose the n-fermion system is in the Gibbs canonical ensemble. Then its state is given by with the Hamiltonian hL K "nCL h K the density matrix exp(!bnCL h )"f\ oL, b'0 , f\ K LK K LK where oK"exp(!bh K ), oK L denotes the nth Grassmann power of oK: : AL(oKoK2oK)AL"ALoBL o K L" K AL and f
L
"Tro K L. oK is a trace class operator on HK:
Tr oK4exp(!b inf »(x))TrK exp(bDI K)(R (4.2.1) J VZ0 (lim K "K"\ TrK exp(bDI K)(R), and therefore bounded (Bratteli and Robinson, 1981): #oK#4exp(!b inf »(x)#b sup e) , VZ0J CZ1N DI K lim #oK#4exp(!b inf »(x)) . VZ0J The conditions under which the limit
(4.2.2) (4.2.3)
K
f (h, b) " : !lim (nb)\ ln f K" lim f (hL, b) (4.2.4) L LK LK exists, with lim n"K"\"d, are well known (Ruelle, 1969; Lieb, 1966; Ginibre, 1965; Bratteli and Robinson, 1981). Ruelle proved that f (h, b) is finite for any tempered and stable potential »L K , provided "K"PR in the sense of Fisher and d does not exceed the “close packing density” d .
Let us assume that these conditions are satisfied and the limit (4.2.4) exists.
258
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Direct calculation of the limit (4.2.4) is difficult. The grand canonical ensemble approach is more accessible and yields the following expression for f (h, b): (4.2.5) f (h, b)"b\ lim ln z K! lim (nb)\ Tr ln(K#z KoK) , L L K K L L where the fugacity z K is given implicitly as the unique solution of the equation L (4.2.6) Tr(z K(K#z KoK)\)"n , L L K denoting the identity operator in HK. We shall prove that Eqs. (4.2.4) and (4.2.5) express L. The asymptotically the same quantity in the limit (3.1.5). To this end let us first calculate ¸o L K integral kernel of ¸o K L, in terms of the kernel oK(x, x) of oK, can be written in the form L L )(x , x ) (¸o L K oK(x , x )oK(x , x )oK(x , x )2oK(x , x ) L oK(x , x )oK(x , x )oK(x , x )2oK(x , x ) L dx 2dx . "(n!)\ det (4.2.7) L $ $ KL\
oK(x , x )oK(x , x )oK(x , x )2oK(x , x ) L L L L L Performing a Laplace expansion of the determinant in the integrand with respect to the first row, one obtains (¸o L)(x , x ) L K
"f n\oK(x , x )!n\(n!1) oK(x , x )(¸ o L\)(x , x ) dx L\K L\ K K L, ¸ o L\, oK and their products, can be rewritten as which, in terms of ¸o L\ K L K n¸o L"f o !(n!1)oK¸ o L\ . L K L\K K L\ K Repeated application of this formula to the last term on the r.h.s. of Eq. (4.2.9) yields L n¸o L"! f (!oK)I , L K L\IK I where f K"1. M Next we shall prove by induction that the Fre´chet derivative of f d f K"no\ n"1,2,2 . K ¸oK L, L doK L
(4.2.8)
(4.2.9)
(4.2.10)
LK
with respect to oK equals (4.2.11)
For n"1, formula (4.2.11) takes the form d Tr oK"o\ K ¸oK"o\ K oK"K . doK
(4.2.12)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
259
Let us verify it for n"m. Taking the trace of both sides of Eq. (4.2.10) with n"m, we get K mf K"! (!1)If Tr oIK . (4.2.13) K K\IK I Next, equating the derivatives of both sides of Eq. (4.2.13) and making use of Eq. (4.2.11) for n"1,2, m!1 and of Eq. (4.2.10), one obtains
df d Tr oIK df K K K m K " (!1)I> K\I Tr oIK#f K\IK doK doK doK I K (!1)I>[(m!k)¸ o "o\ K\I Tr oIK#f koI ] K K\I K K\IK K I K K\I (!1)I> (!1)H>f oH Tr oIK#f koI "o\ K K\I\HK K K\IK K I H K K\H K (!1)H>oHK (!1)I>f (!1)I>f "o\ koI . K K Tr oI K#o\ K K\I\H K\IK K H I I (4.2.14)
Making use of Eq. (4.2.13), we can simplify the first sum after the last equality in Eq. (4.2.14): df K K K m K " (m!j)f (!oK)H\# f k(!oK)I\ K\HK K\IK doK H I K (!oK)I\"mo\ "m f K ¸ oK K K K\IK I which proves Eq. (4.2.11) for n"m. Formula (4.2.11) can be written equivalently in the form d n\ ln f K"o\ ¸oL . K f\ L LK L K doK
(4.2.15)
(4.2.16)
Eq. (4.2.16) can be viewed as a differential equation for n\ ln f K. The asymptotic form of this L functional for large n, "K" can be thus found by solving Eq. (4.2.16) with the r.h.s. replaced by its asymptotic expression. , b) 4.2.2. Asymptotic solution of the equation for f (nCL h K The asymptotic form of f\ K ¸oK L is expressed by the following lemma. L L Lemma 4.2. For any sequence of bounded operators B K , each defined on HK and such that Tr (B lim #B K #(R and limf\ K ¸oK L) (as n, "K"PR) exists, the following equality holds: LK L lim f\ n\ Tr(B K Tr (B K ¸oK L)" lim K oKz K(K#z K oK)\) , L L L L K K L L where z K is the unique solution of Eq. (4.2.6). L
(4.2.17)
260
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Proof. By the assumed existence of the limit (4.2.4), for large n, "K", f K has the asymptotic form L f K"exp(!bnf (b, d)!b"K"f (b, d)#g(n)), L
lim g(n)n\"0 , L : f\ Kf K where f (b, d)#d\f (b, d)"f (h, b). Thus for s K " L> L L
(4.2.18)
lim s " lim s K"exp(bf (b, d)) . L\K L LK LK : f\ ¸oKL. Then by Eq. (4.2.10) Let q(oK) " LK L L
(4.2.19)
1 L f\ f (!1)I> Tr(B Tr(B K q(oK))" K oI K) LK L\IK L n I implying, in view of Eq. (4.2.19), that
(4.2.20)
Tr(B (4.2.21) lim Tr(B K q(oK))" lim K q (oK)) . L L> K K L L An asymptotic equality of the form (4.2.21) will be subsequently written as q(oK)+q (oK). Thus L> L according to Eqs. (4.2.9) and (4.2.21), q(oK)+(n#1)\s KoK!s KoKq(oK)+q (oK) . L L L L> L Hence, (K#s KoK)q(oK)+(n#1)\s KoK . L L L
(4.2.22)
Obviously, K#s KoK'0 and #(K#s KoK)\#(1, which implies L L 0( lim #(K#s KoK)\#41 . L LK Eq. (4.2.22) can be therefore rewritten in the form q (oK)+(n#1)\s KoK(K#s KoK)\ . L L L
(4.2.23)
Since Tr q(oK)"1, the asymptotic equality (4.2.23) can be extended L q (oK)+(n#1)\s KoK(K#s KoK)\+n\z KoK(K#z KoK)\ , L L L L L
(4.2.24)
where z K is the solution of Eq. (4.2.6). Uniqueness of this solution can be inferred from the L properties of the function ¹(z)"Tr(zoK(K#zoK)\) . These are: (1) ¹(0)"0, (2) lim ¹(z)"R, (3) ¹(z) is monotonically increasing. QED X
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
261
Replacing the r.h.s. of Eq. (4.2.16) by its asymptotic form given by Eq. (4.2.24), we obtain n\
d ln f K+n\z K(K#z KoK)\ . L L L doK
(4.2.25)
The solution of this equation, up to additive terms which vanish in the thermodynamic limit, is n\ ln f K"n\ Tr(ln(K#z KoK))#n\ ln g (z K) , (4.2.26) L L L L where g is an unknown function of z K and n. To determine g , let us rewrite Eq. (4.2.26) in the L L L form (4.2.27) f K"g (z K)exp(Tr ln(K#z Ko )) . L L L L L The density matrix q(oK) is obviously invariant under any transformation of h K of the form L a3R. hI K "h K #aK, Thus by Eq. (4.2.17) z K(h #aK)"z K(h )e@? . L K L K Furthermore, since
(4.2.28)
#aK)"f K(h )e\L@? , f K(h L K L K it follows from Eqs. (4.2.27), (4.2.28) and (4.2.29) that
(4.2.29)
#aK))"g (z K(h )e@?)"g (z K(h ))e\L@? , g (z K(h L L K L L K L L K and therefore, g (z K)"c z\LK , L L L L where c is a constant. The latter can be determined from the equality L
(4.2.30)
lim s K" lim f\ Kf K" lim z K (4.2.31) L L> L L LK LK LK which results from Eq. (4.2.24). Substituting into Eq. (4.2.31) f K given by Eqs. (4.2.27) and (4.2.30), L we get lim c\ c "1 . (4.2.32) L> L K L Furthermore, according to Eq. (4.2.27), if c were to give a nonvanishing contribution ln c to L lim n\ ln f K as nPR, it would have to be of the form c "cL>RL, where lim n\t(n)"0. But L L then Eq. (4.2.32) implies c"1. Thus, g (z K)"z\LK and for such g Eq. (4.2.26) yields the formula L L L L ! lim (nb)\ ln f K"b\ ( lim (ln z K!n\ Tr ln(K#z KoK))) L L L LK LK
(4.2.33)
262
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
which states asymptotic equality of expressions for the free energy density of a noninteracting Fermi gas in the canonical and grand canonical ensemble. The r.h.s. of Eq. (4.2.33) is easier to handle than the l.h.s. In particular, it can be shown (Grandy and Rosa, 1980) that the second term on the r.h.s. can be represented as a contour integral over a function of Tr oK. The representation of the free energy per fermion given by the r.h.s. of Eq. (4.2.33) has also proved useful in mean-field theories of interacting gases (Thirring, 1982; Baumgartner et al., 1983; Mac´kowiak, 1991) and will be exploited in Sections 4.3 and 6. 4.3. Thermodynamic limit of free energy density f (HLK , b) for continuous n-fermion systems N $ with separable interactions Suppose we are dealing with a system of n identical fermions in a bounded region K under the boundary conditions (2.1.1), with a Hamiltonian HLK "ALHLKAL defined in terms of the N N $ potential (3.1.1), (3.1.2). For simplicity, let us assume that the fermions have no internal degrees of freedom. The discussion of system of n fermions each having nonzero spin, but with a 2-body interaction depending only on position coordinates, is closely parallel to the spinless case and requires only minor modification of notation, in particular, introduction of additional traces over the spin space. Systems of n fermions with spin-dependent interactions are considered in Section 6.2. The Hilbert space of our system is the n-fold Grassmann product ¸(K)L of ¸(K): ¸(K)L"¸(K)¸(K)2¸(K)"AL¸(K)LAL and HLK in this space equals N $
n : ALHLKAL"nCL (¹K#» HLK " CL º K )#n\ K . N N $ N 2
(4.3.1)
The quantity f (b, d)" lim f (HLK , b) , (4.3.2) $ N $ L the limit being approached as in Eq. (3.1.5), will be now investigated similarly as f (b, d) in Section 3: a lower and upper bound on f (HLK , b) will be derived and both bounds will be shown to coalesce N $ in the limit (3.1.5). 4.3.1. Lower bound on f (HLK , b) N $ The most convenient representation of the statistical sum Z K "Tr exp(!bHLK ), allowing N $ N $ evaluation of the limit (4.3.2), is provided by the Feynman—Kac theorem: exp(!bHLK )(x ,2, x , y ,2,y ) L L N $ L @ dkNG KP@ (u ) exp ! ¼(u (s),2, u (s)) ds . " (n!)\ (!1)CPL L V W LG G XK b P G L
(4.3.3)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
263
Thus,
@ I L u (u (s)) ds ? G ? G @ J L @ L (4.3.4) !(2n)\ t (u (s)) ds#(2n)\ º(u (s), u (s)) ds A G G G A G G the measure kNK@ being defined as 4$ L @ 1 dx dkNG KP@LG(u ) exp ! »(u (s)) ds kNK@(X ,2, X )" (!1)CPL G 4$ L V G V G n! P KL X G L G on measurable subsets X ;2;X LXKL . For the same reasons as in Section 3.1.1, the last @ L integral in the exponent of the integrand on the r.h.s. of Eq. (4.3.4) will be discarded in further calculations. With this simplification, the integral (4.3.4) will be also denoted by Z K in the sequel. N $ After performing similar rearrangements of Z K as those of Z K in Section 3.1.1 and by N N $ exploiting the growth property of Z K with respect to the integrand in Eq. (4.3.4) (details are N $ given in Appendix B), one obtains Z K " N $
dkNK@(u ,2, u ) exp (2n)\ 4$ L XKL b
bn KI b L K Z K 4 lim dm dkNK@(u ,2,u ) exp N $ I? XKL b 4$ L 2pm m KI 0 G I I? K I kb kb 1 J 1 # g(m ,2,m )!g (m ,2,m )t u ; ! m #m u u I? ? G m II A I II A G m 2 A I 2 I? ? A
,
(4.3.5) where the g ’s are arbitrary, but fixed, functions of m ,2,m . By adding and subtracting the term A I II bn( K m )(2mx)\ with x'1 and m "(m ,2,m ), in the exponent of the integrand, introducing Q Q Q Q QI a bound with respect to m ,2,m , we get, similarly as in Eq. (3.1.11), KI Z K 4 lim max exp[gK (m ,2, m , g ,2, g )](1!x\)\I , $V KI J N $ K +KI?, where
(4.3.6)
gK (m ,2, m , g ,2, g )"nb(2mx)\ m #nb(2m)\ g(m ,2, m ) $V KI J Q A I II Q IA b L K I kb m u u # ln dkNK@(u ,2, u ) exp I? ? 4$ L G m m XKL @ G I ? J kb ! g (m ,2, m )t u . A I II A G m A approaches a functional GK defined on real functions On passing to the limit mPR, gK $V $V m 3¸[0, b], which, contrary to GK , is not necessarily real, since the measure kNK@ is not V ? 4$
264
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
non-negative. Thus, GK
"Re GK #i Im GK , $V $V $V
where Re GK (m ,2, m , g ,2, g ) $V I J @ 1 @ "!(2bx)\n m (s) ds # n g(m (s),2, m (s)) ds ? A I 2 ? A L @ dkNK@(u ,2, u ) exp #ln m (s)u (u (s)) ds 4$ L ? ? G XKL b G ? @ ! g (m (s),2, m (s))t (u (s)) ds A I A G A Im GK (m ,2, m , g ,2, g ) $V I J L @ "arg dkNK@(u ,2, u ) exp m (s)u (u (s)) ds 4$ L ? ? G XKL b G ? @ ! g (m (s),2, m (s))t (u (s)) ds . A I A G A Since the integral in Eq. (4.3.7b) is real, Im GK equals zero or p. Thus, $V
Z K 4 max exp[Re GK (m ,2, m , g ,2, g )](1!x\)\I . N $ $V I J K?Z* @
As for Re GK , it is maximized by constant functions m (s): $V ?
(4.3.7a)
(4.3.7b)
(4.3.8)
Lemma 4.3. ¹he functional Re GK assumes absolute maximum on constant functions m (s): $V ? m (s)"m "const. for a"1,2, k . (4.3.9) ? ? Proof. The proof can be carried out in a similar manner as the proof of Lemma 3.1. It suffices to write Re GK in the form $V bn Re GK (m ,2, m , g ,2, g )" lim !nb(2mx)\ m # g(m ,2, m ) $V I J I? A O OI 2m K ? I OA K b (4.3.10) #ln Tr exp ! nCL hI K m H H with
# g (m ,2, m )tK , hI K"¹K#» K ! m u N I? ?K A I II A I ? A where m "m (s), with s3[(k!1)bm\, kbm\), and apply Ho¨lder’s inequality (3.1.13) to the last I? ? term on the r.h.s. of Eq. (4.3.10). QED
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
265
Further derivation of the lower bound on f (HLK , b) runs parallel to the discussion in SecN $ tion 3.1 and requires only minor modifications. For completeness, full details are presented below. The constraints (4.3.9) are compatible with the necessary conditions for the maximum of Re GK , $V dRe GK $V"0, dm (t) B
d"1,2, k .
(4.3.11)
To verify this statement, consider the explicit form of Eq. (4.3.11):
!n(bx)\
@
jg A g (m (t), , m (t)) m (s) ds#n B jm (t) A 2 I A B
L jg (m (t), , m (t)) # u (u (t)! A 2 I t (u (t))) B G A G jm (t) B G A
"0, d"1,2, k , $@
(4.3.12)
K
where 1s(u )2K " H $@ X
dkNK@(u ,2, u ) s(u )exp[ (@ m (s)u (u (s)) ds!@ g (m (s),2, m (s))t (u (s)) ds)] 4$ L H ? G I A G G ? ? A A . X L dkNK@(u ,2, u ) exp[ (@ m (s)u (u (s)) ds!@ g (mR,2, m (t))t (u (t)) dt)] K @ 4$ L ? G I A G G ? ? A A
KL@
According to Eq. (2.1.6), for m (s)"m "const., a"1,2, k, Eqs. (4.3.12) simplify to ? ?
with
jg (m , , m ) (Tr exp(!bn CL hI K))\Tr exp(!tnCL hI K) CL uK! A 2 I tK N B A N jm B A jg ;exp[!(b!t)nCL hI K] !x\m # Ag (m ,2, m )"0, d"1,2, k , N B I jm A A B
(4.3.13)
"¹K#» # g (m ,2, m )tK , K ! m u NK N ? ?K A I A ? A because hI
AL(uKK2K#2#K2KuK)AL"nCL uK B B B and, obviously, the dependence on t in Eq. (4.3.13) vanishes due to the cyclicity of operator products under the trace. The assertion of Lemma 4.3 thus agrees with Eq. (4.3.11). The functions g (m ,2, m ) will be now chosen as solutions of the equations A I d Re GK $V "0, j"1,2, l . dg (m ,2, m ) H I
(4.3.14)
266
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
For constant functions m (s)"m , a"1,2, k, these equations reduce to ? ? g (m ,2, m )"1CL tK2 CL I p K, j"1,2, l H I H L F
(4.3.15)
and for functions g (m ,2, m ), c"1,2, l, satisfying these equations, Eq. (4.3.13) take the form A I x\m "1CL uK2 CL I p K, d"1,2, k . B B L F
(4.3.16)
As for the functional Re GK , it also simplifies for constant m (s)"m , a"1,2, k, and assumes $V ? ? the form 1 1 Re GK (m ,2, m , g ,2, g )"! nb m# nb g!nbf (nCL hI K(m, g), b) . $V I J ? 2 A N 2x ? A uniquely, as stated by The choice of g (m ,2,m ) made above determines Re GK $V A I
(4.3.17)
Lemma 4.4. For fixed m ,2, m the solution (g*,2, g*) of Eq. (4.3.15) is unique and minimizes J I Re GK . $V Proof. For fixed g ,2, g the Bogoliubov inequality yields J f (nCL hI K(m, g), b)4f (nCL hI K(m, g ), b)# (g !g )1nCL tK2 CL I pK N N A A A L F A whereas expansion of f (nCL hI K(m, g), b) around (g ,2, g ) takes the form J N
(4.3.18)
f (nCL hI K(m, g), b)"f (nCL hI K(m, g ), b)# (g !g )1nCL tK2 CL I p K N N A A A L F K E A jf 1 #2. # (g !g )(g !g ) A A B B jg jg 2 A B EE AB
The matrix (jf/jg jg )(g"g ) is therefore nonpositive. A B As a consequence, for gOg , ?B
j ReGK $V (g !g )(g !g ) A A B B jg jg A B
jf "nb (g !g )!nb (g !g )(g !g )'0 . (4.3.19) A A A A B B jg jg A B EE A AB Given two different solutions +g , and +g ,"+g #e , of Eq. (4.3.15), then by Lagrange’s A A A A theorem, there exists a point +g ,"+g #he ,, with 0(h(1, such that A A A AB
j Re GK $V e e "0 AB jg jg A B
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
267
contrary to Eq. (4.3.19). Thus there can be at most one solution of these equations. Since lim Re GK "#R for fixed +m , $V ? E? this solution does exist. QED Let +g*, denote the unique solution of Eq. (4.3.15). Since "g*"4#t # , therefore A A A lim Re GK (m ,2, m , g*,2, g*)"!R $V I J KA and, as a consequence, Re GK (m , 2, m , g*,2, g*) has at least one maximum with respect to $V I J +m ,. Let ? sup Re GK (m ,2, m , g*,2, g*)"Re GK (m*,2, m*) $V I J $V I K?Z* @
1 1 "! nb m*# nb g*(m*,2,m*)#ln Tr exp(!bnCL hI K(m*, g*(m*))) . ? A I N 2x 2 ? A Insertion of this expression into the r.h.s. of Eq. (4.3.8), yields
(4.3.20)
1 f (HLK , b)5f (nCL hI K(m*, g*(m*)), b)# m* N ? N $ 2x ? k 1 ! g*(m*,2, m*)# ln(1!x\) . (4.3.21) A I 2nb 2 A After passing to the limit nPR, xP1 in inequality (4.3.21), we get, by virtue of Eq. (4.2.33), 1 1 lim f (HLK , b)5 lim f (nCL hI K(m*, g*(m*)), b)# m*! g*(m*,2, m*) N $ N ? A I 2 2 L ? A L
" lim min f (nCL hK(m, g*(m)), b)"b\ lim min (ln z K!n\ Tr ln(K#z KoK)) , N L L L +K?, L +K?, (4.3.22) where oK"exp(!bhK(m, g*(m))) and z K is the unique solution of Eq. (4.2.6). By Lemma 4.2 the N L necessary conditions (4.3.16) for the minimum in Eq. (4.3.22) take the asymptotic form (as nPR, xP1) m "n\ Tr (uKz KoK(K#z KoK))\, d"1,2, k L B B L * and Eqs. (4.3.15) for g (m ,2, m ) assume the form A I g "n\ Tr (tKz KoK(K#z KoK)\), c"1,2, l . A A L L
(4.3.23)
(4.3.24)
268
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
4.3.2. Upper bound on f (HLK , b) N $ The upper bound on f (HLK, b) can be derived by adapting the method of Section 3.2 to the N antisymmetrized operators HLK and nCL hK. By acting on Eq. (3.1.29) with AL one obtains N $ N 1 I HLK "nCL hK(m, g*)! AL ( LK!nm KL)AL N ? ? N $ 2n ? 1 J 1 # AL (WLK!ng*KL)AL! ALºLKAL, m 3R, a"1,2, k . (4.3.25) A A B ? 2n 2n A The second term on the r.h.s. of Eq. (4.3.25) is irrelevant on applying Bogoliubov’s inequality. The third term equals J 1 AL (WLK!ng*KL)AL A A 2n A 1 L " AL (K2KtK(i)K2K)(K2KctK(j)K2K) A A 2n A GH L !2ng* K2KtK(k)K2K#ng*K2K AL A A A I 1 (4.3.26) " +nCL (t) #n(n!1)CL (tKtK)!2ng*CL tK#ng*AL, . A A A AK A A 2n A By virtue of Lemma C.1 in Appendix C
lim f\ Tr(o tK)) K LCL (t LK AK A L " lim f\ Tr ((tKtK)¸o L)" lim f\ (Tr (t ¸oL)) (4.3.27) LK A A L K LK K AK L K L L Next, using Eqs. (4.3.26), (4.3.27) and (4.3.15) which determine the functions g*, one obtains A 04 lim n\1(WLK!ng*KL)2 CL p K * A A L F KE L Tr ((t) ¸oL)#n(n!1)f\ (Tr (tK¸o L))!nf\ (Tr (t ¸oL)), "lim n\+nf\ AK L K LK A L K LK K AK L K LK L " lim n\f\ +Tr[(t) ¸oL]!f\ (Tr(tK¸o L)),4 lim 2n\#t # "0 . LK AK L K LK A L K A L L Similarly, " lim n\"1ºLK2 CL B L FpKKE* L I J " lim f\ n\ Tr (u) ! (t) ¸o L LK ?K AK L K L ? A 4 lim n\ #u # # #t # "0 . ? A L ? A
(4.3.28)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
269
Bogoliubov’s inequality (3.1.23) applied to HLK decomposed according to Eq. (4.3.25) now N $ yields lim f (HLK b)4 lim f ( nCL hK(m, g*), b) , N N $Y L L where
(4.3.29)
1 1 hK(m, g*)"hI K(m, g*)# mK! g*(m ,2, m )K . N ? A I N 2 2 ? A
(4.3.30)
4.3.3. Asymptotic equality of upper and lower bound on f (HLK , b) N $ By comparing the two asymptotic bounds (4.3.22) and (4.3.29) one immediately arrives at the following. Theorem 4.5. For an n-fermion system with separable 2-body potential (3.1.1), (3.1.2) and hK defined N : exp(!bhK), the following equality holds: by Eq. (4.3.30), oK " N lim f (HLK , b)" lim f (nCL hK(m*, g*(m*)), b) N N $ L L " lim +b\(ln z K!n\Tr ln (K#z KoK)), , (4.3.31) L L L where g*"+g*(m ,2,m ), is the unique solution of Eq. (4.3.24), m*"+m*,2,m*, is the solution of A I I Eq. (4.3.23) which minimizes ln z K!n\ Tr ln (K#z KoK) and z K is the unique solution of L L L Eq. (4.2.6). Eqs. (4.3.23) and (4.3.24) can be written in the compact form X ( q(oK))]!Tr[º K "Tr [º K (q(oK) q(oK))]K , K K L L L where
(4.3.32)
1 X ! g*(m ,2,m )tK! m! g* (m ,2, m ) K K " m u ? ?K ? A I A I A 2 ? A ? A Eq. (4.3.32) differs only by the presence of the additive scalar term on the r.h.s. from the equation which results by minimizing the free energy of a large n-fermion system with the 2-body potential oL (Kossakowski and Mac´kowiak, 1986). (3.1.1), (3.1.2) over product states of the form f\ LK K The mean-field equations (4.3.32) can be written in the form of an equation for the electron density operator n "zoJ K(#ZoJ K)\, oJ K"exp (!bhI K), Tr n "n N
(4.3.33)
hI K"¹K#n\ Tr º (n )#» K . N N K
(4.3.34)
as
270
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
In this form they are similar to the mean-field equations of asymptotic Thomas—Fermi theory derived by Narnhofer et al. (1981), Thirring (1982) for the electron density n (x): dp (exp(bh (p, x))#1)\ , (4.3.35) n (x)"2 L (2p) 0 z n (x) I # h (p, x)"p/2m! dx!k , (4.3.36) L $ "x!X " "x!x" 0 I I which determines the limiting free energy density of an n electron system interacting with nuclei positioned at X , k"1,2, M. The constant k in h is determined uniquely by the condition I $ L
n (x) dx"n .
There is striking similarity between Eq. (4.3.34) and Eq. (4.3.36): the trace average in the former is only replaced by a phase-space average in the latter. This correspondence suggests that, at least for some types of interactions, similarly as in Section 3.2, the following asymptotic equality holds:
dp \ L I exp(!bH(x ,2, x , p ,2, p )) "1 , dx lim ln Tr exp(!bHL K ) ln I L L (2p) 0 I K LK where H(x ,2, x , p ,2, p ) is the classical counterpart of the quantum Hamiltonian HL. So far, L L only a bound of the form
dp I exp(!bH(x ,2, x , p ,2, p )) dx Tr exp (!bHL)4 I L L (2p) 0 I 0 has been proved (Simon, 1979, Theorem 9.2).
5. Thermodynamic limit of free energy density for continuous n-boson systems 5.1. Contractions and expansions of p-particle boson operators The definitions relating to n-fermion systems introduced in Section 4.1 can be easily reformulated for systems of n identical bosons. If H denotes the Hilbert space of a single boson, the Hilbert space of n bosons is HRL"SL(H2H) SL"SLHLSL , where SL denotes the projector on the subspace of symmetric vectors in HL. SL acts on product vectors as follows: 1 SLu 2u " uPL 2uPL . L L n! P L
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
271
The p-particle observables of an n-boson system are represented by self-adjoint operators in HRL of the form (Kummer, 1967) CL BN"SL(BNL\N)SL, N and the set of states of such system consists of all symmetrized density matrices: S"+oL: oL50, Tr oL"1, oL"SLoLSL, . The contraction mapping ¸N : TL P TN (TI denoting the set of all self-adjoint trace-class L operators acting in HRI) is defined analogously as for fermions (Kummer, 1967): TrHRL (oLCL BN)"TrHRN (BN¸NoL) . (5.1.1) N L Similarly as in the case of n-fermion systems one proves that ¸N is order-preserving and ¸NoL is L L equal to the partial trace of oL over the Hilbert space of n—p bosons. 5.2. Asymptotic equality of the free energy density of noninteracting Bose gases in the canonical and grand canonical ensemble The canonical and grand canonical ensembles of an ideal Boson gas are known not to be completely equivalent (Davies, 1972; Ziff et al., 1977; Buffet and Pule´, 1983) although the pressure and free energy density in both ensembles are equal in the thermodynamic limit. The imperfect Bose gas, i.e. one with a repulsive potential, is free of this anomaly (Buffet and Pule´, 1983). The asymptotic equality of free energy per particle for noninteracting Bose gases in the canonical and grand canonical ensemble, takes the form , b)"b\ lim (ln z K#n\ Tr ln (K!z KoK)) , (5.2.1) lim f (nCL h L K L LK LK where h K and oK are defined in the same manner as in Section 4.2 and z K is the unique solution of L the equation (5.2.2) Tr (z KoK(K!z KoK)\)"n . L L The proof of Eq. (5.2.1) can be carried out similarly as in the case of Fermi gases: First, let us assume that the limits on both sides of Eq. (5.2.1) exist. The conditions under which this requirement is fulfilled are the same as for Fermi gases. Next, let us find the asymptotic form of the density matrix f\ ¸oRL, where f K"TroR K L and LK L K L oR K L"SL(oK2oK)SL . Assuming that oK can be represented by an integral operator with the kernel oK(x, x), the integral L can be written down as kernel of ¸oR L K oK(x , x ) oK(x , x ) 2 oK(x , x ) L o (x , x ) o (x , x ) 2 o (x ,x ) K K K L dx 2dx . L)(x , x )"(n!)\ per (5.2.3) (¸oR L L K $ $ KL\
oK(x , x ) L
oK(x , x ) L
2
oK(x , x ) L L
272
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Performing an expansion of the permanent
oK(x , x ) oK(x , x ) 2 oK(x , x ) L oK(x , x ) oK(x , x ) 2 oK(x , x ) L " oK(xP , x )oK(xP , x )2oK(xP , x ) per L L LL L $ $ P L
oK(x , x ) oK(x , x ) 2 oK(x , x ) L L L L with respect to the first row, one obtains
L"f o #(n!1)oK¸oR L\ . n¸oR L\K K L K L K Successive application of this formula to the second term on the r.h.s. yields
(5.2.4)
L"f o #(n!1)oK¸ oR L\ . (5.2.5) n¸oR L\K K L\ K L K The Freche´t derivative of f K with respect to oK expresses in terms of p(oK) " : f\ ¸oRL in the L L LK L K same manner as for Fermi gases: d n\ ln f K"o\ n"1, 2,2 . K p(oK), L doK L
(5.2.6)
The proof of Eq. (5.2.6) for Bose gases requires only replacement of all minus signs in Eqs. (4.2.14) and (4.2.15) by plus signs. The r.h.s. of Eq. (5.2.6) assumes a simple asymptotic form for large n,"K", given by Lemma 5.1. For any sequence of bounded operators B K , each defined on HK and such that lim#B K #(R as "K"PR and lim Tr(B K p(oK)) as n,"K"PR exists, the following equality L holds: lim Tr(B n\ Tr(B K p(oK))" lim K z KoK(K!z KoK)\) , L L L K K L L where z K is the unique solution of Eq. (5.2.2). L
(5.2.7)
Proof. Proceeding in the same manner as in Section 4.2.2, one finds that p(oK)+p (oK). L L> : f\ Kf K, Furthermore, since for S K " L> L L p (oK)"(n#1)\S KoK#n(n#1)\S KoKp(oK) , L> L L L therefore, (5.2.8) (K!S KoK)p(oK)+(n#1)\S KoK . L L L It follows from (5.2.8) that the operator (K!S KoK)\ exists for sufficiently large n,"K", since the L r.h.s. represents a positive operator. Hence 0(S KoK(K and therefore, L #(K!S KoK)\#"(1!#S KoK#)\(R . L L The asymptotic equality (5.2.8) can be thus rewritten in the form p(oK)+(n#1)\S KoK(K!S KoK)\+n\z KoK(K!z KoK)\ , L L L L L
(5.2.9)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
273
z K being the unique solution of Eq. (5.2.2). This solution obviously fulfills the requirement L 0(z KoK(K, allowing to pass from Eq. (5.2.8) to (5.2.9). QED L As follows from Lemma 5.1, Eq. (5.2.6) assumes the asymptotic form d ln f K+n\z K(K!z KoK)\ n\ L L L doK
(5.2.10)
for large n and is satisfied by (5.2.11) n\ ln f K+!n\ Tr ln(K!z KoK)#n\ ln e (z K) . L L L L The unknown function e (z K) can be determined in the same manner as g (z K) in Section 4.2.2: L L L L e (z)"z\L, which substituted into Eq. (5.2.11) yields the asymptotic equality (5.2.1). L 5.3. Thermodynamic limit of free energy density of continuous n-boson systems in a bounded region The discussion of n-fermion systems with spin-independent interactions in Section 4.3, after minor modifications, can be repeated in the case of systems of n bosons with zero spin. The Hilbert space of n spin-zero bosons in a bounded region K is ¸(K)RL"SL(¸(K)2¸(K))SL and the kernel of the integral operator exp(!bSLHLKSL)"exp(!bHLK ) N N is, according to the Feynman—Kac theorem, exp(!bHLK (x ,2, x ,y ,2, y )) L L N L @ "(n!)\ (u ) exp ! ¼(u (s),2, u (s)) ds . dkNG K@ PLG W V G L XKL P b G L Here the Wiener measure n!\ PL dkNG KyP@LG(u ) is strictly positive, so the proofs of all theorems G V G ¸oRL for large are even easier than in the case of Fermi statistics and the asymptotic form of f\ LK L K R ¸o L, viz., f\ ¸o L+p(oK)p(o ) (Appendix C). The final result n is similar to that of f\ LK L K L L L LK L K can be stated as follows:
Theorem 5.2. ¹he free energy per particle of an n-boson gas with 2-body potential of the form (3.1.1), (3.1.2), fulfills the following asymptotic equality: lim f (HLK , b)" lim f (nCL hK(m*, g*(m*)), b) N N LK LK " lim b\(ln z K#n\ Tr ln(K!z KoK)) , L L LK
(5.3.1)
274
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
where
1 * : ¹K#» # g*tK# K m*! g* , oK " : exp(!bhK) , hK " K ! m u ? ?K A A N ? A N N 2 ? A ? A and g*(m)"+g* (m ,2, m ),2, g*(m ,2, m ), is the unique solution of the equations I J I n\ Tr((tK!g K)z KoK(K!z KoK)\)"0, c"1,2, l , (5.3.2) A A L L m*"+m*,2,m*, is the solution of equations I n\ Tr((uK!m K )z KoK(K!z KoK)\)"0, a"1,2, k (5.3.3) ? ? L L which minimizes ln z K#n\ Tr ln (K!z KoK) and z K is the unique solution of Eq. (5.2.2). L L L Similarly as in the case of Fermi statistics, Eqs. (5.3.2) and (5.3.3) can be written in the compact form ( p(oK))) (Tr p(oK))X K "(Tr p(oK))Tr (º L K K L L ! Tr[º (p (o )p (o ))] K . (5.3.4) K L K L K The factors Tr(z KoK(K!z KoK)\) and Tr(p(oK)) in Eqs. (5.3.2), (5.3.3) and (5.3.4) are retained L L L and not put equal one, because, unlike in fermion systems, in the case of bosons, Eq. (5.2.2) is not always asymptotically solvable in the thermodynamic limit. Examples are given in Section 5.5. Eq. (5.3.4) is identical (apart from the additive scalar term on the r.h.s.) to the equation ( p(oK))) X K "Tr (º K K L which results by minimizing the free energy of a large, but finite, n-boson system with the 2-body o4L (Kossakowski and Mac´kowiak, potential (3.1.1), (3.1.2), over product states of the form f\ LK K 1986). However, they differ in the limit (3.1.5) if the limiting equation lim n\ Tr(z KoK(K!z KoK)\)"1 L L LK is unsolvable, in which case the correct limiting form of mean-field equations follows from Eq. (5.3.4). The method of dealing with an interacting Boson gas developed in Sections 5.1, 5.2 and 5.3 is applied in Section 5.5 to a Boson gas defined in the momentum representation. First, in Section 5.4, the classical version of this model is considered. 5.4. A mean-field approach to the gas—liquid transition The theory of the gas—liquid transition has a long history. The first fundamental contribution to the understanding of this phenomenon was made by van der Waals who in 1873 discovered his famous equation of state. Around 1930, Ursell and Mayer (Mayer and Mayer, 1940) derived their virial expansion of the equation of state, which led to further theoretical investigations based on microscopic properties of gases, such as the interparticle potential. Substantial contributions in this direction were made by Yang and Lee (1952), Kac (1959a), Kac et al. (1963), Lebowitz and
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
275
Penrose (1966), Widom and Rowlinson (1970) and many others. The search for a complete theory of the gas—liquid transition and liquid state is still continuing and new models with a first-order phase transition are being introduced, e.g. Diren and Leeuwen (1984), Bricomt et al. (1985) and Mac´kowiak (1987). The last of these has some advantages over the well-known model of Kac (1959a), Kac et al. (1963) and its generalization due to Lebowitz and Penrose (1966): the presence of the first-order phase transition does not depend on any additional limiting procedure after the infinite volume limit (as in the Kac model and its extension) and the quantized Bose version is also solvable. It is convenient to consider first the classical solution of this model, as it appears later as the high-temperature form of the quantum solution. 5.4.1. A mean-field Hamiltonian H(n) of a classical gas The essential difficulty in setting up a solvable model of a classical gas consists in finding a simple, but physically meaningful, approximation to the 2-body potential. A typical 2-body potential º(r), representing the interaction between gas molecules, has a minimum at some value r of the interparticle distance r. For r(r , º(r) is rapidly increasing and lim º(r)"R as rP2r (with r equal to the hard-core radius), whereas for r'r , º(r) increases gradually and lim º(r)"0 as rPR. Calculation of the limiting free energy density of n particles with such interaction is a formidable task (see e.g. Boublik et al., 1980). However, liquefaction is presumably due to the increasing part º of the deep well around r , as it is accompanied by a drop of specific volume. G º attracts the slow particles most effectively, the dominating interaction between the fast ones G being repulsion. An approximation to º can be thus obtained by assuming r "0 and G º (1, 2)"const#c ("r !r ")c (k )c (k )(r !r ) , G where r , k are the position and momentum of the ith particle, respectively, and G G c '0 for x3P "[o !d , o #d ] , G G G G G c (x)" G 0 for x , P . G The Hamiltonian of n interacting particles then takes the form (up to an additive constant)
L 1 H (n)" k/2m #n\ c ("r !r ")c (k )c (k )(r !r ) . (5.4.1) G G H G H G H 2 G GH The canonical transformation k "Q , r "!P , i"1,2, n, where Q , P are the new position G G G G G G vectors and momenta, respectively, transforms H (n) into
H (n)" Q/2m #n\ c ("P !P ")c (Q )c (Q )P G G H G H G H G !n\ c ("P !P ")c (Q )c (Q )P ) P . G H G H G H GH Suppose now that d is small and, as a consequence, c ("P !P ")P ) P Koc ("P !P ")(P P )\P ) P . G H G H G H G H G H
(5.4.2)
(5.4.3)
276
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Then for a separable approximation to the 2-body potential (5.4.3), viz., oc ("P !P ")(P P )\P ) P Kg(p )g(p )(p p )\p ) p , G H G H G H G H G H G H g '0 for p3P"[p !D, p #D], p !D'0 , g(p)" 0 for p , P
in terms of the variables
L P\P , i"1,2, n p "(2m Q/2m #n\ c ("P !P ")c (Q )c (Q )P G G G G G H G H G H the Hamiltonian H (n) expresses as L H(n)" p/2m!n\ g(p )g(p )(p p )\p ) p . (5.4.4) G G H G H G H G GH H(n) can be viewed as representing a system of interacting quasi-particles with momenta p and G some effective mass m. As shown in the next section, the thermodynamics of H(n) mimics some features of the gas—liquid transition. 5.4.2. The gas—liquid transition in terms of H(n) Suppose the system H(n) is enclosed in a cube K"¸. The limit lim f (H(n), b) as nPR, with n¸\"d, where
f (H(n), b)"!(nb)\ln(n!)\
dp 2dp dq 2dq exp(!bHL) L KL L 0L can be found by a simplified method similar to the one applied in Section 3 to quantum systems of n Boltzmann particles. One obtains lim f (H(n), b)" lim f (h(n), b)"f (x , b) K L L p 1 "!b\ ln 2p p dp sin a exp(!bp/2m#bx g(p)cos a) da# x #b\ ln d!b\ , K 2 K (5.4.5) where
L h(n)" h(x , p ) , K G G (5.4.6) h(x, p)"p/2m!p\g(p)x ) p#x , and x ""x " is the solution of the equation jf/jx"0 which minimizes f (x, b). This equation takes K K the form F (x, g , b)"x ,
(5.4.7)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
277
where pg(p) dp p sin a cos a exp(!bp/2m#bxg(p)cos a) da . F (x, g , b)" p dp p sin a exp(!bp/2m#bxg(p)cos a) da F has the following properties: (1) F (0, g , b)"F (x, g , 0)"0. (2) Let s(p , p , a , a )"(g(p )cos a !g(p )cos a ) ;exp(!b(p#p)/2m#bxg(p )cos a #bxg(p )cos a ) . Then jF (x, g , b) 1 p dp p dp p sin a da p sin a da s(p , p , a , a ) '0 . " b 2 (p dp p sin a da exp(!bp/2m#bxg(p)cos a)) jx (3) jF (0, g , b)"0 , jx jF (0, g , b) jx "p
m g b
exp(!bp/2m)p dp
P
(p
1 b ! 5(2 3 m
exp(!bq/2m)q dq .
P
(3) implies that for sufficiently small D, (jF /jx)(0, g , b)'0 for any b'0 and hence that F (x, g , b) is convex for positive x in some neighbourhood of x"0 for such D. Further properties of F can be inferred from the following representation of this function: F (x, g , b) (2p(bxg )\I (bxg )P exp(!bp/2m)p dp , (5.4.8) "g (2p(bxg )\I (bxg )P exp(!bq/2m) dq#2 P exp(!bp/2m)p dp 0 > where I (x) is Bessel’s modified function, N p (x)N I (x)" exp(x cos a)sinN a da . (5.4.9) N C(p#)(p Eq. (5.4.8) results by performing the rearrangements
p
p j sin a cos a exp(bxg cos a) da" sin a exp(bxg cos a) da jbxg j 1 \ I (bxg )"(2p(bxg )\I (bxg ) . bxg "(p jbxg 2
278
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
The functions I (x) are increasing in x. Thus, according to Eq. (5.4.8): N (4) For increasing x or g , F(x, g , b) approaches the function G (x, g , b)"g I (bxg )/I (bxg ) with the following growth and concavity properties: (A) jG /jx'0, (B) jG /jx(0 for x'0, (C) G (0, g , b)"0, (D) lim G (x, g , b)"g , b'0. V As a consequence of (D), (5) lim F (x, g , b)"g for b'0. V The growth properties of F (x, g , b) with respect to b can be found from Eq. (5.4.8) and m (jF /jb)(x, g , b)"xb\(jF /jx)(x, g , b)#m\ b
(bxg )\I (bxg )Pp(3!bp/m)exp(!bp/2m) dp . (5.4.10) ; ((bxg )\I (bxg )Pqexp(!bq/2m) dq#(p > Ppexp(!bp/2m) dp) 0 The first term on the r.h.s. of Eq. (5.4.10) is positive, but the second one becomes negative for b exceeding some b . Furthermore, substituting the asymptotic formula (Gradshteyn and Ryshik, 1965, Eq. 6.461.5) I (x)K(2px)\ exp x for large x'0 (5.4.11) N into Eq. (5.4.8), one finds that jF /jb'0 for sufficiently large x. Thus (6) For small b'0 and any x3R , (jF /jb)(x, g , b)'0. For large enough x, jF /jb'0 for all b. > The limit lim F (x, g , b) can be found using the asymptotic form of the integral @ N exp(!bp)p dp, p '0 N for large b, which can be inferred from the expansion
N
N
exp(!bp)p dp"!((2b)\p#(2b)\p\)exp(!bp)"N N K ! exp(!bp) (!1)I>(2k#1)!! (2b)\I\p\I\"N N I N # (2b)\K\(2m#1)!! (!1)K> exp(!bq)q\K\ dq N
(5.4.12)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
279
obtained by integration by parts (Erde´lyi, 1956):
N>D
exp(!bp)p dpK(2b)\(p !D) exp(!b(p !D)) as bPR . N\D Using Eq. (5.4.13), one obtains (7)
(5.4.13)
g for x'(2mg )\(p !D) , lim F (x, g , b)" 0 for 04x4(2mg )\(p !D) . @ According to (1)—(7) and (A)—(D) above, the behaviour of F (for small D and large enough g ) with varying x and b can be depicted as in Fig. 1. Thus, Eq. (5.4.7) has the trivial solution x"0 for all b50 and one or two nonzero solutions x 5x '0 for large enough b, provided: (i) g , m are sufficiently large or p , D sufficiently small. In the sequel we shall assume that g , m, p , D satisfy condition (i). The solution x which minimizes f (x, b) must be now determined. Since Eq. (5.4.7) results by K setting (j/jx) f (x, b)"0 ,
(5.4.14)
the solution x"0 of this equation minimizes f (x, b) in the range of low inverse temperatures. However, F (x, g , b) is increasing in b at large enough x, therefore if g , m, p , D are appropriately adjusted according to (i), the solution x will be the minimizing one for values of b exceeding some b . The minimizing solution thus displays a jump at b , R R lim x (b)"x (b )'0 . (5.4.15) lim x (b)"0, K K \ > @@ @@ As a consequence, the system undergoes a first-order phase transition at b , since the entropy per particle j d s(b)"kb f (h(x (b)), b)"kb f (h(x (b)), b) , K K jb db where k denotes Boltzmann’s constant, is continuous in x , but due to Eq. (5.4.15) exhibits a finite K discontinuity in b at b . Let us verify whether this discontinuity corresponds to an increase or drop in the value of entropy: jf (x , b)/jb"!b\x #Q(x , b)!b\f (x , b) , K K K K where
(5.4.16)
p dpp sin a exp(!bp/2m#bxg(p)cos a) da , Q(x, b)"(2m)\ p dpp sin a exp(!bp/2m#bxg(p)cos a) da m (g x\)I (bxg )Pp dp p!3 exp(!bp/2m) jQ b "p(mb\ . jx (p dpp sin a exp(!bp/2m#bxg(p)cos a) da)
(5.4.17)
280
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Fig. 1. The function F (x,g ,b) for b (b (b (b .
The first term on the r.h.s. of Eq. (5.4.16) is non-positive and, according to Eq. (5.4.17) the second one is decreasing in x at b"b , if b is small enough (which is the case for large g ) or the particle mass m large. This choice of the constants g , m is guaranteed by condition (i). The entropy per particle s(b) thus drops discontinuously at b under increasing b, which is a feature of the gas—liquid transition. The form of the free Hamiltonian h(n) confirms this character of the transition: at b4b , h(n)" L p/2m, therefore the system is in the vapour phase, as it behaves like a noninteractG G ing gas; at b'b the system is in the liquid phase, since the motion of particles is locally ordered, the R dominating motion being that parallel to the vector x . As for the vector x\x , it can be viewed as K K K a random variable with respect to time and space coordinates, because neither of these variables enters into H(n) or Eq. (5.4.7). For a sufficiently strong coupling in the interaction (5.4.4), jx /jb'0, K in which case the local ordering increases with b. This form of the Hamiltonian h(n) agrees with the generally accepted views on the liquid phase, as one in which the motion of the molecules is ordered only locally, e.g. simulation of molecular dynamics of a two-dimensional Lenard—Jones liquid performed by Mitus et al. (1991) revealed a picture of a liquid as a locally ordered system. Not all properties of the system with the Hamiltonian (5.4.4) are satisfactory. For example, its isotherms coincide with those of an ideal gas. This deficiency can be removed by letting g vary appropriately with specific volume v (Mac´kowiak, 1987). However, a satisfactory form of the isotherms results only after replacing the 2-body potential in H(n) given by Eq. (5.4.4) by a more general one of the form º(p , p )"n\(g (p , v)g (p , v)!g ( p , v)g ( p , v))( p p )\p ) p , G H P G P H ? G ? H G H G H where g , g are suitably chosen functions of momentum and v (Mac´kowiak, 1989a). P ?
(5.4.18)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
281
5.5. Bose—Einstein condensation in the liquid phase of an interacting gas Bose—Einstein (BE) condensation in various models of interacting and noninteracting Boson gases has been the subject of extensive investigations (e.g.Bogoliubov, 1947; Cannon, 1973; Lewis, 1972; Pule´ and Lewis, 1974, 1975; Landau and Wilde, 1979; Fannes and Verbeure, 1980; Bratteli and Robinson, 1981; Baumgartner et al., 1983; Fannes et al., 1982; Angelescu et al., 1992). It is generally accepted (e.g. Huang, 1963; Baym, 1967) that the j-transition in He at ¹"2.19 K, which manifests itself in the appearance of unusual superfluid properties of liquid He, is due to BE condensation of He atoms. The quantum origin of this transition is evident: it occurs at a very low temperature, where only subtle quantum effects can produce qualitative changes on a macroscopic scale. He atoms are in fact bosons with spin equal zero, which strongly supports the conjecture of the close link between BE condensation and the j-transition, since the presence of the former effect has been rigorously proved in various models of a gas of zero-spin particles. Thus, purely quantum models of low-temperature liquid He, which reproduce only the j-transition He in the form of BE condensation, suggest themselves as the most natural models of He at low temperatures. Apart from the free Boson gas (e.g. Lewis, 1972; Landau and Wilde, 1979; Bratteli and Robinson, 1981), various models with an attractive or repulsive interaction have been considered. Angelescu et al. (1992) have argued that the Bogoliubov model of a superfluid (Bogoliubov, 1947) has an attractive interaction. They also considered an extension of this model containing a repulsive 2-body potential. Their model, as well as its prototype introduced by Bogoliubov in 1947, reproduces linearity of the quasi-particle excitation spectrum at small momenta, observed in superfluid He. A model of a superfluid with attraction was also considered by Valatin (1964). Other Boson Hamiltonians in Fock space, with a 2-body repulsion, were investigated by Huang (1963), Fannes and Verbeure (1980), and Baumgartner et al. (1983). None of these microscopic models, however, accounts for the singularity of specific heat in He at ¹ . Only phenomenological models, such as the one introduced by Fliessenbach (1991) H on the grounds of phenomenological assumptions about the occupation numbers n of singleI particle states in a Bose gas, have explained this phenomenon. The presence of the liquid phase in He, at all sufficiently low temperatures and pressures below 25 atm, has also remained a question unaccounted for by microscopic theories. The next sections deal with some aspects of fluidity and BE codensation of a Boson gas (Mac´kowiak (1989b)). A convenient interaction for describing both these phenomena results by quantizing the classical Hamiltonian (5.4.4). The quantized Boson system to be considered thus consists of n identical particles, each having zero spin, enclosed in a cube with edge ¸ under periodic boundary conditions and interacting via a 2-body potential of the form º(p , p )"!n\g(p )g(p ) cos(p , p ) , where
g(p)" p , p ,
g '0 0
(5.5.1)
for p3P"(0, D] , for p , P ,
p denoting the particle momenta and cos (p , p ) standing for the cosine of the angle between p . The system will be assumed in equilibrium and the canonical ensemble approach will be
282
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
applied. The thermodynamic limit of free engergy per particle can be found by a technique analogous to the one applied in Sections 6.2.1, 6.2.2, 6.2.3, 6.2.4 and 6.2.5 to the fermion reduced s—d model and proved equal to that of a noninteracting system of bosons in a mean field given implicitly as the solution of a set of two equations. The asymptotic solutions of these equations are next found by a method of Landau and Wilde (1979), and these allow to conclude that in the presence of the interaction (5.5.1) a Boson gas exhibits a first-order gas—liquid transition at some temperature ¹ , which manifests itself in a discontinuous drop of entropy and local ordering of R particle momenta. If this interaction is sufficiently strong, BE condensation occurs in the liquid phase at ¹ (¹ . # R 5.5.1. Asymptotic exactness of the mean-field description The Hamiltonian L (5.5.2) HL" p/2m# º(p , p ) G G H GH G of n zero-spin bosons, is a multiplication operator in ¸(K)L. Under the assumed periodic boundary conditions in K"¸, the admissible values of 1-particle momenta p are p"2p¸\(k , k , k ), k 3Z . G The asymptotic form of the free energy per particle f (HL, b) in the infinite-volume limit nPR, "K""¸PR, n"K"\"d
(5.5.3)
can be found by a method analogous to the one developed in Sections 5.1, 5.2 and 5.3. Details of this technique, applied to the n-fermion reduced s—d Hamiltonian, are presented in Sections 6.2.1, 6.2.2, 6.2.3, 6.2.4 and 6.2.5. As for HL, defined in Eq. (5.5.2), one proves that lim f (HL, b)" lim f (nCL h(x ), b)"b\ lim ln z #b\ lim n\ Tr ln(!z o ) , L L L L L L L L where h(x)"p/2m!xg(p)cos (x, p)#x, x""x" and x is the solution of the equations L n\ Tr((g(p) cos(x , p)!x )z o (!z o )\)"0 , L L L L L L n\ Tr (z o (!z o )\)"1 , L L L L with o " : exp(!bh(x )), which minimizes f (nCL h(x ), b). L L L
(5.5.4)
(5.5.5) (5.5.6)
5.5.2. Solutions of the mean-field equations The explicit form of Eq. (5.5.5), is n\ (g(p) cos(x , p)!x )f . (x , p, b)(1!f . (x , p, b))\"0 L L L L L L L L p
(5.5.7)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
283
and that of condition (5.5.6) n\ f . (x , p, b)(1!f . (x , p, b))\"1 , L L L L L L p
(5.5.8)
where f "z exp(!bx/2) and L L L . (x , p, b)"exp(!bp/2m#bx g(p) cos(x , p)) . L L L L The range of the variables x , f in Eqs. (5.5.7) and (5.5.8) must be, clearly, restricted to the set L L R "+(x , f ) : f exp(!be (x ))(1,, where e (x ) is the smallest eigenvalue of the Hamiltonian L L L L L L h(x )!x/2: L L e (x )" inf ( p/2m!g x cos(x , p)) . L L L p NZP In order to investigate the limit lim x , lim f of solutions x , f of Eqs. (5.5.7) and (5.5.8), L L L L L L let us first consider Eq. (5.5.8) for f and find the limit lim f for a sequence +x , with L L L L x "x "x "2"x. A possible technique for investigating this limit is offered by Landau’s L L> L> and Wilde’s proof of BE condensation of an ideal gas (Sections 2 and 3 of Landau and Wilde (1979)). By mimicking their argument, one can prove the following lemmas: Lemma 5.3. For fixed K, b, d, x there is a unique f 3(0, exp(be (x))) satisfying the equation D(n, b, x, f )"d , (5.5.9) L where D(n, b, x, f )""K"\ f . (x, p, b)(1!f . (x, p, b))\ . L L L L L p
(5.5.10)
Proof. The sum in Eq. (5.5.10) converges almost uniformly with respect to f 3R (x, f ) and L L L therefore is continuous in f . Furthermore, it is strictly increasing in f , since each summand has L L this property. The assertion holds since D(n, b, x, 0)"0 and for f Pexp(be (x)), L
lim D(n, b, x, f )"R. L
QED
The set of f ’s, each satisfying Eq. (5.5.9), is a subset of the interval (0, exp(be (x))). As an infinite L bounded set, +f , has at least one limit point. It remains to prove uniqueness of the limit point L L for each x, b3R ;R . > > For fixed x and e'!xg , let us define D (n, b, x, f)""K"\ f. (x, p, b)(1!f. (x, p, b))\ , L L C Cx pC where e(x, p)"p/2m!xg(p) cos(x, p).
284
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Lemma 5.4. ¸et f* be a limit point of +f ,, so there is a sequence n PR with f IPf*. ¹hen L I L D (n , b, x, f I)PD (b, x, f*), where C C I L
D (b, x, f)"(2p)\ C
f(exp(be(x, p))!f)\ dp Cx pC
for 06f6exp(!bg x). Proof. Obviously, lim I e (x)"!xg , so f*6exp(!bg x) and D (b, x, f*) is well defined for all C L e'!g x. The result follows by writing D (n , b, x, f I) as a Riemann sum in terms of the admissible C I L momenta and passing to the limit n PR. QED I D (b, x, f*) can be viewed as the density of particles with energy exceeding e, provided f* is proved C to be unique. Thus for unique f* D (b, x, f*) " : sup D (b, x, f*)"D (b, x, f*) C \VE C\VE
"(2p)\
f*(exp(be(x, p))!f*)\ dp
0 is the density of excited particles in the infinite system with the 1-particle Hamiltonian h(x)!x/2 and D "d!D the density of particles in the ground state. Clearly, D (b, x, f*)4d, C implying D (b, x, f*)4d and D (b, x, f*)50. Obviously, (j/jf)D (b, x, f)'0 , (j/jb)D (b, x, exp(!bxg ))(0 , D (b, x,f)6D (b, x, exp(!bxg )) equality in Eq. (5.5.13) holding if and only if f"exp(!bxg ). Let us now consider the equation
(5.5.11)
D (b, x , exp(!bx g ))"d , @ @ where x is the unknown. Clearly, @ (j/jx)D (b, x, exp(!bxg ))(0 , therefore, for x'0 and any b3(0,R)
(5.5.14)
(5.5.12) (5.5.13)
(5.5.15)
D (b, 0, 1)7D (b, x, exp(!bxg ))7 lim D (b, x, exp(!bxg ))"0 , (5.5.16) V the last equality being implied by the asymptotic form of I (z) (Gradshteyn and Ryshik, 1965, Eq. 6.461.5) I (z)+(2pz)\eX for large z'0 . and (5.5.29).
(5.5.17)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
285
Furthermore, (j/jb)D (b, 0, 1)(0 . (5.5.18) Thus by Eqs. (5.5.15), (5.5.16) and (5.5.18) for fixed d, Eq. (5.5.14) has a solution x 70 for small @ enough b'0 and no solution whenever D (b, 0, 1)(d , (5.5.19) i.e. for values of b exceeding b , the latter being defined as the solution of the equation B D (b , 0, 1)"d . (5.5.20) B Let us define x as the solution of Eq. (5.5.14) for b(b and x "0 for b7b . Obviously, @ B @ B jx /jb(0 for b(b . (5.5.21) @ B Lemma 5.5. Suppose x '0. If 0(x(x , then any limit point f* of +f , satisfies f*( L @ @ exp(!bxg ) and D (b, x, f*)"0. Conversely, if there is a limit point f*(exp(!bxg ), then x(x @ and so all limit points lie in (0, exp(!bxg )). Proof. Suppose f* is a limit point of +f , and x(x . Then L @ * D (b, x, f )6d"D (b, x , exp(!bx g ))(D (b, x, exp(!bxg )) . @ @ Thus by Eq. (5.5.11), f*(exp(!bxg ). Furthermore, d""K"\ f I(exp(be(x, p))!f I)\PD (b, x, f*) L L p
(5.5.22)
for some sequence f IPf*. Thus D (b, x, f*)"d and therefore D (b, x, f*)"0. L Conversely, if f*(exp(!bxg ) is a limit point, then by Eq. (5.5.22), D (b, x, f*)"d, we obtain D (b, x , exp(!bx g ))"d"D (b, x, f*)(D (b, x, exp(!bxg )) . @ @ Hence, by Eq. (5.5.15) x(x . QED @ From Lemma 5.5 and Eq. (5.5.11) we infer Lemma 5.6. If 0(x(x , then f Pf* as nPR, where f* is the unique solution of the equation @ L * D (b, x, f )"d . (5.5.23) It follows from Eq. (5.5.23) that for x3(0, x ), with x '0, D (b, x, f*)"0, so there is no BE @ @ condensation. However, there is BE condensation for x'x 70. @ Lemma 5.7. If x7x '0, then f Pexp(!bxg ) as nPR and for x'x , D (b, x, exp(!bxg )) @ L @ '0.
286
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Proof. If x7x '0, then according to Lemma 5.3, +f , has no limit points smaller than @ L exp(!bxg ). Thus f Pexp(!bxg ) as nPR and by Eq. (5.5.15) for x'x , L @ D (b, x, exp(!bxg ))"d!D (b, x, exp(!bxg )) 'd!D (b, x , exp(!bxg ))"d!d"0 . @
QED
It can be seen from Lemma 5.7 by letting x P0, that if x "0 (i.e. b7b ), then for any x70 the @ @ B sequence +f , has one limit point f*"exp(!bxg ), and by Eqs. (5.5.16) and (5.5.19) L D (b, x, exp(!bxg ))'0 for x'x "0 and b7b . @ B The first consequence of these lemmas, in particular, of Lemmas 5.6 and 5.7, is that for a fixed b(b the limiting form of Eqs. (5.5.7) and (5.5.8) is B G(x, f, b, d)"x ,
(5.5.24)
D (b, x, f)"d ,
(5.5.25)
where
G"d\(2p)\
g p dp P
L
sin a cos a f(exp(bp/2m!bg x cos a)!f)\ da
provided the variable x "x "x "2"x satisfies x3[0, x ), and if x7x , then their L L> L> @ @ limiting form is (for any b'0) G(x, exp(!bxg ), b, d)"xd\D (b, x, exp(!bxg )), x5x . @
(5.5.26)
The orientation of the vector x"lim x in the infinite system is thus undetermined as it does not L enter into the limiting equations (5.5.24), (5.5.25) and (5.5.26). As for Eq. (5.5.26), a solution x '0 # will appear once b is increased sufficiently, thereby extending the range [x ,R) of the x variable in @ this equation (see Eqs. (5.5.33) and (5.5.39)). Suppose now that b(b and x3[0, x ). Since Eq. (5.5.25) is solvable for f in this range of b, x, it B @ follows that Eqs. (5.5.24) and (5.5.25) reduce to F(x, b, d)"x ,
(5.5.27)
where F(x, b, d) " : G(x, f(x, b, d), b, d) and f(x, b, d) is the implicit solution of Eq. (5.5.25). Let us examine the properties of F(x, b, d) for fixed d and x, b3[0, x );[0, b ): @ B 1°. F(0, b, d)"F(x, 0, d)"0; 2°. Let s(p, q, a, u):"(g(p) cos a!g(q) cos u)pq sin a sin u , K(x, f, p, a):"(f\exp(bp/2m!bg(p)x cos a)!1)\f\exp(bp/2m!xbg(p) cos a) .
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
287
Then
jF jG jG jD jD \ " ! jf jx jf jx jx 1 dp dqp daL du s(p, q, a, u)K(x, f, p, a)K(x, f, q, u) " bd\(2p)\ '0 . 2 dpp da sin a K(x, f, p, a) 3°. Let f " : f(0, b, d), t (p, b) " : exp(!bp/2m). Then K jF (0, b, d)"0 , jx jF d(2p)(2(2g )\ fI k(kb)\ (0, b, d) jx I 2m "(30(p)\(bg ) p dp kfI tI (p, b) lfJ P K lb I J
p dp kfI tI (p, b) K I (p '2(3p)\((bg ) p dp lfJ tJ (p, b) k\fI (2m) P K 20 J I 1 ! q da bktI (q, b) . K 3 P !2(9p)\(bg )
P
The function pbktI (p, b) is bounded on R " : R ;R ;R with respect to p, b, k. Let K > > > > M denote its upper bound in R . Then 20\(p(2m)!3\MD'0 for sufficiently small D. > Hence, (jF/jx)(0, b, d)'0 for such D, which implies convexity of F with respect to x for all b, d and positive x’s in some neighbourhood of x"0. 4°. With decreasing b3R , F(x, b, d), as function of x, approaches F (x, g , b) within any fixed > interval [0, x ]. D This property of F can be inferred from the following representations of G and D : p dp fItI (p,b)(bkxg )\I (bkxg ) , (5.5.28) G"(2p)\d\g K P I D "(2p)\ p dp fItI (p, b)(bkxg )\I (bkxg )#2(2p)\ p dp fItI (p, b) , K K P D I I (5.5.29)
where I are Bessel’s modified functions (cf. Section 5.4). To this end let us note that for any fixed N x, lim f(x, b, d)"0 as bP0, since the assumption lim f(x, b, d)'0 implies @ lim p dp fItI (p, b)"R K D @ I
288
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
which contradicts Eq. (5.5.25). Furthermore, according to Eqs. (5.5.25), (5.5.28) and (5.5.29), F can be written in the form F(x, b, d)" (2pPp dp fI\tI (p,b)(bkxg )\I (bkxg ) I K g . (2pPp dp fI\tI (p, b)(bkxg )\I (bkxg )#2p dp fJ\tJ (p, b) D K K I J (5.5.30) For x3[0, x ] and sufficiently small b only the first terms of each series in Eq. (5.5.30) are D meaningful owing to the smallness of f(x, b, d). These terms are the only ones which enter into F (x, g , b), which proves 4°. Let us recall the relevant properties of F as function of x: For small positive x, F (x, g , b) is convex with respect to x. With increasing x, the convexity of F diminishes; F becomes concave for large x and approaches asymptotically the value g as xPR (cf. Fig. 1). 5°. F(x, b, d) is increasing in b for small values of b and any x3R . For values of x greater than > some x , jF/jb'0 for all b satisfying x 'x . @ Direct calculation of jF/jb yields: jF jF "xb\ #R(x, b, d) , jx jb
(5.5.31)
where (2p)R(x, b, d)
2m lbp 3! "(8(2 dm)\g (xkbg )\I (xkbg )klfI>J p dp lb m P IJ ;tI (p, b)P\#(2md)\g p dp kfIQ (k, x, b, p) q dq K P P I ; lfJQ (l, x, b, q)! p dp kfIQ (k, x, b, p) qpq P P J I 2 (2md)\g ; lfJQ (l, x, b, q) P\# n J p dp kfIQ (k, x, b, p) q dq lfJtJ (q, b) ; K P P I J ! p dp kfIQ (k, x, b, p) q dq lfJtJ (q, b) P\ , K P P I J 2 p dp Q (k, x, b, p)# q dq tI (q, b) , P" kfI K p > P P 0 I Q (k, x, b, p)"(bxkg )\I (bxkg )tI (p, b) . G G K
(5.5.32)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
289
The first and second term in Eq. (5.5.31), with R(x, b, d) substituted from Eq. (5.5.32), are of order D, whereas the third and fourth of order D. The latter two can be thus disregarded, owing to the smallness of D. As for the first term, it is positive by 2°, and the second term is positive for small enough b. This proves the first statement of 5°. To verify the second statement, let us use Eq. (5.5.17). According to this formula and the property 2°, for sufficiently large x, b, i.e. where the b-dependence of F is dominated by the I , I functions, F is increasing in b, which is equivalent to the second part of 5°. The range of such x and b can be extended to smaller values by increasing g and m. The properties of the solutions x(b) of Eq. (5.5.27) for 04x4x can be now inferred from 1°—5°: @ For sufficiently small b, x is large and Eq. (5.5.27) has only the trivial solution x"0. According to @ 2°—5°, for large enough g , m, D\, a nonzero solution x of Eq. (5.5.27) will appear at some b , if b (b , x (b )(x , which will subsequently split into two nonzero solutions x 'x under B @ growing b. By 5°, jx /jb'0 for large b. If b is increased still further, a point b is reached at # which the nonzero solution x (b ) of Eq. (5.5.26) appears and # # (5.5.33) x (b )"x #"x (b ) . # # # @ For b'b the solution x of Eq. (5.5.27) no longer exists. At b "(k¹ )\ it passes over # # # continuously to the solution x (b) of Eq. (5.5.26) and by Eq. (5.5.39), the latter persists as bPR. # Furthermore, since x (b)'x for b'b , it follows from Eq. (5.5.15) that in this range of # @ # temperatures D (b, x (b), exp(!bx (b)g )) # # "D (b, x , exp(!bx g ))!D (b, x (b), exp(!bx (b)g ))'0 . (5.5.34) @ @ # # Eqs. (5.5.26) and (5.5.27) have in general more than one solution x(b). Thus the solution x (b) K which minimizes the limiting free energy density lim f (nCL h, b), as nPR, of the system should be determined. According to Eq. (5.5.4) f ( h, b) " : lim f (nCL h(x), b) L
"b\ln z#(2n)\(bd)\
p dp
#bxg(p)cos a!bx/2)) da .
L sin a ln (1!z exp(!bp/2m (5.5.35)
The equations jf (nCL h, b) jf (nCL h, b) "0, lim z "0 (5.5.36) lim L jz jx L L L L reduce to Eq. (5.5.26) for x5x 50 and to Eq. (5.5.27) for x 'x50. The analysis of the @ @ solutions of these equations carried out above allows to follow the variation with b of the growth properties of f (h, b) as function of x: At small values of b, f (h, b) has one minimum at x "0. If K g , m, D\ are sufficiently large, two nonzero solutions x 'x are present in some range above
290
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
b and f (h, b) has a maximum at x and a second minimum at x . Again, it these constants are large enough, the minimum at x becomes deeper than the one at x"0 for b exceeding some b . In such case x "x for b3[b , b ) if x (b )(x , and x "x for b3[b ,R). The second equality K # # K # @ follows from Eq. (5.5.26) (cf. Eq. (5.5.30)) and its asymptotic form for large b'0, which in turn can be derived from Eq. (5.5.17) and the asymptotic expressions for the integrals
exp(!bp)p dp,
D
exp(!bp)p dp . (5.5.37) The first of these was dealt with in Eq. (5.4.13) which shows that its asymptotic form for large b'0 is D
exp(!bp)p dp+(2b)\D exp(!bD) .
D
The asymptotic form of the second integral in Eq. (5.5.37) can be found by the Laplace saddle point method (Erde´lyi, 1956):
D
lim 4b exp(!bp)p dp"(p . @ Thus, according to Eqs. (5.5.28) and (5.5.29), for large b'0 Eq. (5.5.26) takes the form
p m g (kbxg )\ #O(b\) 2 kb I p m \ (kbxg )\ #O(b\) "x 2 kb I m #2(2p)\ exp(!kb(xg #D/2m)) #O(b\) 2kb I and therefore
(5.5.38)
dG(x, exp(!bxg ), b, d) lim "g for x'0 D (b, x, exp(!bxg )) @ implying lim x (b)" lim x (b)"g . # K @ @
(5.5.39)
5.5.3. Phase transitions in the system The discussion in Section 5.5.2 allows to follow the thermodynamic behaviour of our gas under varying temperature. At high temperatures, x "0, so the system behaves like a collection of free K particles in the gaseous phase. If g , m, D\ are sufficiently large and b (b , x (b )(x , then the B @ gas exhibits a first order transition at b , since the derivative jf (h, b)/jb is continuous in b and x , K but due to the discontinuity of x at b , it exhibits a jump at b . Let us verify whether this K R
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
291
discontinuity corresponds to an increase or decrease of the entropy per particle s(b): jf (h, b) . s(b)"kb jb
(5.5.40)
By direct calculation, using Eqs. (5.5.35) and (5.5.36), one obtains jf (h, b) "!b\f (h, b)!b\x /2#Q(x , b, d) , K K jb
(5.5.41)
where
Q(x, b, d)"(2p)\(2bmd)\
p p dp sin a f(exp(bp/2m!bxg(p) cos a)!f)\ da ,
jQ (x, b, d)"!R(x, b, d) . jx
(5.5.42)
As noted in Section 5.5.2, R(x, b, d)'0 for sufficiently small b, D. For fixed b, D, R(x, b, d)'0 also for large enough values of the particle mass m. The inverse transition temperature b can be thus R lowered by taking sufficiently large g , m, D\, in which case, according to Eqs. (5.5.41) and (5.5.42) jf (h, b) jf (h, b) ( lim . lim jb jb > \ @@ @@ The entropy of the system thus drops discontinuously at b under increasing b, which allows to interpret the new phase (for b'b ) as liquid. The structure of the 1-particle Hamiltonian h(x ) is K consistent with such interpretation: in the new phase the prevailing motion of particles is a flow parallel to x . As in the classical case, x\x can be interpreted as a random variable with constant K K K distribution with respect to space coordinates and time. Suppose now that b (b and x (b )(x , i.e. liquefaction occurs at a temperature ¹ "(kb )\ @ higher than ¹ "(kb )\, the BE condensation temperature of an ideal gas. This can be achieved by increasing sufficiently g or D\ and keeping m constant, thereby decreasing b whilst b remains unaltered. Then up to b the Boson gas with the interaction (5.5.1) behaves in the same manner as its # classical analogue discussed in Section 5.4. Deviations from the classical behaviour appear at b5b . According to the discussion concluding Section 5.5.2, x "x '0 for b3[b ,R), so the # K # # system remains in the liquid phase at these temperatures and, by Eq. (5.5.34), there is a nonzero density of bosons occupying the ground state in this range. Thus by virtue of the continuity (5.5.33), at b"b the Boson liquid undergoes a second order phase transition and for b3[b ,R) exhibits # # BE condensation in the liquid phase. Since lim x "g as bPR, by Eq. (5.5.39), and lim D "d # as bPR, by Eq. (5.5.34), at ¹"0 all bosons condense in the ground state with 1-particle energy e "!g#g and the liquid phase achieves maximum ordering. The value of e agrees with the 1-particle ground state energy of the Hamiltonian HL:
n E " lim n\ (!g)"!g/2"e . 2 L
292
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
The fugacity z"f exp(bx /2)"exp(!bx g #bx /2)(1 for b3[b ,R) implying a negative # # # # value of the chemical potential k in this range of b. This agrees qualitatively with experiment: in He, k"!7.16 K at ¹"0 (Baym, 1967). Another property of the gas described by HL, which is partially reflected by measurements on superfluid He is its excitation spectrum. The latter can be inferred from the structure of h(x ). For p3P the spectrum of h(x ) consists of a band of K K breadth equal 2x g and for p , P it is the same as that of a free gas. The excitation spectrum of K superfluid He also contains a diffused band close above the phonon excitation branch and the former has a finite breadth at all values of the 1-particle momentum (see Fig. 6 of Woods and Cowley, 1970). Linearity of the boundary of this band for small p as observed in experiment could be achieved in the model with Hamiltonian HL by modifying g(p), so that lim p\g(p)"const. as pP0 and such diffused 1-particle spectrum would also result in the present approach if an interparticle repulsion of the form n\g(p )g(p )cos ( p , p ) was added to the interaction (5.5.1). The roton pairs, so characteristic of superfluid He (Ruvalds and Zawadowski, 1970; Zawadowski et al., 1972) can be also identified in the present model. Each pair of bosons with momenta p , p in a common plane with x and such that cos ( p , x )"cos ( p , x ), constitutes a roton pair, since the K K K mean occupation numbers of such bosons are equal. Apart from these satisfactory features, the Boson gas with the interaction (5.5.1) has some apparent deficiencies. In particular, lim D "d as bPR, whereas in superfluid He the density of the condensate D containing all the particles at rest satisfies lim D (d as bPR (Baym, 1967). Furthermore, the temperature ¹ "(kb )\ is higher than the temperature ¹ of BE condensa# # tion of a free gas, since D (b, x , exp(!bx g ))4D (b, 0, 1) by Eq. (5.5.16) and ¹ "(kb )\, # # where b is the solution of Eq. (5.5.20) (Landau and Wilde, 1979). On the other hand, BE condensation in a free gas with the same parameters (particle mass, density) as those of He occurs at ¹ "3.2 K which is somewhat higher than the j-transition temperature ¹ "2.19 K in He. H The approximation to ¹ provided by the ideal gas is therefore better than by the Hamiltonian H (5.5.2). This shortcoming of the Hamiltonian (5.5.2) could be amended by adding to the interaction (5.5.1) a repulsive 2-body potential, e.g. by replacing Eq. (5.5.1) with Eq. (5.4.18). It should be stressed, however, that bounded potentials of the form (5.4.18) cannot provide explanation of the j-type specific heat singularity observed in He as long as g (p) and g (p) are ? P bounded functions. Systems with interactions of this type are mean-field ones and renormalization group theory (Wilson and Kogut, 1974) asserts that in the vicinity of a phase transition at ¹ the specific heat c(¹)Kconst. "¹!¹ "\? with a"0 for mean-field models. The repulsive term in Eq. (5.4.18), which, in fact, represents a hard core potential, should therefore be singular or unbounded in p , p . Alternatively, one could consider a sequence of models HL"HL#n\ g (p )g (p )cos(p , p ), s"1, 2,2 Q PQ G PQ H G H GH with g approaching some unbounded function as sPR, as a possible approach to the question PQ of explaining the singularity of specific heat at ¹ in He. H A different microscopic model of He, which yields ¹ "¹ and explains why at ¹"0 K the # H density of the condensed mode in superfluid He (consisting of particles at rest) constitutes only
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
293
a fraction of the total density, is discussed in (Mac´kowiak, 1989c). A system combining the structure of this model and that of a Hamiltonian of the type (5.5.2) could also be adjusted to yield ¹ "¹ and the appropriate excitation spectrum. # H These questions, related to the refinemennt of the n-boson system (5.5.2), are left as open problems.
6. The Kondo effect in terms of a reduced s—d model 6.1. Introduction Anomalous increase of resistivity in dilute magnetic alloys (DMA) (e.g., high purity gold and copper samples with trace impurities of iron), observed at low temperatures, was first reported by Meissner and Voigt in 1930 (Meissner and Voigt, 1930). Measurements of magnetization and magnetic susceptibility in DMA allowed to infer the presence of spin—spin type interactions between conduction electrons and localized magnetic moments of impurities, which give rise to this anomaly. There has been considerable effort on the part of theoreticians to explain this effect, as well as the striking magnetic properties of DMA, but a complete theory, which could explain all phenomena observed in DMA in terms of transport theory and thermodynamics of a single Hamiltonian, is still lacking. One of the most commonly exploited models of DMA is the one expressed by the s—d exchange Hamiltonian (Kasuya, 1956) + H "H #H" e a*k ak # »kk "H !J r(R ) ) S , J(0 , I N N kk Y ? ? )+ k N Y ? with + »kk "!JN\ exp[i(k!k)R ]+(a*k ak !a*k ak )S #a*k ak S #a*k ak S , Y ? Y> > Y\ \ ?X Y> \ ?\ Y\ > ?> ? describing the interaction of conduction electrons with spin density r(R ) at R with M magnetic ? ? impurities localized at R ,2, R in a nonmagnetic host metal with N atoms (e.g. Kondo, 1964; + Hepp, 1970; Wilson, 1975; Andrei, 1980; Andrei and Lowenstein, 1981; Andrei et al., 1983; Filyov et al., 1981; Wiegmann, 1982). The anomalous increase of resistivity of DMA at low temperatures was first explained by Kondo (1964) in terms of H . Kondo derived the temperature dependence )+ of resistivity R (¹) due to the interaction of current carrying electrons with localized impurity ) spins by calculating the second-order Born approximation to the transition probability per unit time between two free-electron states with H treated as a perturbation. The resulting expression for R (¹) ) R (¹)"R [1!2JD\ln(D/k¹)] , (6.1.1) ) where D is the half-bandwidth of mobile electrons around the Fermi level e , has a logarithmic $ singularity at ¹"0 K, but combined with the resistivity R (¹) due to lattice vibrations in a metal, correctly accounts for the observed resistivity minimum in DMA (Kondo, 1964) in the vicinity of
294
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
¹ defined as R(¹ )"0, where
R(¹)"R (¹)#R (¹)#R (6.1.2) ) G is the total resistivity, R (¹)"a¹ and R is the temperature-independent resistivity arising from G the impurity potential. In 1970 Hepp rigorously proved absence of any such singularities for H in )+ case the number of impurities remains finite when passing to the infinite-volume limit, in particular, absence of phase transitions in a system with a finite number of impurities described by H . )+ Theoretical investigations have therefore centered on the finite-impurity version of H with the )+ number of electrons nPR. Wilson (1973) devised a numerical method of diagonalizing H by ) renormalization group technique (RGT). The resulting spectrum of H revealed a singlet state of ) the impurity spin coupled with a single electron at low temperatures and allowed to calculate the temperature dependence of magnetic susceptibility and specific heat. The next step was made in 1980 and 1981 by Andrei and Wiegmann who independently solved the equilibrium thermodynamics of H by applying a generalized Bethe Ansatz to H with e linearized in k around the ) ) I Fermi momentum k (Andrei, 1980; Andrei and Lowenstein, 1981; Andrei et al., 1983; Filyov $ et al., 1981; Wiegmann, 1982). Their analysis confirmed experimental observations of different behaviour of DMA in low ¹;¹ and high temperature regions ¹ ;¹;¹ , where ) ) " D corresponds to the momentum cut-off within which linearization of e remains valid, the Kondo I temperature ¹ , defined by ¹ "Dk\ exp (!D/2"J"), providing a scale separating the two ) ) regions. Expansions of various quantities in powers of ¹¹\ also prove to be different for ) ¹¹\;1 and ¹¹\<1 (Caplin and Rizzuto, 1968). Magnetization and specific heat curves as ) ) functions of temperature and magnetic field resulting in the Bethe Ansatz approach showed correspondence with experiment (Andrei et al., 1983). The H , H Hamiltonians are still fre) ) quently exploited, e.g. Sacramento and Schlottmann (1990), Ingersent et al. (1992) and Crisan and Popoviciu (1992). In the early stage of development of the theory Anderson (1961) introduced another model which gives a more detailed account of localized moment formation and origin of the exchange interaction with conduction electrons. The Anderson Hamiltonian H contains, as usual, a term repres enting the band of conduction electrons with spectrum e , furthermore, the interaction H of the I electrons on the incomplete d or f shell of impurity atoms and, finally, a term describing the interaction between impurity atoms and conduction electrons H : ' H " e a*k ak #H #H , ' k I N N N where " ¼KNKNd* d* d d !A¸ S K N K N K N K N K N K N KGNG the first term of H representing that part of the atomic interaction which is invariant under rotations in the spin and coordinate spaces and the second term the spin—orbit coupling. H has the form ' H " » (a* d #h.c.) , ' IK IKN KN IKN * where d is the creation operator for an electron with spin p and orbital angular momentum KN component m in the unfilled shell of the impurity atom. H
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
295
The finite-impurity version of H has been extensively studied, e.g. Clogston and Anderson (1961), de Gennes (1962), Kondo (1962), Schrieffer and Wolf (1966). In 1985 Arai derived expressions for the temperature dependence of specific heat, resistivity and susceptibility for the singleimpurity H , using Baym’s criterion (Baym, 1962) for the macroscopic conservation laws. His approach yields finite zero temperature value of resistivity of DMA and specific heat and susceptibility curves compatible with those of RGT and experiment. The single-impurity theory of DMA formulated on the grounds of the Andrei—Wiegmann solution of H (Andrei, 1980; Andrei et al., 1981, Andrei et al., 1983; Wiegmann, 1982) has also ) proved successfull in explaining some properties of these materials, for example the behaviour of specific heat of (LaCe)Al in the vicinity of ¹ and magnetization of (LaCe)Al with 1.5% content ) of Ce, but similarly as Arai’s theory (Arai, 1985), it does not involve impurity concentration c"N\M, a parameter on which experimental measurements are dependent. For example, the position of the resistivity minimum of DMA is proportional to c and its depth to c (Kondo, 1964; Ashcroft and Mermin, 1976). Dependence on c is also visible in the shape of magnetization and susceptibility curves, e.g. Rizzuto (1974) and Felsch et al. (1975). It would be therefore advantageous to develop a theory of the Kondo effect along the lines initiated by Kondo in 1964 and in passing to the infinite-volume limit keep the impurity concentration c constant. The complexity of the interaction in H and H does not allow evaluation of lim f (HL , b) or lim f (HL, b), as nPR, )+ )+ under this condition. A simplified version of H or H appears to be therefore necessary. To this )+ end a reduction procedure was proposed for H (Mac´kowiak, 1993) which imposes no restrictions )+ on the number of impurities, but simplifies H. The resulting reduced Hamiltonian H is proved in subsequent sections to be asymptotically equivalent in the thermodynamic limit (nPR, MPR, NPR, n\M"l"const., N\M"c"const.) to a noninteracting Hamiltonian h representing an electron gas decoupled from the impurity spins. h exhibits a second-order phase transition. The spins of current carrying electrons align antiparallel to those of impurities below the transition temperature ¹ . ¹ can be thus identified as the Curie temperature of DMA in the H approximation (cf. Rizzuto, 1974; Crisan, 1992). The transition, accompanied by a discontinuity of specific heat, occurs at very low temperatures, e.g. at 0.04 K in CuCr with 51;10\% content of Cr (Tripplett and Philips, 1971), at 3.6 K in PbSnMnTe (Story et al., 1986), at 6.7 K in CePd Ga (Bauer et al., 1994). The Hamiltonians H and h prove to be insufficiently accurate at temperatures beyond the vicinity of ¹ . Specific heat, magnetization and susceptibility curves in H theory differ significantly from the experimental ones in this range (Mac´kowiak and Wisniewski, 1994). In particular, the temperature-dependent magnetization curves of (LaCe)Al alloy with 1.5% content of Ce found by Felsch et al. (1975) and successfully modelled by Andrei et al. (1983) using H theory, cannot be ) recovered by exploiting the Hamiltonian H with fixed c, because the relevant range of temper atures is far above ¹ , where H represents a free electron gas. H can thus serve only as the Hamiltonian of an unperturbed system with the perturbation equal H !H . In this manner ) a perturbation theory can be developed for the free energy of H (Mac´kowiak and Wisniewski, ) 1997a) and for the resistivity of DMA. The latter question is dealt with in Section 6.3: with h treated as the unperturbed Hamiltonian and H !H as the perturbation, the resistivity, due to )+ the interaction of current carrying electrons with localized spins, is calculated by Kondo’s method (Mac´kowiak, 1993). The resulting expression is finite at ¹"0 and equal R (¹) above ¹ , ) thereby extending Kondo’s theory of resistivity minimum in DMA to temperatures ¹3[0,¹ ].
296
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Kondo was aware of the limitations of his approach and the reason of the logarithmic singularity ln ¹ in his formula for R . In particular, he emphasized validity of his theory only “above the ) temperature at which steep rise of resistivity towards low temperature is suppressed by spin ordering and below the temperature where lattice resistivity becomes predominant” (Section 3, Kondo, 1964). We now turn to a discussion of the reduction procedure for H and asymptotic behaviour of )+ the resulting reduced Hamiltonian H . 6.2. Reduction of H to H and asymptotic treatment of H )+ 6.2.1. ¹he reduced s—d Hamiltonians H and H ' As mentioned earlier, due to the complex structure of H , the limit lim f (HL , b) as )+ )+ nPR, MPR with n\M"l"const., c"const., cannot be easily evaluated. Thus if the constraint c'0, l'0 is to remain imposed, which would be advantageous for the theory, a simplified version of H , solvable under this restriction and retaining the essential physics of )+ H , would be sufficient for providing a zero-order approximation to lim f (HL , b). It appears that )+ )+ a solvable H results by simplifying H in the following manner: rejecting the sum kOk»kk in H )+ and next restricting the remaining single sum over the first Brillouin zone to a thin layer P defined $ as P "+k: k !D4k4k #D,, D;k . $ $ $ $ The first step can be justified by the fact that spin-flip processes and not momentum exchange processes account primarily for the anomalous resistivity at low temperatures (Kondo, 1964). The second restriction is a consequence of the assumed low-temperature regime. The interaction H, simplified in this manner, reduces to the form + H "!JN\ +(a*k ak !a*k ak )S #a*k ak S #a*k ak S , . > > \ \ ?X > \ ?\ \ > ?> ? k Z P$ In terms of Pauli spin operators (p , p , p ), (S , S , S ) with eigenvalues $1, H for spin impurities V W X V W X can be rewritten as 1 + H "! JN\ r(k) ) S , ? 2 ? I Z P$ where r ) S"p S #p S #p S . The first quantization version of H "H #H is thus V V W W X X L L + HL"AL e G#n\ cg (p )r ) S AL , $ G G ? N G G ? where AL denotes the antisymmetrizer with respect to fermion variables with indices 1,2, n,
g '0 for p3P , $ g (p)" $ 0 for p , P , $ cg "l\c"J""C'0. The system represented by HL will be called the reduced s—d model (Rs-dM). As a preliminary step in the investigation of HL we first consider a simplified version of
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
297
this Hamiltonian, viz.,
L L + HL"AL e G#n\ cg (p )p S AL . N $ G XG X? ' G G ? In order to evaluate lim f (HL, b) the technique of Tindemans and Capel (1974) and Pearce and Thompson (1975) is adapted to antisymmetric projections of n-particle operators. Similarly as in previous chapters, an upper and lower bound to f (HL, b) are derived and shown to coalesce as nPR. The following notation and assumptions are introduced: The system HL or HL is ' enclosed in a cube ¸ under periodic boundary conditions on electron states, which restrict the admissible electron momenta: k " 2p¸\(k , k , k ) with k 3Z. For constant density d"n¸\ G H of electrons and constant l"n\M, the density d equals d"k/3p, the density of mobile $ electrons with momenta k3P for a half-filled P band at ¹"0 K equals d " : pkD and the $ $ K $ density of mobile electrons per lattice site equals d"d d\"d M\¸"3D(lk )\. K Q K $ 6.2.2. Upper bound to f (HL, b) ' The Hamiltonian HL of a DMA with spin impurities acts in the Hilbert space HLH+, $ ' where H "¸(¸)C and H"C. Thus in terms of CI + defined by Eq. (2.1.2) and CL in $ Section 4.1, HL takes the form '
L L + HL "AL e G!(2n)\ g (p )p ! cS N $ H XH X? ' G H ?
L + #(2n)\ cS #(2n)\ g (p )p AL $ G XG X? G ? "nCL ¹+!(2n)\[nCL (g p )+!LMCI +(cp )] $ X $ X #(2n)\[nCL (g p )]+#(2n)\L[MCI +(cS )] , $ X $ X
(6.2.1)
and stand for the identity operators on H , H, respectively, and L,AL. The upper bound $ $ $ on f (HL, b) results by applying the Bogoliubov inequality (3.1.23) with ' H "hL(x, y, z)"nCL ¹+#(y!x)nCL (g p )+ ' $ X 1 #L(x!z)MCI +(cS )# n(x!y!z)L+ , $ X $ 2
(6.2.2)
where x3R and y, z are defined as arbitrary solutions of the equations Tr[g p ¸ exp(!bhL )] $ X L '$ , y" Tr exp(!bhL ) '$ z"!lc
Tr[S exp(!bc(x!z)S )] X X Tr exp(!bc (x!z)S ) X
(6.2.3a)
(6.2.3b)
298
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
with hL "nCL ¹#(y!z)nCL (g p ). For such a choice of H , the Bogoliubov inequality '$ $ X (3.1.23) yields f (HL, b)4f (hL, b)#n\1[nCL (g p )!nSL]2 L'$ $ X $ F ' ' #n\1[MCI +(cS )#nz+]2 +G' X F !n\1[nCL (g p )+!LMCI +(cS )!nxL+]2 L' $ X $ X $ F 4f (hL, b)#n\1[nCL (g p )!nS L]2 L'$ ' $ X $ F #n\1[MCI +(cS )#nz+]2 +G' , h+"(x!z)MCI +(cS ) . X F G' X One proves that lim n\1[nCL (g p )!nSL]2 L'$"0 , $ X $ F L lim n\1[MCI +(cS )#nz+]2 + ' "0 X F +L follows by performing similar rearrangements. As a consequence,
(6.2.4)
(6.2.5) (6.2.6)
lim f (HL, b)4 lim f (hL, b) ' ' L L " lim f (hL , b)!b\l ln [2cosh (bc(x!z))#(x!y!z)] '$ L and by Eq. (4.2.33), lim f (HL, b)4 lim +b\ln z !(bn)\ Tr ln[ #z exp(!bh )], ' L $ L '$ L L !b\l ln[2 cosh(bc(x!z))]#(x!y!z) , where z is the unique solution of the equation L n\ Tr[z o( #z o)\]"1, o"exp(!bh ) , L $ L '$ Eq. (6.2.9) assumes the limiting form
(2pd)\
dp p Tr+z\exp[be #b(y!x)g (p)p ]#1,\"1 N $ X
(6.2.7)
(6.2.8)
(6.2.9)
(6.2.10)
as nPR. An approximate solution of Eq. (6.2.10) can be found by proceeding similarly as in BCS theory, i.e., by replacing the integral over P by the product of the integrand at p with "P ""2D, $ $ $ justifying this step by the smallness of D. Eq. (6.2.10) then reduces to dl(+1#z\ exp[be #b(y!x)g ],\#+1#z\ exp[be !b(y!x)g ],\) $ $
#2(2pd)\
!P >
0
dp p[z\ exp(be )#1]\"1, P"[k !D, k #D] . N $ $
(6.2.11)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
299
Under the same approximation, Eq. (6.2.9), for n noninteracting electrons, takes the limiting form
dl#2(2pd)\
dp p+z\ exp(be )#1,\"1 . (6.2.12) $ 0> !P The solution of Eq. (6.2.10) is therefore z"z , since for this value of z, Eq. (6.2.11) reduces to $ Eq. (6.2.12). The inequality (6.2.8) can be thus rewritten as lim f (HL, b)4f (x, y(x, b), z(x, b), b) , ' L where
(6.2.13)
f (x, y, z, b)"e !b\G ("x!y", b)!b\l ln[2 cosh(bc(x!z))]#(x!y!z) $ $ with
G (m, b)"(2pd)\ $
dp p ln(+1#z exp[bg m!be ],+1#z exp[!bg m!be ],) . $ $ N $ $ N (6.2.14)
Taking into account the asymptotic form of m\¸oL given by Lemma 4.2, we may write L L Eq. (6.2.3a) in the limit nPR, as y"n\ Tr[g p z o( #z o)\] $ X $ $ $ or, explicitly, exploiting the smallness of D, as y"dlg tanh[bg (x!y)] . As for Eq. (6.2.3b), its explicit form is z"lc tanh[bc(x!z)] .
(6.2.15)
(6.2.16a)
(6.2.16b)
Both Eqs. (6.2.16a) and (6.2.16b) result by setting jf "0, jy
jf "0 jz
(6.2.17)
and the first under the additional assumption that D is sufficiently small. In Appendix D it is shown that equations of this type have unique solutions. 6.2.3. Lower bound to f (HL, b) ' A suitable lower bound to f (HL, b) obtains by first linearizing the quadratic term ' (2n)\b[nCL (g p )+!LMCI +(cS )] $ X $ X in the exponential of exp(!bHL) (cf. Eq. (6.2.1)) with use of the Gaussian integral (in the sense of ' Bochner)
exp(A)"(2p)\
exp(!m#(2Am) dm .
0
300
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Then
f (HL, b)"!(nb)\ ln(nb/2p) '
dx Tr exp+!bnCL ¹+!nbxL+ $
0 #bx[nCL (g p )+!LMCI +(cS )] $ X $ X !b(2n)\[nCL (g p )]+!b(2n)\L[MCI +(cS )], . $ X $ X In order to linearize the remaining quadratic terms, one exploits the inequalities
[MCI +(cS )#nz*+]50 , [nCL (g p )!ny*KL]50, $ X $ X where y* and z* are the unique solutions of Eqs. (6.2.16a) and (6.2.16b), respectively. theorem (Ruelle, 1969) stating monotonicity of the function t(X)"Tr e6, applied to the Eq. (6.2.18), then yields the following bound to f (HL, b): ' nb dx exp[G (x, y*(x, b), z*(x, b), b)] , f (HL, b)5!(nb)\ ln L ' 2p 0 where
(6.2.18)
(6.2.19) Peierls’ r.h.s. of
(6.2.20)
G (x, y, z, b)"ln Tr exp(!bhL) L ' "ln Tr exp[!bhL ]#M ln[2 cosh(bc(x!z))!nb(x!y!z) . '$ By exploiting Eq. (4.2.33), one obtains exp G (x, y, z, b)" L z\L exp[Tr ln( #z o)#u(n)][2 cosh(bc(x!z))]+ exp[!nb(x!y!z)] L $ L "z\L exp[nG ("x!y",b)#u(n)][2 cosh(bc(x!z))]+ exp[!nb(x!y!z)] , (6.2.21) L $ where u(n) is such that lim n\u(n)"0 as nPR. Substitution of exp(G ) given by Eq. (6.2.21) into L Eq. (6.2.20), yields
nb Tr exp[!bHL]4 ' 2p
0
exp+n[!ln z #G ("x!y*", b) $ $
#l ln[2 cosh(bc(x!z*))]!b(x!y*!z*)#n\u(n)], dx
4max exp+n[!ln z #G ("x!y*", b)#l ln[2 cosh(bc(x!z*))] $ $ V nb !b(m\x!y*!z*)#n\u(n)], exp[!nbu(1!m\)] du 2p 0
"max exp +!nbf (x, y*(x, b), z*(x, b), b)#nbx(1!m\)#n\u(n),(1!m\)\ V (6.2.22)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
301
for m'1. The necessary condition for the maximum in Eq. (6.2.22) is, in the limit nPR, jf !x(1!m\)"0 jx
(6.2.23)
and, under the assumed smallness of D, takes the form xm\"dlg tanh[bg (x!y*)]#lc tanh[bc(x!z*)] . From Eq. (6.2.22), we get
(6.2.24)
lim f (HL, b)5min + f (x, y*(x, b), z*(x, b), b)!x(1!m\), . ' L V After passing to the limit mP1, Eq. (6.2.25) takes the form
(6.2.25)
lim f (HL, b)5min f (x, y*(x, b), z*(x, b), b) ' L V and Eq. (6.2.24) simplifies to jf/jx"0, i.e. to
(6.2.26)
x"jlg tanh[bg (x!y*)]#lc tanh[bc(x!z*)] .
(6.2.27)
6.2.4. Asymptotic equality of upper and lower bounds to f (HL, b) ' Comparison of the upper bound (6.2.13) and the lower bound (6.2.26) shows that (6.2.28) lim f (HL, b)" lim f (hL, b)"min f (x, y*(x, b), z*(x, b), b) , ' ' L L V where y*(x, b), z*(x, b) are the unique solutions of Eqs. (6.2.16a) and (6.2.16b), respectively. Combined with Eq. (6.2.27), these equations constitute the following system: x"f (x!y)#f (x!z) , (6.2.29a) y"f (x!y) , (6.2.29b) z"f (x!z) , (6.2.29c) where f (m)"dlg tanh(bg m), f (m)"lc tanh(bcm). Equations of the form (6.2.29b), (6.2.29c) are discussed in Appendix D, where it is shown that they have unique solutions y*(x, b), z*(x, b) with growth and convexity properties analogous to those of tanh(bx) and that
x
for !dlg 4x4dlg , lim y*(x, b)" dlg for x5dlg , (6.2.30) @ !dlg for x4!dlg * and lim z (x, b), as bPR, determined in the same manner, with lc replacing dlg . The system (6.2.29a)—(6.2.29c) therefore reduces to the equation x"y*(x, b)#z*(x, b) ,
(6.2.31)
302
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
which has the solution x "0 for any b50 and two nonzero solutions: x (b)'0, x (b)"!x (b) whenever b'b , b being the solution of the equation j * [y (x, b)#z*(x, b)] "1 . (6.2.32) V jx One finds b "(2(dlcg )\ . Eqs. (6.2.29a), (6.2.29b) and (6.2.29c) yield furthermore
(6.2.33)
x"y#z ,
(6.2.34a)
y"dlg tanh(bg z) , z"lc tanh(bcy)
(6.2.34b) (6.2.34c)
implying f (x , y*(x , b), z*(x , b), b)"e !G (z*(x , b), b) G G $ $ G G !b\l ln[2 cosh(bcy*(x , b)]#y*(x , b)z*(x , b), i"0, 1, 2 . G G G Let x , y , z be defined by K K K
(6.2.35)
f (x , y , z , b)"min f (x, y*(x, b), z*(x, b), b) . K K K V These minimizing values of x,y*,z* can be determined by analysing the graph of the derivative (d/dx) f (x, y* (x, b), z*(x, b), b) as functions of x and b. The graph itself can be inferred from Eq. (6.2.31). The latter shows that x "0 for b4b and x "x (b) (or x "x (b)) for b'b . Since y*, z* are unique, therefore K K K y "z "0 for b4b and y "y*(x (b), b), z "z*(x (b), b) for b'b . The phase transition at K K K K b is second order due to the evenness of f (x , y , z , b) as function of g y , cz (cf. Eq. (6.2.35)) and K K K K K discontinuity of jy /jb, jz /jb at b : K K 6d((dlC) jy lim K" , jb c(2(1#2d) @@>
jy lim K"0 , jb @@\
(6.2.36a)
12((dlC) jz , lim K" jb g(2(1#2d) @@>
jz lim K"0 , jb @@\
(6.2.36b)
Eqs. (6.2.36a) and (6.2.36b) enable calculation of the specific heat discontinuity at b . The necessary explicit expression for f (b)"l\f (x , y , z , b) can be found from Eq. (6.2.35): G K K K (6.2.37) f "!2db\ ln [cosh(bv )]!b\ ln [2 cosh bu ]#(Cl)\u v #f , K K K G K
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
303
where u "cy , v "g z and K K K K f "l\e !(lbdp)\ dp pln[1#z exp(!bp/2m)] . $ $ Using Eq. (6.2.37), one obtains for the specific heat per impurity
ju 1 1 jv jf K cosh\(bu )# b K cosh\ bv c (¹)"!¹ G "kb K G jb 2 K 2 jb j¹
ju 1 ju jv K cosh\(bu ) ! (lCu v )\ K K#b(2u )\ K K K K jb 2 jb jb
jv 1 1 1 1 # db(2v )\ K cosh\ bv # dv cosh\ bv K K K jb 2 2 K 2b 2
#b\u cosh\(bv ) #c , K K
(6.2.38)
where c is the specific heat of the free electron gas. The resulting discontinuity of c at ¹ equals G 1 ju 1 jv K# d K "12kd(1#2d)\ . (6.2.39) Dc " lim kb G 2 jb 2 jb > @@ A discontinuity of c has been observed in CuCr with 51;10\% content of Cr at around 0.04 K G (Tripplett and Philips, 1971), in the semi-magnetic semiconductor PbSnMnTe at 3.6 K (Story et al., 1986) and in CePd Ga at 6.7 K (Bauer et al., 1994). The electron structure of these materials is described by the Hamiltonian H . The low-temperature phase is magnetically ordered, similarly )+ as in the case of h , where y*(x (b), b)'0 and z*(x (b), b)'0 according to Eqs. (6.2.29b), (6.2.29c) ' and x 'y*(x , b)'0, x 'z*(x , b)'0 according to Eq. (6.2.31). The form of h in Eq. (6.2.2) ' thus shows that the spins of mobile electrons tend to align antiparallel to those of the magnetic impurities below ¹ "(kb )\.
6.2.5. Mean-field description of the system H The thermodynamic limit of free energy per electron for HL can be examined in a similar manner, with some generalizations as in Sections 6.2.2, 6.2.3 and 6.2.4 in the case of HL. According ' to Eq. (6.2.3a) and Eq. (6.2.3b), HL"nCL ¹+!(2n)\ [nCL (g p )+!LMCI +(cS )] $ S $ S S (6.2.40) #(2n)\ +[nCL (g p )]+#L[MCI +(cS )], $ S $ S S with summation over u"x, y, z. The appropriate upper bound to f (HL, b) follows from the Bogoliubov inequality (3.1.23) with H "hL(x, y, z)"nCL ¹+#nCL ((y!x) ) g r)+ $ 1 #LMCI +((x#z) ) cS)# n(x!y!z)L+ , $ $ 2
(6.2.41)
304
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
where x3R and y, z are arbitrary solutions of the equations y"1CL (g r)2 L$, hL"nCL (¹#(y!x) ) g r) , $ $ $ F z"1CI +(cS)2 + , h+"MCI +((x#z) ) cS) . F G Similarly as in Section 6.2.2, one obtains
(6.2.42a) (6.2.42b)
f (HL, b)4f (hL, b)#n\ 1[nCL (g p )!ny L]2 L$ $ S S $ F S #n\ 1[MCI +(cS )!nz +]2 + . S S F S The two averages on the r.h.s. of Eq. (6.2.43) vanish as nPR, yielding lim f (HL, b)4 lim f (hL, b) L L "e !b\G ("x!y", b)!b\l ln[2 cosh(bc"x#z")]#(x!y!z) $ $ ,f (x, y(x, b), z(x, b), b) .
(6.2.43)
(6.2.44)
By Lemma 4.2, for sufficiently small D, Eq. (6.2.42a) takes the asymptotic form
x!y 1 y" dlg tanh bg "x!y" "x!y" 2
(6.2.45a)
as nPR, and the explicit form of Eq. (6.2.42b) is x#z lc tanh(bc"x#z") . z"! "x#z"
(6.2.45b)
Both equations result by setting jf/jy"0, jf/jz"0 .
(6.2.46)
In order to derive an appropriate lower bound to f (HL, b) the technique of Section 6.2.3 must be generalized. The ordinary notation, without use of Kummer’s mapping CL , will prove more convenient:
L L + ZL"Tr AL exp !b ¹ !(2n)\ g (p )p ! cS G $ G SG S? G S G ? L + AL . (6.2.47) #(2n)\ g ( p )p #(2n)\ cS $ G SG S? S G S ? Since the trace on the r.h.s. can be represented by a uniformly convergent series, Trotter’s product formula
A B e\\ "s!lim exp ! exp ! m m K
K
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
305
for operators A, B semibounded from below, can be applied in the following manner:
1 L L + ¹ !(2n)\ g (p )p ! cS ZL"Tr AL exp !b G $ G SG S? 3 G G ? S 1 L + # n\ g (p )p #(2n)\ cS AL $ G SG S? 2 G ? !b L b L + ¹# g (p )p ! cS " lim Tr AL exp G $ G SG S? 3m 2mn G G ? K S b L b + K ! g ( p )p ! cS AL . (6.2.48) $ G SG S? 2mn 2mn G ? Similarly as in Section 6.2.3, one linearizes the first quadratic term in the exponents by a Gaussian integral:
nb K K !b L Tr AL dx exp ¹ ZL" lim QS G 2pm 3m 0K Q S G K L + L b g (p )p #m\bx g (p )p ! cS ! $ G SG S? $ G SG QS 2mn G ? G + nb b cS ! x AL (6.2.49) ! S? 2m QS 2mn ? and the remaining two quadratic terms by exploiting a generalization of Peierls’ theorem for ZL (Appendix E). This yields an upper bound on ZL:
n K K K nb ZL4 lim dx exp ! x Re Tr B , R Q R 2pm 2m 0K R K Q where x "(x , x , x ), y , z are some functions of x , x , x and R RV RW RX QS QS QV QW QX b b B "AL exp ! ¹ # (x !y ) g ( p )p G m QS QS $ H SH Q 3m G H SXWV nb b ! c(x #z ) S # (y #z ) AL. QS S? 2m QS QS m QS ? The real part of Tr( B ) can be further bounded using Ho¨lder’s inequality Q 1 K K Tr B 4 "Tr(B*B )K"K . Q Q 1 Q Q The operator B*B has the form B*B "oLqB+, where Q Q Q Q Q Q o (i)"o (i)o (i)o (i)o (i)o (i) , Q QV QW QX QW QV q (a)"q (a)q (a)q (a)q (a)q (a) Q QV QW QX QW QV
(6.2.50)
(6.2.51)
(6.2.52a) (6.2.52b)
306
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
and
b b o (i)"exp ! ¹ # (x !y )p g (p ) , QS QS SG $ G 3m G m QS
bc b q (a)"exp ! (x #z )S # (y #z ) . QS QS S? 2ml QS QS m QS The explicit expressions for the operators o (i), q (a) can be found by direct calculation. With Q Q notation suppressing dependence on i, a, they can be written in the form
o "exp[!2b¹/m]a (1#t)#2p t (1#t)(1#t) Q Q S VV W X S #2p t (1#t)(1!t)#2p t (1!t)(1!t), , WW X V XX V W where
(6.2.53a)
a " cosh[bg (p)(x !y )/m] , Q $ QS QS S t "tanh[bg (p)(x !y )/m] S $ QS QS and q "exp[b(ml)\(y#z)]d + (1#r)#2S r (1#r)(1#r) Q Q Q Q S VV W X S #2S r (1#r)(1!r)#2S r (1!r)(1!r), WW X V XX V W
(6.2.53b)
with d "cosh[bc(x #z )/m] , Q QS QS r "tanh[bc(x #z )/m] . S QS QS Defining a "a (1#t) , Q Q S S b "2 (1#t)\(t (1#t)(1#t), t (1#t)(1!t), t (1!t)(1!t)) , Q S V W X W X V X V W S and analogously d , e in terms of d , r , r , r , formulae (6.2.53a) and (6.2.53b) can be rewritten as Q Q Q V W X o "exp[!2b¹/m]a (1#b ) r) , (6.2.54a) Q Q Q q "exp[b(y#z)/ml]d (1#e ) S) . (6.2.54b) Q Q Q Q Q Bounds on a and d can be obtained using the inequality exp(3x)5coshx#sinhx, which yields Q Q 14a 4exp(3bg(p)"x !y "m\) , (6.2.55a) Q $ Q Q 14d 4exp(3bc"x #z "m\) . (6.2.55b) Q Q Q
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Bounds on "b ", "e " are implied by the inequality tanhx4x: Q Q b ""b "4bg (p)"x !y "m\ , Q Q $ Q Q e ""e "4bc"x #z "m\ . Q Q Q Q A further bound on the r.h.s. of Eq. (6.2.51) can be obtained by noting that (oLqB+)K"(oK)L(qK)B+ Q Q Q Q and that according to Eq. (4.2.33)
307
(6.2.56a) (6.2.56b)
(6.2.57)
Tr(oK)L"z\L exp+Tr[ln( #z oK)]#u(n, m), , (6.2.58) Q LQ $ LQ Q where, due to the convergence of oK as mPR, lim n\ u(n, m)"0 as nPR, lim m\u(n, m)"0 Q as mPR. As for z , it is the unique solution of the equation LQ n\ Tr[z oK( #z oK)\]"1 . (6.2.59) LQ Q $ LQ Q After substituting o given by Eq. (6.2.54a) into this equation, it assumes the form Q n\ +[1#z\ exp(b¹)a\K(1#b )\K]\#[1#z\exp(b¹)a\K(1!b )\K]\, LQ Q Q LQ Q Q p P Z $ (6.2.60) #2n\ [1#z\ exp(b¹)]\"1 . LQ p P A $ By virtue of Eqs. (6.2.55a) and (6.2.56a), lim a "1 and lim b "0 as mPR. Under the assumed Q Q smallness of D'0, exp(b¹(p)) can be replaced for p3P by z , implying that Eq. (6.2.60) has the $ $ asymptotic solution z "z as mPR. LQ $ The trace of ln( #z oK) thus equals $ LQ Q Tr[ln( #z oK)]"n(2pd)\ dp p ln+[1#z aK(1#b )K exp(!b¹(p))] $ LQ Q $ Q Q ;[1#z aK(1!b )K exp[!b¹(p))], . (6.2.61) $ Q Q Using the bound (6.2.55a) and the inequalities (1$b )K4exp($b m/2), we get Q Q b3 Tr[ln( #z oK)]4n(2pd)\ dp p ln 1#z exp !b¹# g"x !y "#b m/2 $ Q Q $ LQ Q 2m $ Q b3 1 ; 1#z exp !b¹# g"x !y "! b m . (6.2.62) $ Q 2m $ Q 2 Q
Furthermore, since the function u(x)"ln[(1#ceV)(1#ce\V)] is increasing in x for c'0, the bound (6.2.56a) allows to extend the inequality (6.2.62) to Tr(ln( #z oK)]4nG ("x !y ", b) , $ LQ Q $K Q Q
(6.2.63)
308
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
where
G (m, b)"(2pd)\ $K
dp p ln
3 1#z exp bgm#bg m!b¹ $ $ 2m $
3 ; 1#z exp bgm!bg m!b¹ $ $ 2m $
.
(6.2.64)
Analogously, for (Tr qK)+K one obtains Q (Tr qK)+K4exp+3Mbc"x #z "/2m Q Q Q #(M/m)ln[2 cosh(bc"x#z")]#nb(y#z)/2m, . Q Q Combining Eqs. (6.2.50), (6.2.51), (6.2.58), (6.2.63) and (6.2.65), one obtains
(6.2.65)
nb K K nb dx exp ! [d !(mm)\]x ) x ZL4 lim R PQ P Q 2pm 2m 0K R PQ K n nb ;exp ! x ! [ln z !G ("x !y ", b) Q $ $K Q Q m 2mm Q Q
!l ln[2 cosh(bc"x#z")]!3lbc"x#z"/2m!b(y#z)] , Q Q
(6.2.66)
where, in order to carry out the next step, the term nb( x )/2mm has been added and subtracted Q Q in the exponent of the integrand. As in Section 3.1, the second exponential function in the integrand on the r.h.s. of Eq. (6.2.66) is now replaced by its maximum with respect to +x ,: R n bn ZL4 lim max exp ! x ! ln z !G ("x !y ", b) Q $ $K Q Q m 2mm +x , K R Q Q b !l ln[2 cosh(bc"x #z ")]!3lbc"x #z "/2m! (y#z) Q Q Q Q Q 2 Q
bn K 2pm
K bn dx exp ! [d !(mm)\]x ) x . PQ P Q R 2m 0K R PQ The necessary condition for this maximum is ;
(6.2.67)
j !b(mm)\ x # +G ("x !y ",b)#l ln[2 cosh(bc"x #z ")]#3lbc"x #z "/2m, QS jx $K N N N N N N NS Q j 1 j jy NT# # G ("x !y ", b)# by +l ln[2 cosh(bc"x #z ")] $K N N N N jy 2 N jx jz NT NT NS T T jz (6.2.68) #3lbc"x #z "/2m#bz, NT "0 N N N jx NS
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
309
for arbitrary p, u. For large m and y , z adjusted so that the derivatives with respect to y , z in NT NT NT NT Eq. (6.2.68) vanish, i.e., taking into account lim G "G $K $ K and imposing the conditions
(6.2.69)
1 j G ("x !y ", b)# by "0 , $ N N 2 N jy NS 1 j l ln[2 cosh(bc"x #z ")]# bz "0 , N N 2 N jz NS one arrives at the following asymptotic form of Eq. (6.2.68) in the limit mPR:
(6.2.70a) (6.2.70b)
x #z j NS tanh(bc"x #z ")"0 . G ("x !y ", b)#lbc NS (6.2.71) b(mm)\ x ! N N N QS jx $ N "x #z " N N NS Q The form of Eqs. (6.2.71), (6.2.70a) and (6.2.70b) shows that their solutions x , y , z are independent N N N of p and, therefore, under the assumed smallness of D, they simplify to x !y x #z Sf ("x!y")# S Sf ("x#z") , m\x " S S "x!y" "x#z"
(6.2.72a)
x !y Sf ("x!y") , y" S S "x!y"
(6.2.72b)
x #z Sf ("x#z") . z "! S S "x#z"
(6.2.72c)
The integration in Eq. (6.2.67) can be performed, provided m'1, in which case one obtains, similarly as in Eq. (3.1.11),
n ZL4max exp !b x!n ln z #nG ("x!y", b)#nl ln[2 cosh(bc"x#z")] $ $ 2m x
#nb(y#z) (1!m\)\ .
(6.2.73)
For the free energy per electron, on passing to the limit nPR and subsequently mP1, one obtains lim f (HL, b)5min f (x, y(x, b), z(x, b), b) , x L where x, y(x, b), z(x, b) are solutions of the equations x#z x!y f ("x!y")# f ("x#z") , x" "x#z" "x!y"
(6.2.74)
(6.2.75a)
310
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
x!y y" f ("x!y") , "x!y"
(6.2.75b)
x#z z"! f ("x#z") . "x#z"
(6.2.75c)
By comparing the upper bound (6.2.44) and the lower bound (6.2.74), we see that lim f (HL, b)" lim f (hL, b)"min f (x, y(x, b), z(x, b), b) . (6.2.76) x L L The system HL is thus asymptotically equivalent to the noninteracting gas with the Hamiltonian hL, similarly as HL is asymptotically equivalent to hL. Further correspondence between HL ' ' ' and HL can be deduced from Eqs. (6.2.75a), (6.2.75b) and (6.2.75c). These equations have the solution x "y "z "0 for any b50. The nonzero solution, whenever it exists, satisfies x ) y""x""y""xy'0 according to Eq. (6.2.75b) and x ) z"!"x""z""!xz(0 according to Eq. (6.2.75c). Furthermore, Eq. (6.2.75a) yields x"y!z, implying x'y, x'z. As a consequence, Eqs. (6.2.75a), (6.2.75b) and (6.2.75c) reduce to Eqs. (6.2.29a), (6.2.29b) and (6.2.29c) and (6.2.77) lim f (HL, b)" lim f (HL, b)"min f (x, y*(x, b), z*(x, b), b) . ' L L V The thermodynamic properties of HL and HL are thus identical: HL is asymptotically equivalent ' to hL and hL and exhibits at ¹ a second-order phase transition, with a discontinuity of specific ' heat given by Eq. (6.2.39), which is accompanied below ¹ by antiparallel alignment of the spins of mobile electrons and those of impurities (as can be seen from the structure of hL). The single impurity Kondo Hamiltonian has a similar property: the spin of the ground state is equal S/2! (Mattis, 1967; Wilson, 1975; Cragg and Lloyd, 1979). The presence of a second-order phase transition in the phase diagram of hL suggests that the full Kondo Hamiltonian H also exhibits a transition in the infinite-volume limit. This conjecture is )+ supported by evidence of such transitions in PbSnMnTe (Story et al., 1986) and CePd Ga (Bauer et al., 1994) as well as by maxima in the zero-field susceptibility diagrams of (La Ce )Al alloys \V V discovered by Felsch et al. (1975). These maxima were found compatible with those resulting from the thermodynamics of HL in the presence of an external magnetic field (Mac´kowiak et al., 1994). The model described by the Hamiltonian HL was extended (ERs-dM) by adding an antiferromag netic equivalent-neighbour interaction between impurity spins, which allowed to simulate the magnetic properties of (La Ce )Al and CePd Ga (Mac´kowiak et al., 1995; Wis´ niewski et al., \V V 1997). The assumption that the impurity spin S" is not essential in the proof of asymptotic equivalence of H and h expressed by Eq. (6.2.76). A similar proof can be given for higher impurity spins as well as for the ERs-dM (Mac´kowiak et al., 1997b). 6.3. The resistivity due to interaction of conducting electrons with localized impurity spins The asymptotic equivalence of the Hamiltonians H and h expressed by Eq. (6.2.76) suggests a modification of Kondo’s theory of resistivity minimum in DMA (Kondo, 1964). The crucial part
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
311
of Kondo’s calculation consists in deriving an expression for the transition probability between two free-electron states, with the s—d exchange H treated as a perturbation. Since the Hamiltonian h is asymptotically equivalent to H , which includes part of H, it appears natural to replace the free-electron states in Kondo’s theory by eigenstates of h and the perturbation H by »"H !H . The perturbation H yields, in fact, the same resistivity as », as it leads to integrals )+ which differ only on a set of zero measure from those resulting from ». ¹ proves to be lower than ¹ (see Mac´kowiak et al., 1994, 1995) and as Kondo’s theory agrees very well with experiment in
the vicinity of ¹ (Kondo, 1964, Mattis, 1988), such modification of his theory would retain
validity above ¹ , where h "H , and, simultaneously, extend his resistivity formula to temper atures below ¹ . The transition probability per unit time ¼(aPb) from the initial state a to the final state b, up to second-order Born approximation, equals ¼(aPb)"2pd(E !E )[» » # » (E !E )\] , ? @ ?@ @? ?@A ? A A$?
(6.3.1)
where » "» » » #c.c. The electron eigenstates of h are the same as those of H , but the ?@A ?A A@ @? electron eigenvalues e of h differ from e for k3P : IN I $ k e " #pg (k)z . $ K IN 2m Let us proceed in the same manner as Kondo (1964) and first calculate the sum in Eq. (6.3.1) A$? for a process in which the electron in the state "k#2 is scattered to the final state "k#2. (Here and in the sequel all correlations between pairs of impurity spins will be neglected and, following Kondo, all multiple sums over these spins will be replaced by single sums over diagonal terms. The reason for this approximation is that the position and depth of the resistivity minimum in DMA depends on impurity concentration c at most linearly, whereas multiple sums yield contributions proportional to higher powers of c (Kondo, 1964).) According to the type of intermediate state, four types of these processes can be distinguished: (1) "2q#2k#22P"2q#2q#22P"2q#2k#22: d(E !E ) ? @ " »k q »q k »k k (1!f )(e !e )\d(e !e )#c.c. , » ?@A E !E > Y> Y> Y> Y> > OY> I> OY> I> IY> aq ? A A$? where f "z exp(!be )[1#z exp(!be )]\. ON $ ON $ ON (2) "2q#2k#22P"2k#2k#22P"2k#2q#22: d(E !E ) ? @ "! »q k »k q »k k f (e !e )\d(e !e )#c.c. , » ?@A E !E > Y> > > Y> > O> O> IY> I> IY> q ? A A$? Y the minus sign on the r.h.s. being due to the opposite signs of final states resulting in these two processes.
312
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
(3) "q!2k#2M 2P"q!2q!2M #12P"q!2k#2M 2: ? ? ? d(E !E ) ? @ » ?@A E !E ? A A$? » (1!f )(e !e !2cy )\d(e !e )#c.c. " »k » > +? qY\ +a# qY\+?> IY> +? kY> k> OY\ I> OY\ K I> IY> ?qY (4) "2q!2k#, M 2P"2k#2k#, M !12P"2k#2q!, M 2: ? ? ? d(E !E ) ? @ " »q » » » \ +? kY>+?\ k> +?\ q\ +? kY>k> ?@A E !E ? A ?q A$? ;f (e !e #2cy )\d(e !e )#c.c. O O\ IY> K I> IY> The final states of processes (3) and (4) also differ by sign. Following Kondo (1964) and adding the above four contributions, all terms not involving fq being discarded as weakly dependent on e , one obtains N IN d(E !E ) ? @ "!2("J"N\) M (S!M )(S#M #1) » ? ? ? ?@A E !E ? A ? A$? ; f (e !e !2cy )\d(e !e )!2("J"N\) M (S#M )(S!M #1) OY\ I> OY\ K I> IY> ? ? ? q Y ? ; f (e !e #2cy )\d(e !e ) O\ O\ IY> K I> IY> q "4(JN\) M f (e !e #2cy )\d(e !e ) I> K I> IY> ? q O\ O\ ? 4 " (JN\)S(S#1)cN f (e !e #2cy )\d(e !e ) , O\ O\ I> K I> I> 3 q where M "$ is the y component of the ath localized spin. Adding to the above expressions K ? contributions from the first-order Born approximation, one obtains for the transition probability from "k#2 to "k#2: ¼(k#Pk#)"pJc(2N)\[1#4Jg (e !2cy )]d(e !e ) , \ I> K I> IY> where g (x)"N\ q f (e !x)\. N ON ON Similarly, one obtains
(6.3.2)
¼(k!Pk!)"pJc(2N)\[1#4Jg (ek #2cy )]d(ek !ek ) . (6.3.3) > \ K \ Y\ Consider now the processes "k#, M 2P"k!, M #12. Again, according to the type of inter? ? mediate state, four types of these processes can be distinguished:
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
313
(1) "2q!2k#, M 2P"2q!2q!, M #12P"2q!2k!, M #12: ? ? ? d(E !E ) ? A » ?@A E !E ? A A$?
d(e !e !2cy ) »q »k " »k IY\ K q k k ) I> ? Y\ +?> ?> Y\+?> ?> >+? (1!f >+ Y\+ Y\+ OY\ e !e !2cy q Y I> OY\ K d(e !e !2cy ) IY\ K . "2(JN\) (M #1)(S!M )(S#M #1)(1!f ) I> ? ? ? OY\ e !e !2cy q I> OY\ K Y (2) "2q!2k#, M 2P"2k!2k#, M 2P"2k!2q!, M #12: ? ? ? d(E !E ) ? A » ?@A E !E ? A A$?
d(e !e !2cy ) I> IY\ K »k » f q k ? ? ? ? \+ Y\+ I> + \+ > Y\+ > >+ O\ e !e #2cy q O\ I> K d(e !e !2cy ) I> IY\ K . "!2(JN\) M (S!M )(S#M #1)f ? ? ? O\ e !e #2cy q O\ I> K Y (3) "2q#2k#, M 2P"2q#2q#, M 2P"q#2k!, M #12: ? ? ? d(E !E ) ? @ » ?@A E !E ? @ A$? d(e !e !2cy ) IY\ K »k " »k »q (1!f ) I> q k k ? Y> +? ? Y\+?> ?> >+? > + Y>+ Y\+ OY> e !e q I> OY> Y d(e !e !2cy ) IY\ K . "!2(JN\) M (S!M )(S#M #1)(1!f ) I> ? ? ? OY> e !e q I> OY> Y (4) "2q#2k#, M 2P"2k!2k#, M #12P"2k!2q#, M #12: ? ? ? "! »q
?
k
?
d(E !E ) ? @ » ?@A E !E ? A A$?
d(e !e !2cy ) I> IY\ K f »k »k q k ? IY\ +?> ?> > +?> ?> > +? O> > + > + Y\ + e !e q O> I> d(e !e !2cy ) I> IY\ K . "2(JN\) (M #1)(S!M )(S#M #1) f ? ? ? O> e !e O> I> O Adding these contributions, as well as those from the first-order Born approximation, one obtains " »q
¼(k#, M Pk!, M #1)"2pJN\(S!M )(S#M #1)[1#2Jg (e !2cy ) ? ? ? ? \ I> K #2Jg (e )]d(e !e !2cy ) . > I> I> IY\ K
314
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Analogously, ¼(k!, M Pk#, M !1)"2pJN\(S#M )(S!M #1)[1#2Jg (e ) ? ? ? ? \ I\ #2Jg (e #2cy )]d(e !e #2cy ) . > I\ K I\ IY> K Performing summation over the localized spins, we get, after substitution S", ¼(k#Pk!)" ¼(k#, M Pk!, M #1) ? ? ? "pJcN\[1#2Jg (e !2cy )#2Jg (e )]d(e !e !2cy ) , \ I> K I> IY\ K > I> (6.3.4) ¼(k!Pk#)" ¼(k!, M Pk#, M !1) ? ? ? "pJcN\[1#2Jg (e )#2Jg (e #2cy )]d(e !e #2cy ) . \ I\ > I\ K I\ IY> K (6.3.5) The relaxation times of return to the equilibrium distributions f , f equal I> I\ q "+¼(k$Pk$)#¼(k$Pk$),\ . (6.3.6) I! In the sequel, values of the g functions on the interval (e !k¹, e #k¹), with k¹;e , are only ! $ $ $ relevant; therefore 3c lim N\ d(e !e )" lim N\ d(e !e $2cy )" (6.3.7) I! IY! I8 IY! K 4le k k $ L Y L Y since e +e . Furthermore, the assumption that the interaction in H is weak, i.e. 0(C;e , I! $ )+ $ implies, according to Eqs. (6.2.34b) and (6.2.34c) that g z ;e , cy ;e . As a consequence, K $ K $ expressions of the form e $g z $cy will be replaced by e , as already done in Eq. (6.3.7). $ K K $ Summation over k in Eq. (6.3.6) thus yields q\"3pcJ(8le )\[3#8Jg (e !2cy )#4Jg (e )] , I> $ \ I> K > I> q\"3pcJ(8le )\[3#8Jg (e #2cy )#4Jg (e )] . I\ $ > I\ K \ I\ The resistivity is defined as (Dugdale, 1977)
(6.3.8) (6.3.9)
\ df , q v I dk I I de I where q is the relaxation time of return to the equilibrium distribution f given by I I f "z e\@CI (1#z e\@CI )\ I $ $ and V "km\. Since f +( f #f ), we may put q "(q #q ). Substituting q , q given I I I> I\ I I> I\ I> I\ by Eqs. (6.3.8) and (6.3.9) and retaining only the lowest-order expansion terms of o in powers of g , g , we get > \ o "(o #o ), (6.3.10) > \ o "!12pe\
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
315
where o ,o are the contributions from q , q respectively, > \ I> I\ gp cJm 4J df o " 1! [2g (e !2cy )#2g (e )](e I de , (6.3.11) > \ I> K > I> I I 8 ele d de 3(e $ I $ gp cJm df 4J o " 1! [2g (e #2cy )#2g (e )](e I de . (6.3.12) \ > I\ K \ I\ I I 8 ele d de 3(e $ I $ Discrepancies between resistivity given by Eq. (6.3.10) and Kondo’s resistivity formula (6.1.1) arise below ¹ and can be expected to be most significant close to 0 K, where deviations of f , f from I> I\ f are largest. Calculation of the limit lim o as ¹P0, confirms this conjecture: Using the integral I formulae derived in Appendix F, one obtains for lC'D (high concentrations c)
9p cJm Jc 4e "D#2lC(1!d)" 4e $ $ lim o " 6!2 ln !ln #1 , > 8 ele d le "D#lC(1!2d)"D#2lC(d!1)" "lC#D" $ $ 2 (6.3.13)
4e 4e 9p cJm Jc $ $ 6!2 ln !ln #1 , lim o " \ "D#lC(2d!1)" "lC!D" 8 ele d e l $ $ 2 and for lC(D (low concentrations c),
(6.3.14)
9p cJm Jc 4e "D#2lC(1!d)" 4e D $ $ lim o " 6!2 ln !ln #1 , > 8 ele d e l lC"D#lC(1!2d)"1!2d" lC"lC#D" $ $ 2 (6.3.15) 4e "D#2lC(d!1)" 4e D 9p cJm Jc $ $ 6!2 ln !ln #1 lim o " \ lC"D#lC(2d!1)"2d!1" lC"lC#D" 8 ele d e l $ $ 2 (6.3.16) which, unlike Kondo’s formula (6.1.1), yields, a finite value of resistivity at 0 K (in all cases considered D#lC(2d!1)O0, 2d!1(0; Mac´kowiak et al., 1994). For temperatures above ¹ both formulae coincide: o (¹)"o (¹) for ¹'¹ . (6.3.17) I According to Eq. (6.3.17), compatibility of the resistivity formula (6.3.10) for ¹'¹ with Kondo’s theory of resistivity minimum in DMA (Kondo, 1964) would require ¹ (¹ , where ¹ is the
temperature at which the total resistivity R(¹) of DMA assumes minimum and in the vicinity of which Kondo’s theory is valid. Such a comparison of ¹ and ¹ for (LaCe)Al alloys (Mac´kowiak
et al., 1994, 1995) shows that ¹ (¹ , confirming compatibility of Kondo’s theory with its
extension developed in this section. Appendix A. Proof of Lemma 3.4 For x, y3RJ, let p(t, x, y)"(2pt)\J exp(!"x!y"/2t)
316
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
denote the Green function of the differential equation jt(t, x) "Dt(t, x), (t, x)3(0,R);RJ. jt p(t, x, y) satisfies the normalization lim p(t, x, y)"d(x!y) , R
(A.1)
lim p(t, x, y)rJ\"(2pt)\d(h !h ) for x, y3SJ\ , (A.2) V W P P where (r, h ), (r, h ) are the spherical coordinates of x, y. Now let 0(t(b, x, z3SJ\, f : RQ JPC V W P and let u3C (X ) be defined in terms of f by f (u(s)) for s3[0, b]. Consider the limit @
p(t, x, y) f (y)p(b!t, y, z) dy rJ\ lim k@ (u)rJ\"lim VX P P 0J t r ( (h )) W J (h ,1) dh "lim rJ\ dr p , (h ), W W V W W r r J\ J W 1 P b!t ;f (r , h )p , r (h ), r (h ) , W W X r W W
where
(A.3)
x"r (h )"(r (h ),2, r (h )) , V V J V y"r (h )"(r (h ),2, r (h )) . W W W W W J W Direct calculation shows that for any g(h )"cos(nh ), sin(nh ), the asymptotic form of the W W W J (h , r ) dh dr p(t, r (h ), r (h ))g(h )rI, in the function h(r, h ) defined by the integral J\ V W W W W V 1 " J W W W W limit rPR, is exactly the same as that of J\ J (h , r) dh dr p(t, r (h ), r (h ))g(h )rI. One 1 " J W W W V W W W may thus use r, r interchangeably in the integrand f in Eq. (A.3) when estimating this limit. That W such replacements are admissible can be also seen from the behaviour of the sequence of functions p(t/r, (h ), r (h )/r) for x3SJ\, y3RJ, in the limit rPR. This sequence behaves asymptotically V W W P like a sequence of Dirac delta functions d( (h )!r (h )/r). One may thus write V W W exp[!(r!r )(2t)\] W lim k@ (u)rJ\"lim VX (2pt) J\ P P "1
exp[!r( (h )! (h ))(2t)\] exp[!(r!r )(2(b!t))\] V W W ; f (r, h ) W (2ptr\)J\ (2p(b!t)) exp[!r( (h )! (h ))(2(b!t))\] X W ; dr J (h , r) dh W J W W (2p(b!t)r\)J\
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
317
"(2pb)\lim d( (h )! (h )) f (r, h )d( (h )! (h ))J (h ,1) dh V W W W X J W W P 1J\
"(2pb)\lim rJ\ G (tr\, h , h ) f (r, h ) G ((b!t)r\, h , h )J (h , r) dh V W W P W X J W W J\ P 1 P (u )rJ\ . "lim (2pb)\k@P FVFX P P Thus, (u ) for x, z3SJ\ . (A.4) lim k@ (u)"(2pb)\lim kP@ VX FVFX P P P P For f : (RJ)K P C the proof is analogous. Now let u3C (X ), t3C(X ) and u"XP@"u 3C (X ), P P@ @ @ t"XP@"t 3C(X ) and sup X@"u(u)!t(u)"(e. P P@ SZ Then for x, z3SJ\, P lim "kP@ (u )!kP@ (t )"4 lim eG (br\h , h )"0 , FVFX P FVFX P P V X P P
(A.5)
lim "k@ (u)!k@ (t)"4lim ep(b, x, z)"0 . VX VX P P Making use of Eqs. (A.4), (A.5) and (A.6), one can now easily prove that for x, z3SJ\, P lim (2pb)k@ (t)"lim kP@ (t ) . VX FVFX P P P Appendix B. Growth property of Z
(A.6)
QED
NK$
The measure kNK@ is not nonnegative, contrary to kNK@ used in Section 3.1.1. It is convenient to 4$ 4 write Z K as a limit of Gaussian integrals: N $
bn KI Z K " lim dm eK (m ,2, m ) , KI N $ I? K 2pm KI 0 I? K where
b L K dkNK@(u ,2,u ) exp 4$ L m XKL @ G I L kb b K J t u ! A G m 2mn I A G
eK (m ,2, m )" K KI
I kb 1 ! m #m u u I? I? ? G m 2 ? .
318
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
As mPR, eK approaches a functional EK[m ,2,m ] defined on m 3¸[0, b], a"1,2, k, of the K I ? form @ @ 1 EK[m ,2, m ]" dkNK@(u ,2, u ) exp ! n m(s) ds# m (s)u (u (s)) ds ? ? ? G I 4$ L 2 XKL @ G? ? @ 1 t (u (s)) ds . ! A G 2n A G EK can be written as I@K\ b 1 EK[m ,2,m ]" lim dkNK@(u ,2,u ) exp ! n mY# m u (u (s)) ds I 4$ L I? ? G m I? 2 XKL \ @ I\@K K ?I G?I I@K\ 1 t (u (s)) ds , ! A G 2n AI I\@K\ G
where m "m (s ), s 3[(k!1)b/m, kb/m), or, alternatively, in the form I? ? I I K bn EK[m ,2,m ]" lim Tr B exp ! mY , I I I? 2m K I ? where
b b b B "exp ! nCL (¹#»)# n CL (m u K)! (nCL t K) , I I? ? A m m 2nm ? A bn KI bn Tr B exp ! mY dm . Z K " lim I I? N $ 2pm 2m I? 0KI I ? K Writing the integral kernel of B as an integral with respect to the conditional Wiener measure: I
I@K\ L dkNG K@K (u ) exp V WPLG G XKL @K G L H? I\@K\ \ \ I@K I@K 1 ;m u (u (s)) ds! »(u (s)) ds! t (u (s)) ds I? ? H H A H 2n H I\@K\ A I\@K\ H
B (x ,2, x , y ,2, y )"(n!)\ (!1)EPL I L L P
and integrating the product B with respect to exp(!(bn/2m)mY) dm , we get I I I? I? I? bn KI B (x ,2, x , y ,2, y ) e\@LKmY dm lim I L L I? I? 2pm I 0I ? K I@K\ K m dkNG K@K (u ) exp ds u (u (s)) " lim (n!)\ (!1)CPL P V W LG G ? H 2bn XKL P @K G L I\@K\ H ? K I I@K\ I@K\ 1 ! »(u (s)) ds! t (u (s)) ds H A H 2n H I\@K\ A I\@K\ H
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
319
I@K\ K 1 " lim (n!)\ (!1)KPL dkNG K@K (u ) exp u (u (s)) ds P LG V G ? H W 2n XKL P @K G L K I ? I\@K\ H \ \ I@K I@K 1 ! »(u (s)) ds! t (u (s)) ds H A H 2n H I\@K\ A I\@K\ H b b "exp !bnCL (¹#»)# (nCL u K)! (nCL t K) (x ,2, x ,y ,2, y ) ? A L L 2n 2n ? A which shows that the operator
bn KI bn B exp ! m dm lim I I? 2pm 2m I? I 0I ? K is positive. In the same way, one finds that
bn KI d K bn Tr B exp ! m dm lim I I? 2pm dB KI 2m I? Q 0 I ? K bn KI I bn bn " lim B exp ! m dm exp ! m dm 50 I I? QB 2pm 2m I? 2m QB 0KI I$Q B K ? proving that Z K is increasing in each operator B . This property of Z K allows to exploit N $ I N $ equality (3.1.9) in the same manner as in Section 3.1.1.
Appendix C. Asymptotic form of the (n, 2)-contraction of o KL The proof of asymptotic formulae for the free energy density of n-fermion systems with a 2-body potential of the form (3.1.1) and (3.1.2) requires knowledge of the limiting form of the (n, 2)¸f\oL is, e.g. exploited in Section 4.3.2. contraction of o K L. The formula for lim L L LK K To simplify notation, we shall write oO+pO whenever two q-particle density matrices oO, pO defined on H K O, satisfy lim Tr(BO Tr(BO K oO)" lim K pO) K K L L for any bounded operators BO 2b , where lim #b #(R as K on HK O of the form BO K "b K OK GK "K"PR for i"1,2, q. Let us also denote qO(oK) " : f\ ¸O o L for q"0, 1, , n. 2 L LK L K For finite n52, the (n,2)-contraction of o K L is given by the following formula derived by Pruski (1978):
n \ L\ L\I oI oHK(!1)I>H . ¸o f K L" L L\I\HK K 2 I H Proof. The proof can be carried out by induction. For n"2, Eq. (C.1) takes the form "f KoKoK"oKoK, f K " : 1, ¸o L K
(C.1)
320
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
and is obviously valid. Suppose now that Eq. (C.1) holds for ¸o O with q"2,2, n. Then, O K L> in the form expressing the integral kernel of ¸ o L> K ¸ oK L>(x , x , x , x )" L> oK(x , x )oK(x , x )oK(x , x )2 oK(x , x ) L> oK(x , x )oK(x , x )oK(x , x )2 oK(x , x ) L> 1 det oK(x , x )oK(x , x )oK(x , x )2 oK(x , x ) dx 2dx L> L> (n#1)! KL\ $ $
oK(x
, x )o (x , x )o (x , x )2 oK(x , x ) L> K L> K L> L> L>
and expanding the determinant of the integrand with respect to the first row, we get L>(x , x , x , x )"(n#1)\oK(x , x )(¸o L)(x , x )!(n#1)\oK(x , x )(¸o L)(x , x ) ¸ o L K L K L> K oK(x , x )oK(x , x )oK(x , x )2 oK(x , x ) L> oK(x , x )oK(x , x )oK(x , x )2 oK(x , x ) n!1 L> # oK(x , x ) det dx 2dx . L> (n#1)! KL\ $ $
oK(x , x )oK(x , x )oK(x , x )2 oK(x , x ) L> L> L> L> L>
Performing a Laplace expansion with respect to the first row of the determinant in the last term, one obtains L>)(x , x , x , x )" (¸ o L> K L)(x , x )!(n#1)\oK(x , x )(¸o L)(x , x ) (n#1)\oK(x , x )(¸o L K L K n!1 # oK(x , x ) oK(x , x )(¸ )o L\(x , x ) dx K L\ K n(n#1)
!oK(x , x ) oK(x , x )(¸ o L\)(x , x ) dx K L\ K
#(n!2) oK(x , x )oK(x , x )(¸ o L\)(x , x , x , x ) dx dx . L\ K K
(C.2)
By Eq. (4.2.9), the first two terms in the curly brackets on the r.h.s. of Eq. (C.2) equal 1 +oK(x , x )[f o (x , x )!n(¸o L)(x , x )] L\K K L K n(n#1) o (x , x )!n(¸o L)(x , x )], !oK(x , x )[f L\K K L K and the third term in the curly brackets on the r.h.s. of Eq. (C.2) equals (n!1)(n!2) ((oKoK)(¸ o L\))(x , x , x , x ) . L\ K n(n#1)
(C.3)
(C.4)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
321
Furthermore, since (pq)(x , x , x , x )"+p(x , x )q(x , x )#p(x , x )q(x , x ) !p(x , x )q(x , x )!p(x , x )q(x , x ), , therefore in terms of operator and Grassmann products Eq. (C.2) takes the form 4 2 ¸ o L>" oK¸o f K L! KoKoK L> K L n#1 n(n#1) L\ (n!1)(n!2) # (oKoK)(¸ o L\) . L\ K n(n#1)
(C.5)
given by Eqs. (C.1) and (C.5) can be easily verified using Compatibility of the expressions for ¸o K Eq. (4.2.10). Substituting into the r.h.s. of Eq. (C.5) the expansion of ¸o L given by Eq. (4.2.10) and L K n!1 \ L\ L\\I f oI oHK(!1)I>H , ¸ o K L\" L\\I\HK K L\ 2 I H one obtains
n#1 \ L 2 (!1)I>f oI oK!f o oK L\IK K L\K K 2 I L\ L\\I # f oI>oH> (!1)I>H K L\\I\HK K I H n#1 \ L L (!1)I>f " oI oK KoI KoK# (!1)I>f L\I L\IK K 2 I I L\ L\I # f oI>oHK(!1)I>H> L\I\HK K I H n#1 \ L L\ (!1)I>f " oI>oK KoI KoK# (!1)If L\I L\I\K K 2 I I L\ L\I # f oI>oHK(!1)I>H> L\I\HK K I H n#1 \ L L\ L\I " oI>oHK(!1)I>H> (!1)H>f K oHKoK# f L\I\H K K L\H 2 I H H n#1 \ L L>\I f " oI oHK(!1)I>H L>\I\HK K 2 I H which proves validity of Eq. (C.1) for q"n#1. ¸oL can be deduced from Eq. (C.1), viz., The asymptotic formula for f\ LK L K ¸ o L>" L> K
322
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Lemma C.1. ¸qL(oK)+q(oK)q(oK)+2q(oK)q(oK). L L L L L L Proof. Let us rearrange the expansion Eq. (C.1): L\ L\I L I\ (!1)I>Hf KoI KoHK" (!1)If K oGKoI\G K L\I\H L\I I H I G I\ L\ L\ oGKoI\G "! (!1)If oI oK!A(KoK) (!1)If K A L\I\K K L\I\K G I I n!1 "(n!1) (¸ o ¸ o L\ , (C.6) K L\)oK!A(KoK) L\ L\ K 2
the second and third equality being implied by the commutativity I\ I\ oGKoI\G . K A"A oGKoI\G K G G , Eq. (C.6) may be rewritten in the form In terms of qG(oK), i"1, 2, and s L\K L n n!1 q(oK)"(n!1)n\s A(Ks o )q (o ) n\ KoKq (oK)!n\ L L\ L\ L\K K L\ K 2 2
(C.7) and since by Eq. (C.1), similarly as in Eq. (4.2.20)
n\
n
q(oK)+(n#1)\ 2 L
n#1 2
q (oK) , L>
(C.8)
therefore
n\
n
n q(oK)+n(n#1)\s KoKq(oK)!n\ A(KoKs K)q(oK) L L L L 2 L 2
(C.9)
which implies
n\
n
q(oK)(K(K#s KoK))A+n\q(oK)s KoK . L L L 2 L
Let us now write n\(L )q (o ) in the form L L n q(oK)"q(oK)q(oK)#R . n\ L L L 2 L
(C.10)
(C.11)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
323
Then by Eq. (4.2.22)
n\
n
q(oK)+n\s KoK(K#s KoK)\s KoK(K#s KoK)\#R . L L L L L 2 L
(C.12)
Substituting Eq. (C.12) into Eq. (C.10), we get R(K(K#oKs K))A#n\s KoK(K#s KoK)\s KoK L L L L L +n\s KoK(#s KoK)\s KoK . (C.13) L L L Since the sequence s K is convergent and +oK, is bounded as "K"PR, therefore Eq. (C.13) implies L R+0 and, as a consequence L q(oK)+q(oK)q(oK) . (C.14) L L L By direct calculation, one finds 2 Tr((g ))!Tr(qg qg) . (C.15) K g K )qq)"(Tr(qg L L L K L K L K Furthermore, since ¸ is order-preserving, Eq. (4.2.9) yields L q(oK)4n\ . (C.16) L (Density matrices of the form ¸.oL are said to be n-representable. Coleman (1963), who introduced L the notion of n-representability, proved that a 1-fermion density matrix o is n-representable if and only if o4n\. Eq. (C.16) exemplifies his theorem.) Thus, qg)"4 lim n\#g lim "Tr(qg K # Tr q" lim n\"g K #"0 . L L K L K L L L Eqs. (C.15) and (C.17) show that
(C.17)
2q(oK)q(oK)+q(oK)q(oK) L L L L which combined with Eq. (C.14) proves the lemma. QED
(C.18)
The asymptotic form of the (n, 2)-contraction of oR K L for an n-boson system can be found by L : The (n, 2)-contraction of oR a method similar to the one applied above to ¸o K L equals L K n \ L\ L\I f ¸oR L" oI RoHK (C.19) L\I\HK K L K 2 I H and for large n
1 R f\ K ¸oK L+p(oK)Rp(oK) . L L 2 L L
(C.20)
In particular, lim f\ Tr(¸(oR L)BB K LK L K K L " lim (Tr B K p(oK))# lim Tr(B K p(oK)B K p(oK)) . L L L L L
(C.21)
324
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Since for B K "K the trace Tr(B K p(oK))"1, it follows that in order that normalization L R ¸ o L )"1 be preserved, by the r.h.s. of Eq. (C.21). Tr(f\ K K L L lim Tr(p(oK))"0 , L L and therefore, necessarily,
(C.22)
lim Tr(B K p(oK)B K p(oK))"0 L L L for any uniformly bounded sequence +B K ,. Thus, ¸oRL+p(oK)p(oK)+2p(oK)Rp(oK) . (C.23) f\ L L L L LK L K Eq. (C.23) allows to evaluate the upper bound to lim f (HLK , b) as n,"K"PR for an n-boson N @ system with the potential (3.1.1), (3.1.2), in the same manner as for n fermions in Section 4.3.2.
Appendix D. The solutions of Eqs. (6.2.29b) and (6.2.29c) Lemma. Given an equation of the form y"f (x!y, b),
b50, x3R,
(D.1)
where f has the following properties: (a) f (0, b)"f (x, 0)"0, f (x)"!f (!x), " f (x, b)"(R, (b) (j/jx) f (x, b)'0, (j/jb) f (x, b)'0 for x'0 , (c) (j/jx) f (x, b)(0 for x'0 , then there exists a unique solution y*(x, b), satisfying this equation, with the same properties as (a)—(c). Proof. Uniqueness of the solution y*(x, b) and the property (a) can be deduced from the graphical representation of Eq. (D.1). With y*(x, b) substituted for y, Eq. (D.1) becomes an identity y*(x, b)"f (x!y*(x, b), b) .
(D.2)
(A) Differentiation of Eq. (D.2) yields jy* jy* "f (x!y*)!f (x!y*) , jx jx whence jy* "f (x!y*)[1#f (x!y*)]\'0 . jx
(D.3)
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
325
(B) For x'0, y*(x, b)'0, therefore by virtue of (b), x!y*(x, b)'0 for x'0 .
(D.4)
(C) Differentiation of Eq. (D.3) yields
jy* jy* jy* f (x!y*) , "!f (x!y*) # 1! jx jx jx whence by Eq. (D.4)
jy* jy* " 1! [1#f (x!y*)]\f (x!y*)(0 for x'0 . jx jx (D) Differentiation of Eq. (D.2) with respect to b yields jy* jy* jf "!f (x!y*) # , jb jb jb whence jy* jf "[1#f (x!y*)]\ '0 for x'0 . jb jb Let F(x, b)"d tanh(bg x) , According to the lemma, the equation y"F(x!y, b) has a unique solution y*(x, b) with the properties (a)—(c) and
lim @
y*(x, b)"
x
for !d4x4d ,
d
for x5d ,
!d
for x4!d .
Appendix E. Growth property of ZL[C ,2,C ] V +X ZL given by Eq. (6.2.49) can be written in the form ZL" lim ZL , K K
326
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
where
K dx D (x )C , RS S RS RS 0K R S nb b L + nb ALexp x g ( p )p ! cS ! D (x )" x !bm\t AL , $ G SG S? S RS 2pm m RS 2mn RS G ? b b b ! #bm\t AL . C "ALexp ! ¹ ! g (p )p cS RS G 2mn $ G SG S? 2mn 3m G G ? Obviously, ZL"Tr K
K dx D (x )C "exp(!bHL) s! lim RS S RS RS P K 0K R S is a positive operator. Therefore, the operator
(E.1)
d Tr dx D (x )C RS S RS RS dC 0K R QT S K Q\ " (dx D(x )C )D (x ) dx dx D (x )C dx D (x )C RU U RU RU OU U OU OU QS QS QS T QT QT K 0 RQ> U S$T O U (E.2)
is also positive, up to additive terms vanishing as mPR or multiplicative terms approaching 1 as mPR, because
d Tr dx D (x )C "exp(!bHL) s!lim RS S RS RS P dC K 0 R QT S according to Eqs. (E.1) and (E.2). As a consequence, lim ZL(C )4lim ZL(C ) if C 4C . K K RS K K RS RS RS Appendix F. Evaluation of the integrals (6.3.11) and (6.3.12) Evaluation of the integrals appearing in Eqs. (6.3.11) and (6.3.12) in the limit ¹P0 is straightforward. Using the equality
C C df (ef (e)(e!x)\ de"f (e)G(e, x)"C! G(e,x) de , C de C C
where G(e, x)"(e(e!x)\ de, and the approximation
!(2k¹)\ df (e) " de 0
for e3(e !k¹, e #k¹) , $ $ for e , (e !k¹, e #k¹) , $ $
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
327
one obtains for lC'D
C$>I2 df g (e #a)(e I de " lim g (e #a)(e (2k¹)\ de !lim > I\ I de I I I > I\ I 2 2 C$\I2
(e de 3c 3c C$\" " 2(e !D) " $ e!(e !lC#a) 4e l 4e l $ $ $
"(e !lC#a)#(e !D)" $ !(e !lC#a) ln $ $ "a#D!lC"
3c 4e $ K 2!ln , for "a", D, lC;e ; $ "a#D!lC" 4(e l $ 3c C$\" (e de df ! lim g (e #a)(e I de " I de I 4e l > I> e!(e #lC#a) $ I $ 2
3c "(e #lC#a)#(e !D)" $ " 2(e !D)!(e #lC#a) ln $ $ $ 4e l "lC#a#D" $ 3c 4e $ K 2!ln for "a", D, lC;e ; $ "lC#a#D" 4(e l $ ? C$\" C$>" (e de 3c (e de df ! lim g (e #a)(e I de " # I de I 4e l \ I\ e!(e !lC#a) (e!(e #a) $ I $ $ C$\" 2
3c "(e !lC#a)#(e !D)" $ " 2(e #D)!(e !lC#a) ln $ $ $ 4e l "D#a!lC" $ "(e #a)#(e #lC)""a#D" $ !(e #a) ln $ $ "(e #a)#(e !D)""a!D" $ $ 3c 4e "a#D" $ K 2!ln for "a", D, lC;e ; $ "D#a!lC""a!D" 4(e l $ C$\" C$>" (e de (e de df 3c g (e #a)(e I de " # !lim I de I 4e l \ I> e!(e #lC#a) e!(e #2lC#a) I $ $ $ C$\" 2
3c "(e #lC#a)#(e !D)" $ " 2(e #D)!(e #lC#a) ln $ $ $ 4e l "D#a#lC" $ "(e #a#2lC)#(e #D)""a#2lC#D" $ !(e #a#2lC) ln $ $ "(e #a#2lC)#(e !D)""a#2lC!D" $ $ 3c 4e "a#2lC#D" $ K 2!ln for "a", D, lC;e ; $ "D#a#lC""a#2lC!D" 4(e l $
328
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
and for lC(D,
df C$>I2 !lim g (e #a)(e I de " lim g (e #a)(e (2k¹)\ de > I\ I de I > I\ I I $ I 2 2 C \I2
(e de C$\" (e de C$\JC 3c # " e!(e !lC#a) e!(e !2lC#a) 4e l C$\" $ $ $ 3c "(e !lC#a)#(e !D)" $ " 2(e !lC!(e !lC#a ln $ $ $ 4e l "D!lC#a" $ "(e !2lC#a)#(e !lC)""D!2lC#a" $ $ ! (e !2lC#a ln $ "(e !2lC#a)#(e !D)""lC!a" $ $ 3c 4e "D!2lC#a" $ + 2!ln !ln for "a", D, lC;e ; $ "D!lC#a" "lC#a" 4(e l $ df g (e #a)(e I de !lim > I> I de I I 2
C$\" (e de C$\JC (e de 3c # " e!(e #lC#a) e!(e #a) 4e l C$\" $ $ $ 3c "(e #lC#a)#(e !D)" $ " 2(e !lC!(e #lC#a ln $ $ $ 4e l "D#lC#a" $ "D#a" "(e #a)#(e !lC)" $ $ !(e #a ln $ "lC#a" "(e #a)#(e !D)" $ $ 3c 4e "D#a" $ + 2!ln for "a", D, lC;e ; $ "D#lC#a" "lC#a" 4(e l $ df !lim g (e #a)(e I de \ I\ I de I I 2 C$\" (e de C$>JC (e de 3c # " e!(e !lC#a) e!(e #a) 4e l C$\" $ $ $ 3c "(e !lC#a)#(e #D)" $ " 2(e #lC!(e !lC#a ln $ $ $ 4e l "D!lC#a" $ "D#a" "(e #a)#(e #lC)" $ $ !(e #a ln $ "lC!a" "(e #a)#(e !D)" $ $ 3c 4e "D#a" $ + 2!ln for "a", D, lC;e ; $ "D!lC#a""lC!a" 4(e l $
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
329
df lim g (e #a)(e I de \ I> I de I I 2 (e de 3c C$\" (e de C$>JC " # 4e l e!(e #lC#a) e!(e #2lC#a) C$\" $ $ $ 3c "(e #lC#a)#(e !D)" $ " 2(e #lC!(e !lC#a ln $ $ $ 4e l "D#lC#a" $ "D#a#2lC" "(e #lC#a)#(e #lC)" $ $ !(e #2lC#a ln $ "lC#a" "(e #2lC#a)#(e !D)" $ $ 3c 4e "D#a#2lC" $ + 2!ln for "a", D, lC;e . $ "D#lC#a""lC#a" 4(e l $
References Anderson, P.W., 1961. Phys. Rev. 124, 41. Andrei, N., 1980. Phys. Rev. Lett. 45, 379. Andrei, N., Lowenstein, J.H., 1981. Phys. Rev. Lett. 46, 356. Andrei, N., Furuya, K., Lowenstein, J.H., 1983. Rev. Mod. Phys. 55, 331. Angelescu, N., Nenciu, G., 1973. Commun. Math. Phys. 29, 15. Angelescu, N., Verbeure, A., Zagrebnov, V.A., 1992. J. Phys. A 25, 3473. Angelescu, N., Pulvirenti, M., Teta, A., 1994. J. Stat. Phys. 74, 167. Arai, T., 1985. J. Appl. Phys. 57, 3161. Ashcroft, N.W,. Mermin, N.D., 1976. Solid State Physics. Saunders, New York, London, Sydney. Bauer, E., Schaudy, G., Hilscher, G., Keller, L., Fischer, D., Do¨nni, A., 1994. Z. Phys. B 94, 359. Baumgartner, B., Narnhofer, H., Thirring, W., 1983. Ann. Phys. N.Y. 150, 373. Baym, G.., 1962. Phys. Rev. 127, 1391. Baym, G., 1967. The microscopic description of superfluidity. In: Clark, R.C., Derrick, G.H. (Eds.), Mathematical Methods in Solid State and Superfluid Theory. Scottish Universities Summer School Oliver and Boyd, Edinburgh. Blattner, R.J., 1973. Quantization and representation theory. In: Harmonic Analysis on Homogeneous Spaces. Proc. Symp. Pure Math. A.M.S., Providence, R.I. vol. 26, pp. 147—165. Bogoliubov, N., 1947. J. Phys. USSR 11, 23. Boublik, T., Nezbeda, I., Hlavaty, K., 1980. Statistical Thermodynamics of Simple Liquids and their Mixtures. Academia, Prague. Bratteli, O., Robinson, D.W., 1981. Operator Algebras and Quantum Statistical Mechanics, Vol. II. Springer, New York. Bricomt, J., Kuroda, K., Lebowitz, J.L., 1985. Commun. Math. Phys. 101, 501. Buffet, E., Pule´, J.V., 1983. J. Math. Phys. 24, 1608. Cannon, J.T., 1973. Commun. Math. Phys. 29, 89. Capel, H.W., Tindemans, P.A.J., 1974. Rep. Math. Phys. 6, 225. Caplin, A.D., Rizzuto, C., 1968. Phys. Rev. Lett. 21, 746. Clogston, A.M., Anderson, P.W., 1961. Bull. Am. Phys. Soc. 6, 124. Coleman, A.J., 1963. Rev. Mod. Phys. 35, 668. Cragg, D., Lloyd, P., 1979. J. Phys. C 12, L215. Crisan, M., Popoviciu, P., 1992. Mod. Phys. Lett. 6, 1329. Davies, E.B., 1972. Commun. Math. Phys. 28, 69. de Gennes, P.G., 1962. J. Phys. Paris 23, 510. Dugdale, J.S., 1977. The Electrical Properties of Metals and Alloys. Edward Arnold, London.
330
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
Dunford, N., Schwartz, J., 1963. Linear Operators. Wiley, New York. Erde´lyi, A., 1956. Asymptotic Expansions. Dover, London. Fannes, M., Spohn, H., Verbeure, A., 1980. J. Math. Phys. 21, 355. Fannes, M., Verbeure, A., 1980. J. Math. Phys. 21, 1809. Fannes, M., Verbeure, A., Pule´, J.V., 1982. Helv. Phys. Acta 55, 391. Felsch, W., Winzer, K., Minnigerode, G.V., 1975. Z. Phys. B 21, 151. Feynman, R.P., Hibbs, A.R., 1965. Quantum Mechanics and Path Integrals. McGraw-Hill, New York. Filyov, V.M., Tsvelik, A.M., Wiegmann, P.B., 1981. Phys. Lett. A 81, 115. Fisher, M.E., 1964. Arch. Rational Mech. Appl. 17, 377. Fliessenbach, T., 1991. Nuovo Cimento 13D, 211. Fuller, W., Lenard, A., 1979. Commun. Math. Phys. 67, 69. Ginibre, J., 1965. J. Math. Phys. 6, 238, 252. Ginibre, J., 1971. Some Applications of Functional Integration In: de Witt, C., Stora, R. (Eds.), Statistical Mechanics and Field Theory. Gordon and Breach, New York, London, Paris. Ginibre, J., 1972. Dilute quantum systems. In: Domb, C., Green, M.S. (Eds.), Phase Transitions and Critical Phenomena. Academic, London, San Francisco, New York, pp. 111—136. Glimm, J., Jaffe, A., 1981. Quantum Physics, A Functional Integral Point of View. Springer, New York, Heidelberg, Berlin. Gradshteyn, I.S., Ryshik, I.M., 1965. Tables of Integrals, Series and Products, Academic, New York. Grandy Jr, W.T., Rosa Jr. S.G., 1980. Am. J. Phys. 49, 570. Hepp, K., 1970. Solid State Commun. 8, 2087. Hepp, K., Lieb, E.H., 1973. Helv. Phys. Acta 46, 573. Huang, K., 1963. Statistical Mechanics. Wiley, New York, London. Ingersent, K., Jones, B.A., Wilkins, J.W., 1992. Phys. Rev. Lett. 69, 2594. Ito, K., McKean, H.P., 1965. Diffusion Processes and Their Sample Paths. Springer, Berlin, Heidelberg, New York. Kac, M., 1959a. Phys. Fluids 2, 8. Kac, M., 1959b. Probability and Related Topics in Physical Sciences. Interscience, London, New York. Kac, M., Uhlenbeck, G.E., Hemmer, P., 1963. J. Math. Phys. 4, 216. Kasuya, T., 1956. Progr. Theor. Phys. 16, 45. Kondo, J., 1962. Progr. Theor. Phys. 28, 846. Kondo, J., 1964. Progr. Theor. Phys. 32, 37. Kossakowski, A., Mac´kowiak, J., 1986. Rep. Math. Phys. 24, 365. Kummer, H., 1967. J. Math. Phys. 8, 2063. Landau, L., Wilde, I.F., 1979. Commun. Math. Phys. 70, 43. Lebowitz, J.L., Penrose, O., 1966. J. Math. Phys. 7, 98. Lewis, J.T., 1972. The free boson gas. In: Streater, R.F. (Ed.), Mathematics of Contemporary Physics. Academic, London, New York. Lewis, T.J., Pfister, C.E., Sullivan, W.G., 1994a. In: Fannes, M. et al. (Eds.), Large Deviations and the Thermodynamic Formalism: a New Proof of the Equivalence of Ensembles in On Three Levels. Plenum Press, New York. Lewis, T.J., Pfister, C.E., Sullivan, W.G., 1994b. J. Stat. Phys. 77, 397. Lieb, E.H., 1966. J. Math. Phys. 7, 1016. Lightman, A.P., Press, W.H., Price, R.H., Teukolsky, S.A., 1975. Problem Book in Relativity and Gravitation. Princeton University Press, Princeton, New Jersey. Mac´kowiak, J., 1982. Physica A 110, 302. Mac´kowiak, J., 1983a. Physica A 117, 47; 119, 646. Mac´kowiak, J., 1983b. Physica A 121, 59. Mac´kowiak, J., 1987. Physica A 143, 239. Mac´kowiak, J., 1988. Physica Scripta 38, 513. Mac´kowiak, J., 1989a. Physica A 155, 352. Mac´kowiak, J., 1989b. Fortschr. Phys. 37, 101. Mac´kowiak, J., 1989c. Rep. Math. Phys. 27, 11. Mac´kowiak, J., 1991. Physica A 180, 171.
J. Mac´ kowiak / Physics Reports 308 (1999) 235—331
331
Mac´kowiak, J., 1993. Physica A 199, 165. Mac´kowiak, J., Wis´ niewski, M., 1994. Physica A 212, 382. Mac´kowiak, J., Wis´ niewski, M., 1995. Physica A 220, 585. Mac´kowiak, J., Wis´ niewski, M., 1997a. Physica A 242, 482. Mac´kowiak, J., Wis´ niewski, M., 1997b. Rep. Math. Phys. 39, 113. Mayer, J.E., Mayer, M.G., 1940. Statistical Mechanics. Wiley, New York. Meissner, W., Voigt, G., 1930. Ann. Phys. 7, 892. Mitus, A.C., Patashinski, A.Z., Sokolowski, S., 1991. Physica A 174, 244. Narnhofer, H., Thirring, W., 1981. Ann. Phys. N.Y. 134, 128. Negele, J.W., Orland, H., 1988. Quantum Many-Particle Systems. Addison-Wesley, New York, Amsterdam, Tokyo. Nelson, E., 1964. J. Math. Phys. 5, 332. Novikov, S.P., 1969. Func. Anal. Appl. 3, 58. Pearce, P.A., Thompson, C.J., 1975. Commun. Math. Phys. 41, 191. Pruski, S., 1978. private communication. Pruski, S., Mac´kowiak, J., 1971. Rep. Math. Phys. 1, 309. Pule´, J.V., Lewis, J.T., 1974. Commun. Math. Phys. 36, 1. Pule´, J.V., Lewis, J.T., 1975. Commun. Math. Phys. 45, 115. Reed, M., Simon, B., 1973. Methods of Modern Mathematical Physics, vol. I. Academic, New York and London. Rizzuto, C., 1974. Rep. Progr. Phys. 37, 147. Robinson, D.W., 1971. Lectures Notes in Physics 9. Springer, Berlin. Roepstorff, G., 1994. Path Integral Approach to Quantum Physics, An Introduction. Springer, Berlin. Ruelle, D., 1969. Statistical Mechanics. Benjamin, New York. Ruvalds, J., Zawadowski, A., 1970. Phys. Rev. Lett. 25, 333. Sacramento, P.D., Schlottmann, P., 1990. Physica B 163, 231. Schwartz, L., 1967. Analyse Mathe´matique. Hermann, Paris. Schrieffer, J.R., Wolf, P., 1966. Phys. Rev. 149, 491. Souriau, J., 1970. Structure des Systemes Dynamiques. Dunod, Paris. Sniatycki, J., 1980. Geometric Quantization and Quantum Mechanics, Applied Mathematical Sciences Series, vol. 30. Springer, New York, Heidelberg, Berlin. Story, T., Ga|a, zka, R.R., Frankel, R.B., Wolff, P.A., 1986. Phys. Rev. Lett. 56, 777. Thirring, W., 1982. Quantum Mechanics of Large Systems. Springer, New York, Heidelberg, Berlin. Tindemans, P.A.J., Capel, H.W., 1974. Physica 72, 433; 75, 407. Tripplett, B.B., Philips, N.E., 1971. Phys. Rev. Lett. 27, 1001. Valatin, J.G., 1964. Theory of Superfluids in Lectures in Theoretical Physics. University of Colorado Press, Boulder, Colorado. van Diren, F., van Leeuwen, J.M.J., 1984. Physica A 128, 383. Wiegmann, P.B., 1982. An exact solution of the Kondo problem. In: Lifshits, I.M. (Ed.), Quantum Theory of Solids. MIR Publishers, Moscow. Widom, B., Rowlinson, J.S., 1970. J. Chem. Phys. 52, 1270. Wilson, K.G., 1973. Proc. Nobel Symp. XXIV Uppsala: Almquist and Wiksell Informations-Industrie AB, p. 68. Wilson, K.G., Kogut, J., 1974. Phys. Rep. 12C, 75. Wilson, K.G., 1975. Rev. Mod. Phys. 47, 773. Wis´ niewski, M., Mac´kowiak, J., 1997. Physica A 243, 415. Woods, A.D.B., Cowley, R.A., 1970. Canad. J. Phys. 49, 177. Yang, C.N., Lee, T.D., 1952. Phys. Rev. 87, 404, 410. Zawadowski, A., Ruvalds, J., Solana, J., 1972. Phys. Rev. A 51, 399. Ziff, R.M., Uhlenbeck, G.E., Kac, M., 1977. Phys. Rep. 32, 169.
Physics Reports 308 (1999) 333—428
Surface magnetoplasma waves at the interface between a plasma-like medium and a metal in the Voigt geometry N.A. Azarenkov , K.N. Ostrikov* Kharkov State University, Department of Physics and Technology, 4, Svobody sq., 310077, Kharkov, Ukraine Kharkov State University, Scientific Centre for Physical Technologies, Novgorodskaya st., 2, C 93, Kharkov, 310145, Kharkov, Ukraine Received January 1998; editor: A.A. Maradudin Contents 0. Introduction 1. Linear theory of the SW at the interface between a plasma-like medium and a metal in the Voigt geometry 1.1. SW in a uniform semi-infinite medium 1.2. The effects of a transverse inhomogeneity of the medium 1.3. SW in a homogeneous plasma layer 1.4. SW in a two-layer plasma structure bounded by a metal 1.5. Azimuthal surface waves in magnetoactive plasma-filled waveguides 1.6. Other types of waves and geometries of the problem 2. Excitation mechanisms of surface-type waves 2.1. Parametric excitation in semi-infinite plasmas 2.2. Parametric excitation of SW in a plasma layer 2.3. Excitation of SW in crossed E and H fields
336
340 341 345 349 350 352 357 360 360 368 370
2.4. Parametric excitation of azimuthal surface waves 2.5. Other mechanisms of SW excitation 3. Nonlinear theory of SW 3.1. Resonant second harmonic generation 3.2. Self-interaction of SW. Formalism of the third order nonlinear susceptibilities 3.3. Self-interaction caused by hydrodynamic nonlinearities 3.4. The effect of the nonparabolicity of the free carriers spectrum 3.5. The influence of the heating nonlinearity on the SW propagation 3.6. Ionization nonlinearity of the SW in gas discharge plasmas 3.7. Nonlinear theory of azimuthal SW 3.8. Other nonlinear effects 4. Basic experimental results 5. Conclusions Acknowledgements References Note added in proof
375 378 379 379 384 387 394 399 404 406 411 416 421 423 423 427
Abstract Theoretical and experimental results associated with the studies of different properties of surface-type waves (SW) in plasma-like medium—metal structures are reviewed. The propagation of surface waves in the Voigt geometry (the SW *Corresponding author. E-mail:
[email protected]; present address: Institute for Theoretical Physics I, RuhrUniversity Bochum, 44780 Bochum, Germany.
0370-1573/99/$ — see front matter 1999 Elsevier Science B.V. All rights reserved. PII: S 0 3 7 0 - 1 5 7 3 ( 9 8 ) 0 0 0 3 2 - 5
SURFACE MAGNETOPLASMA WAVES AT THE INTERFACE BETWEEN A PLASMA-LIKE MEDIUM AND A METAL IN THE VOIGT GEOMETRY
N.A. AZARENKOV , K.N. OSTRIKOV Kharkov State University, Department of Physics and Technology, 4, Svobody sq., 310077, Kharkov, Ukraine Kharkov State University, Scientific Centre for Physical Technologies, Novgorodskaya st., 2, C 93, Kharkov, 310145, Kharkov, Ukraine
AMSTERDAM — LAUSANNE — NEW YORK — OXFORD — SHANNON — TOKYO
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
335
propagate across the external magnetic field, which is parallel to the interface) is considered. Various problems dealing with the linear properties of the SW (dispersion characteristics, electromagnetic fields topography, influence of the inhomogeneity of the medium, etc.); excitation mechanisms of the plasma—metal waveguide structures (parametric, drift, diffraction, etc. mechanisms); nonlinear effects associated with SW propagation (higher harmonics generation, selfinteraction, nonlinear damping, nonlinear interactions, etc.) are presented. In many cases the results are valid for both gaseous and solid-state plasmas. 1999 Elsevier Science B.V. All rights reserved. PACS: 72.30.#q
336
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
0. Introduction Bounded plasma-like structures are the subjects of growing interest in many fields of fundamental physical science as well as of the science-filled physical technologies. In particular, highfrequency properties of such structures are actively studied both experimentally and theoretically because of the applied needs of plasma and semiconductor electronics. One of the main specificities of bounded plasma-containing structures is that they can sustain a specific kind of resonant high-frequency excitations — surface-type waves — that are intimately associated with the interfaces [1—12]. Although the properties of these kinds of excitations significantly depend on the characteristics of the media in contact, as well as on those of the interface itself, they all possess the same main features [6,7]. Namely, when dissipative processes are weak, the energy of the surface-type waves (SW) is localized near the interface and is not radiated to the volume of the medium. The amplitude of a SW usually undergoes a rapid decrease in the direction perpendicular to the interface, while the SW phase velocity is most frequently significantly lower than the speed of light. Another valuable feature of the surface-type waves is that in inhomogeneous media their eigenfrequencies are integral functions of the medium’s density, and can undergo only small changes under a significant variation of the latter. This circumstance is in contrast with the local dependence of the eigenfrequencies of volume waves on the medium’s density [7]. Moreover, the localization of the electromagnetic energy near the interface makes the SW suitable for applications in functional devices of solid-state and gaseous plasma electronics due to the relative simplicity of the excitation and the energy output, and of the convenience of realization of the interaction with electron flows and external electromagnetic fields [4—6]. These circumstances as well as a number of others caused a significant interest of a great number of researchers throughout the world towards the investigations of various properties of surface-type waves in different bounded plasma-like structures. Surface-type waves found numerous applications in solid-state and gas discharge low-temperature plasmas, as well as in high-temperature plasmas of fusion devices. In particular, SW are actively used in a number of functional elements of semiconductor electronics, optoelectronics, and integrated optics [13,14]. The excitation of the SW can serve as an effective tool for non-contact and non-destructive spectroscopy of solid-state surfaces and interfaces, as well as for the diagnostics of a number of parameters characterizing the contacting media [9,11,15—17]. In gaseous plasmas SW are often used as operating modes of high-power plasma generators and amplifiers of electromagnetic radiation due to their ability to interact with beams of charged particles [6]. One of the most recent and promising applications of SW is the RF and microwave surface-wave based plasma sources that are used in a large number of low-temperature plasma technologies, such as materials processing, pumping of laser media, etc. [18—25]. SW also take an active part in many physical phenomena in both laboratory and space plasmas [8,26]. The eigenfrequencies of the SW vary in broad limit from the radiofrequency [18] to the optical [17] ranges. As was mentioned above, the nature of the bounding media significantly influence the linear and the nonlinear properties of SW. From a general point of view, one can subdivide all possible interfaces into three main categories: the interface between two non-conducting (dielectric) media, between a dielectric and a conducting medium, and between two conducting media. Here we deal with bounded systems which consist of two conducting media. Metal—metal, metal—semimetal, metal—semiconductor, semimetal—semiconductor, semiconductor—semiconductor, gaseous plasma—
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
337
metal (semiconductor, semimetal) interfaces, and several others, can be treated as examples of interfaces between two conducting media. An interesting feature of the above-mentioned interfaces is that all of the contacting media to a greater or lesser extent possess plasma properties, i.e. can be considered as plasma-like media. In the present report we limit consideration to the situations where the conductivities of the media in contact differ significantly (p
338
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
article. It should also be pointed out that in the review of the nonlinear theory of surface waves [10] there is no analysis of the features of nonlinear effects caused by wave processes in bounded plasma—metal structures. In a recent review [37] it was pointed out that the presence of a metal as a bounding medium can lead to new effects in the theory of nonlinear plasma surface waves. Therefore, there is a vital need to disseminate the results obtained in this field, and to inform the world scientific community about the progress made and the future perspectives in theoretical and experimental studies of surface-type waves at the plasma-like medium—metal interface. We restrict the consideration of a great number of possible propagating modes of the plasma-like medium—metal interface to the SW in one of the most typical and interesting geometries — the Voigt geometry (SW propagate across an external magnetic field that is parallel to the interface). It should be pointed out that this geometry is being actively studied for different wave processes at plasma-like medium—dielectric or plasma-like medium—plasma-like medium interfaces [38—42]. This geometry is characterized by a significant non-reciprocity of the SW propagation, which means that the eigenfrequency is changed when the direction of propagation is reversed. This non-reciprocity arises from the gyrotropic form of the dielectric tensor, i.e. the presence of non-zero off-diagonal terms proportional to the value of the external magnetic field [42]. As is shown in this report, non-reciprocity effects are most significant for the SW at the plasma-like medium—metal interface. In the present report in the framework of a quasihydrodynamical description we consider two situations when the plasma-like medium is assumed cold or possesses a finite electron pressure. The assumption of a cold plasma-like medium is usually valid when the SW phase velocity significantly exceeds the electron’s thermal velocity [7]. The quasihydrodynamical description of linear and nonlinear properties of the SW is valid when the SW wavelength and skin depth significantly exceed the average mean free paths of the charge carriers [43]. The interface is assumed sharp. This assumption is used in many boundary value problems, and is valid if the SW skin depth significantly exceeds the width of the near-wall inhomogeneous transient layer [7]. In a cold plasma-like medium bounded by a metal the SW in the Voigt geometry possess a number of very interesting, even unique, properties. First of all, it is their strongly nonreciprocal nature of propagation (namely, the SW can propagate only in a definite direction across H ), that makes the SW useful for nonreciprocal elements. The second interesting property of the SW is that they are transverse purely electromagnetic waves, but at the same time are slow (their phase velocities are significantly smaller than the speed of light) ones. Third, the nonlinear effects for these SW are realized for relatively low pump field amplitudes (this circumstance is mainly due to the fact that the SW are slow waves). The next feature of the SW is that in a homogeneous plasma-like medium they possess the minimal possible number of electromagnetic field components, namely, the normal (with respect to the interface) electric field component (which defines the value of the electric charge at the interface) and the tangential magnetic field component (which defines the value of the electric current at the interface). This obvious simplicity of the SW field structure makes these waves convenient for non-contact non-destructive spectroscopy of various parameters of surfaces, interfaces, and thin films. It is also necessary to point out that the SW of interest to us exist in a broad range of frequencies, namely from radiowave to optical frequencies, as well as that they can be observed in quite different Voigt-like geometries. However, in all these cases the properties of the SW are described from a unified point of view. And, finally, one of the particular
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
339
features of the SW is that in a spatially homogeneous plasma-like medium the tangential electric field component of the SW is equal to zero not only at the interface with the perfectly conducting metal but also in the entire volume of the medium (this circumstance leads to the fact that the SW phase velocity in a homogeneous plasma layer bounded on two sides by metal planes does not depend on the transverse size of the layer). Another situation is realized when one takes into account the finite electronic pressure, i.e. when one considers a warm plasma-like medium. In this case even if the SW phase velocity significantly exceeds the electrons’ thermal velocity, potential (i.e. unretarded) SW at the plasmalike medium—metal interface exist. This type is not possible in cold plasmas. The potential SW are also strongly nonreciprocal analogously to the electromagnetic ones, but possess a number of features. One of them is that the tangential electric field component of the SW, which vanishes at the interface, no longer vanishes inside the plasma-like medium. It should be pointed out that the SW at the plasma-like medium—metal interface are very specific from the standpoint of their description. This specificity is mainly due to the necessity of satisfying the boundary condition of the vanishing of the tangential component of the electric field at the interface with a perfectly conducting metal, and in many aspects defines how their linear and nonlinear properties are described. Separately we consider a significantly new kind of surface-type waves propagating across an external axial magnetic field but in a cylindrical metal plasma-filled waveguide (i.e. in a cylindrical Voigt-like geometry) — azimuthal surface waves (ASW). Active studies of ASW in such structures began only in the early 1990s, so the understanding of their linear and nonlinear properties is far from being complete. The ASW possess a number of features compared with the ordinary SW in the Voigt geometry. This specificity is mainly caused by the discretness of their spectrum, which gives rise to many aspects of their linear and nonlinear theory. In this review we discuss a large number of problems dealing with the dispersion characteristics, excitation mechanisms, as well as the nonlinear properties, of SW and ASW. Simultaneously the applied aspects of all solved problems are discussed. In Section 1 the SW dispersion characteristics and the electromagnetic field topography are studied in semi-infinite plasmas and in a plasma layer for both spatially homogeneous and inhomogeneous media. The specificity of the SW excitation in the Voigt geometry is that the beam excitation mechanism is not convenient in planar structures due to the fact that the SW propagate across the external magnetic field — the usual direction of beam propagation. Therefore, one has to search for other mechanisms for exciting SW in the Voigt geometry. The parametric, diffraction, and drift (in crossed E and H fields) mechanisms of exciting SW are discussed in Section 2. The excitation of ASW is possible by azimuthally rotating beams as well as in a radial electric pump field. The efficiency of excitation is estimated in each specific case. The nonlinear effects associated with the propagation of SW at the plasma-like medium—metal interface are studied in the weak nonlinearity approximation and are discussed in Section 3. In this approximation the following nonlinear effects are discussed: nonlinear wave interaction and self-interaction, second and third harmonic generation, nonlinear damping, etc. In studying the SW interaction and self-interaction we use the formalisms of the second and third order nonlinear susceptibilities, and discuss the possible mechanisms of nonlinearity such as the nonlinearities of the basic set of quasihydrodynamic equations (hydrodynamic nonlinearities), nonparabolicity of
340
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
the free carrier spectrum, electron heating in the wave field (heating nonlinearity), and ionization nonlinearity. The Report is completed by a description of the available basic results of experimental investigations of the SW (Section 4). In Conclusion, we summarize the main results reviewed in this work, give a brief outline of possible applications of the SW in a Voigt geometry, and express our hope that this Report will attract the attention of experimentalists and will help them to overcome the existing gap between theory and experiment in the area studied. The present report has the aim of informing the scientific community about the progress made and the future perspectives of the linear and nonlinear theory of surface-type waves at the interface between a plasma-like medium and a metal in the Voigt geometry. However, to obtain a more complete coverage of the current status of research into high-frequency properties of bounded plasma-like medium—metal structures, references concerning other kinds of waves and geometries of the problem are presented as well.
1. Linear theory of the SW at the interface between a plasma-like medium and a metal in the Voigt geometry In this section we consider the possibility of the existence of surface-type waves and their properties in a cold magnetoactive plasma-like medium bounded by a metal. As was assumed for a long period of time, surface-type waves at such an interface are impossible. This statement is based on the following circumstance. The presence of a spatially varying surface charge density (travelling wave), connected with the existence of a SW, leads to a finite value of the tangential component of the electric field E . The latter is impossible at a metal interface due to the high O conductivity of the metal, which leads to the vanishing of E , which is caused by the collective O excitations in a plasma. This situation occurs if E in a plasma is taken in the form O E "E exp(!q x) exp[i(q r !ut)] , O OO where E is a constant amplitude, and q and r are the wavevector and the radius-vector along the O O interface. At the plasma—metal interface (x"0) the tangential component E vanishes (E "0), and, O O consequently, E "0. However, if one takes into account the electrons’ thermal motion, or introduces an external magnetic field, the necessary boundary condition then can be satisfied without equating the amplitude of the solution E to zero. In the first case, when one takes into account the carriers’ thermal motion, i.e. the effects of spatial dispersion, the electric field in the plasma medium can be considered as a superposition of the fields caused by the interaction of the electric field and the electrons’ thermal motion (the latter is taken into account through the introduction of a term proportional to p in the quasihydrodynamic equations of motion for the electrons. Here p is the kinetic pressure of the electron gas). The superposition of these fields leads to the fact that the resulting field in a plasma can be presented in the following form [7]: E "E f (u, q, x) exp[i(q r !ut)] , O OO
(1.1)
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
341
where f (u, q, x) is a definite function, satisfying f (u, q, 0) " 0 and tending to zero at x"R. Therefore, the possibility of satisfying the boundary condition E "0 at x"0 for a nonzero value O of E arises. In this case a surface-type electromagnetic field in a plasma-like medium can exist. In a cold, but magnetized, medium the resulting field of the SW can also be represented in the form (1.1). In this case the presence of the function f (u, q, x) in the expression (1.1) is due to the magnetic field induced anisotropy of the medium. In particular, in the Voigt geometry studied this function does not depend on the x coordinate, and equating f (u, q) to zero yields the linear dispersion equation for the SW. Moreover, all wave perturbations of the surface type satisfying this dispersion equation, will automatically possess a tangential component of the electric field that vanishes in the entire plasma-like medium, not only at the interface (see expression (1.1)). In this Review we discuss both of the cases mentioned above. However, the first case is a bit more complicated from the mathematical point of view, so we begin with the SW in a cold magnetoactive plasma-like medium bounded by a metal. As is shown in this chapter, the presence of an external magnetic field H in a plasma leads to the occurrence of surface-type waves, and their properties depend significantly on the direction of their propagation. The SW can be reciprocal (for propagation along an external magnetic field) and nonreciprocal (for propagation that is oblique or purely transverse with respect to H ), purely surface waves (in a planar geometry the SW amplitude decreases exponentially at large distances from the interface), as well as generalized surface waves (the SW amplitude undergoes oscillations while decreasing away from the interface). Here we pay the most attention to the SW propagating purely transverse to H , i.e. in the Voigt geometry. Although in all the situations considered the SW of interest to us propagate across H , the exact geometries of the problem are specified in each paragraph separately. Other cases of propagation are briefly discussed in Section 1.6. 1.1. SW in a uniform semi-infinite medium Consider the situation where an external uniform and steady magnetic field is directed along the interface, which is assumed sharp. The latter assumption is valid if the SW skin depth significantly exceeds the width of the near-interface transient layer. A plasma-like medium occupies the half-space x'0, while the z-axis is directed along the external magnetic field. In the y-, z-directions the waveguide structure studied is assumed infinite. The dependence of all wave quantities on the coordinates x, y and time t is assumed to be in the form A(r, t)"A(x) exp[i(k y!ut)] , (1.2) where k and u are the SW wavenumber and eigenfrequency, respectively. The SW electromagnetic field is defined by the following set of Maxwell equations: curl E"!jB/jt,
curl B"!jD/jt ,
(1.3)
where D "e e E , e is the cold magnetoactive plasma dielectric tensor, B"k H, e and k are G GI IY GI the electric and magnetic SI constants, X ? ; e "e "e "e # * u!u ? ?
e "!e "ie ,
342
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
u X X ? , e " ? e "e "e # ? , * u u!u u ? ? ? X"n q/e m ; u "q B /m ; B ""B " ; ? ? ? ? ? ? ? e is the dielectric function of the crystal lattice (for a gaseous plasma e "1); and n , q , m are the * * ? ? ? density, the charge number, and the effective mass of a-species particles. The index e refers to electrons while the index i corresponds to ions (in gaseous plasmas) and holes (in semiconductor plasmas). For propagation purely transverse to B (or H ) the set of Maxwell equations breaks up into two independent sets. One of these sets (E-wave) allows a solution in the form of a surface wave. In a homogeneous plasma-like medium the SW field topography is defined by the following expressions [44]: E "(k/e k)(e s !e k )(k /e )H , V X E "i(k/e k)(e s !e k )(k /e )H , (1.4) W X H "H (0)exp(!s x) , X X where s"k!k, k"k(e!e)/e , and k"u(k e ). At the plasma-like medium—metal interface the tangential component of the SW electric field tends to zero: E (0)"0. This equality yields the following dispersion equation of the SW conW sidered: e s !e k "0 . (1.5) Eq. (1.5) was also obtained e.g. in [45,46]. However, it should be pointed out that in those papers a detailed analysis of the propagation conditions and the frequency ranges of existence was not given. It is very interesting to notice that in the case considered the tangential component of the SW electric field vanishes (E "0) not only at the interface, but also in the entire plasma volume due to W the homogeneity of the plasma medium. This means that the SW is a transverse electromagnetic wave, and the values E (0) and H (0) define the perturbations of the surface charge and the value of V X the surface current per unit length of the interface, respectively. If the SW propagate along the y-axis, then k '0. Because the SW are possible for s '0, from Eq. (1.5) one can obtain the condition that the values of the dielectric tensor components e and e must have the same signs. This is possible in the frequency ranges u(u and u'u (u is the upper hybrid frequency), where both e and e are positive. From Eq. (1.5) one can easily obtain k"ke , s"ke/e . (1.6) The condition for SW propagation is e '0. It follows from Eq. (1.5) that e is to be positive for wave propagation with k '0. The effect of the anisotropy of the plasma-like medium makes possible another solution of Eq. (1.5), corresponding to a wave propagating along the negative y-axis (k (0). In this case e '0, while e (0. This solution exists in the frequency range u (u(u . (1.7) J
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
343
Here u is the lower-hybrid frequency. The expressions for k and s in this range are also given by J Eq. (1.6). In the frequency ranges u;u and u
344
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
where W(x) is the coordinate-dependent amplitude of the potential. Using the quasihydrodynamic equations for electron and ion motions in the SW field, together with the Poisson equation, one can obtain the following solution for the potential of the waves considered: W(x) " A exp(!"k "x)#A exp(!c x) , (1.11) where c"k!e (u!u)/e v , e "1!X/u, and v is the electrons’ thermal velocity. Sub G 2 G 2 stituting the solution (1.11) into the boundary conditions representing the vanishing of the SW potential W and the normal component of the electrons’ hydrodynamical velocity v at the interface [7], it is easy to derive the dispersion relation given below:
c u!u u k u!u e "0 . e ! 1# (1.12) 1# "k " X u "k " X Studies of the dispersion equation (1.12) was performed for both high-frequency (HF) and low-frequency (LF) waves. The SW eigenfrequencies were assumed to exceed the characteristic frequencies of ion motions X significantly, while that of the LFSW were assumed to have values of the same order as X . Using Eq. (1.12), one can show that the nature of HF SW propagation with the frequencies u(u and u'u is different: in the range u(u SW propagation is possible only in the positive y-direction, while in the range u'u solutions with both k '0 and k (0 are possible. LF waves exist in the frequency range below the low-hybrid one. In this case only solutions with k '0 satisfy the basic dispersion relation. In both the LF and HF cases the applicability conditions for the quasihydrodynamic approach were verified. These conditions in both cases can be written as "v "
X uu 1 v " v , k " '0, u 2 X r " u "k "" v\, k (0 , (u#X 2
1 j " , r "
(1.13)
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
(u#X , v "v 2 u
j "1/r ; "
345
(1.14)
LF SW u v "v (1!u/u , k " '0, 2 v D 2 1 v (1#uD/X . j " r D c "
(1.15)
Here r is the electron Debye radius and c"1/(k e ) is the speed of light. Notice that the " expression for the HFSW wavenumber coincides with the analogous expression in paper [49], which is valid for the case of an isotropic plasma. This is due to the fact that in this frequency range the effect of the non-reciprocity of SW propagation, caused by an external magnetic field, is weak, and the eigenfrequencies of the SW with k '0 and k (0 coincide within small terms of the order of "k "r ;1 and u /u;1. " In the above mentioned cases a strong inequality between the skin depths "k "\ and c\, namely, c
1 dk k (0) 1# (1.16) !s (0) e (0)"0 , 2s(0) k (0) dx V where s"k!k!k k(d/dx)(e /e k). In the limiting case of a homogeneous plasma-like medium this equation reduces to Eq. (1.5). As an illustration, we give the solution of Eq. (1.6) for waves whose frequency satisfies u ;u;u (the medium is assumed to be dense so that J u/X(0);e ): * e (0)k #
k "!k(X (0)/u )(c(1#a) , where
1 dX 3 , a" c(c X(0) dx 4 V
uu c"1! . u
(1.17)
346
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
From the solution (1.17) it is easy to obtain the expression for the SW phase and group velocities:
u a(u#u u ) u . v "c (c 1! v "c (1!a), X(0) u X (0) It follows from Eqs. (1.16) and (1.17) that for an inhomogeneous medium density profile for which the density increases away from the interface, the absolute value of the wavenumber increases, while the absolute values of the phase and group velocities decrease. Ref. [44] also deals with the approximation of the inhomogeneity of the medium by a step function (the medium density is n for 0(x(a, while it is n for x'a). In the case when n (n a density profile increasing into the medium is modelled, while if n 'n — a density profile decreasing into the medium is modelled. The dispersion equation for such a spatially inhomogeneous waveguide structure has the following form: [e(n )!e(n )][k!ke (n )]sh (s a)"[s e (n )!k e (n )] ;[s e (n )ch (s a)!k e (n )sh (s a)] , s "[k!k(n )], (1.18) where ch(x) and sh(x) are the hyperbolic cosine and sine. Assuming a"0 or n "n in Eq. (1.18), we obtain the SW dispersion equation in a homogeneous plasma-like medium under the assumption of a transient layer thin with respect to the SW skin depth (s a;1) for waves in the frequency range u (u(u : X (n ) n aX (n ) , k "!k (c . (1.19) k "k 1# !1 u n cc C One can easily see from the expression (1.19) that in the case of a density profile increasing into the medium (n 'n ) the value of "k " increases. This result is in a good agreement with the conclusions obtained by WKB methods. An increase of "k " means a decrease of the wave phase velocity, and reflects the fact that a positive density gradient acts as a slowing-down structure. Note that for a density profile increasing into the medium the SW phase velocity decreases for the waves in the range u(u , while it decreases for the SW with u'u . Following [44], let us also present the results of a numerical simulation of the influence of the inhomogeneity of the medium density on the SW dispersion. In this paper the density profile was taken in the form
n (0)[1#(p!1) sin(nx/2g)], 0(x(g, n (x)" pn (0), x'g , where p"n (g)/n (0) is the inhomogeneity parameter, and g is a quantity that characterizes the SW skin depth. In particular, this quantity has the value g"c/X (0) for the waves from the frequency range u (u(u in a dense plasma. In the computations it was assumed that X(0)/u"10. In this case the numerical solutions show that the SW eigenfrequency and phase velocity decrease with the increase of the inhomogeneity parameter. In the frequency range given above the topography of the SW electromagnetic fields and its dependence on the inhomogeneity parameter were studied. Fig. 1.1 illustrates the dependence of the normalized (to (k /e )H (0)) amplitudes of X
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
347
Fig. 1.1. The dependence of normalized (to H (0)) amplitudes of components H (curves 1, 2), E (curves 3, 4), 0.1 X X V E (curves 5, 6) of the SW electromagnetic fields on the distance x for different values of the inhomogeneity parameter p. W Curves 1, 3, 5 correspond to p"2 while the curves 2, 4, 6 to p"3. It is assumed that X(0)/u"10.
the components H (curves 1, 2), E (curves 3, 4), 0.1 E (curves 5, 6) of the SW electromagnetic fields X V W on the distance x for different values of the inhomogeneity parameter p. One can easily see that the tangential component of the SW electric field E , which is zero in the entire volume of the W homogeneous plasma-like medium, becomes nonzero in the volume of the medium while remaining equal to zero at the interface with a metal. The inhomogeneity also influences the x-dependence of the E and H components. Namely, the SW field in a plasma decreases faster with the increase of V X the inhomogeneity parameter, and the amplitude of the E component increases at the metal V boundary. Now let us discuss the influence of the transverse inhomogeneity of the plasma-like medium on the dispersion properties of the potential SW in a magnetized medium with a nonzero electron pressure [48]. As was mentioned above, the strong inequality c
(1.20)
Using the quasihydrodynamic equations and the Poisson equation in a nonuniform plasma-like medium, one can show that the function W(x) satisfies the following equation:
de (x) dW de (x) dW # ! k #k e (x) W"0 . e (x) dx dx dx dx
(1.21)
348
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Eq. (1.21) can be solved analytically for only a few density profiles n (x). In particular, in the case of a linear profile, n "n (0)(1#ax), its solutions can be expressed in terms of the confluent hypergeometric functions of the second order G [50]: W"A exp(!m/2)G(1/2!k, 1, m) , (1.22) where m"2"(k /ab)"e (x), k"!sgn(k a)e (0)/b, and b"e (0)!1. The substitution of the solution (1.20) with the result given by Eq. (1.22) into the boundary conditions W(0)"0 and v (0)"0 yields the SW dispersion equation, which is not presented here due to its unobservability. Analytical studies of the SW dispersion properties in nonuniform plasmas were carried out in the limiting cases of a weak "a";"k " and a strong "a"<"k " inhomogeneity. In the former case the expressions for the potential SW eigenfrequencies are u"u (1#a/k ), u;u ; (1.23) ar (0) u X(0) , u
k "k #dk , (1.25) where dk "!a(u/u )(m /m , and k is defined by the expression (1.15). The solution (1.25) J shows that as in the HF SW case, the phase velocity increases for a'0. It is important to note that all these results are in a good agreement with the dependence of the phase velocities of the HF and LF waves in a homogeneous medium (1.13)—(1.15) on the medium density n [7]. In the expressions (1.23)—(1.25) the terms caused by the inhomogeneity are small with respect to u and k in homogeneous plasmas. It is easy to see that this is in a good agreement with the weak inhomogeneity approximation. More interesting results are obtained in the limiting case of a strong inhomogeneity ("a"<"k "). Under such conditions the SW eigenfrequency and the wavenumber are coupled by a simple relation u"(u /3)(k /a) , (1.26) showing a linear dependence of u on k . Moreover, it follows from the expression (1.26) that the HFSW eigenfrequency is significantly smaller than the electron cyclotron frequency, and the direction of the SW propagation is now defined by the sign of a. Therefore, in the limiting case "a"<"k " the HFSW are strongly nonreciprocal (unidirectional waves), and their frequency range of existence is significantly narrower in comparison with the homogeneous plasma-like medium case. In the same limiting case one can also obtain the following result for the LFSW eigenfrequency
k u "k " m "k " , u"!u #ln a m k a u which is valid in the frequency range u;u;X(0).
(1.27)
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
349
One can conclude from Eq. (1.27) that under definite conditions the LFSW phase velocity is significantly smaller than the electrons’ thermal velocity in the entire range mentioned above. The waves considered remain nonreciprocal also in the case of a strong inhomogeneity. However, the possibility of the existence of SW with k (0 arises. Therefore, the frequency range in which LFSW exist in strongly inhomogeneous plasmas broadens, and the possibility of SW propagation in the negative y-direction arises. These results are in contrast with those for HFSW. 1.3. SW in a homogeneous plasma layer In this section, following [51] we consider a homogeneous layer of a plasma-like medium, which occupies the region 0(x(d and is bounded at the planes x"0, d by perfectly conducting metal surfaces. As in Section 1.1, an external magnetic field is directed along the z-axis, while the SW propagate along the interface perpendicular to H (y-axis). The solution of the set of Maxwell equations for the SW electromagnetic field components can be presented in the following form: H (x)"A exp(!s x)#A exp(s x) , (1.28) X k [A (e k !e s ) exp(!s x)#A (e k #e s ) exp(s x)](k /e ) , E (x)"!i W e k T k [A (e k !e s ) exp(!s x)#A (e k #e s ) exp(s x)](k /e ) . E (x)"! V e k T The substitution of the solutions (1.28) into the boundary conditions E (x"0), E (x"d)"0 leads W W to the following dispersion equation for SW propagating normal to an external magnetic field in a layer of a homogeneous plasma medium bounded by a metal [51]: (e k !e s )(e k #e s )"0 . (1.29) Eq. (1.29) leads to the following correlation between the SW eigenfrequencies and wavenumbers k "k(e . (1.30) It follows from Eq. (1.30) that the SW phase velocity in the considered plasma layer does not depend on the thickness of the structure. Without going into the details, which can be found in the original paper [51], we show here that Eq. (1.29) is the dispersion equation of two independent unidirectional SW, localized near the two different boundaries of the structure. Suppose that the SW frequency belongs to one of the ranges u(u or u'u . In these ranges the substitution of the solution k "k(e in Eqs. (1.28) and J (1.29) causes the expression in the first bracket in (1.29) to vanish. Therefore, in expression (1.28) A "0, while A is nonzero. If one, however, performs the substitution k "!k(e , which is valid in the range u (u(u , it is easy to obtain that in this case the expression e k #e s vanishes, i.e. A "0 while A is nonzero. One can conclude that, as in the case of a semi-infinite structure, E "0 in the entire plasma layer. In this case the waves, localized near the different W boundaries of the structure are independent, and propagate in different directions. Namely, in the ranges u(u and u'u the SW localized near the interface x"0 (amplitude A ) propagates in the positive y-direction (k '0), while the SW localized near the interface x"d (amplitude A ) — in
350
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
the negative y-direction (k (0). The electric field component E possesses different signs in the V waves with k '0 and k (0, and these wave processes take place independently. We assume that this nature of the SW propagation in a layer is a consequence of their nonreciprocity. The independence of the SW phase velocity on the layer thickness mentioned above is caused by the equality of the tangential component of the electric field both at the layer boundaries and throughout the volume of the plasma-like medium. Under such conditions there is no force causing the charged particles to drift across the layer. As will be shown in the following section, such a force can arise under the introduction of inhomogeneity in a layer of a plasma-like medium. 1.4. SW in a two-layer plasma structure bounded by a metal The result obtained in Section 1.2 showing that in a plasma-like medium with an inhomogeneous density, the tangential component of the electric field which is equal to zero at the interface becomes nonzero in the plasma volume allows one to expect the arising of a dependence of the SW dispersion properties on the transverse sizes of the layer and the arising of the coupling of the SW localized near the different boundaries of the structure [52]. One can qualitatively follow the influence of the plasma density inhomogeneity on the SW dispersion properties using as an example a two layered plasma structure bounded by metal surfaces. It should be emphasized that such a structure is typical for solid-state plasmas, and can be realized in gaseous plasmas if the width of the transient region between the plasmas with different densities is significantly smaller than the SW skin depth. Suppose that the plasma layer with a constant electron density N occupies the region !b(x(0 (I) and a layer with a density N occupies the region 0(x(a (II). At the planes x"!b, a the layers are bounded by perfectly conducting metal surfaces. The directions of the external magnetic field and the SW propagation are the same as in Sections 1.1, 1.2 and 1.3. According to the results of the original paper [52] the dispersion equation of the waveguide structure considered can be written as k!ke (1) e (1)k !e (1)s cth s b , " (1.31) k!ke (2) e (2)k #e (2)s cth s a where e (i)"e (N ), e (i)"e (N ), s "s(N ), i"1, 2. In the limiting case s b<1, s a<1 the G G G G dispersion equation (1.31) breaks up into three dispersion equations for three independent waves propagating at the interfaces x"!b, x"a, x"0: D(u, k )"D (u, k )D (u, k )D (u, k ) , where D (u, k )"e (2)s !e (2)k , D (u, k )"e (1)s #e (1)k , D (u, k )"e (2)k [e (1)k #e (1)s ]!e (1)k [e (2)k #e (2)s ], and k "k (N ). D (u, k)"0 and D (u, k)"0 are the G G dispersion equations for the SW propagating at the plasma-like medium—metal interfaces x"!b, x"a. The solutions of such equations for the SW wavenumbers are given by equations (1.30). The equation D (u, k )"0 describes the SW dispersion at the interface between two plasma-like medium half-spaces [40]. In the case of a homogeneous plasma layer (N "N ) the equation D (u, k )"0 degenerates into an identity. This means that the wave at the interface between the two plasma-like media disappears, and the basic dispersion equation (1.31) is reduced
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
351
to Eq. (1.29). It should be pointed out that using the equation D (u, k )"0 one can obtain the SW dispersion at the plasma-like medium—metal interface, if one assumes the ideal metal as a plasma with an infinitely great particle density. Assuming N to be infinite, we obtain the SW dispersion at the x"!b interface, while assuming N to be infinite, we obtain the SW dispersion at the interface x"a. In the limiting case of thin plasma layers (s b;1, s a;1) it is easy to obtain analytical solutions for the SW wavenumbers. In the frequency range u(u in a dense plasma-like medium (X <e u) when the k are finite and N does not differ significantly from N one can obtain: * X X (1#b/a) , k "k [X #X (b/a)](u!u) (1.32) u b 1 k "! X # X (X !X )\ . a b u It should be pointed out that the first two solutions correspond to the SW at the metal interfaces. Namely, the solution with a sign (#) corresponds to the wave at the x"a interface while the solution with a sign (!) to the wave at the x"!b interface. The third solution corresponds to the SW at the x"0 interface. In this case the sign of the wavenumber k is defined by the relation between N and N . For a given u the wavenumbers of the SW at the metal interfaces are maximal for N "N . The absolute values of the wavenumbers of the SW at the x"0 interface are decreased with the increase of the difference in electron densities. The conditions of applicability of the expressions (1.32) can be written in the form given below:
X ac\;(u!u)u\, uu\;"N !N "N\ . Detailed investigations of the dispersion equation (1.31) in the frequency range u (u(u were carried out numerically for various values of the parameters b"X u\, m "a(a#b)\, and m "N /N . In this frequency range the SW at the x"a interface can propagate only in the positive y-direction, while that at the interface x"!b ! in the negative y-direction. Solutions with both k '0 and k (0 exist for the wave at the x"0 interface. In the computations it was assumed that N "const., a#b"const. The transverse structure size a#b was selected so that the SW skin depths were quantities of the same order as the transverse sizes of the layers. In this case one can numerically demonstrate the dependence of the SW dispersion properties on the relation between the layer widths a and b, and between the electron densities of the layers N and N . Namely, for fixed m the waves’ eigenfrequencies at the interfaces x"a (k '0) and x"0 (k (0) decrease with an increase of the parameter m , while that of the SW at the interfaces x"!b (k (0), x"0 (k '0) increase with m . The numerical computations showed that for a fixed m the eigenfrequency of the SW at the interface x"a is a monotonically decreasing function of m (Fig. 1.2) and for a fixed m decreases with m (Fig. 1.3). The SW dispersion properties can be varied by changing the plasma density and the external magnetic field (parameter b). In particular, the wave frequency at the x"!b interface increases with a decrease of b, and for b"0.1 the dependence of the frequency on the wavenumber is a straight line (Fig. 1.4), i.e. the magnetoplasma wave perturbations considered become purely electromagnetic ones. Therefore, a numerical simulation of the inhomogeneity in a plasma layer bounded by a metal confirmed the assumption about the possibility of a dependence of the SW dispersion properties on the layer sizes in an inhomogeneous plasma-like medium [44].
352
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Fig. 1.2. The dependence of the normalized SW eigenfrequency p"u/u on the ratio of the electron plasma densities m "N /N in a two-layer plasma structure bounded by a metal. Curves 1—3 correspond to m "0.1; 0.3; 0.5, respectively, and Q"ck /u "3. Fig. 1.3. The dependence of the normalized SW eigenfrequency p on the parameter m . Curves 1 and 2 are plotted for the wave with Q"3 for m "2; 6, respectively. Curves 3 and 4 correspond to Q"23 and the same values of m .
Fig. 1.4. Dispersion curves of the SW for different values of the parameter b. Curves 1—4 correspond to b"0.1; 2; 10; 20, respectively.
1.5. Azimuthal surface waves in magnetoactive plasma-filled waveguides In this section we consider the possibility of the existence, and study the dispersion properties, of azimuthal surface waves (ASW) in a coaxial plasma structure with metal walls (cold plasma-like medium), and in a cylindrical metal waveguide filled by a finite pressure medium. Both structures are immersed in an axial constant external magnetic field. The coaxial structure studied consists of two coaxial metal cylinders with radii a and b (b'a). The space between the cylinders is filled by
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
353
a homogeneous plasma-like medium. The dependence of all perturbations in the ASW field is given in the form A(r, t)"A(r) exp[i(mu!ut)] ,
(1.33)
where and m are the azimuthal angle and the wavenumber, respectively. The waves considered are the analogs of the SW in a planar Voigt geometry; however, many of their properties are different. ASW in a hollow cylindrical metal waveguide completely filled by a cold magnetoactive plasma were studied in [53]. In this paper the following frequency ranges of existence u (u(min(u !u , u ), u (u(u (1.34) * were obtained. Here u "u /2#(u/4#X). In contrast with the SW in a planar Voigt geometry the ASW do not exist in the low-frequency range u(u , while the high-frequency range is limited by an upper limiting frequency u . In [53] the possibility of the existence of ASW in the same frequency ranges (1.34) at the interface of the metal rod immersed in a magnetoplasma was shown. In both cases studied the ASW are unidirectional wave perturbations. Namely, for the ASW in a waveguide in the first frequency range the azimuthal wavenumbers are positive, while in the second range they are negative. For the ASW propagating at the interface of a metal rod the signs of m are the opposite to those mentioned above. According to the results of papers [54,55] the expressions for the ASW electromagnetic field components in a cylindrical plasma layer can be given in the form H (r)"[AI (sr)#BK (sr)](e /k ) , X
i e [D (r)A!D (r)B] , E (r)" P k e!e D (r) m D (r) m e e !B I (sr)# K (sr)# , E (r)"A ! P e kr e kr e!e k e!e k where
(1.35)
me K (sr) #s , D (r)"!K (sr)
re K (sr)
me I (sr) #s D (r)"I (sr) ,
re I (sr)
s"k(e!e)e\, I , K are the modified Bessel function and the McDonald function of the
mth order, and the prime denotes differentiation with respect to the argument. Substituting the solutions (1.35) into the boundary conditions E (a)"E (b)"0, we obtain the dispersion equation P P of ASW in the structure studied
D (a)D (b)"D (a)D (b) . (1.36) The estimates of the ASW skin depths given in the papers [53—55] show that in a dense plasma for realistic sizes of waveguide structures the inequalities sa<1, sb<1 are satisfied well enough.
354
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
In this limiting case the dispersion equation (1.36) can be simplified, and its solution can be easily written as follows:
m u cd m 4ab c 1! u"! 1# c , (1.37) 2 X ab "m" d where d"b!a, and c"[1#exp(!2sd)][1!exp(!2sd)]\. The expression (1.37) illustrates the dependence of the wave frequencies on the sign of the azimuthal wavenumber: the frequency of ASW localized near the interface r"a (m(0) is higher than that of ASW located near the interface r"b (m'0). This means that the nonreciprocal nature of the ASW propagation remains when the coupling of the wave perturbations localized near the inner and outer boundaries of the structure is taken into account. The coupling of waves in the structure is the most significant under the realization of the condition sd;1. In this case the solution of Eq. (1.36) is
m 1 1 mu 1! # . u" "m" 2s(ab 8sab s(ab
(1.38)
In a thin layer of a plasma-like medium the corresponding wave structure cannot be separated into wave perturbations localized near the opposite boundaries. This is due to the fact that the normal component of the SW electric field is practically constant in the layer of a plasma-like medium. In this case there exist wave perturbations in the structure with equal amplitudes at the different boundaries, and eigenfrequencies (1.37) that depend on the propagation direction. For arbitrary values of the parameters sa, sb the dispersion equation (1.36) was solved numerically. Fig. 1.5 presents the dependence of the relative wave eigenfrequency variation "Du"/u (u is the ASW eigenfrequency for a given value of m and sa<1) on the width of a plasma sheath for two values of the parameter a "au /c and for three values of m"1,3,5. One can easily see from Fig. 1.5 that the quantity "Du"/u asymptotically tends to zero with the increase of the parameter sd. Meanwhile, the relative frequency variation decreases with increasing structure sizes for the same value of d.
Fig. 1.5. The dependence of the relative ASW frequency variation "Du"/u on the normalized width of the plasma sheath xd for two values of the parameter a "au /c"0.05; 0.5, and for three values of the ASW mode numbers m"1, 3, 5, respectively.
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
355
It should be noted that the structure considered above can act as the basis for the design of phase-converting devices of plasma and semiconductor electronics [54,55]. It is very useful to emphasize the main features of ASW in a layer of a cold magnetized plasma-like medium. The propagation of ASW along a curved waveguide surface leads to significant differences with SW propagation in a planar plasma layer (Section 1.3). The eigenfrequencies of ASW with the same "m" and localized near different boundaries of the structure are different and depend on the width of the plasma layer. The latter option is impossible in a planar layer of a homogeneous plasma-like medium (Section 1.3). This is due to the fact that in contrast with the SW in a planar geometry the tangential component of the ASW electric field E is not P equal to zero in the interior of the layer. Moreover, the frequency ranges in which SW and ASW exist significantly differ. Now, following [36,56], let us show that in a cylindrical metal waveguide of radius a completely filled by a finite pressure magnetoactive plasma-like medium potential ASW can exist, and let us also study their potential distribution and dispersion properties in a spatially homogeneous medium. As before, the external magnetic field is directed along the waveguide axis. The dependence of all perturbations in the wave field is given in the form (1.33). Then, using quasihydrodynamic and Poisson equations, one can obtain the following solution for the ASW potential: W(r)"A r #A I (s r) , (1.39) K where s"e (u!u)/e v . Substituting the solution (1.39) into the boundary conditions G 2 W(0)"0, » (0)"0 (» is the radial component of electron velocity), one can easily obtain the dispersion equation of the ASW considered
u m u!u s a u!u I (s a) 1# e ! 1! e "0 . (1.40) I (s a) u "m" X "m" X
Here the prime denotes differentiation with respect to the argument. Eq. (1.40) is similar in structure to Eq. (1.12) for the potential SW in the Voigt geometry and coincides with it in the limiting case of a “wide” waveguide (s a<1), if, however, one makes the substitution !m/aPk . Therefore, one has to expect that in the limiting case s a<1 the form of the solutions for ASW will be analogous to those in the case of a planar geometry. Eq. (1.40) also allows the limit H "0 to be taken. In this case it is written as u "m" I (sa)
" , (1.41) X sa I (sa)
where s"!ue/» e . It should be noted that Eq. (1.41) can be easily obtained under a simplifi2 G cation of the dispersion equation (3) of the paper [57] (one has to assume k "0 in it). The latter equation describes the axially nonsymmetric surface waves (the dependence of wave perturbations in the field of these waves is given by exp[i(mu#k z!ut)], k is the SW wavenumber along the cylinder axis) in a cylindrical metal waveguide completely filled by a finite plasma-like medium with a nonzero electron pressure. The analysis of the dispersion equation (1.40) shows that for the waveguide studied the existence of both high-frequency and low-frequency ASW is possible. For LF waves only the solutions with m(0 are allowed, while for HF waves the solutions with both k '0, and k (0 are possible. The analytical solutions of Eq. (1.40) are studied in the limiting
356
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
cases of “wide” (s a<1) and “narrow” (s a;1) waveguides. In the limiting case s a<1 it is easy to obtain the following expression for LF branch of ASW u"u (1!au/2m» ) . (1.42) J J 2 The expression (1.42) shows that the LF ASW eigenfrequency is close to the low-hybrid one, and increases with increasing azimuthal wavenumber, tending to the low hybrid frequency for large values of "m". Note, that for the validity of the quasihydrodynamical description the value ua/"m" acting as the ASW phase velocity [53], in the case of low-frequency waves must be significantly smaller than the electrons’ thermal velocity » . Using the expression 2 ua/"m""» (1!u/u) 2 J for the ASW phase velocity, one can easily see that the ASW eigenfrequency u must satisfy the following inequality (u!u)/u;1 J for the validity of our analysis. In a rarefied plasma-like medium and in other frequency ranges in a dense medium the low-frequency ASW are strongly damped due to the fact that the value of ua/"m" is close to » . 2 In the same limiting case there also exists a solution for high-frequency ASW with frequencies that significantly exceed the characteristic frequencies of ionic motion. For HF ASW the dispersion equation (1.40) can be presented as u(uGu ) s a u$u u , " (1.43) 1$ "m" X X where s "(u!u/» , the upper sign is related to the solutions with m'0 while the lower 2 one to the solutions with m(0. One can see that Eq. (1.43) is analogous to (3) of the paper [47] and reduces to it after the substitution "m"/aP"k ". From Eq. (1.43) one can obtain the following solution for the HF ASW eigenfrequency:
u r u . #"m" "X u"$ # 4 a 2
(1.44)
The latter solution is valid under the satisfaction of the following inequalities: u;u, "m"r /a;1. In the frequency range u;u solutions with both m'0 and m(0 are " possible. For u;u the following expression is the solution of the dispersion equation (1.44): "m" X (1.45) u" r , m(0 , a "u which is analogous to the solution (1.13) for a semi-infinite plasma-like medium. The phase velocity of ASW with u;u is ua/"m""» (X /u ), and in a dense plasma significantly exceeds the 2 electrons’ thermal velocity. Under the realization of the inequality X
(1.46)
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
357
Eq. (1.46) is analogous to expression (3.1.6) of the paper [36], which is valid in the case of a semi-infinite free plasma. The kinetic damping of the waves is also small because ua/"m"<» . 2 Note that in a rarefied plasma high-frequency waves are strongly damped because their phase velocity is close to the electron thermal velocity. Here we also present the results for the case of a “thin” plasma waveguide (s a;1). This limiting case can be realized for the ASW eigenfrequencies which are close to the lower and upper hybrid frequencies. Namely, in a dense plasma the following solution is valid for LF waves [36]: u"u (1!d), m(0, d;1 , J where d"(1/2)((m /m #u/X). The dependence of ASW eigenfrequency on the azimuthal wavenumber can be obtained by taking into account the next terms in the expansion of the modified Bessel functions in powers of the small parameter s a. The analogous solution can be easily obtained also for HF waves (u&u ). However, due to a weak dependence of the frequency on m it is not of significant interest, and we do not present it here. 1.6. Other types of waves and geometries of the problem Unfortunately, the limited volumes of the present review prevents even a detailed analysis of different linear properties of surface waves in the Voigt geometry. However, a complete coverage of the area of surface waves in plasma-like media—metal structures requires at least mentioning results corresponding to other types of surface waves and to some different geometries of the problem. Let us begin a brief review of the results mentioned above with those related to the Voigt geometry. In the paper [58] the first attempt to identify the wave perturbations in Toda’s experiments was made. However, the observed waves were termed “surface waves” at the semiconductor—metal interface in the paper [30], which appeared two years later. In the paper [59] the first detailed study of the SW dispersion properties at a semiconductor—perfect conductor interface was performed. The subject of the studies in the paper [60] is clearing up the influence of the near-wall transient layers with an inhomogeneous medium density on the SW dispersion properties in the layer of a plasma-like medium bounded by a metal. It is shown that when a density profile that decreases from the metal interface into the volume is realized (this profile can be e.g. realized in activated semiconductor layers bounded by a metal, and in plasma sources) a linear transformation of the surface wave into the lower or upper hybrid small-scale oscillations in the region of the local resonance can take place. The temporal evolution of the SW damped in this way is obtained as well. The paper [61] deals with the investigations of the influence of the near-wall dielectric layer and the finite conductivity of the metal on the SW dispersion characteristics and the electromagnetic field topography. In [62] the application of the type of surface wave perturbations studied in nonreciprocal elements of solid-state electronics, such as directional couplers and electrooptical modulators is discussed. The structure studied is formed by two identical n-type semiconductor waveguiding channels separated by the thin dielectric element and covered by a metal film. The energy exchange processes between the waveguiding channels without and with a DC voltage which is supplied directly to one of the structure channels are studied, and the characteristic spatial scales of such an exchange are obtained. Numerical estimates show that the structure studied, in principle, can be used as an effective directional coupler and electrooptical modulator operating in the millimeter
358
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
and submillimeter frequency ranges. In the papers [63—65] the surface waves propagating across an external magnetic field in gyrotropic plasmas bounded by a metal waveguide of different crosssections are studied. Namely, in [63, 65] square and rectangular cross-sections are assumed, respectively, while in [64] a waveguide with an arbitrary cross-section that is close to a circular one was considered. In all cases the spectrum and the electromagnetic fields of the SW in such waveguides are presented. Moreover, the conditions providing the possibility of SW in the waveguides analogous to the azimuthal surface modes discussed in Section 1.5, are obtained. The papers [66,67] deal with the surface waves which exist at the harmonics of the ion and electron cyclotron frequencies in the planar Voigt geometry, respectively. The consideration of these waves is carried out within the framework of the kinetic theory under the assumption of weak spatial dispersion. The influence of the transverse inhomogeneity of the plasma density (the inhomogeneity profile is modelled by a step function) on the SW phase velocity is discussed as well. The papers [68—70] cover various aspects of the theory of azimuthal surface waves in the cylindircal Voigt-like geometry. In particular, in [68] the influence of a dielectric sheath between the plasma column and a cylindrical metal waveguide on the frequency ranges of the ASW existence, on the ASW eigenfrequencies, and on their electromagnetic fields is discussed. The paper [69] is devoted to the influence of the radial inhomogeneity of the plasma density on the ASW properties (the density increases from the waveguide wall into the medium). In the paper [70] a problem which is similar to that considered in [68] is considered. However, the presence of an axial current is supposed. The latter fact leads to the occurrence of an azimuthal component of the ASW magnetic field H and to ( the impossibility of separating the wave field into E and H modes. However, if the inequality H
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
359
fields on the angle of wave propagation and on other parameters of the problem are obtained. It is shown that at a definite angle of propagation the transformation of the generalized surface waves into the purely surface waves occurs. Papers [49,57,84—87] are devoted to studies of the possibility of the existence of surface waves at the interface between a plasma-like medium with a finite carrier thermal velocity and a metal (an external magnetic field is absent). The results of these papers suggest that taking spatial dispersion into account leads to the possibility of the propagation of potential waves in the structure studied in frequency ranges which differ from that significantly of the surface waves in magnetoactive plasmas. It is important to note that taking spatial dispersion into account leads to the possibility of the existence of the surface-type wave perturbations at the free plasma-like medium—metal interface (if the plasma-like medium is cold, then the SW at the interface of interest to us are possible only under the presence of near-wall inhomogeneities with a special profile [88,89]). Moreover, in the papers [49,57] this result is obtained both in planar and cylindrical geometries. In the paper [49] the dispersion equation of the SW, which takes into account the collisions of the plasma electrons and a finite conductivity of the metal is derived in both homogeneous and inhomogeneous media. It is shown that in the cases considered the finite conductivity of the metal leads to a weak damping of the SW studied. This damping vanishes if the high-frequency conductivity of the metal is assumed to be infinite. If, however, one assumes the electrons’ thermal velocity to vanish, the SW investigated in the papers mentioned above become impossible. However, as is shown in the paper [85], the presence of even a thin vacuum or dielectric sheath (it can be, e.g. a thin oxide layer or a vacuum gap between a plasma and a metal) leads to the possibility of the existence of surface-type waves. The detailed analysis of the SW dispersion properties was carried out both analytically and numerically for a structure consisting of a metal, a dielectric, and a plasma-like medium with a nonzero electron pressure. The paper [57] deals with the potential axially non-symmetric surface waves, propagating at the interface of a metal cyclinder completely filled with a nonzero gas kinetic pressure, plasma-like medium as well as at the interface of a metal rod immersed in an analogous medium. It is shown that by the variation of such parameters of the problem as the azimuthal number m, plasma density, and temperature one can obtain the effective variation of the eigenfrequency of the waves studied (their frequencies can be both significantly smaller than and of the same order as the electron plasma frequency). In the paper [86] the dispersion properties of the SW at the interface between a plasma with a finite gas kinetic pressure and a corrugated metal surface are considered. In this case the modification of the dispersion curves and the formation of non-transparency bands takes place. The kinetic theory of the SW in a homogeneous layer of a hot plasma-like medium, bounded by two perfectly conducting metal surfaces, as well as the comparison between the results obtained in a kinetic approach with those obtained in the framework of the hydrodynamical theory, are presented in the paper [87]. At the end of this section we also briefly note the series of papers devoted to the applied aspects of the theory of the surface-type waves at the plasma—metal interface [90—92], such as the travelling wave antennas in a magnetoplasma [32,90], a surface wave-produced and sustained gas discharge in a planar plasma producer [91], and polishing of metal surfaces [92]; as well as to the possibility of the existence of electron and ion cyclotron SW in a warm plasma in the presence of a magnetic field transverse to the interface (kinetic description) [93,94], and at the interface of a plasmamolecular medium with a metal [95].
360
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
2. Excitation mechanisms of surface-type waves Chapter 2 is devoted to different mechanisms of excitation of the plasma-like medium—metal waveguide structures. Sections 2.1, 2.2 and 2.4 deal with the parametric mechanism of excitation of the surface waves in different plasma-like medium—metal waveguide structures. Consideration begins with the study of the excitation of semi-bounded plasma—metal structure in an external electric field directed normally to the interface (Section 2.1). Two limiting cases of HFSW and LFSW are considered. Then the influence of the near-wall transient layer with a plasma density and (or) an external magnetic field inhomogeneity is considered. Then the excitation of layered plasma—metal structures is studied in an external electric field, directed normally to the interface (Section 2.2). Section 2.4 is devoted to the parametric excitation of the azimuthal surface waves in a magnetoactive plasma waveguide in a radial electric pump field. Section 2.3 is devoted to the drift excitation mechanism of SW in crossed electric and magnetic fields. The last section of this chapter gives a brief review of other excitation possibilities of surface-type waves in plasma-like medium— metal waveguide structures. 2.1. Parametric excitation in semi-infinite plasmas In this section we consider the parametric excitation of SW propagating across an external magnetic field and study the possibility of the instability saturation due to the self-interaction of the waves studied [96,97]. Let us consider the situation where a cold magnetoactive plasma-like medium occupies the half-space x'0 and is bounded at the plane x " 0 by a perfectly conducting metal surface. An external magnetic field H is directed along the z axis. An external high-frequency electric pump field of frequency u directed normal to the interface is excited: EI "E cos u t . (2.1) V We are interested in parametric excitation of the SW propagating across H (along the y-axis). The dependence of all wave perturbations on the y coordinate and time t is taken in the form exp [i(k y!ut)]. The plasma is assumed dense (X/u<e ). To understand the features of the H * excitation of these waves we briefly recall their linear properties [44] (see Section 1.1). The SW studied exist in the frequency ranges u(u , u (u(u , and u'u . The SW wavenumbers J and eigenfrequencies are coupled by the linear dispersion relation (1.5). This equality provides the nonreciprocal character of the SW propagation, namely, only the solutions with k '0 are G possible in the frequency ranges u(u , u'u , and the SW can propagate only along the negative y-axis in the frequency range u (u(u (see Section 1.1). The nonreciprocal nature J of the SW propagation forbids the possibility of the simultaneous excitation of the SW from the same frequency range of existence. So we have to consider the set of parametrically coupled waves from different frequency ranges in the pump field (2.1). In the geometry studied the decay conditions u "u #u , k "!k , (2.2) where u , u are the eigenfrequencies of the excited waves, and k (u ), k (u ) are their wavenum bers, can be satisfied in following two cases:
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
361
(1) The frequency of the HF wave u is close to the low hybrid frequency u and the LF wave J frequency satisfies: u;u. In this case k (0 and k '0. (2) The frequency of the HF wave is close to the upper hybrid one u , and the LF wave frequency satisfies: u;u. In this case k '0, while k (0. We call first case the low-frequency one due to the fact that it is necessary to take into account the motion of the ions or holes in considering the process studied. The second case we call the high-frequency case, due to the fact that the motions of the ions or holes can be neglected. The set of parametrically coupled equations for the slowly varying amplitudes E of the excited SW in the pump field (2.1) has the following form: jE jE #» #l E "ib E*E , jt jy
(2.3)
jE jE #» #l E "ib E*E , jy jt where l are the SW collisional damping decrements, » are the SW group velocities, and b are the coefficients of the SW parametric coupling in the pump field (2.1). The corresponding values of » and b for the low-frequency and high-frequency cases are given below in Table 1. Here X is the ion (hole) plasma frequency, d"u(u u )\!1, X"u!u, c"1/(e k ). The set (2.3) is valid at the intial stage of the instability when the amplitudes of the excited waves are small compared with that of the pump field E , and the influence of the waves with u and u on E can be neglected. If this condition is not satisfied, then the dynamical equation for the pump field amplitude is to be added to Eq. (2.3). Assuming the solutions of Eq. (2.3) in the form E "E exp [i(fy!Xt)], j"1, 2, it is not H H difficult to obtain the following dispersion equation for the SW envelope: D(X, f)"(X#f» #il )(X#f» #il )#b b E"0 , (2.4) where the signs # or ! are taken correspondingly to the signs of the SW group velocities given in Table 1. Eq. (2.4) yields the following expressions for the increment of growth of the amplitude of the SW field envelope
b b l l » » E \ # c"Im X" 2 » » » » » #» Table 1 Values of »
.
and b in the low-frequency and high-frequency cases »
»
b
b
LF case
!cdu /)
»
HF case
cu /X
!cu /X
e u d 2m » u e u mc u
2e u (1/d) m» u e mc
362
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
The instability takes place if the pump field amplitude exceeds the threshold value determined by
b b l l E " # . » » » » In the present case the spatial homogeneity of the pump field gives us the possibility to consider only the temporal evolution of spatially homogeneous perturbations (f"0). In this case the expression for the instability increment can be presented as follows: 2
l #l # c"! 2
l !l #b b E . 2
(2.5)
Expression (2.5) allows one to obtain the following restriction on the magnitude of the pump field E showing the possibility of the realization of the parametric instability E 'E , which gives the following pump field threshold values for the low-frequency and high-frequency cases, respectively:
u c mm X l l ; LF case E " e 2 u X
(2.6)
mc u u (l l . HF case E " (2.7) e u X If the damping decrement l satisfies the inequalities l
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
363
dissipation losses in the system. The dependence of E in the high-frequency case on the external magnetic field value is weak and appears in the next approximation with respect to the small parameter u/X;1. The value of the instability increment in this case is shown to decrease with the parameter X /u for the fixed values of u , u , l and E . It is necessary to point out that the results associated with the excitation of HF waves can be appropriate for a semiconductor plasma if, however, one makes the following substitutions: X to X /(e , where e is * * the semiconductor lattice dielectric constant, and m to m , where m is the electron’s effect ive mass in the semiconductor. The semiconductor plasma is characterized by larger values of the dissipation decrements l and l , but the electron’s effective mass is significantly smaller than m . This leads to the fact that the threshold pump field values are of the same order in solid state plasmas. The next step is to discuss the possibility of the saturation of the instability studied and to estimate the instability saturation levels. Estimates of the latter levels can provide valuable information about the subsequent nonlinear evolution of the excited waves as well as help one to reach conclusions about the efficiency of the saturation mechanism studied. Here we investigate the possibility of the saturation of the parametric instability due to the self-interaction of the excited SW. The action of the nonlinear self-interaction mechanism on SW propagation is discussed in detail in the papers [98,99] (see also Section 3 of the present review). Here we only use the results for the SW nonlinear frequency shifts from the original papers [98,99] and briefly describe the procedure of deriving the basic set of coupled equations. So, to obtain the set of nonlinear equations describing the parametric excitation of the SW with the nonlinear self-interaction effect taken into account we start from the nonlinear quasihydrodynamic and Maxwell equations. Further, we take into account nonlinearities that are cubic in the excited SW amplitudes, of the kind of "E "E , j"1, 2, and use a method described in [100] (we take into account the selfH H interaction channels through the second harmonic generation (2u!u"u) and through static surface perturbations, arising as a result of the action of a ponderomotive force (0#u"u)), and the standard procedure used above to obtain the set (2.3), obtain the following set of equations for the slowly varying amplitudes E and E : jE #l E "ib E* E #im "E "E , jt
(2.8)
jE #l E "ib E* E #im "E "E , jt where m and m are the self-interaction coefficients of the excited waves, which may be found in [98] for the LF case and in [99] for the HF case. We do not present them here due to the fact they take too much space. The corresponding nonlinear frequency shifts of the excited waves are given by Du,*"m "E ",
Du,*"m "E " .
The occurrence of the frequency shifts leads to a mismatch of the decay conditions (2.2) and, therefore, can lead to the saturation of the parametric instability [101]. Assuming
364
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
E ""E "exp(iH ), H"H #H in the set (2.8) it is not difficult to obtain the following values of the instability saturation levels due to the action of self-interaction effects:
b b \ b l E!1 m #m "E " "(l #l ) , (2.9) b l l l b l "E " " "E " . b l Numerical estimates of the instability saturation levels (2.9) show that in the LF case these values of the stationary amplitudes are outside the weak nonlinearity assumption used here (k ;1, where k "» /» are the nonlinearity parameters of the first and the second waves, » is the # # characteristic velocity of electronic oscillations in the SW field, and » is the SW phase velocity). This means that the nonlinear self-interaction effect is not the effective mechanism for the saturation of the parametric instability in the LF case in the framework of a weak nonlinearity approximation. So, strongly nonlinear solutions can be formed as a result of the evolution of this instability. Another situation is appropriate for the HF case, when the stationary amplitude values (2.9) correspond to a weak nonlinearity approximation. This fact enables us to state that the instability saturation which is due to the self-interaction of the excited HF waves can really take place and the levels of the instability saturation are defined by Eq. (2.9). The corresponding nonlinear solutions which are formed as a result of the evolution of the instability can be described in a weak nonlinearity approximation. However, the following restriction for the electrons effective collision frequency and the pump field amplitude is to be satisfied:
E u l !1; , (2.10) E X u so as to satisfy k ;1. The inequality (2.10) shows that the above-mentioned instability saturation takes place if the dissipative losses are small enough and the amplitude of the pump field does not greatly exceed its threshold value. Note that the inequality k ;1 also provides a restriction analogous to (2.10), but it is less rigid than (2.10) and there is no need to present it here. Estimates also show that the stationary amplitude values of the excited SW are smaller than the pump field threshold amplitude and that probably there is no further decay of the excited waves if, however, the thresholds of the latter processes are of the same order as the calculated E values. 2.1.1. The influence of the near-wall transient layer on the parametric excitation of S¼ In the preceding section we discussed the possibility of the parametric excitation of surface waves propagating across the external magnetic field in an external electric pump field directed normally to the interface. The analysis was carried out under the assumption of a homogeneous plasma-like medium. Here we take into account the near-wall transient layer where the plasma density or (and) an external magnetic field are inhomogeneous. This inhomogeneity under some conditions is shown to decrease the instability threshold and, therefore, facilitates the excitation of the plasma—metal structure studied. The geometry of the problem is the same as that considered above in the preceding section. However, the near-wall transient layer of thickness d separates a plasmalike medium from the metal surface. The following inequalities are assumed to be satisfied: j ;d;k , # H
j ;d;(s(d))\;¸ , #
(2.11)
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
365
where k "(u/c)(e (d) is the absolute value of the SW wavenumber, s(x)"(ue (x))/ H ce (x), (s(d))\ is the skin depth, ¸ the characteristic scale of the pump field localization along x axis, j that of the electronic oscillations in the pump field (2.1), e (x)"e (x)#ie (x) are the # components of the cold magnetoactive plasma dielectric tensor, and c"1/(e k ). If conditions (2.11) are satisfied, the near-wall transient layer can be assumed thin compared with the characteristic scales of the SW propagation and the method of successive approximations can be used. Note that the pump field (2.1) can be realized, e.g., in a planar capacitor of width ¸. If the condition s\(d);¸ is satisfied the wave processes can be considered at only one of the capacitor surfaces, i.e. one can use a semi-infinite plasma approximation. Here we study the influence of the near-wall transient layer only in application to the excitation of LF waves [102]. The HF case can be considered analogously. Let us briefly recall that the parametric excitation of the SW with the frequencies u and u and wavenumbers k and k satisfying Eq. (2.2) is being considered. The frequency u is close to the low-hybrid one, and the frequency u satisfies u;u. The SW wavenumbers are given by: k "k(u )"!(X (d)/c)a (d) , k "k(u )"u /» (d) , where a(x)"u(u (x)u (x))\!1, and » (x)"B (x)[k m n (x)]\ is the Alfven velocity. The presence of the near-wall transient layer leads to a shift of the SW frequency as well as to an additional damping of the SW in the regions of local hybrid resonances [7]. The component of the SW electric field normal to the interface significantly increases in the regions of local resonances, and this enhances the nonlinear effects [103—105]. This fact allows one to expect changes in the threshold value of the pump field and in the instability increment. For the sake of definiteness we consider the situation when in the transient layer there is a point of the local low-hybrid resonance for the wave with u and k , i.e. u "u (x ), x (d. One can easily show that the J inequalities H (x )'H (d) and (or) n (x )'n (d) have to be satisfied for the equation u "u (x ) J to have a solution. Under these conditions the SW with u and k is not damped in the transient layer, its frequency shift due to resonant damping can be neglected, and the plasma medium can be assumed homogeneous in its field. This is due to the fact that (s (d))\<(s (d))\ [44] and the resonant frequency shift of each wave is proportional to s (d)d;1 [7]. H By solving the set of Maxwell and quasihydrodynamic equations, it is easy to obtain the expressions for the SW electromagnetic field components. Moreover, one has to take into account a weak collisional dissipation to obtain finite values of the normal component of the SW electric field and not to go outside the weak nonlinearity approximation. In the case studied the set of parametrically coupled equations for the slowly varying amplitudes of the excited SW has the following form: jE #l E " i(b #b )E E* , jt jE #l E #iDu E !Du E "ib E E* , jt
(2.12)
366
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
where b , b are given in Table 1, b is the change in b due to the presence of the near-wall transient layer, m u(x)u(x) j p (x) B u (d) u (x) ¼ (x)# , b "b #ib "i» (d) G m (u!u(x)) jx u(x) u u
u (d)a(d) B X(x)u (x) dx , f (x)"B "f (d)!f (0), Du "! G p(x) 2» (d) is the linear shift of the SW frequency u , Du "nu (d)a(d)(2» (d))\u (x )X(x )(p (x)/x)\ (0 V is the SW resonant damping decrement,
jp (x) p(x)u (d)a (d) j m(u ,x) ! jx jx X(x) » (d) eX(x)u u (d)a (d) u(x)" , 2m » (d)u(x)u p(x) p (x)"p (x)#ip (x)"(u(x)#X(x))d(x)#ic u\X(x) , V d(x)"(u!u(x))u\, m(u , x)" e (u , x)e\(u , x) dx, e (u , x)"p (x)u\(x), e (u , x)"!X(x)(u u (x))\ . In a homogeneous plasma-like medium (d"0) the coefficients b , Du , Du vanish. The increment of the parametric instability is easily obtained from Eq. (2.12): ¼ (x)"iu(x)
c "(g!p)/2'2 , (2.13) where g"(!h/2#2\((h)#(h)), p"c #c !Du , h"!(l !l )#(Du )!(Du ) #2Du (l !l )!4b (b #b )E, and h"!2Du (l !l )#2Du Du !4b b E. The threshold value of the pump field in an inhomogeneous plasma easily follows from Eq. (2.13):
4(b b ) l (l !Du ) pb (b #b ) 1! 1# . (2.14) (E )"! (b #b ) pb 2(b b ) The latter expression is valid if the SW frequency shift "Du " is small compared with "Du " and l . The solutions (2.13) and (2.14) for the instability increment and threshold obtained above reduce to expressions (2.5) and (2.6) in the limiting case of the absence of the near-wall transient layer (d"0). The subsequent analysis has been carried for the situation where the plasma density is inhomogeneous in the transient layer, while an external magnetic field H is homogeneous. The plasma density profile n (x) is assumed to satisfy jn (x)/jx" "jn (x)/jx" ). In this case the B expressions for b and b can be significantly simplified: jX(x) eu(d)a (d)d , (2.15) b " 2m u u u X(d)d(d) jx VB b "!b .
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
367
To make a qualitative analysis of the influence of the near-wall transient layer on the excitation of the plasma-like medium—metal structure, we consider two limiting cases. In the first one we assume the terms associated with the inhomogeneity to be small, viz. b /b ;1, "b "b ;1, "Du ";l , "Du ";l . Such a situation occurs if the inhomogeneity in the near-wall transient layer is weak. The expression for the threshold value of the pump field in this case can be written as
"Du " b ! , E "E 1# (2.16) 2l 2b where E is the value of the pump field in a homogeneous plasma and is given by Eq. (2.6). Expressions (2.15) and (2.16) show that two additional terms in Eq. (2.16), which are due to the inhomogeneity, have the opposite signs: the resonant damping taking into account (Du ) leads to an increase of the instability threshold, and an increase of the SW coupling coefficient compared with that in a homogeneous plasma (b ) leads to a decrease of the instability threshold. It is possible to show that in our case the second effect prevails over the first one. So, the decrease of the instability threshold compared with the homogeneous plasma case takes place. In the limiting case studied the instability increment can be written in the following way: c "c #c , j"1, 2, c ;c (2.17) H H H H where c "!(l /2)(1!(1#4(E /E )l /l ) c "l (b /b )(E /E ) in the limiting case (l /2l )&2E/(E )<1 (the pump field significantly exceeds the instability threshold), and c "l [(E /E )!1] c "l (b /b )(E /E )#(Du )/4l #(3/4)(Du /l )l #l Du /l in the limiting case (l /2l )<2E/(E )&1 (dissipative parametric instability). In both cases taking the inhomogeneity into account leads to an increase of the parametric instability increment. In another limiting case, when the inhomogeneity in the near-wall transient layer is strong (b ,"b "
368
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Fig. 2.1. The dependence of the threshold pump field value on the dimensionless inhomogeneity parameter; l /u "0.1; l /u "0.2; a (d)"d(d)"0.25; u /u "4; X(d)/u"10; x (d), d"0.05. Fig. 2.2. The instability increment for E "1.5E and the same parameters as in Fig. 2.1.
easy to show that the ratio b /b &S. Therefore, it is reasonable to expect that the increase of S leads to a decrease of E and to an increase of c . This fact is illustrated in Figs. 2.1 and 2.2, where the dependences of the normalized threshold pump field value and the instability increment on S are presented. Fig. 2.1 shows that the value of E decreases with an increase of S and is smaller than the value of the pump field in a homogeneous plasma. It is easily seen from Fig. 2.2 that the increment exceeds the corresponding value in the absence of the transient layer and increases with an increase of the plasma density inhomogeneity parameter. The E and c values significantly differ from EP and c even for S(0.7. To end this section, we state that the increase of the amplitudes of the excited waves can lead to a plasma density redistribution in the vicinity of the resonant point, and this fact has to be taken into account in the framework of the dynamic problem of the excitation of the plasma-like medium—metal waveguide structure. 2.2. Parametric excitation of SW in a plasma layer In this section we briefly discuss the possibility of the parametric excitation of surface waves in a plasma layer bounded by a metal in a high-frequency electric pump field [106]. A layer of a homogeneous plasma-like medium occupies the space !a(x(a. An external magnetic field B is directed along the z axis. In the layer the spatially homogeneous electric pump field of frequency u and directed normally to its interfaces is excited: E "E cos u t . V The SW propagate at both interfaces between the plasma-like medium and a metal along the y-axis. The dependence of all wave perturbations on the y-coordinate and time t is taken in the form exp[i(k y!ut)], where k are the wavenumbers of the SW propagating at the upper and
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
369
the lower boundaries of the layer, respectively (due to the nonreciprocal nature of the propagation of the waves studied k is always equal to !k [51]). We study the excitation of LF and HF waves with eigenfrequencies satisfying u;u and u;u, respectively. In both cases the conditions of symmetric decay u "u#u, k "!k are satisfied. Under these conditions the parametric interaction between the SW propagating at the different plasma layer boundaries can take place. This interaction is shown to be unstable. It is not difficult to show that to provide a stable generation of waves with frequencies u and to avoid the generation of undesirable higher harmonics the inequalities 4u"u:u are to be satisfied. In this case the amplitudes of the excited waves of frequency u are maximal due to the fact that the second and other higher harmonics of the SW are the forced perturbations of the system and do not satisfy the conditions for their resonant generation. The set of parametrically coupled equations describing the interaction of the SW propagating at the different boundaries of the layer can be written in the following form:
j j #» E #lE "iaE E , jy jt
(2.20)
where l is a decrement of the SW collisional damping, » is the SW group velocity, » "» for LF waves, and » "cu /X for HF waves, a is the coefficient of the SW parametric coupling, c"1/(e k ). a"3euua[2m » (uu) sinh(2s a)]\ s "u(u » )\, (u;u) a"eXua[mcu(u!u)e sinh(2s a)]\ * s "X/cu (e , (u;u) . * The set (2.20) is valid when the inequality "E ";E is satisfied. The latter situation happens at the initial stage of parametric instability when the influence of the excited waves on the pump field can be neglected. Assuming E "E exp[i(fy!Xt)], j"1, 2, it is not difficult to derive the parametric H H instability increment: c"Im X"aE!l .
(2.21)
The latter equality yields the following values of the pump field threshold amplitude: 2lm » (u!u) sinh(2s a) (2.22) E " 3euau in the low-frequency case (u;u), and lm ce u(u!u) sinh(2s a) E " * (2.23) euaX in the high-frequency case (u;u). It is necessary to point out that the parametric instability increment in our case coincides with that of a temporal problem (j/jy"0). This corresponds to a well-known fact: the maximal increment equal to that of a temporal problem is reached when two waves move in different directions with the same group velocities [107].
370
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Now let us make some numerical estimates for the threshold pump field values (2.22), (2.23) and the parametric instability increment (2.21) for various parameters of the structure studied. We start with the low-frequency waves with the frequency u"u /3 in a weakly collisional (l"u/5) plasma with the density n "10 m\ and immersed in an external magnetic field B "0.1 Tl. For these parameters and a plasma layer thickness 2a"0.01 m the threshold pump field value is about 3 kV/m and about 3.15 kV/m for 2a"0.05 m. It is not difficult to see that the increase of the layer thickness leads to an increase of the value of E . The relative increment c/u is about 0.18 for 2a"0.01 m, E "4 kV/m and the same n , B , u, l values, and about 0.14 for 2a"0.05 m and the same values of E , n , B , u, l. The increase of the layer thickness leads to a decrease of the SW coupling coefficient and hence to a decrease of the parametric instability increment. It should be also noted that the increase of an external magnetic field intensity leads to a decrease of c. The estimates for HF waves have been carried out in application to semiconductor plasmas [106]. These estimates show that the pump field threshold values are higher than in the LF case. One can easily show that to excite the SW with u"u /3 in a thin film of n-GaAs (n "5;10 m\, e "13, m "6;10\ kg, l"10 s\, 2a"1 lm) immersed in an ex * ternal magnetic field B "0.6 Tl, it is necessary to apply a voltage 0.19 V (E "190 kV/m) to the structure. The increase of the film thickness also leads to an increase of E . E is about 220 kV/m for 2a"2 lm (a voltage of 0.4 V is to be applied). The relative instability increment c/u is about 0.11 for 2a"2 lm and E "230 kV/m, and about 0.15 for 2a"1 lm and the same E value. Note that as in the LF case the value of c decreases with an increase of the layer thickness and with an increase of the value of B . 2.3. Excitation of SW in crossed E and H fields As was shown in [44], in the presence of an external magnetic field parallel to the plasma-like medium—metal interface SW can propagate in one particular direction along the interface (unidirectional waves). This direction is normal to the imposed magnetic field and varies for the different frequency ranges in which surface waves can propagate. Because the unidirectional waves under consideration travel across the magnetic field, their excitation by beams of charged particles is troublesome. In this section we study the drift mechanism for exciting these unidirectional waves [108]. This drift is caused by crossed external magnetic and electric fields. We consider the case when the electric field is transverse to both the interface and the external magnetic field, and when the direction of the resulting drift coincides exactly with the direction of SW propagation. We will study the conditions for the excitation of unidirectional SW in a magnetoplasma and will find the excitation efficiency. Below, for brevity, a plasma-like medium (either gaseous or solid state) will be referred to as a plasma. Let a homogeneous plasma occupy the half-space x'0 and be bounded at x"0 by a perfectly conducting metal surface. Let external steady magnetic B and electric E fields be directed along the z- and x-axis, respectively. We consider perturbations in the form of a SW with an amplitude decreasing with the distance from the plasma—metal interface, and assume that the wave propagates transverse to both B and E , i.e. along the y-axis. We assume the dependence of the perturbations on the y-coordinate and time t in the form exp[i(k y!ut)], where u and k are the frequency and the wavenumber of the SW, respectively. From the equations of two-fluid quasi-
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
371
hydrodynamics it follows that all the plasma particles drift with the constant velocity u"(E ;B )/B either in the direction of or opposite to the y-axis. From the set of Maxwell equations and the equations of quasihydrodynamics for a cold magnetized plasma, we find the following expressions for the components of the electromagnetic field of the SW: H (x)"A exp(!jx) , X E (x)"i[k (k #b)!keJ ]H (x)/[k(eJ j!k eJ )](k e ) , W X E (x)"(keJ !j(k #b))H (x)/[k(e j!k e )](k e ) , V X where we use the following notation:
(2.24)
e (ke !k)#e (e !e )(u)/c!(keJ ) , j" (e (u/c)!1)(e !e )!e u X X ? , e " ? ? , e "e # * u u!u u!u ? ? ? ? u X ? u"u!k u, eJ "e # , u u!u ? ? u u X u X ? , b" k ? , eJ " ? c u u!u u u!u ? ? ? ? X"n q/m e , u "e B /m ; k"u/c; u""u"; c"1/(k e ) , ? ? ? ? ? ? ? B ""B " , e is the dielectric function of the crystal lattice (for a gaseous plasma e "1); and n , q , m are the * * M? ? ? density, the charge number, and the effective mass of a-species particles. Using the boundary condition for the tangential component of the electric field, which vanishes at the plasma—metal interface (E (x"0)"0), we find the following dispersion relation for the SW: W k (k #b)!keJ "0 . (2.25) In the absence of an electric field (E "0), we can reduce the dispersion relation to the form k!ke "0 , (2.25a) which was obtained in [44]. In Eq. (2.25a) the notation e "eJ (E "0) is used. The solution of Eq. (2.25) has the form k "!b/2#k+eJ [1#(b/k)/4eJ ],. (2.26) From the dispersion relation (2.25), we can find that SW with frequencies in the range determined by the inequality eJ 'b/4k are possible. We assume that the drift velocity of the particles is small (u;c). In this case, the quantity b/4k is negligible, and a SW can propagate with a frequency in the same ranges as in the absence of an electric field [44]. Relation (2.25) shows that, in both gaseous and semiconductor plasmas, a SW with the frequency u"k u#nu , can be excited via the Doppler nonresonant mechanism (where n"$1 and
372
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
k is the wavenumber of the wave that satisfies Eq. (2.25a) when E "0). We seek the solution of the equation in the form k "k #dk , where we assume (dk );(k ), u/c;1, and dk u;u,u . Note that for a gaseous plasma, the additional assumption should be made that u;v (where v is the phase velocity of the SW). Under these conditions, the solution for the correction to the wavenumber dk can be obtained in the form Xnu 2b , (2.27) dk " !b # b#2b uc
where b "k #X/c!ke and b "2k !uX/(2nu c). The surface wave can be excited when the term under the radical in expression (2.27) is negative. To be specific, we consider the case when u;» and the excitation is possible for negative values of k n/u. In the frequency range u (u(u , for n"1 and k /"k "(0, only a drift along the * y-axis can cause the instability. This corresponds to "u"/u'0 and determines the condition E e "!E for the orientation of the x-component of the electric field E . V From Eq. (2.27), we can find an expression for "Im(dk )": X (u!u) 1 X u . (2.28) ! "Im(dk )"" 4uc 16 c u!u When the inequality
1uX u uX ( 1! (4 (2.29) 4cu u cu holds, the condition (dk );k is satisfied, and the expression under the radical in formula (2.27) is negative. From Eq. (2.29), we can see that, in the case of interest, the frequency of the SW should be close to the cyclotron frequency u . This is in agreement with our assumptions. To be specific, we consider the characteristic parameters of the problem under conditions typical of a gas discharge plasma: B "0.02 Tl; E "6 kV/m; X /u "10. Under these conditions, the drift velocity of the plasma particles is u"3;10 m/s, and the phase velocity of the wave is » "10 m/s, so that the inequality k u;u is easily satisfied. When the wave frequency is u"0.99u , from Eq. (2.28) we find that the value of the normalized spatial growth rate of the instability "Im(dk )/k " is 8.8;10\. The instability occurs when the typical length scale corresponding to the instability growth rate is smaller than the characteristic decay length of the SW. Numerical estimates show that it is easier to satisfy this condition in a gas discharge plasma than in a semiconductor plasma. This is due to the fairly large frequency of electron collisions l in a semiconductor. For a solid-state plasma the range of parameters where the above condition holds is the following: n&10 m\, B " 0.1—0.5 Tl, e &12—16, E &30—150 kV/m, and l /u&10\—10\. These parameters are typical for * experiments on the excitation of surface waves in a semiconductor in the presence of a strong magnetic field [109]. When the condition mentioned above does not hold, the instability does not arise. We note that in a solid-state plasma, in addition to the instability that we consider here, the ion sound instability is possible. This instability occurs when the relative drift of the charge carriers and the electric current are collinear: u "u(l /u ) . V
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
373
Under the conditions of the experiment [109], when the value of u exceeds the value of the sound V velocity of the holes, the instability arises that was studied in detail in [110]. For the same parameters, comparing the normalized growth rate of this instability c/u&10\ and the growth rate corresponding to the excitation of the SW in crossed E and B fields, we find that the latter growth rate is substantially greater. We note that the scheme of SW excitation that we discuss here can also be used to resonantly excite (when u"k u#nu ) a wave with a frequency u(u in a gaseous plasma. In this case the instability can arise when the conditions k /"k "'0 (excited waves propagate along the y-axis in the positive direction), n"1, and "u"/u(0 hold; the expressions for dk , "Im(dk )" and the limitations for the allowed frequency range (for which the excitation is possible) coincide with those following from Eqs. (2.27), (2.28) and (2.29) after the replacements u to u , X to X . Our analytical study of expression (2.27) and the numerical solution of Eq. (2.25) show that, in the frequency range above the upper hybrid frequency, the value of dk is real, and no excitation of SW is possible. We studied the dispersion relation (2.25) numerically in order to understand the conditions under which the SW is excited, to obtain additional information about the dependences of the growth rate of the instability on the parameters, and to verify our analytical estimates. Our study has shown that, within the frequency range where the inequality uX 1uX ((1!(u/u ))( cu 2cu holds, the analytical results following from the formula (2.28) essentially coincide with the results of our numerical calculations. Figs. 2.3, 2.4 and 2.5 show the normalized spatial growth rate of the instability "Im(k )/Re(k )" as a function of the frequency in the range 0.9u (u(0.99u . Each figure shows several curves
Fig. 2.3. Normalized spatial growth rate as a function of frequency for several values of the drift velocity u: (1) 3;10 m/s, (2) 5;10 m/s, and (3) 10 m/s. The curves correspond to a gaseous plasma with the parameters u "3;10 1/s, and X /u "5. Fig. 2.4. Normalized spatial growth rate as a function of frequency for several values of the magnetic field. Curves 1, 2, 3 are for u "10, 3;10, 6;10 1/s, respectively. Other parameters are u"5;10 m/s and X "1.5;10 1/s.
374
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Fig. 2.5. Normalized spatial growth rate as a function of frequency for several values of the plasma density. Curves 1, 2, 3 are for X /u "10, 5 and 3, respectively. Other parameters are u"10 m/s and u "3;10 1/s.
corresponding to different values of one of the parameters of the problem while the other parameters are fixed. Fig. 2.3 presents curves for different values of the drift velocity. It shows that both the spatial growth rate and the width of the instability region (in the frequency range where the excitation of a SW is possible) grow when the drift velocity increases. In all the cases, the value of "Im(k )/Re(k )" increases when the frequency of the wave approaches the electron cyclotron frequency. Fig. 2.4 presents the dependences of "Im(k )/Re(k )" on the frequency; different curves correspond to different values of the external magnetic field. This figure shows that both "Im(k )/Re(k )" and the width of the region in the frequency range (in which the SW can be excited) decrease when the electron cyclotron frequency increases. Fig. 2.5 shows the frequency dependences of the spatial growth rate for different values of the plasma density. The figure shows that the spatial growth rate increases with plasma density. We note that the behavior of the quantity "Im(k )/Re(k )" is the same for both a semiconductor plasma in the range u(u and a gaseous plasma for u(u . In all other frequency ranges, where the condition u!k u"nu does not hold, the correction ? to the wavenumber k is small. For small drift velocities ("u";v ), this correction can be obtained in the form: Xu ? ? . (2.30) dk "!k(u/c) (u!u) ? ? In this case, because dk is real, the excitation of SWs is not possible. Our numerical calculation confirms this conclusion. The values of dk obtained analytically and numerically are close. Since, in our approximate analysis, the value of dk is much smaller than k , the dispersion relation for SWs deviates only slightly from that in the case of the absence of a drift (u"0). In concluding this section, we note that the instability under consideration may cause the heating of charge carriers, which can be important in some cases. This heating can modify the dispersion characteristics of the waves considered. However, since the phase velocity of the wave substantially exceeds both the drift velocity and the electron thermal velocity, this modification is insignificant, and our model of a cold plasma is valid for describing the linear instability. Because the heating effect is proportional to the square of the SW field, it cannot be ignored in studying the nonlinear
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
375
dynamics of excited waves. We can also mention that the instability can lead to the formation of a double layer. However, both these effects are outside the scope of this review. 2.4. Parametric excitation of azimuthal surface waves In this section the parametric excitation of a specific type of waves existing in magnetoactive plasma waveguides — azimuthal surface waves (ASW) — is considered [111]. They are non-ordinary electromagnetic waves propagating in an azimuthal direction across an external magnetic field at the plasma—metal interface [53]. The waveguide structure studied is a cylindrical metal waveguide of radius a fully filled by a cold plasma and immersed in an axial magnetic field H . In this structure the ASW propagation is possible in the following frequency ranges [53] (see also Section 1.5 of the present review): u (u(u , u (u(u , (2.31) J where u "u /2#[u/4#X]. Moreover, these waves are strongly nonreciprocal and can propagate only in one azimuthal direction (they are also unidirectional waves) depending on the sign of the azimuthal wavenumber of the SW. So, in the first frequency range in Eq. (2.31) m (0 and in the second one m '0. This feature of ASW allows the simultaneous excitation of two waves in an azimuthally homogeneous pump field u (m "0). To do this, it is necessary to satisfy the following decay conditions m "m "m, u "u (m )#u (m ) (2.32) by the proper choice of the pump field frequency and the excited ASW mode number. The dependences u (m ) are presented in [54] for different limiting cases. Here we consider only H H high-frequency waves caused by the HF motions of the electrons. To be specific, let us consider the situation when the plasma waveguide is “wide” for both excited ASW. This case may be realized if the following inequality (a/d)<m(X /u ) (2.33) is satisfied, where d"c/X . The electric pump field is assumed homogeneous and directed radially: EI "E cos u t . (2.34) It should be pointed out that this field can be realized, e.g., in a coaxial cylindrical capacitor with plasma filling. To consider this field homogeneous, the value
1 dE q" E dr which characterizes the inhomogeneity of E in a cylindrical geometry, must satisfy q(r"a);s , j"1, 2 H where s is the inverse skin depth of ASW. For E &1/(r the latter inequality is reduced to s a<1 H H which is less rigid than Eq. (2.33) and there is no need to use it here. The result presented in this section may be used to study the excitation of ASW in such a capacitor in the case where the distance between its cylinders greatly exceeds the ASW skin depth, and the wave processes at the
376
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
different interfaces may be considered separately. If the inequality (2.33) is satisfied, the following expressions for the excited ASW eigenfrequencies are valid: u "mu d/a , u "u #2\mXd/2a . These expressions show that the frequency u is close to the upper hybrid one, while the second frequency satisfies u;u. To calculate the components of the electromagnetic fields of the ASW with frequencies u and u under conditions of their parametric interaction in the pump field (2.34), we start from Maxwell’s equations and the quasihydrodynamic equations for a cold magnetoactive plasma. Taking into account the coupling between the ASW, we have
me g (a)e\QH? (2ns a s# H Hs E E* , (2.35) D E "! H H H H s!s ae H H H where D "D(u ,m )"(m e )(ae )\#s is the ASW dispersion relation in the absence of the H H H H H H H parametric coupling, g (a)"g (a)#e e\g (a), g "!f (2u u\#mda\) , H H H H H a u (ms)\ , g "0, g "f 1! d u a u a u 1! (ms)\ # 1# (ms)\ , g "!f u d u d
e exp(s a) d f " , 2m u u d 2na e S exp(s a) mu u e !e , s" H H H, S"X u\ , f " 2m u u d 2na e c H e "e (u ), e "e (u ), c"1/(e k ), i, j"1,2, iOj . H H H H Replacing u by u #ij/jt and taking into account the weak dissipative losses in Eq. (2.35), one H H can easily obtain the following set of coupled equations for the slowly varying ASW amplitudes E : H jE H#l E "ib E E* , (2.36) H H H jt
jD \(2ns a m H H g (a)e\QH? s# He e\ where b " H s!s H ju a H H H are the coefficients of the ASW parametric coupling in the pump field (2.34), and l are the ASW H collisional damping decrements. The set (2.36) is valid at the initial stage of the evolution of the instability, when the influence of the excited waves on the pump field can be neglected. The set
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
377
Fig. 2.6. The dependence of the dimensionless parametric instability increment of ASW c on the external magnetic field strength for p"20, E "1.3E , l /u "0.1. Curves 1—3 correspond to m"1, 2, 3, respectively. H H Fig. 2.7. The dependence of the ASW increment c on the plasma density for d"au /c"7 and the same parameters as in Fig. 2.6.
(2.36) yields the following expressions for the instability increment and the pump field threshold value: c"!2\(l #l )#(4\(l !l )#b b E ,
(2.37)
E "(l l /b b . Now let us make some numerical estimates of E and c for several parameters of the waveguide structure studied. The value of E is about 370 kV/m for n "2;10 m\, B "0.05 Tl, l u\"0.1, p"a/d"20, and m"1. The increase of the mode number leads to an increase of the H H pump field threshold value. So, E is about 420 kV/m for m"3. The value of E can be decreased by decreasing the plasma density or by increasing the value of the external magnetic field. The instability threshold is decreased from 650 kV/m to 330 kV/m by decreasing the plasma density from 2;10 m\ to 1;10 m\ (B " 0.05 Tl). Analogously, E can be decreased from 1.1;10 V/m to 200 kV/m by increasing the magnetic field value from 0.025 to 0.13 Tl (n "10 m\). Fig. 2.6 presents the dependence of the dimensionless instability increment c "(cm c)(eE )\ on the external magnetic field. It is easy to obtain that the increment increases with H according to the analytically obtained proportionality c&H (c&S\) . The dependence of the increment on the plasma density is plotted in Fig. 2.7. This figure shows that the instability increment decreases with the increase of the plasma density according to the analytically obtained proportionality c(n ), which can be written as c&n\(c&1/(S) .
378
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
It is not difficult to see from Figs. 2.6 and 2.7 that the first mode is characterized by larger values of the increment and "Dc /DS" than the modes with m'1 (D is the variation of the corresponding value). 2.5. Other mechanisms of SW excitation In this section we briefly cover the other problems associated with the excitation of surface-type waves in plasma-like medium—metal waveguide structures. In the paper [45] the excitation of the SW by a line source of magnetic current is considered. This paper is often referred to as the first one devoted to the excitation of surface waves at a plasma-like medium—metal interface. The field due to a line source of a magnetic current situated in a lossless plasma region above a perfectly conducting screen is considered when the uniform static magnetic field is applied throughout the plasma region parallel to the direction of the line source. It is shown that under certain conditions the surface waves are excited at the plasma-like medium—metal interface. The dependence of the efficiency of the SW excitation on the distance d of the line source from the interface is examined as well. Moreover, it is also shown that by a proper choice of d, it is possible to decrease the leading term in the asymptotic series for the radiation field significantly, and thereby obtain a surface wave field that is much stronger than the radiation field near the waveguiding surface. In the papers [112,113] problems connected with the beam mechanism for the excitation of the plasma—metal structure are considered. The excitation of slow waves with phase velocities » satisfying » ;» ;c (» is the electrons thermal velocity) is considered. The low density 2 2 electron and ion beams are assumed to be nonrelativistic and monoenergetic ones. In the paper [112] a finite electronic pressure plasma without an external magnetic field is assumed. Here an external magnetic field is chosen to satisfy X;u;X (X are electron plasma and beam frequencies, and u is the electron cyclotron frequency, respectively). The latter inequality allows one to consider the electron beam magnetized, and to neglect the influence of an external magnetic field in the plasma. The influence of the sheath region between a plasma and a metal on the studied excitation of the structure is demonstrated by studying separately the excitation of the SW by electron beams moving in the vacuum sheath region and in the plasma region. A comparison of the excitation efficiencies shows that the SW are more effectively excited by electron beams moving in the plasma region. The papers [113] are devoted to excitation of the SW propagating along an external magnetic field by ion and electron beams. (The low-frequency SW from the range u(u , where u is an ion cyclotron frequency, are excited by monoenergetic ion beams, and the high-frequency SW from the range u ;u(u , where u is the low-hybrid frequency are excited J J by monoenergetic electron beams.) The waves propagating along the beam are shown to be excited under conditions of Cherenkov resonance. Under conditions of a normal Doppler resonance the backward waves are excited. The corresponding instability increments are calculated. The subject of studies in the paper [114] is the examination of the SW excitation at the plasma-like medium—metal interface in a vacuum—plasma-layer—metal structure by an electromagnetic E wave incident from the vacuum. The frequency of the incident wave is chosen to provide the exponential decay of its field away from the plasma—vacuum interface. Such a field can excite two pairs of waves propagating in different directions at the plasma—vacuum and plasma—metal interfaces. The electromagnetic SW are excited at the plasma—vacuum interface and potential ones at the plasma—metal interface. Due to a significant difference of the excitation increments of
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
379
potential and electromagnetic SW the correlation between the efficiencies of wave excitation at the different boundaries can be effectively varied in their dependence on the structure parameters. Estimates of the structure parameters providing the most efficient excitation of the SW at the plasma-like medium—metal interface are presented. In [115,116] the diffraction mechanism of SW excitation is examined. The excitation of two types of the SW is considered, namely the SW in a cold magnetoactive plasma (Voigt geometry) [115] and in a plasma with a finite electron pressure [116]. Diffraction excitation can be realized when a surface wave propagating at a plasma—vacuum interface or an external electromagnetic wave are incident at the edge of a metal half-plane. The excitation efficiencies of the SW are presented as well in application to different situations. The papers [117,118] deal with mechanisms of excitation of the azimuthal surface waves in magnetoactive metal waveguides filled by the plasma-like medium differing from those considered in Section 2.4. Namely, in [117] the generation of the electromagnetic azimuthal surface waves by an angular relativistic electron beam in a gyrotron-like structure was examined. The dependence of the beam instability increments on the waveguide parameters is presented. The intervals of the effective ASW wavenumber k "mc/RX (R is the plasma column radius) providing the most effective excitation of wave perturbations are defined as well. In the paper [118] the excitation of the ASW in a cylindrical semiconductor structures in the presence of the electrons’ drift motions is studied. Azimuthal drift motion of the electrons is provided by crossed electric and magnetic fields (the magnetic field is directed along the waveguide axis while the electric field is directed in a radial direction). The dependence of the dissipative instability increments on the waveguide parameters is investigated.
3. Nonlinear theory of SW This section is devoted to studies of the main nonlinear processes that are realized under the propagation of finite amplitude surface waves at a plasma-like medium—metal interface in the Voigt geometry. In all sections the results of investigations carried out within the framework of the weak nonlinearity approximation are presented. The criterion for the validity of this approximation is given in the appropriate places. 3.1. Resonant second harmonic generation Here the nonlinear process of the resonant second harmonic generation of the surface waves propagating at a cold magnetoactive plasma-like medium—metal interface in the Voigt geometry is considered [119]. Two types of the plasma-like medium, namely, an n-type semiconductor [119] and a semiconductor superlattice [120] are considered. In both cases B is directed parallel to the plasma-like medium—metal interface along the Z axis. The medium occupies the half-space x'0 while the metal occupies the half-space x(0. The second harmonic generation process is one of the basic nonlinear processes taking place in media with quadratic nonlinearities [17,121,122]. However, it should be emphasized that generally the second harmonic generated is not an eigenwave of the system. In the situation where the phase synchronism conditions discussed below are satisfied, the second harmonic is an eigenwave, and
380
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
the efficiency of second harmonic generation is the highest. This case is called resonant second harmonic generation. In both cases, to describe the nonlinear process of second harmonic generation in the plasmalike medium—metal structure we start from the following quasihydrodynamic equations for the motion of the electrons in the SW field and the Maxwell equations: j e e e #( grad)"! E! [;B ]! [;B] , jt m m m jn #div(n )#div(n)"0 , jt jB curl E"! ; jt
e jE curl H" * !en !en; jt
(3.1)
H"k B ,
where "(dE)/(dp) is the velocity of electrons’ motion in the SW field, E, H, are the magnitudes of the SW electric and magnetic fields, n is the equilibrium density of the electrons, n is a perturbation of the electron density due to wave processes, e and m are the charge and the effective mass of electrons, E and p are the electron energy and quasimomentum, respectively, and e is the * semiconductor lattice dielectric constant (e "1 in a gaseous plasma). The set (3.1) should be * supplemented by the relation between the electrons’ energy and quasimomentum. Here we assume this dependence to be parabolic (E"p/2m ). The influence of the nonparabolicity of this relation on resonant second harmonic generation is briefly discussed in this section while the theory of the SW self-interaction caused by this nonlinearity is discussed in detail in Section 3.4. Only highfrequency (HF) wave processes which are due to HF motions of the electrons are considered, and all possible low-frequency elementary excitations such as ions, phonons, etc. are neglected. To solve the set (3.1) we assume that the amplitude of the nonlinear surface wave satisfies k"» /» ;1 (» is the characteristic velocity of electronic oscillations in the wave field, » is # # the SW phase velocity). In this case the weak nonlinearity approximation is valid, and the solution of the nonlinear set of equations can be obtained using perturbation theory methods [100]. In this case the solution of the set (3.1) for the wave perturbations can be represented in the form of a series expansion with the harmonics of the nonlinear SW eigenfrequency u and the wavenumber k : A(r, t)" A (x) exp(iW ) , H H H where A (x) is the amplitude of the wave perturbation, W "jW, W"k y!ut. According to the H H results of the paper [123], in the weak nonlinearity approximation the ratio of the amplitudes of the (j#1)th and jth harmonics is proportional to the following expression A /A &k[D ]\, H> H H> where D "D(k , u ) is the dispersion relation coupling the eigenfrequency u and the wavenumH H H H ber k of the jth harmonics, and k (u ) is the corresponding solution of the latter dispersion H H H equation. When the jth harmonic is an eigenwave of the structure (D "0) and the ( j#1)th harmonic is H not an eigenwave (D O0), the ratio A /A &k;1. In this case the efficiency of the generation H> H> H of higher harmonics is low. The situation changes considerably when both harmonics considered are eigenwaves of the structure (D "0, D "0). In this case even for k;1 the amplitudes of jth H H>
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
381
and ( j#1)th harmonics can be of the same order. Such a process is called resonant higher harmonic generation [121,124]. It should be noted that the latter process may be effective enough to be used in electronic frequency multipliers. In particular, resonant second harmonic generation is very promising for frequency doublers if, however, one can obtain the necessary efficiency of energy transfer from the first harmonic to the second [125]. It is not difficult to show that to satisfy the conditions D "D(k (u), u)"0 and D "D(2k (u), 2u)"0 the frequencies and the wavenumbers of the first and second harmonics must be coupled by the following spatial and temporal synchronism conditions [17,121,122,124] u#u"2u, 2k (u)"k (2u) . (3.2) To restrict our consideration only to the interaction of two harmonics, the third harmonics must be a non-eigenwave of the structure (D(3k (u), 3u)O0). In this case its amplitude is k\
(3.5)
382
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
5 H (x)"B exp(!s x)! k uX\A exp(!2s x) , X 3 s "X /c, k "(eX A)(m uu c)\, c"1/(k e ) , where A and B are the amplitudes of the first and second harmonics, respectively. One can easily see from Eq. (3.5) that the second harmonic does not disappear when the first harmonic signal is switched off (A"0). This means that it is an eigenwave of the structure studied. In this case the static amplitude approximation, which is usually used for the study of the generation of higher harmonics that are non-eigenwaves of the structure [100], becomes invalid. This is due to the fact that the latter approximation is valid when the influence of the excited waves on the pump wave can be neglected. If the excited waves are not eigenwaves, the latter is possible because the amplitudes of the excited waves are k\<1 times smaller than the pump wave amplitude [100]. In the case when both harmonics of the SW are eigenwaves, their amplitudes may become quantities of the same order and, thus, the influence of the excited waves on the pump wave cannot be neglected. The process of energy transfer becomes periodic and so the following dynamic equations for the amplitudes A and B are necessary jA jA #» , "ib ,A*B , jy jt jB jB #» , "ib ,A , jt jy
(3.6)
where b ,"b ,"!eu[2m cu ]\ are the coupling coefficients of the first and the second harmonics, and » ,"!c u /X is the SW group velocity. The solution of the set (3.6) for the temporal dependence of the first and the second harmonic amplitudes can be presented in the following form: a(q)"a #(a !a )sn[u(q), K] , where
(3.7)
K"[(a !a )(a !a )\] , u(q)"(a !a )q#sn\[(a (a !a )\), K] , a(q)"a!a"b!b, a "a(q"0), b "b(q"0) , b"Bb /u, a"Au\(b b , A"A exp(ih ) , B"B exp(ih ), q"ut, h"h !2h , h "h (q"0), a "a(1$(b /a )cos h ), a "b/3, sn is an elliptic function. The solution (3.7) describes a periodic process of energy transfer from the first harmonic to the second and vice versa. When the first harmonic is initially excited ("A "<"B "), its energy is transferred to the second harmonic within the following period of time:
4 ¹ " ds[(1!s)(1!Ks)]\ . ,* u
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
383
Now let us briefly discuss the influence of self-interaction effects on resonant second harmonic generation (here we do not give the details because the self-interaction effects are discussed in detail in Sections 3.3, 3.4, 3.5 and 3.6 of this review). To do this it is necessary to introduce the term ib "A"A in the right-hand side of the first equation in (3.6). This term is responsible for the self-interaction of the pump wave with a fundamental frequency u. The set of equations obtained is valid at the initial stage of the energy transfer, i.e. when "A"<"B". According to the results of the papers [99,126] the coefficient b can be represented as b "b,*#b,., where b,* is the self-interaction coefficient which is due to the quadratic nonlinearities of the nonlinear set of quasihydrodynamical and Maxwell’s equations, and b,. is the self-interaction coefficient which is due to the cubic nonlinearity caused by the nonparabolicity of the electronic spectrum. It should be noted that the possibility of the influence of the latter nonlinearity on the second harmonic generation of magnetoplasma SW in semiconductors was discussed in the paper [109]. Under conditions of a weak nonlinearity this influence can be realized only through the self-interaction of the first harmonic. Without giving the exact solution of the set of nonlinear equations obtained, let us make some numerical estimates for the characteristic times of this nonlinear process ¹ and ,* compare them with the linear period ¹ "2n/u. In the semiconductor n-In As (n "2;10 m\, * E "0.4 eV, E is the energy gap width) for B "0.1 Tl, u/u "0.2, "A(0)"/"B(0)""10, k"0.1, the ratio ¹ /¹ "20 and in the semiconductor sample n-GaAs (n "2;10 m\, E "1.5 eV) the ,* * ratio ¹ /¹ "17. Without the self-interaction effects taken into account this ratio ¹ /¹ "17 ,* * ,* * in n-In As and 16.2 in n-GaAs. It is not difficult to see that the self-interaction effects lead to an increase of the characteristic time of the nonlinear process. This is in a good agreement with the results of [124]. Now, following [120], let us briefly discuss the features of the resonant second harmonic generation of the SW propagating at the lateral surface of a semiconductor superlattice bounded by a metal (Voigt geometry). Recently, there has been a considerable number of theoretical and experimental studies devoted to collective excitations in semiconductor superlattices and heterostructures [43]. This interest is caused by a number of features of superlattices that are not typical for bulk semiconductor samples, such as a strong anisotropy which is due to the two-dimensionality of the structure, the possibility of the modification of the energy spectrum, the possibility of a selective variation of the free carrier concentration, etc. In this section we are interested in compositional superlattices of the first kind with periodically alternating layers of wide-band and narrow-band semiconductors, where the formation of two-dimensional conducting layers takes place due to a significant distortion of the energy band structure (the superlattices based on GaAs—AlGaAs heterostructures can be taken as an example in this case). In the presence of an external quantizing magnetic field the realization of the quantum Hall effect (QHE) in two-dimensional conducting layers is possible [127]. Here we deal with the surface magnetoplasma modes propagating on the lateral surface (i.e. on the surface that is perpendicular to the two-dimensional conducting layers and is parallel to the superlattice axis) of the semiconductor superlattice coated by a metal plane. The linear properties of the surface magnetoplasma waves on the lateral surface of the superlattice bounded by a dielectric medium have been investigated in the papers [128] (periodic bulk two-dimensional gas approximation) and in [129] (continuous anisotropic medium approximation). The paper [130] deals with a linear theory of the SW in the geometry studied here. Here we study the features of the resonant second harmonic generation of the SW, which can be of interest for the design of
384
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
semiconductor frequency multipliers. The semiconductor superlattice occupies the half-space x'0 and its 2D electron layers are parallel to the xy-plane. An external magnetic field is parallel to the superlattice axis (z-direction). The heterostructure studied is bounded at its lateral surface x"0 by a perfectly conducting metal plane. The SW we are interested in propagate in the y-direction along the superlattice lateral surface—metal interface across an external magnetic field. The basic assumptions of the problem are listed below. The analysis is carried out, as was mentioned above, in the continuous anisotropic medium approximation, which is appropriate when the SW wavelength and the skin depth greatly exceed the thickness of a 2D layer d. When the characteristic electron mean free path also satisfies l
, belongs to * the submillimeter range. The characteristic spatial scale ¸ of the second harmonic generation can ,* be estimated using the basic set of coupled equations. For the parameters of the structure mentioned above, and B /A "0.01, k"0.1, the value of ¸ is of the order of 3.7;10\ m, i.e. is ,* a quantity of the same order as the typical sizes of submillimeter-wave devices. This estimate shows that the second harmonic generation process studied here is quite observable experimentally. The value of ¸ increases with an increase of the external magnetic field, and decreases with the ,* increase of the average electron density. To obtain an estimate for the characteristic time of the second harmonic generation process one can use the simple relation q & ¸ /» , where » is ,* ,* the SW group velocity. The value q calculated for the same parameters of the structure is ,* approximately 20 times greater than the linear period of the SW 2n/u. Therefore, the present consideration shows the possibility of using semiconductor superlattice-metal structures as effective second harmonic generators. 3.2. Self-interaction of S¼. Formalism of the third order nonlinear susceptibilities In this section results associated with the nonlinear theory of surface magnetoplasma waves (SW) propagating across an external magnetic field at a plasma-like medium—metal interface are
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
385
presented [131]. In the weak nonlinearity approximation using the formalism of third-order nonlinear susceptibilities the nonlinear frequency shift of the SW is calculated. The possible nonlinear mechanisms that influence the SW propagation are discussed. In several cases a comparison of the results obtained with those of previous papers gives the possibility of estimating the third order nonlinear susceptibility of the nonlinear medium bounded by a metal. A plasma-like medium occupies the half-space x'0 and is bounded at the plane x"0 by a perfectly conducting metal surface. An external magnetic field is directed along the z-axis which is parallel to the interface. The SW we are interested in propagate in the y-direction across the external magnetic field (Voigt geometry). The plasma-like medium is assumed cold and weakly collisional. We do not consider the collective excitations in a metal because we study wave processes with eigenfrequencies that are significantly smaller than the characteristic frequencies of collective excitations in a metal. In the approximation linear with respect to the wave field amplitude the plasma-like medium sustaining the studied SW is described by the standard dielectric tensor of a gyrotropic medium [46]. Therefore, our results in the most general form are applicable for any gyrotropic nonlinear medium bounded by a metal. In this section we are interested in nonlinearities cubic in the SW amplitude of the kind "A"A , where A is the amplitude of the initially excited first harmonic of the SW, i.e. we study the nonlinear self-interaction process. The first step in studying this process is to obtain the amplitude dependent dispersion relation of the SW in the weak nonlinearity approximation which has already been used above. To describe the nonlinear properties of the SW studied, we start from the set of Maxwell equations. By solving the basic set of equations with the necessary boundary condition E (0)"0 W (E is the tangential component of the SW electric field), one can derive the following nonlinear W dispersion relation of the surface magnetoplasma waves at the plasma-like medium—metal interface
e e nu e 1! (s #e H #(e H #(e H ) D(u, k )"! e "e " ce e #e 1! (s #e H #(e H #(e H "E""c"E", (3.8) "e " where D(u, k )"k e !e s , and s are the components of the third order nonlinear susceptibil GHIJ ity tensor [17]. The quantities H describe the influence of the so-called “magnetic nonlinearities” GHIJ on the self-interaction of the SW. These kinds of nonlinearities arise e.g. in gyrotropic quasihydrodynamic media with free carriers which undergo not only the action of the external magnetic field but also the action of the magnetic field of the wave itself. It should be noted that in the linear approximation the latter influence is neglected, but it is very significant in the description of nonlinear processes. This nonlinearity is of hydrodynamic origin and is not present in studies of nonlinear self-interaction effects of different natures such as heating nonlinearity, nonlinearity that is due to a nonparabolic relation between the electron velocity and quasimomentum, striction nonlinearity, etc. It is assumed that in the quantities s and H only the impact of selfGHIJ GHIJ interaction processes leading to a nonlinear response at the fundamental frequency is taken into account. It is very important to note that only two components s and s of the third order nonlinear susceptibility tensor, including all possible nonlinearities except the “magnetic nonlinearity”, are taken into account. This is mainly due to the simplicity of the wave field pattern in the linear approximation (only the components E and H are present). This means that the V X
386
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
experimental observation of the nonlinear frequency shift of the SW of interest to us can provide information about the effective nonlinear susceptibility of the plasma-like medium. Using the nonlinear dispersion relation (3.8) it is easy to obtain the following general expression for the nonlinear frequency shift of the SW:
du "c ,*
jD(u, k ) ju
\ "E""cJ "E",
(3.9) S I where u is a solution of the linear dispersion equation D(u , k )"0 [44]. Now we discuss the applicability of the general theory presented to several particular cases determined by the mechanism of nonlinearity chosen, estimate the parameters of the problem when the weak nonlinearity assumption is valid, and follow the possible results of the nonlinear evolution of the SW. Let us begin with an estimate of the criteria for the applicability of the weak nonlinearity theory. The condition k;1 mentioned in Section 3.1 is valid for the wave amplitudes satisfying E;m cu/e(e . One can easily show that the latter inequality can be realized e.g. in n-type semiconductors with m /m "(0.1—0.01) (m is the free electron mass, m is the effective electron mass in the medium), n "10—10 m\ (n is the unperturbed carrier density) immersed in an external magnetic field B "0.15 Tl in the frequency range u&10—10 rad/s for wave amplitudes belonging to the range up to 200—300 kV/m. Such an amplitude level is quite typical in experiments with surface magnetoplasma waves in semiconductor structures [109]. This estimate allows us to hope that many observable nonlinear effects at the amplitude level mentioned above can be interpreted by means of the weak nonlinearity theory. To show the possibility of estimating the third order nonlinear susceptibility of the medium from a measurement of the SW nonlinear frequency shift we compare the results of the general theory given here with the results obtained in two particular cases. It should be noted that one interesting feature of the expression (3.8) allows one to obtain precise information about the coefficients s and s . It is easily seen that the expression 1!e /"e " on the right-hand side of Eq. (3.8) is equal to zero in the frequency range u'u where e '0. Therefore, in situations when the action of “magnetic nonlinearities” is not significant, a measurement of the nonlinear frequency shift Du ,* of the SW in the range u 9u yields the value of the coefficient s : e \ ce Du jD 1# . (3.10) s " ,* "E " ju nu e e S Then, the measurement of the nonlinear frequency shift Du of the SW in the range u (u ,* yields:
(e (u )) Du jD (e (u )) c Du jD ,* ! ,* . (3.11) s " n "E " ju u e (u ) "E " ju u e (u ) S S Here we use this procedure in two particular cases when it is possible to obtain the expressions for Du and Du under certain definite assumptions about the medium and the predominant ,* ,* nonlinear mechanism (i.e. we use the calculated values Du and Du instead of the experimental ,* ,* data). The first case is related to the nonlinearity that is due to a nonparabolic relation between the electron energy and quasimomentum. The use of the expressions for Du obtained in paper ,* [126] (see also Section 3.4 of the present review) gives the values s "5.7;10\ C m/V,
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
387
s "4.7;10\ C m/V for the following set of the parameters of the problem: n "5;10 m\, m /m "0.01, B "0.1 Tl, u "1.3u , u "u /3, E "0.1 eV (E is the width of the energy gap) and s "3.1;10\ C m/V, s "2.6;10\ C m/V for the same set of parameters except E "1 eV. This means that a decrease of the energy gap leads to an increase of the nonlinear susceptibilities of the medium. This seems natural due to the fact that the nonlinearity which is due to the nonparabolicity of the free carrier spectrum is stronger in narrow-gap semiconductors. One can also show that the nonlinear response of the medium increases with an increase of the free carrier density and decreases with an increase of the external magnetic field. In our opinion, this fact can be explained, e.g., in the framework of the hydrodynamical model by following the dependence of the weak nonlinearity parameter k on n and B . It is easily shown that k increases with n and decreases with B , respectively. The second case is related to the study of the nonlinearity caused by electron heating in the wave field. Using the results of the papers [132,133] (see also Section 3.5 of this review) concerning the nonlinear frequency shifts Du and Du taken near the source of the SW excitation, one can ,* ,* obtain the values of the components s and s of the third order nonlinear susceptibility tensor s "7.42;10\ C m/V, s "6.6;10\ C m/V for the following set of the para meters of the structure: n "5;10 m\, m /m "0.01, B "0.1 Tl, u "1.3u , u "u /3, ¹ "77 K, l(¹ )/lJ (¹ )"10, l(¹ )/u "10\, l(¹ )/u "0.1, l(H)&H (the scattering by acoustic phonons is assumed to be predominant), l(¹ ) and lJ (¹ ) are the effective collisional frequencies leading to a transfer of the quasimomentum and energy, respectively. It is easy to see that these values are 30—40% higher than those obtained in the case of the nonlinearity that is due to the nonparabolicity of the free carrier spectrum. We assume that this difference is caused by a relative arbitrariness of choice of the scattering mechanism and of the values of l(¹ ) and lJ (¹ ). An exact analysis of the processes leading to the heating nonlinearity of the type of the SW awaits its realization. It should also be pointed out that the s and s obtained here are quantities of the same order of magnitude as those measured in the paper [134]. The measurement of the nonlinear frequency shift of the SW in two different frequency ranges significantly simplifies the procedure of determining the third order nonlinear susceptibility. However, it is easily seen that the corresponding measurement of Du and Du of the SW from ,* ,* the same frequency range also provides this information. However, the expressions (3.10) and (3.11) become more complicated. The scope of the nonlinear evolution of SW will be discussed in detail in Sections 3.3 and 3.4, which deal with definite mechanisms of nonlinearity. 3.3. Self-interaction caused by hydrodynamic nonlinearities In this section we study the nonlinear self-interaction effect of the SW propagating across an external magnetic field at a cold magnetoactive plasma—metal interface. The waves from all three possible frequency ranges of existence are considered. The self-interaction is due to the action of quadratic hydrodynamic nonlinearities (the terms ( grad) , (e/m ) [B], div(n), and en in the basic set (3.1)). The nonlinearity is assumed weak so the ratio k"» /» , which is usually referred # as the weak nonlinearity parameter [100], is small compared with unity. In this approximation the self-interaction effect which is due to quadratic hydrodynamic nonlinearities is realized in two stages [100,135]. At the first stage the second harmonic of the SW and the static surface type
388
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
perturbations are generated (the processes u#u"2u and u!u"0 take place). It is important that the second harmonic of the SW should not be an eigenwave of the structure for the realization of the self-interaction effects. In the opposite case the resonant interaction between the first and the second harmonics, considered in Section 3.1, occurs. The conditions which allow us to assume that the second harmonic is not an eigenwave of the waveguide structure are given separately in every frequency range. Then the second harmonic and the static surface perturbation interact with the first harmonic (the processes 2u!u"u and 0#u"u, respectively), and such interactions lead to an increase of the nonlinear response (its amplitude is proportional to "E"E, where E is the SW amplitude) at the frequency u (it is the second stage of the self-interaction). The nonlinear frequency shift and the nonlinear envelope waves arise as a result of this process. Depending on several conditions these nonlinear envelope waves may be unstable with respect to longitudinal or transverse perturbations. The results of the nonlinear evolution are also given for a definite frequency range. In this section the geometry of the problem is the same as in the preceding section. Let us consider the SW from the frequency range u;u. According to the results of the paper [44] (see also the review [36]) the dispersion of these waves in the linear approximation is described by the following expression u"k » (1!(k » )/2u) , (3.12) where » and k are the Alfven velocity and the SW wavenumber, respectively. It is necessary to point out that in this frequency range the dynamics of the SW studied is defined by both ion and electron nonlinearities. Therefore, both electron and ion oscillation velocities must be significantly smaller than the phase velocity of the wave (the latter is of the order of the Alfven velocity). Using the method given in [100] one can obtain the expressions for the second harmonics of the wave field: G(r, t)"[G (x)exp(it )#G *(x)exp(!it )] , G"(E, H), E (x)"!i(u/X )xb(k /e ) , (3.13) W 5 u u 3 k E exp(!2s x) 1! (u/u) , H (x)"bx# X 4 X X 10 u u 2u!u bx!2 k E exp(!2s x) (k /e ) , E (x)" V X X uX 3 u u!u k E exp(!2s x) , b(x)"! X 4» eE k " , t "2t , t "k y!ut, X"4u!u , m u» X is the ion plasma frequency, m is the ion (electron) mass, and E is the SW amplitude. The second harmonic (3.13) is not an eigenwave of the structure when the inequality 4u(u is satisfied. So, our results are valid in the frequency range where both the inequalities u;u and 4u(u are valid. Alongside the second harmonic generation, the processes of the generation of
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
389
a static surface current and a static magnetic field take place. These surface type perturbations also decay exponentially from the interface and are caused by the interaction u!u"0: e"E" » (x)"! exp(!2s x) , W 2m» u e"E" exp(!2s x) , (3.14) » (x)"! W 2m» u e"E" exp(!2s x) . H(x)" X 4m » u X It should be noted that to calculate the static surface perturbations (3.14) there is no need to introduce collisions or the thermal motion of particles specially as was done in the paper [135]. In the third approximation with respect to the SW field it is possible to obtain the following dispersion equation: u"u #Q"E", (3.15) where 1 e u b(u ), X"4u!u , Q"! 128 mc Xu u u u u u b(u )"18!45 !256 #270 #896 #360 , u u u u u c"1/(k e ), and u is defined by the expression (3.12). In calculating the nonlinear frequency shift du "Q"E" both self-interaction channels 2u!u"u (through the second harmonic ,* generation) and 0#u"u (through the action of the ponderomotive force, the force of highfrequency pressure acting on electrons) are taken into account, and the corresponding frequency shifts DS"QS"E" and D"Q"E" are calculated. The numerical estimates show that in the frequency range 0.27u (u(0.33u the inequality "DS"<"D" is valid. When 0.25u (u(0.38u and 0.4u (u(0.45u "DS"'"D". When 0.25u (u(0.38u the signs of DS and D are opposite, while when 0.38u (u(0.45u they are the same. In the entire frequency range of interest D(0. This means that taking the self-interaction channel 0#u"u into account leads to a decrease of the wave phase velocity. If the channel 2u!u"u is considered separately, the SW phase velocity increases in the frequency range 0.25u (u(0.38u and decreases when 0.38u (u(0.45u . In the first frequency range both processes lead to opposite results for the change of the SW phase velocity, and in the second they both lead to a decrease of the wave phase velocity. The resulting nonlinear frequency shift is positive when u(0.37u and is negative when u(0.37u . With the nonlinear dispersion equation (3.15) in mind, one can derive the nonlinear Schro¨dinger equation for the envelope amplitude [136]
i
jE jE jE jE #» #ilE#P #P "Q"E"E , jy , jy , jz jt
(3.16)
390
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
where » "» is the SW group velocity, 1 j» 3» u » » P " "! , P " " , , 2 jk , 2u 2k 2u II and l is the linear decrement of the collisional damping. Due to the fact that the quantity Q changes its sign in the range 0.25u (u(0.45u , in the vicinity of the frequency u "0.37u we can reach the following conclusions about the stability of the SW under consideration. 1. In the frequency range 0.25u (u(0.37u (Q'0) the stationary waves of the SW envelope are unstable with respect to longitudinal perturbations. The greatest instability increment is reached when s "!EQP\ (s and s are the wavenumbers of longitudinal and transverse , , , , perturbations, respectively) and is equal to c "!l/2#(EQ#l/4). The maximum in crement in a lossless medium is c "EQ. In this frequency range, assuming P jE/jz and l to
, be equal to zero in Eq. (3.16), one can obtain the following soliton-type solution for the SW field envelope:
A"E exp(i )exp(!s x) . (3.17) Here E "(2B ch\[(c B (m!m )] and "!QBt. When P jE/jy"0, l"0, the anti , soliton solution for the steady state spatial distribution of the SW field envelope can be realized as a result of an evolution of stable transverse perturbations: A"E exp(i )exp(!s x) , (3.18) where E "B "th[(c B (z!z )]", "c B(y!y )/2k , c "k Q/» , z "z (t"0), and y "y (t"0). 2. In the frequency range 0.37u (u(0.45u an instability with respect to transverse perturba tions takes place. The maximum increment c "!l/2#(EQ#l/4) is reached when
s "!EQ P\. In a lossless medium c "EQ. As a result of the instability with respect to , ,
transverse perturbations a stationary solution in the form of a wave with the soliton profile of its intensity which does not depend on the y-coordinate, can be realized in the collisionless case. It is given as A"E exp(i )exp(!s x) , (3.19) where E "B ch\[(c B (z!z )], "c B(y!y )/2k , and c k Q/» . The solution (3.19) describes the steady-state spatial distribution of the SW field envelope. In the same range of frequencies in a collisionless plasma the solution in the form of an antisoliton of the SW field envelope can be realized. Omitting P jE/jy and ilE in Eq. (3.16), and , following [137], we obtain: A"E exp(i )exp(!s x) , (3.20) where E "B "th[(c B (m!m )]", "!QBt, and c "Q/4P . Taking into account damping , effects in Eq. (3.16) leads to a decrease of the amplitude in solutions (3.17) and (3.20) that is proportional to exp(!c"Q"Elt/» u ). The solutions (3.17)—(3.20) have been analyzed for stability according to the stability criterion for soliton-type solutions of the nonlinear Schro¨dinger equation [138,139] and have been found to be
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
391
stable. This allows us to state that such nonlinear solutions for the surface waves at the plasmametal interface may arise as a result of the nonlinear evolution of the SW. Now let us study the self-interaction of the SW with a frequency satisfying u;u;u [99]. J C In this frequency range the SW dynamics is defined by the motions of the electrons and, thus, the effect of ions or other low-frequency excitations on the self-interaction can be neglected. The wavenumber of the SW in this range is negative (k (0). The frequency and the absolute value of the wavenumber are coupled by the following linear dispersion equation [44]: u""k "c(u /X )(1!kc/2X). (3.21) The weak nonlinearity approximation is used as in the preceding section. The corresponding nonlinearity parameter k"eX E/mcuu ;1. The results of this section are valid in the frequency range where both the inequalities u;u and 4u(u are satisfied. The nonlinear dispersion equation of the SW considered, derived in the third approximation with respect to the wave amplitude, can be written in the following way: u"u #Q"E" , (3.22) where X a \ eX [2#a]\ 1! h(u ) , Q" 24m c(u!4u)e u(e * * the expression for the h(u ) can be written as [99] h(u )" (u /u )Hp , H H e u (e u * !(30#15a) * , p "20#10a#18a X X (e u eu (e u * !(27#147a) * #22.5a * ; p "4.5!21a#(157#278a) X X X (e u p "57!45a * , X e u (e u * , p "!264!60a#(644#212a) * #168 X X p "!276!156a, a"[1#4u/u], c"1/(e k ) and u is given by the expression (3.21). The nonlinear Schro¨dinger equation which describes the evolution of the wave field envelope of the weakly nonlinear magnetoplasma SW is presented in the same form as (3.16), where
3 u » "!c 1! (k c/X ) , 2 X P "!3cu/2X, P "cu/2Xu , , ,
392
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
and l is the linear dissipation decrement of the magnetoplasma surface waves. It is not difficult to show that in the frequency range considered Q(0, P (0, P '0. This means that , , according to the Lighthill criterion [136], the solutions for the SW envelope amplitude are stable with respect to longitudinal perturbations and are unstable with respect to transverse perturbations (Q(0). The realization of the condition Q(0 leads to the fact that the nonlinear self-interaction processes taking into account results in a decrease of the SW phase velocity. It is not difficult to show that both self-interaction channels 2u!u"u and 0#u"u yield a negative frequency shift. In the frequency range studied the effect of the first channel appears to exceed by 2.5—4 times the impact of the second one. For the SW that are unstable with respect to transverse perturbations the stationary solution (3.19) is valid, while the solution (3.20) is the result of the nonlinear evolution of the stable longitudinal perturbations. In this section we also consider the self-interaction of high-frequency surface waves, which are caused by the motion of the electrons, and exist in the frequency range u'u (u is the upper hybrid frequency) [140]. The linear dispersion relation of the SW in this frequency range can be represented as u"(X/e )(1#(k c/X )e ) (3.23) * * (e is the dielectric constant of the semiconductor lattice). This expression is valid in the * dense plasma approximation. In the frequency range studied the waves of interest to us are also unidirectional ones and can propagate in the positive y-direction. It should be noted that the results of this section can be applied to both gaseous and electron semiconductor plasmas. The second harmonic of the SW is not an eigenwave of the structure if the following inequality is satisfied u9X/e . If u<X/e then the spectrum of the SW studied becomes linear, and * * resonant second harmonic generation takes place. If we impose the restriction u(5X/e on the * eigenfrequency, then our results are valid in the frequency range which approximately can be taken as u (u(2.25X /e. * The distinguishing feature of the second harmonic in the case studied is the fact that it is a superposition of two perturbations, of surface and volume types. The forced volume-type wave propagating at a definite angle with respect to the metal surface transfers the electromagnetic energy of the SW into the plasma volume. Therefore, in the second approximation with respect to the SW amplitude a nonlinear energy transfer from surface waves to volume waves takes place. This process leads to the nonlinear dissipation of the SW energy [141]. The conclusion about the generation of the volume wave at the second harmonic of the surface wave can be made using the expression for the SW inverse skin depth [44] after the substitution of 2k and 2u for k and u. The results of the paper [140] suggest that the impact of the nonlinear energy dissipation is significant. This means that an external source of energy that compensates the dissipative losses is necessary for the realization of the self-interaction effect. The nonlinear dispersion equation is obtained absolutely analogously as above. It can be represented in the following form: u"u #Re Q"E"#i Im Q"E" ,
(3.24)
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
393
Fig. 3.1. The dependence of the quantities q — (curves 1—5, respectively) on the SW eigenfrequency ¼"u/X for a gaseous plasma with the parameters k"0.1, X /u "3.
where u is given by the expression (3.23), Re Q"!(Xe /u u X)[D#D#Im D] (3.24a) * Im Q"!(Xe /u u X) Re D (3.24b) * X"u!X/e , the superscripts s, 0, v denote the corresponding self-interaction channels: * s denotes the self-interaction through the surface-type second harmonic (2u(s)!u"u), v denotes the self-interaction through the volume wave at the second harmonic (2u(v)!u"u), and 0 denotes the self-interaction through the static surface perturbations (0#u"u). The correlation between the nonlinear frequency shifts that are due to the different self-interaction channels is presented in Fig. 3.1. The following notations are used: q "Im Q"E"/u (curve 1), q "Re Q"E"/u (curve 2), and q "Re Q"E"/u (curves 3—5). Fig. 3.1 illustrates the \ dependence of these quantities on the wave eigenfrequency for a gaseous plasma. If ¼(1.8 (¼"u/X ), both self-interaction channels 0#u"u and 2u(s)!u"u lead to an increase of the wave phase velocity, while their effects on the resulting SW nonlinear frequency shift are quantities of the same order. In the whole frequency range where the present consideration is valid, the inequality Re Q(0 is satisfied. If, however, ¼'1.8, the quantity Re Q becomes positive and significantly exceeds the absolute value of Re Q. The resulting nonlinear frequency shift is positive for ¼'1.8 and is negative if ¼(1.8. It should be noticed that the value of the parameter ¼, at which the sign of Re Q changes, varies slightly with changes of the parameters of the plasma-like medium. However, this variation is not significant and is neglected in what follows. The effect of the process 2u(v)!u"u on Re Q is not significant. One can show that the process 0#u"u is the predominant self-interaction channel, in contrast to the case of the magnetoplasma SW considered above. This fact can be explained as follows. The relative impact of the processes 2u!u"u and 0#u"u can be characterized by the ratio k/g, where g"» /» , and » is the charactertistic velocity of the static surface drift #
394
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
[140]. The ratio of the quantities (k/g) and (k/g) related to the SW with u(u and with u'u , respectively, is (k/g) /(k/g) &X/ue <1 , * i.e. the relative impact of the self-interaction process through the second harmonic generation is more significant for the waves from the first frequency range. One can easily show that the nonlinear Schro¨dinger equation in the case considered can be written in the following form:
i
jE jE jE jE #» #ilE#P #P "(Re Q#i Im Q)"E"E#f , jy , jy , jz jt
(3.25)
where » "!cX /u is the SW group velocity, cX » 1 j» , "! P " "c/u , P " , 2k , 2 jk 2ue II * l is the linear decrement of collisional damping, and f is the external source compensating the dissipative losses in the system. Without repeating the detailed analysis of Eq. (3.25) we note that because P '0, P '0, and the quantity Re Q changes its sign at ¼"1.8, the solutions of the , , nonlinear Schro¨dinger equation for the SW envelope amplitude are stable in the frequency range ¼'1.8 while in the range ¼(1.8 they are unstable with respect to both longitudinal and transverse perturbations. The maximal instability increment is obtained for s "!"E" Re QP\, , , s "0 or s "!"E" Re Q ; P\, s "0 (2n/s , 2n/s are the characteristic spatial scales of the , , , , , , longitudinal and transverse perturbations). Due to the fact that the unstable perturbations with maximal increments grow most quickly among the spectrum of unstable perturbations [115], it is enough to follow their nonlinear evolution. The SW which are unstable with respect to transverse perturbations evolve into an envelope wave with a soliton intensity profile which does not depend on the direction of the SW propagation (3.19). And the SW which are unstable with respect to longitudinal perturbations evolve into the soliton-type solution (3.17). At the end of this section it is necessary to note that in the absence of the external energy source compensating the dissipative losses, a dissipative-type instability can be realized in the structure [115].
3.4. The effect of the nonparabolicity of the free carriers spectrum In this section results associated with the influence of the nonparabolicity of the free carriers’ dispersion law on the propagation of surface waves located near the plane interface between an n-type semiconductor and a metal are reported [126,142—144]. The structure is immersed in an external magnetic field directed along the semiconductor—metal interface. The SW studied propagate across the external magnetic field and are called “surface magnetoplasmons” (SMP). The action of the nonparabolicity of the electrons’ dispersion law produces two effects. The first effect is associated with the nonlinear self-interaction of the SMP [126,142,143], and the second one is associated with the SMP third harmonic generation [144].
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
395
It should be noted that the nonlinearity which is due to the nonparabolicity of the free carrier spectrum acts alongside with other nonlinear mechanisms which may affect the propagation of SMP in the semiconductor plasma. These nonlinear mechanisms can be due to the features of the lattice, to nonlinear electron hydrodynamic motions, to heating of the electrons in the wave field, etc. [110]. Here we compare the action of different types of nonlinearity with the action of the nonlinear mechanism studied. The analogous problem has been solved, e.g. in [145] for the quasielectrostatic waves in a semiconductor layer bounded by a dielectric, where the possibility of soliton-type nonlinear solutions has been shown. The geometry of the problem is analogous to that used in previous sections, and there is no need to specify it here. The assumption of a semi-infinite structure is applicable, e.g. for the metal—semiconductor—dielectric structure, when the transverse dimensions of the semiconductor layer are large compared with the skin depth of the surface waves studied. The n-type semiconductor plasma is assumed weakly collisional (l;u, l is the effective electron collision frequency). The analysis is carried out taking into account the nonparabolicity of the free carrier spectrum. The latter is assumed to be described by the isotropic nonparabolic law according to the Kane model [146]. Such a free carrier dispersion law is realized in a number of semiconductor materials (more often for A'''B4 semiconductors) even in the presence of not very strong magnetic fields H [147]. Although the Kane model is approximate, it provides good results in the description of the nonlinear properties of semiconductors that are caused by the nonparabolicity of the electronic spectrum [148]. It should be pointed out that the action of the above mentioned nonlinearity in semiconductors is analogous to that of the relativistic nonlinearity in gaseous plasmas [149]. In the model of the isotropic nonparabolic dispersion law we have the following relations between the velocity , quasimomentum p and energy E of electrons: jE " , jp
E p !E , E" E# m
where E is the width of the energy gap, and the values of m , p, are evaluated near the bottom of the conduction band. In the weak nonparabolicity approximation (p/2m E ;1) the relation between the electrons’ quasimomentum and velocity can be expressed as "(p/m )(1!(p/pJ )), where pJ "2m E . This relation is introduced into the basic set of quasihydrodynamic equations (3.1) for high-frequency electron motions. The nonlinearity is assumed to be weak, as in previous sections, and the weak nonlinearity parameter k is also used here. Now following the original paper [126] we study the self-interaction of the non-potential magnetoplasma surface waves which is due to the nonparabolicity of the free carrier spectrum. Then we briefly recall the specificities of the self-interaction of the potential SW of the magnetoplasma type [142,143], and also discuss the problem of the SW third harmonic generation processes for both types of SW, which is also caused by the nonparabolicity of the electronic spectrum. The nonlinear dispersion equation of the surface magnetoplasma waves in the Voigt geometry (frequency range u;u, k (0) obtained in the framework of the approximation, used, can be presented as follows [126]: u"u #Q,."E" ,
(3.26)
396
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Table 2 Sample
n (m\)
E (eV)
B (Tl)
m /m
Q,./Q,*
n-GaAs n-InSb n-InAs n-HgTe n-HgSe
2;10 5;10 1.7;10 1.7;10 4;10
1.5 0.18 0.4 !0.17 !0.2
0.5 0.1 0.1 0.5 0.75
0.066 0.013 0.02 0.017 0.019
0.27 9.1 0.42 !1.42 !2.1
where u is given by the expression (3.21), and Q,."E" is the SMP nonlinear frequency shift, which is caused by the mechanism of nonlinearity studied
eu "E" u Q,."E"" 7!12 . 32m E X u The evolution of the weakly nonlinear SMP envelope in the structure studied is described by the nonlinear Schro¨dinger equation (3.16) with the corresponding replacement of Q by Q,.. Further conclusions regarding the nature of the SMP self-interaction are based on the analysis of the latter equation by the method given in [136]. To estimate the impacts of different nonlinearity mechanisms on the resulting nonlinear SMP frequency shift, a comparison of the nonlinear frequency shift Q,."E" with that caused by the hydrodynamic nonlinearities (3.22), (in this section it is defined as Q,*"E") has been made. The values of the ratio Q,./Q,*, which characterizes the relative impact of different processes in the nonlinear self-interaction of the SMP, are presented in Table 2 for different semiconductor samples at liquid nitrogen temperatures, wave frequency u"u /3, and the nonlinearity parameter k"0.1. The results presented in Table 2 suggest that for wide-band semiconductor materials the resulting nonlinear frequency shift of the SMP is defined by the action of the hydrodynamic nonlinearities (see the data for the wide-band sample of n-GaAs). In this case the nonlinear frequency shift is negative (see Section 3.3, Q(0). Therefore [136], the nonlinear envelope wave is unstable with respect to transverse perturbations (the instability increment c"!Q,*"E"), and evolves into the stationary solution with the soliton intensity profile "E" (3.19). The antisoliton state (3.20) is realized for stable longitudinal perturbations. The cubic nonlinearity, caused by the nonparabolicity of the carriers spectrum, is very significant for narrow-band semiconductors. Moreover, the sign of the nonlinear frequency shift depends significantly on whether the semiconductor with a normal (E '0) or inverted (E (0) band structure is used. For E '0 in the frequency range considered the quantity Q,."E" is negative. Therefore, in this case the impact of the nonlinearity mechanism studied leads to an increase of the instability increment with respect to transverse perturbations, considered in Section 3.3, i.e. to the SMP self-focusing effect becoming stronger. The data given in Table 2 also suggest that the impact of the nonlinearity caused by the nonparabolicity of the free carrier spectrum is the most significant in semiconductor materials with a not very high electron density (n&10—10 m\) and with a small energy gap. A more interesting result can be obtained in studies of the nonlinear SMP in semiconductors with an inverted band structure, i.e. with E (0 (e.g. n-HgTe and n-HgSe (AB) [147]). In these
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
397
materials the SMP nonlinear frequency shift, caused by the nonparabolicity of the carrier spectrum is positive, and leads to results which differ qualitatively from those obtained upon taking into account only hydrodynamic nonlinearities. Namely, in Table 2 the conditions yielding a positive total nonlinear frequency shift of the SMP are presented. Due to the fact that in the materials with E (0 the quantity Q'0, the nonlinear SMP studied are unstable with respect to longitudinal perturbations (the instability increment c"Q"E"). The envelope solitons (3.17) are formed as a result of the evolution of such perturbations. It should be pointed out that in gaseous plasmas and in semiconductors with a normal band structure the existence of envelope solitons in this frequency range is impossible (see Section 3.3). The stable transverse perturbations evolve into an antisoliton-type state for the steady spatial amplitude distribution (3.18). It is necessary to note that the formation of SMP solitons and antisolitons requires that the SMP amplitude be a quantity of the same order of magnitude as the critical value E , which is defined by the competition between the dispersion and the nonlinearity [145]. In our case for a pulse duration of q"20 ns, and the values n &10 m\, m &10\ kg, B &0.1 Tl, E &0.1 eV, and u"u /3, the critical value E corresponds to the characteristic velocity of electron oscillations in the wave field » &3;10 m/s. For the given parameters the SMP phase # velocity » &3;10 m/s, and the soliton solutions can be realized for rather low values of the nonlinearity parameter k&3;10\. Now consider the problem of the self-interaction effect of the potential SMP in a plasma-like medium with a finite electron pressure taken into account [142,143]. The frequency range u;X/e #u is considered. The nonlinear dispersion equations describing the effect of the * nonparabolicity of the electron spectrum on the SMP propagation at a semiconductor—metal interface can be written as follows: u"u #du , ,*
(3.27)
where u "k r X/u (e is the linear solution for the SM frequency (u ;u ), du and ,* " * is the nonlinear frequency shift of the SMP, whose value can be found in the original paper [142]. The corresponding expression for u in the limiting case u
398
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Now let us obtain estimates for the given nonlinear frequency shifts (3.27) for waves from different frequency ranges and compare them. This comparison shows that
du E u X X ,*& <1 (3.28) du E u ue ue ,* * * for u ;u , u ;u ; E , E are the amplitudes of the SM from these frequency ranges, respective ly. The estimate (3.28) is valid if the amplitudes of the SMP from different frequency ranges are of the same order (E &E ), and shows that the nonlinear effects are stronger in the frequency range where the action of the external magnetic field is significant (u;u ). It is also interesting to compare the nonlinear frequency shifts (3.27) due to the nonparabolicity of the electron spectrum (du /u) with that caused by hydrodynamic nonlinearities (du /u) . ,* ,. ,* &7" The ratio of normalized nonlinear frequency shifts yields the following estimate (the quantity (du /u) has been calculated in [150]): ,* &7" du du \ 30¹ ue ,* ,* *. " u u E X ,. &7" This ratio can be of the order of unity. The impact of the mechanism studied increases with an increase of the electron temperature and wave frequency, and with a decrease of the semiconductor plasma density and energy gap width. Now let us briefly describe the results associated with the third harmonic generation of the SMP in a cold magnetoactive plasma bounded by a metal [144]. The signal at the third harmonic is also caused by the action of the cubic nonlinearity, which is due to the nonparabolicity of the electron spectrum. We also compare the amplitude of the third harmonic with that of the magnetoplasma surface wave, which is generated as a result of the action of the hydrodynamic nonlinearities. Note that in a gaseous plasma the generation of higher harmonics is completely defined by the latter quadratic nonlinearities, and in the framework of the weak nonlinearity theory the amplitude of the third harmonic is k\<1 times smaller than that of the second harmonic [100]. In a semiconductor plasma an additional source of third harmonic generation arises (the cubic nonlinearity, caused by the nonparabolicity of the carrier spectrum). The sources of the latter nonlinearity (the terms which are proportional to (p/2m E )p) are inversely proportional to the width of the energy gap of the semiconductor, and one has to expect that in narrow-band semiconductor materials the amplitude of the third harmonic can be comparable with that of the second harmonic of the SMP, which is not an eigenwave of the structure. The SMP eigenfrequency must simultaneously satisfy both of the inequalities u;u, 4u(u [144]. The satisfaction of the second one allows one to consider the third and the second harmonics as non-eigenwaves of the system, and to exclude from consideration the resonant harmonics interaction. The expressions for the electromagnetic field components of the third harmonic in the structure studied are given by [144]:
E(r, t)"[E (x) exp(iW )#E*(x) exp(!iW )] , * H(r, t)"[H (x) exp(iW )#H (x) exp(!iW )] , km cuu (exp(!s x)!exp(!3s x)) , E (x)"iA W 8E X
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
399
km cuu ((9u/u!a)(a!1)\exp(!s x)!3 exp(!3s x)) , 24E X km cuu (a!1)\(exp(!s x)!(u/8u)(a!1)exp(!3s x))(e /k ) , H (x)"!A X 24E X (3.29)
E (x)"A V
where s "X a/c, s "X /c, k"(eX A)(m uu c)\, W "3(k y!ut), c"1/(e /k ), a" (9u/u#1), and A is the amplitude of the SMP at the fundamental frequency. The expressions (3.29) show that the third harmonic of the waves considered is a superposition of two surface-type wave perturbations with the inverse skin depths s and 3s . This is true even in the case when the inequality 9u'u holds, i.e. when the frequency of the SMP third harmonic belongs to the range where the SMP of the fundamental frequency cannot exist [44]. Therefore, one reaches the conclusion that in the case considered the transformation of the surface waves into volume ones and, hence, the nonlinear damping of the SW with the fundamental frequency, is not realized. To estimate the efficiency of the generation of the signal with frequency 3u let us compare the amplitudes of z components of the third (H (0)) and the second (H (0)) harmonic magnetic fields X X at the metal interface. The estimates are carried out for different semiconductor samples at liquid nitrogen temperatures (the expressions for the SMP second harmonic electromagnetic field components are presented in the paper [99]), and also verify the validity of the basic assumptions. Namely, for the narrow-gap semiconductors n-InSb and n-InAs the semiconductor plasma is weakly collisional for the wave frequencies &5;10—10 1/s and higher. At the typical experimental values of the magnetic field 0.2—0.5 Tl the weak nonlinearity approximation is realized for the above mentioned semiconductors for SMP field amplitudes &150 kV/m and lower (for n-InSb the value k"0.1 corresponds to E"110 kV/m, while for n-InAs the same value k"0.1 corresponds to SMP amplitudes of the order of 130 kV/m). For the values of the SMP amplitude and eigenfrequency, and of the external magnetic field, given above the ratio H (0)/H (0)&1/3—1/2 in X X an n-InSb sample, and H (0)/H (0)&1/5—1/4 in an n-InAs sample. In the semiconductor n-GaAs X X with a significantly larger energy gap (1.5 eV) and the same parameters the ratio H (0)/H (0) is X X only 0.01. Therefore, one can conclude that in narrow-gap semiconductors the amplitude of the third harmonic can significantly exceed that of the signal with the frequency 3u in a gaseous plasma and in wide-band semiconductors, and can be a quantity of the same order of magnitude as the amplitude of the SMP second harmonic. Finally, let us note that the problem of the third harmonic generation of potential SMP in the Voigt geometry was studied in [142,143]. The results of these papers suggest that, depending on the SMP eigenfrequency and other parameters of the structure, the SMP third harmonic can be a superposition of purely surface perturbations, a superposition of purely surface perturbations with pseudosurface ones, and a superposition of purely surface perturbations with purely volume ones. In the last case the nonlinear damping of SMP caused by the excitation of a volume wave at the third harmonics can take place. 3.5. The influence of the heating nonlinearity on the SW propagation In this section the influence of electron heating in a high-frequency SW field on the dispersion properties of the SW considered is investigated (this nonlinearity is often called a thermal
400
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
nonlinearity). Here a semi-infinite geometry is used as in the previous sections. This process is studied in detail only for SW in a cold plasma-like medium [132,133]. An analogous problem is also mentioned below in application to SW propagating across the external magnetic field in a warm magnetoactive plasma-like medium [151]. First of all, let us indicate the basic assumptions and qualitatively illustrate the action of this nonlinearity mechanism. Let us assume that the electron collision frequency l leading to the momentum transfer is much smaller than the SW eigenfrequency u. The electron collisions with different scattering centres are assumed to be quasi-elastic (e.g. optical and acoustic phonons or charged impurities can act as such scattering centres). The condition for the quasi-elasticity of the collisions, l
(3.30)
where
jl(h) dl(x, y, "E")" jH H
H(x, y, "E"). 2 The dependence (3.30) is introduced into the quasihydrodynamic equation for the HF electron motions. The latter equation is solved together with the Maxwell (in a cold medium) or Poisson (in a medium with a finite electron pressure) equations, and the quasihydrodynamic and stationary energy balance [152] equations to describe the influence of the heating nonlinearity on the SW propagation studied. Note that the energy balance equation has been solved in the following approximation: ¸H<¸, where ¸"[s(¹)/n lJ (¹)] is a characteristic scale of temperature vari ation, where s(¹) is the heat conduction coefficient, and ¸H the temperature variation due to the wave energy transfer effect. In this case heat conduction effects are neglected compared with that of heat transfer from the waves to electrons, and the SW electric field and the temperature are locally coupled as in the case of the normal skin-effect [152]. The above mentioned stationary energy balance equation yields the quantity H(x, y, "E") which is to be substituted into Eq. (3.30). The quantity l(H) obtained in this way is to be substituted into
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
401
the basic set of quasihydrodynamic and Maxwell equations, and the expressions for the SW electromagnetic field components can be obtained (they are presented for waves from different frequency ranges in the original papers [132,133]). The substitution of the latter solutions into the boundary condition E (x"0)"0 after the necessary separation of the real and imaginary parts O leads to the following nonlinear dispersion equation: D(u,k )#i Re Q"E"exp(!2k "y")!Im Q"E"exp(!2k "y")"0 , (3.31) where eX(u#u) e jl(H) l(¹) u#u Im Q" 2m c(u!u) e (e ) jH lJ (¹) 4e 2 e e ; ((e )!3(e ))! (17(e )!5(e )) 2e 2e 9e #2u u e !e 1# , c"1/(e k ), 4e eX(u#u) (e )!(e ) jl(H) l(¹) Re Q" , lJ (¹) 8m c(u!u) e (e ) jH 2 D(u, k )"e s !k e "0 is the linear dispersion equation of the waves considered, and e "e #ie , e "e #ie are components of the cold magnetoactive semiconductor plasma dielectric tensor [110]. In Eq. (3.31) the expression D(u, k ) incorporates the linear collision damping; the term which is proportional to Re Q is responsible for the nonlinear damping of the surface waves; while the term proportional to Im Q is proportional to the SW nonlinear frequency shift. One can easily conclude from Eq. (3.31) that the nonlinear damping decrement and the nonlinear frequency shift are dependent on the square of the wave amplitude, and decrease with increasing distance from the source of the SW excitation (y"0). It is also not difficult to see that taking the heating nonlinearity into account leads to a change of the field structure and the dispersion properties of the SW. Before simplifying and studying the dispersion equation (3.31), we verify the weak heating assumption, i.e. determine the range of the SW amplitude values where the latter assumption is realized. The satisfaction of this condition gives one the possibility of not imposing any rigid limits on the nonlinear wave pulse duration t, caused by the Joule heating. However, the following restriction on the value of t can be written as t
lJ (¹) (X/uu) . k;(¹/m c) l(¹)
(3.32)
The values of the parameters ¹/m c and lJ (¹)/l(¹) are much smaller than unity [152]. This means that experimentally realizable values of the wave field amplitudes can occur in semiconductor samples with high enough electron densities. The estimates are carried out for
402
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
a semiconductor with the electron density n &5;10 m\, an electron effective mass m &5;10\ kg, and immersed in an external magnetic field of magnitude B "0.5 Tl at liquid nitrogen temperatures. For specificity scattering by acoustic phonons is chosen as the dominant mechanism of energy dissipation. In this case the expression on the right-hand side of Eq. (3.32) is a quantity of the order of unity. This means that in the case considered the weak heating assumption corresponds with a good accuracy to the weak nonlinearity approximation k;1. For the waves studied the nonlinearity parameter k"0.1 for the parameters given above corresponds to the following value of the SW amplitude E&100 kV/m, i.e. to a quite experimentally realizable field value. It should also be noted that the condition (3.32) is hardly realizable in a rarefied semiconductor plasma. Therefore, the subsequent discussion applies only to a dense semiconductor plasma-like medium. In this case in the frequency range u;u the following expressions for D(u, k ), Re Q, and Im Q are valid: uX l X X #i (4u/u!1) , (3.33) D(u, k )" k # cu 2c u uu jl(H) l(¹) eX Re Q"! , (3.34) lJ (¹) 8m cuu jH 2 eX l jl(H) l(¹) Im Q"! (!1#53u/u#32u/u!72u/u), 16m cuu jH lJ (¹) 2 c"1/(e k ) . (3.35) The wave field amplitude decreases with y and, therefore, we study the quantities Re Q and Im Q near the source of the SW excitation. One can show that the value of the SW nonlinear frequency shift increases with respect to the SW nonlinear damping decrement with an increase of the collision frequency and the wave eigenfrequency. Moreover, the quantities Re Q and Im Q are of the same order of magnitude in practically the entire frequency range of interest to us. The expression (3.34) suggests that the sign of Re Q is defined by the sign of the derivative jl(H)/jH" . If 2 jl(H)/jH" '0, i.e. when the collision frequency increases with the temperature, the quantity Re Q 2 is negative. This means that an increase of the SW linear damping caused by the nonlinear heating occurs. In the opposite case, when the collision frequency decreases with temperature, i.e. when the inequality jl(H)/jH" (0 holds, the quantity Re Q(0. This means the decrease of the SW 2 collisional damping with respect to the linear approximation. The case jl(H)/jH" '0 is realized, 2 e.g. in the scattering of HF electrons by phonon oscillations, while the situation jl(H)/jH" (0 is 2 realized in the scattering of HF electrons by charged impurities [152]. In the first case the SW nonlinear frequency shift Du"c(u/X) Im Q"E" is negative for u(u /7 and is positive for u'u /7 (these values are obtained under the assumption of a dense plasma). If the inequality jl(H)/jH" (0 holds, the signs of Du in both of these frequency ranges are reversed. 2 To estimate the possible impact of the nonlinearity mechanism studied on the resulting SW nonlinear frequency shift, we compare the values of the nonlinear shifts caused by the action of the heating nonlinearity with that caused by the nonparabolicity of the free carrier spectrum (see the preceding section). The ratio Du /Du , evaluated near the source of the SW excitation, is 2 ,. approximately 400 for the following parameters: l/u"0.1, u/u "0.3, E &0.2 eV, ¹"80 K, and
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
403
scattering by acoustic phonons (l(H)&H under the assumption of the polarization potential of the electron—phonon interaction, while l(H)&H under the assumption of the deformation potential [152]). However, at a distance from the source of the order of two wavelengths the nonlinear frequency shifts compared can become quantities of the same order of magnitude due to the dissipative nature of the heating nonlinearity. It should also be noted that the impact of the heating nonlinearity increases with increasing u. Now, following the original paper [133], we briefly recall the main features of the influence of carrier heating on the propagation of high-frequency surface waves (the frequency range above the upper hybrid frequency is considered). First of all, it should be noted that the conditions of applicability of the weak heating approximation in this frequency range are more rigid. Namely, for the same parameters that were used for the verification of the condition (3.32), and for wave frequencies of the order of an upper hybrid one, the weak heating assumption is realized for SW field amplitudes of the order of several hundreds of V/cm. For frequencies u&u the sign of the SW nonlinear frequency shift coincides with that of the expression jl(H)/jH" , i.e. is completely 2 defined by the scattering mechanism. When jl(H)/jH" '0 holds, i.e. when the collision frequency 2 increases with temperature, the SW nonlinear frequency shift is positive and taking the nonlinear effects into account leads to an increase of the SW phase velocity. If the opposite inequality holds, the SW nonlinear frequency shift is negative, and the wave phase velocity decreases. The analogous tendencies for the SW nonlinear damping decrement are valid. In the paper [133] a comparison of the SW nonlinear frequency shifts caused by the heating and hydrodynamic nonlinearities is presented (the latter results are presented in the paper [140], see also Section 3.3 of this review). Namely, for typical parameters of semiconductor structures the impact of the heating nonlinearity is shown to be also predominant near the source of the SW excitation. However, the ratio of the corresponding nonlinear frequency shifts is significantly smaller than in the range u(u . The latter fact means that in the frequency range u(u the impact of the heating nonlinearity is more significant than it is in the range u'u . Now, following [151], let us briefly discuss the influence of electron heating on the propagation of potential SW in a plasma-like medium with a finite electron pressure. In this case the electron temperature profile obtained undergoes spatial oscillations in addition to exponential decay with distance from the interface, and looks like the field profile of generalized surface waves [7]. Here we present only a simplified estimate for the amplitude of the electron temperature h&¹
l(¹) » # , lJ (¹) » 2
(3.36)
where » is the characteristic velocity of electron oscillations in the wave’s electric field, and » is # 2 the electron thermal velocity. In our case the estimate (3.36) satisfies the weak heating assumption. One can easily see from Eq. (3.36) that the value of the temperature h increases with the increase of the parameters (» /» ) and l/lJ . This seems quite reasonable due to the fact that the larger is the # 2 SW amplitude, the greater are the heating losses. The increase of h with the parameter l/lJ is due to the fact that the smaller is the part of the total electron energy transferred from the electrons to the scattering centres, the more significant is the electron heating. Using the nonlinear dispersion relation of the potential SW presented in [151], one can obtain the expression for the SW wavenumber which confirms that the nonlinear wavenumber shift and
404
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
the nonlinear damping decrement possess opposite signs. This means that the nonlinear phase velocity decrease corresponds to a decrease of dissipative losses and vice versa. It should be noted that the increase of the electron collision frequency with temperature leads to an increase of the SW phase velocity and to an increase of the damping decrement compared with the results of the linear theory. The opposite situation, when the electron collision frequency decreases with h, leads to a decrease of the SW phase velocity and to a decrease of the damping decrement. One can also show that the nonlinear wavenumber shift and the nonlinear damping decrement decrease with the SW frequency. However, this dependence is less significant than that on the parameter (» /» ). # 2 This seems quite natural due to the fact that this parameter is proportional to "E", and the intensity of nonlinear effects increases with the increase of the wave amplitude. An increase of the parameters X e\/l and l(¹)/lJ (¹) leads to an increase of the SW wavenumber. * Now let us compare the action of the heating nonlinearity mechanism with that caused by the nonparabolicity of the free carrier spectrum (the latter has been calculated in papers [142,143], see also Section 3.4 of the present review). The ratio of the corresponding normalized nonlinear frequency shifts (du /u) and (du /u) for waves in the frequency range higher than the ,* ,. ,* 2 electron cyclotron frequency can be written as
du ,* u
du \ ¹ ,* " . u E &7" ,. To obtain this estimate both nonlinear frequency shifts have been evaluated for the same parameters, and the value (du /u) was evaluated near the source of the wave excitation, due ,* 2 to its dissipative nature. The heating nonlinearity mechanism can be predominant because the parameter ¹ /E is usually small. But at distances from the point y"0 of the order of two wavelengths the contribution of the nonlinear mechanism that is due to the nonparabolicity of the electron spectrum to the total nonlinear frequency shift becomes predominant. 3.6. Ionization nonlinearity of the SW in gas discharge plasmas In this section we study the influence of the ionization nonlinearity on the propagation of SW in the plasma of a stationary gas discharge [153]. This nonlinearity is very significant in studies of wave properties in surface wave-sustained plasmas, and is of great current interest [154]. The mechanism of the ionization nonlinearity is in some aspects analogous to that of the heating nonlinearity discussed in Section 3.5. The heating of the electrons by the high-frequency field of the surface wave leads to an additional ionization and, therefore, to a stationary density perturbation. In this density perturbed medium a nonlinear frequency shift arises, i.e. the nonlinear SW self-interaction effect is realized. The procedure for describing the ionization nonlinearity in SW-sustained gas discharges was developed in the papers [155,156], and is used in this section to describe its effect on the propagation of SW at the interface between a magnetized gas discharge plasma and a metal. The perturbation of the stationary electron density caused by the ionization nonlinearity can be obtained from the following set of electron transport equations [157,158]: !div(D n grad T )#nl*u*"!enE , D(D n)#l n"on .
(3.37a) (3.37b)
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
405
In Eq. (3.37a) the electron energy loss by inelastic collisions with excitation, which usually predominates [157,158], is the only loss mechanism included; o is the volume recombination coefficient, D "¹ /m l and D "¹ /m l are the electron and ambipolar diffusion coefficients, respectively, and l and l are the electron-neutral and ion-neutral collision frequencies, respec tively. The following expressions are valid for the excitation frequencies [155,156], involved in Eqs. (3.37a) and (3.37b): l*"l* exp(!u*/¹ ) , l "l exp(!u*/¹ ), ¹ '(u !u*) , (3.38) l "l exp(!u /¹ ), ¹ ((u !u*) . In Eqs. (3.37a), (3.37b) and (3.38) u* is the threshold energy for the first excited level and u is the ionization energy. The explicit forms of the slowly varying functions l*, l , l are presented in [159]. The results of this section are valid when the ambipolar diffusion in magnetized plasmas dominates (the conditions providing the realizability of ambipolar diffusion and Bohm-like (anomalous) diffusion in RF magnetoplasmas are discussed in [160]). Since the weak nonlinearity approximation is made in this section, the expressions for the electron temperature, the excitation and ionization frequencies can be presented as follows: ¹ "¹ ¹I l* "l* #lJ * where
(3.39a) (3.39b)
jl u u lJ "l * ¹I . lJ *" * ¹I "l* * ¹I , ¹ ¹ j¹ 2 Under the assumption of a local plasma heating (this assumption was already used in Section 3.5 in the studies of the influence of the heating nonlinearity on SW propagation), the heat conduction effects can be neglected. In this case, using the electron energy balance equation (3.37), one can obtain the following expression for the electron temperature variation in the discharge: l ¹I (u#u) e ¹I " "E " , (3.40) 3l* u* (u!u) m 6* where E "(c/u) (e k !e s )/(e!e ) A exp(!s x) exp(!k "y") is the linear solution for the 6* normal component of the SW electric field. The expression (3.40) was obtained for the SW eigenfrequencies satisfying u
406
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Under the assumption of the volume recombination-controlled regime the equilibrium electron density and its modification caused by the action of the ionization nonlinearity can be easily obtained from Eq. (3.37b): nJ "E " " 6* , (3.41) n E L where 3m (u!u) l* u* (3.42) E " LG (u#u)u* l e is the normalizing field amplitude which characterizes the effect of the ionization nonlinearity. The expression (3.41) suggests that in the case studied an increase of the electron density with respect to its stationary value occurs. It should also be noted that by assuming that the external magnetic field vanishes in Eq. (3.41) one can easily obtain the expression for the normalizing electric field of the ionization nonlinearity presented in [154] in the case of SW in a free plasma (the inequality u
Du "!u (nJ /2n ) . (3.43) ,* Therefore, the action of the ionization nonlinearity in the volume-recombination-controlled regime leads to a decrease of the SW phase velocity according to the expression (3.43). Now let us briefly discuss the features of the nonlinearity studied under conditions of the diffusion-controlled regime. In this case the recombination particle losses are neglected in Eq. (3.37b). Nevertheless, even in this case Eq. (3.37b) is a differential equation, and it is not easy to obtain its general solution in the perturbed state. However, simple analytical solutions of Eq. (3.37b) can be obtained in two limiting cases [154]. The first case is realized if the skin depth of the SW, 2n/s , is larger than the characteristic ambipolar diffusion length K"(D /l ). In this case the reduced electron density is realized in the regions of high field amplitudes (nJ (0). In the opposite case, which is realized when the SW skin depth is smaller than K, the ionization nonlinearity leads to a positive value of the electron density perturbation. The scope of the SW nonlinear evolution under such conditions can be discussed in analogy to the discussion in Section 3.3. Finally, let us note that numerical estimates suggest that the ionization nonlinearity is the most significant nonlinear self-interaction mechanism of the SW of interest to us in low-temperature gas discharge plasmas. 3.7. Nonlinear theory of azimuthal SW The aim of this section is to study three- and four-wave interactions of azimuthal-type wave perturbations (ASW) propagating in the azimuthal direction across an external magnetic field in magnetoactive plasma waveguides [161]. The elements of the linear theory of this type of wave perturbations have been considered above (Section 1.5). The frequency ranges of existence of ASW are given by the expressions (1.34), while the dependence of all perturbations in the wave field by (1.33).
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
407
Let us consider a cylindrical metal waveguide of radius a, fully filled by a cold and homogeneous plasma and immersed in an axial magnetic field. Our analysis is carried out in the dense plasma-like medium approximation used many times above. To be specific, we consider the possible interactions of the azimuthal-type waves from the first frequency range of existence (the frequencies are below the electron cyclotron frequency), namely, the possibility of interactions between the high-frequency and low-frequency wave perturbations. In a HF wave field the ion motions are neglected because the HF ASW eigenfrequency u significantly exceeds the character istic eigenfrequencies of ion motion. The case of the interaction of HF and LF waves can be realized if, e.g. u <X, m
n mc \ , (3.45) u "u 1$ 1# 8 m au where u is given by Eq. (3.44). The frequency of the LF wave perturbation follows from Eq. (3.45) mc \ n . (3.46) 1# X"u m au In the case considered the satellites with frequencies (3.45) are eigenmodes of the structure considered, while the LF wave perturbation with the frequency X is not an eigenwave in general. A careful analysis of the dispersion relations shows that a decay-type of interaction is realized when the following inequalities are satisfied:
X
408
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
quasihydrodynamic and Maxwell equations. The analysis is carried out in the weak nonlinearity approximation with the corresponding parameter k"» /» ;1, where » is the characteristic # # velocity of electron oscillations in the wave field, and » "ua/m is the quantity acting as the ASW phase velocity [53]. The solutions for the electromagnetic fields of the surface-type wave perturbations obtained within the framework of the latter approximation have been introduced into the boundary condition E (r"a)"0 representing the vanishing of the tangential component of the P ASW electric field at the interface with a perfectly conducting metal. The use of the procedure described above leads to the following set of equations governing the three-wave interactions being studied: D(m , u )E "b E E*!l E , \ \ \ \ \ \ * (3.49) D(n, X)E "b E E !l E , \ where D(m , u )"u !u /a , a "aX /m c, E is the amplitude of the pump ASW with the \ \ \ \ \ \ frequency u , E is that for the ASW satellite with u , E is the LF ASW amplitude, \ \ D(n, X)"X!u /a , a "aX /nc, l and l are the linear damping decrements, b and b are the \ \ wave coupling coefficients, that are given by
p (m , u , a) exp(!s a) m e (u ) \ (2ns a s# \ \ s b " \ \ \ , \ \ s!s a e (u ) \ \ \ p (n, X, a) exp(!s a) n e (X) (2ns a s ! s , b " \ s !s a e (X) \ p (m , u , a)"e (u )e\(u )g (m , u , a)#g (m , u , a) , \ \ \ \ \ \ \ \ \ p (n, X, a)"!e (X)e\(X)g (n, X, a)#g (n,X,a) , where
X u 2u m u X m c eX exp(!s a) ! \ 1# \ # \ , g (m , u , a)" \ \ 4na u Xa u X u 4m cu u \ \ 2eXu 7u 2m cu X u 3X \ g (n, X, a)" exp(!s a) 1# \ 1# \ \ , (3.50) \ 7m cu Xu 4na u u u 5Xa X \ X u u 2m c u 17 X u 3eXu \ exp(!s a) 3# \ \# \ \ g (m , u , a)"! \ \ Xa X 2 u X 7m cu Xu 2na u u u 2m c u !4 \ 1# \ \ , u Xa X X 2u 12 X u X 17 eXu exp(!s a) 1! 2# \ g (n, X, a)"! \ u 7u 5 u u 2 m cuX 2na \ 3 X X 1u # 5# 1# , c"1/(e k ) . 2u 2u 5X \
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
409
Fig. 3.2. The dependence of the normalized decay instability increment c /u on the parameter u /X . Curves 1—4 are "#! plotted for the following values of the parameters m , n, aX /c, respectively: (1) 7, 1, 40; (2) 7, 1, 50; (3) 5, 1, 25; (4) 5, 1, 30. In \ all figures E /E "2.
It should be noted that the coupling coefficients (3.50) have been calculated using a method that is analogous to the general procedure for analyzing three-wave interactions in bounded plasmas (one can find the details of this procedure in application to a cylindrical plasma column, e.g. in the papers [162,163]). After some algebra it is not difficult to show that in the case studied the condition for realizing the instability,
jD b b* \ ju
jD \ '0 , ju X S\ is satisfied. The corresponding decay instability increment and the threshold value of the pump field can be written as follows:
l #l # c "! \ "#! 2
E "[l l /b b ] , \ \ where
l #l \ #b b* , \ 2
(3.51)
l "l [jD/ju"X]\ , l "l [jD/ju" \]\, \ \ S b "b [jD/ju"X]\ . b "b [jD/ju" \]\, \ \ S The dependence of the normalized decay instability increment c /u on the parameter X /u is "#! presented in Fig. 3.2 for several parameters of the plasma waveguide structure studied. One can easily see from this figure that the quantity c increases with an increase of the ASW eigen"#! frequency and the external magnetic field strength, and decreases with an increase of the waveguide radius and the plasma density. Now let us make some numerical estimates for the value of the pump field threshold amplitude. For n &2 ; 10 m\, B "0.05 Tl, l /u "0.05, X/u "0.1, aX /c"40, the value E is about \ \ 32 kV/m for the decay interaction of the ASW with m "7, n"1; and is about 34 kV/m for \ m "5, n"1. One can easily see that such values of the electric field amplitudes can be realized \ experimentally.
410
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
If, however, the inequalities (3.47) are not satisfied, a non-decaying spectrum is realized in the system. This situation can be realized if at least one of the relations below is valid: (aX /mc)&1, X &u , J (aX /c)&m, X \ interaction is described by the following set of equations: D(X , m )E "p E E , > > > > D(u , m )E "p E E* , \ \ \ \ D(n, X)E "p [E*E #E E* ] , > \ where
(3.52)
D(m , u )"u !u /m , m "[1#(aX /m c)] , \ \ \ \ \ \ D(m , u )"u !u /m , m "[1#(aX /m c)] , > > > > > > p , p , p are the interaction coefficients [161], and D(n, X) is the dispersion function for the LF > \ wave perturbation. In the limiting case s(X)a<1 the dispersion function D(n,X) of the lowfrequency wave perturbation is represented in the form D(n, X)"(X X/c(X!u)[1!ncu /aX (X!u] . J J The set (3.52) can be reduced to the following dispersion equation of a standard form [163], which describes the initial stage of the modulational instability:
1 1 p p "E " # . (3.53) 1" > D(n, X) D(m , X ) D(n , X ) > > \ \ One can conclude from Eq. (3.53) that a modulational instability in the system is realized when the inequality p p /D(n, X)(0 holds. In this case the corresponding instability increment is > n p p cm \ > 1# c " "E " u . (3.54) +-" m D(n, X) aX
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
411
Fig. 3.3. The dependence of the normalized modulational instability increment c /u on the parameter u /X . Curves +-" 1—4 are plotted for the following values of the parameters m , n, aX /c, respectively: (1) 10, 1, 40; (2) 10, 1, 34; (3) 7, 1, 30; (4) 7, 1, 35. All figures correspond to the following value of the weak nonlinearity parameter, k"0.1.
A comparison of the increments of the three-wave and four-wave interactions (3.51) and (3.54) shows that n aX (s(X)a)\<1 . c /c " "#! +-" m mc
The dependence of the modulational instability increment on the plasma parameters is presented in Fig. 3.3. The quantity c also increases with an increase of the wave frequency and the external +-" magnetic field strength, and decreases with an increase of the waveguide radius and the plasma density. Finally, it should be noted that the nonlinear theory of the ASW is not limited to the cases of the three- and four-wave interactions considered in this section. However, at present the paper [161] is the only one dealing with this theory. Therefore, one can conclude that the nonlinear theory of the azimuthal-type waves in plasma-like medium—metal structures still awaits its realization. 3.8. Other nonlinear effects Unfortunately, the limited volume of the present review prevented us from covering all of the nonlinear effects in which SW at the interface between a plasma-like medium and a metal participate that have been studied up to now. In this section we briefly summarize results associated with several nonlinear effects that were not discussed above. We begin with the nonlinear damping of the SW. As was mentioned above, the nonlinear damping of the SW is an important phenomenon and usually leads to an increase of the dissipative processes compared with those in the linear approximation. It is not difficult to show that the equation describing the temporal variation of the SW amplitude A has the following form (see the details in [164] for the SW in a cold magnetoactive plasma, and in [142,143,165] for the SW in free and magnetoactive warm nonisothermal plasma-like media): D j"A " "! » "A " , D jt
412
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
where D"¼/"A ", ¼ is the energy density of the volume wave perturbation with the frequency 2u (3u), D"¼/"A ", ¼ is the surface energy density of the SW first harmonic. The solution of the latter equation with the initial condition "A " (t"0)"A can be written as "A"(t)"A [1#(D/D)» A t]\ . The quantity (D/D)» A characterizes the speed of the nonlinear dissipation of the SW energy. One can see that the SW amplitude decreases with time due to a nonlinear transformation to a volume wave which removes the electromagnetic energy from the interface. Using the latter solution, it is not difficult to show that the amplitude of the first harmonic undergoes a twofold decrease within the following period of time ¹ "3D[D» A ]\ . Estimates of the half-decrease time ¹ for SW propagating across an external magnetic field can be found in [164]. The next nonlinear effect we discuss in this section is the interaction of high-frequency SW with low-frequency wave perturbations, which modulate the plasma density and an external magnetic field. This interaction can be realized as a three-wave interaction or four-wave interaction (the latter can lead to the realization of a modulational instability [163]). It should be noted that the LF wave perturbation mentioned above can be of different types (eigenmodes of the structure or forced perturbations, surface or volume type perturbations, etc.). Such problems have already been studied in application to SW in a planar Voigt geometry [166]. In this geometry the interaction of the SW with LF perturbations leads to the occurrence of two satellite waves with frequencies u$"u $u and wavenumbers k$"k $k (u and k are the frequency and wavenumber of LF perturbations, u;u). One can show that u and k are coupled by u "k » , where » is the HFSW group velocity. In this case a nonresonant type of modulational interaction of the HF and LF waves is possible. The LF wave perturbation is a forced mixed-type perturbation, i.e. a superposition of forced perturbations of volume and surface types. An analytical calculation of the nonlinear frequency shift suggests that it is defined by the LF modulation of the magnetic field. However, the estimates of the threshold value of the SW amplitude allows one to state that the LF modulation of the magnetic field mentioned above does not increase in time. This fact allows us to state that the SW in the Voigt geometry studied in the framework of the weak nonlinearity approximation are stable with respect to a LF modulation of the plasma density and the external magnetic field strength. In the paper [167] the possibility of the realization of nonlinear three-wave interactions between the surface eigenmodes of a plasma-like medium—metal waveguide structure in a planar Voigt geometry is discussed. In this paper the situations providing the realization of the necessary conditions for spatial and temporal synchronism, u "u #u , k "k #k , (3.55) are discussed. It should be pointed out that the phase mismatch can be taken into account in the general case. However, we are interested in the realization of the most effective interaction between three surface eigenmodes of the structure when the conditions (3.55) are satisfied exactly. We discuss this problem here in application to two surface modes: the SW propagating across an
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
413
external magnetic field in a cold plasma-like medium and the SW in a warm nonisothermal plasma-like medium without an external magnetic field. In the first case, using the linear dispersion relation [44] e (u ) u k " H(e (u ) H , H H "e (u )" c H one can reduce the conditions (3.55) to the following relation:
e (u ) e (u ) e (u ) e (u ) u (e (u ) !(e (u ) "u (e (u ) !(e (u ) (3.56) "e (u )" "e (u )" "e (u )" "e (u )" which is useful in the subsequent analysis. It is easily seen from (3.56), that a decay-type interaction between three eigenmodes from the same frequency range of existence is forbidden (the SW in the geometry studied possess a non-decaying spectrum). However, in the geometry studied one can realize a decay-type interaction between three eigenmodes from different frequency ranges of existence, namely u (u and u 'u . In this case the relation (3.56) allows one to obtain the G F following relation between the frequencies of the high-frequency and low-frequency SW: (e (u )!(e (u ) . u "u (e (u ) The latter condition gives one the possibility to obtain the following estimate: u X , (e (u )!(e (u )" u (u!u G which demonstrates that the frequencies u of the HF SW are close to the upper hybrid frequency, while the relation between the frequency u and the ion cyclotron frequency is arbitrary. The second case mentioned above occurs in the interaction between three eigenmodes of the interface between a warm nonisothermal plasma-like medium without an external magnetic field and a metal. In this case we discuss the possibility of the interaction between a pump wave propagating along the z-axis and two excited waves with eigenfrequencies u "u /2 and wavenumbers satisfying 2k(u /2)cos h"k(u ), where h is the angle between the z-axis and the directions of propagation of the SW with u (i.e. a realization of symmetric decay is discussed in this case). Bearing in mind that cos h(1, we show that k(u /2)'k(u )/2. The latter relation is not realized for the dispersion law of the surface modes studied. In the first case the wave coupling coefficients are obtained in the most general form and are expressed through the second order nonlinear susceptibilities [167] (a limiting case allowing the calculation of the latter coefficients in the framework of the hydrodynamic theory is also presented). The conditions providing the most effective interaction between the waves are discussed, and analytical expressions describing the evolution of the amplitudes of the SW in the presence of such an interaction are given. The possibility of realizing a parametric generator based on the nonlinear interaction of SW is discussed. The characteristic length of the nonlinear medium providing the effective interaction is estimated for several important cases. In application to the problems studied above it is also interesting to discuss the influence of the finite spectral width on the three-wave interactions studied. Preliminary investigations in this
414
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
direction [168] show that there are several qualitative features related to the finite spectral width. First, the period for energy exchange between the waves grows with the spectral width. Second, the excitation of a broad spectrum of SW occurs in several cases, i.e. the realization of interactions with a finite spectral width yields the conditions for the occurrence of turbulence. And third, the fraction of the total wave energy exchanged in the interaction is a decreasing function of the spectral width. In considering the self-interaction of SW caused by hydrodynamic nonlinearities, Section 3.3, we have already mentioned the action of the ponderomotive force (force of high-frequency pressure acting on electrons) leading to an increase of the static surface perturbations (3.14) and the corresponding self-interaction channel 0#u"0. This analysis was carried out in a cold plasmalike medium approximation. If we, however, take into account the finite electron temperature ¹ in studies of SW nonlinear self-interaction effects, we have to take into account the so-called “striction” nonlinearity [169]. One can easily show that the SW studied in the non-linear regime are characterized by the normal component of the ponderomotive force F (the y-component of ,*V this force is equal to zero) [99]. This force yields the nonlinear electric field at zero frequency directed normally to the interface. The occurrence of the latter field leads to a corresponding drift of the plasma particles (see Eq. (3.14)) in the direction perpendicular to both the nonlinear electric and the external magnetic fields (drift in crossed E and H fields). In finite pressure plasmas the balance between the ponderomotive force pressure and the plasma gaskinetic pressure leads to the following electron density disturbance caused by the action of the striction nonlinearity [153]: "E " d L "! 6* , (3.57) E n where E is a linear solution for the SW electric field, and E is the normalizing field which 6* characterizes the action of the striction nonlinearity. The expression (3.57) is valid for the SW in the Voigt geometry in the frequency range u(u . In Eq. (3.57) u E " (4m ¹ . (3.58) e The expression (3.57) confirms the well-known fact that the action of the striction nonlinearity leads to a decrease of the electron equilibrium density in the regions with an intense electromagnetic field [169]. To compare the impacts of striction and ionization nonlinearities on the SW nonlinear frequency shift (3.43), the ratio of the characteristic normalizing electric fields (3.58) and (3.42) was studied. The results of these studies suggest that the striction nonlinearity is characterized by larger values of E . Therefore, in gas discharge plasmas the impact of the ionization nonlinearity is more L significant. The presence of the ponderomotive force also gives rise to the possibility of the interaction of SW with low-frequency quasi-neutral plasma responses of the magnetosonic (in magnetized plasmas) or the ion (hole) acoustic types. However, the corresponding variations of the stationary plasma density caused by this mechanism appear to be smaller than Eq. (3.57) in a wide range of the plasma parameters [153]. Another interesting nonlinear effect which can be realized in a plasma-like medium—metal waveguide structure is the realization of bistable states under the excitation of the SW by means of the attenuated total reflection (ATR) method. The paper [170] deals with this nonlinear effect in
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
415
application to the excitation of the SW propagating across an external magnetic field at a semiconductor—metal interface by means of the ATR method in the Kretchmann configuration (a planar slab of a free-electron metal is sandwiched between an n-type semiconductor and a dielectric simulating the prism material in a prism coupler. A plane TM wave is incident on the interface between the prism and the metal). The significant field amplification which is provided by this modification of the ATR method results in the nonlinear behaviour of the modes excited. The possible nonlinear effects resulting in the SW nonlinear frequency shift are taken into account through the formalism of the third order nonlinear susceptibilities. The resulting amplitudedependent frequency shift of the SW modifies the excitation conditions by the ATR method. Therefore, an excited SW provides a backward influence on the excitation system. This backward influence results in a bistable regime of the SW excitation in the structure studied. Varying the input power leads to the fact that the system is triggered between the regimes of TR and ATR and vice versa for different values of the input power. The numerical results suggest a hysteretic dependence between the input and output amplitudes, and that this effect is realized under reasonably low values of the input power. It should be pointed out that the nonlinear solutions discussed above are related to homogeneous plasma-like media. The nonlinear hydrodynamic echo is a nonlinear effect which is due to the inhomogeneity of the near-wall plasma region. In the paper [171] the possibility of the existence of the hydrodynamic echo effect in both gaseous and semiconductor plasmas bounded by a metal wall (in the Voigt geometry) is shown. This effect turns out to be possible due to the fact that the small-scale oscillations in the vicinity of local resonances may provide the retention of the phase memory about the initial perturbations. The analysis shows the possibility of using this effect in diagnostics of near-surface transient layers in semiconductor and gaseous plasmas. In [171] a temporal two-pulse hydrodynamic echo has been studied. The criterion for the existence of the echo effect and the expressions for the electromagnetic field signals are obtained. The generation time of the plasma response turns out to depend on the parameters of the near-wall transient layer, and provides information about the density profile and other parameters of the nonuniform plasma-like medium. We must also mention the paper [172] where strongly nonlinear surface-type solutions of the MHD equations were obtained in a plasma layer bounded by metal walls. These solutions correspond to the Alfven-type nonlinear MHD waves, which are the superposition of a stationary nonuniform flow and a high-amplitude low-frequency wave perturbation propagating across the external magnetic field. Now let us briefly mention several papers which are not related to the Voigt geometry. In the paper [173] the possibility of the propagation of a finite-amplitude nonlinear surface wave at the interface between a cold free plasma and a perfectly conducting metal is shown (note that linear SW at this interface are impossible [100]). General solutions are obtained without any simplifying assumptions about the level of the nonlinearity, i.e. strongly nonlinear surface waves are considered. The ionization and the striction nonlinearities are assumed as the possible mechanisms providing the nonlinear plasma response. In this paper the nonlinear SW is shown to create the propagation conditions for itself by acting on the medium where the linear SW are impossible. It is essential to expect that the action of the SW field can lead to some re-distribution of the plasma density (it should be recalled that SW are possible in the case studied in an inhomogeneous plasma with special profiles of the plasma density [88]), and this re-distribution can lead to changes of the
416
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
electrodynamic properties of the medium. These changes should be strong enough to provide the conditions for the propagation of the surface wave. This leads to the fact that the SW amplitude should be large enough to be described by the weak nonlinearity approximation used in the previous sections. The nonlinear SW considered in [173] has a linear analog, namely the SW at the plasma-dielectric interface in a metal—plasma—dielectric structure. In this case the plasma—dielectric interface acts as the waveguide surface (the surface separating two media with opposite signs of e). Therefore, the nonlinear surface wave propagating along the metal boundary creates for itself a complicated waveguide structure which consists of the region with a negative dielectric constant (near wall region, e(0) and the region with e'0 which acts as a dielectric. It is also interesting to mention papers related to studies of the resonant second harmonic generation of waves propagating along an external magnetic field at the interface between a plasma-like medium and a metal [123,174], and to make numerical estimates of the characteristic times of energy transfer ¹ and to compare them with the linear period ¹ and with the ,* * characteristic times of energy transfer in the Voigt geometry. In samples of the semiconductor n-GaAs, for the same parameters as in Section 3.1, the ratios ¹ /¹ are about 13.4 and 12.5, ,* * respectively. It is not difficult to see that in this case the characteristic times of the second harmonic generation process are shorter than in the Voigt geometry. This can be explained by the fact that the ratio of the corresponding coupling coefficients in the cases of longitudinal and transverse propagation is b /b &X /u<1. Therefore, the parametric coupling of the first and , , second harmonics is stronger in the case of longitudinal propagation. In this case taking the nonlinear self-interaction effects into account also leads to an increase of the characteristic times ¹ [126,174]. ,* Finally, we mention the papers [165,175,176] dealing with studies of the nonlinear self-interaction processes taking place in the propagation of the SW at the interface between a warm plasma-like medium and a metal (the action of the nonparabolicity of the free carrier spectrum is studied in [165,175], while the influence of electron heating on the SW propagation is discussed in [176]).
4. Basic experimental results Although the present report is mainly devoted to the theoretical basis and the perspectives of studies and applications of the surface waves propagating in the Voigt geometry at a plasma-like medium—metal interface, we briefly review here the basic experimental data demonstrating the existence of these SW in several plasma structures. Therefore, this section is devoted to a review of experimental investigations of dispersion properties and spatial field distributions of surface-type waves in plasma waveguide structures bounded by a metal. As was mentioned earlier, the first experimental studies of the SW considered were carried out for the semiconductor plasmas [29,30]. In these papers SW in the Voigt geometry were studied. The experimental studies of the properties of SW in a gas-discharge plasma bounded by a metal [31,32,177] are connected with the investigations of the impedance properties of metal antennas immersed in a plasma. Although the latter studies are not related to the Voigt geometry, we still present them here due to the fact that these results are of significant interest for understanding wave processes at the interface between a plasma-like medium and a metal. In
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
417
papers [31,32,177] the propagation of a high-frequency signal along an antenna was explained as a surface wave at the interface between the antenna and a plasma. In our view, the experimental results most closely related to the subject of this review are presented in the paper [32]. Toda [29] observed a microwave surface signal in the part of the semiconductor waveguide filled by a crystal of n-In Sb with an average electron density &8;10 m\. The experiment was carried out at a liquid nitrogen temperature ¹"77 K. The surface of the crystal was covered by copper plates which formed a rectangular waveguide with a cross-section 2;2 mm and a length 5 mm. The sample studied was placed in a basic waveguide as shown in Fig. 4.1. An external magnetic field H was directed transverse to the transmitted microwave power. In the absence of the external magnetic field the signal was registered at a noise level, while the intensity of the registered signal increased with H . This result is in a good agreement with the theoretical assumptions about the absence of SW at the interface of a cold homogeneous plasma and a metal in the absence of an external magnetic field [100], and about the possibility of the existence of SW at the interface between a cold magnetoactive plasma-like medium and a metal (see Section 1.1 of this review). Then the metal plane 1 covering the top of the sample was removed, and the microwave power transmitted along the lower boundary of the waveguide for two opposite directions of the external magnetic field H and H was registered (Fig. 4.1). It was shown that the ratio of the amplitudes of the registered microwave signals A /A corresponding to the external magnetic field directions H and H is about 100. It was also mentioned that the direction H is “closing” for the surface signal propagating along the lower waveguide boundary. If only the lower plane was removed, then the direction of wave propagation was reversed. Fig. 4.2 presents the dependence of the transmitted power P along the lower waveguide boundary on the value of the external magnetic field. One can easily see from this figure that the amplitudes of the transmitted
Fig. 4.1. Schematic view of the waveguide configuration used in Toda’s experiments. Here 1 — rectangular waveguide; 2 — basic waveguide. H and H denote two different orientations of the external magnetic field. Fig. 4.2. The dependence of the transmitted power P along the lower waveguide boundary on the external magnetic field strength. The input power is about 2 mW, the wave frequency u"1.5;10 rad/s. Curve 1 corresponds to the H orientation of the external magnetic field, curve 2 — to the H orientation.
418
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Fig. 4.3. Schematic diagram of the waveguide configuration used in the experiment [30]. Here H and H denote two different orientations of the external magnetic field.
signals differ significantly. These results demonstrate that the SW propagating at the interface between a cold plasma-like medium and a metal are unidirectional waves. This fact is in good agreement with the results of Section 1.1 of the present review. In the experiment an increase of the wave amplitude with an increase in the strength of the external magnetic field was also observed. In our view, the latter fact is due to a variation of the ratio u/u with an external magnetic filed H . The latter ratio defines the damping of the waves considered. The paper of Hirota and Suzuki [30] is a logical extension of the paper [29]. They suggested a new type of solid-state plasma waveguide in which a layered semiconductor—dielectric structure bounded on both sides by a metal (Cu) was used. A single-crystal of InSb with n "9;10 m\ at a temperature ¹"77 K was used. The polycrystal CuO was used as the dielectric. Note that the dielectric constants of CuO (18.1) and InSb (18.7) are close so that the authors obtained the possibility to synchronize the phase velocities of the SW propagating along the InSb—Cu and InSb—CuO interfaces (Fig. 4.3) with an error of less than 1.6%. In the experiment considered, as in the paper [29], two opposite directions of the external magnetic field H and H were used, as shown in Fig. 4.3. The metal cover, which was in contact with InSb could be removed simply. The case where the cover was present was called a “closed boundary state” by the authors, while the opposite situation was called an “open boundary state”. In the experiments the sample sizes and the external magnetic field strengths were comparable with those used in Toda’s experiments. The dependences of the transmitted power on the value of H for different microwave frequencies, external magnetic field orientations, and boundary states were obtained. The measurements showed that in the “closed boundary state”, for the external magnetic field orientation H , the microwave power is transmitted along the InSb—Cu interface. For the opposite orientation H the microwave power is transmitted along the InSb—CuO interface. The authors claim that as in the paper of Toda [29], the magnetic field direction H is “closing” for the SW at the plasma-like medium—metal interface, i.e. for this orientation the microwave power is transmitted through the SW at the CuO—InSb interface. However, here we are only interested in the SW at the interface of a plasma-like medium and a metal, and the results associated with the studies of the wave at the CuO—InSb interface are outside the scope of this review. In our view, the results associated with the identification and measurement of the level of the losses of the SW at the plasma-like medium—metal interface, as well as with the coupling of these SW with a wave at the
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
419
Fig. 4.4. Schematic view of the cylindrical metal antenna in a magnetized plasma used in the experiment [32]. Here 1 — anode, 2 — coaxial cable, 3 — antenna. Fig. 4.5. Dispersion curves of the SW for different values of the transient layer width. Here u "3.52;10 rad/s, X "2.5u , l"u /22. Antenna parameters: width — 21 cm, diameter — 0.8 cm. Curvers 1—3 correspond to the following values of the transient layer width s"16r ; 28r ; 39r , respectively. " " "
semiconductor—dielectric interface, are the most interesting results of the paper [30] related to the subject of the present report. In the paper [32] the wave properties of an antenna immersed in a magnetoactive gas discharge plasma were studied. The cylindrical antenna made from a non-magnetic material (length — 20 cm, diameter — 1—4 mm), coupled with a coaxial cable, was inserted into a discharge chamber through a special slot in the anode of the gas discharge tube. The latter was placed in an axial magnetic field as shown in Fig. 4.4. A helium plasma was produced in a glass tube under conditions of a glowing DC discharge (cold cathode) at a pressure 6—14 Pa. The value of the external magnetic field at the center of the discharge chamber was varied from 0 to 0.02 Tl. The dispersion characteristics of the SW propagating along the external magnetic field were studied. One of the main problems posed by the authors of [32] was to study the influence of the transient layer from a metal to a plasma on the HF properties of the antenna. The parameters of the transient layer were specially varied by a weak negative DC voltage with respect to the anode. The width of the transient layer was varied in a series of experiments from 10 to 1000 Debye radii. Fig. 4.5 presents the SW dispersion curves for different values of the transient layer width. Under a decrease of s the wave frequency decreases and can reach a value that is smaller than u . We assume that this occurs because u is the limiting frequency of the SW at the plasma—metal interface, while the SW at the plasma—sheath interface possess the limiting frequency X /(2. The decrease of the transient layer width leads to a trans formation of the wave studied into the SW at the plasma—metal interface with the limiting frequency X /(2. In the paper [32] an investigation of the plasma—metal interface was presented as well. The results of this investigation are given in Fig. 4.6, and demonstrate that the electron cyclotron frequency is the limiting frequency of this type of SW. In the first case, as is noted by the authors, the antenna diameter is smaller than the SW skin depth, and their dispersion is to be described in a cylindrical geometry. In the second case the antenna diameter is larger than the SW
420
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
Fig. 4.6. Dispersion diagrams (curves 1, 2) and the attenuation coefficients (curves 3, 4) of the SW at the antenna— metal interface. These diagrams are valid for the following plasma parameters: u "3.52;10 rad/s, X "4u , l"u /28. Curves 1, 3 correspond to the antenna with the diameter d"0.08 cm, while curves 2, 4 to that with d"0.45 cm.
skin depth, and the semi-infinite plasma model can be used to make the comparison between the theory and the experiment. Our numerical estimates show that for the plasma parameters corresponding to Fig. 4.6 the SW skin depth is of the order of 3 mm for frequencies u;u , and decreases with an increase of the wave eigenfrequency. In our view, the semi-infinite plasma mode can be used only for qualitative estimates in this case. The paper [32] also states that when the transient layer is absent the frequency dependence of the wavenumber obtained experimentally corresponds with a good accuracy to the approximate formula for an infinitely thin wire immersed in a magnetoplasma, k"(u/c)(e , which to within a factor of (2 coincides with the wave number of a SW propagating along an external magnetic field at a plasma—metal interface [77]. This demonstrates, at least, a good qualitative agreement between the experimental results of the paper [32] and the theoretical conclusions of the paper [77]. Fig. 4.6 also illustrates the frequency dependence of the SW attenuation coefficient (curves 3,4), showing that the SW damping increases as the frequency approaches the electron cyclotron frequency. In our view, the confirmation of the existence of surface waves at the gas discharge plasma—metal interface, and the investigation of their dispersion characteristics and attenuation coefficients are the most interesting experimental results of the paper [32]. The results of studies of the influence of a specially formed waveguide channel between a plasma and a metal, allowing to increase the SW limiting frequency to X /(2 (i.e. to the limiting frequency of the SW at the plasma—dielectric interface), are of a specific interest. These investigations are carried out in parallel with studies of the influence of a metal screen on the propagation of SW at the plasma—dielectric interface and with the studies of
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
421
properties of layered metal—dielectric—semiconductor (MDS), MSD, MSDM structures. However, these results are not closely related to the subject of this review, and we do not discuss them here. Therefore, in this section the basic experimental results verifying the existence of surface waves at the interface between a plasma-like medium and a metal have been presented for the particular cases of their propagation transverse to the external magnetic field in semiconductor plasmas, and parallel to the magnetic field in gas discharge plasmas.
5. Conclusions In the present Report the results obtained in the theory of surface magnetoplasma waves propagating at the interface between a plasma-like medium and a metal in the Voigt geometry are reviewed. The basic experimental data demonstrating the existence of SW in the structures studied are presented as well. Now let us briefly recall the main results reviewed above. First of all, the application of an external magnetic field parallel to the metal interface leads to the possibility of the existence of pure electromagnetic surface waves with very interesting, and somewhat unique, properties (unidirectionality of propagation, pure electromagnetic modes, but at the same time slow waves; minima of electromagnetic field components, etc.). In the Voigt geometry unidirectional SW can also exist in plasma-like media with a finite electron pressure taken into account (in this case the SW are potential waves). Second, surface waves propagate in different Voigt-like geometries (planar and cylindrical geometries, including planar and cylindrical slabs, rectangular waveguides, waveguides with an arbitrary shape of their cross-section, and current-driven cylindrical configurations). In all these geometries the SW are described from the same point of view. In particular, in the cylindrical metal waveguides fully filled by magnetoactive plasmas the existence of SW with a discrete spectrum, namely azimuthal surface waves, is possible. Although the ASW have much in common with the usual SW in a planar geometry (unidirectionality, possibility of the existence of potential waves in finite pressure plasmas, etc.), they possess many specificities caused mainly by the curved waveguide surface (discrete spectrum, different frequency ranges of existence, field topography, etc.). Another interesting feature of the SW studied is the following. In a planar slab of a homogeneous plasma medium bounded by metal planes, the SW dispersion characteristics in the Voigt geometry do not depend on the layer sizes. Taking into account transverse inhomogeneity of the medium leads to the occurrence of this dependence. In the case of the cylindrical coaxial structure filled with a plasma, the dispersion properties of the azimuthal surface waves depend on the layer sizes. Concerning the excitation of SW in the Voigt geometry, the parametric, diffraction, and drift mechanisms can be regarded as the most convenient excitation mechanisms. Under the parametric excitation in a spatially homogeneous electric pump field two surface eigenmodes of the structure from different frequency ranges of existence and propagating in different directions are excited. This kind of parametric instability may be saturated due to the self-interaction effect of the excited SW. In the framework of the approach used this saturation mechanism is more effective for high-frequency waves. The influence of a near-wall transient layer on the parametric excitation of the plasma—metal waveguide structure is also reviewed. The inhomogeneity in the transient layer is shown to lead to an increase of the instability increment and to a decrease of the pump field
422
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
threshold value compared with the homogeneous plasma case. The parametric excitation of ASW is also reviewed. In a cylindrical metal waveguide filled with a plasma two azimuthal surface waves from different frequency ranges of existence and possessing equal absolute values of the mode numbers are parametrically excited in a radial, spatially homogeneous, electric pump field. The main results of the nonlinear theory of the SW in plasma—metal waveguide structures are reported as well. Although the analysis is limited by the weak nonlinearity approximation, many interesting nonlinear effects associated with the SW in the Voigt plasma—metal geometry arise. One of the most interesting nonlinear effects is the resonant second harmonic generation of SW. This effect can be realized in different plasma-like medium—metal structures (gaseous and semiconductor plasmas, and semiconductor superlattices). In all these situations the relatively high efficiency of this process is caused by an interesting feature of the SW, namely, the presence in their spectrum of a region with linear dispersion, which provides a good phase matching between the first and the second harmonics of the SW. The self-interaction of the SW with the fundamental frequency leads to an increase of the characteristic temporal and spatial scales of this process. The scope of the nonlinear interactions of SW and ASW is also discussed, and the conditions providing such interactions are derived. Close attention is paid to the nonlinear self-interaction effects associated with the SW studied. We reviewed the SW self-interaction from a general point of view by means of the formalism of the third-order nonlinear susceptibilities. We also specified this nonlinear effect by the discussion of several possible self-interaction mechanisms such as the striction, heating, and ionization nonlinearities, and the nonlinearity caused by the nonparabolicity of the free carrier spectrum. Under certain conditions each kind of nonlinearity can be significant in studies of the SW nonlinear self-interaction effect. In many cases a comparison of the impacts of different nonlinearity mechanisms on the resulting SW nonlinear frequency shift is provided. The results associated with the nonlinear evolution of the SW field envelope are reviewed as well. An outline of the basic experimental data suggesting that the SW in the plasma-like medium—metal structures really can be observed experimentally is also presented. However, the brief look to the available theoretical and experimental results associated with the SW at the plasma-like medium—metal interface shows that at present a gap between theory and experiment in this area exists. Meanwhile, the SW in the Voigt geometry are very attractive for practical use due to their interesting features. Let us mention only several possible applications of this kind of surface waves. One of them is the design of radiating elements with a narrow field pattern. The excitation of the SW studied can serve as an effective diagnostic tool for plasma-like materials such as semimetals, semiconductors, gyrotropic dielectrics, etc. A large number of parameters, characterizing the media in contact (inhomogeneity, conductivity, averaged density in a transient layer, band structure, nonlinear susceptibilities of the second and the third order of the material, dominant carrier scattering mechanisms, etc.) can be diagnosed by the excitation of SW. Another opportunity for the application of the SW is to the design of a number of functional electron devices, such as frequency convertors, phase shifters, transmission lines, non-reciprocal elements, power dividers, input and output coupling elements, electrooptical switches and modulators, polarization plane rotators, etc. And, surely, different types of waves in the Voigt- like geometry can produce and sustain RF and microwave gas discharges. An example of such discharges based on surface cyclotron waves at a plasma—metal interface is discussed in [178].
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
423
Therefore, we hope that the interesting features of the SW discussed above, and the perspectives for their applications outlined, will attract the attention of experimentalists to this subject, and will help to overcome the existing gap between theory and experiment in this area. We believe that the subject covered has a future from a theoretical point of view as well. There are still many unresolved problems, especially in the nonlinear theory of SW (it is necessary to point out that the nonlinear effects associated with the propagation of azimuthal surface waves are still far from being studied in detail). This means that further studies of the properties of surface waves in the geometry reviewed and in several other geometries are necessary. And, finally, we emphasize that this report has the aim of attracting the attention of the scientific community to the interesting wave phenomena occurring at the interface between a plasma-like medium and a metal, as well as to stimulate experimental investigations and technological applications of them.
Acknowledgements The authors are thankful to many of their colleagues, in particular to Prof. A.N. Kondratenko, for collaboration in some of the work discussed, as well as for stimulating discussions; and to the referee for valuable and constructive comments. This work was partially supported by the Science and Technology Center in Ukraine. K.N.O thanks the Alexander von Humboldt Foundation for financial support during the preparation of the final version, and Professor M.Y. Yu for fruitful discussions and hospitality at the Institute for Theoretical Physics I, Ruhr-University-Bochum.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]
L.D. Landau, E.M. Lifshits, Electrodynamics of Continuous Media, Pergamon Press, Oxford, 1960. P.E. Vandenplas, Electron Waves and Resonances in Bounded Plasmas, Interscience, New York, 1968. P. Halevi, Surface Sci. 79 (1978) 64. V.M. Agranovich, D.L. Mills (Eds.), Surface Polaritons, North-Holland, Amsterdam, 1982 **. A.D. Boardman (Ed.), Electromagnetic Surface Modes, Wiley, New York, 1982 **. M. Moisan, A. Shivarova, A.W. Trievelpiece, Plasma Phys. 24 (1982) 1331 **. A.N. Kondratenko, Surface and Volume Waves in Bounded Plasmas, Energoatomizdat, Moscow, 1985 *. S. Vucovic (Ed.), Surface Waves in Plasma and Solids, World Scientific, New York, 1986. P. Halevi (Ed.), Spatial Dispersion in Solids and Plasmas, North-Holland, Amsterdam, 1992. S.V. Vladimirov, M.Y. Yu, V.N. Tsytovich, Phys. Rep. 241 (1994) 1 **. P. Halevi (Ed.), Photonic Probes of Surfaces. Electromagnetic Waves: Recent Developments in Research, Elsevier, Amsterdam, 1995. R.F. Wallis, in: M. Balkanski (Ed.), Optical Properties of Semiconductors, Handbook on Semiconductors, vol. 2, Elsevier, Amsterdam, 1995. T. Tamir, Guided-Wave Optoelectronics, 2nd ed., Springer Ser. Electronics and Photonics, vol. 26, Springer, Berlin, 1990. R.G. Hunsperger, Integrated Optics. Theory and Technology, Springer, Berlin, 1995. R.J. Bell, R.W. Alexander et al., Surface Sci. 48 (1975) 253. N. Bloembergen (Ed.), Nonlinear Spectroscopy, North-Holland, Amsterdam, 1977. Y.R. Shen, Principles of Nonlinear Optics, Wiley, New York, 1984. C.M. Ferreira, M. Moisan (Eds.), Microwave Discharges: Fundamentals and Applications, Plenum Press, New York, 1993 *.
424 [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65]
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428 F.F. Chen, Phys. Plasmas 2 (1995) 2164. M. Moisan, Z. Zakrzewski, Plasma Sources Sci. Technol. 4 (1995) 379. M. Moisan, G. Sauve, Z. Zakrzewski, J. Hubert, Plasma Sources Sci. Technol. 3 (1994) 584. I. Zhelyazkov, V. Atanassov, Phys. Rep. 255 (1995) 79 **. Yu.M. Aliev, U. Kortchagen, H. Schluter, A. Shivarova, Phys. Rev. E 51 (1995) 6091. Yu.M. Aliev, H. Schluter, A. Shivarova, Plasma Sources Sci. Technol. 5 (1996) 514. C.F.M. Borges, M. Moisan, A. Gicquel, Diamond and Related Materials 4 (1995) 149. J.K. Kirk, D.B. Melrose, E.R. Priest, Plasma Astrophysics, Springer, Berlin, 1995. E.A. Kaner, V.G. Skobov, Adv. Phys. 17 (1968) 605. V.M. Agranovich, V.L. Ginzburg, Crystal Optics with Spatial Dispersion and Excitons, Springer, Berlin, 1984. M. Toda, J. Phys. Soc. Japan 19 (1964) 1126 *. R. Hirota, K. Suzuki, J. Phys. Soc. Japan 21 (1966) 1112 *. T. Ishisone et al., Proc. IEEE 58 (1970) 1843,1852. J.J. Laurin, G.A. Morin, K.G. Balmain, Radio Sci. 24 (1989) 289 *. A.B. Murphy, Fusion Eng. Des. 12 (1990) 79. O. Pogutse, J.G. Gordey, W. Kerner, 22nd EPS Conf. Contr. Fusion and Plasma Physics, Bournemouth, UK, July 1995, Abstr. Inv. and Contr. Papers, p. 362. J.G. Laframboise, J. Rubinstein, Phys. Fluids 12 (1976) 1900. N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Sov. Phys. Radiophys. Quantum Electron. 36 (1993) 335 *. L. Stenflo, Phys. Scr. T63 (1996) 59 **. R.F. Wallis et al., Phys. Rev. Lett. 28 (1972) 1455. B.G. Martin, A.A. Maradudin, R.F. Wallis, Surface Sci. 77 (1978) 416 *. R.S. Brazis, Lit. Phys. J. 21 (1981) 73. M.S. Kushwaha, P. Halevi, Phys. Rev. B 36 (1987) 5960. F.G. Elmzughi, D.R. Tilley, J. Phys. Condensed Matter 6 (1994) 4233. F.G. Bass, A.A. Bulgakov, A.P. Tetervov, High-Frequency Properties of Semiconductors with Superlattices, Nauka, Moscow, 1989. N.A. Azarenkov, A.N. Kondratenko, V.N. Melnik, V.P. Olefir, Sov. J. Commun. Technol. Electronics 30 (1985) 2195 *. S.R. Seshadri, IRE Trans. MTT MTT-10 (1962) 573 *. W.P. Allis, S.J. Buchsbaum, A. Bers, Waves in Anisotropic Plasmas, MIT Press, Cambridge, MA, 1963 *. N.A. Azarenkov, K.N. Ostrikov, Plasma Phys. Rep. 17 (1991) 316. N.A. Azarenkov, K.N. Ostrikov, Contrib. to Plasma Phys. 31 (1991) 637. N.A. Azarenkov, A.N. Kondratenko, Ukraine Phys. J. 30 (1985) 718. A. Erdelyi et al., Higher Transcendental Functions, McGraw-Hill, New York, 1953. N.A. Azarenkov, A.N. Kondratenko, Sov. J. Commun. Technol. Electron. 34 (1989) 1525. N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Sov. J. Commun. Technol. Electron. 35 (1990) 29. V.O. Girka, I.O. Girka, A.N. Kondratenko, V.I. Tkachenko, Sov. J. Commun. Technol. Electron. 34 (1989) 96. N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Sov. Techn. Phys. Lett. 15 (1989) 68. N.A. Azarenkov, K.N. Ostrikov, I.B. Scherbinina, Sov. J. Commun. Technol. Electron. 35 (1990) 2031. N.A. Azarenkov, K.N. Ostrikov, Plasma Phys. Rep. 16 (1990) 225. N.A. Azarenkov, Sov. Phys. Techn. Phys. 57 (1987) 1165. R. Hirota, J. Phys. Soc. Japan 19 (1964) 1130. P. Halevi, C. Guerra-Vela, Phys. Rev. B 18 (1978) 5248 *. K.N. Ostrikov, Sov. Phys. Radiophys. Quantum Electron. 34 (1991) 610. N.A. Azarenkov, K.N. Ostrikov, I.B. Denisenko, Sov. Phys. Techn. Phys. 64 (1994) 23. N.A. Azarenkov, K.N. Ostrikov, O.A. Osmayev, Appl. Phys. A Materials Science and Processing 61 (1995) 435. V.O. Girka, I.O. Girka, A.V. Zolotukhin, Mater. Int. Conf. Physics in Ukraine, Kiev, Ukraine, 1993, vol. Plasma Physics, p. 104. I.O. Girka, A.V. Zolotukhin, Ukraine Phys. J. 39 (1994) 682. I.O. Girka, A.V. Zolotukhin, Sov. J. Commun. Technol. Electron. 39 (1994) 1961.
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428 [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114]
425
N.A. Azarenkov, V.O. Girka, A.E. Sporov, Plasma Phys. Rep. 23 (1997) 616. N.A. Azarenkov, V.O. Girka, A.N. Kondratenko, A.E. Sporov, Plasma Phys. Contrib. Fusion 39 (1997) 375. V.O. Girka, I.O. Girka, A.N. Kondratenko, V.I. Tkachenko, Sov. J. Commun. Technol. Electron. 33 (1988) 37. V.O. Girka, I.O. Girka, Sov. Phys. Radiophys. Quantum Electron. 34 (1991) 324. V.O. Girka, I.O. Girka, Sov. J. Commun. Technol. Electron. 37 (1992) 23. V.L. Golub, A.N. Kondratenko, Sov. Phys. Radiophys. Quantum Electron. 34 (1991) 1209. A.B. Davydov, V.A. Zakharov, Sov. Solid State Phys. 17 (1975) 201 *. V.A. Zakharov, Sov. Solid State Phys. 18 (1976) 1164. N.N. Beletsky, V.M. Yakovenko, Sov. Phys. Semiconductors 14 (1980) 1627. P. Halevi, Phys. Rev. B 23 (1981) 2635. V.A. Zakharov, Sov. Phys. Semiconductors 16 (1982) 712. N.A. Azarenkov, A.N. Kondratenko, G.I. Zaginailov, Sov. Phys. Technol. Phys. 55 (1985) 635; Ukraine Phys. J. 30 (1985) 231. N.A. Azarenkov, V.V. Kostenko, Sov. Phys. Techn. Phys. 60 (1990) 159. N.N. Beletsky, E.A. Gasan, V.M. Yakovenko, Preprint, Inst. of Radiophys. and Electronics, Kharkov, 1986. V.A. Zakharov, G.L. Strapenin, Sov. Phys. Semiconductors 18 (1984) 802. N.N. Beletsky, V.M. Yakovenko, Sov. Phys. Semiconductors 19 (1985) 486. N.A. Azarenkov, A.N. Kondratenko, V.V. Kostenko, Sov. Phys. Techn. Phys. 57 (1987) 591. N.A. Azarenkov, V.V. Kostenko, O.I. Dementiy, Sov. J. Commun. Technol. Electron. 33 (1988) 546. V.S. Popov, S.V. Troitsky, I.P. Yakimenko, Sov. Phys. Radiophys. Quantum Electron. 22 (1979) 46 *. N.A. Azarenkov, A.N. Kondratenko, V.V. Kostenko, Sov. Phys. Techn. Phys. 56 (1986) 391. N.A. Azarenkov, V.V. Kostenko, Sov. J. Commun. Technol. Electron. 33 (1988) 1027. N.A. Azarenkov, V.V. Kostenko, Ukraine Phys. J. 31 (1986) 547. M.I. Bakunov, Sov. Phys. Radiophys. Quantum Electron. 31 (1988) 25. M.I. Bakunov, Yu.M. Sorokin, Sov. Phys. Radiophys. Quantum Electron. 30 (1987) 1402. N.A. Azarenkov, K.N. Ostrikov, I.B. Denisenko, J. Plasma Phys. 50 (1993) 369. N.A. Azarenkov, K.N. Ostrikov, I.B. Denisenko, J. Phys. D 28 (1995) 2465 *. N.A. Azarenkov, A.A. Bizukov, 1996 IEEE ICOPS, Boston, USA, Conf. Record-Abstracts, p. 143. V.O. Girka, I.O. Girka, A.N. Kondratenko, I.V. Pavlenko, Contrib. Plasma Phys. 36 (1996) 679. V.O. Girka, I.O. Girka, I.V. Pavlenko, J. Plasma Phys. 57 (1997) 277. N.A. Azarenkov, V.K. Galaydych, Bull. Amer. Phys. Soc. 39 (1994) 1765. N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Sov. Phys. Techn. Phys. 69 (1990) 31. N.A. Azarenkov, K.N. Ostrikov, M.V. Dolgopolov, Plasma Phys. Contrib. Fusion 37 (1995) 513. N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Ukraine Phys. J. 35 (1990) 1662. N.A. Azarenkov, K.N. Ostrikov, Sov. J. Commun. Technol. Electron. 35 (1990) 325. A.N. Kondratenko, Plasma Waveguides, Atomizdat, Moscow, 1976 *. G.I. Zaginailov, V.M. Kuklin, Plasma Phys. Rep. 16 (1990) 324. N.A. Azarenkov, K.N. Ostrikov, Plasma Phys. Rep. 16 (1990) 226. A.R. Barakate, V.V. Dolgopolov, N.M. El-Siragy, Plasma Phys. 17 (1975) 89. V.M. Kuklin, I.P. Panchenko, V.M. Chernousenko, Rep. Ukraine Acad. Sci. Ser. A (1988) 52. V.B. Taranov, K.P. Shamrai, Plasma Phys. Contrib. Fusion 27 (1985) 925. N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Sov. Radiophys. Quantum Electron. 33 (1990) 642. L.M. Gorbunov, Sov. Phys. Techn. Phys. 47 (1977) 36. N.A. Azarenkov, K.N. Ostrikov, I.B. Denisenko, Plasma Phys. Rep. 22 (1996) 226. V.S. Ambrazyavichene, R.S. Brazis, A.A. Kunigelis, Nonlinear Surface Polaritons of a Magnetoplasma type in the Semiconductors, Inst. Phys. Semicond., Vilnus, 1987. N.N. Beletsky, A.A. Bulgakov, S.I. Khankina, V.M. Yakovenko, Plasma Instabilities and Nonlinear Phenomena in Semiconductors, Naukova Dumka, Kiev, 1984. N.A. Azarenkov, K.N. Ostrikov, Sov. Techn. Phys. Lett. 17 (1991) 62. N.A. Azarenkov, V.V. Kostenko, Sov. J. Commun. Technol. Electron. 31 (1986) 831. N.A. Azarenkov, K.N. Ostrikov, Sov. Phys. Techn. Phys. 58 (1988) 2393; Urkaine Phys. J. 34 (1989) 213. N.A. Azarenkov, A.N. Kondratenko, V.V. Kostenko, Sov. Techn. Phys. Lett. 14 (1988) 564.
426 [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158]
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428 A.N. Kondratenko, V.M. Kuklin, Principles of Plasma Electronics, Energoatomizdat, Moscow, 1985. N.A. Azarenkov, G.I. Zaginailov, Sov. J. Commun. Technol. Electron. 31 (1986) 1889. V.O. Girka, I.O. Girka, V.P. Olefir, V.I. Tkachenko, Sov. Techn. Phys. Lett. 17 (1991) 35. V.O. Girka, I.O. Girka, V.I. Tkachenko, Sov. Phys. Techn. Phys. 75 (1996) 114. N.A. Azarenkov, K.N. Ostrikov, Sov. Phys. Techn. Phys. 61 (1991) 66. N.A. Azarenkov, K.N. Ostrikov, I.B. Denisenko, Surface Rev. Lett. 2 (1995) 579. N. Bloembergen, Nonlinear Optics, Benjamin, New York, 1965. S.A. Akhmanov, R.V. Khokhlov, Problems of Nonlinear Optics, Gordon and Breach, New York, 1972. N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Sov. Phys. Radiophys. Quantum Electron. 34 (1991) 672. J. Weiland, H. Wilhelmsson, Coherent Nonlinear Interaction of Waves in Plasmas, Pergamon, New York, 1977. A. Yariv, Optical Electronics in Modern Communication, Oxford Univ. Press, Oxford, 1997. N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Ukraine Phys. J. 35 (1990) 1715. E.I. Raschba, V.B. Timofeev, Sov. Phys. Semiconductors 20 (1986) 977. Wu Ji Wei et al., Phys. Rev. B 33 (1986) 7091. N.N. Beletsky et al., Sov. Techn. Phys. Lett. 45 (1987) 589. N.A. Azarenkov, K.N. Ostrikov, I.B. Denisenko, Contrib. Papers. Intern. Conf. Physics in Ukraine, Kiev, 22—27 June 1993, vol. Plasma Physics, pp. 26—29. N.A. Azarenkov, K.N. Ostrikov, A.V. Kuzmenko, Phys. Scr. 52 (1995) 668. N.A. Azarenkov, K.N. Ostrikov, Sov. Phys. Semiconductors 25 (1991) 1344. N.A. Azarenkov, K.N. Ostrikov, Sov. J. Commun. Techn. Electron. 39 (1994) 688. T. Sahu, S.K. Nayak, Proc. 33 Solid State Phys. Symp. Bombay, 1991, Vol. 33 (C), p. 407. D. Grosev, A. Shivarova, J. Plasma Phys. 31 (1984) 177. V.I. Karpman, Nonlinear Waves in Dispersive Media, Pergamon Press, Oxford, 1975. D. Grosev, A. Shivarova, A.D. Boardman, J. Plasma Phys. 38 (1987) 427 *. N.G. Vakhitov, A.A. Kolokolov, Sov. Phys. Radiophys. Quantum Electron. 16 (1973) 1020. V.E. Zakharov, E.A. Kuznetsov, A.M. Rubenchik, Stability of solitons, Preprint No. 18, Inst. of Automatics and Electrometry, Novosibirsk, 1982. N.A. Azarenkov, K.N. Ostrikov, Sov. J. Commun. Technol. Electron. 36 (1991) 1161. Yu.R. Alanakyan, Sov. Phys. Techn. Phys. 35 (1965) 1552. N.A. Azarenkov, K.N. Ostrikov, I.B. Denisenko, Phys. Scr. 49 (1994) 502. K.N. Ostrikov, O.A. Osmayev, Z. Physik B 95 (1994) 453. N.A. Azarenkov, K.N. Ostrikov, Sov. Phys. Radiophys. Quantum Electron. 34 (1991) 724. S.I. Khankina, V.M. Yakovenko, Sov. Phys. Radiophys. Quantum Electron. 32 (1989) 389. E.O. Kane, J. Phys. Chem. Solids 1 (1957) 249. I.M. Tsydilkowsky, Electrons and Holes in Semiconductors, Naukova Dumka, Kiev, 1972. P.A. Wolff, G.A. Pearson, Phys. Rev. Lett. 17 (1966) 1015. F.G. Bass, V.A. Pogrebnyak, Sov. Solid State Phys. 14 (1972) 1766. N.A. Azarenkov, A.N. Kondratenko, V.V. Kostenko, Interaction and Self-Action of Waves in Nonlinear Media, Part 1, Donish, Dushanbe, 1988, p. 214. N.A. Azarenkov, K.N. Ostrikov, Contrib. Plasma Phys. 34 (1994) 593. F.G. Bass, Yu.G. Gurevich, Hot Electrons and Strong Electromagnetic Waves in Semiconductor and Gas Discharge Plasma, Nauka, Moscow, 1975. N.A. Azarenkov, K.N. Ostrikov, Contrib. Papers 23rd EPS Conf. Contrib. Fusion and Plasma Physics, Kiev, Ukraine, 1996, Part 3, p. 1195. M. Georgieva, A. Shivarova, Phys. Scr. 50 (1994) 523; J. Plasma Phys. 52 (1994) 391. Yu.M. Aliev, A.G. Boev, A. Shivarova, J. Phys. D 17 (1984) 2233. Yu.M. Aliev, K. Ivanova, M. Moisan, A. Shivarova, Plasma Sources Sci. Technol. 2 (1993) 145 *. A.V. Gurevich, A.B. Shvartsburg, Nonlinear Theory of Radiowave Propagation in Ionosphere, Nauka, Moscow, 1973. V.E. Golant, A.P. Zhilinsky, I.E. Sakharov, Fundamentals of Plasma Physics, Wiley, New York, (1980) *.
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
427
[159] L.M. Biberman, V.S. Vorob’ev, I.T. Yakubov, Kinetics of Non-Equilibrium Low-Temperature Plasmas, Consultants Bureau, New York, 1987. [160] J. Margot, M. Moisan, M. Fortin, J. Vac. Sci. Technol. A 13 (1995) 2890 *. [161] K.N. Ostrikov, O.A. Osmayev, Contrib. Plasma Phys. 34 (1994) 661. [162] T. Lindgren, J. Larsson, L. Stenflo, Plasma Phys. 24 (1982) 1177. [163] D. Ter-Haar, V.N. Tsytovich, Rev. Mod. Phys. 13 (1981) 177. [164] N.A. Azarenkov, A.N. Kondratenko, K.N. Ostrikov, Sov. Phys. Techn. Phys. 69 (1990) 143. [165] K.N. Ostrikov, O.A. Osmayev, Surface Sci. 310 (1994) 413. [166] N.A. Azarenkov, K.N. Ostrikov, Plasma Phys. Rep. 18 (1992) 650. [167] K.N. Ostrikov, Contrib. Plasma Phys. 35 (1995) 481. [168] K.N. Ostrikov, IEEE Int. Conf. on Plasma Science, 5—8 June 1995, Madison, USA. Conference Addendum — Post Deadline Abstracts, p. 46. [169] Ya.L. Alpert, A.V. Gurevich, L.P. Pitaewsky, Space Physics with Artificial Satellites, Consultants Bureau, New York, 1965. [170] K.N. Ostrikov, O.A. Osmayev, Surface Sci. 344 (1995) 283. [171] E.A. Fedutenko, V.I. Lapshin, Ya.F. Leleko, Phys. Scr. 50 (1994) 310. [172] E.A. Fedutenko, V.I. Lapshin, K.N. Stepanov, Plasma Phys. Rep. 19 (1993) 177. [173] A.G. Boev, Ukraine Acad. Sci. Rep., Ser. A (1982) 57. [174] N.A. Azarenkov, K.N. Ostrikov, Sov. Phys. Techn. Phys. Lett. 15 (1989) 11. [175] N.A. Azarenkov, K.N. Ostrikov, O.A. Osmayev, Solid State Commun. 85 (1993) 1063. [176] K.N. Ostrikov, A.V. Smolin, Phys. Lett. A 177 (1993) 327. [177] K.T. Sawaya, T. Ishisone, Y. Mushiake, Radio Sci. 13 (1978) 21. [178] V.O. Girka, Physica Scripta, in press. [179] D.J. Cooperberg, Phys. Plasmas 5 (1988) 853, 862. [180] D.J. Cooperberg, C.K. Birdsall, Plasma Sources Sci. Technol. 7 (1998) 41. [181] D. Korzec, F. Werner, R. Winter, J. Engemann, Plasma Sources Sci. Technol. 5 (1996) 216. [182] X.L. Zhang, F.M. Dias, C.M. Ferreira, Plasma Sources Sci. Technol. 6 (1997) 29, 110. [183] N.A. Azarenkov, I.B. Denisenko, K.N. Ostrikov, J. Plasma Phys. 59 (1998) 15. [184] H. Sugai, I. Ghanashev, M. Nagatsu, Plasma Sources Sci. Technol. 7 (1998) 192. [185] I. Peres, J. Margot, M. Chaker, Plasma Sources Sci. Technol. 5 (1996) 653. [186] L. St-Onge, J. Margot, M. Chaker, Plasma Sources Sci. Technol. 7 (1998) 154. [187] F.G. Elmzughi, N.G. Constantinou, D.R. Tilley, Phys. Rev. B 51 (1995) 11515. [188] T. Dumelow, R.E. Camley, K. Abraha, D.R. Tilley, Phys. Rev. B 58 (1998) 897. [189] R.E. Camley, Surf. Sci. Rep. 7 (1987) 103. [190] K.N. Ostrikov, M.Y. Yu, N.A. Azarenkov, J. Appl. Phys. 84 (1998) to appear. [191] K.N. Ostrikov, M.Y. Yu, N.A. Azarenkov, Phys. Plasmas 5 (1998) to appear. [192] K.N. Ostrikov, M.Y. Yu, N.A. Azarenkov, A.D. Boardman, 23rd Intern. Conf. Infrared and Millimeter Waves, Colchester, Essex, UK, September 1998, Confer. Proceedings, p. 245. [193] K.N. Ostrikov, M.Y. Yu, N.A. Azarenkov, Phys. Rev. E 58 (1998) to appear. [194] K.N. Ostrikov, S.V. Vladimirov, M.Y. Yu, Journ. Geophys. Res. Space Phys. (1998) in press.
Note added in proof Since the time this report was completed there appeared new results related to the subject. Papers [179] deal with the numerical simulations of the electron surface waves in metal bound plasma slabs with uniform and nonuniform plasma densities. The results on the numerical simulations of surface-wave sustained gas discharges in planar plasma-metal slabs are presented in [180]. The theoretical and experimental results related to RF and microwave plasmas in the presence of metal surfaces are discussed in [181—184]. The problems of energy balance and
428
N.A. Azarenkov, K.N. Ostrikov / Physics Reports 308 (1999) 333—428
diffusion processes in magnetically confined surface-wave discharges have been investigated in [185,186]. The theory of collective excitations in the Voigt geometry has been further developed [187,188]. The review on nonreciprocity [189] is very useful for understanding general properties of wave perturbations of different nature in the Voigt geometry. The results of further studies on ionization and striction nonlinearities briefly discussed in Section 3 are given in [190,191]. Further development of the nonlinear theory of ASW has been made [192]. The action of dust particles on the SW in plasma-metal structures is one of the most recent topics [193]. The propagation of the cross-field Alfve´n-like surface waves (i.e. the example of realization of the Voigt geometry in space) dusty space plasmas is studied in [194].