Instructions to authors Aims and scope Physics Reports keeps the active physicist up-to-date on developments in a wide range of topics by publishing timely reviews which are more extensive than just literature surveys but normally less than a full monograph. Each Report deals with one speci"c subject. These reviews are specialist in nature but contain enough introductory material to make the main points intelligible to a non-specialist. The reader will not only be able to distinguish important developments and trends but will also "nd a su$cient number of references to the original literature. Submission In principle, papers are written and submitted on the invitation of one of the Editors, although the Editors would be glad to receive suggestions. Proposals for review articles (approximately 500}1000 words) should be sent by the authors to one of the Editors listed below. The Editor will evaluate proposals on the basis of timeliness and relevance and inform the authors as soon as possible. All submitted papers are subject to a refereeing process. Editors J.V. ALLABY (Experimental high-energy physics), PPE Division, CERN, CH-1211 Geneva 23, Switzerland. E-mail:
[email protected] D.D. AWSCHALOM (Experimental condensed matter physics), Department of Physics, University of California, Santa Barbara, CA 93106, USA. E-mail:
[email protected] J.A. BAGGER (High-energy physics), Department of Physics, The Johns Hopkins University, 3400 North Charles Street, Baltimore MD 21218, USA. E-mail:
[email protected] C.W.J. BEENAKKER (Mesoscopic physics), Instituut}Lorentz, Universiteit Leiden, P.O. Box 9506, 2300 RA Leiden, The Netherlands. E-mail:
[email protected] E. BREZIN (Statistical physics and ,eld theory), Laboratoire de Physique Theorique, Ecole Normale Superieure, 24 rue Lhomond, 75231 Paris Cedex, France. E-mail:
[email protected] G.E. BROWN (Nuclear physics), Institute for Theoretical Physics, State University of New York at Stony Brook, Stony Brook, NY 11974, USA. E-mail:
[email protected] D.K. CAMPBELL (Non-linear dynamics), Dean, College of Engineering, Boston University, 44 Cummington Street, Boston, MA 02215, USA. E-mail:
[email protected] G. COMSA (Surfaces and thin ,lms), Institut fuK r Physikalische und Theoretische Chemie, UniversitaK t Bonn, Wegelerstrasse 12, D-53115 Bonn, Germany. E-mail:
[email protected] J. EICHLER (Atomic and molecular physics), Hahn-Meitner-Institut Berlin, Glienicker Strasse 100, 14109 Berlin, Germany. E-mail:
[email protected] T.F. GALLAGHER (Atomic and molecular physics), Department of Physics, University of Virginia, Charlottesville, VA 22901, USA. E-mail:
[email protected]
vi
Instructions to authors
M.P. KAMIONKOWSKI (Astrophysics), Theoretical Astrophysics 130-33, California Institute of Technology, 1200 East California Blvd., Pasadena, CA 91125, USA. E-mail:
[email protected] M.L. KLEIN (Soft condensed matter physics), Department of Chemistry, University of Pennsylvania, Philadelphia, PA 19104-6323, USA. E-mail:
[email protected] A.A. MARADUDIN (Condensed matter physics), Department of Physics, University of California, Irvine, CA 92717, USA. E-mail:
[email protected] D.L. MILLS (Condensed matter physics), Department of Physics, University of California, Irvine, CA 92717, USA. E-mail:
[email protected] R. PETRONZIO (High-energy physics), Dipartimento di Fisica, II Universita` di Roma } Tor Vergata, Via Orazio Riamondo, 00173 Rome, Italy. E-mail:
[email protected] S. PEYERIMHOFF (Molecular physics), Institute of Physical and Theoretical Chemistry, Wegelerstrasse 12, D-53115 Bonn, Germany. E-mail:
[email protected] I. PROCACCIA (Statistical mechanics), Department of Chemical Physics, Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected] E. SACKMANN (Biological physics), Physik-Department E22 (Biophysics Lab.), Technische UniversitaK t MuK nchen, D-85747 Garching, Germany. E-mail:
[email protected] A. SCHWIMMER (High-energy physics), Physics Department, Weizmann Institute of Science, Rehovot 76100, Israel. E-mail:
[email protected] R.N. SUDAN (Plasma physics), Laboratory of Plasma Studies, Cornell University, 369 Upson Hall, Ithaca, NY 14853-7501, USA. E-mail:
[email protected] W. WEISE (Physics of hadrons and nuclei), Institut fuK r Theoretische Physik, Physik Department, Technische UniversitaK t MuK nchen, James Franck Stra{e, D-85748 Garching, Germany. E-mail:
[email protected] Manuscript style guidelines Papers should be written in correct English. Authors with insu$cient command of the English language should seek linguistic advice. Manuscripts should be typed on one side of the paper, with double line spacing and a wide margin. The character size should be su$ciently large that all subscripts and superscripts in mathematical expressions are clearly legible. Please note that manuscripts should be accompanied by separate sheets containing: the title, authors' names and addresses, abstract, PACS codes and keywords, a table of contents, and a list of "gure captions and tables. } Address: The name, complete postal address, e-mail address, telephone and fax number of the corresponding author should be indicated on the manuscript. } Abstract: A short informative abstract not exceeding approximately 150 words is required. } PACS codes/keywords: Please supply one or more PACS-1999 classi"cation codes and up to 4 keywords of your own choice for indexing purposes. PACS is available online from our homepage (http://www.elsevier.com/locate/physrep). References. The list of references may be organized according to the number system or the name-year (Harvard) system. Number system: [1] M.J. Ablowitz, D.J. Kaup, A.C. Newell and H. Segur, The inverse scattering transform } Fourier analysis for nonlinear problems, Studies in Applied Mathematics 53 (1974) 249}315.
Instructions to authors
vii
[2] M. Abramowitz and I. Stegun, Handbook of Mathematical Functions (Dover, New York, 1965). [3] B. Ziegler, in: New Vistas in Electro-nuclear Physics, eds E.L. Tomusiak, H.S. Kaplan and E.T. Dressler (Plenum, New York, 1986) p. 293. A reference should not contain more than one article. Harvard system: Ablowitz, M.J., D.J. Kaup, A.C. Newell and H. Segur, 1974. The inverse scattering transform } Fourier analysis for nonlinear problems, Studies in Applied Mathematics 53, 249}315. Abramowitz, M. and I. Stegun, 1965, Handbook of Mathematical Functions (Dover, New York). Ziegler, B., 1986, in: New Vistas in Electro-nuclear Physics, eds E.L. Tomusiak, H.S. Kaplan and E.T. Dressler (Plenum, New York) p. 293. Ranking of references. The references in Physics Reports are ranked: crucial references are indicated by three asterisks, very important ones with two, and important references with one. Please indicate in your "nal version the ranking of the references with the asterisk system. Please use the asterisks sparingly: certainly not more than 15% of all references should be placed in either of the three categories. Formulas. Formulas should be typed or unambiguously written. Special care should be taken of those symbols which might cause confusion. Unusual symbols should be identi"ed in the margin the "rst time they occur. Equations should be numbered consecutively throughout the paper or per section, e.g., Eq. (15) or Eq. (2.5). Equations which are referred to should have a number; it is not necessary to number all equations. Figures and tables may be numbered the same way. Footnotes. Footnotes may be typed at the foot of the page where they are alluded to, or collected at the end of the paper on a separate sheet. Please do not mix footnotes with references. Figures. Each "gure should be submitted on a separate sheet labeled with the "gure number. Line diagrams should be original drawings or laser prints. Photographs should be contrasted originals, or high-resolution laserprints on glossy paper. Photocopies usually do not give good results. The size of the lettering should be proportionate to the details of the "gure so as to be legible after reduction. Original "gures will be returned to the author only if this is explicitly requested. Colour illustrations. Colour illustrations will be accepted if the use of colour is judged by the Editor to be essential for the presentation. Upon acceptance, the author will be asked to bear part of the extra cost involved in colour reproduction and printing. For details, contact the Publisher at the address below. After acceptance } Proofs: Proofs will be sent to the author by e-mail, 6}8 weeks after receipt of the manuscript. Please note that the proofs have been proofread by the Publisher and only a cursory check by the author is needed; we are unable to accept changes in, or additions to, the edited manuscript at this stage. Your proof corrections should be returned as soon as possible, preferably within two days of receipt by fax, courier or airmail. The Publisher may proceed with publication of no response is received. } Copyright transfer: The author(s) will receive a form with which they can transfer copyright of the article to the Publisher. This transfer will ensure the widest possible dissemination of information. LaTeX manuscripts The Publisher welcomes the receipt of an electronic version of your accepted manuscript (encoded in LATEX). If you have not already supplied the "nal, revised version of your article (on diskette) to the Journal Editor, you are requested herewith to send a "le with the text of the manuscript
viii
Instructions to authors
(after acceptance) directly to the Publisher by e-mail or on diskette to the address given below. Please note that no deviations from the version accepted by the Editor of the journal are permissible without the prior and explicit approval by the Editor. Such changes should be clearly indicated on an accompanying printout of the "le. Files sent via electronic mail should be accompanied by a clear identi"cation of the article (name of journal, editor's reference number) in the &&subject "eld'' of the e-mail message. LATEX articles should use the Elsevier document class &&elsart'', or alternatively the standard document class &&article''. The Elsevier LATEX package (including detailed instructions for LATEX preparation) can be obtained from http://www.elsevier.com/locate/latex. The elsart package consists of the "les: ascii.tab (ASCII table), elsart.cls (use this "le if you are using LATEX2e, the current version of LATEX), elsart.sty and elsart12.sty (use these two "les if you are using LATEX2.09, the previous version of LATEX), instraut.dvi and/or instraut.ps (instruction booklet), readme. Author bene5ts } Free o+prints. For regular articles, the joint authors will receive 25 o!prints free of charge of the journal issue containing their contribution; additional copies may be ordered at a reduced rate. } Discount. Contributors to Elsevier Science journals are entitled to a 30% discount on all Elsevier Science books. } Contents Alert. Physics Reports is included in Elsevier's pre-publication service Contents Alert. Author enquiries For enquiries relating to the submission of articles (including electronic submission), the status of accepted articles through our Online Article Status Information System (OASIS), author Frequently Asked Questions and any other enquiries relating to Elsevier Science, please consult http://www.elsevier.com/locate/authors/ For speci"c enquiries on the preparation of electronic artwork, consult http://www.elsevier.com/ locate/authorartwork/ Contact details for questions arising after acceptance of an article, especially those relating to proofs, are provided when an article is accepted for publication.
THE PDF APPROACH TO TURBULENT POLYDISPERSED TWO-PHASE FLOWS
Jean-Pierre MINIER, Eric PEIRANO ElectriciteH de France, Div. R&D, MFTT, 6 Quai Watier, 78400 Chatou, France Energy Conversion Department, Chalmes University of Technology, S-41296 GoK teborg, Sweden
AMSTERDAM } LONDON } NEW YORK } OXFORD } PARIS } SHANNON } TOKYO
Physics Reports 352 (2001) 1–214
The pdf approach to turbulent polydispersed two-phase ows Jean-Pierre Miniera ; ∗ , Eric Peiranob b
a Electricite de France, Div. R&D, MFTT, 6 Quai Watier, 78400 Chatou, France Energy Conversion Department, Chalmers University of Technology, S-41296 G-oteborg, Sweden
Received December 2000; editor : I: Procaccia
Contents 1. Introduction 1.1. Two-phase ow regimes 1.2. An industrial example of dispersed two-phase ows 1.3. Mathematical and physical approach 1.4. Description of the contents 2. Mathematical background on stochastic processes 2.1. Random variables 2.2. Stochastic processes 2.3. Markov processes 2.4. Key Markov processes 2.5. General Chapman–Kolmogorov equations 2.6. Stochastic di8erential equations and di8usion processes 2.7. Stochastic calculus 2.8. Langevin and Fokker–Planck equations 2.9. The probabilistic interpretation of PDEs 2.10. A word on numerical schemes 3. Hierarchy of pdf descriptions
3 4 5 7 11 12 13 15 16 17 19 22 24 25 26 27 28
3.1. Complete and reduced pdf equations 3.2. BBGKY hierarchy 3.3. Hierarchy between state vectors 4. Stochastic di8usion processes for modelling purposes 4.1. The shift from an ODE to a SDE 4.2. Modelling principles 4.3. Example for typical stochastic models 5. The physics of turbulence 5.1. The turbulence problem 5.2. Characteristic scales 5.3. Kolmogorov theory 5.4. Di@culties and reAnements 5.5. Experimental and numerical results 5.6. SimpliAed images of turbulence and Lagrangian models 5.7. Closing remarks 6. One-point pdf models in single-phase turbulence 6.1. Motivation and basic ideas 6.2. Coarse-grained description and stochastic modelling
∗
Corresponding author. Tel.: +33-1-30-87-71-40; fax: +33-1-30-87-79-16. E-mail addresses:
[email protected] (J.-P. Minier),
[email protected] (E. Peirano). c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 1 1 - 4
29 30 32 34 35 36 38 40 41 42 46 51 55 59 64 65 66 66
2
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
6.3. Relations to classical approaches 6.4. Probabilistic description of continuous Aelds 6.5. Choice of the pdf description 6.6. Present one-point models 6.7. Mean Aeld equations 6.8. Physical and information contents 6.9. Numerical examples 7. One-point particle pdf models in two-phase ows 7.1. Fundamental equations and modelling approaches 7.2. Interest of the pdf approach 7.3. Choice of the pdf description 7.4. Present models 7.5. Properties of present class of models 7.6. Numerical examples and typical simulations
67 69 75 77 80 84 88 99 100 105 106 108 120 135
8. Two-point uid–particle pdf models in dispersed two-phase ows 8.1. Motivations and basic ideas 8.2. Probabilistic description of dispersed two-phase ows 8.3. Choice of the pdf description 8.4. Present ‘two-point’ models 8.5. Mean Aeld equations 8.6. Concluding remarks 9. Summary and propositions for new developments 9.1. Di@culties with conventional approaches and interest of a pdf description 9.2. Assessment of current modelling state 9.3. Open issues and suggestions References
148 149 149 161 164 169 181 181 181 184 186 209
Abstract The purpose of this paper is to develop a probabilistic approach to turbulent polydispersed two-phase ows. The two-phase ows considered are composed of a continuous phase, which is a turbulent uid, and a dispersed phase, which represents an ensemble of discrete particles (solid particles, droplets or bubbles). Gathering the di@culties of turbulent ows and of particle motion, the challenge is to work out a general modelling approach that meets three requirements: to treat accurately the physically relevant phenomena, to provide enough information to address issues of complex physics (combustion, polydispersed particle ows, : : :) and to remain tractable for general non-homogeneous ows. The present probabilistic approach models the statistical dynamics of the system and consists in simulating the joint probability density function (pdf) of a number of uid and discrete particle properties. A new point is that both the uid and the particles are included in the pdf description. The derivation of the joint pdf model for the uid and for the discrete particles is worked out in several steps. The mathematical properties of stochastic processes are Arst recalled. The various hierarchies of pdf descriptions are detailed and the physical principles that are used in the construction of the models are explained. The Lagrangian one-particle probabilistic description is developed Arst for the uid alone, then for the discrete particles and Anally for the joint uid and particle turbulent systems. In the case of the probabilistic description for the uid alone or for the discrete particles alone, numerical computations are presented and discussed to illustrate how the method works in practice and the kind of information that can be extracted from it. Comments on the current modelling state and propositions for future investigations which try to link the present c 2001 Elsevier Science B.V. All work with other ideas in physics are made at the end of the paper. rights reserved. PACS: 47.27.Eq; 47.55.Kf; 02.40.+j; 02.50.Ey Keywords: Turbulence; Two-phase ows; Probability density function; Stochastic process
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
3
1. Introduction In March 1999, Reviews of Modern Physics issued a special volume, for the commemoration of its 100th birthday, which discussed historical developments and gave a general outlook on a wide range of physical questions. Numerous articles, written by world experts and often major contributors to their Aelds, provided an overview of past achievements and of the current state of each domain. Apart from the Aelds which have traditionally formed the main core of theoretical physics (quantum theory, particle physics, relativity, astrophysics, etc.) the selection of other subjects (such as uid turbulence, granular matter, soft matter, biological physics) which also found their place in this prestigious assembly is an indication of the interest for the issues raised by these subjects. This is also an indication that improved physical understanding is needed to bring these subjects to a more mature state. In the section devoted to statistical physics, two reviews, written by Sreenivasan and De Gennes, respectively, discussed separately the present understanding of uid turbulence [1] and of granular matter [2] (say, the behaviour of non-Brownian small solid particles). Broadly speaking, both are subjects where the basic equations (for example, the Navier–Stokes equations) or the elementary behaviour (for instance, how two grains interact) may be believed to be known, but where the issue is to understand the complicated and collective behaviour of a large number of interacting degrees of freedom. Both represent problems at a human-size level. They are actually everyday-life concerns and could, at Arst, have been thought to be mere engineering problems. They are indeed engineering problems, but even if only approximate results or rough estimates are sought, this often requires a clear and precise physical understanding of the important phenomena at play. There is another interesting domain which is simply obtained when the two di@culties are mixed: the case of turbulent dispersed two-phase ows. An easy way to picture this is to imagine dealing with granular matter but embedded in a turbulent ow. These ows are of crucial importance in a large variety of industrial problems. Yet, they have not received the same attention as turbulence or as granular matter. As a consequence, physical understanding remains limited and appears to be scarce compared to each of the separate sub-cases, uid turbulence in the absence of particles, and granular matter in the absence of any underlying or interstitial uid. The purpose of the present work is to discuss some of the physical issues involved in two-phase ows and to put forward a probabilistic formalism that can bridge the gap between physical understanding of basic phenomena and practical simulations. That middle-road approach is that of a modeller, where one invents a model, which has simpliAed rules compared to the real phenomena, and which is used to simulate the overall and collective behaviour of a complex system. The question is therefore whether the model contains the right ‘physics’ (thus the need to understand clearly the important phenomena) and then how to reach an acceptable compromise between the simplicity of the model versus its physical realism (thus the need of an appropriate formalism). Before going into the details of the approach followed in this work, a clear deAnition of two-phase ows and particularly of dispersed two-phase ows must be given. Secondly, a better idea of their importance in natural and industrial situations as well as an outline of the modelling issues involved must be provided. Introducing these notions is perhaps best achieved through typical examples. Dispersed two-phase ows occur in many natural phenomena. They are met for example in fogs, in water sprays, in smokes, when desert sand is carried away or
4
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 1. Di8erent two-phase ow regimes in a heat exchanger.
sediments, or (to provide a more vivid image) when an erupting volcano billows around smoke and various particles. They are increasingly important in environmental problems when one or several species (not necessarily pollutants) are dispersed in a turbulent atmosphere. Nevertheless, in the following two sections and to complement these Arst examples, we introduce dispersed two-phase ows and discuss the modelling questions through industrial examples. 1.1. Two-phase ;ow regimes As it transpires from their name, two-phase ows are encountered when two non-miscible phases coexist. Depending on the form of the interface between the two media, di8erent regimes can be found. This is illustrated in Fig. 1 which shows a range of regimes for the case of a boiling liquid (for example water) in a classical heat exchanger. At the bottom of the tube, the liquid has not yet started to boil and we have a single-phase turbulent ow. When nucleation starts at the walls, bubbles can be found as separate inclusions within the liquid (bubbly ows). Then, as more vapour is created we go through the so-called slug and plug regimes where vapour occupies a more important volumetric fraction. Then, as the liquid continues to boil, we And the annular regime with a thin liquid layer at the walls and a central vapour ow with small droplets carried by the vapour. Other regimes can also be found when horizontal channels are considered, but their detailed description is outside the scope of the present article. The wide variety of regimes, merely outlined above, is typical of immiscible liquid–gas or liquid–liquid ows since the interface can be deformed. Two of these regimes (the bubbly and annular regimes) are characterized by the presence of one phase, either liquid or vapour, as separate inclusions embedded in the other phase. These are two examples of what is deAned as dispersed turbulent two-phase ows, where one phase (called the continuous phase) is a continuum and the other phase (called the dispersed phase) appears as separate inclusions dispersed within the continuous one, assumed here to be a turbulent uid. When the dispersed phase is characterized by a distribution in size, one speaks of a polydispersed turbulent two-phase ;ows. The dispersed regime (either mono or polydispersed) is of Arst importance in most cases. It is always found when the dispersed phase is made up by solid particles (solid particles in a gas or a liquid turbulent ow). It is often found for a liquid dispersed as separate droplets in a gas ow (sprays for example) or for two immiscible liquids where one liquid is dispersed in the other liquid. Indeed, the dispersion of one phase within another one increases considerably the surface of the separating interface
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
5
Fig. 2. A circulating uidized bed combustor.
and thus allows better mass and energy transfer between the two phases. These higher transfer rates explain that the dispersed regime is preferable. In the present work, we limit ourselves exclusively to the dispersed regime and we will talk of a uid (the continuous phase) and of discrete particles (which can represent either solid particles, bubbles or droplets). In most of the problems encountered, the dispersed particles have a distribution in size. Since the polydispersed case obviously contains the monodispersed situation as a simple sub-case, we will consider the realistic problem of polydispersed two-phase ows. 1.2. An industrial example of dispersed two-phase ;ows The limitation to the regime of dispersed two-phase ows is of course a simpliAcation with respect to general two-phase ows. Yet, the range of problems remains large, and each of these problems is di@cult. What are the main problems and what are the key issues? To provide some answers to that question, it is perhaps better to describe a relevant industrial example, circulating uidized bed (CFB) boilers. This is an industrial process for thermal energy generation. A sketch of a typical unit is displayed in Fig. 2. In a conAned domain (the combustion chamber), solids (inert sand and solid fuel or coal particles with a size distribution ranging from 100 m to 1 mm and an average density of order of magnitude 1000 kg m−3 ) are transported vertically by a gas (injected at the bottom) through the combustion chamber. The solids are captured at the exit by a separator (usually a cyclone), and reintroduced near the bottom of the combustion chamber, whereas the gas leaves the cyclone through an outlet duct. The solid particles are therefore
6
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
recycled to the combustion chamber and some particles may thus perform several loops in the process. This solid circulation is the key factor of the whole process (thus its name). It ensures an approximate homogeneous temperature within the combustion chamber which can thus be chosen as an optimum between the e@ciency of the process and the formation of noxious pollutants (the resulting low level of emission is one of the strong points of CFB units). It represents also a central aspect that must be mastered if one is to expect a satisfactory performance of the overall process. The ow of the gas–solid mixture is non-stationary and non-homogeneous with a vertical distribution of the particle concentration inside the combustion chamber. This vertical distribution is characterized by three interacting zones (as a function of increasing height): (1) a bottom bed which has the characteristics of a bubbling bed (gas ows through the bed in the form of large structures) and the concentration of discrete particles is so high that particle–particle interaction is a dominant mechanism (particles collide and possibly slide against each other), (2) a splash zone with high clustering and back-mixing activity and (3) a transport zone which exhibits a core/wall-layer structure (particles are entrained upwards in the core and fall down along the walls in the form of a thin boundary layer). In these regions large scale spatial inhomogeneities in the discrete particle concentration Aeld can be observed and for the gas large scale instabilities (pseudo-like turbulence) are present. In regions (2) and (3), particle loading (the local instantaneous ratio between the weight of particles and the weight of gas) is high enough so that turbulence is modulated and possibly modiAed by the presence of the particles. In addition particle segregation can be observed, that is to say the mean particle diameter decreases with height and large particles tend to migrate to the boundary layers. At the exit of the chamber, the particle-laden ow enters the cyclone(s). Cyclones are used here as separators (separating the solid coal particles from the gas in order to recycle them) and are key elements of the whole process. Indeed, should they fail to ensure a proper separation and consequently a proper particle recirculation, the whole process would not be able to run correctly. Cyclone performances are quantiAed by their e@ciency curve which is the fraction of solid particles being collected (and thus recycled) as a function of the particle diameters. It is important to be aware that cyclone e@ciencies are due to the complicated swirling motions and gas ow patterns within the cyclone and not to external forces such as gravity. In other words, both within the combustion chamber and within the cyclone separators, satisfactory performances of a CFB process are ensured by the local hydrodynamics of the two-phase ows rather than by external monitoring. In particular, a key parameter for a good functioning of a CFB boiler is the particle size distribution, to ensure, for example, suitable particle spatial distribution and residence time in the combustion chamber. It is mainly controlled, for small diameters, by the collection e@ciency of the cyclone, and for large diameters, by the characteristics of the fuel particles. This is, after all, an engineering problem. Numerous industrial or engineering problems may also involve di@cult questions. In some situations, complex questions or issues may become less relevant or secondary if engineers apply a high-enough ‘margin coe@cient’. This easy way around theoretical issues may lead researchers to believe that only clever or astute Axing or tinkering is needed. However, from the above outline of the CFB process, it is clear that this is not the case here, since the overall performance is a result of the local hydrodynamics throughout the process.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
7
In summary, the physics of turbulent gas-solid ows (in the case of circulating uidised bed boilers) contains a large spectrum of problems. Three main themes emerge, namely: (A) polydispersed two-phase ows (there is a range of particle diameters rather than a single value), (B) combustion, either within reactive gas ows or of the solid particles (as in the case of fuel particles within the CFB boiler), (C) turbulence, which is the central and common issue. Of course, these three main categories overlap and a wide range of sub-categories or classes of problems can be enumerated. For example, this concerns: • reactive ows (heterogeneous and homogeneous combustion), • particle dispersion (and di8usion of combustion gases), • turbulence modulation, possibly modiAcation of its nature, by the presence and motion of
solid particles embedded in the turbulent ow,
• particle–particle interactions (short- and long duration collisions), • swirling two-phase ows (selectivity curve of the cyclone) and • particle segregation, etc.
Other industrial and practical needs involving two-phase ows, such as pollutant dispersion in the atmosphere, combustion of fuel droplets within car engines, etc. would reveal the same picture and the same categories with an emphasis on one of these categories depending on the application. From the previous analysis, it appears that one has to built the link between uid-mechanics, classical mechanics, tribology, combustion and chemistry. The question to be answered is: how can we achieve this goal with a tractable formalism which has to be, in addition, suitable for numerical applications? 1.3. Mathematical and physical approach 1.3.1. The present objectives For the two-phase ows we consider in the present work, the central subject is turbulence. Turbulence of continuous-phase ows is further compounded by the e8ects of the discrete particles. Direct numerical simulations are possible in theory but are quite impossible in practice, at least for the typical examples described above. Most turbulent dispersed two-phase ows involve far too many degrees of freedom to be directly simulated. The issue is therefore to reduce the number of degrees of freedom to a tractable number and to come up with a contracted description. We are thus faced with a problem of non-equilibrium statistical physics where one tries to obtain a statistical model for a reduced number of degrees of freedom. Given the inherent complexity of the problems we have to deal with, the Arst choice is to limit ourselves to mean or average quantities. This is classical in most problems of statistical mechanics. In other words, we treat the solutions of the fundamental equations as random variables and we are interested in some statistics. Compared to the high complexity and to the beauty of the initial problem, this may look as limited and perhaps unchallenging objectives. However, it must be remembered that we are not dealing with only one problem, either single-phase turbulence, combustion or
8
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
two-phase ows, but typically with the three of them. Consequently, the aim is to develop a mathematical and physical approach that meets the following requirements: 1. the important physical phenomena, such as convection or mean pressure-gradient, are treated without approximation, 2. enough information is available to handle correctly issues of complex physics (combustion, polydispersed particles), 3. the resulting numerical model is tractable for non-homogeneous ows, 4. the model can be coupled to other approaches, either more fundamental or applied descriptions. It is clear from the third constraint that a compromise between detailed descriptions and practical applicability must be reached. The approach will necessarily be less fundamental compared to a number of theoretical approaches in turbulence for example [3–5]. It does not mean that the present choice contradicts more theoretically oriented works, but rather that the underlying objectives are somewhat di8erent. Actually, a satisfactory model respecting the three Arst requirements should easily beneAt from fundamental progress made in one of the three main themes (A), (B) or (C) listed above, which are of concern here. This is one of the reasons for the fourth item which also suggests that the approach can be used in relation to coarser descriptions in a multi-scale or multi-level simulation. The main challenge comes from the second constraint. It implies that we are not looking for a model or an approach which performs very well for only one theme but for an approach that can handle complex physics. For example, we are not looking for an approach which is perhaps the best candidate at the moment to simulate, say, isothermal incompressible single-phase turbulent ows but which requires new formalisms or new models for combustion. We are looking for an approach that can do a Ane job for single-phase turbulence and still be easily extended to handle combustion and dispersed two-phase ow issues within the same framework. This ‘engineering’ constraint has far-reaching consequences in terms of modelling choices and justiAes advanced methods. 1.3.2. Choice of the modelling approach Since we are mainly interested in some local mean statistics on a number of uid and discrete particle properties and since we have emphasized the practical side of the problem, it would seem that the path of least dissipation (for the modeller) consists in trying to derive directly a set of closed partial di8erential equations (PDE) for those mean variables. We refer to this approach as the moment approach or the conventional approach. It is indeed in line with the classical or conventional approaches in continuum mechanics where one handles Aelds which are solutions of some PDEs. For example, if we are interested in the mean uid velocity, we start from the Navier–Stokes equations (we consider here an incompressible ow for the sake of simplicity) 9Ui 9Ui 1 9P 92 Ui =− + 2 : (1) + Uj 9t 9xj 9xi 9 xj This equation contains all the information for the uid velocity. Then, following the classical approach we apply to this equation an averaging operator (the nature of this averaging operator, be it the Reynolds average or a spatial Alter, does not change the present point so we use
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
9
here the Reynolds decomposition while comments on spatial Altering will be developed in Section 9), written :. We obtain the unclosed PDE directly for the variable of interest U 9ui uj 1 9P 9Ui 9Ui =− + QUi − : + Uj 9t 9xj 9xi 9xj
(2)
This open equation has to be closed by resorting to a constitutive relation giving the unknown quantity, here ui uj (the Reynolds stress tensor), as a function of known variables, here the mean velocity. If we use vocabulary from statistical physics, we can say that the conventional approach is a macroscopic approach where one tries to obtain the macroscopic laws through closure relations which are written directly at the macroscopic level. If an acceptable macroscopic constitutive relation can be found, then this route is certainly the most cost-e8ective one since we explicitly calculate only what we want and nothing more. However, the success of this macroscopic approach hinges on the possibility to express unclosed terms through macroscopic laws. If such macroscopic relations cannot be explicitly written or involve far too drastic assumptions to yield acceptable results, then the conventional approach fails. Two typical examples of such problems are provided by the reactive source terms which enter the equations of single-phase turbulent combustion with Anite-rate chemistry and by the existence of a range of particle diameters (a polydispersed two-phase ow). Both issues will be explained in detail in Section 6 for the former and in Section 7 for the latter. In each case, one has to express the average value of a complicated function of some instantaneous variables (the uid instantaneous species mass fractions or the particle instantaneous diameters), whereas the conventional approach can only provide information on the Arst moments (usually the Arst two moments). We are faced with the problem of having to express a quantity such as S( ), where S is a complicated function of some scalar , in terms of the available information, usually limited to or 2 . This results in an intractable problem and more information is needed to address these typical issues of complex physics. In other words, even if we are interested mainly in estimates of macroscopic quantities, an advanced method providing more detailed information is absolutely required. That problem will be emphasized and explained in more details (and for general averaging operators) in the course of the paper. From the above outline, it can be concluded that the macroscopic path is not well suited for our present objectives. On the other hand, we have also seen that the direct simulation is not tractable. In the language of statistical physics, this direct simulation is a microscopic description since all degrees of freedom are explicitly tracked. A reasonable solution is therefore to choose what can be referred to as a mesoscopic approach, or as a middle-road approach between the microscopic and macroscopic descriptions. The mesoscopic approach retained in this work is a probabilistic approach. Its aim is to model and to simulate the probability density functions of the variables which are of Arst interest. For this reason, the present approach can be deAned as a pdf approach to turbulent dispersed two-phase ows. The di8erent pdfs that will be manipulated are modelled pdfs, that is to say the basis of the approach will be to propose probabilistic models to describe the joint pdf of some variables. It will be seen that probabilistic models can be developed either in terms of the pdf or in terms of the trajectories of the stochastic processes involved. In the present work, we will mainly adopt this second point of view and we will be talking about stochastic particles. The stochastic models will be developed directly for the variables attached to these stochastic particles, providing at the same time
10
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
a Monte Carlo evaluation of the pdf. Thus, the approach can also be referred to as a particle stochastic approach. At the moment, multi-point approaches have not been extended to non-homogeneous wallbounded ows whereas one-point pdf models have been put forward. From the third requirement mentioned above, it follows that one-point pdf models will be often considered. Yet, this is not a strict limitation of the present work. Two-point or multi-point pdfs will be discussed and considered at di8erent stages. This will re ect the fourth constraint since multi-point pdfs represent Aner probabilistic descriptions. Our purpose is a general pdf approach and the relation between these di8erent descriptions is contained in the presentation. 1.3.3. Choice of a rigorous presentation pdf models are already well established in the combustion community. A number of reviews are available which discuss the necessary formalism and present modelling state [6 –9]. In particular, Pope’s work [6] has clariAed the one-point pdf formalism and has given the relations between the Lagrangian and the Eulerian pdfs. In these works, the presentation is strictly tailored for one-point pdf descriptions and the derivation of the stochastic models is often based on the previous knowledge of a macroscopic closure relation [10]. This is probably a reasonable choice (and perhaps the best compromise between model complexity versus tractability) since closures of the reactive source terms, which are local source terms, require only one-point pdfs. In most presentations, the stochastic models are not derived from statistical arguments but from their correspondence with given mean moment equations, although recent proposals have tried to use only arguments from statistical physics [11]. This approach can indeed be regarded as a satisfactory answer for two of the main themes, turbulence and combustion, but application to dispersed turbulent two-phase ows requires further work. Indeed, di8erent physical e8ects are present when discrete particles are considered. Furthermore, the mean equations are not known in advance and should precisely be derived from a probabilistic approach. On the other hand, a particle approach and stochastic models have been used for some time to simulate dispersed two-phase ows, see among others the review of Stock [12] and the references inside. A wide variety of stochastic models have been devised, most of the time from a heuristic point of view. In two-phase ows, the notion of a stochastic particle is, at Arst sight, less surprising than in single-phase turbulence, and it is tempting to skip the careful construction of rigorous foundations since the stochastic concepts may be believed to be ‘evident’. However, this direct approach to stochastic modelling can create severe problems that will be detailed in Section 7. Given that no macroscopic relations are known in advance (and can thus be used as safeguards), the development of a rigorous approach to the pdf description of dispersed turbulent two-phase ows is needed. There is also a new element compared to single-phase reactive ows where the choice of the variables which are explicitly modelled is rather obvious. In two-phase ows, the selection of the basic variables is less obvious and is subject to debate. Then, various choices can be made for a pdf description, and the technical aspects of the hierarchies between these di8erent pdfs must be well understood. As a consequence of the above analysis, the aim of this paper is to build a rigorous probabilistic framework that extends current models developed for single-phase reactive ows to include dispersed turbulent two-phase ows. Such a construction requires a mathematical-oriented
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
11
presentation and the deAnition of a clear methodology. As such, it will be somewhat di8erent from the above-cited references. The presentation will be based on ideas from statistical physics and on the hierarchies of pdf equations with a particle stochastic point of view. The stochastic models will be developed as much as possible from the arguments of statistical mechanics and statistical physics. The purpose of the present work is basically to propose a probabilistic description of a mixed system, composed of a continuous Aeld and of discrete particles. The central notion that is adopted is the Lagrangian point of view which will appear as the ‘propagator of information’ for our complex system. 1.4. Description of the contents The paper has been organized to answer to the following general questions: • • • •
What are the mathematical tools which are required? What are their main characteristics? How are they used for physical modelling in general? How are they precisely used in our case? What do we obtain from them? Are present models the end of the story or can they be improved or coupled to other methods?
These questions correspond to three categories: the mathematics of stochastic processes, the general physical meaning of stochastic modelling, and the development of a speciAc framework. As a consequence, the paper has a three-fold objective. The Arst objective is to provide the reader with a comprehensive and understandable picture of the theoretical tools used in the pdf approach (Section 2). Several notions must be understood: the mathematical properties of stochastic processes, the notion of the trajectories of a stochastic process as well as the correspondence between the trajectory point of view and pdf equations. Once the notion of a Markovian stochastic process (and more precisely the subclass ‘di8usion process’) and its associated pdf is clear, the second objective is the description of the use of di8usion stochastic processes for physical modelling. This is carried out by Arst recalling the concepts of statistical physics, i.e. the N -body problem. A general framework is given to work out the relations between the di8erent levels of contraction (Section 3). Then, the modelling principles that allow stochastic processes to be used are presented (Section 4). From this results the deAnition of a pdf description, which is made up by the choice of the variables which constitute the state vector and by the choice of a stochastic model for this state vector. The third objective concerns the development of a consistent and self-contained framework for the probabilistic description of two-phase ows. This derivation is the core subject of the present work and is performed in four steps. A gradual construction of the complete description has several advantages. It avoids dealing immediately with a complicated formalism which may hide or blur some physical points. By gradually building the complete description, we can discuss at length the physical meaning of the di8erent stochastic terms, for the continuous phase and for the dispersed phase. Since the discrete particles are embedded in a turbulent uid, their motion (and the associated statistical properties) are governed by the underlying turbulent ow. It is then important to detail the physical characteristics of turbulent ows. This is the Arst step of our modelling approach where the reader is given a comprehensive, but still general, overview of the physics of turbulence,
12
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Section 5. The second modelling step is the probabilistic description of single-phase ows, that is the probabilistic description of a continuous Aeld (Section 6). Emphasis is put on the level of information which is needed for successful closures (Kolmogorov theory), on the Lagrangian point of view (which is the natural choice in uid mechanics) and on the existence of a propagator. Correspondences with the Eulerian description and classical mean Aeld equations are given. The third modelling step addresses the question of the probabilistic modelling of discrete particles. The usual issues of particle-tracking models are discussed at length, Section 7. In both the second and the third modelling step, numerical computations are presented and discussed to illustrate how the method performs in di8erent ows and the kind of information that can be extracted from it. The fourth modelling step, i.e. the complete uid–particle pdf approach, is achieved in Section 8. It is shown, once again, that the Lagrangian point of view is the natural choice and that there exists a propagator. Correspondence with Eulerian tools is put forward and mean Aeld equations are derived using rigorous probabilistic arguments. By the end of Section 8, the reader has a clear picture of the pdf approach to turbulent dispersed two-phase ows. Then, the concepts of the probabilistic approach can be summarized and prospects for new developments can be put forward, Section 9. 2. Mathematical background on stochastic processes The purpose of this chapter is to provide clear deAnitions of a stochastic process and of stochastic di8erential equations. These equations appear rather naturally in physical or engineering sciences where one would like to introduce ‘randomness’ or ‘noise’ into the di8erential equations that describe the evolution of a physical system. For example, one would like to give a precise meaning to the equation d Xt = A(t; Xt ) d t + B(t; Xt )t ; dt
(3)
where t is the so-called ‘white-noise’ process that represents some ‘rapid uctuations’. It turns out that the deAnition and proper treatment of such an equation cannot be made directly with classical methods from ordinary di8erential equations (ODEs). Special mathematical notions have to be introduced to explain stochastic calculus which has its subtleties that can be surprising at Arst sight. The following results and notions will be presented, as much as possible, in a logical way while trying to avoid being too mathematically involved. Most of these results will be stated without proofs and not all deAnitions are given. However, complements and detailed presentations of this material can be found in mathematical textbooks [13,14] or in physically-oriented books [15]. An excellent presentation gathering mathematical correctness and an applicationR oriented discussion can be found in Ottinger [16]. Most of the material needed to handle in a simple way probability concepts has been developed in Pope’s seminal work [6] for single-phase pdf methods. It did not appear useful to repeat this presentation here and the objective of this section is to go into more mathematically advanced details. Each of the following subsections cannot pretend to give a comprehensive description of the subjects but the themes and the order of the presentation re ect the important issues.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
13
2.1. Random variables In applied physics, random variables are often introduced directly through their probability density functions (pdf) which can be either discrete or continuous. The random variable, say X , can take a range of possible values x ∈ A, for example x ∈ R or Rd . The probability that X takes a value between x and x + d x is P[x 6 X 6 x + d x] = p(x) d x :
(4)
The actual and more rigorous construction of a random variable is based on an underlying probability space (; F; P) and on measure theory. One deAnes a reference space equipped with a -algebra F (an ensemble of subsets such that complements and reunions of them still give a subset that belongs to the ensemble) on which a measure P (with P() = 1) is deAned. A random variable X is mathematically deAned as a measurable function from this reference space to the one where X takes its values, here A which is also equipped with a -algebra G (; F; P) → (A; G) ; ! → X (!)
(5)
and the law of probability of X is simply the image of the reference measure P, that is PX (A) = P(X −1 (A));
∀A ∈ G :
2.1.1. Conditional expectations The Arst central notion is the expectation of a random variable which is the integral of the possible values against their measures X = X (!) d P(!) : (6) The expected or mean value is written here as X following the usual notation in applied physics but is written as E(X ) in the mathematical literature. The level of abstraction used in the deAnition of random variables is not just for the sake of doing mathematics but is helpful to precise some notions concerning Arst random variables and then for stochastic processes. One such notion is the conditional expectation which is very important for the physical idea of coarse-grained descriptions but can only be fully understood with reference to -algebras. It is worth giving the formal deAnition: Denition 1. If X is a random variable on the probability space (; F; P) and if F is a sub--algebra of F, that is F ⊂ F, the conditional expectation (or conditional average) of X given F , written X |F , is a random variable deAned on the sub--algebra F and such that its expectation or its mean value on any subset A of the sub--algebra F is equal to the mean value of X on the same subset, or X XA = X |F XA
∀A ∈ F ;
where XA is the indicator of the subset A .
(7)
14
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
This formal statement can be translated into words. The Arst important point is that the conditional average is also a random variable but deAned on a coarser -algebra. A -algebra can be regarded as describing mathematically the notion of the ‘level of information resolved’ or the ‘information content’ of the random variable X . If more information is provided on a physical object which is represented by the random variable X , this corresponds in the mathematical deAnition to a function deAned on a ?ner ensemble F. On the reverse, if less information is known or resolved by X this translates into the fact that the function X is deAned on a coarser ensemble F. Therefore, for a physical object represented by a random variable X , the idea of a coarse-grained description, when not enough details are resolved or when one voluntarily disregards some pieces of information, can be well represented by a conditional average. The second important point in the deAnition given above, is that the conditional average is the mean of the actual random variable X ‘averaged’ over the unresolved information or, in more mathematical terms, over the Aner decomposition of any subset A of F into reunions of subsets of F. This appears as the only way to properly deAne the notion of the conditional average of one random variable. However, when one handles in fact two joint random variables X and Y and simply considers the sub--algebras obtained by Axing the value of one of the two random variables, say for example the sub--algebra obtained with Y = y, we retrieve the usual and more intuitive notion of conditional random variable given the value of another one whose pdf is then p(x; y) p(X = x|Y = y) = : (8) pY (y) 2.1.2. Weak and strong convergence of random variables Random variables are not often known directly and are generally obtained as limits of approximate and simpler random variables. This is the case when a process is simulated by numerical integration with Anite time steps Qt, see Section 2.10. This also happens from a physical point of view since models are used to get practical but then approximate answers. One must be able to know how properties or characteristics of the various approximations can be carried over to the actual solution. Several modes of convergence can be deAned for random variables, which must be well understood in particular the distinction between strong and weak convergences. For these reasons, we explicitly give the following deAnitions. Denition 2. The sequence (Xn ) converges towards the random variable X , deAned on the same probability space, almost surely if and only if P({! for which |Xn (!) − X (!)| → 0 as n → ∞} = 1 :
(9)
Denition 3. The sequence (Xn ) converges towards the random variable X , deAned on the same probability space, in the mean square sense if and only if |Xn − X |2 → 0
as n → ∞ :
(10)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
15
Denition 4. The sequence (Xn ) converges towards the random variable X , not necessarily on the same probability space, in distribution or in law if and only if f(Xn ) → f(X )
as n → ∞ :
(11)
There is actually a fourth possible mode of convergence, the notion of stochastic limit, which is not given here. This mode of convergence is important in the theory but since it will not be explicitly used here and since its absence does not prevent the key concepts from being presented, it is left out so as to limit the mathematical burden. The Arst mode of convergence, the almost sure limit, is the strongest possible one. It is actually the idea of classical pointwise convergence of real-valued functions used for the realizations of the random variable. It means that the sequence (Xn ) converges to X ‘everywhere’, or that the subset on which (Xn ) does not converge to X is negligible (in the sense that the measure of its importance is zero). The second mode of convergence has a more familiar connotation since it manipulates something that is basically an energy. Yet, these Arst two modes are similar. The third one is somewhat di8erent since what is required is that only mean quantities derived from the sequence (Xn ) converge to a mean value derived from the limit process X . That limit process does not need to be known explicitly, and we only deal here with some information extracted from the di8erent processes. That mode of convergence is therefore weaker than the Arst two. Indeed, the Arst two modes depend ‘directly’ on the values of the variables Xn and X whereas in the third mode the knowledge of X is ‘indirect’. Loosely speaking, we can give the overall picture and say, that the almost sure and the mean square convergence are strong modes of convergence while convergence in distribution refers to a weak convergence. The distinction between these two ways of approximating random variables is important within the mathematical theory (deAnition of the Itˆo integral, solutions of equations, : : :) but also for numerical reasons (see Sections 2.10) and for physical purposes since it helps clarifying the ideas of the pdf approach in single- and two-phase ows developed in Sections 6 –8. 2.2. Stochastic processes Another interest of the exact deAnition of random variables given previously is to pave the way for the notion of stochastic processes and of trajectories of a process. A stochastic process is simply a family of random variables X = (Xt ) indexed by a parameter which is usually the time t. This notion is obvious to introduce when one wishes to use random variables to model a time-dependent physical system. The mathematical deAnition of a stochastic function is in fact a measurable function of two variables T × (; F; P) → (A; G) ; (t; !) → X (t; !) :
(12)
The equivalence mentioned in the introduction between di8erent points of view can now be made clear by Axing one of the two variables. (a) for each Axed t ∈ T , Xt is a random variable and we can deAne its pdf p(t; x),
16
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
(b) for each Axed !, we have simply a function t → Xt (!)=X (t; !) which is called a trajectory of the stochastic process X or a sample path, (c) there is a third point of view which generalizes the trajectory point of view. In this point of view, the stochastic process (Xt ) is regarded as a random variable for which the range of values is the set of real functions X: (!). This deAnes the path-integral point of view. In (a), we address the problem by considering the time-dependent pdf and the question is, for a given problem, to write the equation satisAed by this pdf. This is the pdf point of view. In (b), we Arst discretize the reference space introducing ‘particles’ and we follow the time evolution of these particles which deAne the trajectories of the process. This is the trajectory point of view. It is now clear that these particles represent actually di8erent realizations of a stochastic process whose evolution is tracked in time. The path-integral point of view will not be used in the discussion of present models for turbulent dispersed two-phase ows, but will be referred to as an attractive tool in Section 9.3.4. 2.3. Markov processes Manipulation of general stochastic processes is di@cult since it requires to handle N -point distributions, that is the joint distribution functions of the values of the process p(t1 ; x1 ; t2 ; x2 ; : : : ; tN ; xN ) ;
(13)
at N di8erent times, and for any value of N . An important simpliAcation can be obtained for a class of special processes to which we nearly always limit ourselves, Markov processes. A Markov process is deAned as a process for which knowledge of the present is su@cient to predict the future. This is actually a simple notion which is carried over from ordinary di8erential equations (ODE). In classical mechanics, when an ODE is written to describe the time evolution of a system, knowledge of the initial condition is su@cient. For stochastic processes, the Markov property means that if the state of the system is known at time t0 , additional information on the system at previous times s (s 6 t0 ) has no e8ect on the future at t ¿ t0 . The Markov property simpliAes the situation since it can be shown that Markov processes are completely determined by their initial distribution p(t0 ; x0 ) and their transitional pdf p(t; x|t0 ; x0 ). This transitional pdf represents the probability that X takes a value x at time t conditioned on the fact that at time t0 its value was x0 . The Markov property manifests itself in the following consistency relation which is the Chapman–Kolmogorov formula (14) p(t; x|t0 ; x0 ) = p(t; x|t1 ; x1 ) p(t1 ; x1 |t0 ; x0 ) d x1 : This equation states that the probability to go from (t0 ; x0 ) to (t; x) is the sum over all intermediate locations x1 at an intermediate time t1 . The factorization inside the integral re ects the independence of the past and the future at t1 when the present is known. A Markov process can be characterized directly in terms of its transitional pdf or its trajectories or more indirectly (in a weak or distribution sense) by its action on members of a function space. It is useful to deAne the inAnitesimal operator for functions g acting on the sample space
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
17
of Xt by (g(Xt+dt )|Xt = x) − g(x) ; dt→0 dt
Lt g(x) = lim
(15)
where g(X ) denotes the mean or expectation g(X ) = g(x)p(x) d x :
(16)
The value of Lt g(x) can be thought of as the mean inAnitesimal rate of change of the process g(Xs ) conditioned on Xt = x. Using this operator, the Chapman–Kolmogorov relation can be turned into di8erential equations. Since the conditional pdf p(t; x|t0 ; x0 ) is a function of two variables, on can consider variations with respect to the initial variables (t0 ; x0 ) or the Anal variables (t; x). We obtain then two di8erent equations (see [13]), whose meaning will become clearer for di8usion processes: • Kolmogorov backward equation: 9p + Lt p = 0 ; 0 9 t0 end condition p(t; x|t ; x ) = (x − x ) 0 0 0 • Kolmogorov forward equation: 9p = L∗ p ; t 9t initial condition p(t; x|t ; x ) = (x − x ) 0 0 0
(17) when t0 → t :
(18) when t → t0 ;
where Lt∗ denotes the adjoint of the operator Lt . The forward Kolmogorov equation gives the well-known Fokker–Planck equation for di8usion processes as we will see below. 2.4. Key Markov processes 2.4.1. The Poisson process Many situations, such as electron emission, telephone calls, shot noise or collisions, among other problems, require the notion of random points and eventually of Poisson processes. The important properties of the statistics of random points are Arst outlined. If a large number of points n are placed at random within a wide interval, say [ − T=2; T=2], it can be shown that the probability to have k points in an interval I of length tI , small with respect to T , is given by (ntI =T )k : (19) k! We then consider the case when n; T → ∞ such that n=T = # remains Anite. This deAnes the concept of random Poisson points for which the probability to have k points in any interval I of length tI , say n(I ) = k, is thus P(k in I ) = e−ntI =T
P(k in I ) = e−#tI
(#tI )k : k!
(20)
18
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
A very important property is that random points in non-overlapping intervals are independent. The parameter # which speciAes Poisson points has a clear and simple meaning. This is shown by considering a small time interval of length Qt and the probability to have one point within that interval. If Qt is such that #Qt is much less than one, we have P(one point in [t; t + Qt]) #Qt ;
(21)
while we have P(more than one point in [t; t + Qt])Qt :
(22)
Consequently, the parameter # appears as the density of Poisson points. From the concept of random points, the deAnition of the Poisson process is straightforward. The Poisson process, Nt is deAned as the number of random points, or more generally of random events that take place in the interval [0; t] Nt = n(0; t) :
(23)
The Poisson process is therefore a stochastic process taking discrete values. The trajectories of the Poisson process are staircase functions, being constant between random points at which they jump to the next integer. The mean value of the Poisson process and its variance are given by Nt = #t ;
(24)
Nt = (Nt − Nt )2 1=2 = (#t)1=2 :
(25)
The parameter # which deAnes the process is still the density of the random points, or rather of the random times at which certain events take place (emission of an electron, arrival of a phone call, collision with another particle, : : :). It is called the intensity of the Poisson process and has the dimension of a frequency or the inverse of a time scale, say $c . The mean time interval between each random event is simply equal to $c . 2.4.2. Wiener process and Brownian motion The Wiener process is the key process for our present concerns. It represents directly a model for a Brownian particle and as such has direct physical applications for modelling issues. It is also the fundamental building block on which di8usion processes and stochastic di8erential equations are built. The Wiener process can be introduced di8erently, directly through its construction as a random walk in some applied textbooks or as a rather abstract mathematical object in more formal mathematical works. A middle path is sought here and further explanations can be found in [13–15]. We Arst limit ourselves to the one-dimensional case but all results are easily extended to the multi-dimensional case. The Wiener process Wt can be deAned as a Gaussian process. Just as every Gaussian random variable is completely deAned by its mean and variance, a Gaussian process is fully characterized by two functions, its mean and covariance, which are functions of one and two variables respectively: M (t) = Xt ;
(26)
C(t; t ) = (Xt − Xt )(Xt − Xt ) :
(27)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
19
For the Wiener process, these deAning functions are M (t) = 0;
C(t; t ) = min(t; t )
(28)
and the transitional pdf has a Gaussian form (x − x0 )2 1 p(t; x|t0 ; x0 ) = exp − : 2(t − t0 ) 2((t − t0 )
(29)
The inAnitesimal operator associated to the Wiener process is 1 92 g(x) Lt g(x) = ; (30) 2 9x 2 and the forward Kolmogorov equation shows that the transitional pdf p(t; x|t0 ; x0 ) is the solution of the heat equation 2 9p = 1 9 p(x) ; 9t 2 9x 2 (31) initial conditions p(t; x|t0 ; x0 ) = (x − x0 ) when t → t0 : This equation already reveals the physics of the problem. A quantity will di8use in space (its value follows a di8usion equation such as the heat equation) because it is ‘carried’ by underlying and fast Brownian particles. In other words, the result of the mixing of fast Brownian particles which carry a piece of information is that the mean value of that information di8uses in space. The Wiener process has a number of particular properties. The main ones are: • the trajectories of Wt are continuous yet nowhere di8erentiable. Even on a small interval, Wt
uctuates enormously.
• the increments of Wt , d Wt =Wt+dt − Wt , over small time steps d t are stationary and independent.
Each increment is a Gaussian variable with mean, variance and higher moments given by d Wt = 0;
(d Wt )2 = d t;
(d Wt )n = o(d t) :
(32)
• the Wiener process is the only stochastic process with independent Gaussian increments and
with continuous trajectories. • the trajectories are of unbounded variation in every Anite interval. This property explains why stochastic integrals will di8er from classical Riemann–Stieltjes ones. 2.5. General Chapman–Kolmogorov equations
Some of the typical properties observed with the key stochastic processes described above can be generalized to a whole class of Markov processes, provided that certain assumptions are made on their behaviour over small time increments. From the correspondence between the trajectory and the pdf points of view, there are two ways to express this incremental behaviour. In this section, we follow the trajectory point of view and characterize these processes by the following conditions on the transitional pdf over small increments in time Qt: 1 (33a) p(t + Qt; y|t; x) = W (y|t; x) + O(Qt); for |x − y| ¿ + ; Qt
20
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
1 Qt 1 Qt
|y−x|¡+
|y−x|¡+
(y − x) p(t + Qt; y|t; x) d y = A(t; x) + O(Qt) ;
(33b)
(y − x)2 p(t + Qt; y|t; x) d y = B2 (t; x) + O(Qt) :
(33c)
The Arst condition is the probability of a jump and trajectories are discontinuous when W = 0. The second one deAnes the drift coe@cient A(t; x) which is the mean increment of the conditional process Xt . The third one deAnes the di8usion coe@cient which represents the variance of the increment or the spread around the mean incremental value. The transitional probability density function p(t; x|t0 ; x0 ) is a function of both the initial state (t0 ; x0 ) and of the Anal state (t; x). Consequently, two points of view can be adopted by holding either the initial or the Anal condition Axed and by varying the other state. Using the Chapman–Kolmogorov relation, Eq. (14), and the above hypotheses, it can be shown that, when the initial condition is held Axed and when p is regarded as a function of the Anal state (t; x), then p(t; x|t0 ; x0 ) satisAes the forward Kolmogorov equation [15] 9p 9[A(t; x) p] 1 92 [B2 (t; x) p] =− + 9t 9x 2 9x 2 + {W (x|t; y) p(t; y|t0 ; x0 ) − W (y|t; x) p(t; x|t0 ; x0 )} d y :
(34)
Using similar considerations and more or less the same derivation, it can be shown that, as a function of the initial state (t0 ; x0 ) when the Anal condition (t; x) is held Axed, p(t; x|t0 ; x0 ) satisAes the backward Kolmogorov equation [15] 9p(t; x|t0 ; x0 ) 9p(t; x|t0 ; x0 ) 1 2 92 p(t; x|t0 ; x0 ) = −A(t0 ; x0 ) − B (t0 ; x0 ) 9t0 9x0 2 9x02 + W (y|t0 ; x0 ){ p(t; x|t0 ; x0 ) − p(t; x|t0 ; y)} d y :
(35)
It is important not to confuse the two points of view (forward or backward) which further justiAes the central role of the transitional pdf. In the forward equation, the initial state does not appear explicitly in the jump, drift and di8usion coe@cients, and we can integrate over all possible initial conditions. Since the pdf of the stochastic process Xt at time t is of course given by p(t; x) = p(t; x|t0 ; x0 )p(t0 ; x0 ) d x0 ; (36) it follows that p(t; x) satisAes the same forward equation. From the general Chapman–Kolmogorov equations, various cases can be isolated by considering di8erent possibilities for the jump, drift and di8usion coe@cients. These particular equations have sometimes been obtained separately and carry di8erent names often for historical reasons. Yet, in the present formulation, they appear as subclasses of a general class of Markov processes.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
21
2.5.1. The Master equation When A(t; x) = B(t; x) = 0, the stochastic process involves only jumps and between each jump the trajectories of the process are straight lines. The pdf equation is called the Master equation 9p (37) = [W (x|t; y)p(t; y|t0 ; x0 ) − W (y|t; x)p(t; x|t0 ; x0 )] d y : 9t The Master equation is indeed the central equation for processes which are typically discrete and is met when one deals with physical issues which are also by nature discrete. The classical example is molecular and chemical processes which involve either a complete change or no change at all. Another typical application is for particle collisions where particle velocities can be constant and change discontinuously at discrete and random times. Over a Anite time step Qt, we have p(t + Qt; y|t; x) = (y − x) 1 − W (y |t; z) d y + W (y|t; x)Qt (38) which shows that W (y|t; x) is the probability to jump from state x to y at time t per unit of time. The generic process in this subclass is the Poisson process described in the previous section for which the sample space is discrete and W (x + 1|t; x) = #. 2.5.2. The Liouville equation When W (y|t; x) = B(t; x) = 0, the process is a continuous deterministic process and the pdf equation is the Liouville equation 9p 9[A(t; x)p] =− : (39) 9t 9x The Liouville equation is central in classical mechanics. Its characteristic form, and the presence of only Arst-order partial derivatives, are closely related to the choice of a closed description of a mechanical system (each degree of freedom is explicitly tracked) as it will be explained in detail later on in Sections 3 and 4. 2.5.3. The Fokker–Planck equation When W (y|t; x) = 0, the forward Kolmogorov equation is called the Fokker–Planck equation. 9p 9[A(t; x)p] 1 92 [B2 (t; x)p] : =− + 9t 9x 2 9x 2
(40)
Compared to the Liouville equation, the Fokker–Planck equation involves a supplementary term with a second-order partial derivative. The existence of this term has deep consequences both mathematically and physically. From the mathematical point of view, the issue is to deAne clearly the corresponding behaviour of the trajectories of the process and to put the manipulation of these trajectories on a sound footing. From the physical point of view, the issue is to explain how this behaviour comes into play and the physical meaning behind its use. That question is addressed in Section 4. The solutions of Fokker–Planck equations are known as di8usion processes and the rest of the present section is devoted to clarifying their characteristics and how they are manipulated. The central example within the subclass of di8usion processes is the Wiener process, described in the previous section.
22
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
2.6. Stochastic diCerential equations and diCusion processes Di8usion processes form a subclass of Markov processes. They have been carefully studied and their properties are rather well-known mathematically which makes them easier and safer to use in applied physics. They will be used extensively for modelling purposes since they represent models for di8usion phenomena (thus their name) and have continuous trajectories. The Arst example is the Wiener process described above and general di8usion processes are in fact extensions of it. The inAnitesimal operator for such a di8usion process is given by 9 92 1 L = A(t; x) + B2 (t; x) 2 (41) 9x 2 9x and the transitional pdf p(t; x|t0 ; x0 ) satisAes the Fokker–Planck equation, Eq. (40), with the initial condition p(t; x|t0 ; x0 ) → (x − x0 ) when t → t0 . The Fokker–Planck equation re ects the pdf point of view. On the other hand, the second point of view will give direct answers to the questions explained in the introduction of this chapter related to the meaning of Eq. (3). One would like to give a meaning to the ‘noise’ term, t , introduced in an ODE. The proper way to do so is to say that we are now dealing with a stochastic process Xt and that we are writing di8erential equations for the trajectories of this process as deAned above. We consider now t as a rapidly uctuating, highly irregular stochastic process. The ideal model is a Gaussian ‘white noise’ model where the process is stationary with zero mean and no correlation, that is t = 0
and
t t = (t − t) :
(42)
This process has a constant spectral density (thus the name white noise). However, the white-noise process cannot be deAned directly since it has an inAnite variance. One can give an abstract sense to this process (Arnold, Chapter 3). However, there is a simpler way out of this di@culty. The solution consists in considering the e8ect of the white-noise term over (small) time intervals. We deAne t Yt = t d t : (43) 0
Yt is a Gaussian Markov process whose mean and covariance functions are worked out from the properties of the white noise Yt = 0
and
(Yt )2 = t :
(44)
Therefore, Yt can be identiAed with the Wiener process, Yt = Wt . This indicates that in fact, the integral over a time interval of the white-noise process gives the Wiener process and this justiAes writing t Wt = t d t or d Wt = t d t : (45) 0
The idea is thus to try to deAne not the derivatives of the trajectories, Eq. (3), but their increments over small time steps as d Xt = A(t; x) d t + B(t; x) d Wt ;
(46)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
which is a short-hand notation for t t Xt = Xt0 + A(s; Xs ) d s + B(s; Xs ) d Ws : t0
23
(47)
t0
The Arst integral can be thought of as a classical one. The second one is a stochastic integral which must be properly deAned. In the usual sense, one would split the time interval into a number of small time steps and write the integral as the limit t N B(s; Xs ) d Ws = lim B($i ; X$i )(Wti+1 − Wti ) ; (48) t0
N →∞
i=1
where $i is chosen in the interval [ti ; ti+1 ]. However, it turns out that the limit result is not independent of the choice of the intermediate time $i as one would expect from classical integration. Di8erent choices of this intermediate time yield Anite but di8erent results for the integral. This surprising result can be traced back to the ragged behaviour of the Wiener process and to the fact that its trajectories are of inAnite total variation in any time interval. To obtain a meaningful and coherent theory (for later manipulations) of the stochastic integral, one must therefore choose the intermediate times right from the outset. The important message here is that speaking of a stochastic integral without specifying in what sense it is considered is not meaningful. Two main choices have been made in the literature. The Arst one is called the Itˆo deAnition and consists in taking the value at the beginning of the time interval $i = ti . There is a clear probabilistic interpretation of this. The integral writes t N B(s; Xs ) d Ws = lim B(ti ; Xti )(Wti+1 − Wti ) (49) t0
N →∞
i=1
which shows that we consider the function B(t; Xt ) as a non-anticipating function with respect to the Wiener process. The choice of $i signiAes that we express B(t; x) as a function of the present state while the increment d Wt which is independent of the present is said to ‘point towards the future’. This choice is in fact rather natural when the ‘noise’ does not depend on the system. From it, result the properties of the Itˆo stochastic integral
t1 Xt d Wt = 0 ; (50) t0
t1 t3 t1 for t0 6 t2 6 t1 6 t3 Xt d Wt Yt d Wt = Xt Yt d t : (51) t0
t2
t2
The second choice is to take the intermediate point $i as the middle point of the interval $i = (ti + ti+1 )=2. This results in the Stratonovich deAnition. Actually, various deAnitions of the Stratonovich integral can be found depending upon the exact expression of the term involving B($i ; X$i ) in the limit sum, Eq. (48). For example, one can choose to take B((ti + ti+1 )=2; X(ti +ti+1 )=2 ) or B(ti ; X(ti +ti+1 )=2 ) as in Arnold [13]. The most common deAnition met in mathematical books is (written with a characteristic symbol ◦) t N 1 B(s; Xs ) ◦ d Ws = lim (52) [B(ti ; Xti ) + B(ti+1 ; Xti+1 )](Wti+1 − Wti ) : N →∞ 2 t0 i=1
24
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
These various deAnitions di8er only by the assumptions required on B for the sums to converge, and if B is smooth enough they lead to the same limit object. Therefore, the above sum can be taken as the present deAnition of the Stratonovich integral. The question of what deAnition of the stochastic integral, Itˆo or Stratonovich, should be chosen has led to continuous debate in applied physics textbooks. A detailed discussion on this dilemma is outside the scope of the present notes. However, the key point is to be aware of this apparent peculiarity so as to avoid confusion. Indeed, if these di8erent deAnitions and properties are ignored, it is hard to understand why calculations performed with seemingly identical procedures can lead to con icting results. Actually, the two forms can be transformed one into the other. The Stratonovich deAnition of a SDE d Xt = A(t; x) d t + B(t; x) ◦ d Wt ;
(53)
can be shown to be equivalent to the Itˆo SDE [13,15] 9B(t; x) 1 d Xt = A(t; x) d t + B(t; x) d t + B(t; x) d Wt : (54) 2 9x The di8erence between the two deAnitions is therefore a mean drift term and is not ‘negligible’. This illustrates and further stresses that, even if one is not interested in mathematical subtleties, a careful deAnition and at least some understanding of what these deAnitions embody is unavoidable. The best illustration of such pitfalls is perhaps numerical schemes for the integration of the trajectories of the process in practical computations, see Section 2.10. Finally, for a stochastic process Xt whose trajectories satisfy stochastic di8erential equations in the Stratonovich sense, it can be seen from the correspondence with an Itˆo form, Eq. (54), and the Fokker–Planck equation veriAed for di8usion processes in the Itˆo sense, Eq. (40), that the pdf of Xt is the solution of
9p 9[B(t; x)p] 9[A(t; x)p] 1 9 B(t; x) : (55) =− + 9t 9x 2 9x 9x 2.7. Stochastic calculus Most of the strangeness of stochastic processes and of SDEs is embodied in stochastic calculus. Although surprising at Arst sight, the di8erences with ordinary di8erential rules are not too di@cult to grasp. They stem from the irregular behaviour of the trajectories of the Wiener process Wt . Indeed, we have seen that on a small time increment d t the variance of the increments of the Wiener process, (d Wt )2 , is linear in d t (in fact, it is equal to d t). This is already contradictory with the ‘normal’ calculus result which says that the square of an increment should be of order (d t)2 . The explanation is that the ‘correct’ behaviour is expected for a di8erentiable process (a process whose trajectories are di8erentiable) while the Wiener process is precisely not di8erentiable. As a consequence, normal calculus rules must be modiAed by going over to the second-order derivatives, which in normal cases give only terms of order (d t)2 but will bring a Arst-order contribution in our case. To illustrate this, we consider a SDE deAned in the Itˆo sense d Xt = A(t; x) d t + B(t; x) d Wt
(56)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
25
and we want to derive the SDE veriAed by a function g(t; Xt ) of the stochastic process Xt . The rule of thumb is thus to write the Taylor series up to the second order and not to forget the contribution that arises from the term involving (d Wt )2 whose mean is d t. The result is the Itˆo’s formula 9g 9g 9g d g(t; Xt ) = (t; Xt ) d t + A(t; Xt ) (t; Xt ) d t + B(t; Xt ) (t; Xt ) d Wt 9t 9x 9x 92 g 1 + B2 (t; Xt ) 2 (t; Xt ) d t ; (57) 2 9x where the last term on the second line is the ‘new term’ with respect to classical calculus which would have produced only the Arst line. On the other hand, the choice of the Stratonovich deAnition leads to calculus rules which are identical to classical ones [13,15]. However, this nice point is o8set by the di@culty in manipulating the stochastic integral and Itˆo’s simple properties Eqs. (50) – (51) are no longer valid.
2.8. Langevin and Fokker–Planck equations Once stochastic calculus has been deAned and the signiAcation of SDEs has been given, the picture is complete. We can state what is in fact the main point of this whole section: when dealing with stochastic processes there are two ways to characterize the properties, the time-evolution equation of the trajectories of the process or the equation satisAed in sample space by its pdf. This correspondence is particularly clear for di8usion processes and is central in the present paper. We use this summary as an opportunity to write results in the multi-dimensional case. If Z(t) = (Z1 ; : : : ; Zn ) is a di8usion process with a vector drift A = (Ai ) and a di8usion matrix B = Bij , the trajectories of the process are solutions of the following SDE d Zi = Ai (t; Z(t)) d t + Bij (t; Z(t)) d Wj ;
(58)
where Wt =(W1 ; : : : ; Wn ) is a set of independent Wiener processes. The SDEs are called Langevin equations in the physical literature. This corresponds in sample space to the Fokker–Planck equation for the transitional pdf written p(t; z|t0 ; z0 ) 9p 9[A(t; z)p] 1 92 [(BBT )ij (t; z)p] + ; (59) =− 9t 9 zi 2 9zi 9zj where BT is the transpose matrix of B. Actually, the correspondence between the two points of view is not a strict equivalence. Indeed, the matrix D that enters the Fokker–Planck equation is related to the di8usion matrix of the SDEs B by D = BBT . Since there is not always a unique decomposition of deAnite positive matrices for a given matrix D, there may exist several choices for the di8usion matrix B. Therefore, we can have di8erent models for the trajectories that still correspond to the same transitional pdf. In other words, there is more information in the trajectories of a di8usion process than in the solution of the Fokker–Planck equation. However, since we are in the present work interested mainly in statistics extracted from the stochastic process, or in a weak
26
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
approach (in the sense already used in Section 2.1.2), we can consider that the di8erent models for the trajectories belong to the same class and then speak of the equivalence between SDEs and Fokker–Planck equations. 2.9. The probabilistic interpretation of PDEs The equivalence between the trajectory and the pdf points of view is also the basis of the probabilistic interpretation of some PDEs. The starting idea is to interpret the solution of a PDE as a function or a functional of some stochastic process. Instead of solving the PDE by classical numerical methods, the idea is then to simulate directly the trajectories of the process and to obtain the solution by some sort of averaging operation. This methodology can be applied to a large variety of PDEs (see [14,17]). We limit ourselves to the case of parabolic equations and the relation between stochastic processes and PDEs of convection–di8usion type is simply the relation between the two deAnitions of a di8usion process. The probabilistic interpretation reverses that point of view and regards a convection–di8usion PDE as a kind of Fokker–Planck equation. For example, the solution of the problem 2 2 9u = − 9[A(t; x) u] + 1 9 [B (t; x) u] ; 9t 9x 2 9x 2 (60) u(0; x) = h(x); when t = 0 ; can be built from the transitional pdf of the di8usion process Xt , p(t; x|t0 ; x0 ), as u(t; x) = p(t; x|t0 ; x0 )h(x0 ) d x0 ;
(61)
where A(t; x) and B(t; x) are, respectively, the drift and di8usion coe@cients of the process Xt . Therefore, in physical terms, Xt appears as the propagator of the initial function h(x0 ). Or in other words, Xt is the carrier of the information. At the initial time, particles start at x0 with an ‘information’ that is h(x0 ). Then they follow the SDE d Xt = A(t; x) d t + B(t; x) d Wt :
(62)
As a consequence of this motion, information is carried from the initial state (t0 ; x0 ) to another one (t; x). The average result is then the solution of the PDE which is of convection–di8usion type. It is seen that the di8usion term in the PDE re ects in fact the fast and random motion expressed by the Wiener process, d Wt , in the ‘particle’ evolution equation. Conversely, when particles undergo a random walk, the result of their mixing is to produce a di8usion in space. Then, in practical simulations, any statistics that are continuously obtained from the pdf of the process, can be approximated, at a given time t, from an ensemble of realizations of the process by the Monte Carlo evaluation N 1 f(Xt ) f(Xti ) : (63) N i=1
For a stochastic process when statistics are required at various times, the di8erent realizations at time t are simply provided by the values at the corresponding time of the trajectories of the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
27
process. Indeed, it is now clear that the trajectory point of view consists in practice in following in time a number of trajectories, that we can write as X i (t) = X (t; !i ) for a certain number of possible events represented here by !i . In other words, simulating stochastic processes from the trajectory point of view corresponds to performing Monte Carlo integration at each time. One speaks then of the Monte Carlo integration of partial di8erential equations. 2.10. A word on numerical schemes The problem of how to devise accurate numerical schemes for the integration of SDEs is a di@cult issue, and also a recent concern. This is the subject of current research [18,19]. The detailed presentation of current state-of-the-art proposals is not within the scope of the present paper and we limit ourselves to the main points that also illustrate the notions put forward in the previous sections. Compared to similar numerical schemes that are now well established for ordinary di8erential equations, the question of the consistency of stochastic numerical schemes must be carefully analysed. Actually, most of the di@culties arise from a lack of understanding of the exact deAnition of the stochastic integral, see Section 2.7. Numerical schemes, as well as manipulation of a function of the stochastic process Xt can only be done after an interpretation of the stochastic integral has been chosen. If one has chosen the Itˆo interpretation, then it is implicitly assumed that the discretization of B(t; x) should not anticipate the future. As a result, Runge–Kutta schemes cannot be applied directly. More precisely, careless applications of high-order Runge– Kutta schemes can introduce spurious drifts which may not be easy to detect. For the Langevin equation d Xt = A(t; Xt ) d t + B(t; Xt ) d Wt ;
(64)
the Euler scheme is the simplest choice and is written as X i (t + Qt) − X i (t) = A(t; X i (t))Qt + B(t; X i (t))QWt ; (65) √ where the random term QWt is expressed as Qt × e, e being a value sampled in a normalized Gaussian random variable, independently at each time step and for each trajectory. A rather illuminating example of typical pitfalls is seen if one tries to apply directly the well-known predictor–corrector scheme. This is a two-step scheme with the Euler scheme acting as a predictor i X˜ (t + Qt) − X i (t) = A(t; X i (t))Qt + B(t; X i (t))QWt ;
(66a)
1 i X i (t + Qt) − X i (t) = (A(t; X i (t)) + A(t + Qt; X˜ (t + Qt)))Qt 2 1 i × (B(t; X i (t)) + B(t + Qt; X˜ (t + Qt)))QWt : (66b) 2 Yet, a time series expansion of this scheme reveals that due to the Arst-order behaviour of (QWt )2 in time, the corresponding di8erential equation turns out to be Eq. (54) rather than the Itˆo SDE which is here Eq. (64). In other words, the predictor–corrector scheme is consistent, however with the Stratonovich interpretation of SDEs, Eq. (53), but not (in general) with the
28
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Itˆo interpretation. Therefore, if the Itˆo interpretation has been chosen and the stochastic integrals are manipulated using the simple Itˆo’s rules (see Section 2.6), the scheme is not consistent. The key point here is that the numerical discretization must be in line with the mathematical deAnition of the stochastic terms. To stay on somewhat safer grounds, one can stick to the Euler scheme or pay enough attention to the validity of the numerical schemes. After consistency is checked, the quality of schemes must be measured to analyse how they actually approximate solutions, and for this the notion of order of convergence must be properly deAned. For stochastic processes various deAnitions can be adopted which mirror the di8erent ways random variables may converge to a limit random variable, see Section 2.1.2. One can deAne a strong order of convergence and a weak order of convergence. Let us consider a numerical approximation of the process Xt obtained with a Anite time step Qt, called XtQt . On the one hand, the numerical scheme will have a strong order of convergence m if at a time tmax we have that |Xtmax − XtQt |2 1=2 6 C(Qt)m : max
(67)
On the other hand, the numerical scheme will have the weak order of convergence m if at time tmax we have that |f(Xtmax ) − f(XtQt )| 6 C(Qt)m max
(68)
for all su@ciently smooth functions f. For example, the Euler scheme has a strong order of convergence m = 1=2 but a weak order of convergence m = 1. As already explained before, since we are mainly interested in approximating various statistics of single- and two-phase ows the natural notion is the notion of weak convergence. 3. Hierarchy of pdf descriptions Most of the necessary mathematical elements concerning stochastic di8erential equations have been given in the preceding section. For our purposes, attention has been focused on Markovian processes, and more speciAcally on a particular subset of Markovian processes, di8usion processes. These processes will be used as building blocks, Arst in turbulent single-phase ow modelling in Section 6 and then in turbulent two-phase ow modelling in Sections 7 and 8. Up to now, emphasis has been mainly put on the mathematical characteristics of di8usion processes rather than on their application for physical purposes. Such an application requires further analysis and discussion. Indeed, even in the multi-dimensional case where the stochastic process Z(t) is a vector of d real stochastic processes Z(t)=(Z1 (t); : : : ; Zd (t)), the selection of variables that make up the stochastic process Z(t) in a practical case, its dimension d, and the choice of the evolution equation (through the drift and di8usion coe@cients), were not discussed and were considered as given. However, a pdf description appears in a closed form only when: (i) the stochastic process Z(t) is chosen, (ii) the model for its time evolution equation is speciAed. The form and the nature of the di8erent models used for two-phase ow modelling will be presented in detail in later sections. In the present one, we discuss issues related to the choice of
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
29
the stochastic process Z(t) which is used to describe a physical system. By considering di8erent stochastic descriptions, either Aner or coarser, di8erent pdf equations result. It is important to be aware of the interplay between the di8erent and increasingly coarser descriptions and the structure of the corresponding reduced pdf equations. In practice, the cornerstone of the contracted description is, of course, to be able to choose the ‘correct’ reduced number of variables, which must be small enough to make up a tractable system while still capturing the essence of the physics of the problem. The discussion on how to perform such a choice in some cases is postponed to the next section. In the present one, we limit ourselves to the technical presentation of this interplay which manifests itself by various pdf hierarchies. These hierarchies will be referred to continuously in the rest of the paper. The general issue of a pdf hierarchy is Arst presented, and is then illustrated by two examples. The Arst hierarchy is very well known in Statistical Physics. However, the second hierarchy is not often described, though it is of the same nature. Both hierarchies appear constantly in the modelling considerations later on. 3.1. Complete and reduced pdf equations Numerous physicals situations fall into the category of what is called N -body problems. That is, we have N objects, identical or not, which interact mutually. This situation can be loosely referred to as a N -particle problem by deAning each ‘object’ as being a particle. This terminology will be retained here. In this general approach, each particle represents the particular value of a set of variables and is fully determined by the knowledge of these ‘internal’ variables. A classical example is molecular dynamics problems, where each particle represents a molecule and can be thought of as a point particle deAned by the value of its location and velocity. In another case, the knowledge of the state of each particle may require more variables. The way these N particles interact and in uence one another is considered to be known when the state of the N particles is known, that is the mutual forces are internal with respect to the whole system made up by the ensemble of the N particles. The dimension of the system (or the number of degrees of freedom), d = dim(Z), is given by d = N × p, where N is the number of particles included in the system and p represents the number of variables attached to each particle. For this system, the complete vector which gathers all available information is then Z = (Z11 ; Z21 ; : : : ; Zp1 ; Z12 ; Z22 ; : : : ; Zp2 ; : : : ; Z1N ; Z2N ; : : : ; ZpN ) : This vector is the state vector of the N -particle system. The vector deAned by the p variables attached to each particle, Zi = (Z1i ; Z2i ; : : : ; Zpi ), is called the one-particle state vector, in this case for the particle labelled i. In practice the dimension of the system is huge (it might be inAnite) and one has to come up with a reduced (or contracted) description, or in other words to consider a subset of dimension d = s × p d. Such a reduced description is needed to achieve a practical formulation of the behaviour of the system, that is to formulate a set of equations in closed form which can be solved numerically with help of modern computer technology. The key point is that, in the general case, such a contraction is followed by a loss of information and that knowledge of higher-order pdfs has to be provided through closure relations.
30
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
To illustrate this problem, let us consider a N -particle system where the time evolution equation involves simply a deterministic force d Z(t) = A(t; Z(t)) : (69) dt The dimension of the complete state vector Z is equal to d, and the corresponding pdf p(t; z) veriAes the Liouville equation 9p(t; z) 9 + (A(t; z)p(t; z)) = 0 : (70) 9t 9z This equation is closed since in fact all the degrees of freedom of the system are explicitly tracked. We consider now a reduced pdf pr (t; zr ) where dim(Zr ) = d and p(t; z) = p(t; zr ; y) with, of course, dim(Y) = d − d . By integration of the previous equation on y, the transport equation for the marginal (reduced) pdf becomes 9pr (t; zr ) 9 (71) + r [A|zr pr (t; zr )] = 0 ; 9t 9z where the conditional expectation is deAned by 1 r r r A|z = A(t; z ; y)p(y|t; z ) d y = A(t; zr ; y) p(t; zr ; y) d y : (72) p(t; zr ) Eq. (71) is now unclosed. This illustrates the fact that when a reduced description (in terms of a subset of degrees of freedom) is performed, information is lost, and one has to come up with a closure equation for higher-order pdfs. We have moved from a complete description and therefore a closed pdf equation Eq. (70), to a contracted description and thus an unclosed pdf equation Eq. (71). At this point, two sets of reduced descriptions can be chosen in the N -particle example, by varying either the number of particles retained in the state vector of the reduced system or by varying the number of variables attached to each particle. The Arst one corresponds to the classical BBGKY hierarchy (the initials are those of the authors who derived it independently: Bogoliubov, Born, Green, Kirkwood and Yvon) encountered in kinetic theory (p = 2), and is fully described in textbooks, for example [20,21]. In the second one, the dimension of the state vector is addressed from a single particle point of view, s = 1. 3.2. BBGKY hierarchy Classical mechanical questions are well represented by N -particle deterministic problems, involving N particles of identical mass m in mutual interaction and with no external forces. The dimension of the one-particle state vector is, almost always, taken as two, including particle location and velocity. This is a consequence of the search of a kinetic description and of the hypothesis that forces derive from a location-dependent potential. Consequently, the dimension of the complete state vector is d = 2 × N . The drift vector is A = (U; F) where the mutual acceleration, taken in the direction xi − xj , is denoted Fij and is given in terms of a potential
ij = (|xi − xj |), which is the mutual potential energy of the pair of particles (i; j). Therefore mFij = 9 ij = 9xi represents the force on particle i due to particle j. In the classical mechanical framework, a reduced description is meant as a description of the system using identical variables for each particle but using only a subset of the total number. The reduced pdf for a subset of
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
31
s particles, ps (t; y1 ; V1 ; : : : ; ys ; Vs ) is written for the sake of simplicity as ps (t; 1; : : : ; s) = ps and consequently for integration d ys d Vs reads d s. Integration of the Liouville equation yields (summation over the i index should not be confused with tensor notation as it represents the number of particles in the subset, that is a summation from 1 to s; 1 6 s 6 N ) 9 Ls (ps ) + Fij pN d (s + 1) : : : d N = 0 ; (73) 9Vi j¿s where the Ls operator is given by Ls (·) =
9· 9 9 (Vi ·) + + 9t 9yi 9Vi
s
Fij · :
(74)
j=1
Eq. (73) has been obtained by applying the correspondence di8usion process—Fokker–Planck equation and more especially deterministic process—Liouville equation. It can also be derived using Classical Mechanics, i.e. the properties of the Liouville operator, Libo8 [21], or the Hamiltonian, Balescu [22]. Noticing that (by permutation and variable changes) N Fij p d (s + 1) : : : d N = (N − s) Fi(s+1) ps+1 d (s + 1) ; (75) j¿s
the following set of equations is obtained: 9 Ls (ps ) + (N − s) Fi(s+1) ps+1 d (s + 1) = 0 ; 9Vi
(76)
which is a set of N coupled equations and is often called the BBGKY hierarchy. This simply states that for a deterministic ensemble of N particles, a contracted description of the system gives an unclosed equation on the reduced pdf as illustrated by Eq. (76). For s = 1, one-point pdf, one recognizes the kinetic equation which involves the two-point pdf and so on. At this point, it should be mentioned that, in the case of mutual interactions given by a simple potential, it was quite trivial to illustrate the hierarchy of pdfs but, for example in the case of discrete particles (or even uid particles) carried by a turbulent uid, the expression of the force exerted on a particle does not exhibit a simple analytical form as it depends simultaneously on all other particles and consequently, in this case, the hierarchy problem is given by Eq. (71). At last, this type of hierarchy is not a property of the pdf approach but is typical, in general, for problems where a reduction is made, as for example, in the case of the Reynolds decomposition of the local instantaneous Navier–Stokes equations. 3.2.1. Normalization of the distribution function In the previous approach, a pdf, p(t; x), has been used, p(t; x) d x is in fact the probability to And the system (the N particles) in a given state in the range [x; x + d x], cf. Section 2.2 (this can be understood more easily using the notion of an ensemble density function, introduced by Gibbs, cf. e.g. [21]). The marginal ps represents then the probability to And the reduced system (s particles) in a given state.
32
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
In many applications, as will be seen later, it is convenient to work with the s-tuple distribution function, fs (t; 1; : : : ; s), where fs (t; 1; : : : ; s) d 1 : : : d s represents the probable number of s-tuple in a given state in the range [1; 1 + d 1]; : : : ; [s; s + d s] at time t. The relation between ps and fs is directly given by combinatorics, that is by the number of ways of taking s elements from a population of N elements, without replacement and of course with regard to order. The answer is (N )k , that is N! fs (t; 1; : : : ; s) = (77) ps (t; 1; : : : ; s) : (N − s)! Normalization is given by fs (t; 1; : : : ; s) d 1 : : : d s =
N! (N − s)!
ps (t; 1; : : : ; s) d 1 : : : d s =
N! ; (N − s)!
(78)
and with r ¡ s 6 N , the r- and s-tuple distribution functions verify the following relation: (N − s)! r f (t; 1; : : : ; r) = fs (t; 1; : : : ; s) d (r + 1) : : : d s : (79) (N − r)! The BBGKY hierarchy, Eq. (76), can be written in a slightly di8erent form 9 s s s+1 Fi(s+1) f d (s + 1) = 0 : L (f ) + 9Vi
(80)
3.3. Hierarchy between state vectors The BBGKY hierarchy gives a comprehensive picture of the resulting modelling problem in the frame of Classical Mechanics. The issue is now to express the statistical e8ect of all the disregarded particles on the statistical properties of the small number (usually one or two) of particles that are kept in the state vector. In this hierarchy, the choice of the one-particle state vector and its dimension, here p = 2, remains unchanged. However, in di8erent situations, various choices can be made for the one-particle state vector and it is useful to consider a second set of pdf equations which corresponds to di8erent and increasing one-particle state vectors. This happens already when we consider a N -particle problem where the force acting on one particle due to the other ones can be any function of particle properties, for example a function of particle acceleration or other ‘internal’ particle properties. It is therefore important to express also the interplay between the choice of the one-particle state vector and the structure of the corresponding pdf equation, even when a given subset of s particles is considered. There is another strong justiAcation for considering this second pdf hierarchy with respect to modelling purposes. Indeed, to obtain a closed pdf equation at some chosen level, a model must be introduced to simulate the behaviour of the degrees of freedom that are summed over. As will be explained more in detail in the following section in the case of white-noise terms, it is important to select the ‘correct’ variable that can be well modelled by a certain stochastic process. A very precise example of this choice will be given by the choice of the variable to model in one-point particle pdf for two-phase ows, see Section 7. The BBGKY hierarchy was presented using a top-bottom approach, that is starting from the complete Liouville equation and deriving from it the di8erent reduced descriptions. The
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
33
hierarchy between vector states will be presented here from a bottom-top approach, starting from the most reduced level to higher level and introducing modelling concerns. We consider only one particle (s = 1), and follow a presentation based on the historical case of a Brownian particle that will be taken up again in the next section. First of all, we can restrict ourselves to following the position of the particle (that was Einstein’s point of view with time steps that are large enough, see the next section). With that choice of the state vector Z(t) = (X(t)), the particle velocity is an external variable and the pdf equation for p(t; y) is unclosed 9p(t; y) 9 (81) + (U|yp(t; y)) = 0 : 9t 9y To obtain a closed model, the e8ect of the particle velocity has to be replaced by a model d X + (t) d X (t) = U + (t) ⇒ = F[t; U (t)] dt dt
(82)
where the superscript + denotes the exact equation and F[t; X (t)] represents a functional of the position X (t). If the functional F is deterministic we end up with a reduced Liouville equation. However, if F is stochastic, the techniques of Section 2 may be applied. If this Arst picture is believed to be too crude, one can include the velocity of the particle in the state vector that becomes then Z(t) = (X(t); U(t)) (Langevin’s point of view). Now, the particle acceleration A(t) becomes an external variable and the corresponding pdf equation for p(t; y; V) is unclosed 9p(t; y; V) 9(Vi p(t; y; V)) 9 + (A|y; Vp(t; y; V)) = 0 : (83) + 9t 9yi 9Vi To obtain a closed form, the acceleration has to be eliminated or replaced by a model d X + (t) d X (t) + = U (t) = U (t) dt dt ⇒ + d U (t) = A+ (t) d U (t) = F[t; X (t); U (t)] : dt dt It is thus clear that the second description encompasses the Arst one. It contains more information and in physical terms corresponds to a description performed with a Aner resolution. From a modelling point of view the task is also di8erent depending upon the choice of the one-particle state vector. In the Arst case (Einstein’s point of view), one has to model particle velocities. In the second case (Langevin’s point of view) one has to model particle accelerations. From the above example, a general picture emerges. We consider a one-particle reduced description (s = 1) but with many internal degrees of freedom, i.e. Z1 = (Z11 ; Z21 ; : : : ; Zp1 ; : : :). The complete one-particle state vector is written here for a particle labelled i = 1, but in a one-particle pdf description the label is irrelevant (the same would be valid for any particle i) and the superscript is therefore skipped in the following. If the time rate of change of the particle degrees of freedom has the following form: d Z1 (85a) = g(t; Z1 ; Z2 ) ; dt d Z2 (85b) = g(t; Z1 ; Z2 ; Z3 ) dt
34
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
.. .
(85c)
d Zp = g(t; Z1 ; : : : ; Zp ; Zp+1 ) dt
(85d)
.. .
(85e)
and if the chosen one-particle reduced state vector contains only a limited number of degrees of freedom, say p; Z r = (Z1 ; : : : ; Zp ) then the corresponding pdf equation for pr (t; z1 ; z2 ; : : : ; zp ) is unclosed since it involves an external variable, namely Zp+1 9(g(t; z1 ; : : : ; zp )pr ) 9(g(t; Z1 ; : : : ; Zp ; Zp+1 )|Z r = z r pr ) 9p r 9(g(t; z1 ; z2 )pr ) + ··· + + + 9t 9z1 9zp−1 9zp =0 : (86) To obtain a closed model, the external variable Zp+1 must be expressed as a function of the variables contained in the chosen state vector, and the equations for the modelled system have the form with a model written gm for the time rate of change of Zp d Z1 (87a) = g(t; Z1 ; Z2 ) ; dt d Z2 (87b) = g(t; Z1 ; Z2 ; Z3 ) dt .. . (87c) d Zp = gm (t; Z1 ; : : : ; Zp ) : dt
(87d)
4. Stochastic di-usion processes for modelling purposes The purpose of the present section is to show how stochastic processes can be used in applied situations for modelling issues. Indeed, we have seen in the previous section that the practical need to limit ourselves to reduced descriptions results in unclosed pdf equations. To obtain closed equations, the disregarded degrees of freedom may be replaced by stochastic models. The objective in this section is to try to clarify what is meant when a stochastic process is written to replace a real physical process. This is not always an easy question, though there are some situations when such a move is clear. For example, if we are dealing with a mechanical system subject to an external force F(t) which uctuates rapidly with a variance 2 (t) around a mean term Fd (t), then the obvious model is to write the equivalent of Newton’s law as d Xt (88) = F(t) ⇒ d Xt = Fd (t) d t + (t) d Wt : dt However, the situation is perhaps less clear when we are dealing with internal degrees of freedom. The methodology is thus detailed in the rest of this section starting with a ‘simple’ example.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
35
4.1. The shift from an ODE to a SDE Let us consider the case of a system Xt whose time rate of change is Yt d Xt (89) = Yt : dt We consider that we are dealing with stochastic processes (due, for example to random initial conditions) which are di8erentiable and can thus be handled with normal calculus rules. This gives t dX 2 (t) Y (t)Y (t ) d t : (90) =2 dt 0 If we consider, for the sake of simplicity, Y (t) as a stationary process and introduce its autocorrelation Ry (s) deAned by Ry (s) = Y (t)Y (t + s)= Y 2 , we can write t dX 2 Ry (s) d s : (91) = 2Y 2 dt 0 The important scale in that reasoning is the integral time scale of Y (t), say T , which is deAned as the integral of the autocorrelation ∞ T= Ry (s) d s : (92) 0
This time scale is a measure of the ‘memory’ of the process. If we consider time intervals s small with respect to T , successive values of Y (t) are well correlated. On the other hand, successive values of Y (t) over time intervals that are large with respect to T are nearly uncorrelated. Therefore, in this second limit, we have t for t T; Ry (s) d s ∼ T ⇒ X 2 2Y 2 T × t (93) 0
that is the mean square of X (t) varies linearly with the time interval, here t. This is the ‘di8usive regime’. It should be noted that this regime is always reached (for long enough time spans) and that, once it is reached, the behaviour of X 2 does not depend on the particular form of Ry (s) but simply on two mean quantities, namely the variance and integral time scale of Y (t). This reasoning is certainly not new. Applied to the position and velocity of a uid particle, this point was described by Taylor in 1921 and has been detailed in most textbooks. However, we are not simply interested in reformulating known results concerning the statistics of X (t) but in modelling the instantaneous trajectories. Indeed, if we assume that the trajectories of X (t) are continuous, the previous result suggests that, in the range t T; X (t) can be seen as a Wiener process, that is undergoing a random walk. The previous behaviour is obtained with Anite time di8erence and by Arst introducing T and then making t or Qt large enough. The reasoning can be reversed to reveal what the introduction of a white noise means. We still consider Xt whose time rate of change is Yt . Let us consider that there is a separation of scales: we introduce a time step Qt ∼ d t representing the time interval over which we observe the process Xt . This time increment d t is therefore assumed to be small with respect to a characteristic time of Xt . Nevertheless, we assume that the integral time scale of Yt ; T , is very small with respect to d t. Thus, Yt is a fast and rapidly changing
36
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
variable. Actually, we would like to take directly the limit T → 0, since d t is assumed to be arbitrarily small. Yet, if we take that limit, assuming that Yt is a normal process having a Anite variance, Eq. (93) shows that the e8ect of the uctuations of Yt vanishes completely. Consequently, to retain a Anite limit when T → 0, we are forced to consider that Y 2 becomes arbitrary large in the sense that 2 Y → +∞ (94) such that Y 2 T → D ; T →0 where D is a Anite constant. In that case, the modelling step consists in replacing the di8erentiable process Yt by a white noise and writing that Xt becomes a di8usion process deAned by the SDE √ d Xt = 2D d Wt ; d Xt (95) = Y (t) → T →0 D = lim Y 2 T : dt T →0
By making this step, Xt becomes a Markov process since the memory of Yt becomes inAnitesimally small. It also implies that some ‘information’ has been lost (the information associated to Yt ) in an irreversible way. The signiAcance of this modelling step can be further clariAed by writing the consequences in the pdf equation. If p(t; y) is the pdf associated to the process Xt , we have 9p 92 p 9Yt |Xt = y 9p → =− =D 2 T →0 9t 9t 9y 9y
(96)
which shows that we have in fact introduced a ‘transport coe@cient’, namely D. The discussion above is presented in the framework of continuous-time stochastic processes, and to be put on Arm mathematical grounds the limit expressed in Eq. (94) is required. On a discrete time basis, the time scale T of Yt does not have to go exactly to zero. What is required is that this very time scale be small with respect to the time step which is the reference time scale Qt we have introduced right at the outset. It is important to realize that in practice the introduction of a white-noise term is a relative notion. With regard to one time scale, another process is assumed to vary ‘su@ciently quickly’. Therefore, the details of this fast process are not crucial: the wild variations can be expressed by a Wiener increment. Yet, the eliminated fast process leaves its trace through its variance and integral time scale which deAne the transport coe@cient D. Using a discrete representation of Xt , this step can be expressed by t+Qt t+Qt √ QX (t) = X (t + Qt) − X (t) = Ys d s → QX (t) = 2D d Ws : (97) t
T Qt
t
4.2. Modelling principles All the necessary elements have been given in the above example and can be developed in a more complex context to propose a general methodology. The idea is that introducing a local closure in an (open) set of equations means a Markovian approximation. Such an approximation
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
37
can be justiAed by a coarse-graining procedure, that is by observing the system on ‘large enough’ time intervals. This is precisely what we did in the previous example by taking not-too-small d t in order to disregard the ‘information’ related to Yt and to retain only its e8ects on Xt through the coe@cient D. The success of such a procedure will therefore rest upon a satisfactory choice of the ‘size of the grain’ (in practice a time or length scale) and upon a separation of scales as explained in Section 3. Let us build on these ideas in a complex situation to help us select the proper degrees of freedom to retain in the state vector. As it was explained in the previous sections, for the case of N interacting particles, and even if we limit ourselves to characterizing the statistical behaviour of one particle (one-point pdf), we still have a huge (maybe inAnite) number of degrees of freedom. We could limit ourselves to the position of the particle, say Xt , or include its velocity to have (Xt ; Ut ), or also its acceleration (Xt ; Ut ; At ) and so on. Using the language of Statistical Mechanics or of Synergetics [23,24] the principle is to introduce Arst a reference scale which in our example with one particle would be a reference time scale d t. Then, the degrees of freedom written as Zt = (Z1 ; : : : ; Zn ) are classiAed with respect to that scale as slow and fast variables, (Z1 ; Z2 ; : : : ; Zn ; : : :) ; ↑ reference scale A slow variable is a variable whose integral time scale T is greater than the reference scale d t while fast variables are those with an integral time scale $ smaller than the reference scale, $d t T :
(98)
The guiding principle is then to retain only the slow modes or variables in the state vector used to build the model and to ‘eliminate’ the fast ones. The latter modes are eliminated by expressing them as functions of the slow ones. This is called the slaving-principle [23] and is in fact an equilibrium hypothesis. The fast modes are assumed to relax ‘very rapidly’ to equilibrium values or distributions which are determined or parameterized by the values taken by the slow modes. This corresponds to sorting out the degrees of freedom in terms of solutions of transport equations and local source terms. The slow modes (Z1 ; Z2 ; : : : ; Zd ) that are kept in the state vector will satisfy di8erential equations while the fast ones (Zd+1 ; Zd+2 ; : : :) will be given by algebraic relations. In uid mechanics applications, statistics on the slow modes will be solutions of transport equations while statistics on the fast modes will appear as local source terms. Of course, this procedure will be successful if there exists a clear separation of scales between the integral time scales of the slow modes and of the fast ones. This was indeed the case in the previous example and this allows to replace the fast modes by white-noise or increments of Wiener processes. In the general case, there is no such clear-cut separation and replacing the fast variables by white-noise terms appears as a less justiAed approximation. However, the interest of this principle is at least to provide a convenient and coherent framework and to suggest in practice which variables have the ‘best chances’ to be replaced by a model.
38
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
4.3. Example for typical stochastic models A typical example of this reasoning is the historical case of a Brownian particle. This example was already used in Section 3 to illustrate the pdf hierarchy with respect to increasing one-particle state vectors. We can return to that case and go one step further by introducing speciAc models following the general methodology above. The Arst and simplest description retains only the position of the particle (that was Einstein’s point of view) and with that choice of the state vector Zt =(Xt ), the particle velocity is an external variable and has to be eliminated to obtain a closed model, as already explained in Section 3. When a large enough time step Qt or d t is used, the particle velocity can be regarded as a fast variable and the resulting stochastic model for Brownian particle location is expressed by √ d Xt (99) = Ut → d Xt = 2D d Wt : dt That procedure implicitly assumes that the time scale of the particle velocity Ut , say TU , is small with respect to d t. The corresponding pdf equation is a simple di8usion equation in sample-space (identical to a heat equation) 9p 92 p =D 2 : 9t 9y
(100)
The correlation between successive particle locations is given by Xt Xs = min(t; s) :
(101)
In the Einstein’s picture, particle velocities do not exist. If this Arst picture is believed to be too crude, one can include the velocity of the particle in the state vector that becomes then Zt = (Xt ; Ut ) (Langevin’s point of view). Now, the particle acceleration At becomes an external variable that has to be eliminated. The model proposed by Langevin is written as d Xt d Xt = Ut d t = Ut dt → (102) √ Ut d Ut d U = − d t + K d W t t = At T dt and the corresponding pdf equation for p(t; y; V ) is
1 92 [Kp] 9p 9p 9 1 : (103) +V = Vp + 9t 9y 9V T 2 9V 2 The correlation between successive particle velocities is now given by KT −(t+t )=T KT −|t−t |=T Ut Ut = U02 e−(t+t )=T − e + e : (104) 2 2 When we consider times both long enough with respect to the initial time of the process, the form of the correlation takes the simpliAed expression KT −|t−t |=T t; t T Ut Ut = e : (105) 2
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
39
This reveals that the time scale T used in the stochastic velocity equation is the time scale of particle velocity correlations since ∞ −|s|=T RU (s) = e and thus RU (s) d s = T : (106) 0
The Langevin model has better support if the acceleration can easily be replaced by a model. In the case of a Brownian particle, the acceleration is due to the large number of collisions with uid molecules. Due to the large inertia of the Brownian particle compared to the inertia of uid molecules, we can select a time step which is small with respect to the time scale of particle velocities and yet large with respect to the time scale of uid molecule velocities. The motion of these molecules can thus be seen as a fast and purely random process. The total action of the collisions is written as the sum of two contributions: a purely deterministic one opposed to the Brownian particle motion and a purely random one expressed as a white-noise process. For that precise example the complete form of the Langevin model is written with kB the Boltzmann constant, : the friction coe@cient and ; the uid temperature as d Xt = Vt d t ; d Vt = −:Vt d t +
(107)
2kB ;: d Wt : (108) The Langevin model is really the archetype of stochastic processes for uid dynamical modelling problems and will be extensively referred to in the next chapters. It is therefore important to be aware of its physical justiAcation and, consequently, of its inherent limitations. In the Langevin’s picture, one part of the particle acceleration is taken as a fast process and replaced by a white-noise term. Consequently, information related to the acceleration is lost. If such information is needed, or if acceleration cannot be seen as inAnitely fast, the same procedure can be pursued by shifting the introduction of the necessary model to the time rate of change of At . A useful model can be written as d Xt = Ut d t Ut Ut (109) At = − +
The value of <(t) at time t depends upon the past values. In the acceleration-based model, the process (Xt ; Ut ) is not Markovian anymore. The corresponding pdf equation for p(t; y; V; ) is now
9p 9p 9 1 9 1 9 1 92 [Bp] : (111) +V = Vp − [p] + p + 9t 9y 9V T 9V 9 $ 2 9 2
40
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Far from the chosen origin of time, the correlation between successive uid particle velocities is expressed by B$ T$ 1 $ U (t)U (t + s) = × × × e−s=T − e−s=$ (112) 2 1 + $=T (1 − $=T ) T and the integral time scale of the uid velocities is equal to T + $. Finally, it is also instructing to consider the interplay between these di8erent pdf descriptions. In the last model, the pdf equation for p(t; y; V; ) is closed. From the knowledge of this pdf, information concerning only particle location and velocity can be retrieved by integrating over the extra variable <, p(t; y; V ) = p(t; y; V; ) d ; since p(t; x; V ) is simply the marginal of p(t; x; V; ). From the pdf equation for p(t; y; V; ), Eq. (111), the equation satisAed by the marginal is readily obtained. The integration yields
9p 9p 9 1 9 (113) +V = Vp − [<|(y; V )p] ; 9t 9y 9V T 9V where
<|(y; V )p(t; y; V ) =
=
p<|(x; U ) ()p(t; y; V ) d
(114)
p(t; y; V; ) d
(115)
=<(U − V )(X − y) :
(116)
However, the pdf satisAed by p(t; y; V ) is now unclosed. To obtain a closed form requires to express the mean conditional expectation of < by a function of (y; V ). In general, additional information or assumptions must be input at that stage. 5. The physics of turbulence In this section, we discuss the physics of single-phase ow turbulence. The core of the present work is not directly devoted to this problem, but to the statistical properties of discrete particles. However, since these discrete particles are transported and dispersed by the turbulence of the carrier uid, their statistical properties are strongly in uenced or governed by the underlying turbulent ow. It is then important to detail the physical characteristics of turbulent ows. Turbulence is a very di@cult subject. It is also a subject where di8erent people may actually be following di8erent objectives. Quoting Lesieur [3], this is a subject where people may completely disagree (sometimes in violent terms) on what the problem is. In line with this statement, the point of view adopted in this work is to follow a ‘middle-road approach’ between a more theoretically-inclined point of view and a more engineering-inclined point of view. On the one hand, this section does not claim to present a comprehensive overview of the physics of turbulence and of all theoretical issues. On the other hand, some of these issues are not swept aside on the grounds that they do not concern ‘engineering’ aspects of the problem. The
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
41
presentation remains as general as possible and, as such, this section is the last section devoted to background issues (as well as Sections 2– 4). It provides a link with the following sections which will be devoted to the developments of speciAc models. With respect to these points, the precise aims of this section are (i) to provide a background on the physics of turbulence and in particular on the classical Kolmogorov theories and the concept of the energy cascade, (ii) to describe speciAc results, such as Lagrangian statistics, that will be used in the developments of PDF models in the following sections, (iii) to present and discuss some of the known deAciencies in the classical theories and to introduce recent theoretical issues or results, such as small scale intermittency or coherent structures. These points will help to clarify what will be captured by proposed models, what is explicitly left out and also what could be captured by a certain formalism, (iv) to try to somewhat reconcile theoretical and practical points of view by indicating how some theoretical Andings or concepts could be used in approximate descriptions and then in simple models, (v) to emphasize the physical aspects of turbulence which make the problem inherently di@cult to understand and to model in a satisfactory way. The literature on the subject is immense and impossible to cite in detail. Complementary aspects can be found in several textbooks on turbulence [3,5,25 –27] and in recent reviews which provide up-to-date lists of references [28,1]. 5.1. The turbulence problem The starting point is provided by the Navier–Stokes equations supplemented with conservation equations for a set of scalars. These are Aeld equations (the di8erent Aelds are density (t; x), pressure P(t; x), velocity U(t; x) and scalars (t; x), where x represents the coordinates in physical space) 9 9(Uj ) =0 ; (117a) + 9t 9xj 9Ui 9Ui 1 9P 92 Ui =− + 2 ; (117b) + Uj 9t 9xj 9xi 9 xj 9 l 9 92 (117c) + Uj l = > 2l + Sl () : 9t 9xj 9 xj The set of scalars is a compact notation that includes in a vector the mass fractions m: (: = 1; : : : ; Ns ) of the Ns species that compose the reactive mixture and the enthalpy (or the temperature T ) of the mixture, thus = (m1 ; : : : ; mNs ; T ). This allows us to express the equation of state that gives the mixture density as a function of the scalars for low Mach number ows, where pressure does not in uence density signiAcantly, as = (). The conservation equations for the di8erent scalars involve reactive source terms, Sl for the species l, which depend upon the whole set of species mass fractions and upon the enthalpy or temperature. With the general set of scalars , the dependence of the source terms on the variables can be written as Sl = Sl ().
42
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
The basic equations are known. However, we are dealing here with fully turbulent ows. In that case, it is known that the solutions of the Navier–Stokes equations exhibit uctuations over a wide range of scales. It will be shown later on in Section 5.3, that the number of degrees of freedom is of the order of Re9=4 for a Reynolds number Re that is typically 105 –108 . The existence of such of wide range of scales, and of the acute sensitivity of turbulent ows to small perturbations in initial and boundary conditions (which are never known absolutely) explain the search of a statistical description of turbulent ows. Turbulence is such a vast subject that many di8erent objectives can be pursued. Broadly speaking, the ‘turbulence problem’ is the search for a reduced statistical description. In particular, from a modelling point of view, the ‘turbulence problem’ is to come up with a tractable statistical model. This implies reasonings in which statistical quantities are manipulated or compared. Of Arst importance for statistical descriptions are the characteristic scales which are Arst deAned before going into the details of Kolmogorov theory. 5.2. Characteristic scales Integral time scales have already been introduced in Section 4 through the discussion of simple di8usive behaviour. In this section, we deAne the characteristics time and length scales which are of importance for turbulence modelling and, in the process, we introduce notations that will be used in the rest of the paper. In the statistical approach to turbulence, all variables (velocity, pressure, temperature, etc.) are regarded as random functions or random Aelds. Yet, a fundamental aspects of turbulence with respect to other random phenomena is that the values of each random variable (say velocity) at di8erent locations or at di8erent times are not independent. They are usually correlated or, in other words, turbulence has a non-zero memory. These non-zero memories are best quantiAed by the use of the autocorrelation functions and of the corresponding integral scale. In this section, we limit ourselves, for the sake of simplicity, to homogeneous isotropic stationary turbulence. If we Arst consider a Lagrangian point of view, and record the successive velocities in time of a marked uid element, say Ua (t), where the index a indicates that the particle starts from location a at time t = 0, the autocorrelation coe@cient of the Lagrangian uid particle velocity is then deAned as RL (s) =
Ua (t) Ua (t + s)
u2
:
(118)
In this equation, u2 stands for the constant velocity variance. The autocorrelation coe@cient does not depend on absolute times but on time di8erences since we are considering homogeneous stationary turbulence. A typical form of RL (s) is shown in Fig. 3. As already indicated in Section 4.1, the value of the autocorrelation coe@cient RL (s) for a time lag s is an indication of the degree of correlation between the two velocities Ua (t) and Ua (t + s). For small s, RL (s) 1 and the two velocities are strongly correlated while for long enough time intervals, RL (s) 0 and the two velocities are nearly uncorrelated. A rough measure of the time interval over which velocities remain typically correlated is the integral
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
43
Fig. 3. Typical form of the autocorrelation coe@cient.
time scale deAned as +∞ TL = RL (s) d s :
(119)
0
As shown in Section 4.1, this characteristic timescale is the key parameter for turbulent di8usion. The deAnition of the integral time scale TL allows the notion of ‘long enough time intervals’ used above to be speciAed: a time interval s is long enough for the successive velocities to be uncorrelated when s is large with respect to the integral time scale, sTL . Another scale can be introduced as the curvature at the origin, see Fig. 3. 2 d RL (s) 2 s2 = − 2; RL (s) = 1 − 2 sTL : (120) 2 ds # # s=0 The Taylor scale, #, is the integral over which velocities are strongly correlated (that point will be discussed again in Section 5.3 for spatial velocity di8erences). Since the process Ua (s) is stationary, # represents also a measure of the acceleration variance. Indeed, we have
2 2 2u2 d Ua (s) d Ua (s ) d Ua (s) 2 2 d RL (s − s) 2 d RL (s) = u ⇒ = − u = : ds d s d s d s ds d s2 #2 s=0 (121) The exact form of the autocorrelation coe@cient RL (s) does not play a role for the long-time di8usive behaviour. However, for other concerns (in particular for the expression of x2 (t) when t ∼ TL ), the shape of RL does play a role. Little theoretical information is available but three requirements can be stated |RL (s)| 6 1;
d RL (s) ds
RL (0) = 1;
RL (∞) = 0 ;
(122a)
d 2 RL (s) d s2
s=0
=0 ;
(122b)
s=0
60:
(122c)
44
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 4. DeAnition of the longitudinal and transversal directions.
A simple form that is assumed is an exponential formula, RL (s) = exp(−s=TL ). For example, this expression results from a Langevin stochastic model, see Section 4.3. It is obvious that this expression does not respect the last constraints, Eqs. (122b) and (122c). Nevertheless, these constraints are governed by the behaviour of RL (s) very close to the origin. Away from the origin, the exponential form is acceptable and has been conArmed by both experimental and numerical studies for the case of homogeneous turbulence [29 –31]. The question of the form of the autocorrelation coe@cient and of the limitations of the exponential expression are taken up again and discussed in Sections 6.8 and 7.5.3. Similar reasoning can be applied to the Eulerian velocity Aeld. If we consider the velocities at two di8erent locations and at two di8erent times, we can deAne the Eulerian space–time correlation tensor as RE; ij (r; t) = Ui (x0 ; t0 )Uj (x0 + r; t0 + t) :
(123)
The space and time dependence are generally considered separately. We thus Arst deAne the Eulerian tensor at the same point by RE; ij (t) = Ui (x0 ; t0 )Uj (x0 ; t0 + t) :
(124)
This directly introduces the Eulerian time scale which represents the memory of the turbulent velocities seen by an immobile observer +∞ 2 TE; ij (x0 ) = u RE; ij (t) d t : (125) 0
In an analogous way, the spatial Eulerian tensor is deAned as the correlation between velocities at the same time but at di8erent locations RE; ij (r) = Ui (x0 ; t) Uj (x0 + r; t) :
(126)
This is the quantity usually considered and most analyses are limited to the spatial correlation. For homogeneous isotropic turbulence, the form of the tensor RE; ij (r) can be further developed. Indeed, it can be shown that the only non-zero correlations are the correlations along the longitudinal and transversal directions with respect to the separation vector r, see Fig. 4.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
45
Fig. 5. Shapes of the longitudinal and transversal correlations.
In a general system of reference, the Eulerian tensor has the form f(r) − g(r) 2 RE; ij (r) = u g(r) ij + ri rj ; r = |r| ; |r|2
(127)
where f(r) is the longitudinal autocorrelation coe@cient and g(r) the transversal coe@cient. This relation results from symmetry conditions. Additional information is provided by dynamical constraints. For example, the continuity equation implies that 9RE; ij (r) =0 : (128) 9 ri Application of Eq. (128) in isotropic turbulence yields the di8erential equation r d f(r) g(r) = f(r) + : (129) 2 dr As for the Lagrangian velocity autocorrelation function, few requirements can be stated. The conditions listed above for RL (s) are still valid and it can be shown that the transversal correlation function g(r) must contain negative loops, since mass conservation across a plane implies that +∞ g(r) d r = 0 : (130) −∞
On the other hand, experimental and numerical studies tend to suggest that the longitudinal autocorrelation function f(r) does not present such negative loops. A sketch of both correlations shapes is given in Fig. 5. Numerical simulations support an exponential form for f(r) except in the vicinity of the origin (as for RL (s)). Hence, f(r) and g(r) can be approximated by the relations f(r) exp(−r=LE; ) ; r g(r) 1 − exp(−r=LE; ) : 2LE; In these equations, LE; stands for the Eulerian longitudinal length scale deAned by +∞ LE; = f(r) d r : 0
(131a) (131b)
(132)
46
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 6. Sketch of the Kolmogorov cascade.
Similarly, a transversal length scale is introduced by +∞ LE; ⊥ = g(r) d r : 0
(133)
An important result is that, even in isotropic turbulence, the Eulerian tensor is not isotropic and that correlations depend upon whether the longitudinal or transversal direction is considered. The two length scales are related by LE; ⊥ = LE; =2. 5.3. Kolmogorov theory Ever since their Arst formulation by Kolmogorov, the set of hypothesis that form what is now referred to as the ‘Kolmogorov theory’ has been the cornerstone of turbulence modelling. This picture or description of turbulence has indeed acquired a central role, acting as a reference theory in most analyses whether the point is to conArm its predictions or to pinpoint its limitations. We limit ourselves to presenting only the salient points of the Kolmogorov theory, and we emphasise mostly the applications that are of particular interest for the development of Lagrangian stochastic models to be considered in Sections 6 and 7. A comprehensive presentation of the Kolmogorov theory can be found in Monin and Yaglom [25], and the theory is discussed in several textbooks on turbulence [5,3,27]. The Kolmogorov theory is a phenomenological description of turbulence based on the idea of a cascade of energy from large to small scales through a range of what is called inertial scales where energy is merely transferred to smaller scales without creation nor dissipation. The Kolmogorov picture is one of quasi-equilibrium where + is assumed to represent at the same time the rate of production of kinetic energy at the large scales, the rate of transfer in the inertial range and Anally the rate of viscous dissipation at the end of the turbulence spectrum though they are di8erent phenomena. It is represented in Fig. 6 where three di8erent ranges have been separated: the production range (P.R.) where kinetic energy is produced by large scale motions, the inertial range (I.R.) where kinetic energy is merely transferred to smaller scales and the dissipative range (D.R.) where kinetic energy is dissipated into heat by viscous forces.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
47
If we call u a typical uctuating velocity, the characteristic large scales are expressed by L∼
u3 ; +
T∼
u2 : +
(134)
The Arst hypothesis in the Kolmogorov theory is to say that, at high Reynolds numbers, turbulence can be seen as locally isotropic at least for the range of scales ‘su@ciently’ far from the large energy-producing ones. The local statistics of quantities belonging to that range, for example velocity increments in space and time v($; r) = U(t + $; x + r) − U(t; x) relative to the constant motion of one uid particle at the velocity U(t; x) (to have a Galilean transformation), are actually independent of the reference velocity U(t; x) and are uniquely deAned by the two parameters + and and the corresponding time and space increments ($ and=or r) which deAne the local quantity itself. The characteristic scales at which viscous damping gets the upper hand over turbulent uctuations are thus given from simple dimensional analysis by 3 1=4 1=2
@= ; $@ = : (135) + + The ratios of the smallest to the largest scales can be expressed from the Kolmogorov scales and from the relation for the integral length scale L given above in Eq. (134) @ ∼ Re−3=4 ; (136a) L u@ ∼ Re−1=4 ; (136b) u $@ ∼ Re−1=2 : (136c) T With the expression of the inner scales, the inertial range can now be deAned precisely as the scales (r; $) such that with r = |r| @r L;
$@ $T :
(137)
The second hypothesis states that in that inertial range molecular viscosity becomes irrelevant and that statistics of U($; r) = v($; r) depend only on $; r and +. Once the large and small scales have been introduced, we can give another description of the energy cascade in a more quantitative way. If we already adopt an Eulerian point of view (that will be pursued just below) and deAne eddies as ‘coherent’ turbulent motions localised in a region of size r with characteristic velocities u(r) and time scales $(r), we can now express these scales as functions of the large and small scales. The key notion is the idea of a constant transfer of energy, + which implies that relations similar to Eq. (134) are valid not just for the largest scales but at any scale r. This yields the following relations: u(r) = (+r)1=3 = u@ (r=@)1=3 ∼ u(r=L)1=3 ;
(138a)
$(r) = r=u(r) = (r 2 = +)1=3 = $@ (r=@)2=3 ∼ T (r=L)2=3 :
(138b)
48
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
The Kolmogorov picture implies that, in the inertial range, the velocity scales u(r) and the time scales $(r) decreases as we consider smaller and smaller scales r. Then, for a small enough scale r, the time scales $(r) are small enough compare to the time scale of the large scale motion $(L) ∼ L=u, so that these eddies can be regarded as fast processes or variables in the spirit of the discussion in Section 4. Following the general discussion given in Section 4.2, these small eddies will relax very rapidly to their equilibrium values while the large scale, which are the energy-containing scales, have not changed very much and therefore while the ux of energy remains approximately constant. This approximatively constant energy ux acts therefore as an ‘external constraint’ for the small scales. We then obtain the physical notion of the universal equilibrium range which is an important aspect of the Kolmogorov description. Furthermore, in the cascade process and the intrinsically chaotic size-by-size production of smaller eddies, directional information is lost and the resulting small scales are assumed to be statistically isotropic. However, the consequence of the Kolmogorov theory is not as nice as it would seem for the purpose of turbulence modelling and for most practical purposes. Indeed, the universal behaviour which provides the main justiAcation for the search of a (universal) model, is only valid for the small scales. Yet, most of the turbulent energy is contained in the large scales or within a range of scales which are comparable with the integral length scale re ∼ L. These scales have a size comparable to the geometry of the ow and are still very much a8ected by the anisotropy of the ow. Furthermore, their characteristic time scale $(re ), which is thus of the order of magnitude of the integral time scale T , is not small compared to the time scale of the mean ow L= U . For example, in typical shear ows such as mixing layers or shear ows, this characteristic time scale is actually four times higher than the mean- ow timescale. These scales contribute mostly to transport phenomena but do not have a universal form produced by statistical equilibrium. This simple conclusion illustrates the inherent di@culty of practical turbulence models which often have to express the e8ects of these scales through constitutive laws. This Arst form of the Kolmogorov theory, published in 1941, is currently referred to as K41 in the literature and this notation will be kept here. Before considering more in detail speciAc applications, it is worth emphasising that Kolmogorov theory is essentially a Lagrangian concept since predictions are made for Aelds relative to the motion of one uid particle, and that predictions concern the form of N -point distributions in the four-dimensional space (x; t) and include both space and time. Most presentations of Kolmogorov theory limit themselves to space N -point distributions and leave out temporal issues. Yet, the predictions for correlations in time of various quantities are of great importance for the Lagrangian models to be considered in Sections 6 and 7. 5.3.1. Eulerian statistics of velocity diCerences Kolmogorov hypotheses are mostly applied to space correlations, that is for the statistics of Ur =v(0; r)=U(t; x +r) − U(t; x). Statistics often considered are the velocity–structure functions deAned as F n (r) = (Ur )n ;
(139)
which can have di8erent forms depending on the orientation of the velocity component with respect to the separation vector r (the longitudinal and transversal components), see Fig. 4.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
49
When the separation distance lies within the inertial range, we can apply Kolmogorov hypotheses. First of all, since turbulence is locally isotropic, the results given in the previous subsection still hold. The Arst-order moment is zero and the second-order tensor is expressed with the help of the longitudinal and transversal correlations as in Eq. (127). We thus consider separately the two Eulerian velocity structure functions along the longitudinal and transversal directions, D and D⊥ , respectively. Kolmogorov theory predicts that when @r L D = (U; r )2 = C(+r)2=3 ;
(140a)
D⊥ = (U⊥; r )2 = 43 C(+r)2=3 ;
(140b)
F3 = (U; r )3 = 45 +r :
(140c)
The Kolmogorov prediction for the second-order moment is often re-expressed in terms of the energy spectrum (Ur )2 ∼ (+ r)2=3
(Ur )2 ∼ kE(k) ⇒ E(k) ∼ +2=3 k −5=3 ;
(141)
leading to the famous expression of the energy spectrum (k is the wave number). Eqs. (140) already indicate that relative velocities are not Gaussian since odd moments are not zero. By the same reasoning, Kolmogorov hypotheses imply that the n-order velocity structure function has the form Fn (r) = (U; r )n = Cn (+ r)n=3 :
(142)
The immediate consequence of Kolmogorov theory is that the non-dimensional structure factors are independent of the separation distance r and are universal constants Fn (r) (U; r )n fn (r) = 2 = = :n : (143) (F (r))n=2 (U; r )2 n=2 An important consequence of the above scaling is that the probability distributions of the normalized (with unit variance) variable U; r =(+ r)1=3 should not depend on the scale r when r lies in the inertial range and should be universal. Finally, the form of the second-order moments in the inertial range can be used to bring out another signiAcation of the Taylor length scale, #. From the deAnition of the rate of dissipation of the turbulent kinetic energy, 9U 2 +
(144) 9r and the relation between the variance of the velocity derivative and the Taylor scale, see Eq. (121), we have + ∼ u2 =#2 and thus that # ∼ L1=3 @2=3 . From this, we can show that # can be regarded as the averaged length scale of strong correlations when velocities are rescaled to u. Indeed, we have that Ur2 ∼ (+ r)2=3 and thus the dissipation of turbulent kinetic energy at that scale by viscous forces is +r =
Ur2 +2=3 r −4=3 : r2
(145)
50
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
When the turbulent kinetic energy and its dissipation at r are re-scaled we have @ 4=3 Ur2 r 2=3 +r ; : u2 L + r
(146)
The scale at which the relative quantities are of the same order is thus r ∼ L1=3 @2=3 that is when r = #. 5.3.2. Lagrangian statistics of ;uid particle velocity increments The Kolmogorov theory can also be used to predict the more general statistics of the Aeld v (which is in spirit a relative Lagrangian velocity Aeld), and for Lagrangian statistics of a uid particle. Direct application reveals that, if one considers time intervals d t in the inertial range $@ d t T , the successive velocities U (t) and U (t + d t) are strongly correlated. Indeed, we have d UL = U (t + d t) − U (t) = v(d t; 0) ⇒ (d UL )2 ∼ C0 + d t ;
(147)
where C0 is a constant. This implies for the velocity autocorrelation function DL (d t) that U (t)U (t + d t) DL (d t) C0 d t DL (d t) = ∼1− ∼1− ∼1: (148) u2 2u2 2 T For the statistics of uid particle acceleration A(t), the same hypotheses yield that + + A(t + d t)A(t) ∼ (149) ; A2 ∼ dt $@ which implies for the acceleration autocorrelation function RA (d t) that $@ A(t + d t)A(t) RA (d t) = ∼ 1 : 2 A dt
(150)
Consequently, for a given reference time scale d t which belongs to the inertial range, the successive uid particle accelerations are nearly uncorrelated while their velocities are still well correlated. In the spirit of the discussion of Section 3, we can conclude that the uid particle acceleration can be regarded as a fast variable. The Kolmogorov theory suggests therefore that the uid particle acceleration is close to a white-noise process (its spectrum is constant in the inertial range EA (!) ∼ + where ! is the frequency). This is a crucial result for the development of stochastic models either in the single-phase ow case, for uid particles in Section 6.6, or in the two-phase ow case in Section 7.4.2. 5.3.3. Statistics of temporal velocity increments The Kolmogorov theory can actually be applied to numerous variables either of a Lagrangian or an Eulerian form. This includes the statistics of temporal velocity increments which are important, the statistics of pressure correlations, vorticity, etc. They are discussed at length in the reference textbook of Monin and Yaglom [25]. Since the results of Kolmogorov predictions for temporal velocity increments will be used for two-phase ow modelling in Section 7.4.2, they are brie y presented here. For temporal velocity increments, one is interested in obtaining the statistics of the Eulerian velocity di8erence at a Axed point x0 at two instants t0 and t0 + $. For small time intervals $, this velocity di8erence
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
51
can be expressed in terms of the velocity Aeld v deAned in the reference system moving with velocity U (t0 ; x0 ) as U$ v(−U (t0 ; x0 )$; $) :
(151)
However, the situation is in fact more complicated than for the Eulerian increments or Lagrangian di8erences considered before. Indeed, the variable U$ depends explicitly on the reference velocity U0 = U (t0 ; x0 ). Therefore, applying Kolmogorov theory, one can only conclude that there is a conditional probability distribution for U$ . For a given value of U0 , the tensor ∗ which correspond to the longitudinal D∗ = Ui; $ Uj; $ depends on the two functions D∗ and D⊥ and transversal directions, as in Eq. (127). These two functions depend on $ and on r = U0 $ and in the inertial range for ($; r) we have |U0 |2 |U0 |2 ∗ ∗ D = + $ : ; D⊥ = + $ :⊥ ; (152) +$ +$ where : and :⊥ are universal functions which have to be speciAed. The dependence on U0 and the conditional nature of the previous results can be removed by resorting to the Taylor, or frozen turbulence, hypothesis. In this hypothesis, it is assumed that the turbulent uctuations are much smaller than the mean velocity, or U0 U0 . Then, the turbulent uctuations at a Axed point and over the time interval $ can be regarded as the transport of the turbulent uctuation with a constant velocity U0 without distortion. This frozen-turbulence assumption not only removes the dependence on U0 , but allows to write the statistics of U$ in terms of the velocity structure-functions derived for the Eulerian space increments in Section 5.3.1 by replacing r by U0 $. This leads to D∗ = D (U0 $) ;
∗ D⊥ = D⊥ (U0 $)
(153)
and thus in the inertial range to D∗ = C(+U0 $)2=3 ;
∗ D⊥ = 43 C(+U0 $)2=3 :
(154)
5.4. DiIculties and re?nements 5.4.1. High-order statistics Quickly after the publication and di8usion (through Batchelor’s work) of Kolmogorov hypotheses, measurements were performed on particular velocity statistics. The statistics that were measured, and that have been repeatedly analysed are the atness and skewness factors of velocity derivatives. The skewness and atness factors are the third and fourth moments of the velocity spatial derivatives. For the case of the gradient of the longitudinal velocity component along the longitudinal direction, say 9U1 = 9x1 , these factors are deAned by S=
(9U1 = 9x1 )3 ; (9U1 = 9x1 )2 3=2
K=
(9U1 = 9x1 )4 : (9U1 = 9x1 )2 2
(155)
52
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
They are just special cases of high-order velocity-derivative moments which are expressed by (9U1 = 9x1 )n : (156) Mn = (9U1 = 9x1 )2 n=2 According to Kolmogorov theory, for each value of n; Mn is a universal constant. In particular, Kolmogorov theory predicts therefore a constant value of S and K regardless of the Reynolds number above a possible threshold since Kolmogorov theory is applicable for high-Reynolds ows. The early measurements were performed by Townsend in 1947 and are reported in Batchelor [32]. Since then, additional data have been gathered by numerous researchers both experimentally and numerically. Experimental data collected in the atmosphere are already discussed in Monin and Yaglom [25] and further results have been obtained in laboratory ows. Data were compiled in Van Atta and Antonia [33] and are discussed more in detail in the recent review of Sreenivasan and Antonia [28]. With the recent development of DNS, similar results have also been found in numerical simulation [34]. All the results indicate that, contrary to Kolmogorov predictions, the skewness (actually −S which is a positive number) and atness factors increase with the Reynolds number apparently without reaching a limit value. Measurements show that the atness increases from K ∼ 4 at low Reynolds number (for Re# ∼ 40 in grid turbulence) to K ∼ 40 at high Reynolds number (for Re# ∼ 104 in the atmospheric boundary layer) and still seems to increase with higher Re# . A similar trend is observed for −S although the rate of increase as a function of the Reynolds number is slower. Similar discrepancies arise, not just for high-order statistics governed by scales in the dissipative range, but also for statistics pertaining to the inertial range. We have seen that in the inertial range, Kolmogorov theory predicts that the n-order velocity structure functions are given by Eq. (142), and therefore that F n (r) scales as r 2=3 when r belongs to the inertial range. High-order statistics have been measured in detail by Anselmet et al. [35] and have been since then conArmed by numerous works (see the discussion in the review of Sreenivasan and Antonia [36]). Measurements conArm the power-law scaling in the inertial range, that is F n (r) ∼ r −Cn , but the measured exponents Cn di8er from the Kolmogorov values showing a slower increase of Cn with n than predicted by Kolmogorov theory with a marked di8erence from n ¿ 5. The scaling exponents Cn are called anomalous exponents. 5.4.2. Intermittency The discrepancies between Kolmogorov predictions and experimental Andings for high-order statistics of velocity increments are mainly attributed to the phenomena of internal intermittency. Indeed, Kolmogorov theory was rapidly questioned by a remark which is believed to have been Arst formulated by Landau (see [5]). In the K41 picture, the energy transfer (taken as identical to the dissipation rate in the equilibrium range, see Fig. 6) was assumed to be given by the mean dissipation rate +. However, the dissipation rate is directly expressed in terms of the velocity Aeld as
9ui 9uj 2 += + : (157) 2 9xj 9 ui Therefore, + is also a random variable which uctuates around its mean value +. These uctuations may depend upon the Reynolds number and involve large scales. As a consequence,
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
53
statistics in the inertial range may depend on large scales and thus are not universal. For example, if we consider that + takes di8erent values in di8erent regions, we can apply the K41 statistical results based on the value of + that exists in each region. Since the structure functions are not linear functions of + (with the exception of the third-order moment which is thus a quantity on which intermittency has no e8ect), it is obvious that the averaged (on the di8erent regions) result di8ers from the global one built on + since F n (r) = Cn +n=3 r n=3 = Cn +n=3 r n=3
when n = 3 :
(158)
One can note that the distribution of the dissipation rate + does not contradict the existence of power laws in the inertial range but rather the universal nature of the constants that appear in Eqs. (142) and (143). In 1962, Kolmogorov (as well as Obukhov) proposed the so-called re?ned similarity hypotheses, referred to as K62. They basically consist in stating that the statistical laws of K41 remain valid but should be regarded as conditional results for a Axed value of + in a small time–space domain V(t; x) centred around the point x where statistical laws are expressed, that is for +r deAned as 3 +r (t; x) = +(t; x + r) d r : (159) 4(r 3 V(r) Averaged results are then derived by integrating the conditional results for all possible values of the dissipation rate +r . For example, the reAned hypotheses predict that the Eulerian velocity structure functions are given by (Ur )n |+r = + = Cn (+r)n=3
(160)
with Cn a universal constant. The unconditional structure functions are then obtained as F n (r) = (Ur )n = T(Ur )n |+r = +U = Cn +n=3 r n=3 :
(161)
The reAned hypotheses introduce the probability distribution of +r (or equivalently of spatial derivatives) in the picture. Furthermore, from Eq. (157), it is clear that + is a small scale variable and that + and +r are mainly determined by small-scale behaviour. Thus, even though similar expressions are obtained in the K62 theory, the overall picture di8ers slightly from the one-way cascade (from the large to the small scales) in K41: small scale statistics are determined from the +-distribution which in turn results from small scale behaviour. To work out global results, this probability must be input. The earlier measurements by Townsend and reported in Batchelor [32] (see the discussion in Section 5.4.1) indicated that small scales deviate from Gaussianity and, as already presented, all subsequent studies have conArmed this trend. The typical pdf that has emerged for small scale quantities has a typical feature: a higher central core and tails which decrease far less rapidly than for the Gaussian distribution (this point is taken up in the next Section 5.5). The deviation from Gaussianity is expressed in terms of an intermittent factor (through the measured atness factor) which can be seen as the fraction of zero and non-zero values. This corresponds to an uneven spatial distribution of small scale quantities: turbulence activity tends to concentrate in conAned parts of the ow (meaning locally a high dissipation and the existence of smaller scales) surrounded by more quiescent regions. This intermittency becomes more and more marked as the Reynolds number is increased and is therefore related to the non-Gaussian character of small scales and the scale dependence of the probability distribution of Ur .
54
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
In order to take intermittency into account, Kolmorogov proposed in 1962 a third hypothesis. It is assumed that +r = + is a uctuating quantity which is a function of r=L, and by making the same assumption for r = @ that += + is function of @=L. More precisely, Kolmogorov assumed that +r (and thus +) is log-normally distributed with a variance given by L 2 : (162) ln(+r ) = A + D ln r The log-normal distribution is the natural distribution for the result of a cascade process with multiplicative noise. In the reAned picture K62, the probability distributions of the normalized velocity increment U; r =(+r)1=3 ) are no longer universal, but we still expect power-law scaling for the structure functions. Indeed, by using the log-normal assumption of +r the Eulerian velocity structure functions F n (r) scale as F n (r) ∼ r −Cn with scaling exponents given by
1 1 (163) Cn = n 1 − D(n − 3) : 3 6 For the second-order structure function, say for the longitudinal function D (r), the correction is very small since the new exponent is now C2 = 2=3 + 1=36 and thus di8ers from K41 prediction only by 1=36. Similarly, the modiAcation to the −5=3 law of the energy spectrum E(k) in the inertial range is minor. When compared to measured scaling exponents, the reAned predictions perform better than the original K41 ones [35] and acceptable agreement is obtained for values of n up to 10. For higher-order exponents, K62 values depart from measured ones, indicating that the reAned turbulence theory can only claim to be an approximate description (but still a very reasonable one) and is not the ultimate theoretical word on turbulence. In spite of known theoretical deAciencies [5], the reAned hypotheses have rather well come out of recent tests, both experimentally [28] and numerically [37]. So far, the discussion on intermittence has mainly been focused on Eulerian statistics such as the (Eulerian) velocity structure functions F n (r). However, intermittence a8ects also Lagrangian statistics, and these e8ects are particularly relevant to the present work since the Lagrangian point of view will be central in the development of pdf models. In the Lagrangian formalism, the pressure Aeld, or more precisely its gradient, plays an important role. Indeed, by writing the Navier–Stokes equation for a uid particle ai (t) =
d Ui (t) 1 9P + QUi ; =− dt 9xi
(164)
it is seen that the pressure Aeld is closely related to the uid particle acceleration statistics. The pressure is given by a Poisson equation QP = −
9Ui 9Uj ; 9xj 9xi
(165)
which shows that pressure is a8ected by large and small scales. Lagrangian statistics are much more di@cult to measure experimentally and less data is available. Results have nevertheless been obtained [25,38]. Valuable information and insight have come from recent direct numerical simulations [29,39]. These results have also brought out discrepancies with the K41 description of turbulence statistics. According to Kolmogorov K41 predictions, the acceleration variance
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
55
varies as a2 = a0 +3=2 = 1=2
(166)
with a0 being a constant. Yet, in the DNS results of Yeung and Pope [29], it is reported that acceleration variance increases with the Reynolds number Re# as a2 ∼ Re#1=2 : (167) +3=2 = 1=2 This is due to intermittent e8ects, but Kolmogorov reAned hypothesis leads to a variation of a0 as a0 ∼ Re#D3=2 which is weaker than the one observed. The Re#1=2 scaling has been conArmed in another analysis of DNS results [39] where it is also found, for Re#1=2 up to 172, that (∇P)2 scales as Re#1=2 . These issues, of anomalous scaling, small scale intermittency, and thus of intermittency models, have received considerable attention and remain a major issue in most theoretical work. Detailed information and guidelines into the related vast literature on this speciAc subject can be found for example in recent reviews [1,28,40] and in textbooks [5]. Recent developments include for instance multi-fractal formalism [5], ESS (extended self similarity) [41] or log-Poisson statistics [42] among other models. 5.5. Experimental and numerical results Analysis and tests of Kolmogorov predictions were not limited to determining the scaling exponents. These exponents are of key theoretical importance but of less interest in the practice of modelling purposes. The di8erent investigations have also helped to bring out useful information on the pdf of various quantities, both of large and of small scales. These results have been obtained from experiments [25,43– 45] and from numerical simulations [46,47,29,39] which have been particularly useful for small scale quantities and for Lagrangian variables. 5.5.1. Experimental and numerical results on distributions We present Arst one-point velocity pdf, then velocity derivative and velocity increment distributions, and results for pressure-gradient pdfs. Velocity distribution. First of all, both experimental and numerical approaches have well conArmed that, for homogeneous turbulence, one-point velocities have a Gaussian distribution [46]. The normal distribution is the natural reference for systems with large numbers of degrees of freedom as a consequence of the Central Limit Theorem. Batchelor showed in 1953 [32] that the Gaussian nature of some band-limited eddies is equivalent to the independence of their Fourier coe@cients over the corresponding wave number band. The distribution of one-point velocity involves the whole spectrum. Yet, we have seen that the energy contained in each band (or the energy density E(k)) and thus that the pdf re ects mostly the large scale behaviour. These scales are excited by uctuations of boundary conditions which are not controlled and it is therefore not too surprising to And an error law distribution. Another way to describe this result is to say that the large scales do not exhibit (internal) intermittency. This is not the case of free shear ows, such as jets or mixing layers, which are known to be intermittent even on the large scale. However, this intermittency results from the mixing of a turbulent core with
56
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 7. Typical forms of the distributions of velocity increments X = Ur for decreasing values of the separation distance r: of the order of L (D), in the inertial range (C) and (B) and in the dissipative range (A). The curve (A) is identical to the distribution for velocity gradients.
laminar surroundings and is thus due to external forcing or conditions (this intermittency is referred to as ‘external intermittency’). For homogeneous turbulent ows, external intermittency is mainly absent, though this does not mean that large scale structures cannot be found (see below). The Gaussian nature of the large scale part of one-point velocity in homogenous ows was well demonstrated by She et al. [47]. The authors used a Fourier band Altering method to isolate and quantify deviations from Gaussianity of separate ranges (large, inertial and dissipative ranges). They found that not only the full velocity Aeld, but also the ‘inertial’ velocity Aeld follow a Gaussian distribution [47] whereas the velocity Aeld corresponding to the dissipative range displays deviations from Gaussianity with near exponential tails. Velocity derivative distribution. The situation is quite di8erent for velocity derivatives since we are now dealing with the other end of the spectrum (velocity gradients are mainly governed by small scale quantities). All available results, from the early experiments [43] to recent simulations [34], indicate that distributions of velocity gradients or derivatives deviate strongly from Gaussian pdfs. As already indicated in Section 5.4.2 the observed distribution is closer to an exponential law for the tails of the distribution and has a central core with higher probability compared to Gaussian values. When the separate contributions of di8erent energy ranges (large scale, inertial or dissipative) are studied [47], the resulting distributions conArm that velocity gradients are indeed small scale phenomena since the distributions for the full Aeld and for the dissipative range show similar behaviour whereas the distribution obtained for the velocity Aeld corresponding to the inertial range is nearly Gaussian. Velocity gradients are actually special cases of velocity increments Ur (when r → 0), so it is useful to consider the distributions for Ur when r varies from the large scale down to Kolmogorov scales. The outcome is that the distributions vary continuously from a Gaussian law when r is of the order of the large scales to the near-exponential pdf characteristic of velocity gradients when r → @. This is illustrated in Fig. 7 which has been observed in many experiments and simulations [35,44,46,34].
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
57
Similar behaviour has been observed for the velocity increments in time of a uid particle QU$ (t) (a Lagrangian quantity) by Yeung and Pope [29]. For time lags which are large enough, that is large with respect to Kolmogorov timescale ($$@ ), the distribution is Gaussian. When smaller time lags are considered, the distributions change continuously showing an increasingly non-Gaussian behaviour and when $ $@ the distribution presents the typical form known for 9U= 9t ever since the early measurements of Van Atta and Chen [43]. Pressure and small scale invariants. Analysis of the distributions of the one-point pressure has shown a non-symmetric pdf [48,39]. For positive uctuations, the distribution appears to be Gaussian and rather insensitive to variations of the Reynolds number Re# . For negative uctuations, the form of the pdf tends towards an exponential distribution (or perhaps stretched exponential) and the tails of the distribution extends to larger negative values as the Reynolds number increases. The pdfs for pressure-gradients have been computed by Gotoh and Rogallo [39]. The resulting pdfs are isotropic and have the same form as velocity gradients or small scale quantities. The distributions di8er markedly from the Gaussian one and have exponential tails (or stretched exponential forms). Therefore, the pressure-gradient Aeld appears to be very intermittent with a variance that increases with Re# as already indicated in Section 5.4.2. Three small scale invariants can be formed from the tensor of the velocity gradients Gij = 9Ui = 9xj . By introducing the symmetrical Sij and anti-symmetrical tensors Rij 1 9Ui 9Uj + ; (168) Sij = 2 9xj 9xi 1 9Ui 9Uj Rij = − ; (169) 2 9xj 9xi we can deAne the three invariants s2 = 2 Sij2 = 2 Tr(SS ⊥ ) = 2 Tr(S 2 ) ;
(170)
!2 = 2 R2ij = 2 Tr(RR⊥ ) = −2 Tr(R2 ) ;
(171)
g2 = Gij2 = Tr(GG ⊥ ) :
(172)
These invariants are directly related to dissipative quantities since + = s2
mechanical dissipation ;
C = !2 enstrophy ; ’ = g2
pseudo-dissipation :
(173)
In homogeneous turbulence, the mean values of these dissipative quantities are identical + = C = ’. Then, by manipulating matrix identities, it can be shown that +; ’ and C satisfy the relations 1 ’ = [+ + C] ; 2
(174)
58
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
+−’=
9Ui 9xj
9Uj 9xi
:
(175)
Kolmogorov reAned hypotheses (K62) assumes that small scale quantities follow log-normal distributions. That hypothesis is often tested by calculating the pdf of the total or mechanical dissipation +, for which the agreement with log-normality is moderate [46,37]. However, in an interesting analysis, Yeung and Pope [29] showed that the distributions of the three small scale invariants +; ’ and C were not identical. It was found that while the pdfs of + and C di8er from the log-normal pdf, the pdf of the pseudo-dissipation ’ on the other hand is very close to being log-normal [29]. This could be related to the di8erent physical meanings of the small scale invariants (see below and Section 5.6.2). Along the same line, a very interesting analysis of conditional pdf of velocity increments was proposed by Gagne et al. [45]. Using experimental data, the authors showed that when conditional velocity increments U (r)|+(r) = +0 are considered, then the observed distributions are Gaussian regardless of whether the separation distance r lies in the inertial range or in the dissipative range. This result was obtained only when +(r) is deAned as the energy transfer rate and not as the energy dissipation rate. From the results of Gagne et al., it appears that intermittent non-Gaussian statistics can be attributed essentially to uctuations of the energy transfer rate. A strong point is that universal Gaussian behaviour can be obtained from quantities that essentially represent energy transfer rate and not energy dissipation rate (though the two have identical mean values). 5.5.2. Spatial structures The presentation and the discussions developed up to now in this section have followed the statistical approach to turbulence. Since the early developments and Kolmogorov theory in particular, this statistical description has been the point of view mainly adopted. Yet, other points of view are possible, among which an interesting one is the geometrical approach. In this approach, one attempts a geometrical description of turbulent ows as a collection of spatial structures. The underlying aim is to try to isolate elementary structures that would play for turbulent ows the role of molecules in classical statistical physics. This point of view was really initiated when it was realized that large structures can exist in turbulent ows, in particular in free shear ows such as mixing layers [49]. These structures are, however, large scale structures strongly related to the geometry of the ow and to external forcing. It is then di@cult to come up with a universal theory for them. Another impetus for the geometrical approach was given by the discovery of the existence of typical small scale structures which are the result of Navier–Stokes dynamics and may present universal characteristics. These small scale structures have been mostly revealed by direct numerical simulations which have proved invaluable tools for their analysis [50,34,51]. Recent experiments have supported their existence and have helped to assess their characteristic features [52,53]. It was found that intense events related to small scales, especially small scale vorticity, were not evenly and randomly distributed in the ow domain, but organized in the form of vortex Alaments. These Alaments are also referred to as vortex tubes or worms. The analysis of these objects is still going on since open questions remain (see the discussion in the introduction of the article by Jimenez and Wray [51]). We simply summarize here present knowledge, drawing
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
59
in particular from the work of Jimenez and coworkers [34,51]. These authors have tried to determine the statistical properties of these geometrical objects. From the recent analysis of DNS results [51], it appears that • a turbulent ow contains small scale intense structures which have the form of long tubes, • these tubes have a radius of the order of the Kolmogorov length scale @ and a length of the
order of the integral length scale L,
• the internal velocity di8erences within the tubes scale as u the large scale characteristic
velocity uctuations, • the tubes occupy a volume fraction which seems to scale as Re#−2 (or at least as Re#a with a ∼ 2). Other typical properties can be deduced from the ones above. The typical vorticity of the tubes scale as u u # # !∼ = (176) = ! ∼ !Re#1=2 @ # @ @ with ! being the rms magnitude for the ow as a whole. With respect to the existence of these tubes, the key questions are: how relevant are they for the dynamics of the ow? What role do they play in the energy budget? How are they related to intermittency? For these purposes, the crucial step is the evaluation of the volume fraction :vt occupied by the vortex Alaments. From the estimation that :vt ∼ Re#−2 , it appears that the vortex tubes give negligible contributions to the kinetic energy as well as to the total dissipation and the total enstrophy of the ow. In terms of statistics, the e8ects of the vortex Alaments on velocity gradient statistics and on velocity structure functions are negligible for low-order statistics (say for n 6 4) but dominant for high-order statistics (n ¿ 4), as discussed in the conclusion of Jimenez and Wray [51]. The interesting conclusion of these works is that the observed deviations of high-order statistics from Kolmogorov predictions could be attributed to the existence of special small scale intense structures, such as vortex Alaments. 5.6. Simpli?ed images of turbulence and Lagrangian models In this subsection, we put forward a simple description of turbulence and we present a Lagrangian point of view in which turbulence is addressed as a N -particle problem. The description tries to include some of the recent ideas or Andings on turbulent ows while remaining simple enough so that tractable models can be derived (or justiAed) from it. This somewhat crude image is therefore meant as a compromise between the complexity of some advanced theoretical concepts in turbulence and the necessary simpliAcations inherent to most models. The purpose of this subsection is to show how some of these concepts can be used even for modelling issues and also to introduce a Lagrangian modelling point of view that will be adopted in the following sections. As such, this subsection provides a link between the theoretical-oriented presentation of the physics of turbulence and the modelling-oriented issues that will be addressed in the rest of this paper.
60
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
5.6.1. A two-fold picture of turbulence We have seen that most of the di@culties in the theory of turbulence (in particular in Kolmogorov theories) come from the phenomena of small scale intermittency. Intermittency is responsible for the deviations of statistics from Gaussianity and can be seen as a manifestation of an underlying order in turbulent ows. The recent developments described in Section 5.5.2 have revealed that this order could be related to intense vortex Alaments. Of course, these vortex tubes are just one of several typical structures that are bound to exist in turbulent ows. They correspond to the most intense small scale vorticity events and it is quite probable that there exists a whole range of structures with di8erent properties. Yet, they suggest the following model picture of turbulence. In this simple image, turbulent ows are described as being made up by the presence of two completely di8erent regions (a) a random and structureless background which occupies most of the ow (its volume fraction :bg is very close to one). In this random background, correlations between small scales quantities at di8erent locations, such as !2 (x)!2 (x + r), are small and the overall statistics are well described by Kolmogorov K41 theory (or perhaps K62 reAnements). (b) a collection of intense vortex Alaments which occupy only a very small volume fraction (of the order of Re#−3=2 or Re#−2 . These vortex tubes correspond to regions of strong spatial correlations and are responsible of the deviations of high-order moments from Kolmogorov predictions. In that simple picture, it is for example assumed that the anomalous Re#1=2 scaling of the variance of the pressure-gradient terms is mainly due to the contributions of the intense coherent structures. 5.6.2. A Lagrangian point of view From this point of view, we consider turbulent ows as being made up by a large number N of ‘ uid particles’. This corresponds already to a discrete vision of a turbulent ow rather than a continuous one. This point of view is a statistical one in which one tries to describe the N -point pdf and from it the discrete approximation of statistical quantities. Each particle can be thought of as having the same mass for the sake of simplicity and have location xn and velocity Un . The N di8erential equations that give the evolution in time of the Lagrangian properties are written from the Navier–Stokes equations as d xin (177) = Uin ; dt d Uin (178) = Fp + Fv ; dt for n = 1; : : : ; N and where Fp and Fv stand for the pressure-gradient and the viscous forces, respectively, 1 9P Fp; i = ; Fv; i = QUi : (179) 9xi The pressure-gradient and viscous terms that enter the time evolution equation for one particle labelled n depend upon the values of all other particles, so the set of equations describes
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
61
indeed a coupled N -particle problem. From Kolmogorov theory, we know that Fv is actually a short-range force whereas Fp is a force which has both short- and long-range e8ects. The pressure-gradient force is a crucial term, particularly in a Lagrangian formalism since it connects even widely separated particles. This can be further brought out by reformulating the Poisson equation satisAed by P, Eq. (165). Using the three small scale invariants, +, ’ and C deAned in Section 5.5.1 and the relations given in Eqs. (174), we can write QP = (’ − +) = (C − +) = (180)
and by inversion of the Poisson equation, the pressure-gradient force is given by 1 1 (t; y) d y : (181) Fp = − 4( |x − y|2 This formulation reveals an interesting analogy with classical electrostatic. It is seen that the uid particle interact through Coulombian elementary forces where plays the role of the electric charge. At each point, and thus for each particle, the di8erence between the enstrophy C and the mechanical dissipation + creates a non-zero charge at that point. When the enstrophy is in excess over the mechanical dissipation, the charge is positive and this leads to attractive forces between the particles. It is thus seen that stable coherent structures must correspond to low-pressure events (or points when QP ¿ 0) and to high vorticity regions, which are indeed found in the vortex Alaments. The particle or Lagrangian point can help to clarify the di8erent meaning between the three small scale invariants. If we consider the total energy W received by a particle as the result of the work performed by the surface forces acting on that particle (nj is a normal unit vector directed outward) 9 W = Ui $ij nj d S = (Ui $ij ) dV ; (182) S V 9xj where $ij is the stress tensor 9Ui 9Ui + : $ij = −Pij +
9xj 9xj
(183)
The total work (per unit mass) is thus given by the sum of two contributions DUi W = W1 + W2 = Ui + +: (184) Dt The Arst term, W1 represents the change of kinetic energy of the particle while the second term, W2 represents the change of internal energy of the particle. We have therefore that W = ’ + + ;
(185)
(186) QP = [’ − +] :
which suggests that ’ and + do not play the same physical role: + is the rate of increase of a particle internal energy while ’ (actually the di8erence with +) appears as the rate of energy transfer between uid elements of particles. This could be related to the observed di8erent distributions of these invariants in homogeneous turbulence as discussed in Section 5.5.1.
62
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 8. Sketch of the force created by one particle on other uid elements.
It may be instructive (and perhaps also amusing) to re-derive usual Kolmogorov estimations from this Lagrangian point of view. For this purpose, let us evaluate the energy transfer through the work performed by the pressure-gradient force. We consider one particle, which has a small volume, say V0 an associated charge 0 , a characteristic velocity u0 and a typical length d0 . These ‘particles’ which are in fact small uid elements having a certain coherence are not stable. They are subject to viscous stabilizing e8ects which act with a typical time scale of the order of $v ∼ d2 = and to perturbation which act with a typical time scale of the order $p ∼ d=u. Particles are only stable when $p 6 $v , that is when the Reynolds number based on the particle length d is less than one. In other words, particles are only stable when d ∼ @. We therefore associate to each particle a characteristic lifetime, say $(d). The single particle that we are considering (and which is labelled with the index 0) creates a force on all uid elements, see Fig. 8. The force density at the distance r from the reference particle is given by 1 0 V0 4(r 2 and a small volume element dV around the point M is subject to the force f(r) = −
(187)
d F(r) = f(r) dV :
(188)
The small volume element will then have a typical velocity induced by the force created by the reference particle, say u(r) that we can write as u(r) = f(r) $0 ;
(189)
where $0 is the characteristic time during which the force will be applied on the small volume dV at location M. The work performed by the force is d P = d F(r) u(r) = T0 f(r)2 dV
(190)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
63
and the total work, or in other words the total energy lost by the reference particle labelled 0 per unit mass becomes ∞ 1 1 1 W= P= T0 02 4 V0 4(r 2 d r : (191) 2 V0 (4() r d0 We then obtain the simple expression of the energy transfer rate for a given (and dropping now the label 0) W ∼ $(d) d2 2 :
uid particle (192)
Since the ‘charge’ is also expressed by = !2 − s2 by Eqs. (174) and (180), we can write W ∼ $(d) d2 !4 :
(193)
The simplest estimation for ! and $(d), for particles with a typical length greater than the Kolmogorov scale, are !∼
u ; d
$(d) ∼
d : u
(194)
This yields that W ∼ u3 =d and if we assume that the energy transfer for the pressure-gradient force is of the order of the characteristic energy transfer rate for the ow as a whole, thus of the order of +, we Anally retrieve Kolmogorov scaling that is u3 ∼ + : (195) d The simple two-fold description of turbulence put forward in the preceding subsection and the derivation of the energy lost by work of the pressure-gradient force can be used to suggest modelling ideas for this pressure-gradient term. If we leave out the (small) region involving the coherent structures and if we consider only the structureless background, then with Eq. (181) and using the assumption that the charge density (x) is nearly uncorrelated in space (x)(x + r)1, we obtain that the pressure-gradient Fp should be close to a Gaussian random variable as the consequence of the Central Limit Theorem. Using a di8erent terminology, we can say that in the random background, pairwise interactions between uid particles are weak and are similar to ‘grazing collisions’ of particle physics for which the Bolztmann equation is sometimes replaced by a Fokker–Planck equation. These two points, loss of energy through the transfer towards all other uid particles and a near Gaussian form when coherent structures are ignored, suggest to consider the simple mechanical models often retained for the motion of a charged particle in classical electrodynamics m
d2 x dx + < + kx = Fext : dt 2 dt
(196)
In classical electrodynamics, a charged particle which accelerates (in a general sense this means its velocity is not constant) creates an electromagnetic Aeld and therefore a force Aeld that acts on all other charged particles. This force Aeld corresponds to a transfer of energy from the charged particle that we are considering to the others. In that sense, it is said that an accelerating charged particle ‘radiates’ energy: its own energy E decreases due to the transfer written
64
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
as −d E= d t to all other particles. In the simple mechanical model, the friction coe@cient < in Eq. (196) is meant to represent the radiative damping coe@cient, thus 1 dE <=− : (197) E dt We can use similar ideas for the pressure-gradient term building on the above-mentioned analogy with electrodynamics. In an homogeneous ow, where the mean velocity U is zero, this analogy suggests to model Fp by Fp = −
(198)
where the friction coe@cient is expressed as W <∼ 2 ; (199) u with W the energy transfer and u2 the particle kinetic energy. Using the classical estimations in the K41 picture, W ∼ +, we obtain 1 + <∼ 2 ∼ ; (200) u TL where TL is the Lagrangian integral time scale deAned in Section 5.2 (see also Eq. (134)). Furthermore, we can take the uctuating term fp as a Gaussian random term in the structureless background as proposed above. We then obtain a model equation which has the form of a Langevin equation, and generalising to non-zero mean velocities, this model can be written as 1 Fp ∼ − (U − U ) + G ; (201) TL with G being a Gaussian white noise. These ideas are of course just simple local models and should just be regarded as crude representations. Nevertheless, they are not necessarily without any connections to the physics of turbulence and they still retain some of the essential aspects of Kolmogorov theory. It will be seen in the following sections that they form the basis of present pdf models. 5.7. Closing remarks A great body of work has accumulated over the years on turbulence, but the subject has retained its mysteries. At that point, it may be useful to repeat that di8erent objectives can actually be pursued when dealing with turbulent ows. To somewhat simplify the picture, we can distinguish between a theoretical point of view and a modelling point of view. From the theoretical point of view, one looks for universal features and the concern is mainly on small scales, scaling exponents, intermittency and high-order statistics. The central issue is then to remedy to the deAciencies of Kolmogorov theories. From a modelling or ‘engineering’ point of view, the emphasis is more on mixing and transport phenomena which are mainly governed by the large energy-containing scales. The two points of view are not completely separated, and, as explained in the introduction of the paper, the purpose of the present work is actually to And a middle-road approach and
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
65
a compromise between the two. An example may be the Arst and simple modelling steps presented in Section 5.6.2. It is believed that theoretical improvements or new results, such as the determination of small scale statistics, can help even application-oriented models. However, from the modelling point of view, it is clear that high-order scaling exponents is an issue which is not of Arst importance. The most important aspects are well identiAed in Kolmogorov theories and in the description of the energy cascade. Even the Arst K41 theory does not describe a system in thermodynamical equilibrium, but rather a system maintained out of equilibrium by a constant ux. A Arst-importance issue is to determine the energy transfer rate from the large to the dissipative scales. The uctuations of this variable are important for practical concerns as well as for theoretical aspects (see the discussion of recent results [45] in Section 5.5.1). The determination of the energy transfer rate is a very di@cult point since the large scales are not isotropic and are not universal. It is thus seen that the turbulence problem is not, in that respect, only due to the large number of degrees of freedom but to the strong interactions between them, in particular between the large scales. Another main di@culty is that, at the moment, no small parameter has been identiAed from which successive approximations could be rigorously derived. Depending on the point of view, small parameters do appear but they only concern partial aspects of the whole problem. For example, by adopting a Lagrangian formalism, we have seen that there exists a separation of scale (of characteristic time scales) between uid particle velocity and acceleration. This suggests already a Lagrangian point of view and the small parameter Ta =TL (or even $@ =Qt). Yet, deriving a local or one-particle model remains di@cult since one has to model (for instance as in Section 5.6.2) the interactions of the large scales. These models are then uncontrolled approximations, or, in other words, guesses. On the other hand, one should not overlook their practical interest as we will see in the next sections. The discussion related to large scale direct interactions is taken up again in Section 9.3.2. 6. One-point pdf models in single-phase turbulence The mathematical tool of stochastic processes (particularly, di8usion processes) and the general modelling ideas that have been developed so far can now be applied for turbulent ow modelling issues. The objective of the present paper is to formulate a pdf approach for two-phase ows, but it is useful to consider only single-phase turbulent ows as a Arst step before addressing the two-phase ow problem. Presentation of the pdf approach for single-phase turbulent reactive ows can be found in previous works, for example [6,10,8,9] or [54]. Therefore, we do not try to give a comprehensive presentation, but rather choose to insist on certain aspects of the approach. In particular, the emphasis in this section is put on (i) clarifying the purpose of a pdf description in turbulence, with respect to other particle methods and with respect to classical approaches in turbulence, (ii) detailing the formalism, especially the notions of Lagrangian and Eulerian pdfs, (iii) explaining the reasoning behind the choice of a Lagrangian description, of a precise state vector and of di8usion processes as models, (iv) outlining the physical content of present state-of-the-art stochastic models and their range of assumed validity.
66
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Some of these aspects are not always covered in the previously cited works, or may complement them. Details on the formalism can already be found in Pope [6] but are presented here from a slightly di8erent point of view. The general formalism and the stochastic models used for the two-phase ow modelling issue are actually extensions of those given in this section. In the following, · will denote the classical Eulerian averaging operator (whose exact mathematical deAnition will be properly given in Section 6.4). 6.1. Motivation and basic ideas As explained in Section 5, a turbulent ow displays far too many degrees of freedom: the range of scales can cover several decades, i.e. L=@ and T=$@ 1, and is therefore too extended to allow a direct calculation even with computers of the foreseeable future. Thus, although the basic equations are deterministic, the Arst modelling step consists in adopting a statistical point of view and in regarding the solutions of the basic equations as being random or stochastic processes. This is actually a classical step in statistical physics where probabilistic arguments are used in (deterministic) systems involving a very large number of degrees of freedom. This step goes hand in hand with the fact that we are not interested in the full description of the ow in time and space but rather in some limited information about some characteristics of the Aelds. In the language of mathematics, we would say that we are interested in a weak approximation (see Section 2.1.2). In the language of statistical physics, we are interested in a reduced or contracted description of continuous Aelds. 6.2. Coarse-grained description and stochastic modelling At that point, it is worth recalling that the basic equations, the Navier–Stokes equations, have precisely been obtained through such a procedure, e.g. Balescu [22] and this is an outstanding example of the success of a reduced description. Their derivation follows the steps: molecular equations : microscopic description ↓ Boltzmann equation : kinetic description ↓ Navier–Stokes equations : macroscopic description : In this procedure, the microscopic equations of motion are not treated exactly but are bluntly replaced by a probabilistic model (where the assumption of molecular chaos is input). Once this is done, the macroscopic equations of motion are derived from the kinetic equation. The downward orientation of the arrows indicates a contraction of the information and an evolution towards less detailed descriptions of the system. The kinetic equation approach is called a mesoscopic description. In turbulence modelling, the need to limit ourselves to a reduced set of degrees of freedom shows that we are faced with a similar task. The di8erence lies in the starting point which is now the very hydrodynamical equations. What is regarded as ‘macroscopic’ in the classical kinetic approach is now taken as being the ‘microscopic’ level. The procedure can
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
67
be sketched as Navier–Stokes equations : microscopic description ↓ ‘model’ equations : mesoscopic description ↓ mean Reynolds equations : macroscopic description : Here, the adjectives of microscopic and macroscopic lose their meaning of small and large from a geometrical point of view and refer more to Ane and coarse descriptions in terms of the number of degrees of freedom involved. The idea of the present approach is to introduce a model at some intermediate level between the exact Navier–Stokes equations and those for the mean moments in which we are interested. The construction of a reduced state vector can be achieved by a coarse-graining procedure where the system is described on a ‘large enough scale’ to eliminate some degrees of freedom. Information is therefore lost and this lack of complete knowledge will be re ected by the use of a stochastic description for the remaining degrees of freedom. Resorting to a statistical description is common to all turbulence models. However, the key point is that we will not try to write directly a model in terms of macroscopic variables but, following the above sketch, we will try to introduce the model at an upstream level, or, using the same terminology as above, at a mesoscopic level. In other words, the aim is to have a pdf description through the Aeld of probabilistic density functions, and not simply the knowledge of a limited number of moments at each point (usually not more than two). 6.3. Relations to classical approaches First of all, present pdf models should not be confused with other particle methods, such as Vortex Methods or S.P.H. (smoothed particle hydrodynamics) in their basic forms. Indeed, in random vortex methods one interprets the Navier–Stokes equations (written in the vorticity form) as pdf equations to be solved by a particle method. This corresponds exactly to the probabilistic interpretation of PDEs (here Navier–Stokes) mentioned in Section 2 and the particles used in this method are ‘computational particles’ whose distributions give the solutions of the Navier– Stokes equations. As such, this is equivalent to a direct numerical simulation with particles. It is also useful to contrast Lagrangian stochastic models with classical turbulence approaches. The classical approach starts by applying some averaging or Altering operator to the exact equations. We obtain exact but unclosed ‘mean’ equations in which closure relations are then introduced. Closed ‘mean’ equations result. For example, in the moment approach one writes the mean Navier–Stokes equations (we assume constant density ows for the sake of simplicity) 9ui uj 9Ui 1 9P 9Ui =− + QUi − ; + Uj 9t 9xj 9xi 9xj
(202)
in which the unknown Reynolds stress tensor ui uj has to be expressed. The mean momentum equation can be closed by assuming a constitutive law for the Reynolds stresses (this is the
68
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
spirit of the k–+ model) or by writing their own exact equation 9ui uj 9ui uj 9ui uj uk =− + Uk 9t 9xk 9xk 9Uj 9Ui Pij −ui uk − uj uk 9xk 9xk
1 9p 1 9p − Hij − ui uj 9xj 9xi ’ij + ui Quj + uj Qui :
(203)
The closure problem has only been shifted to the rate of change of ui uj and one has now to come up with closures for the triple correlation ui uj uk , the pressure rate of strain correlation Hij and the work of the viscous forces ’ij . The latter can be decomposed as
9uj 92 ui uj 9ui − 2
; (204) ’ij =
9xl 9xl 9xl2 where the last term represents twice the pseudo-dissipation tensor written as +ij . Much of the current research in second-order modelling so far has focused on the closure of the Hij terms (whether or not this is fully justiAed). Yet, closure expressions for the di8erent Hij are not only di@cult (since limited information is available), but as a result of the approach itself, it is also di@cult to be sure that they are coherent among themselves. This is called the realizability problem. More than the particular details of certain models, it is important to underline the steps that are taken in this classical approach: Navier–Stokes equations ↓ open or unclosed mean Reynolds equations → introduction of a model: closed mean equations .
Therefore closure is attempted directly at the macroscopic level, and when it is performed, available information is of course strictly limited to the very macroscopic variables that have been explicitly retained in the second step of the procedure. Compared to this, in the pdf approach, the introduction of a model is made at an ‘upstream’ level, as indicated in the sketch in Section 6.2, where far more information is still available (since we model a probability density function) and has not yet been eliminated. As a result, pdf approaches are particularly attractive if (i) detailed information is needed to address a problem, at least more than the mere knowledge of a few moments, (ii) it is too di@cult to close directly at the macroscopic level through constitutive relations between macroscopic variables. The Arst item is related to the fact that, depending upon the problem involved, one may be interested in more than one or two mean values. A clear example of the second item will be given in Section 6.5.3 for reactive ows. The di8erence can be further illustrated by the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
69
following representation of the pdf approach Navier–Stokes equations ↓ open or unclosed pdf equations → introduction of a model: closed pdf equations ↓
closed mean equations .
where as above, a horizontal arrow stands for the introduction of a model closure at the same level of description, while a vertical arrow means that we move from one level of description to a coarser one. 6.4. Probabilistic description of continuous ?elds We have referred to classical kinetic theory and to the Boltzmann approach in order to bring out the ideas and the purpose of present pdf models for turbulent ows. However, the probabilistic description needed in the pdf approach is more complex than in the Boltzmann approach and it requires that the notions of particles and Aelds, or Lagrangian and Eulerian points of view, be clearly speciAed. In classical kinetic theory, there is a clear di8erence between the meanings and the way with which the notions of particles and of Aelds enter the theory. We are dealing with an ensemble of ‘real’ particles and the system on which we apply a probabilistic description is discrete. It is therefore not surprising to use a particle point of view in the probabilistic description and to handle a Lagrangian probability density function or distribution function. There is no reason in that context to use a di8erent point of view. On the other hand, the resulting local mean values or moments derived from the pdf (or kinetic) description are Aelds: the solutions of the Navier–Stokes equations. Thus, there is a clear di8erence between the two notions of particles and of Aelds in classical kinetic theory. The microscopic level, which is made up by discrete particles, is described from a Lagrangian point of view, and the macroscopic level, which is made up by the Aelds which are solutions of the Navier–Stokes equations, is described from an Eulerian point of view [22]. In the present pdf approach for turbulent ows, there is no such clear-cut separation between the two notions. We do not give a statistical description of a discrete system but instead a probabilistic description of Aelds using stochastic processes. This can create some confusion with respect to the notion of Lagrangian and Eulerian descriptions, or particles and Aelds. Indeed, in this case, the system considered (the microscopic level, now represented by the solutions of the Navier–Stokes equations) can be described either from a Lagrangian or from an Eulerian point of view. The latter appears natural in the context of continuum mechanics, but the former can also be adopted to characterize the system. As a result, the probabilistic description can either be performed in an Eulerian or in a Lagrangian framework. Yet, the outcome of the whole approach, which are the di8erent mean values or the local moments of the variables of interest, will still be Aelds. In other words, the resulting macroscopic quantities will be expressed in an Eulerian frame, but the road leading to them can be either a Lagrangian or an Eulerian one.
70
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
The purpose of the following subsections is to deAne the Eulerian and Lagrangian pdfs and the way they are related. 6.4.1. Dimension of the state vector The following deAnitions and relations between probabilistic quantities will be developed with a certain expression of the state vector. For this we slightly anticipate in Section 6.6 where the choice will be discussed, and we already use a state vector. That state vector corresponds to a one-particle description and to a one-particle state vector which includes particle location, velocity and a number of scalars denoted by in physical space or in sample space, see Section 3 Z = (y; V; ) :
(205)
At present, the meaning of the variables can be left undeAned, standing for any particle properties, and the state vector Z remains therefore fairly general. 6.4.2. Eulerian and Lagrangian descriptions First of all, the exact one-point pdf equation satisAed by pL (t; y; V; ) can be derived by writing the Navier–Stokes equations and a scalar equation in a Lagrangian formulation. The probability to And the reduced system—a ;uid particle—in the state [y; y + d y], [V; V + d V] and [ ; + d ] is given by pL (t; y; V; ) d y d V d ;
(206)
where the superscript L is introduced to emphasize that we are dealing with a Lagrangian formulation. The two possible points of view (Eulerian or Lagrangian) di8er essentially by the choice of the independent variables (or parameters). It is of common knowledge that in the Lagrangian description, attention is focused on ‘labelled’ uid particles as they move through the ow (initial position and time as parameters, position, velocity and scalars as variables) whereas in the Eulerian approach (Aeld description), all ow properties are by deAnition monitored as functions of time and a Axed point in space (time and space coordinates as parameters, velocity and scalars as variables). Therefore, we deAne the Eulerian pdf pE (t; x; V; ) where pE (t; x; V; ) d V d
(207)
is the probability to observe in the ow Aeld a state in the range [V; V + d V] and [ ; + d ] at time t and position x. By doing so, we have deAned a Aeld of distribution function since for each point (t; x) of the time-physical space, a distribution function of velocities and scalars has been associated. To emphasize the distinction between the two descriptions, a semi-column is introduced to separate the parameters from the variables, i.e. pE (t; x; V; ) and pL (t; y; V; ). Note that we also distinguish between phase space and physical space: (y; V; ) ↔ (x; U; ) for the Lagrangian description and (V; ) ↔ (U; ) for the Eulerian one. To complete this formalism, we need to deAne physically the notion of ;uid particles. A uid particle is a small element of uid whose characteristic length scale is smaller than the Kolmogorov length scale, Eq. (135) (which is the smallest length scale of uid motion) but much larger than the molecular mean free path. The uid particle (which can be deformed) has a mass m(t; ), a volume V (t) where (t; ) = m=V and a velocity U(t) equal to the value of
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
71
the Eulerian velocity Aeld at the location of the uid particle, U(t) = U(t; x(t)). This notion is in fact a direct extension of the particle (point) notion in classical mechanics. In the case of particles of constant volume (the arbitrary choice that we make from now on), the ;uid particle object is fully deAned by the set of following variables: (m; y; V; ). Therefore, for a complete description, it is convenient to introduce a mass density function F L (t; y; V; ) where F L (t; y; V; ) d y d V d ;
(208)
represents the probable mass of uid particles contained in an element of volume d y d V d in phase space. The mass density function is consequently normalized by the total mass of uid M (which is chosen to be constant in time for the sake of simplicity) M = F L (t; y; V; ) d y d V d : (209) The Lagrangian mass density function is given by F L (t; y; V; ) = M pL (t; y; V; ) :
(210)
6.4.3. Relations between Eulerian and Lagrangian pdfs We have previously introduced the notion of Eulerian and Lagrangian pdfs and it has been explained why in the case of compressible or reactive ows, the proper pdf to be considered is the mass density function F L and not the pdf itself pL (which is of course the right choice in the incompressible case since all uid particles have a constant mass for a given volume). Attention is now focused on the possible relations between the two points of view. Indeed in order to derive partial di8erential equations veriAed by the mean moments, one has to be able to deAne the proper Eulerian quantity to construct an operator which gives the expected values (moments). The techniques presented in Section 2 are ‘Lagrangian tools’ and moment equations which are Aeld equations can only be derived by means of ‘Eulerian tools’. The question to be answered is: how are Eulerian and Lagrangian quantities related? Following Balescu [22], the correspondence between the Eulerian and the Lagrangian descriptions is given by E L F (t; x; V; ) = F (t; y = x; V; ) = F L (t; y; V; ) (x − y) d y ; (211) where F E is the Eulerian mass density function. This relation simply expresses the fact that, when in phase space y = x, there is equivalence between the two points of view since we consider the same event (in the incompressible case this relation is veriAed by pE (t; x; V) and pL (t; y; V)). By integration of the previous equality over phase space (V; ) and physical space, we can write (212) M = F E (t; x; V; ) d x d V d : This constraint imposes that the integral of F E over phase space (V; ) is the expected density at (t; x) (the probable mass of particles in a given state per unit volume). The expected density,
72
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
denoted (t; x), is of course deAned by (t; x) = ( ) pE (t; x; V; ) d V d ;
(213)
(the integral of pE over the (V; ) phase space equals one since pE is a pdf) and since the integral of the expected density (t; x) over the whole physical space is of course the total mass of uid, the Eulerian mass density function is deAned by F E (t; x; V; ) = ( ) pE (t; x; V; ) :
(214)
The relation between both mass density functions has now been deAned, Eq. (211), and the expression of the Eulerian mass density function is known, Eq. (214). The remaining task is to And the relations between the pdfs. Intuitively, one might think that the Eulerian pdf equals the conditional (conditioned by the position) Lagrangian pdf. Let us investigate this matter. Integration of both mass density functions over the (V; ) phase space gives F L (t; x; V; ) d V d = M pL (t; x) ;
F E (t; x; V; ) d V d = (t; x) ;
(215)
which gives pL (t; x) = (t; x)=M . The conditional expectation pL (t; V; | x) can then be evaluated as follows (using the previous results): pL (t; V; | x) =
pL (t; x; V; ) F L (t; x; V; ) ( ) = = pE (t; x; V; ) : L p (t; x) (t; x) (t; x)
(216)
Therefore, in a compressible ow (and also in a turbulent reactive ow) the Lagrangian pdf conditioned by the position is not the Eulerian pdf but the Favre pdf. Finally, the relation between the mass density function and the Lagrangian transitional pdf can be worked out from the deAnition of the Lagrangian pdf using the transitional pdf L p (t; x; V; ) = pL (t; x; V; | t0 ; x0 ; V0 ; 0 ) pL (t; x0 ; V0 ; 0 ) d x0 d V0 d 0 : (217) By multiplying each side by the total mass M and using the identity given by Eq. (211), we obtain E F (t; x; V; ) = pL (t; x; V; | t0 ; x0 ; V0 ; 0 ) F E (t; x0 ; V0 ; 0 ) d x0 d V0 d 0 : (218) This relation is of extreme importance: it shows that the mass density function is ‘propagated’ by the transitional pdf, or in the language of statistical physics, the transitional pdf is the propagator of an information which is the mass density function. Before, we carry on, let us rewrite the results derived in this section in the particular case of an incompressible ow and from now on, only this type of ows will be studied. When the ow is incompressible, uid particles have a constant mass (density is constant) and all the information is contained in the pdfs. Eq. (211) shows then that pE and pL are related by
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
(with = M= V where V is the volume of the physical space occupied by the ow) 1 E L p (t; x; V) = p (t; y = x; V) = pL (t; y; V) (x − y) d y : V
73
(219)
The probability to And a uid particle at a given position x is associated to the marginal pL (t; x) = 1= V. This result is physically sound since in the case of incompressible ows mass is equally distributed in space and all events are equiprobable. The Eulerian pdf is, in the incompressible case, the Lagrangian pdf conditioned by position, cf. Eq. (216), pL (t; V | x) =
pL (t; x; V) = pE (t; x; V) : pL (t; x)
At last, for the incompressible case, Eq. (218) is veriAed by pE E p (t; x; V) = pL (t; x; V | t0 ; x0 ; V0 ) pE (t; x0 ; V0 ) d x0 d V0 ;
(220)
(221)
which means, as mentioned previously, that the transitional pdf is the propagator of an Eulerian information which is, in this particular case, the Eulerian pdf. 6.4.4. Discrete representation and weak approximation The di8erent mdfs and pdfs have been presented so far in a continuous framework, and as being continuous densities or measures. In actual simulations, these measures are approximated by discrete measures using a Anite number of particles, and practical implementations require to manipulate these discrete mdfs. Nevertheless, it appears preferable to separate the handling of these approximate densities from the deAnitions of the actual densities (the Lagrangian and Eulerian mdfs and pdfs), so as to keep the presentation of the key notions as clear as possible. In a practical simulation, a Anite number of particles are followed in time. Each of these particles have a particular value of its attached variables, namely x(t); U(t) and (t). The discrete approximation of the Lagrangian mass density function is therefore given by N FNL (t; y; V; ) = Qm (y − xi (t)) ⊗ (V − Ui (t)) ⊗ ( − i (t)) ;
(222)
i=1
using N particles with equal mass Qm. When particles have di8erent masses, the discrete Lagrangian mdf has the form FNL (t; y; V; ) =
N i=1
mi (y − xi (t)) ⊗ (V − Ui (t)) ⊗ ( − i (t)) :
(223)
The total expected mass in the volume occupied by the ow is then M = Ni=1 mi or, in the case of equal particle mass M = N × Qm. The approximation of a mean Eulerian quantity (Favre averaged) at a given point in space x and at time t, H˜ (t; x) can be obtained from the above discrete expression of FNL . Indeed, the density-weighted integral is (224) (t; x)H˜ (t; x) = () H (U(t; x); (t; x)) = H (V; ) F E (t; x; V; ) d V d :
74
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Using Eq. (211) where the delta function in space is replaced by (y − x) 1=Vx when y ∈ Vx , Vx being a small volume around point x, we obtain (t; x)H˜ (t; x)
N
x 1 mi H (Ui (t); i (t)) : Vx
(225)
i=1
With the local (discrete) mean density being given by Nx i m (t; x) i=1 ; Vx
(226)
it is seen that the discrete approximation, through the ensemble average over the Nx particles present in the small volume Vx around x, approximates the Favre-average mean quantity H˜ Nx i m H (Ui (t); i (t)) ˜ H HN = i=1 Nx : (227) i i=1 m In other words, the Favre-average mean value of any function of the local Eulerian velocity and scalars, is obtained by a Monte Carlo calculation from the local number of particles. The procedure is valid in the general non-stationary and non-homogeneous case, but is actually based on a local homogeneity hypothesis. A small volume Vx is introduced around point x, and the Nx particles which are located in this small volume are regarded as independent realizations and samples of the mdf at point x. This amounts to assuming spatial homogeneity within the small volume Vx . Convergence of the discrete approximation is ensured by the Central Limit Theorem which shows that there exists a constant C such that HN = H˜
and
(H˜ − HN )2 6
C : Nx
(228)
It is therefore seen that the discrete distribution FLN does not tend towards FL in a strong sense but in a weak sense, or to be more precise in distribution (see Section 2.1.2), since in fact it is the mean value of functions of the stochastic process Z that converges as Nx → +∞ where Nx stands for the simulated number of samples of Z HN = H (ZN ) → H (Z) : N →∞
(229)
This notion of weak convergence and of weak approximation is fully consistent with the overall aim of the pdf approach where, as explained in Section 6.1, the interest is to obtain information on various statistics of the ow. In practice, the challenge is to reach a satisfactory compromise between spatial errors (governed by the size of Vx ), time discretization errors (governed by the time step Qt used to integrate the trajectories of the process, see Section 2.10) and statistical errors (governed by Nx1=2 ) for a tractable total number of particles. This constitutes the important Aeld of practical Monte Carlo methods, see Kalos and Whitlock [55] for example and specialized articles [6,54,56].
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
75
6.5. Choice of the pdf description 6.5.1. The Lagrangian stochastic point of view From the previous section, it appears that the central notion is the transitional Lagrangian pdf from which Lagrangian and Eulerian mass density functions are derived. Consequently, the probabilistic description is best performed using a Lagrangian point of view that will be followed from here onwards. From that point of view, the Navier–Stokes equations are regarded as expressing directly the time-evolution equations of a large number of uid particles. These exact equations are replaced by models which (hopefully) produce the same statistics as those of real uid particles. Following the reasoning of Sections 6.1 and 6.2, these models are stochastic models. In turn, these stochastic models can be developed and expressed from two points of view, either a pdf point of view or a trajectory point of view, see Section 2. The second important choice that is made is to use mainly the trajectory point of view. In a way, we can say that we ‘merge’ two notions into one, the particle or Lagrangian description in turbulence on the one hand and the trajectory point of view for the expression of stochastic processes on the other hand. A somewhat condense presentation of these steps would be to say that the exact instantaneous equations are replaced by modelled but still instantaneous ones. 6.5.2. PDF equation in single-phase turbulence Let us recall the original problem, i.e. which is the formulation of the exact one-point pdf equation satisAed by pL (t; x; V; | t0 ; x0 ; V0 ; 0 ). We start by writing the Navier–Stokes equations and a scalar equation in a Lagrangian formulation (for a given uid particle) d xi+ = Ui+ d t; + 1 9P d Ui+ = − + QUi dt ; 9xi + + d + l = >Q l d t + Sl ( ) d t :
(230)
Note that the subscript ‘+’ has been used to distinguish between the exact equations and the modelled ones. By applying for example the techniques presented in Section 2 (other techniques can also be used), one obtains (where pL+ stands for the transitional pdf associated to the exact trajectories)
9pL+ 9pL+ 1 9P 9 9 L+ − − =− Z=z p [ QUi | Z = z pL+ ] + Vi 9t 9yi 9Vi 9xi 9Vi −
9 9 [>Q l | Z = z pL+ ] − [Sl ( ) pL+ ] : 9 l 9 l
(231)
In this equation, A|Z = z represents the average value of A conditioned on the values of the vector state Z = z, cf. Section 3. Then, Eqs. (217) and (218) show that the same equation is satisAed by the Lagrangian pdf and by the Eulerian mass density function. From the latter, mean moment equations can be expressed (see Section 6.7).
76
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
6.5.3. Interest of the pdf approach The two main features of Eq. (231) are that convection and the scalar source term appear in closed form. No modelling assumption is required to get a closed form for these terms. First of all, treating convection without approximation means that whatever expression coming out of the convection term, in any moment equation, does not need to be modelled. Therefore, the closure di@culties that we mentioned with the classical moment approach, due to ui uj or ui uj uk are not met in our case. This is already very attractive since, from our discussion of statistical closures and our presentation of modelling principles, one can expect that replacing correlations arising from the convection term (and Reynolds stresses are nothing but that) by accurate models is di@cult. The second feature is even more noteworthy. However complicated and non-linear the source term may be, it is in closed form in the present approach. On the contrary, in the classical approach to turbulence modelling (the moment approach), one has to express S() directly in terms of available information, which is limited to () and ()2 for complex functions S. Such a direct closure at the macroscopic level is hopeless, unless speciAc and restrictive assumptions are made (fast chemistry, etc.) and this is a very clear example of the second interest for pdf descriptions as emphasized at the end of Section 6.3. Indeed, the calculation of the mean source term is handled correctly since it is given by S() = S( ) pE (t; x; ) d ; (232) where pE (t; x; ) d is now known. This success explains the great interest of present models for problems involving similar source terms such as in combustion. These advantages are easily explained by the present Lagrangian point of view. Following particles ensures that convection is treated without approximation. Furthermore, since we keep track of instantaneous values, any local source term such as S() is also treated without assumption. This is easily seen by the Monte Carlo evaluation of the mean source term using the Nx discrete particles which are located in a small volume around the point x at which the integral is evaluated as Nx 1 S() S()Nx = S( (i) ) : (233) Nx i=1
On the other hand, the terms involving the pressure-gradient and the molecular transport coe@cients are not in closed form. This is not surprising from the particle equation of motion: these terms represent the instantaneous forces or actions exerted by the other particles (other elements of uid). This expresses the BBGKY hierarchy of unclosed pdf equations explained in Section 3. In order to develop a practical model in a pdf framework, the state vector and its dimension must be chosen. If we adopt the picture of a turbulent ow as a N -particle problem (the continuum is retrieved by taking N → ∞), we must choose the values of N and of p, see Section 3. One-point models can only provide one-point information (all mean moments) but no space correlations at a given time, since these correlations are in fact contained in a two-point description. However, at present, two-particle models are not as developed as one-particle ones, especially in the general case of non-homogeneous non-stationary ows. We therefore consider
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
77
only one-point or one-particle models in the following (in the notations of Section 3, we have s = 1). The remaining choice is the one-particle state vector. 6.6. Present one-point models Once the pdf stage is set, remains the question of the proper choice of the state variables that will be kept in the state vector and those that will be eliminated in turbulence modelling. From the Kolmogorov’s hypothesis described in Section 5.3, uid particle accelerations are close to white-noise processes. Thus, the uid particle acceleration is a good candidate to be replaced by a model and there is some hope to derive a one-point model by restricting the state vector to particle locations and velocities. This justiAes the choice of the one-particle state vector, that we have already used namely Z = (y; V; ). In this section, models for uid particle location and velocity are presented. Models for scalar variables are left out and can be found in Pope [6], Fox [9] or Dopazo [8]. This could be surprising since one of the main interests of pdf models, for single-phase turbulent ows, is precisely the treatment of reactive scalars. However, the aim of the present paper is the presentation of pdf models for two-phase turbulent ows where particle velocity stochastic models have similar interest. The stochastic models described here will be used as starting points in later sections. 6.6.1. Stochastic model for ;uid particle velocity A model is now derived for pL (t; x; V). From the instantaneous particle equation of motion Eq. (230), the two forces acting on a uid particle, the pressure-gradient force and the viscous force, have to be expressed as functions of local variables (one-point closure). A global model can be proposed for the sum of these two forces and analysed with respect to classical second-order modelling [7,10]. Another approach is to build a model in two steps, by modelling separately the viscous term and the pressure-gradient term, relying furthermore on arguments from Statistical Physics rather than on macroscopic mean equations. The starting point is to reformulate the exact one-point pdf equation, Eq. (231). We decompose the pressure P into a mean component P and a uctuating one p , and it can be shown that the viscous term can be re-expressed as the sum of two contributions [6,11]
9pL+ 9pL+ 1 9P 9pL+ 9 1 9p L+ = + Z=z p + Vi 9t 9yi 9xi 9Vi 9Vi 9xi +
92 [ pL+ ] 92 − [+ij |Z = zpL+ ] ; 9Vi 9Vj 9yi2
(234)
where +ij is the pseudo-dissipation tensor deAned in Section 6.3. By neglecting the third term involving molecular viscosity at high Reynolds numbers and by assuming an isotropic form for +ij using only the unconditional dissipation of the turbulent kinetic energy +, we obtain the working form of the one-point pdf equation
92 1 9p L 9p L 9 1 9p 1 9P 9pL L L = + Z=z p − +p : (235) + Vi 9t 9yi 9xi 9Vi 9Vi 9xi 9Vi 2 3
78
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
The superscript L+ has been replaced simply by L since we are already dealing with a model equation. It is interesting to see that the only remaining trace of the molecular viscosity is through the Anite and non-zero mean dissipation rate, +, in line with Kolmogorov theory, and that we end up with a second-order term in the pdf equation, however with a negative sign. A complete pdf model is worked out by putting forward a model for the uctuating pressure-gradient force as a function of the particle velocity and local mean quantities. A recent proposal is to resort to Onsager’s hypotheses [11] and to write a Langevin equation for the e8ect of the uctuating pressure-gradient (an external force in the one-point pdf description since this is a force created by all the other uid particles in the ow). Apart from showing that present pdf models can be more than just rough guesses and indications of some interesting links with other physical Aelds, this connection is also useful to reveal the interests and the inherent limitations of present closure relations. The real issue is to write a model to describe what is essentially a non-equilibrium statistical problem. Onsager’s hypotheses are valid mainly in the near-equilibrium domain, and their use reveal that present models cannot claim to fully describe the complete problem, and particularly that far-from equilibrium phenomena such as coherent structures may not be captured with present closure proposals. Extensions of the stochastic models or perhaps multi-point pdfs may be needed to include explicitly the statistical signature of these structures. The Langevin model for the uctuating pressure-gradient force is combined with the deterministic mean pressure-gradient and the viscous term to build a complete Langevin model for uid particle velocities. The derivation is also an example of the equivalence and the interest of the two points of view (trajectory and pdf) in stochastic modelling, since both frameworks are used in the course of the construction. Details on the modelling steps can be found in other works [54,11] and only the resulting global model is presented here. The complete model consists in retaining the instantaneous positions and velocities in the state vector Z = (x; U) and in using a general di8usion process to simulate its time rate of change d xi = Ui d t d Ui = −
(236a)
1 9P d t + Di d t + C0 + d Wi : 9xi
(236b)
The drift vector is given by the expression Di (x; U) = −
Ui − Ui + Gija (Uj − Uj ) ; TL
where TL stands for a time scale given by 1 k TL = 1 3 : ( 2 + 4 C0 ) +
(237)
(238)
It is thus seen that the uid velocity equation has the form of a Langevin equation. The drift coe@cient has actually a very simple form, involving only return-to-equilibrium terms (where the mean velocity at particle locations acts as the local equilibrium value), and it seems di@cult, apart from forgetting turbulence altogether, to obtain simpler stochastic models. The resulting model has indeed limitations but contains already interesting turbulent features [54,11]. When
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
79
the matrix Gija is put to zero, which yields the simplest model, the time scale TL represents also the time scale of velocity correlations. This is how the model appears from the trajectory point of view. In the pdf point of view, it consists in saying that the pdf equation is modelled by a Fokker–Planck equation 1 9P 9pL 1 92 9p L 9p L 9 = − [Di (x; V)pL ] + [C0 +pL ] : + Vi 9t 9xi 9xi 9Vi 9Vi 2 9Vi 2
(239)
6.6.2. Stochastic model for the dissipation rate To obtain a self-su@cient model, information about +, the mean dissipation rate of turbulent energy at particle location, is needed. + is given by
9ui 9ui + =
: (240) 9xl 9xl In a one-particle approach, instantaneous gradients cannot be derived from available information. They remain external to the description and must be provided by another source of information. Although just + is strictly needed in the particle velocity equations written above, a model for the instantaneous values of + along the particle trajectories is developed. The present model is constructed, not directly in terms of +, but rather in terms of the frequency rate !=+=k whose mean is the inverse of the time scale. The derivation of the model is explained in detail in other papers [57,58] and is based on Kolmogorov’s third hypothesis, namely that in homogeneous turbulence ln(+) is log-normally distributed. In non-homogeneous turbulence, the complete form of the model is [57] d ! = −!!S! d t + !2 h d t
! ! ! − ln d t + ! 2CK !2 d W ; − !!CK ln ! ! !
(241)
where 2 is the variance of ln(!= !) and is taken here as a constant. The drift term involving h is added to represent the mechanism through which laminar particles become turbulent and plays a role at the edges of free shear ows. The second drift term involving S! represents the normalized decay rate of the mean frequency and is modelled by reference to the standard equation of + S! = (C+2 − 1) − (C+1 − 1)
P
+
where P = −ui uj
9Ui : 9xj
(242)
The di8erent parameters, CK ; C+1 and C+2 are constants of the model. Apart from providing the needed value + at each point, the introduction of instantaneous particle values of ! allows external and internal intermittencies to be simulated (see [59]). Another model for the simulation of the instantaneous values of ! can be found in [60]. With the stochastic model for !, the complete vector state is now Z = (x; U; !) and the general pdf follows a Fokker–Planck equation in the corresponding sample space.
80
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
6.7. Mean ?eld equations The mean equations that correspond to the present Lagrangian stochastic models can be explicitly written. Once again, there are two possible ways to derive them. The Arst one is by writing the pdf equation and by integrating over all sample space after multiplication by the proper variable and the second one is a derivation directly from the particle stochastic di8erential equations (trajectory point of view). 6.7.1. pdf point of view The procedure developed here is identical to the one which is used in Kinetic Theory to obtain the Navier–Stokes equations, cf. e.g. [22,21], but at another scale as it was explained at the beginning of this section. The transport equation for pE (the kinetic equation in Kinetic Theory) is used together with an averaging operator to obtain the macroscopic equation, the mean equations (the Navier–Stokes equation in kinetic theory). In the case of an incompressible ow (the procedure can be easily extended to the compressible case), it is straightforward to prove, using Eqs. (219) and (239), that pE veriAes the following di8erential equation: 1 92 9p E 9p E 9 = − [Ai (x; V; P ) pE ] + [C0 +pE ] ; + Vi 9t 9xi 9Vi 2 9Vi 2
(243)
where Ai (x; V; P ) is given by Ai (x; V; P ) = −
1 9P + Di (x; V) : 9xi
(244)
In order to obtain the (mean) Aeld equations, a standard procedure is used, in analogy with the derivations which can be found in kinetic theory [20,21], as explained at the beginning of the section. Given any function H(V) (it is one component of the tensor of order n; Vi1 Vi2 : : : Vin ), the following expectation : (averaging operator) is deAned H(t; x) = H(U(t; x)) = H(V)pE (t; x; V) d V : (245) This corresponds to a simpliAed case (since is constant) of the deAnition of mean variables already used in Eq. (224). Let us multiply Eq. (243) by H(V) (denoted H for the sake of simplicity) and apply the previously deAned operator. With
9p E 9H E E [HAi p ]; H → 0 ; p (246) Vi →±∞ 9Vi 9Vi (by construction pE and 9pE = 9Vi converge to zero when, at least one component of the velocity goes to inAnity, Vi → ±∞) and if we assume that all generalized integrals converge, after some algebra, one can obtain the following equation:
2 9 9 H 9 9H 1 H + Vi H = Ai + C0 + ; (247) 9t 9xi 9Vi 2 9Vi 2
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
81
provided that uniform convergence is achieved (so that one can write 9= 9t = 9= 9t and 9= 9xi = 9= 9xi ). By choosing H = 1, the mean continuity equation, 9Ui = 9xi = 0 is obtained, and with H = Vi , Eq. (247) yields the mean momentum (Navier–Stokes) equation, 9Ui 1 9P 9Ui 9ui uj + =− : (248) + Uj 9t 9xj 9xj 9xi
The mean Aeld equations of higher order cannot be obtained directly by the procedure which has just been introduced. As a matter of fact, a change of coordinates in phase space is necessary, Vi = Ui (x; t) + vi (velocity is represented in terms of the uctuating component, v = v(V; U ). By noticing that pE (t; x; v) d v = pE (t; x; V) d V, the mean moment of order n is deAned by n ui1 : : : uin (t; x) = vik pfE (t; x; v) d v : (249) k=1
The procedure developed so far with a function H(V) is now repeated with another function H(v) (it is one component the tensor of order n; vi1 vi2 : : : vin ), whose expected value is deAned by H(u(t; x) = H(v)pE (t; x; v) d v : (250) Eq. (243) is Arst re-written is the new phase space. The time and spatial derivatives become (the velocity derivatives are not changed) 9pE (V) 9pE (v) 9Ui 9pE (v) − ; (251) = 9t 9t 9t 9vi 9pE (V) 9pE (v) 9Uj 9pE (v) = − 9xi 9xi 9xi 9vj
(252)
and consequently, the Fokker–Planck veriAed by pE (t; x; v) is 1 92 d pE 9p E 9 = − [Ai (x; V; P )pE ] + [C0 +pE ] + vi dt 9xi 9vi 2 9vi2 +
9Uj 9pE dUi 9pE + vi ; d t 9vi 9xi 9vj
(253)
where d = d t = 9= 9t + Ui 9= 9xi . Let us multiply the previous equation by H(v) and make the same assumptions as in the derivation of Eq. (247) (that is Eq. (246) is veriAed with pE (v), all generalized integrals converge and uniform convergence is achieved). By doing so and after some algebra, one can write
2 1 d 9 9H 9 H H + vi H = Ai + C0 + dt 9xi 9vi 2 9vi2
9Uj 9 dUi 9H − (vi H) : (254) − dt 9vi 9xi 9vj
82
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Inserting H = 1 or vi in the previous equation, one obtains the mean continuity equation and the mean Navier–Stokes equations, respectively. With H = vi vj , the transport equation for the Reynolds stresses becomes 9ui uj 9ui uj uk 9Uj 9ui uj 9Ui + = −ui uk − uj uk + Uk 9t 9xk 9xk 9xk 9xk
+ Gik uj uk + Gjk ui uk + C0 +ij :
(255)
6.7.2. Trajectory point of view Pursuing our trajectory point of view, we present a derivation directly from the particle stochastic di8erential equations. This is a nice exercise for the application of stochastic calculus. In the pdf point of view, the normalization condition of pE yields immediately the zero divergence equation. In the particle point of view, the condition of mass conservation is directly enforced to calculate the mean pressure-gradient. Therefore, by construction, we have 9Ui = 9xi = 0. The mean momentum equation is simply obtained by applying the averaging operator to the particle velocity equation (236b) d Ui = −
1 9P dt : 9xi
(256)
Using the relation between the instantaneous substantial derivative and its Eulerian counterpart, d · = d t = 9 · = 9t + Uj 9 · = 9xj , we obtain 9Ui 1 9P 9Ui 9ui uj + =− : + Uj 9t 9xj 9xj 9xi
(257)
Thus, the high Reynolds form of the mean Navier–Stokes equation is satisAed. This should not be too surprising since convection is treated without approximation by the Lagrangian point of view and since the mean pressure-gradient, which represents the mean value of the acceleration of a uid particle, is properly taken into account in Eq. (236b). Contrary to the Eulerian approach, no model is needed for the Reynolds stress tensor. The second-order equations are more interesting. First of all, one has to write the instantaneous equations for the uctuating velocity components along a particle trajectory. This is done by writing ui = Ui − Ui and consequently d ui d Ui 9Ui 9Ui 9Ui − uj − : (258) = + Uj dt dt 9t 9xj 9xj We now write the equation in an incremental form to properly handle the stochastic terms, which using the mean Navier–Stokes equation is 9ui uk 9Ui d ui = d t − uk d t + Gik uk d t + C0 + d Wi : (259) 9xk 9xk The Arst two terms on the rhs are exact and are independent of the form of the stochastic model. It is seen that, for non-homogeneous turbulence, the mean value of the uctuating velocity increments is not zero. Failing to include that term is equivalent to mishandling the mean pressure-gradient term. This may lead to non-physical e8ects, such as spurious drifts
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
83
unfortunately met in most two-phase ow models, see Section 7.5.2. The di8erent SDEs are deAned in the Itˆo sense and the derivatives of the products ui uj are obtained from Itˆo’s formula d (ui uj ) = ui d uj + uj d ui + C0 + d tij :
(260)
The mean second-order equations are then 9ui uj 9ui uj uk 9Uj 9ui uj 9Ui + = −ui uk − uj uk + Uk 9t 9xk 9xk 9xk 9xk + Gik uj uk + Gjk ui uk + C0 +ij :
(261)
The mean convective terms and the triple correlations arise naturally from the average of the derivative along the instantaneous trajectory and are therefore exact. The turbulence production terms on the rhs stem also from convection and, more precisely, from the second term on the rhs of Eq. (259). Since these terms come from exact manipulation, the production terms are treated without approximation. The e8ect of the model appears on the second line and is seen to correspond to a model for the pressure rate of strain correlation Hij . By using the decomposition of the matrix G already introduced + 1 3 Gij = − (262) + C0 ij + Gija ; 2 4 k a more explicit form is obtained 9ui uj 9ui uj uk 9Uj 9ui uj 9Ui + = −ui uk − uj uk + Uk 9t 9xk 9xk 9xk 9xk 3 + 2 2 a a ui uj − kij − +ij : + Gik uj uk + Gjk ui uk − 1 + C0 (263) 2 k 3 3 Several remarks can be made at that stage. The Arst one is that these second-order equations are the Arst level in the moment hierarchy where the model (through the matrix G) manifests itself. This re ects the fact that the stochastic model represents a model for the joint e8ects of the uctuating pressure gradient and the uctuating viscous term. The Arst mean equation where these e8ects appear is the second-order one through the mean work performed by these forces. The particular form of the second-order equation is of course dependent on the expression of the matrix G or G a . For di8erent forms, we get di8erent models of Hij while with the simplest form, G a = 0, the Rotta model (simple return-to-isotropy term) is retrieved. Through di8erent constitutive laws for G a , one can get various turbulence models. This connection between Lagrangian stochastic equations and Reynolds stress modelling has been comprehensively detailed in some papers [10,61] and these works are referred to for further details. The mean equation picture is completed by writing the equation satisAed by the mean frequency rate !. Following the same derivation, one gets 9! 9uj ! 9! + Uj + = −!2 (S! − h) : (264) 9xj 9xj 9t 6.7.3. Compressible ;ows Mean equations for variable density and compressible ows are derived from the stochastic equations (provided the necessary supplementary models for scalars and other quantities entering
84
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
the chosen models are introduced) with the same procedure. In that case, the proper quantity to handle is the Eulerian mass density function F E which satisAes the same equation as the Lagrangian transitional pdf. The mean values obtained from this mdf are the Favre-averaged means deAned by ˜ (t; x)H(t; x) = H(V; )F E (t; x; V; ) d V d : (265) 6.8. Physical and information contents In this paragraph, we provide a short summary of the main ideas and we discuss physical aspects of the modelling steps as well as the information content of present models. It is important to be aware of the relation between a choice of a state vector and of a stochastic model on the one hand and the corresponding loss of information on the other hand. This point is discussed in the last part of this subsection. 6.8.1. Summary of the modelling steps Brie y speaking, we can merge the Lagrangian and the trajectory point of view and say that the exact instantaneous equations are replaced by modelled ones (stochastic models) which still describe the instantaneous behaviour. This is expressed by the following sketch: + + d xi = Ui d t d xi = Ui d t + ⇒ : (266) 1 9P 1 9P d t + Di d t + C0 + d Wi + QUi dt d Ui+ = − d Ui = − 9xi 9xi The complete model consists therefore in a stochastic di8usion process in terms of the state vector Z = (x; U) (the stand-alone model contains an equation for the dissipation rate and the vector state is extended to Z = (x; U; +) but we leave out this last model in the present section). The trajectories of the process are continuous and a linear law is used for the drift vector Di as a function of the instantaneous particle velocities. This translates, in turbulence modelling, classical arguments from Statistical Physics. However, the matrices G and G a can be any function of mean quantities and the resulting pressure rate of strain model in the second-order equations can be of any form. The precise expression for G a is not provided by the present derivation and a constitutive law still has to be assumed to get a closed expression. The preceding section has shown that there is a strong connection between the present class of Lagrangian stochastic models and second-order equations [7] and this connection is even often put forward as a justiAcation of the Lagrangian models. In the present document, another route has been followed. We have worked our way from the Navier–Stokes equations by directly injecting a model for the instantaneous variables which are considered. This is indeed a mesoscopic model in agreement with the general approach described at the beginning of this section. We have used the Kolmogorov theory to help us make a physically based choice of the variables retained in the state vector and to suggest that, at high Reynolds numbers, the uid particle acceleration was the good variable to eliminate.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
85
6.8.2. Physical content We can try to outline the physics accounted for or left out of the present approach by reformulating the modelling steps with a slightly di8erent language. The uid particle acceleration A is in fact split into three contributions, 1 9P A=− (267) + G(U − U) + : !"# 9x !" # As Al
The sum of the Arst two terms on the rhs, denoted by Al may be regarded as representing the pressure-gradient due to the large scales. In the present one-point framework, this term has to be modelled and is here subdivided into a mean (in the spirit of Reynolds decomposition) and a return-to-the-mean term. This is also a weakness of present one-point formulations since a satisfactory model for the rapid pressure term remains an open issue. A better idea would probably be to try to calculate directly these large scale part of the interactions between particles, in the spirit of the decomposition performed in large eddy simulations (LES). Proposals along that line of reasoning and suggestions on how to couple present methods with other particle methods such as S.P.H. will be put forward in Section 9.3.2. Yet, in spite of their current limitations, present stochastic models have a crucial advantage over classical approaches including the more-advanced LES method: they explicitly simulate all the scales at a given point within the ow and not simply the Arst two moments as Reynolds stress models, see Section 6.3, or just the larger scales as in LES models. This is manifested by the explicit simulation of As = for each particle and not just their indirect in uence on the average terms, such as A or Ai Uj . The direct consequence is that local source terms and particularly those that depend upon the entire range of scales, such as chemical reactive source terms, are treated without approximation, see Section 6.5.3. On the contrary, even LES methods in spite of their interest for large scales predictions are faced with the same di@cult closure requirement for the small scales. Doing a poor job at that stage, or neglecting the sub-grid scalar uctuations on the source terms, can result in predictions that can be completely inaccurate as was shown recently in simple mixing ows [62,63]. In the language of Statistical Physics, the present modelling step, Eq. (266), can be seen as a kind of mean-Aeld approximation. We have replaced the real problem of N interacting particles by N particles that interact with a mean Aeld. The particles do not interact directly but create a mean Aeld or ‘potential’ that is applied to each one. It is also called a ‘weak interaction’ approximation, an expression which highlights already some of its limitations in turbulence modelling. The last term in the acceleration decomposition, , represents small scale uctuations. At high-Reynolds, is treated as a fast process and is replaced by a white-noise term. Indeed, since it represents small scale e8ects, it can be seen as a (real) process whose time scale is of the order of the Kolmogorov time scale $@ and whose variance is from Kolmogorov hypotheses + ()2 ∼ : (268) $@ Therefore, we have when Re → ∞ $@ → 0; ()2 → ∞ with ()2 × $@ → D = + :
(269)
86
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
This is fully consistent with the reasoning presented in Section 4 to justify the shift from a real process to a white-noise process. Such a move is therefore justiAed and mathematically-supported. Furthermore, the Anite trace which is left when the fast process is eliminated is the expression of one keystone of Kolmogorov theory which states that, as → 0 or Re → ∞, the dissipation rate of turbulent kinetic energy tends towards a Anite and non-zero limit. The closure model for is written even in the general case of non-homogeneous turbulence. In that case, the model is local in space dt → C0 +(t; x) d W : (270) Re→∞
This is also in line with the ideas of Section 4 and with the notion of slaving variables. The fast variable is slaved by the slow variable, here the position x, and is replaced by its limit value. This is one way to express the ideas of the (local) equilibrium of the small scales. In summary, we can say that there is about as much physics in present pdf models as in Reynolds-stress models, as far as the predictions of mean velocity and=or temperature Aelds are concerned. The main interest is to provide more local information on the ow and, in particular, on small scales. On the one hand, phenomena governed by small scales as well as large scales are still correctly treated and deviations from Gaussianity can be captured (since the information contains higher-order moments). On the other hand, there is no instantaneous Aeld in these methods and, as such, cannot pretend to simulate coherent structures. These questions will be addressed again in Sections 9.3.1 and 9.3.2. Numerical examples are presented below in Section 6.9 in order to illustrate the previous points by numerical outcomes and to give an idea of how the method works in practice. 6.8.3. Information content It is also useful to emphasize the information content of present models: what is the kind of information available and what is lost? Indeed, a modelling step, such as Eq. (270), in which a di8erentiable process is replaced by a non-di8erentiable process (a white noise) has important consequences. From the considerations developed in Section 4, it is clear that such a move implies an irreversible loss of information. In our case, acceleration is eliminated from the state vector which is valid provided the system (a uid particle) is observed on a ‘large enough time scale’, that is provided we have $@ d t T :
(271)
As a consequence, terms related to the acceleration or governed by what happens in the range d t $@ in which the model is not valid may not be treated correctly. A good example is the behaviour of the autocorrelation near the origin. This quantity is of importance in dispersion models for two-phase ows (it will be discussed in Section 7 and in particular in Section 7.5.3). Let us consider, for the sake of simplicity, homogeneous stationary turbulence when present models have the simple generic form √ U (272) d U = − d t + < d t with < d t = K d W : TL The notations have been chosen so as to agree with those used in Section 4.3. From the results given in that section, we know that when the limit of a white-noise term is taken as in
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Eq. (270), the autocorrelation is an exponential function s U (t0 )U (t0 + s) = exp − : Rf (s) = U 2 (t0 ) T Then, the derivative of Rf (s) at the origin is d Rf (s) 1 =− ; Rf (0) = ds T s=0
87
(273)
(274)
instead of zero as it should be for a real process, see Section 5. However, this is not really a new element and it cannot be used to criticize the Langevin model. Indeed, the slope of Rf (s) when s → 0 is governed by the acceleration, or in other words by the behaviour of the increments of uid velocities over time steps in the range d t $@ , that is outside the range of validity of the model. This can be seen by considering increments of Rf (s) around s = 0 for small values of s 1 1 <(0)U (0) U (0) d U (s) = − d s + ds : d Rf (s) = Rf (s) − Rf (0) = (275) 2 U T U 2 √ When < is treated as a white noise (details on the acceleration are lost), i.e. < d s = K d W , then from the rules of stochastic calculus, we have 1 1 <(0)U (0) = 0 ⇒ d Rf (s) = − d s or Rf (0) = − : (276) T T Yet, when acceleration remains a di8erentiable process, the rules of classical calculus apply. From the conservation of energy, U 2 we have dU 2 = −2
U 2
d t + 2
U 2
(277) T T and, in that case, we obtain the correct derivative for Rf (s), Rf (0) = 0. In consequence, if one is interested in the behaviour of Rf (t) near the origin, then particle acceleration must be properly handled in the modelling steps. From the discussion in Section 4.3, a satisfactory answer is to include the acceleration in the state vector and to reject the model to the time rate of change of the acceleration. For example, in the last process presented in Section 4.3, Eqs. (109), is not a white noise but a coloured noise. Thus, the acceleration exists (U is now a di8erentiable process). Such a model may be useful to simulate uid velocity even when d t $@ by assuming that the characteristic time scale of , that is $ in Eqs. (109), is equal to the Kolmogorov time scale $ = $@ . From the results of Section 4, the form of the velocity autocorrelation is 1 $ (278) e−s=T − e−s=$ ; Rf (s) = 1 − $=T T which does respect the two constraints Rf (0) = 0 and Rf (0) = 0. It is also clear from the above expression, that the correct limit is a question of how the limit is taken when both s and $ go to zero. If for a Axed s, we Arst take the limit $ → 0, then we retrieve the Langevin model for uid velocities and in consequence the slope is not correct. On the other hand, if we take Arst the limit s → 0 for a Axed $ which means that we Arst calculate Rf (0) when uid acceleration still exist, and then $ → 0, we have the correct limit.
88
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
The point emphasized by this example is that one should be aware of the resulting loss of information implied by the necessary modelling steps such as in Eq. (270). A clear physical analysis must then be performed beforehand in order to identify what variables are needed for speciAc issues. 6.9. Numerical examples In the last part of this section, we present and discuss numerical simulations based on pdf models. These numerical examples illustrate how the method works in practice and the kind of information that can be extracted from present models. Present pdf or particle stochastic models can be applied di8erently. First of all, pdf models can be coupled to conventional Eulerian solvers in various ways. In some applications, the pdf approach is limited to scalars variables (in order to treat reaction sources terms correctly) while the Aelds of mean velocity, Reynolds stresses and mean pressure are calculated by solving their PDEs (partial di8erential equations) on a mesh using classical numerical methods such as Anite volume. Thus, an Eulerian solver provides a particle (or pdf) solver with these mean Aelds, and in return the particle solver sends back the value of the mean source terms and of the mean density to the Eulerian solver. In other numerical methods, the moment and pdf approaches are also coupled for the solution of the dynamical variables. For example, one can compute the mean Navier–Stokes equation with a classical Eulerian solver, but where the Reynolds stresses are calculated from the particle set instead of being modelled as in k–+ models. This coupled numerical method was applied by Haworth and El Tahry in an interesting industrial application [64]. A new coupling procedure has recently been put forward [65], in which the Eulerian and Lagrangian solvers are coupled in a consistent way (the introduction of this article provides an up-to-date overview of the various coupling strategies). The coupling of pdf models to Eulerian solvers is not a necessity and a second class of numerical methods makes use only of particles. There is no Eulerian solver in the background and everything (all the necessary or useful Aelds) are computed from the set of particles. This is called a stand-alone pdf code. This method is by construction fully consistent since there is only one approach (the Lagrangian one using stochastic particles). From a strictly practical point of view, this stand-alone approach may not be the most e@cient in terms of computational costs [65]. However, a stand-alone simulation represents a more stringent numerical test compared to coupled approaches. Since there is no Eulerian calculation performed with classical models, any numerical bias or ill-posed numerical algorithm that could be present in the pdf solver is clearly revealed whereas it could remain hidden in a coupled numerical solution. The simulations presented below have been obtained with such a stand-alone approach. Various applications have been described by Pope and di8erent co-workers over the years. These are summarized in a speciAc chapter (Chapter 12) of Pope’s textbook on Turbulent Flows [27]. A detailed analysis of numerical errors and performances of the related stand-alone code has been given by Xu and Pope recently [56]. Another stand-alone code has been developed by the Arst author of this paper in collaboration with Jacek Pozorski (IMP, Polish Academy of Science, Gdansk) and the simulations described below have been obtained with this code. Details on the numerical algorithm have been provided in a recent publication along with one speciAc application [66]. A comprehensive discussion of the overall algorithm or of speciAc numerical
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
89
Fig. 9. Weak approximation of the pdf.
steps is not within the scope of the present paper (which is devoted mainly to two-phase ows), but it may be useful to give the main lines: • the ow is represented by a large number of particles. These particles correspond to samples
of the one-point pdf, or represent di8erent realizations of the ow,
• a number of variables (position, velocity, turbulent frequency, scalars, etc.) are attached to
each particle. The particles are statistical models of real uid particles and the time evolution equations of particle variables are (generally) stochastic equations. • the meaningful quantities are statistical averages and are calculated locally as ensemble averages from the set of particles found in a small volume around the point of interest, see Fig. 9. This corresponds to a weak approximation of the underlying one-point pdf, as indicated in Sections 2.1.2 and 2.10, with the exact pdf being approximated by N
1 p( ) pN ( ) = ( i − ) ; N
(279)
i=1
where i is the value of the variable attached to the particle labelled i and corresponding sample-space variable.
is the
For a given application, the domain is Arst covered with a mesh and is thus divided into small cells. Although the stand-alone approach is in spirit a grid-free method, a mesh is introduced for practical reasons. The mesh is used to locate particles and to calculate statistical averages from the set of particles present in each cell (or perhaps in neighbouring cells also depending upon the chosen method) and also to compute the mean pressure Aeld required to satisfy the mean continuity equation (which is therefore satisAed at the level of the size of a cell) [66]. The global numerical method falls therefore in the category of Monte Carlo particle=mesh approach. A large number of particles are distributed continuously in the domain. At each time step, particle properties are updated by integrating the stochastic di8erential equations in time. Boundary conditions are applied and particles may leave or enter the domain depending upon the case considered. Then, particle locations and velocities are corrected so as to ensure the
90
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 10. Sketch of a mixing layer.
mean continuity equation [66]. Finally, statistical quantities, such as the mean velocity Aeld, Reynolds stresses,: : : ; are computed in each cell as local ensemble averages. 6.9.1. Mixing layer The Arst example described is a mixing layer, see Fig. 10. This is a typical free shear ow which has been well studied over the years. A mixing layer is a turbulent ow obtained by mixing two streams of di8erent velocities, say Ul and Uh (with Ul 6 Uh ). The characteristic scales are indicated in Fig. 10: a longitudinal scale L which scales as x, a transverse scale l, a velocity scale Us = Uh − Ul and a velocity scale for turbulence u. We deAne the characteristic width of the ow by = y0:95 − y0:05 where y0:05 is the location at which we have U (y) = Ul + 0:05Us and y0:95 the location at which we have U (y) = Ul + 0:95Us . As other free shear ows, mixing layers are evolving ows and the transverse distributions of the di8erent variables change with the downstream position. However, an experimentally supported assumption is to regard these ows as self-preserving: the di8erent variables retain the same form and become independent of the longitudinal coordinate x once they are scaled by the corresponding scales (which grow as a function of x). For a mixing layer, the driving force Us remains constant. The turbulent velocity scale u satisAes u=Us = O(=L)1=2 . The characteristic width of the ow grows linearly with x and consequently the ratio =L is constant. Then, u is a constant fraction of Us and is also constant. Experimental evidence indicate that the entrainment parameter U + Uh d S= l (280) Uh − Ul d x is constant S 0:056. We present some results for a mixing layer with Ul = 1 m= s and Uh = 1:5 m= s. Further details can be found in a proceeding paper which describes the computation results [59]. The ratio Ul =Uh = 0:67 is close to one, and thus the present case corresponds approximatively to a temporal mixing layer. For this calculation, a speciAc version of the general algorithm was used. Computations were performed with the boundary-layer algorithm [6] in which particles represent a Axed amount of axial momentum instead of a Axed amount of mass and are marched downstream until a self-similar state is reached. The numbers of particles simulated varied from
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
91
Fig. 11. ProAle of the mean axial velocity. Comparison with the experiments of Wygnanski and Fiedler (•), Patel (?) and Champagne et al. ( ).
about 15 000 at the beginning of the computations to about 100 000 at the end. The overall calculation took around 30 –35 mns for 1000 time steps on a 40 MEGAFLOPS HP workstation. It must be noted that the pdf model used in the calculations correspond to the reAned Langevin model [58] in which the di8usion coe@cient in the velocity equation is written in terms of the instantaneous dissipation rate instead of the mean value. The detailed form of this model is more complicated compared to the models presented in Section 6.6 with many additional variables that increase the computational costs with respect to the simpler SLM (simple Langevin model). The calculated spreading rate d = d x is found to be 0:036 whereas the experimental formula predicts a value of 0:037. It is Arst assessed that a self-similar state is reached by checking that proAles plotted as a function of @ = y= collapse on the same curves. Then, the proAles of mean quantities are compared to the experimental measurements of Patel [67], Champagne et al. [68] and Wygnanski and Fiedler [69]. The mean velocity and the non-zero components of the Reynolds stress tensor are shown in Figs. 11 and 12. Higher-order moments which are not accessible for classical turbulence models are easily calculated from the set of particles with nearly no extra computational costs. For example, triple correlations and skewness and atness factors of the velocity components are displayed in Figs. 13 and 14. The following Agures illustrate the type of information that can be extracted from the model. Fig. 15 is a scatter plot of the instantaneous streamwise velocity of a selection of particles (of the order of 10 000 particles). The low-speed and the high-speed sides are clearly visible; particles are in a laminar state and have nearly the same velocity which collapses on a single line. In the core of the mixing layer, the ow is turbulent and instantaneous velocities are scattered. At a given lateral location, the spread is representative of the axial velocity variance uu1=2 . The next Agure is very instructive (Fig. 16). It displays a scatter plot of the instantaneous value of the logarithm of the relaxation rate log ! for the same group of particles. At the centre of the mixing layer (@ = 0), values of the random variable are spread around a central value. Only one group is visible and the level of log ! is a Arst hint that they are all turbulent particles. At the edges of the layer (for example, @ = y= = 0:15), a di8erent behaviour is evident: there is clearly a bimodal distribution. Particles which are located in the corresponding cell can be divided into two groups. The Arst group is made up of particles with roughly the same level of
92
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 12. ProAles of the Reynolds stress components. Comparison with the experiments of Wygnanski and Fiedler (•), Patel (?) and Champagne et al. ( ).
Fig. 13. Various third-order moments. Comparison with the experiments of Wygnanski and Fiedler (•). The solid line is to be compared with the Alled symbol while the dotted line is to be compared with the empty symbols.
log ! than those which are in the centre of the layer. Compared with the particles which are in
the centre, less particles constitute this Arst subset. The second subset is made up of particles having the same low value of !, ! 0, indicative of laminar particles. These two subsets represent turbulent and laminar particles, respectively, and their simultaneous presence is the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 14. Skewness and Champagne et al. ( ).
93
atness factors. Comparison with the experiments of Wygnanski and Fiedler (•) and
Fig. 15. Scatter plot of axial instantaneous particle velocities. Fig. 16. Scatter plot of the logarithm of the instantaneous relaxation rate of an ensemble of particles.
translation of the intermittent nature of the ow. In order to bring out this intermittent behaviour in a more quantitative way, the pdf of the local relaxation rate can be derived. Two cells have been considered, one in the centre of the mixing layer where the uid is fully turbulent and one at the edge of the layer where the ow is highly intermittent. By considering the ensemble of particles present in these cells and by sorting the value of ! into bins, it is possible to plot the pdf of the random variable !. Fig. 17 presents the result in the centre of the layer.
94
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 17. Distribution of the pdf of the dissipation rate ! in the centre of the mixing layer. Fig. 18. Distribution of the pdf of the dissipation rate ! at the edge of the mixing layer.
Only one group of particles is present (turbulent particles, < 1) and the distribution follows a log-normal law. Such an outcome should not be regarded as a prediction of the model. Indeed, the model is built upon the reAned Kolmogorov hypotheses [58] and therefore assumes that the relaxation rate is lognormally distributed. However, this Agure re ects an important physical behaviour: at one location, even when the ow is fully turbulent, dissipation is not given by a simple value + but is a random variable. In a homogeneous Aeld, instantaneous snapshots of the dissipation would reveal regions of high turbulent activity (high value of +) concentrated in small volumes of the ow (low probability of these events). This spotty picture becomes more evident as the Reynolds number increases and is the signature of internal intermittency. Another type of behaviour is brought out by drawing similar results with the cell near one edge of the layer (Fig. 18). As described above, there are two groups of particles. Particles belonging to the laminar subset have all the value ! 0 therefore building a Dirac at the origin. Particles in the turbulent subset have scattered values and give rise to the second part of the curve. Although not obvious from the graph, the instantaneous values within the turbulent subset follow a log-normal law. The typical log-normal distribution is here attened since turbulent particles represent a small fraction of the whole set. The relative importance of the two distributions (Dirac versus log-normal) manifests the relative importance of the two subsets and represents directly the intermittent factor <. It is here a result of the present approach and proves that the model is able to simulate external intermittency. The ability of pdf models to simulate naturally external intermittency, and to give access to conditional statistics (statistics calculated only with ‘turbulent’ particles) as well as unconditional statistics is an attractive feature. A detailed physical analysis of the external intermittency coe@cient and its relation with variables of the model (in particular the dissipation-weighted kinetic energy) was proposed in [59]. The comparison between calculated and experimental intermittency coe@cients is shown in Fig. 19. Other results showing the deviation of the pdf of velocity components from Gaussianity were also presented in the same work [59].
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
95
Fig. 19. External intermittency coe@cient. Comparison with the experiments of Wygnanski and Fiedler (•). The solid line represents a direct computation based on the fraction of turbulent particles in each cell. The dotted line is the evaluation of < based on the expression the ratio k= k˜ of the kinetic energy over the dissipation-weighted kinetic energy.
Fig. 20. Sketch of the turbulent channel ow with an indication of typical boundary conditions.
6.9.2. Channel ;ow Another class of ow concern wall-bounded turbulent ows. A typical example is the channel ow. A sketch of the geometry is given in Fig. 20. Compared to free shear- ows, computations of wall-boundary conditions require proper particle boundary conditions. These conditions have been developed in a recently published paper [66]. In this work, a complete 2D algorithm was detailed and results obtained on a high Reynolds channel ow have been presented including an analysis of numerical errors and the importance of variance reduction techniques. The independence of results with respect to the choice of the mesh and to the choice of the number of particles per cell was also demonstrated [66].
96
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 21. Mean streamwise velocity and Reynolds stress as a function of the normalized cross-stream coordinate.
Time-averaged results are presented below. They represent a typical pdf calculation of a high Reynolds channel ow (Re = 120 000 based on the channel half-width and the bulk velocity, or Re$ = 4800 based on the friction velocity) and compared to the experimental results of Comte–Bellot. Numerical computations took 5 mns to compute 1000 time steps with 25 000 particles with a 10 × 25 mesh on a HP785=C3000 workstation. Since detailed numerical results are well described in [66], we limit ourselves to two main outcomes. These results have been obtained with the SLM model instead of the RLM (reAned Langevin model) model used for the mixing layer case and the equations of the model correspond to the models described in Section 6.6. The corresponding second-order model is the Rotta model which includes only a return to isotropy term. This is not the best model to use to obtain the most satisfactory agreement with experimental data. However, the purpose of the computations presented here is more to illustrate the approach and to validate the overall numerical algorithm rather than to obtain the best possible results. The plots in Fig. 21 present mean streamwise velocity as well as components of the Reynolds stress tensor Rij . The limits obtained in the simulations for the values of the Reynolds stress components at the wall are in line with the analytical values derived from the choice of the model in the logarithmic layer [66]. Higher-order statistics have also been computed, for example the skewness and the atness of velocity components which are useful to analyse the deviations of velocity pdfs from Gaussianity. They are shown in Fig. 22. 6.9.3. Heated channel ;ow Finally, we report computational results for scalar variables. The Arst two cases presented above concerned only dynamical variables. The interest in pdf models increases when more complex physics come into play, since their inclusion represent only a small overhead. For
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
97
Fig. 22. Skewness and atness factors of velocity components as a function of the normalized wall distance. Comparison with the experiment of Comte–Bellot.
Fig. 23. Sketch of the heated channel ow with an indication of typical boundary conditions.
example, to simulate a turbulent ow with heat exchange one has to extend the particle state vector to scalar variables for which modelled time-evolution equations have to be written. Typical models for scalars, as well as related issues, are discussed in Section 9.3.3. As a direct follow-up of previous calculations, we present below results for a heated channel ow where temperature is treated as a passive scalar with constant heat ux at the walls. A sketch of the problem with particle boundary conditions is shown in Fig. 23. This calculation is a recent computation obtained with a full velocity-scalar pdf stand-alone approach and with the numerical code developed with Jacek Pozorski (IMP, Polish Academy of Science, Gdansk). Results were Arst presented in a conference poster paper [70]. Yet, the
98
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 24. Temperature pdf in the core region of the heated=cooled channel. Left: scatter plot of instantaneous particle temperatures. Right: resulting pdf and histogram.
Fig. 25. Temperature pdfs in the core region of the heated=cooled channel. Comparison of computed pdf (solid line) and Gaussian pdf (dotted line).
case is still being investigated and we only describe here preliminary results. Fig. 24 represents a scatter plot of instantaneous particle temperature values in a small volume located near the core region of the channel. The pdf corresponding to this temperature distributions is deduced simply by bin-counting and is also shown in Fig. 24. It is clear from this result that the pdf is non-Gaussian and this is further revealed in Fig. 25 by comparing directly the calculated pdf with the Gaussian form. It is seen that the model captures a deviation from Gaussian statistics in this non-homogeneous case.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
99
7. One-point particle pdf models in two-phase 7ows Most of the necessary preliminary work has been performed and we are in a position to develop the pdf approach to two-phase ows. Indeed, the mathematical tools were explained in Section 2, the relations between various state vectors and the hierarchies of pdf equations have been revealed in Section 3 and the physical meaning of using SDEs has been proposed in Section 4. Furthermore, Section 6 has shown how to apply these concepts for single-phase pdf modelling, already introducing all the ideas behind the probabilistic description of turbulent ows. All these tools and results will now be used directly. The pdf approach to two-phase ow modelling will be divided into two sections, the present one and the following one. The following one will deal with the complete formalism for the two phases, the uid and the particle phases, while the present section adopts a pdf point of view only for the particle phase. This is done for two main reasons. First of all, this two-step approach towards the complete pdf formalism greatly simpliAes the theoretical and probabilistic tools. We will then be able to provide an in-depth discussion of some of the physical issues involved without being impeded by the mathematical framework. A second reason is that, by making this choice, the approach presented in this section falls into the rather classical Lagrangian, also called particle-tracking approach for which a vast literature exists, see for example one of the latest reviews by Stock [12] and the references quoted inside. In this literature, the modelling problem is addressed directly through an heuristic use of stochastic models and without making much use of the mathematical background. Although this short-cut method may allow one to obtain quickly numerical answers to certain problems, it is believed that paying not enough attention to the theoretical framework has several drawbacks and quite often leads to severe shortcomings. For instance, a lack of understanding on the assumptions behind stochastic modelling brings about confusion on the nature of the approach. Moreover, the fact that many models actually belong to the same class is missed and these models are believed to be di8erent (for example, the presentation of Markov chains versus random walks as if they were completely di8erent mathematical objects or the fact that current Lagrangian models are pdf models is not understood). Finally and even more importantly, some di@culties are created that actually do not exist (one such example is the notion of the so-called spurious drifts, see Section 7.5.2) while important open questions do not receive enough attention. With respect to that context, the objectives of the present section are (i) to precise what a pdf description is, as well as its relation with macroscopic approaches and its interest, (ii) to give an idea of the physical content of present state-of-the-art models and to emphasize the real modelling issues, (iii) to show that, once mathematical properties and the physical assumptions behind stochastic models are clearly explained, a number of di@culties disappear, (iv) to present numerical examples that illustrate how pdf models work in practice and the kind of information they provide.
100
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
7.1. Fundamental equations and modelling approaches The case of dispersed two-phase ows is now considered, i.e. ows where a continuous phase (a gas or a liquid) is carrying discrete particles (solid particles, droplets, bubbles, etc.), so that, for instance, cases like wavy or slugging two-phase ows are outside the scope of the present work. We will also assume that the characteristic length scale of the discrete particle is much smaller than the characteristic length scale of the continuous phase. New physical points, related to motion of discrete particles in turbulent ows, are introduced in this section and are worth detailing. The form of discrete particle equations is Arst discussed before considering various modelling approaches that can be followed. 7.1.1. Particle equation of motion The starting point of the statistical approach is to write the exact equations of motion of the discrete particles. Loosely speaking, these equations are the equivalent of the fundamental equations for the uid phase, that is the Navier–Stokes equations. Yet, the situation is more complex: the equations of motion for a discrete particle in a turbulent ow Aeld do not have the same level of validity as the Navier–Stokes equations and the exact form remains a subject of current research. This question has a long history. Stokes Arst worked out the resistive drag force for a steady creeping ow in 1851. Basset (1888), Boussinesq (1903) and Oseen (1927) then extended the equation to the case of a particle accelerating in a uid at rest. Tchen, in 1947 [71], was the Arst to generalize this equation for a particle in a turbulent ow. Since then, various contributors have addressed the question and have proposed modiAcations of the expression of the forces, however often in an ad hoc manner. The derivation of the forces was analytically carried out from Arst principles by Maxey and Riley [72] and more rigorously (in our opinion) by Gatignol [73]. In the following, we Arst indicate the main results and then discuss more at length speciAc issues. For negligible relative Reynolds numbers, Rep = dp |Ur |= f based on the particle diameter dp and on the local instantaneous relative velocity between the uid and the particle Ur = Us − Up , the equation has the form d xp = Up ; dt
mp
d Up = F1 + F2 : dt
(281a) (281b)
Here, mp = p ((d3p =6) is the particle mass, F1 accounts for the so-called pressure-gradient and the buoyancy forces whereas F2 stands for the drag, added-mass and Basset forces (d3p DUs (d3p F1 = + (p − f )g ; 6 f Dt 6
(282a)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
F2 = 3(dp f f (Us − Up ) t (d3p DUs d Up d d$ 3 2 √ − + dp f ( f + : f (Us − Up ) √ 12 Dt dt 2 t−$ −∞ d $
101
(282b)
In these equations, Us = U(xp (t); t) is the uid velocity seen, i.e. the uid velocity sampled along the particle trajectory xp (t), not to be confused with the uid velocity Uf = U(xf (t); t) denoted with the subscript f. The expressions above involve particle and uid accelerations, respectively d Up = d t and DUs =Dt. The precise form of the ‘ uid acceleration’ is still discussed. As will be seen below, that question is not important for heavy particles. It is however of signiAcance for lighter particles, such as bubbles, and will be discussed later on in Section 7.5.6. The division of the forces exerted by the uid on the particle into two terms has some physical meaning. In the course of the mathematical derivation, it is convenient to write the instantaneous velocity Aeld surrounding the particle as the sum of an ‘undisturbed’ Aeld (the velocity Aeld that would exist if the particle was not present) and a ‘disturbance’ Aeld which represents the in uence of the particle. Such a division is possible since dropping the inertia term (thanks to the assumption of small relative Reynolds number) transforms the problem into a linear one. The ‘undisturbed’ Aeld manifests itself through the pressure-gradient and buoyancy forces. On the other hand, the drag, added-mass and Basset forces arise from the ‘disturbance’ Aeld (evidenced by the presence of the relative velocity in the mathematical expressions of the forces) and are expressed as a function of the ‘undisturbed’ velocity at the centre of the particle (which is the precise deAnition of the velocity of the uid seen Us ). It is worth noting that it is the mathematical treatment which Arst yields the above equations which are then physically interpreted. The analytical results stated above rely upon the assumption that Rep 1. For higher Rep , when exact calculation cannot be performed anymore, it is assumed that the di8erent forces already isolated are still present, but with a generalized expression. For example, the drag force is generalized with the help of a drag coe@cient CD . The situation is more complicated for the added-mass and Basset forces. However, one can propose [74] mp
d Up = F1 + F2 ; dt
(283)
where F2 =
1 (d2p CD |Us − Up |(Us − Up ) 2 4 f t (d3p 3 2 √ DUs d Up d d$ + : − + dp f ( f (Us − Up ) √ dt 2 d $ 12 f Dt t−$ −∞
(284)
The added-mass and Basset forces are left unchanged while the drag force has now a quadratic form. The drag coe@cient is an empirical coe@cient that can be estimated through experiments. Various expressions have been put forward, cf. Clift et al. [75], among which an often retained
102
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
form is CD =
24 [1 + 0:15Rep0:687 ]
Rep
0:44
if Rep 6 1000 ;
(285)
if Rep ¿ 1000 :
For heavy particles where p f , the drag and gravity forces are the dominant forces and the particle equation of motion is reduced to d Up 1 (286) = (Us − Up ) + g : dt $p The drag force has been written in this form to bring out the particle relaxation time scale p 4dp $p = : (287) f 3CD |Ur | In Eq. (286), $p appears as the only scale, and is the time necessary for a particle to adjust to uid velocities. In the limit when Rep 1, it is seen from the expression of CD in that range that p d2p $p = (288) f 18 f which is the Stokes value. Even in that limit regime, the particle relaxation time scale is already a non-linear function (here quadratic) of particle properties (the particle diameter dp ). Outside the Stokes regime, the dependence of $p on particle properties and variables, such as Us and Up , is more complicated. Actually, the fact that the added-mass and Basset forces can be neglected with respect to the drag force, when p f , is not so obvious from the expressions of the di8erent forces in Eq. (284) since the particle density is nowhere present on the right-hand side of the equation. This can be justiAed by going back to the analysis of the basic equations, see [73]. The Navier– Stokes equation for the ‘disturbance’ Aeld, say W, around the solid particle is in the creeping regime, 9Wi 1 9p 92 Wi + f 2 : (289) =− 9t f 9xi 9xi The boundary conditions on the particle surface and at inAnity are W = −Uf + Up on the particle surface ; (W; p) → (0; 0)
when r → ∞ :
(290)
The important step is dimensional analysis. We introduce the particle diameter dp as the relevant length scale, Wr as a relevant scale for the relative velocity and $0 as the relevant time scale. With these typical scales, the Navier–Stokes equation is now, using a star to indicate non-dimensional quantities d2p 9p ∗ 9W ∗ 92 W ∗ (Rep St) ∗i = − ∗ + ∗ i ∗ ; Rep St = : (291) 9t 9 xi 9xi 9xi
f $0
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
103
Obviously, the choice of the time scale is crucial to the problem. If the time scale $0 is much larger than the viscous time scale, $0 d2p = f , the uid derivative can be neglected and the Navier–Stokes equation is reduced to the Stokes problem around a particle (stationary creeping ow). The force exerted on the particle is then limited to the drag force. On the other hand, if $0 is of the same order of magnitude as the viscous time scale, $0 d2p = f , then the uid derivative must be retained and this leads to the unsteady terms, the added-mass and Basset forces. Consequently, everything relies on the evaluation of the time $0 . Now, we are interested in the particle motion and it seems reasonable to take $0 as the characteristic time for particle velocity changes, that is as the particle relaxation time scale $0 =$p . At this point of the analysis $p is still unknown, but can be worked out as follows. Let us Arst assume that when p f , $p is very large with respect to d2p = f . Then, the added-mass and the Basset forces may be forgotten as explained above. Consequently, the particle equation of motion is reduced to mp or even mp
d Up 1 (d2p = CD |Us − Up |(Us − Up ) + mp g dt 2 4 f
(292)
d Up = 3(dp f f (Us − Up ) + mp g ; dt
(293)
since we are in the small Reynolds number regime. From this equation, as already mentioned, we can isolate the particle relaxation time scale, Eq. (288). Since p f , the relevant time scale $0 that enters the dimensional analysis is indeed much larger than the viscous time scale and the assumption is conArmed. This result is only valid for heavy particles, for when p f , the reasoning fails to single out a particular time scale. The unsteady terms are of the same order of magnitude as the stationary drag force and the only characteristic time is the viscous di8usion time d2p = f . In the following, we mainly limit ourselves to the case of heavy particles (the problem of bubbles or sediments is addressed in Section 7.5.6). The fundamental equations that describe heavy particle motion in turbulent ows are therefore made up by Eqs. (281a), (285), (286) and (287). When particle loading is not too small, other e8ects come into play. Particle–particle interactions or collisions have to be taken into account and the exchange of momentum and of energy between the uid and the particles can be expressed by additional source terms in the Navier–Stokes equations. The above form of the drag coe@cient is essentially for a single particle embedded in a turbulent ow. When the particle loading increases and we move into the realm of dense (but still dispersed) two-phase ows, the drag coe@cient for a particle may be modiAed by the presence and the in uence of other particles. This re ects hydrodynamical forces between discrete particles, see for example [76]. For example, one particle being in the wake of another particle may well experience a somewhat di8erent drag force compared to the upstream one. The hydrodynamical e8ect remains a di@cult question to take into account without explicitly simulating the exact behaviour of the uid and the N -interacting particles. 7.1.2. Various approaches for the dispersed phase Supplemented with the Aeld equations for the continuous phase, Eqs. (117), the discrete particle equations given above, Eqs. (281a), (285) – (287) form the set of equations needed to
104
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
describe the complete problem. Unfortunately, in practice knowledge of the exact equations of motion (microscopic level) are not the end of the story. One can argue that an ‘exact approach’ (in the spirit of DNS) is possible: one can solve the local instantaneous Navier–Stokes equations in which a source term, that represents the exchange of momentum with particles, is added and the particle trajectories can then be computed since there are no unknowns left, e.g. [77]. However, in the case of a large number of particles and of turbulent ows at high Reynolds numbers, the number of degrees of freedom (the dimension of the state vector) is huge and one has to reduce this dimension, or, in the language of statistical physics, to come up with a contracted description (probabilistic arguments have to be used). As explained in the introduction of this section, we do not consider here the complete issue of a reduced description through a unique formalism that includes both the uid and the particle phase (this will be done in Section 8). We Arst limit ourselves to an hybrid approach where the uid phase is described only by classical macroscopic methods in single-phase turbulence, such as k–+ or RSM models, see Section 6.3. This already implies that, for these practical reasons, some information is disregarded. For example, the hydrodynamical e8ects mentioned above cannot be explicitly calculated and, if present, would have to be introduced directly through modiAed constitutive relations for the drag coe@cient CD . In this simpliAed approach, the remaining issue is therefore to derive a reduced description of the particle phase. In two-phase ow modelling, two di8erent approaches can be followed for that purpose. In practice, one of the two possible ways to deal with the problem is the so-called Eulerian point of view. In this approach, mean Aeld equations are written for both phases expressing the evolution in time and in space of statistical properties of the uid and of the particles. These Aeld equations are transport equations involving a limited number of mean properties, at best the two Arst moments of each phase [78,79]. The other possible approach is the Lagrangian, or particle-tracking approach. The philosophy of this statistical approach is to consider the solution of the exact equations as random processes and therefore some statistical closures have to be introduced on the uid velocity seen, Us . This is achieved by looking at the process at a mesoscopic scale, that is by replacing the exact instantaneous equations of motion by modelled ones in the spirit of what was done for single-phase ow turbulence in the previous section (equations are written for a number of stochastic particles and therefore the particle-tracking approach is basically a Monte Carlo simulation of an underlying pdf [80]). Since the continuous phase is then described by classical macroscopic methods, the Lagrangian approach is therefore a mixed Eulerian=Lagrangian method or, from a numerical point of view, a grid=particle method. The Eulerian and Lagrangian approaches are often compared directly. Yet, it should be pointed out that such a comparison is often misleading if one fails to take into account that the two methods do not belong to the same class of models. Actually, the terms of Eulerian and Lagrangian do not help to clarify the issue. The central di8erence between the Eulerian and the Lagrangian approaches does not come from the chosen variables used to write continuum Aeld equations (the Lagrangian or Eulerian descriptions), but rather from the level of information contained in each approach. Most of what has been explained in Sections 6.2 and 6.3 can be applied here. The Eulerian approach is in fact a macroscopic approach: at any point in space, the Eulerian approach gives at best the two Arst moments of the di8erent variables. The Lagrangian approach (which should be called instantaneous Lagrangian or stochastic Lagrangian approach) is a mesoscopic model: it provides the actual pdf (probability density function) everywhere in
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
105
the domain, see Section 8.2.7. Consequently, to compare directly for example the computational costs of the two methods is pointless. One must judge the interest, the di@culty and the inherent costs of an approach with respect to the amount of information that is supplied. 7.2. Interest of the pdf approach Even though the di8erence between the two approaches is accounted for, one question remains: what is the interest of going to a pdf description in two-phase ows? First of all, even if one is only interested in the two Arst moments of some variables, it may be useful to develop Arst a Lagrangian model, and then to derive the macroscopic equations satisAed by the moments which are of interest, such as the mean velocities and the kinetic energies. This follows strictly the steps described in Section 6.3 for single-phase ows. That methodology is particularly attractive since closing the moment equations directly at the macroscopic level through constitutive relations is di@cult in two-phase ows [81]. Furthermore, the resulting macroscopic models are realizable by construction. A second reason is related not to particular models, but simply to the level of detail of the reduced description. For many engineering problems, the macroscopic approach does not supply enough information to address certain questions. It is very often the case that one is interested in knowing conditional statistics, for example the mean temperature or the mean velocity of particles that enter the domain by one speciAc inlet section. Another common concern is to have indications on the particle residence time distribution within the ow. At a given point and at a given time, one would be interested in having more information than the sole volume fraction occupied by particles. For example, one would like to be able to separate the particles into subclasses depending upon their di8erent residence times or their histories. In some nuclear concerns, one is interested in knowing how many particles present at a certain point have, at least once, entered a given zone. A third reason and by far the most important one, is that there are situations where the particle-tracking approach resolves closure issues in a way which is very similar to the question of the reactive source terms, Section 6.5.3. Indeed when particle diameters vary considerably from particle to particle or when we are confronted with a situation where particles have completely di8erent histories (highly complicated but local laws), deriving partial di8erential equations for mean quantities is a thorny issue. The case of complicated source terms happens whenever we want to have particle evaporation of combustion with complex expressions in terms of individual particle properties, and is identical to the issue of single-phase source terms in Section 6.5.3. When dealing with a distribution of particle diameters, one is faced with the problem of expressing, as a function of mean velocities Us ; Up and the mean particle diameter dp , quantities such as
Up Us ; : (294) $p $p These are complicated functions, due to the complex dependence of $p on particle diameters dp and also on particle and uid velocities, Eqs. (287) and (285). In theses cases, the Lagrangian
106
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
approach is attractive since it treats these phenomena without approximation while the derivation of closed moment equations is next to impossible unless very crude simpliAcations are introduced. In practical calculations, the pdf approach is generally more demanding in terms of computational costs compared to the moment or Eulerian approach. As explained in the previous subsection, this is not surprising since far more information is resolved. It is not easy to come up with precise Agures concerning computational overheads, memory requirements for any calculation. Indications and illustrations of the approach will be provided through a discussion of a number of practical computations in Section 7.6 after the introduction of current pdf models. Yet, the point made in the present subsection is that, in spite of its current limitations (theoretical and computational), the pdf approach appears at the moment as the only candidate if one is to simulate ows involving the above-mentioned ‘complex physics’ terms. 7.3. Choice of the pdf description 7.3.1. The Lagrangian stochastic point of view The pdf description of particle properties in turbulent ows will be developed from a Lagrangian point of view and using stochastic processes to simulate the variables which are retained in the state vector. In other words, we adopt a Lagrangian stochastic point of view. This mirrors the similar choice made in Section 6.5 for the probabilistic description of single-phase turbulent ows. For single-phase ows, since one is dealing with a continuous Aeld, the Lagrangian description may be surprising and the further notion of a Lagrangian stochastic point of view, which in fact collapses the two concepts of the trajectory point of view (see Section 2) and of a uid–particle notion into one idea, has to be explained. In dispersed turbulent two-phase ows on the contrary, the Lagrangian point of view seems a natural choice since one is dealing with an ensemble of ‘real’ particles embedded in a ow. Consequently, this choice is not often discussed. Yet, this should not be passed over too quickly: the Lagrangian stochastic point of view adopted here is a more involved concept and the explanation given in Section 6.5 remains valid. We will be handling stochastic particles and we will be representing their behaviour in time since the transitional pdf is the central notion, as will be detailed in Section 8. At last, it is worth underlying the interest of the Lagrangian point of view repeatedly put forward in previous sections. The whole stochastic approach is contained in the replacement of the exact instantaneous particle equations, Eqs. (281), by modelled but still instantaneous ones. 7.3.2. Choice of the state vector The present pdf description of particle properties is limited below to one-particle models, or s = 1 in the notation of Section 3. Higher-order pdfs (two-particle, or even s-particle pdfs) would require statistical information on the underlying uid at a number of separate spatial locations for general ows and this remains too di@cult at the moment given current models for single-phase turbulence. The one-point particle pdf model imposes the Arst constraint on the number of degrees of freedom (d = p) and the remaining problem is to specify the dimension of the state vector, p. That choice is very important for modelling purposes and governs the chances of deriving a satisfactory model, as it was explained in Sections 3 and 4. The discussion proposed there and the interplay between the choice of the one-particle state vector and the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
107
resulting modelling task takes its full interest for our present need to develop pdf models for two-phase ows. A direct illustration is that, depending on the number of variables retained in the one-point pdf approach, several proposals have been made. The Arst one follows the work of Reeks [82,83], who has proposed, in analogy with the Kinetic Theory, to retain only position and velocity to address the problem, Z=(xp ; Up ) (kinetic equation). The uid velocity seen is therefore an external variable and the major challenge is to express its statistical e8ect. For particles in turbulent ows, this closure is far from being trivial since history terms are present and one has to use functional calculus to evaluate them. The proposed closure is developed in sample space, that is following the pdf point of view for p(t; yp ; Vp )
Vp; i 9 9p 9 9 1 [Vp; i p] = p − Us; i |yp ; Vp p ; (295) + 9t 9yp; i 9Vp; i $p 9Vp; i $p where the unknown ux due to the uid seen is modelled by a gradient hypothesis involving two di8usion tensors #ij and Dij (whose expressions can be found in [82,83]) Us; i |yp ; Vp p = Uf ; i p − #ij
9p 9p − Dij : 9xp; j 9Vp; j
(296)
The contracted term Us |yp ; Vp is a short-hand notation for Us |(xp = yp ; Up = Vp ) and represents the following conditional expectation: Us |yp ; Vp p(t; yp ; Vp ) = Vs p(t; yp ; Vp ; Vs ) d Vs ; (297) which is the expected uid velocity seen by a discrete particle, conditioned on the fact that the discrete particle is at a given position yp with a given velocity Vp . This particular choice of a state vector seems to be made mainly by analogy with classical molecular dynamics where it has its justiAcation and where models can be written directly with these variables (the Boltzmann equation). However, in that case the driving mechanism is mainly due to kinetic e8ects and there is no underlying phenomenon. For turbulent ows, particles are carried around by a uid and it is therefore natural to consider whether variables related to uid properties should be included in the state vector. Indeed, if we consider the limit case of particles having small inertia, which means that particles nearly behave as uid elements, we have seen in Section 6 that it is much better to replace uid–particle accelerations by a model rather than uid–particle velocities. Broadly speaking, uid–particle accelerations are governed by small scales which have a better chance of showing some universal characteristics whereas uid–particle velocities are more likely to be problem or ow dependent. Therefore, building from the uid case, it appears preferable to include uid velocities in the state vector. Since the present pdf description is limited to particle properties, these uid velocities should be the ones entering the particle momentum equation, Eq. (281). A second choice for the one-particle state vector is thus Z = (xp ; Up ; Us ) :
(298)
That choice is in fact common to most Lagrangian models [84] and is also retained here.
108
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 26. Fluid element and discrete particle motions. Fig. 27. Fluid element and particle paths.
7.3.3. Unclosed pdf equations From the choice of the one-particle state vector, the exact pdf equation for p(t; yp ; Vp ; Vs ) is obtained by applying the techniques of Section 3 % Vs; i − Vp; i 9p 9 9 9 $ ;s; i |(yp ; Vp ; Vs )p = 0 ; [Vp; i p] + p + (299) + 9t 9yp; i 9Vp; i $p 9Vs; i where s stands for the time rate of change of the uid seen along particle trajectories. To obtain a closed form, a model must be proposed for s . 7.4. Present models 7.4.1. Modelling issues We restrict ourselves to situations where only particle dispersion is the important issue (other issues such as turbulence modulation and particle collision are left out in this section for the sake of simplicity). The modelling problem is sketched in Fig. 26. The issue is to model not the successive velocity of a uid element but rather the successive uid velocities which are sampled or ‘seen’ by the discrete particles (either solid particles, droplets or even bubbles) as they move across the ow. This quantity is denoted by Us to emphasize that we are not necessarily dealing with the characteristics of the same uid element (in Fig. 26, these velocities would be the velocity of the uid particle at location F1 at time t1 , then the velocity of the uid particle located at F2 at time t2 , and so on) and this variable is not a pure Lagrangian quantity in the general case. The problem is more complicated than pure di8usion models. Indeed, compared to a uid particle, the determination of the uid velocity seen is further compounded by particle inertia ($p ) and the e8ect of an external force Aeld (gravity in our case g). Both e8ects induce a separation of the uid element and of the discrete particle which are located near the same point at the beginning of the time interval. This is represented in Fig. 27 between two discrete time steps tn and tn+1 . In the absence of gravity (or other external force) and for small particle inertia, $p → 0, the separation e8ect disappears and, in that limit, the modelling issue is to represent the successive velocities of a uid particle, for which the stochastic models developed in Section 6
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
109
Fig. 28. Typical uid and discrete particle locations.
can be applied. For that reason, dispersion models (simulation of (Us )t ) are extensions of di8usion models (simulation of (Uf )t ). That sketch is often taken as a faithful description of the statistical picture. Most models are then built in two steps (see Fig. 27): a Lagrangian one (the uid particle from F(t) to F(t + d t)) and an Eulerian one (the two uid particles from F(t + d t) to F (t + d t)). In that approach, there is no clear separation between the e8ects due to particle inertia and to mean drifts. Furthermore, by sticking to this sketch in the stochastic model, one is led to writing a dispersion model as a function of the instantaneous velocity di8erence between the discrete particle and one realization of the velocity of the surrounding uid element, since we have r = (Up − Uf ) d t. This has been shown to lead to spurious de-correlation between the successive values of the velocity of the uid seen, Us (t) and Us (t ), and consequently to an artiAcial decrease of the integral time scale of the velocity of the uid seen, see [84]. This is not an evident e8ect and it may go unnoticed unless carefully checked. Its origin may be traced back to the Eulerian step of the approach. First of all, it can be illustrated in an extreme but illuminating situation. Let us imagine the case of very heavy particles in a turbulent ow in the absence of gravity and any other external forces and when the uid characteristic uctuating velocity is much larger than the particle characteristic velocity. If we are dealing with an homogeneous ow without mean velocities, thesecharacteristic velocities are typically given by the root mean squares of 2 the kinetic energies, U p and U 2 s . The Tchen’s formulae to be given in Section 7.5.5 will show that this corresponds to cases where the particle relaxation time scale is much larger than the uid Lagrangian time scale, say TL $p . The particle dispersion problem can then be represented as in Fig. 28. Since the discrete particle kinetic energy is smaller than the uid kinetic energy, at the next time step tn+1 = tn + d t, the discrete particle P that is considered is nearly at a stand-still and has not moved very far from its previous location at time t = tn . We can then expect that the discrete particle will see a new uid velocity which is essentially correlated with the one that exists at its previous location at the new time tn+1 . In other words, the discrete particle will typically sample Eulerian uid velocities which have a characteristic time scale, say TE , which is of the same order than the Lagrangian one TE ∼ TL . This is represented in Fig. 28 by the direct arrow labelled [1]. Even with a time step d t that is small with respect to the discrete particle inertia $p but can be comparable to TL , the uid particle F can be located at the next
110
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
time step relatively far from its previous location. In the approach usually followed to derive the velocity of the uid seen, one Arst simulates the Lagrangian step (indicated by the arrow labelled [2a: step L] in Fig. 28) which implies a correlation factor typically expressed by exp(−d t=TL ) (the exact form of the coe@cient will be explained by the use of Langevin models, see Section 7.4.2, and the resulting autocorrelations, see Section 7.5.3). The Eulerian step (indicated by the arrow labelled [2b: step E] in Fig. 28) will typically involve a similar factor which will be exp(−r=LE ), or since r ∼ U 2 s d t, also of the order of exp(−d t=TL ) for the situation sketched in Fig. 28. And since the two steps are considered independently in this approach, it is seen that the resulting total correlation factor between the two successive velocities of the uid seen by the discrete particles (between F at time tn and F at time tn+1 ) will be about exp(−2d t=TL ) and we end up with a characteristic time scale of the velocity of the uid seen of around TL =2 instead of TE . The key point in this construction and the source of the resulting shortcoming is that the second step does not exactly correspond to an Eulerian step. Actually, an Eulerian step consists in simulating the correlated velocities of two uid elements located, respectively, at a point x and at a point x + r, regardless of their previous history and their respective trajectories. The factor exp(−r=LE ) is only meant to represent (it is just a rough estimation) the typical correlation for the complete class of pairs of uid particles with one member located at x and the other member of the pair located at x + r. However, if we go back to the illustrative Fig. 27, it is seen that we need to correlate the velocities of two uid particles but conditioned on the fact that one of them was located at the discrete particle position at the previous time step. Thus, we are not actually dealing with an Eulerian correlation problem but with a conditional Eulerian problem. We need to correlate pairs of uid particles that form only a subclass of all the possible pairs of uid particles located at x and at x + r. That problem remains an open issue at the moment. Based on that analysis and in order to work out a tractable model that nevertheless avoids the previous shortcoming, a somewhat di8erent approach has been proposed [84]. In that approach, the construction of the velocity of the uid seen by the discrete particles is still obtained in two steps, but these steps have a di8erent purpose and do not have the same meaning: (i) the Arst one accounts only for the e8ect of particle inertia in the absence of any external force such as gravity. The model for the successive velocities Us (t) is based on the chosen model for uid particle velocities but with a modiAed time scale. This modiAed time scale, say TL∗ , is a function of particle inertia, as manifested in the particle relaxation time scale $p and varies between the uid Lagrangian time scale TL and the Eulerian time scale, TE . Indeed, discrete particles with negligible inertia $p → 0 tend to follow their surrounding uid elements and therefore ‘see’ velocities which are correlated in a time interval of the order of TL . On the contrary, discrete particles with very high inertia $p 1, are nearly at a stand-still with respect to motion of uid elements, and as explained above (see Fig. 28), will tend to ‘see’ uid velocities at about the same point whose integral time scale is TE . In between, the time scale of the uid velocities seen by the discrete particles TL∗ can be regarded as changing continuously between these two asymptotic limits as a function of $p . (ii) the second step accounts for the e8ect of external forces. These external forces induce a mean drift between the discrete particles and the surrounding uid, and therefore a separation of the average trajectories of the discrete and of the uid elements. This results
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
111
Fig. 29. Mean uid and particle paths.
in a decorrelation of the velocity of uid seen with respect to the velocity of uid particles, and is called the crossing-trajectory e8ect (CTE). In that analysis, the crossing-trajectory e8ect should not be confused with the separation e8ect mentioned above. The separation of the trajectories of uid and discrete particles is always present as soon as particle inertia is not negligible or a mean drift exists, whereas the crossing-trajectory e8ect is only present when there is a mean relative velocity between the uid and the discrete particles. In the following, we will assume that the Eulerian time scale is of the same order as the Lagrangian one, TE TL , and that in the absence of gravity or similar external forces, the modiAed time scale of the velocity of the uid seen can be taken as the uid Lagrangian one. In other words, we will neglect the Arst e8ect described above due to particle inertia, and write TL∗ (g=0) TL . That choice is mainly made for the sake of simplicity and in order to concentrate on the important modelling of the crossing-trajectory e8ect. Detailed proposals for the e8ect of particle inertia on TL∗ can be found in Pozorski and Minier [84]. The key point in the above analysis of the crossing-trajectory e8ect is its origin in a mean drift and not in an instantaneous drift. The complete description of the dispersion modelling issue can be represented as follows in Fig. 29. That Agure does not claim to be a representation of the instantaneous picture, but is simply a sketch drawn to illustrate the modelling steps that will be taken below. At the end of this discussion on modelling issues, a few general comments can be made. It must be emphasized that the derivation of a satisfactory model (that is respecting a number of well-established constraints) for particle dispersion remains an open and di@cult issue that still calls for new ideas and approaches. First of all, there is no such well-established theoretical constraints against which the validity of proposed models can be checked. Secondly, the extension from the models retained for the velocity of uid particles to models for particle dispersion, which is with our choice of the one-particle state vector is equivalent to a model for the velocity of the uid seen Us , involves a number of supplementary choices. At the moment, there is no theoretical satisfactory route for the derivation of such particle dispersion model starting from uid di8usion results. As an example of this situation, one possible route to the derivation of Lagrangian uid di8usion models is to propose a stochastic model and to use what is then considered as given second-order mean equations. In particular a certain expression of the pressure rate of strain
112
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
correlation Hij can be chosen and the form of the matrix Gij of the stochastic model can be guessed to be consistent with that choice, see [10] and see Section 6.7. That approach cannot be followed in the two-phase ow case since the mean equations are not known or given. One seeks precisely to derive them from the particle stochastic equations, and furthermore in the general case of polydispersed ows, a closed form of the mean equations is extremely di@cult, see Section 8. At the moment, a number of results have been gathered from experiments and numerical simulations, and these results are used to propose models for particle dispersion, more in an ad hoc manner. Or to describe the modelling situation in other words: even if the Langevin equation for uid particle velocities were taken as exact, the simulation of the velocity of the uid seen by discrete particles in turbulent ows Us would still require a model. Work remains to be done to improve present state-of-the-art models, an example of which is detailed below. 7.4.2. Langevin models In the present subsection, the trajectory point of view is followed and a stochastic model is written to describe the time increments of the velocity of the uid seen along the discrete particle trajectories. An attractive approach is to use a Langevin equation, as an extension of the Langevin models developed for uid particle velocities in Section 6.6, for Us . In Section 5.3, Langevin models, or in mathematical terms stochastic di8usion processes, were justiAed by the application of Kolmogorov hypotheses to uid Lagrangian quantities which show that, in high Reynolds ows, the velocity of a uid particle is well approximated by a random walk in velocity sample space. In the two-phase ow case, similar application of Kolmogorov hypotheses supplies also some support for a Langevin model for the velocity of the uid seen. The crucial step in that respect is that, in our present approach, the di8erence between uid particle models and models for the velocity of the uid seen is due to crossing-trajectory e8ects which are determined by the existence of a mean drift and not an instantaneous drift. Indeed, if we write the time increment of the velocity of the uid seen using the assumptions described in the previous subsection and represented in Fig. 29, we have for a time increment of d t d Us = v(d t; Ur d t) ;
(300)
where Ur = Up − Us is the mean relative velocity between the discrete particle and the surrounding uid element. In that equation, v represents the uid velocity Aeld relative to the motion of one uid particle, as in the general Kolmogorov theory, see Section 5.3, and in our case the chosen uid particle is the uid particle that was located at the particle location at the beginning of the time step. With our present description of the crossing-trajectory e8ect, it is thus seen that the statistics of the time increment of Us depends only on d t and on some statistics (here the mean relative velocity), as well as on the key variables of Kolmogorov theory + and , but not explicitly on the actual uid and particle velocities. The Kolmogorov theory can then be applied as in Section 5.3 to show, Arst that the statistics of d Us do not depend on the values of Us (t), and second that in high Reynolds number ows and for a time increment d t that belongs to the inertial range, we have d Us; i d Us; j = Dij (d t) ;
(301)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
113
where the matrix Dij is determined by the two scalars functions D and D⊥ through Dij = D⊥ ij + [D − D⊥ ]ri rj :
(302)
The situation is analogous to the case of statistics of Eulerian velocity di8erences, see [25], with the separation vector r being in the direction of the mean relative velocity r = Ur = |Ur |. The functions D and D⊥ correspond to the values of the correlation for the velocity components aligned with the separation vector r or transverse to r. Dimensional analysis yields that in the inertial range, we have |Ur |2 |Ur |2 D (d t) = + d t : ; D⊥ (d t) = + d t :⊥ ; (303) + d t + d t where : and :⊥ are regarded as two universal functions for which there is no exact prediction. The form of these functions can be obtained in two limit cases. When the mean relative velocity is small, small meaning here that |Ur | is small with regard to (+ d t)1=2 for a given time interval d t, then we expect the statistics of the velocity of the uid seen to be close to the uid ones, and thus |Ur |2 1; ⇒ : :⊥ C0 : (304) + d t On the other hand, when the relative mean velocity is large (larger than (+ d t)1=2 ), we can resort to the frozen turbulence hypothesis. In that case, we obtain that D (d t) C(+Ur d t)2=3 ;
D⊥ (d t) 43 C(+Ur d t)2=3 ;
(305)
which shows that, in that limit, the two functions : (x) and :⊥ (x) vary as x1=3 . Due to this variation of the functions : and in particular to the explicit presence of the time step d t in the argument of :, the situation is more complex than in the uid case. In particular, the non-constant value does not directly support a Langevin model as in the uid case, see Sections 5.3 and 6.6. Nevertheless, a useful approximation can be proposed. Indeed, if we freeze the values of the functions : and :⊥ for a certain value of the time interval, say Qtr and write |Ur |2 |Ur |2 D (d t) + d t : ; D⊥ (d t) + d t :⊥ ; (306) +Qtr +Qtr we have now a linear variation of D (d t) and D⊥ (d t) with respect to the time interval d t. A reasonable choice for the reference time lag may the Lagrangian time scale which is the time scale over which uid velocities are correlated. And since +TL k, we have |Ur |2 |Ur |2 D (d t) + d t : ; D⊥ (d t) + d t :⊥ : (307) k k As for the uid case, this result suggests to use a Langevin equation model which consists in simulating Us as a di8usion process. From the discussion above, it is clear that, as such, a Langevin model is only an approximate model having less support than in the uid case. The Langevin model does not yield the correct spectrum (in the limit of large relative velocity or frozen turbulence). However, it must be remembered that the objective is more limited: the purpose of a Langevin model is to remain a somewhat simple and still tractable model while still respecting the integral scales (here the integral time scale of the velocity of the uid seen,
114
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
discussed at length in Section 7.4.1). Indeed, for macroscopic behaviour such as the di8usion coe@cients, the important properties are the integral time scales rather than the precise form of the spectrum, see Section 4. Thus, Langevin models, are simply ‘reasonable compromises’ between simplicity and physical accuracy at the moment. It is also clear that much work remains to be done to improve stochastic models. The general form of the Langevin model chosen for the velocity of the uid seen consists in writing d Us; i = As; i (t; Z) d t + Bs; ij (t; Z) d Wj ;
(308)
where the drift vector A and the di8usion matrix B have to be modelled. The complete Langevin equation model can therefore be written as d xp; i = Up; i d t ;
(309a)
d Up; i = Ap; i d t ;
(309b)
d Us; i = As; i (t; Z) d t + Bs; ij (t; Z) d Wj ;
(309c)
where the particle acceleration is Ap; i = (Us; i − Up; i )=$p + gi . This formulation is equivalent to a Fokker–Planck equation given in closed form for the corresponding pdf p(t; yp ; Vp ; Vs ) which is, in sample space 1 9p 9 9 9 92 [Vp; i p] + [Ap; i p] + [As; i p] = [(Bs BsT )ij p] : + 9t 9yp; i 9Vp; i 9Vs; i 2 9Vs; i 9Vs; j
(310)
Closure relations for the drift vector and the di8usion matrix will be detailed. These relations have been proposed in a previous paper, Minier [85], but will be developed here since they are recent and di8erent from usual proposals and since they will illustrate the modelling issues described above. A complete model is built by closing successively the drift vector and the di8usion matrix, and the two closure relations are considered separately. Closure of the drift term. In the uid case, the drift term entering the stochastic di8erential equation for a uid particle is (considering only the simplest proposal where Gija = 0) 1 9P Uf ; i − Uf ; i − − ; (311) f 9xi TL showing that the drift term is the sum of a mean term and a uctuating term. In the two-phase ow case, these two terms need to be modiAed to account for the crossing-trajectory e8ect. We retain the idea of a decomposition into a mean and a uctuating term. Based on the modelling ideas developed above and on the sketch in Fig. 29, the mean term can be obtained from the uid case by performing a Arst-order Taylor development for small d t and thus small mean relative displacement Ur d t 1 9P 9Uf ; i − + (Up; j − Us; j ) : (312) f 9xi 9xj Other forms have also been suggested [81], but the present one, although postulated, is consistent with the general description of the crossing-trajectory e8ect. Then, the uctuating term which
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
115
is added to the mean one is still written as a return-to-equilibrium term but with a modiAed time scale TL∗ . The proposed drift term has therefore the following expression: 1 9P 9Uf ; i Us; i − Us; i − + (Up; j − Us; j ) − : (313) f 9xi 9xj TL∗ The linear form of the drift term can be justiAed. For example, in homogeneous turbulence where the di8erent coe@cient and time scales are constant, a linear drift term implies that the resulting one-point pdf of Us is Gaussian as obtained in numerical simulations, Deutsch [31]. Then, from the results of Section 4.3, the autocorrelation of the velocity of the uid seen is found to be an exponential R∗L ($) = exp(−$=TL∗ ), showing that TL∗ is indeed the integral time scale of the velocity of the uid seen. The exponential form of the velocity autocorrelation has also received support from numerical simulations [31], apart from the vicinity of the origin, this behaviour is actually quite understandable as in the uid particle case, see the explanation of Section 6.8. According to Csanady’s analysis, the integral time scale of the velocity of the uid seen di8ers from the uid Lagrangian time scale TL , due to crossing-trajectory e8ects when a mean drift between particles and the uid is present. Assuming for the sake of simplicity that the mean drift is aligned with the Arst coordinate axis, the modelled expressions for the time scales are, in the longitudinal direction TL TL;∗ 1 = & (314) 2 | U | r 1 + M2 2k=3 and in the transversal directions (axes labelled 2 and 3) TL TL;∗ 2 = TL;∗ 3 = & ; 2 | U | r 1 + 4M2 2k=3
(315)
where M is the ratio of the Lagrangian and the Eulerian time scales of the uid M = TL =TE . The distinction between the longitudinal and the transversal directions (with respect to the mean drift) implies that, even for the simplest model, the drift term is not isotropic and involves di8erent time scales for di8erent directions contrary to the uid case. The expression of the time scale TL∗ is in line with the discussion above on the justiAcation of a Langevin model. Indeed, from the expressions retained for the approximate second-order velocity structure functions Eqs. (307), and using a simple uctuation–dissipation argument, we have k |Ur |2 +: ; (316) TL∗ k where the indexes and ⊥ have been skipped for the sake of simplicity. The precise √ form of Csanady’s formulas corresponds to choosing the functions : and :⊥ as :(x) C 1 + x. Closure of the diCusion term. The complete Langevin equation model is obtained by closing the di8usion term in Eqs. (309). An expression for Bs; ij is worked out in several steps, by considering Arst a simple case and then by gradually working our way up to general situations.
116
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Case a: we Arst consider the simplest case of stationary isotropic turbulence, when mean velocities can be taken as zero. The set of SDEs that constitute the model are written in a 1D form for the sake of simplicity (for a given coordinate, leaving out the index) d xp = Up d t ;
1 (Us − Up ) d t ; $p Us d Us = − ∗ d t + Bs d W ; TL d Up =
(317a) (317b) (317c)
where, by direct application of stochastic calculus, the stationarity constraint implies that Bs and TL∗ are related by Bs2 = 2Us2 =TL∗ . The time scale of the uid seen is expressed as a fraction of the uid one as TL∗ = TL =b, where b is the correction factor from Csanady’s formulae, Eqs. (314) – (315) in the corresponding direction. Using the isotropic assumption and the expression of TL in the stationary case 4k=3C0st +, where C0st represents the value of the proportionality constant in the stationary case [11], whose relation with the previously used constant C0 is given below (this expression of TL is easily obtained by basic stochastic calculus using Eq. (236b) for example), one gets 4 1 Bs2 = kb = C0st b+ ; (318) 3 TL provided that there is no statistical bias between the uid turbulent kinetic energy and the turbulent kinetic energy of the uid seen, i.e. Uf2 =Us2 . This closure of the di8usion coe@cient ensures that correct dispersion coe@cients are obtained that is Dp = Df =b, as it should be in Csanady’s analysis (this result is shown in Section 7.5.4). Thus, in isotropic and stationary turbulence, a Arst proposal for the uid velocity seen is (putting back the index i of the coordinate) TL Us; i d Us; i = − ∗ d t + C0st +bi d Wi ; bi = ∗ : (319) TL; i TL; i Case b: we now consider the case of stationary uid velocity seen, however not necessarily isotropic (for example, homogeneous turbulence). In that case, the contributions coming from the drift and di8usion terms must still balance in the equation for the time evolution of the kinetic energy of the uid seen, or in other words dus2 = 0. The above closure for Bs; ij is now not satisfactory, since in the two-phase ow case, the drift vector is not isotropic. To resolve that di@culty, a new kinetic energy is introduced 3 3 i=1 bi uf2; i ˜ k= : (320) 3 2 i=1 bi This represents the normal energies weighted by the corresponding Csanady’s factors, bi . Since these factors vary from direction to direction, the weighted kinetic energy k˜ di8ers from the plain one k. However, if all the factors become identical, i.e. bi = b, then k˜ = k. This is of course the case when uid particles are considered, or when no crossing-trajectory e8ects come into play, since bi = 1. Another case where there is no di8erence between k˜ and k is isotropic
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
turbulence uf2; i
3 3i=1 bi 2 2 ˜ = k ⇒ k = k × 3 =k : 3 3 2 i=1 bi
117
(321)
Using this weighted uid kinetic energy (and supposing as done in Case a that us2 = uf2 ), the di8usion coe@cient is modiAed and the Langevin equation model is written with the proper expression of the uid Lagrangian time scale (TL = 4k=3C0st + with TL;∗ i = TL =bi , see Case a), & st 3C + k˜ d Us; i = − 0 (322) bi (Us; i − Us; i ) d t + C0st +bi d Wi : 4k k This proposal is called formulation 1. Indeed, in the above model, the drift term was left ˜ unchanged while the di8usion coe@cient was modiAed (through the introduction of k=k) to 2 obtain the correct macroscopic behaviour (that is dus = 0 in the stationary case). Another possibility is to modify the drift term or the time scale of the uid seen, leading to formulation 2 3C st + bi (Us; i − Us; i ) d t + C0st +bi d Wi : d Us; i = − 0 (323) 4k˜ It should be noted that the modiAcation of the integral time scale of the uid seen is not apparent in isotropic conditions. Case c: the next step is to consider homogeneous but non-stationary turbulence, for example isotropic decaying turbulence. This is much closer to real situations and we expect that the turbulent kinetic energy of the uid seen satisAes the following equation 1 dus2 = −+ : 2 dt
(324)
The di8usion coe@cient is modiAed by taking into account this constraint (that is dus2 = −2+ d t in the terms of stochastic calculus). In the non-stationary case, the expression of TL is slightly modiAed, see Eq. (237) for example. The steps of the derivation follow the ideas of Section 6.6 and similar steps taken in the derivation of the uid Langevin model, see [11]. As in the uid case, the viscous dissipation term can be accounted by adding a negative term, −2=3+, within the square root of the di8usion coe@cient. For the model referred to as formulation 1, we have & st 3C + k˜ 2 d Us; i = − 0 (325) bi (Us; i − Us; i ) d t + C0st +bi − + d Wi : 4k k 3 Following what is done in the uid case, we introduce the constant C0 still deAned by C0st = C0 + 2=3 and we obtain the Langevin model written in that case as + 1 3 d Us; i = − (326) + C0 bi (Us; i − Us; i ) d t + Bs; i d Wi ; 2 4 k
118
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
with Bs; i given by ' Bs;2 i
k˜ 2 = + C0 bi + k 3
'
((
k˜ bi − 1 k
;
(327)
assuming that the rhs is indeed positive. The general form of the Langevin equation model can also be developed for formulation 2 and gives 1 3C0 +bi (Us; i − Us; i ) d t + Bs; i d Wi ; d Us; i = − (328) + 2 4 k˜ with Bs; i given by Bs;2 i = +(C0 bi + 23 (bi − 1)) ;
(329)
where now the right hand side is always positive since bi ¿ 1. The derivation of the di8usion coe@cient made above completes the expression of the Langevin equation model in the general case. Using for example formulation 1, the complete model is obtained by adding the drift and the di8usion terms developed in the two previous subsections 1 9P 9Uf ; i d t + (Up; j − Us; j ) dt f 9xi 9xj 1 3 + − + C0 bi (Us; i − Us; i ) d t 2 4 k & 2 ˜ ˜ + + C0 bi k=k + (bi k=k − 1) d Wi : 3
d Us; i = −
(330)
Similar expressions are obtained with formulation 2 for the time scale of the uid seen and the di8usion coe@cients. Whatever the formulation, it is seen that the resulting Langevin equation, which is believed to represent the simplest model for two-phase ow, contains a diagonal but non-isotropic di8usion matrix, Bs; ij = Bs; i ij . It is also worth emphasizing that, the closure relations put forward just above, re ect modelling choices. For instance, the closure of the di8usion coe@cient proposed to satisfy the correct decrease of turbulent kinetic energy is only one possibility among di8erent choices. From Kolmogorov’s theory, the di8usion matrix is thought to be an isotropic matrix (actually simply a di8usion coe@cient). In the two-phase ow case, the isotropic form cannot be obtained anymore, but it is chosen to select among di8erent possibilities a diagonal di8usion matrix. This allows to go back to the uid particle case when bi = 1 and still have a simpliAed form compared to a full di8usion matrix. Expression of the model in general coordinates. The complete model, Eq. (330), has been obtained in a special coordinate system since we have assumed that the mean drift, or the mean relative velocity Ur , was aligned with the Arst axis of the reference system. The Langevin model is generalised for the case where Ur has any orientation with respect to the coordinate system as follows.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
119
We Arst introduce the two timescales of the velocity of the uid seen, still following Csanady’s analysis, TL;∗ and TL;∗ ⊥ which correspond to the values in the longitudinal direction (aligned with Ur ) and in the transversal direction, respectively. They are given by TL TL ; TL;∗ ⊥ = : (331) TL;∗ = 1 + M2 |Ur |2 =2k=3 1 + 4M2 |Ur |2 =2k=3 We deAne the coe@cients b and b⊥ by the ratio between TL∗ and TL , b = TL =TL;∗ and b⊥ = TL =TL;∗ ⊥ , and r as the unit vector aligned with Ur , in other words r = Ur = |Ur |. The general form of the Langevin model can be written as 1 9P 9Uf ; i d Us; i = − d t + (Up; j − Us; j ) d t + Gij (Us; i − Us; i ) d t + Bij d Wj : (332) f 9xi 9xj The matrix Gij entering the drift coe@cient is * ) 1 1 1 Gij = − ∗ ij − − ri rj : TL; ⊥ TL;∗ TL;∗ ⊥
(333)
Given the expression of the (isotropic) Lagrangian uid time scale TL , Gij can be re-expressed as + 1 3 Gij = − (334) + C0 Hij ; 2 4 k where Hij = b⊥ ij + [b − b⊥ ]ri rj :
(335)
The di8usion matrix Bij is deAned as the solution of the matrix equation (for example, Bij is obtained by a Cholewski decomposition) (BBt )ij = Dij ;
(336)
where Bijt denotes the transpose matrix of Bij . The symmetric matrix Dij appearing on the right hand side of the matrix equality is expressed in the general coordinate system by Dij = D⊥ ij + [D − D⊥ ]ri rj ;
(337)
with the coe@cients D and D⊥ given by ˜ + 2 (bi k=k ˜ − 1)) ; D = +(C0 b k=k 3
(338)
˜ + 2 (bi k=k ˜ − 1)) : D⊥ = +(C0 b⊥ k=k 3
(339)
When the vector r is aligned with one of the coordinate, the matrix Dij is diagonal and then Bij is also a diagonal matrix formed by the square root of the coe@cient D and D⊥ as can be seen in Eq. (330). In the above equations, the new kinetic energy k˜ is deAned as 3 Tr(HR) k˜ = ; (340) 2 Tr(H )
120
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
where Tr(H ) denotes the trace of the matrix Hij , and where R denotes the Reynolds stress matrix Rij = ui uj . In the special case when r is aligned with one of the coordinate, it is easily seen that the matrix Hij is diagonal and that Eq. (320) is retrieved. It can also be checked that with these general expressions, the correct mean transport equation for the turbulent kinetic energy is respected. 7.5. Properties of present class of models The models proposed in the previous subsection, as well as other possible variants apart from the so-called formulations 1 and 2, form a class of models that can be named Langevin models. The derivation detailed above helps to outline the present modelling state and the whole approach that is followed from uid di8usion to particle dispersion as well as the di8erent assumptions that one needs to make in the course of the derivation. This reveals the limitations of present expressions. Nevertheless, Langevin models have already a number of interesting properties that will be presented below. Most of the characteristics analysed here are not due to a particular choice of the drift or di8usion coe@cients, but belong to the class of Langevin models. They have therefore an interest which goes beyond the selection of one particular model. The purpose of this subsection is actually to use Langevin models as convenient tools to clarify as much as possible a number of issues which may be regarded as obscure when addressed form a di8erent point of view. More speciAcally, our aim is to stress that, by choosing the one-particle state vector which includes the velocity of the uid seen Z = (xp ; Up ; Us ), and by choosing a Langevin equation model for Us which has a ‘proper’ form, interesting physics is already captured while simple mathematical manipulation of the stochastic model both avoids some pitfalls and simpliAes considerably the derivation of macroscopic relations. 7.5.1. Gaussian and non-Gaussian pdfs The Arst property concerns the resulting form of the pdf of the velocity of the uid seen that comes out of Langevin models. A reference to this form has already been made in Section 7.4.2, we follow up that question here. We can write the general form of a Langevin model as d Us; i = As; i d t + Bs; ij d Wj ;
(341)
where the drift vector As; i and the di8usion matrix Bs; ij correspond to the terms in Eq. (330) for example. In that case, and for other models that respect the characteristics put forward below, we have the following property. For homogeneous turbulence and constant mean drifts, the uid turbulent quantities, such as k; + and the mean gradients are constant. As a result, the drift term is linear with respect to Us and the di8usion coe@cient is constant. It is well known that the resulting stochastic process Us is Gaussian, see [15] or [16]. However, this result is only valid in the simpliAed case of homogeneous turbulence. In the general case of non-homogeneous turbulence, the di8erent mean quantities entering the drift and the di8usion coe@cients of the stochastic di8erential equation of the trajectories of Us become space dependent. We must then consider the joint process Z of all the variables contained in the state vector, and even if the statistics of the discrete particle
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
121
velocity were known and constant, we would still have to consider the joint process of (xp ; Us ). In terms of the joint process, the stochastic di8erential equations, Eq. (341), are now non-linear with respect to the variables of the joint process. Consequently, the resulting one-point pdf for the velocity of the uid seen is not Gaussian anymore. Nothing is assumed beforehand with regard to this deviation from Gaussianity, this is a consequence of the mixing at a given location of trajectories or of particles which have di8erent histories. In other words, Langevin models do not suppose Gaussian behaviour: Gaussian pdfs obtained in homogeneous turbulence and deviations from Gaussianity in non-homogeneous turbulence are results of the model. An example of this property was displayed in a turbulent mixing layer, a non-homogeneous ow, for uid particles, see [59], where the simulated pdfs of uid velocity components were found to have a non-Gaussian forms roughly in line with experimental Andings. There is nevertheless, built within Langevin models, an assumption of a Gaussian form. This is due to the increments of the Wiener process, d W which are independent Gaussian variables. In the general case, this does not imply anything on the stochastic process Us . All that can be said is that Langevin models do assume a Gaussian term, but for the conditional increments of Us . Indeed, if we consider the increments of the velocity of the uid seen over a small time interval, and for particles that come from a given location xp = xp0 and with a given discrete particle velocity and a given velocity of the ;uid seen, Up = Up0 and Us = Us0 , we can write that d Us; i |(xp0 ; Up0 ; Us0 ) = As; i (xp0 ; Up0 ; Us0 ) d t + Bs; ij (xp0 ; Up0 ; Us0 ) d Wj :
(342)
Consequently, the conditional increments are Gaussian variables since in the above equation As; ij and Bs; ij are now constant. Yet, this does not imply Gaussianity for the unconditional velocity of the uid seen. 7.5.2. Spurious drifts The issue of the so-called spurious drifts is recurrent in the discussion of two-phase ow models. The question refers to the limit case of particle tracers, or to the limit behaviour of discrete particles when their inertia becomes negligible. Indeed, when $p → 0, it can be shown that the discrete particle velocity Up becomes identical to the uid one Us . We are in fact dealing with an ensemble of marked uid particles and the particle dispersion problem reverts to the uid di8usion one. Then, for an incompressible ow, it derives from the mass continuity constraint (the velocity Aeld is of zero divergence), that if we start with a uniform concentration for the marked particles (which behave as uid tracers) in any ow, the concentration must remain uniform in the domain. In other words, since we are dealing with an ensemble of marked uid elements, which can be thought of as an ensemble of small uid elements with equal mass, we cannot have concentration build-ups. Failure to respect that constraint is referred to as the ‘spurious drift e8ect’, since with models that su8er from this drawback, particles behave as if they were drifting from certain regions of the ow domain. This a very serious shortcoming for a given model. It simply expresses that mass continuity is not ensured and that the mean Navier–Stokes equation is not satisAed. It was shown to a8ect most Lagrangian two-phase ow models in numerical simulations of non-homogeneous turbulent ows, see Mc Innes and Bracco [86]. Actually, that issue was Arst addressed at the beginning of the eighties in the Arst Lagrangian models that were developed to deal with environmental
122
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
di8usion questions. Among other works, it was resolved by Pope [87], but surprisingly these works appear to have gone unnoticed in the two-phase ow community. That issue can be further clariAed with our present class of models, and with the stochastic tools developed in Section 6. Actually, everything needed to understand the origin of the problem in some models and the easy way out has been laid out in Section 6.7.2. For discrete particles in the limit of vanishing inertia, the equations are in fact already described in Section 6.6 for uid particles, since in that case Us = Uf d xf ; i = Uf ; i d t ; d Uf ; i = −
1 9P Uf ; i − Uf ; i dt − d t + C0 + d Wi f 9xi TL
(343a) (343b)
and thus the mean value of particle velocity increments is indeed equal to the local value of the mean pressure-gradient d Uf ; i = −
1 9P dt : f 9xi
(344)
This result, as shown in Section 6.7.2, indicates that the mean Navier–Stokes equation is indeed satisAed. Consequently, by the very form of the stochastic model itself, the Langevin models are free of any spurious drifts. This explanation may appear as deceitfully simple (there is actually nothing complicated in that question), so it is perhaps worth supplying further details. The Langevin model itself is not a key element in the discussion. What is important is that, Arst we are dealing in a Lagrangian formulation (all the usual terms arising from convection in the Navier–Stokes equation are implicitly contained in the Lagrangian derivative) with the instantaneous particle velocity, and second the mean pressure-gradient is properly included in the particle velocity time evolution equation. To even further emphasize that nothing else (but nothing short) than these two features is actually needed to avoid spurious drifts, we could generalize and consider a general model equation written as d xf ; i = Uf ; i d t ; d Uf ; i = −
1 9P d t + M(xf ; Uf ) ; f 9xi
(345a) (345b)
where M(xf ; Uf ) stands for any model in terms of particle properties, (xf ; Uf ), that respects the constraint that, at a given location xf = x0 , we have M(x0 ; Uf ) = 0 :
(346)
Then, by the same reasoning applied for the Langevin model, it is seen that the mean Navier– Stokes equation is veriAed. In the above general form, it is crucial that Uf represent the instantaneous velocity. However, the same model can be re-expressed as a model for particle
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
123
uctuating velocity uf , see Section 6.7.2, and we have d xf ; i = (Uf ; i + uf ; i ) d t ; d uf ; i =
9uf ; i uf ; k 9Uf ; i d t − uf ; k d t + M(xf ; uf ) : 9xk 9xk
(347a) (347b)
Both forms are strictly equivalent, and the mean pressure-gradient term in the time evolution of the instantaneous velocity equation, Eq. (345b), has been replaced by the two terms on the right hand side of the time evolution equation of the uctuating velocity, Eq. (347b). Therefore, if one of these two terms is not properly taken into account in a model for uid particle uctuating velocities, this is equivalent to mishandling the mean pressure-gradient term in the corresponding equation for the instantaneous velocities and, consequently, this leads to the existence of spurious drifts. Most particle-tracking models proposed for two-phase ow simulations are often Arst developed in the case of homogeneous turbulence without mean gradients and the Arst two terms in Eq. (347b) are zero. Unfortunately, when moving to the general case of non-homogeneous turbulence ows, these terms are forgotten, see [86]. The Arst term on the rhs of Eq. (347b) is indeed crucial since this is a mean non-zero term which prevents any concentration build-up. The second term in Eq. (347b) does not play a key role in the spurious drift e8ect, since this is a term of zero mean, but its absence would mean that the production term in the mean second-order equations would not be obtained, see Section 6.7.2. 7.5.3. Form of the autocorrelations The form of the autocorrelations of the velocity of the uid seen Us and of the discrete particle velocity Up is an important property to assess current models. That property is best considered in the simpliAed and ideal case of stationary isotropic turbulence. The discussion concerning the velocity autocorrelations can, of course, be extended to other, and more realistic cases, for example to the case of homogeneous turbulence with a constant velocity mean gradient (simple-shear ows). The most meaningful quantity is the autocorrelation of the particle velocity uctuations, whose equations are now slightly more complicated since velocity components are linked, and where the time scale is typically modiAed by the mean velocity gradients. However, for the sake of simplicity, we will limit ourselves to the isotropic case. For our present class of Langevin models, these forms are readily obtained by making direct use of the results of Section 4.3. Indeed, there is a clear correspondence between the set of stochastic equations in the two-phase ow case, written in 1D without axis index d xp = Up d t ;
Us − Up dt ; $p Us d Us = − ∗ d t + Bs d Ws TL d Up =
(348a) (348b) (348c)
and the last model discussed in Section 4.3 for Brownian particles where the acceleration is included in the one-particle state vector. The acceleration-based model proposed in that section
124
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
has the form (slightly changing the notations of Section 4.3 to avoid confusion between the di8usion coe@cients) d x = U dt ;
(349a)
U dt + < dt ; (349b) T < d < = − d t + B d W : (349c) $ The two sets of stochastic di8erential equations are similar if one makes the following replacements: dU = −
Up ↔ U ; $p ↔ T ; Us ↔ T< ; TL∗ ↔ $ :
(350)
The autocorrelations given in Section 4.3 for typical stochastic processes are thus easily transformed for our present purpose. It is seen that the autocorrelation of the velocity of the uid seen is an exponential, as already discussed in Section 7.4.2, Us (t)Us (t + s) s ∗ RL (s) = = exp − ∗ ; (351) Us (t)2 TL while the autocorrelation of particle velocities Rp (s) is Up (t)Up (t + s) 1 s TL∗ s exp − − = exp − ∗ : Rp (s) = Up (t)2 (1 − TL∗ =$p ) $p $p TL
(352)
Two general comments can be made at that stage. First of all, the forms of the autocorrelations obtained here are consequences of the choice of Langevin models and are derived after the complete model has been built. They represent properties of the model in a very special case (the ideal case of isotropic stationary turbulence) and manifest therefore, in the present approach, only a part of the physics contained in the Langevin model. In most particle-tracking approaches on the contrary, the form of the autocorrelation of the uid seen plays a leading role [86]. Most developments start by assuming a certain form for R∗L and then propose a stochastic model that respects that form. The resulting model appears therefore more as a statistical trick that happens to respect a macroscopic relation. This approach is still in the spirit of the ‘weak approximations’ explained in Section 6 and at the beginning of the present section, since we are after all interested in modelling and in approximating statistics of the uid and of the particles. However, the autocorrelation of the velocity of the uid seen is only one particular statistic among others and it may be too restrictive to isolate only that one. This approach to model development is similar to one approach mentioned in Section 6 for the construction of stochastic models for uid particles, where some proposals are being put forward only in order to respect a given form of the pressure-rate of strain correlation Hij . The situation is somewhat safer in the case of single-phase uid particle models where we consider only some
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
125
terms in the uid particle velocity equation (the terms that are meant to replace the uctuating pressure-gradient and the viscous term) and where we know in advance the mean Navier– Stokes equation which acts as a guideline. There is no such safe guideline in two-phase ow modelling, and by putting too much emphasis on a quantity (the autocorrelation) which takes its full value in isotropic turbulence, some di@culties related to the extension to the general case of non-homogeneous turbulence ows may then be missed, for example the terms that lead to the existence of spurious drift e8ects described above. There are also some similarities with the issue of Gaussian pdfs discussed in Section 7.5.1: the exponential form of the autocorrelation (if it is an acceptable form in the inertial range) is a result of Langevin models in homogeneous turbulence. In non-homogeneous turbulence, conditioned on one location, the autocorrelation is more di@cult to express and is bound to deviate from the simple exponential form. Yet, it is explicitly calculated from Langevin models while the above-mentioned approach would still have to assume a certain form beforehand leaving doubts as the validity of this form and of the resulting uid velocities. Furthermore, the physics contained in the Langevin equations may be hidden or simply missed altogether. The general approach promoted in the present work, either for single-phase uid particle models and for discrete particle models, is di8erent. We have tried, as much as possible, to show that the successive steps are based on physical analysis: (1) why do we retain certain variables in the one-particle state vector? (because uid acceleration appears as the best candidate to be replaced by a model) (2) why do we use a Langevin model for the velocity of the uid seen (description of the crossing-trajectory e8ect and reference to Kolmogrov hypotheses) and (3) how do we close the drift and di8usion coe@cients? Secondly, as a consequence of this Arst remark, it is believed that the present point of view is more helpful: it indicates the physics already contained in the models and consequently points to their limitations or even shortcomings. The main interest is that it helps to avoid certain pitfalls. One example is the critic found regularly on the exponential form of the autocorrelation of the uid velocity seen, which does not respect the zero value of the derivative at the origin. If the model is addressed directly with the exponential form of R∗L , then this is a puzzling question. However, when addressed from the equations of the trajectories of the process and with the explanation of the physical ideas behind present models, see Section 4, the origin and the meaning of the non-zero value of the derivative of R∗L is clear. The detailed discussion put forward in Section 6.8 is totally applicable here. Then, if a satisfactory behaviour of R∗L is deemed necessary, the solution is obvious. One can replace the white-noise term in the uid velocity equation d W by a coloured-noise, thus following completely the procedure of Section 4.3. This simply amounts to shifting the introduction of the model one step further and propose a general model which in 1D can be written d xp = Up d t ;
Us − Up dt ; $p Us d Us = − ∗ d t + < d t ; TL < d < = − d t + B< d W : $ d Up =
(353a) (353b) (353c) (353d)
126
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
The uid velocity seen is then a di8erentiable process and its autocorrelation will respect the zero derivative constraint at the origin. Finally, there is one situation when the present exponential form for R∗L is clearly limited. For particles having very large mean drift with respect to the uid |Ur |1, then the correlation between velocity components transversal to the mean drift direction tend towards the Eulerian cross correlation denoted g(r), see [26]. This is in fact the Taylor hypothesis of frozen turbulence. It is then known that g(r) has negative loops for continuity reasons, see [26]. This is clearly outside the possibility of present Langevin models since the exponential form remains positive. However, that case is only met in an asymptotic limit, where due to the very large drift we have that TL∗ is very small and the value of the autocorrelation is already very small. In other words, the prediction is wrong but for small numbers. This characteristic is not considered too problematic, at least not the limiting problem at the moment. 7.5.4. Dispersion coeIcients In the stationary isotropic case, Eq. (317), the long-time dispersion coe@cient is given by Dp = Tp Up2 where Tp is the discrete particle integral time scale. Tp is given by the classical formula +∞ Tp = Rp (t) d t : (354) 0
which gives Tp = TL∗ + $p , see Section 4.3. Using the stationarity hypothesis and Itˆo’s stochastic calculus in Eqs. (317), one obtains Up2 = Up Us and Us2 = Up Us (1+$p =TL∗ ). It was previously shown under the same hypothesis that Us2 = B2 TL∗ =2 and therefore Up2 =
B2 TL∗ TL∗ =$p : 2 1 + TL∗ =$p
(355)
The dispersion coe@cient becomes equal to Dp =
B2 TL∗ TL∗ =$p B2 (TL∗ )2 Uf2 TL ∗ (T + $ ) = = : p L 2 1 + TL∗ =$p 2 b
(356)
which simply yields that Dp =Df =b as it should be in Csanady’s analysis. Therefore the closure of the di8usion coe@cient in Case a (stationary isotropic turbulence) ensures that correct dispersion coe@cients are obtained. 7.5.5. Tchen’s formulae When a cloud of particles is put into a box Alled with a homogeneous turbulent ow and is being agitated by the uid turbulence, then, after a transient period, the statistics of particle velocities reach equilibrium values. These limit values are of course functions of the (constant) statistics of the uid (its mean kinetic energy, the Lagrangian time scale, among others). The relations giving the equilibrium values in terms of the uid statistics are called the Tchen’s relations. They were Arst obtained by Tchen [71] and later reformulated by Hinze [88]. In Tchen’s work, the formulation of the problem was di8erent: homogeneous turbulence was not explicitly considered but it was assumed that particles were ‘seeing’ the same uid element having constant statistical properties as they move across the ow. This is not a realistic assumption in the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
127
general case, due to particle inertia and crossing-trajectory e8ects, see Section 7.4.1. However, what is important is that the statistics of the uid seen are constant, or that the particle driving force is a stationary process. The best physical situation where these assumptions are valid is homogeneous turbulence. In Tchen or Hinze’s works, the determination of the equilibrium values was obtained through spectral analysis and manipulation of the uid and particle energy spectra, where the uid spectrum is assumed to have an exponential form. This derivation can be cumbersome and the physical meaning of the exponential form is not obvious. On the other hand, the same relations are derived from the stochastic di8erential equations in a straightforward way. The exponential form is easily simulated by a simple Langevin model for the uid seen (although not the only justiAcation of this model) as it transpires from the preceding subsection. Here again, we limit ourselves to the simpliAed case of isotropic turbulence for the sake of simplicity since we have actually a 1D formulation. When no mean gradients are present, the model equations have the simpliAed form, already used in Section 7.5.3 Us − Up d Up = dt ; (357a) $p Us d Us = − ∗ d t + Bs d Ws : (357b) TL After a transient period, all the statistics reach their limit value and the stochastic process Z=(Up ; Us ) reaches its stationary state. We can then write that dg(Z) =0 for various functions g using the rules of stochastic calculus explained in Section 2.7. This yields dUs2 = 0 ⇒ Bs2 = 2
Us2
TL∗
;
(358a)
dUp2 = 0 ⇒ Up Us = Up2 ; 1 Us2 1 dUp Us = 0 ⇒ − + ∗ Up Us = 0
$p
$p
TL
(358b) (358c)
and thus Up2 = Up Us = Us2
1 ; 1 + $p =TL∗
(359)
which are the Tchen’s expressions [88]. In the original expressions, the uid time scale was taken as the Lagrangian time scale TL which is in line with the fact that the crossing-trajectory e8ect was neglected. In the present relations, we have used TL∗ which is the proper time scale since what matters for the resulting particle statistical properties are the characteristics, not exactly of the uid particles, but rather of the uid seen as explained in Section 7.4.1. Tchen’s formulae are algebraic equations relating particle and uid kinetic energies. They are sometimes used in two- uid approaches or Eulerian approaches as local relations written at each point across the ow. In that case, Tchen’s relations play the role of a ‘particle turbulence model’. This is however a very crude model and quite often an inaccurate closure in general non-homogeneous turbulence. Indeed, as it follows from the discussion above, Tchen’s relations can only be obtained if particles are agitated by the same statistical driving forces for long
128
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
enough times. On physical grounds, we can say that the transient period scales with the particle relaxation time scale $p . Therefore, these equilibrium relations can only be approximately valid if particles stay for a time much longer than their characteristics time scale $p in a region of the ow that must furthermore be considered as locally homogeneous. Even if we take the largest time scale, thus TL , as a reference for the lifetime of such a locally homogeneous region, we must have that $p be small with respect to TL . In non-homogeneous turbulent ows and for particles whose inertia is not negligible, these relations can therefore be expected to be poor estimates for particle kinetic energies. This is actually not surprising since Tchen’s relations neglect convective phenomena. Yet, we have already stressed that particle velocities are unlikely to be easily replaced by a (local) model, contrary to their accelerations. This is precisely the motivation for the choice of the one-particle state vector Z = (xp ; Up ; Us ), see Section 7.3.2 and the choice of a Lagrangian approach which treats transport phenomena accurately. 7.5.6. Extension to bubbly ;ows Up to now, only heavy particles have been under consideration (particles whose density p is much greater than the uid density f ), and the extension of the present approach to light particles is now discussed. The Arst problem is to write the instantaneous equation of motion of a bubble in a turbulent ow. This is a more complicated issue than the similar one for heavy particles and is still an open question. A general form of the particle momentum equation can nevertheless be proposed which keeps drag, pressure gradient, added-mass and gravity forces [73,72] while the history and lift forces are neglected d xp = Up ; dt d Up Us − Up f DUs 1 f + = + Ca dt $p p Dt 2 p
(360a)
DUs d Up − Dt dt
+ 1− f p
g:
(360b)
Due to the separation e8ect described in Section 7.4.1, a uid and a discrete particle which coincide at a given time do not have the same velocity. Their trajectories separate and two derivatives can be deAned. The notation d = d t represents the derivative of a quantity along the discrete particle trajectory. In other words, it is the time rate of change of a quantity sampled by a discrete particle as it moves through the continuous phase. On the other hand, the notation D =Dt represents a derivative taken along the uid particle trajectory. In the particle momentum equation, the pressure-gradient force and the added-mass force are expressed with the ‘real’ uid particle acceleration DUs =Dt. There is another expression of the particle momentum equation in which the derivatives are taken along the solid particle trajectory and the equations are written d xp = Up ; dt d Up Us − Up f d Us 1 f + = + Ca dt p d t 2 p $p
(361a)
d Us d Up − dt dt
+ 1− f p
g:
(361b)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
129
The question of which derivatives should be used has been the subject of some debate [89,74]. In the theoretical derivation of the forces acting on a particle which leads to the notion of drag, pressure-gradient and added-mass forces, the relative Reynolds number is small and in that case, the two expressions are of the same order and cannot be distinguished. Further, recent studies [74] seem to indicate that the Arst form, which uses uid velocity derivatives calculated along uid particle trajectories, is better founded. Nevertheless, in spite of some of its inaccuracies, the second expression will be used in the rest of this section. A discussion of the possible modiAcations induced by the Arst form will be made later on. We are still concerned with a statistical approach to the problem, as explained in Section 7.1, and for bubbles, as well as for heavy particles, we have to come up with a model to simulate the velocity of the uid seen Us . Actually, given the form of the discrete particle (the bubbles) momentum equation, we need the velocity of the uid seen and its acceleration d Us = d t. We can then think of the previously described Langevin models to represent the velocity of the uid seen. At Arst sight, we are faced with a new di@culty that seems to prevent these Langevin or stochastic di8usion processes from being used directly. Indeed, stochastic di8usion processes have trajectories which are continuous but nowhere di8erentiable, see Section 2, and the notion of the derivative d Us = d t does not exist. However, there is a easy way out of this di@culty following the steps that lead from a ordinary di8erential equation to a stochastic one and which were explained in the Arst sections of the paper, see Section 2 (in particular, Section 2.6) and Section 4. The main idea is simply to write the whole set of equations as increments over small time steps, for example d xp; i = Up; i d t ;
1 1 + Ca f 2 p
(362a) Us; i − Up; i 1 f f d Up; i = dt + 1 − gi d t + 1 + Ca × d Us; i ; $p p 2 p
d Us; i = As; i d t ;
(362b) (362c)
where As is the time rate of change of the velocity of the uid seen. If Us is a di8erentiable process, then As represents the acceleration. Yet, this formulation in increments is the clue to a generalization to time rate of change which involves white-noise terms, see Section 2.6. It is then easily seen that when Us becomes a di8usion process, such as a Langevin model, the limit set of equations has still a sense (it involves the choice related to the deAnitions of the stochastic integrals) and has the form d xp; i = Up; i d t ;
1 1 + Ca f 2 p
d Us; i = −
(363a) Us; i − Up; i 1 f f d Up; i = dt + 1 − gi d t + 1 + Ca × d Us; i ; $p p 2 p
1 9P 9Uf ; i Us; i − Us; i d t + (Up; j − Us; j ) dt − d t + Bs; i d Wi : f 9xi 9xj TL∗
(363b) (363c)
130
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
What happens is that as the time scale of the ‘rapid’ part of the uid acceleration goes to zero, $ → 0, the procedure of elimination of fast-variables described in Section 4 and already used in Section 6.8 shows that now both the uid and the discrete particle velocity equations become SDEs at the same time. Then both processes, namely Up and Us have trajectories which satisAes stochastic equations with the same white-noise term d W in both equations. The third term in the discrete particle equation represents a model for the pressure-gradient and added-mass forces. The mean pressure-gradient accounts for the mean acceleration of the uid particle while the uctuating part including the white-noise term models the uctuating acceleration. One of the objectives of this subsection was to show that present di8usion models for the uid velocity seen can be used even for bubbly ows. No supplementary terms have to be introduced a priori to handle this case, at least for theoretical reasons. However, this fact and the appearance of the same white-noise term in both the discrete particle and uid velocity equations are related to the choice of the expressions of the pressure-gradient and added-mass forces used here. If we had retained the form with the real uid particle acceleration instead of the acceleration calculated along the solid particle trajectory, the complete model could not have been derived as above. We would need a model for the acceleration of uid particle which are sampled by the solid particles along their trajectories. This requires additional assumptions and developments. In particular, di8erent white-noise terms could enter the di8erent equations, contrary to the case analysed in this section. Another purpose of this subsection is to show, perhaps more e8ectively than with the simpler model for heavy particles, how the equilibrium values (the Tchen’s relations of the previous subsection) can be readily obtained. As in the previous section, we consider homogeneous isotropic turbulence and address the question in a 1D formulation. In that case the governing set of stochastic di8erential equations is ' ( Up 1 b − ∗ Us d t + bBs d W ; (364a) d Up = − m d t + $p $m TL p d Us = −
Us d t + Bs d W ; TL∗
(364b)
where the coe@cients $m p and b stand for 1 f m $p = 1 + Ca $p ; 2 p ' ( 1 + 12 Ca f b= : 1 1 + 2 Ca f =p p
(365) (366)
After a transient period, all the statistics reach their limit value and the stochastic process Z = (Up ; Us ) reaches its stationary state and following the same procedure as in the previous subsection using the rules of stochastic calculus, see Section 2.7, we get that dUs2 = 0 ⇒ Bs2 = 2
Us2
TL∗
;
(367a)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
( 1 b b2 Bs2 dUp2 = 0 ⇒ − m + − U U + =0 ; p s $p $m TL∗ 2 p ( ' ( ' 1 1 1 b Up Us + dUp Us = 0 ⇒ − + − ∗ Us2 + bBs2 = 0 m TL∗ $m $ TL p p Up2
131
'
which give immediately the equilibrium relations ) * TL∗ + b$m p Us2 ; Up Us = TL∗ + $m p ) * TL∗ + b2 $m p 2 Us2 : Up = TL∗ + $m p
(367b) (367c)
(368) (369)
In practice, the constant Ca is assumed to be equal to one, [74] and we can see that these relations contain the ones derived in Section 7.5.5 as limit cases, since when f p we have b 0 and $m p $p . 7.5.7. From Langevin to Kinetic equations In the preceding subsections, once the one-particle state vector Z = (xp ; Up ; Us ) has been chosen and the complete Langevin models have been introduced, we have concentrated mainly on some characteristics of these models. In this last subsection, it may be interesting to take up the question of the hierarchy of pdf equations and of descriptions that was discussed at length in Sections 3 and in 4 and which is very much related to the choice of the state vector. Once we have a closed pdf equation for p(t; yp ; Vp ; Vs ), we can move down the pdf-equation ladder and consider the form of the pdf closures for reduced one-particle state vector. Indeed, in Section 7.3.2, we discuss two main choices, one limited to particle location and velocity Zr =(xp ; Up ), and the one that was retained which includes also the velocity of the uid seen Z = (xp ; Up ; Us ). The Arst one is used in the Kinetic Equation approach, see [82,83] and is brie y presented in Section 7.3.2. The second one has been the basis of the present Langevin models. The state vector Z is more general than Zr since it contains one extra variable, and thus the pdf of Zr , that we will write as pr to emphasize in this section its correspondence with the reduced state vector Zr , is simply the marginal of the pdf of Z. In other words, the description in terms of Z is a Aner description than the one performed in terms of Zr and we have r p (t; yp ; Vp ) = p(t; yp ; Vp ; Vs ) d Vs : (370) In Section 7.3.2, we mentioned the closure proposal made in the Kinetic Equation approach which expresses in sample space the mean conditional value of the velocity of the uid seen, Eq. (296). However, the closure relations in terms of Z which lead to the Langevin models are di8erent, and we can ask ourselves, once a Langevin model is chosen, how the resulting model for the reduced state vector looks like. The interesting issue is to ask whether, after integration of the extra variable Us , we can retrieve the Kinetic Equation closure from a Langevin model.
132
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
The general form of a Langevin model is given by Eqs. (309) which, with a drag force, are d xp; i = Up; i d t ; d Up; i =
Us; i − Up; i $p
(371a) d t + gi d t ;
d Us; i = As; i (t; Z) d t + Bs; ij (t; Z) d Wj
(371b) (371c)
and the pdf p(t; yp ; Vp ; Vs ) satisAes the corresponding Fokker–Planck equation, Eq. (310), which is rewritten here explicitly with the expression of the drift term in the discrete particle momentum equation
Vp; i 9p 9 9 Vs; i 9 9 [Vp; i p] − p + p + [Ap; i p] + 9t 9yp; i 9Vp; i $p 9Vp; i $p 9Vs; i =
1 92 [(Bs BsT )ij p] : 2 9Vs; i 9Vs; j
(372)
By integration of the pdf equation over all possible values of the extra variable, Us , we obtain the reduced pdf equation which is identical to the one already given in Section 7.3.2, Eq. (295), where it was obtained directly through the use of the techniques of Section 3
Vp; i r 9p r 9 9 1 9 r r [Vp; i p ] − p + Us; i | yp ; Vp p = 0 : (373) + 9t 9yp; i 9Vp; i $p 9Vp; i $p Note that implicitly, we have restricted ourselves to cases where $p does not depend anymore on the extra variable, Us , as it should in the general case, see the discussion of current expressions of the discrete particle relaxation time scale in Section 7.1. If not, the time scale $p should be included within the conditional expectation involving the velocity of the uid seen. More importantly, we see that the third term in Eq. (373) would also involve an unclosed form, the conditional expectation of the inverse of $p for a given value of (yp ; Vp ). This is actually another reason for choosing the one-particle state vector Z rather than the reduced one Zr , since $p = $p (Z). Yet, for the sake of simplicity we leave out that question and assume that $p is constant in the following discussion. The conditional average of Us is r Us | yp ; Vp p (t; yp ; Vp ) = Vs p(t; yp ; Vp ; Vs ) d Vs (374) and some information on the form of the higher pdf p must be used to work out a closed expression. This can be done in some cases. In situations such as homogeneous turbulence, the form of present Langevin models, see Eq. (330) for example, shows that the drift coe@cient for the velocity of the uid seen, As , is linear with respect to the variables entering the state vector Z=(xp ; Up ; Us ) and that the di8usion coe@cient Bs is constant. Therefore, if we consider the set of stochastic di8erential equations for the complete state vector Z written in a general form as d Zi = Ai (Z) d t + Bij (Z) d Wj ;
(375)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
133
the drift coe@cient A = (Ai ) is linear with respect to Z and the di8usion matrix is constant. This is the same reasoning as in Section 7.5.1 applied to the complete set of stochastic variables entering the state vector Z. In that case, the resulting stochastic process Z is Gaussian (this can be seen also from the Fokker–Planck equation, Eq. (372)). Then, for Gaussian processes, one can apply a useful result called Gaussian integration by parts [16], which states that for a Gaussian process made up by (n + 1) centred Gaussian random variables, say (X1 ; X2 ; : : : ; Xn ; Xn+1 ) whose pdf is pn+1 (y1 ; y2 ; : : : ; yn ; yn+1 ), integration over one variable, say Xn+1 , can be written in terms of the marginal pdf pn of the reduced n-dimensional process (X1 ; X2 ; : : : ; Xn ) as n 9pn yn+1 p(y1 ; y2 ; : : : ; yn ; yn+1 ) d yn+1 = − Xi Xn+1 : (376) 9yi i=1
A more general form of this theorem is called the Furutsu–Novikov theorem and uses functional calculus. Application of Gaussian integration by parts in our case yields now 9p r Us; i | yp ; Vp pr (t; yp ; Vp ) = Us; i pr − (Us; i − Us; i )(xp; j − xp; j ) 9yp; j −(Us; i − Us; i )(Up; j − Up; j )
9pr : 9Vp; j
(377)
Therefore, the closed pdf equation for pr obtained from the higher pdf equation for a Langevin model is
Vp; i − Us; i 9 9pr 9[Vp; i pr ] 9 9p r 9 9p r − pr = #ij + Dij (378) + 9yp; i $p 9t 9Vp; i 9Vp; i 9yp; j 9Vp; i 9Vp; j and has the same form than the proposed Kinetic Equation [80,82,83]. The coe@cients #ij and Dij appear as the remaining traces of the variable that has been eliminated in the reduced pdf equation (the Kinetic Equation), namely the velocity of the uid seen, Us . In the simpliAed situations where Gaussian integration by parts can be applied it is seen that these coe@cients are the correlations between each of the remaining degrees of freedom of Zr and the external variable Us . The expressions of these coe@cients in terms of the history of the velocity of the uid seen along the discrete particle trajectories can be given from the integrated discrete particle equations, Eqs. (371) t −t=$p xp; i (t) = xp; i (0) + $p [1 − e ]Up; i (0) + [1 − e(t −t)=$p ]Us; i (t ) d t (379a) 0 e−t=$p t t =$p e Us; i (t ) d t : (379b) Up; i (t) = Up; i (0)e−t=$p + $p 0 Neglecting the correlations of the velocity of the uid seen with the discrete particle initial values, for an elapsed time t larger than $p , we get the expressions of the correlations e−t=$p t t =$p Up; i (t)Us; j (t) = e Us; j (t)Us; i (t ) d t ; (380a) $p 0 t xp; i (t)Us; j (t) = [1 − e(t −t)=$p ]Us; j (t)Us; i (t ) d t : (380b) 0
134
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
The general closure relations are therefore non-local in time. One important point is that, since the velocity of the uid seen Us is external to the reduced description Zr , its autocorrelation Us; j (t)Us; i (t ) must be provided by another source or assumed. From the discussion of Section 7.5.3, it appears that putting forward a satisfactory expression for Us; j (t)Us; i (t ) in the general case of non-homogeneous turbulence is not an easy task. Yet, if we limit again ourselves to the simpliAed case of homogeneous turbulence without mean gradients and use a 1D formulation of the equations (for the sake of simplicity as in the preceding subsections), the autocorrelation of the velocity of the uid seen can be well approximated by an exponential
∗
Us (t)Us (t ) = Us2 e(t−t )=TL
(381)
and we obtain the stationary values of the correlations as D=
1 1 1 Up (t)Us (t) = Us2 ; $p $p 1 + $p =TL∗
(382a)
#=
1 1 TL∗ xp (t)Us (t) = Us2 ; $p $p 1 + $p =TL∗
(382b)
where the Arst equation was already given as one of Tchen’s relations, see Section 7.5.5. At this stage, using the concepts developed in the Arst sections of this paper, we are in a position to play with these relations and check their consistency in limit cases. For example, let us consider the limit case where the uctuating part of the velocity of the uid seen us = Us − Us tends towards a white-noise term. All the material needed to handle that case has been given in Section 4. Physically speaking this requires that the time scale of the velocity of the uid seen TL∗ becomes very small with respect to the time scale of the velocity of the discrete particle $p . As it was explained in Section 4, and in particular with the relation (94), the limit case of a white noise term is expressed in a continuous sense by Us2 → ∞
TL∗ → 0
such that Us2 × TL∗ → Ds ;
(383)
where Ds is a Anite and non-zero constant. From the above expressions of # and of D we get in that limit D=
Ds ; $2p
#=0 and the Kinetic Equation becomes 9[Vp; i pr ] Vp; i − Us; i Ds 92 [pr ] 9 9p r − pr = 2 : + 9yp; i $p 9t 9Vp; i $p 9Vp;2 i
(384a) (384b)
(385)
The Kinetic Equation has thus the form of a standard Fokker–Planck equation. From the equivalence between Fokker–Planck equations and Langevin equations, see Section 2.8, we can write
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
135
the corresponding stochastic di8erential equations for the discrete particle variables d xp; i = Up; i d t ; d Up; i = −
Up; i 1 Us; i dt + dt + 2Ds d Wj : $p $p $p
(386a) (386b)
This is fully consistent with the elimination of fast-variable detailed in Section 4 that we could have also performed directly in the trajectory equations d xp; i = Up; i d t d xp; i = Up; i d t → Up; i − Us; i Up; i − Us; i us; i 1 dt + d t TL∗ →0 dt + 2Ds d Wj : d Up; i = − d Up; i = − $p $p $p $p (387) Thus, we have seen that by integration of the velocity of the uid seen, we have retrieved the closed form of the reduced pdf for Zr known as the Kinetic Equation, from a Langevin model. In other words, it can be said that the Langevin model contains the solution of the Kinetic Equation model as a marginal pdf. The correspondence was established by assuming that the complete process Z could be regarded as a Gaussian process to apply Gaussian integration by parts. This appears as an implicit assumption also in other derivations of the Kinetic Equation. For example, in [80], the derivation was performed using cumulant expansion (limited to the two Arst cumulants). This is a correct procedure for Gaussian processes, or if not the existence of a small parameter is required in order to disregard the other terms of the expansion [80]. However, in non-homogeneous turbulence when the velocity of the uid seen is bound to deviate from Gaussianity, this may be a too strong approximation. In this general situation, the Langevin model which can tackle deviations from Gaussianity since the velocity of the uid seen is explicitly simulated may have better chances to do a correct job. 7.6. Numerical examples and typical simulations 7.6.1. General numerical issues In the following, we describe two numerical applications of the Lagrangian stochastic models for discrete particle properties. The numerical code used to carry out the computations is a mixed or hybrid Eulerian=Lagrangian code. Using a di8erent terminology, it can also be said that the present hybrid approach is a mixed Moment=pdf approach, or in terms of numerics a mixed Moment=Monte Carlo approach and even a mixed Particle=Mesh simulation. Indeed, the simulations of the continuous phase (typically a gas) and of the dispersed phase (typically solid particles) are performed with theoretical models and with numerical approaches which are of a completely di8erent nature. On the one hand, the continuous phase is calculated using a classical moment approach (see Section 6.3), and is computed by solving on a mesh the corresponding partial di8erential equations. The continuous phase is therefore characterized by mean Aelds obtained at a number of Axed predetermined points (the mesh nodes). On the other hand, statistical properties of the dispersed phase are simulated by computing a large number of trajectories of the stochastic process Z which describes particle variables (for example,
136
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 30. Sketch of the coupled algorithm for one complete time step.
Z = (xp ; Up ; Us )). The dispersed phase is thus represented by a large number of particles, or samples of the one-point pdf. At each time step, particle statistical properties are obtained by Arst locating the particles in the cells of the mesh and by computing ensemble averages from the set of particles present in each cell. The complete Eulerian=Lagrangian algorithm is sketched in Fig. 30. This Agure represents one time step of the complete algorithm. During each time step, the uid mean Aelds are Arst updated and the uid mean Aelds which enter the particle stochastic equations are provided to the Lagrangian solver. These mean Aelds include typically the mean pressure P (or its gradients), the mean velocity Uf , the Reynolds stress tensor Rij , the mean dissipation rate + and additional Aelds if needed depending on the application (for instance, the mean uid temperature, etc.). Then, in the Lagrangian solver, particle stochastic equations are integrated in time (over one time step), particles are then located within the grid and, after application of boundary conditions, mean or statistical properties are evaluated by local ensemble averaging. These statistical properties extracted from particle variables include the particle mean velocity Aeld Up , which enters the coe@cients of the particle stochastic equation (through the expression of the time scale of the uid velocity seen for instance, Eqs. (314)–(315), and the source terms accounting for the exchange of momentum and of kinetic energy which are then fed back into the Eulerian solver as indicated in Fig. 30. On the numerical front, the situation is as open as on the theoretical front. Just as current stochastic models require improvements (see Sections 7.4.1 and 7.4.2), the numerical implementation of previous ideas involve a number of issues for which further work would be needed. A comprehensive presentation of the various issues is outside the scope of the present work and we limit ourselves to mentioning some points. SpeciAc information can be found in the classical book of Hockney and Eastwood [90] which deals however mostly with deterministic particle=mesh systems. Issues related to Monte Carlo particle=mesh methods may be found in
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
137
Fig. 31. Geometry of the Wall jet test case.
specialized articles [56,91]. Among other issues, two important numerical points are • one has to integrate in time the SDEs which form the particle model. This is often a
tricky question since numerical schemes have to be consistent with the deAnition of the stochastic integral and must be consistent with the objective of a weak approximation (see Section 2 and in particular Section 2.10). • one must exchange information between the grid-based variables and the particle-based variables. The uid mean Aelds calculated at grid nodes must be evaluated at particle locations (which are distributed continuously within the domain). This represents the Arst problem of how to go from the grid to the particles. Then, mean variables related to the particles (such as Up ) are evaluated by taking local ensemble averages. This represents the second problem, which is the reverse of the Arst one, of how to go from the particles to the grid. Generally speaking, these two problems cannot be treated independently as this may lead to inconsistencies [90,91]. 7.6.2. Wall jet The Arst case described is a turbulent wall jet loaded with solid particles. The geometry and the ow conditions are represented in Fig. 31. The ow is made up by a turbulent plane air jet of width b which moves down a vertical plane wall and is mixed with a co-current ow. The plane jet is seeded with solid glass particles. Both single- and two-phase ows have been studied and measurements have been performed at a number of sections downstream of the injection, namely at x=b = 1; 5; 10; 15; 20; 30; 40; 50. Experimental date are published in [92], and this conAguration was used as a test case for one Workshop on two-phase ows organized in Merseburg (Germany) in 1996 (Tables 1 and 2). A standard k–+ model was used for the predictions of the gas-phase properties, and not a low-Reynolds k–+ model which could have been more appropriate given the moderate jet Reynolds number. This was done for the sake of simplicity and also in order to test the standard version of the code. Therefore, present results do not claim to be the best possible ones with
138
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Table 1 Characteristics of the case Dimensions of the test section
150 × 100 mm2
Length Jet width Maximum velocity in the jet Velocity of the co-current stream Jet Reynolds number
350 5 mm 10 m= s 2 m= s 3300
Table 2 Particle properties Mean diameter dp Particle diameter standard deviation p Particle density p Particle mass loading
49:3 m 4:85 m 2590 kg= m3 0:1
Table 3 Computational performances Time for 1000 nodes per time step Eulerian solver Lagrangian solver
0:065 s
Time for 1000 particles per time step 0:16 s
state-of-the-art turbulence models. Calculations were carried out with a cartesian grid. Since the ow is two-dimensional, computations were performed with only 3 planes in the symmetry direction and the mesh was made up of 102 × 3 × 47 = 14 382 nodes. At the inlet, the proAles of particle volumetric fraction and of axial velocity are given (the particle mass ow rate is therefore known). Particles are then injected at each time step so as to respect this inlet mass ow rate. At a transient time, a stationary regime is reached in which the total number of particles within the domain remains constant (or uctuates slightly around a constant value). When this stationary state was reached, between 14 100 and 14 200 particles were treated at each time step. The single-phase ow case was Arst calculated using 1000 iterations with a time step of Qt = 1:E − 4 s. Then the coupled two-phase ow was simulated. Computations were performed on a Silicon OCTANE R10 000 workstation and the necessary CPU times for the computations of the gas (with the Eulerian solver) and the particle phase (with the Lagrangian solver) are indicated below. Both CPU times have been calculated for the same number of computational ‘elements’ which were treated (either mesh points or particles) (Table 3). The computational requirements for the particle solver is more important than for the Eulerian solver. Once again, this is not totally surprising since the Eulerian solver computes only a small number of moments whereas the particle solver computes the one-point pdf. Furthermore,
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
139
Fig. 32. ProAles of vertical mean uid velocities. The continuous line is for the single-phase ow case and is to be compared with experimental data (•). The dotted line represents results for the two-phase case and is to be compared with experimental data (×).
computational requirements for the gas phase are, for the present calculation, rather low since a k–+ model was used. Yet, it was shown in previous sections that present Lagrangian models have a natural correspondence not with eddy-viscosity type of models but rather with full second-order or Reynolds stress models (see Section 6.7 and even Section 8.5). In other words, even in terms of moment equations (which is only a subset of the information calculated by the Lagrangian solver), the model used in the Eulerian and Lagrangian solvers are not equivalent. Consequently, it is actually di@cult to draw deAnitive conclusions from present computational times. The next case where a second-order model is used for the gas phase will provide Agures that are more relevant to compare directly. A number of results are shown in the following in order to illustrate how present models perform for such a case. Statistical results are presented for the four last sections, at x=b = 20; 30; 40; 50. Fig. 32 presents results for the vertical mean uid velocity. In the two-phase case, particles which are heavier than the uid and move at a higher velocity tend to increase uid velocities in the core of the jet and to change the slope of the mean uid velocity proAle. This trend is well reproduced by the model although the peak of the mean uid velocity appears to be slightly overpredicted in the last two sections. The particle volumetric fraction :p is a very sensitive variable in this test case and the form of the proAles as well as the position
140
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 33. ProAles of particle volumetric fraction (×1:E + 4).
of the peak value are sensitive to the choice of the model and to the numerics. Therefore, the comparison shown in Fig. 33 displays satisfactory agreement between numerical predictions and experimental values. Further comparisons for particle mean velocity and also for particle uctuating velocities are presented in the next two Agures (Figs. 34 and 35). 7.6.3. Recirculating bluC-body ;ow In the previous case, the ow between the uid and the particles was co-current. A more di@cult situation and therefore a much more stringent test for computational models is met with the present case of a recirculating blu8-body ow. The sketch of the geometry and of ow conditions is given in Fig. 36. This experimental setup is characteristic of pulverized coal combustion furnaces where primary air and coal are injected in the centre and secondary air is introduced on the periphery. This is a typical blu8-body ow where the gas (air at ambient temperature, T = 293 K ) is injected in the outer region with a velocity high enough to create a recirculation zone downstream of the injection (two honeycombs were used in the experiment in order to stabilize the ow so that no swirl was present). Solid particles (glass particles of density p = 2470 kg= m3 ) are then injected from the inner cylinder with a given mass ow rate and from there interact with the gas turbulence. This is a coupled turbulent two-phase ow since the particle mass loading at
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
141
Fig. 34. ProAles of particle mean vertical velocities.
the inlet is high enough (22%) for the particles to modify the uid mean velocities and kinetic energy. This is also a polydispersed ow where particle diameters vary according to a known distribution at the inlet, typically between dp = 20 m and dp = 110 m around an average of dp ∼ 60 m. Experimental data are available for radial proAles (the ow is stationary and axi-symmetric) of di8erent statistical quantities at Ave axial distances downstream of the injection (x=0:08; 0:16; 0:24; 0:32 and 0:40 m). These quantities include the mean axial and radial velocities as well as the uctuating radial and axial velocities for both the uid and the particle phase. Axial proAles along the axis of symmetry for these quantity have also been measured. All the data were gathered using PDA measurement techniques. Further details on the experimental setup and the measurement techniques can be found in [93]. This test case is a very interesting case for two-phase ow modelling and numerical simulations since most of the di8erent aspects of two-phase ows are present. The particles are dispersed by the turbulent ow but in return modify this one. Furthermore, the existence of a recirculation zone where particles interact with negative axial uid velocities constitutes a much more stringent test case compared to cases where the uid and the particle mean velocities are of the same sign (the problem is then mostly conAned to radial dispersion issues).
142
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 35. ProAles of particle uctuating vertical velocities.
Computations were performed using a curvilinear mesh well suited for axi-symmetrical ows and made up of 74 × 3 × 142 = 31 154 nodes. Two turbulence models, a k–+ and a Rij –+ model, have been used. For the single-phase ow case, the second-order model performed much better than the simple k–+ model for the prediction of the recirculation zone and was retained. In terms of modelling consistency with Lagrangian stochastic models, this is actually more satisfying since the Eulerian model which naturally corresponds to a Lagrangian stochastic equation is indeed a second-order model (see Section 6.7). A variable time step calculation was performed for the single-phase ow computation. Then particles are injected and the two-phase ow was calculated with a Axed time step, d t = 1:E − 3 s. For the two-phase ow case, it must be noted that each component of the Reynolds stress tensor Rij is modiAed by source terms which account for the exchange of energy between the uid and the particles. With the sources terms applied in the mean uid momentum equations, this means that at each time step, nine source terms are calculated from the particle data and are sent back to the Eulerian solver. The stationary regime is then reached as the limit of the unstationary regime. When the stationary regime is reached, around 14 000 particles are treated simultaneously at every time step. Calculations were performed with a HP 785=C3000 workstation. About 1000 time steps were simulated for the single-phase ow problem. Then, 2000 time steps were calculated for the two-phase ow situation. Around 400 –500 time steps are needed to reached the stationary
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
143
Fig. 36. Geometry of the blu8-body case. The mean streamlines are shown for the uid (solid lines) and the particles (dashed lines). Two stagnation points in the uid ow can be observed (S1 and S2 ). Experimental data are available for radial proAles of di8erent statistical quantities at Ave axial distances downstream of the injection (x = 0:08; 0:16; 0:24; 0:32; 0:40 m) (experimental data is also available on the symmetry axis). Table 4 Computational performances Time for 1000 nodes per time step Eulerian solver Lagrangian solver
0:20 s
Time for 1000 particles per time step 0:17 s
regime for the two-phase ow situation. Statistics extracted from the particle data set are then averaged in time (averaged for about 1000 time steps) which ensures that statistical noise is reduced to a negligible level. The CPU times for the calculation of the gas properties with the Eulerian solver and of the particle properties with the Lagrangian solver are given in Table 4. For the same number
144
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 37. ProAles of vertical mean uid velocities. The continuous line is for the single-phase ow case and is to be compared with experimental data (•). The dotted line represents results for the two-phase ow case and is to be compared with experimental data ( ).
of computational elements (either mesh points or nodes), the Lagrangian solver appears now even slightly faster than the Eulerian one. Compared to the previous case, the computational requirements for the Eulerian solver is increased due to the use of a full second-order turbulence model which implies the numerical solution of six coupled partial di8erential equations for the uctuating velocities (added to the three equations for mean momentum) compared to only two for eddy-viscosity models. In comparison, the Lagrangian solver requires about the same time since the same model was used. This is a numerical illustration of the fact that Lagrangian stochastic models become increasingly competitive even with classical grid-based approaches when models are more and more complicated (not to mention the di8erent level of information contained in the di8erent approaches). This case represents a recent numerical application. Various results can be extracted and collated to experimental data. Since the purpose of this section is more to illustrate how present stochastic models work for a practical case rather than a comprehensive validation analysis, we limit ourselves to a subset of numerical outcomes (complete results should be presented and submitted soon). Fig. 37 presents the numerical predictions of the axial proAle of the mean axial uid velocity, both in the single and in the two-phase ow cases. It is seen that the present coupled calculation predicts the correct trend for the mean uid velocity from the single- to the two-phase ow situation where particles in uence the uid, though the comparison with experimental data is slightly worse than for the single-phase ow situation (the increase of the uid velocity downstream of the recirculation zone is underpredicted). The next two Agures, Figs. 38 and 39, display detailed comparisons (at the six sections where experimental measurements are available) of particle mean vertical velocities and particle vertical uctuating velocities (simply obtained as the standard deviation of the distribution of the particle vertical velocity at each point). We can use the present numerical simulation to bring out the kind of information that is available. As already indicated, in a number of engineering applications, one is usually interested in having far more information than simply one or two moments. A typical example is the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
145
Fig. 38. ProAles of particle mean vertical velocities.
particle residence time within a domain or within a certain marked zone. In many situations, one would like to know, at a given point or in a given region, the distribution of the residence time at a certain time. For example, one would like to know how much time particles found in a certain volume have actually spent in that volume, or even how many of the particles present have previously entered another speciAc zone. One of the interest of present stochastic models is to provide such information. This is illustrated in Fig. 40 which presents a plot of the instantaneous locations of the particles that are simulated at that time step. In this plot, particles are coloured by their residence time which reveals the recirculation. The denser plot indicates the accumulation of particles in the recirculation zone. In the same Agure, two distributions of
146
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 39. ProAles of particle uctuating vertical velocities.
particle residence time are extracted and shown at two locations. In a cell near the inlet, the distribution is highly peaked: most of the particles present in that cell have just been injected and their residence time is small contributing to the near delta-value close to the origin of time. A smaller number of particles are found with larger residence times: these are particles which have recirculated and have gone back to the selected cell following di8erent trajectories and thus having di8erent residence times. A second distribution is also shown in Fig. 40, for a location near the outlet of the domain. In that case, we do not And di8erent subclasses but rather a continuous spread of the particle residence time distribution. Particles have been well mixed since the injection and the distribution are smooth. The subdivision of particles into
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
147
Fig. 40. A snapshot of particle locations at one time step of the calculation where particles are coloured by their residence time within the domain. Two distributions of particle residence time are plotted at two di8erent positions, near the injection and near the outlet.
148
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 41. Plot of the instantaneous axial velocity of a number of particles found in a cell close to the injection. The particles are coloured by their residence time within the domain.
di8erent subsets, in a cell close to the inlet, is further illustrated in Fig. 41. In this Agure, the instantaneous particle axial velocities are plotted for the particles found, at a given time in a certain cell. The values are coloured as a function of the particle residence time. It is clearly seen that two di8erent classes co-exist within the cell. The Arst subclass is formed by the particles which have just been injected and whose residence time is small (coloured in blue). These particles have an axial velocity close to 4 m= s (the inlet value) with very little dispersion around this value. In other words, the particle kinetic energy for this subclass is small. A second subclass is formed by particles with a higher residence time and which have recirculated. Their axial velocity is smaller but the dispersion within that subclass is higher. Therefore, the global uctuating velocity up = (up )2 calculated on the whole set of particles present can be important, but that simple number cannot represent the underlying physics of the problem, which is here better described in terms of various subclasses. This kind of analysis is easily accessible with present Lagrangian stochastic models. 8. Two-point 7uid–particle pdf models in dispersed two-phase 7ows The mathematical tool ‘di8usion process’ has now been used extensively in the two preceding sections. It was shown how these processes can be applied to model a continuous Aeld (turbulent single-phase ows) and a discrete case (discrete particles carried by a continuous Aeld, a uid, where the description of the uid is external to the probabilistic formalism). In the present section, we merge the one-point uid pdf and the one-point particle pdf notions. The formalism presented in Section 6 is generalized and the models developed in Sections 6 and 7 are extended. In other words, a complete trajectory description of dispersed two-phase ows is proposed. With respect to that context, the objectives of the present section are (i) to precise and clarify what the present notion of a two-point uid–particle pdf description is, its relation with macroscopic approaches and its interest,
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
149
(ii) to detail the formalism (especially the notions of Lagrangian and Eulerian pdfs and the associated marginals), and study the properties of the new approach. In opposition to the previous sections (6 and 7), the physical content of the approach is not one of the main interests of the present section and we will not dwell on such considerations. Instead, emphasis is put on the general formalism, or in other words, it will be demonstrated how the dispersed two-phase ow issue can be addressed in a general probabilistic framework where the mean Aeld equations are the end of the road. 8.1. Motivations and basic ideas As explained above and in Section 7, one-point particle pdf models in dispersed two-phase ows (such as the Langevin equation model [80] or the Kinetic Equation [82,83]), are hybrid methods where the characteristics of the continuous ( uid) phase are external to the description of the statistical system (the discrete particles) and must be determined by another route, usually classical Reynolds stress modelling. It was shown in that section that pdf models, or particle stochastic models can handle convection as well as any distribution of particle properties (such as particle diameter) without approximation. They are therefore attractive models for polydispersed particle ows and when complex phenomena a8ecting particles (for example, droplet evaporation, heterogeneous particle combustion, etc.) have to be included. Similarly, a classical moment approach (for instance, Reynolds stress models) for the continuous phase may be too limiting when chemical reactions take place and=or other e8ects (intermittency, . . . ) have to be taken into account. This was detailed in Sections 6.3 and 6.5.3. Consequently, it appears interesting to describe both phases (the uid and the particles) with stochastic models or using only probabilistic arguments. In order to do so, it seems logical to introduce a uid–particle pdf and to discuss the properties of both phases from the same point of view. 8.2. Probabilistic description of dispersed two-phase ;ows Here, the motivation for the introduction of the notion of the Eulerian and Lagrangian points of view in the frame of the probabilistic approach is not recalled (this was done in Section 6). It is, however, reminded that, in the continuous phase, both points of view are possible since the problem which is addressed is the probabilistic description of a Aeld. For the discrete phase, the natural choice is the Lagrangian point of view. In other words, it will be seen that, as in Section 6, at the microscopic level the Lagrangian point of view is natural (both point of views can be adopted for the continuous phase) whereas at the macroscopic level all quantities are Eulerian ones. Indeed, the end of the road is the derivation of Aeld equations for the local moments (expected values) of both phases. 8.2.1. Dimension of the state vector As done in Section 6, we slightly anticipate the next subsections and we directly give an expression for the two-particle state vector (one uid particle and one discrete particle). In the case of turbulent, reactive, compressible, dispersed two-phase ows, an appropriate state
150
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
vector is Z = (xf ; Uf ; f ; xp ; Up ; p ) ;
(388)
where p has a given dimension (as it was done previously, we distinguish between physical space and sample space, Z = (yf ; Vf ; f ; yp ; Vp ; p ). It will be seen that p can consist of the uid velocity seen and several scalars relevant to the discrete particles, for example diameter, enthalpy, mass fractions and so on. Here, we try to stay as general as possible and only position and velocity are included explicitly in the state vector. Once again, it is, indeed, necessary to introduce two independent variables for the positions of the uid and the discrete particles since the two kind of particles are not convected by the same velocities. We are therefore considering a two-particle pdf picture (in a Lagrangian sense) of the whole system composed of the uid and of the particles, or a two-point pdf picture (in an Eulerian sense). This is however a di8erent notion from ‘classical’ two-point, or two-particle, descriptions of the same phase (two uid particles or two discrete particles) described in Section 3.2. The two-point description followed here is a mixed notion, since we are considering one particle in each phase. The intermediate status of the present description is discussed again in Sections 8.2.3 and 8:4. 8.2.2. Eulerian and Lagrangian descriptions As explained earlier in Section 6, there are two possible points of view for the description of a uid, or more precisely in this case a uid–particle mixture. The Lagrangian one where one is interested in, at a Axed time, the probability to And two particles (a uid particle and a discrete particle) in a given state and the Eulerian description (Aeld approach) where one seeks the probability to And, at a given time and at two Axed points in space (a ‘ uid point’, xf , and a ‘discrete-particle point’, xp ), the uid–particle mixture in a given state. In the case of the Lagrangian description, we deAne the following pdf: L (t; yf ; Vf ; pfp
f ; yp ; Vp ;
p)
;
(389)
where (the subscripts f and p for the variables are sometimes dropped for the sake of simplicity as long as the notation does not become ambiguous) the probability to And a pair of particles (a uid particle and a discrete particle) at time t, whose positions are in the range [y; y + d y], whose velocities are in the range [V; V + d V] and whose associated quantities (scalars and other variables) are in the range [ ; + d ], is L pfp (t; yf ; Vf ;
f ; yp ; Vp ;
p ) d yf
Normalization is given by L pfp (t; yf ; Vf ; f ; yp ; Vp ; L
d Vf d
p ) d yf
f
d yp d Vp d
d Vf d
f
p
:
d yp d Vp d
(390)
p
=1 ;
(391)
where L represents the obtainable values in sample space: in the velocity spaces, it is ±∞ whereas in the position spaces it is given by the boundary conditions. In the scalar spaces, the obtainable values are essentially deAned by realizability conditions.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
151
For the Aeld description (Eulerian point of view), the following distribution function (it is not a pdf ) is introduced: E pfp (t; xf ; xp ; Vf ;
f ; Vp ;
p)
;
(392)
where the probability to And at time t and at positions xf and xp the system in a given state in the range [V; V + d V] and [ ; + d ] is E (t; xf ; xp ; Vf ; pfp
f ; Vp ;
p ) d Vf
d
f
d Vp d
p
:
(393)
As far as normalization is concerned, one can state that (E represents the obtainable values in sample space with E ⊂ L ) E pfp (t; xf ; xp ; Vf ; f ; Vp ; p ) d Vf d f d Vp d p 6 1 ; (394) E
since the physical situation is a uid–particle mixture where one cannot always And with probability one, at a given time and at two di8erent locations, a uid and a discrete particle in any state. An expression for the normalization factor will be proposed later in Section 8.2.3 but, here, the present form is conserved in order to be consistent with the choices which will be made in the following subsections. 8.2.3. Consistency relation and normalization constraint In the formalism presented so far, no geometrical information, on the relative positions of the uid particle and the discrete particle, has been given. To illustrate that matter, let us go back to Section 6. In this one-point description, one can state that two stochastic ( uid) particles can be located at the same position at the same time since they represent two di8erent realizations of the ow (and consequently the particles are likely to exhibit di8erent velocities and associated scalars). Now, if one goes to a two-point uid pdf approach, the modelled equations of the trajectories for the pair of (stochastic) particles could be twice Eqs. (236a) and (236b). However, if these equations were retained in their present form, a piece of information would be missing, namely the fact that these two stochastic particles could not be located, at time t, at the same position in space. This spatial information (the relative position between the two uid particles) has to be introduced in the stochastic di8erential equations (the form of this short-range interaction and its consequences in the pdf equation will not be discussed here). In the present two-point uid–particle approach, the problem is slightly di8erent since we are not really dealing with a two-point approach (as in the spirit of single-phase ows) but instead with a two-point approach with one point for each phase. However, the same constraint is present, that is, for a pair composed of a uid and a discrete particle, the two particles cannot be located at the same position in physical space for a given time t. For the Lagrangian pdf when yf = yp = y, the argument developed above implies that L (t; y; Vf ; pfp
f ; y; Vp ;
p)
=0 ;
(395)
152
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
and consequently, in terms of the Eulerian distribution function (for xf = xp = x) E (t; x; x; Vf ; pfp
f ; Vp ;
p)
=0 :
(396)
Eq. (395) is an additional constraint which should be present in the pdf equation (the Fokker– L ) and also in the trajectories of a pair of particles (a uid particle Planck equation veriAed by pfp and a discrete particle) as a short range interaction. A direct consequence of Eq. (395), and consequently Eq. (396), is that, at a given point x in physical space and a given time t, the sum of the probabilities to And a uid particle or a discrete particle in any state is one. This can be expressed in terms of the marginals of the Eulerian distribution function as pfE (t; x; Vf ; f ) d Vf d f + ppE (t; x; Vp ; p ) d Vp d p = 1 (397) where pfE (t; xf ; Vf ; ppE (t; xp ; Vp ;
f) =
p)
=
E pfp (t; xf ; xp ; Vf ;
f ; Vp ;
p ) d xp d Vp d p
;
E pfp (t; xf ; xp ; Vf ;
f ; Vp ;
p ) d xf
:
d Vf d
f
(398)
Eq. (397) can also be re-written by introducing the normalization factors of pfE and ppE , namely :f (t; x) and :p (t; x), respectively, to yield :f (t; x) + :p (t; x) = 1 :
(399)
:f (t; x) represents the probability to And the uid phase, at time t and position x, in any state (0 6 :f (t; x) 6 1). This probability in not always one as in single-phase ows where the physical space is continuously Alled by the uid. In a uid–particle mixture, at (t; x) there might be some uid or a discrete particle. Similarly, the probability to And the discrete phase at time t and position x in any state is :p (t; x) (0 6 :p (t; x) 6 1). E is not a pdf but rather a distribution At the beginning of the section, it was seen that pfp function (as a matter of fact, it represents a Aeld of distribution functions). It was also postulated E is always less or equal than one. This can be clariAed in the that the normalization factor of pfp particular case where the uid particles and the discrete particles represent independent events, E = pE pE (strictly speaking, this is not always possible since they cannot be located, for i.e. pfp f p a given time, at the same point in physical space). Under this assumption, the normalization E becomes factor of pfp E pfp (t; xf ; xp ; Vf ; f ; Vp ; p ) d Vf d f d Vp d p = :f (t; xf ) :p (t; xp ) ; (400) E
which is consistent with Eq. (394). We shall see now (as in Section 6) that the Eulerian quantity which contains all information is a two-point mass density function which is in line with the consistency and normalization arguments which have just been addressed. 8.2.4. Marginal pdfs and mass density functions Two marginal pdfs have a clear meaning and correspond to known proposals. The Arst one is obtained by integrating over all characteristics of the discrete particles and is the pdf related
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
153
to the uid characteristics (it is nothing but the pdf which was used in the one-point uid pdf approach in Section 6), L L pf (t; yf ; Vf ; f ) = pfp (t; yf ; Vf ; f ; yp ; Vp ; p ) d yp d Vp d p : (401) The second marginal pdf is obtained by integrating over all characteristics of the uid particles and is the pdf related to the characteristics of the discrete particles (it is therefore the pdf which was used for the one-point particle pdf approach in Section 7), L L pp (t; yp ; Vp ; p ) = pfp (t; yf ; Vf ; f ; yp ; Vp ; p ) d yf d Vf d f : (402) As it was explained in Section 6, for a complete description of the density function FfL (t; yf ; Vf ; f ) is introduced where FfL (t; yf ; Vf ;
f ) d yf
d Vf d
uid particles, a mass (403)
f
is the probable mass of uid particles in an element of volume dyf d Vf d f . As for the uid particles, the discrete particles might have di8erent densities and sizes (in the case for example of a spray where there is a size distribution or in the case of a uidized bed where there is a distribution in density and diameter) and a similar mass density function is deAned FpL (t; yp ; Vp ; p ) where FpL (t; yp ; Vp ;
p ) d yp d Vp d p
(404)
is the probable mass of discrete particles in an element of volume d yp d Vp d p . Both mass density functions are consequently normalized by the total mass of the respective phases, Mf for the continuous phase and Mp for the discrete phase (Mf and Mp are constant in time for the sake of simplicity) Mf = FfL (t; yf ; Vf ; f ) d yf d Vf d f ;
Mp =
FpL (t; yp ; Vp ;
p ) d yp d Vp d p
;
(405)
where the mass density functions are given by FfL (t; yf ; Vf ;
f)
= Mf pfL (t; yf ; Vf ;
f)
;
FpL (t; yp ; Vp ;
p)
= Mp ppL (t; yp ; Vp ;
p)
:
(406)
Np The total masses are of course deAned by Mf = Vf f (xf ) d xf and Mp = i=1 mp; i where Np is the total number of discrete particles, mp; i the mass of the discrete particle i, and where the integration which gives Mf is performed over the domain occupied by the continuous uid phase. If, for example, all discrete particles are identical in mass and size Mp = Np mp , then the mass density function is equivalent to the number density function f1 which is usual in kinetic
154
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
theory, see Section 3, that is (scalars
f
are dropped in analogy with kinetic theory)
f1 (t; yp ; Vp ) = Np ppL (t; yp ; Vp ) ⇒ mp f1 (t; yp ; Vp ) = FpL (t; yp ; Vp ) :
(407)
At last, we deAne a two-point uid–particle mass density function L FLfp (t; y; V; ) = Mp Mf pfp (t; y; V; ) ;
(408)
whose marginals are related to the mass density function of the continuous phase FfL and the mass density function of the discrete phase FpL by FLf (t; yf ; Vf ;
f)
= Mp FfL (t; yf ; Vf ;
f)
;
FLp (t; yp ; Vp ;
p)
= Mf FpL (t; yp ; Vp ;
p)
:
(409)
8.2.5. General relations between Eulerian and Lagrangian pdfs The quantities which have been introduced so far in the present section are related to Lagrangian pdfs (or mdfs). As explained before, the end of the road of the present formalism is the derivation of Aeld equations (Eulerian equations) for the local moments of both phases and this can only be done by means of Eulerian tools, i.e. Fokker–Planck like equations (partial di8erential equations) on Eulerian distribution functions. Averaging operators (expectations) can then be deAned and partial di8erential equations for the mean quantities can be written as it was done in Section 6. Here, the problem is, however, slightly di8erent since we are dealing with a uid–particle mixture, that is two phases sharing the same physical space. Relations between two-point (one uid particle and one discrete particle) Lagrangian mdfs and Eulerian mdfs must be found. We shall shortly see that two treatments of the problem are possible and equivalent. Indeed, relations between Eulerian and Lagrangian mdfs can be worked out on the marginals of FLfp (FLf = Mp FfL and FLp = Mf FpL ) or directly on FLfp . The fundamental di8erence is the information which is kept when one arrives at the Eulerian formulation, i.e. a two-point (one point for each phase) Eulerian information in the latter procedure whereas in the former, only one-point Eulerian information is available in each phase (some information has been lost). By generalization of Eq. (211), we deAne FEfp (t; xf ; xp ; Vf ;
f ; Vp ;
p)
=FLfp (t; yf = xf ; Vf ; f ; yp = xp ; Vp ; p ) = FLfp (t; yf ; Vf ; f ; yp ; Vp ; p )(xf − yf )(xp − yp ) d yf d yp ;
(410)
where FEfp is the Eulerian two-point uid–particle mass density function. By direct integration of the previous equation over physical space xp and phase space (Vp ; p ) or over physical space xf and phase space (Vf ; f ), one Ands similar relations for the associated marginals, the Eulerian one-point uid mass density function, FEf , and the Eulerian one-point mass density
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
155
function associated to the particle phase, FEp , respectively. One can write for the continuous phase E L Ff (t; xf ; Vf ; f ) = Ff (t; yf = xf ; Vf ; f ) = FLf (t; yf ; Vf ; f )(xf − yf ) d yf (411) and for the discrete phase FEp (t; xp ; Vp ;
p)
=
FLp (t; yp
= xp ; Vp ;
p)
=
FLp (t; yp ; Vp ;
p )(xp
− yp ) d yp :
(412)
Eqs. (411) and (412) are consistent with the following deAnitions of the Eulerian marginals of FEfp (FEf and FEp ), E Ff (t; xf ; Vf ; f ) = FEfp (t; xf ; xp ; Vf ; f ; Vp ; p ) d xp d Vp d p (413) and FEp (t; xp ; Vp ;
p)
=
FEfp (t; xf ; xp ; Vf ;
f ; Vp ;
p ) d xf
d Vf d
f
:
(414)
As it was shown previously in the deAnitions of the Lagrangian mass density functions, we have FLf = Mp FfL and FLp = Mf FpL and inserting these results in Eqs. (411) and (412), we And by identiAcation that FEf (t; xf ; Vf ;
f)
= Mp FfE (t; xf ; Vf ;
f)
;
FEp (t; xp ; Vp ;
p)
= Mf FpE (t; yp ; Vp ;
p)
;
(415)
which goes hand in hand with the fact that the relations between the Eulerian mass density functions (FfE ; FpE ) and the Lagrangian mass density functions (FfL ; FpL ) are also given by Eqs. (411) and (412). Eq. (411) (veriAed by FfL and FfE ) is of course identical to Eq. (211) in the single-phase ow case. Before we carry on, let us recall our reasoning. In order to write Aeld equations for both phases, where the physical space is shared by the uid and the particles, the following procedure is adopted. In the Arst part, where relations between Eulerian and Lagrangian mdfs are worked out at the two-point level, FEfp (t; xf = x; xp ; Vf ;
f ; Vp ;
p)
;
FEfp (t; xf ; xp = x; Vf ;
f ; Vp ;
p)
;
(416)
are under investigation. Information is still available, in the Eulerian sense, at the two-point level. Of course, these two mdfs can give both marginals, see Eqs. (413) and (414), at the same point in physical space, i.e. FfE (t; x; Vf ; f ) and FpE (t; x; Vp ; p ).
156
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
In the second part, where relations between Eulerian and Lagrangian mdfs are worked out at the one-point level, L pfp (t; yf = y; Vf ;
f ; yp ; Vp ;
p)
;
L (t; yf ; Vf ; pfp
= y; Vp ;
p)
;
f ; yp
(417)
are under investigation. These two pdfs give both Lagrangian marginals, see Eqs. (401) and (402), at the same point in sample space, i.e. pfL (t; y; Vf ; f ) and ppL (t; y; Vp ; p ). With Eq. (406) and Eqs. (411) and (412), information is obtained in the form of both the one-point uid and particle mass density functions, FfE (t; x; Vf ; f ) and FpE (t; x; Vp ; p ), respectively (at the same point in physical space). 8.2.6. Two-point relations between Eulerian and Lagrangian pdfs L, With Eq. (410), the deAnition of the two-point uid–particle Lagrangian mdf FLfp =Mf Mp pfp L and the two-point uid–particle transitional pdf p|fp , L (t; xf ; Vf ; f ; xp ; Vp ; p ) pfp = p|Lfp (t; xf ; Vf ; f ; xp ; Vp ; L pfp (t; xf 0 ; Vf 0 ;
f 0 ; xp0 ; Vp0 ;
p |t0 ; xf 0 ; Vf 0 ;
f 0 ; xp0 ; Vp0 ;
p0 )
p0 ) d xf 0 d Vf 0 d f 0 d xp0 d Vp0 d p0
;
(418)
one can write FEfp (t; xf ; xp ; Vf ; f ; Vp ; p ) = p|Lfp (t; xf ; Vf ; f ; xp ; Vp ; FEfp (t; xf 0 ; xp0 ; Vf 0 ;
f 0 ; Vp0 ;
p |t0 ; xf 0 ; Vf 0 ;
f 0 ; xp0 ; Vp0 ;
p0 )
p0 ) d xf 0 d Vf 0 d f 0 d xp0 d Vp0 d p0
:
(419)
As in Section 6, this relation shows that the Eulerian mass density function FEfp is ‘propagated’ by the transitional pdf, or in the language of statistical physics, the transitional pdf p|Lfp is the propagator of an information which is the two-point uid–particle Eulerian mass density function. Consequently the partial di8erential equation which is veriAed by the transitional pdf is also veriAed by the Eulerian mass density function FEfp . The deAnitions of the expected densities, f (t; x) and p (t; x), and the probability of presence of both phases :f (t; x) and :p (t; x), can be expressed in terms of the two-point Eulerian mdf. For the expected densities, one can write 1 :f (t; x)f (t; x) = FEfp (t; x; xp ; Vf ; f ; Vp ; p ) d xp d Vp d p d Vf d f ; Mp 1 FEfp (t; xf ; x; Vf ; f ; Vp ; p ) d xf d Vf d f d Vp d p : (420) :p (t; x)p (t; x) = Mf
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Similarly, :f and :p are deAned by 1 1 FE (t; x; xp ; Vf ; :f (t; x) = Mp f ( f ) fp 1 1 :p (t; x) = FE (t; xf ; x; Vf ; Mf p ( p ) fp
f ; Vp ;
p ) d xp d Vp d p d Vf
f ; Vp ;
p ) d xf
d Vf d
f
157
d
f
d Vp d
p
:
(421)
E or between the In the general case, no expressions can be worked out between FEfp and pfp L conditioned by position and the two-point uid–particle distribution function Lagrangian pdf pfp E pfp as it was done for the single-phase ow case in Section 6. This can, however, be done in the particular case where the uid particle and the discrete particle are statistically independent. The two-point uid–particle mdf is then expressed as the product of the uid–particle and discrete-particle mdfs and the results found in the treatment of the continuous and the discrete phase are multiplied. This is not done here since in ‘real ows’ the assumption of statistical independence is not valid. Finally, the expected densities and the probability of presence of both phases can be written in terms of the marginals of FEfp :f (t; x)f (t; x) = FfE (t; x; Vf ; f ) d Vf d f ;
:p (t; x)p (t; x) =
:f (t; x) =
:p (t; x) =
FpE (t; x; Vp ;
p ) d Vp d p
1 F E (t; x; Vf ; f ( f ) f
f ) d Vf
1 F E (t; x; Vp ; p ( p ) p
p ) d Vp d p
d
f
;
; :
(422)
8.2.7. Relations between Eulerian and Lagrangian marginals As mentioned above, relations between Eulerian and Lagrangian mdfs can be worked out either at the two-point level (one uid point and one particle point) or at the one-point level. Results at the two-point level have just been presented and we work out similar relations at the one-point level, for each phase respectively. Using Eq. (411), the deAnition of the uid Lagrangian mdf FfL = Mf pfL , and introducing the uid transitional pdf p|Lf , one can write E Ff (t; x; Vf ; f ) = p|Lf (t; x; Vf ; f | t0 ; xf 0 ; Vf 0 ; f 0 )FfE (t; x0 ; Vf 0 ; f 0 ) d x0 d Vf 0 d f 0 : (423) As in the single-phase ow case, this relation shows that the uid Eulerian mass density function FfE is ‘propagated’ by the uid transitional pdf, or in the language of statistical physics, the uid transitional pdf p|Lf is the propagator of an information which is the uid Eulerian mass density function.
158
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Now, let us express the relations between the Eulerian and Lagrangian marginals (the pdfs related to the uid particles). Here, we repeat the procedure given in Section 6, but as we shall see, there are some slight changes due to the fact that we do not treat a single-phase ow but a uid–particle mixture. By integration of Eq. (411) over x = xf ; Vf ; f and using the deAnition of the mass density function of the continuous phase, one Ands that the result of the integration gives the total mass of uid Mf which means that the integral of FfE over phase space (Vf ; f ) is the expected density of the uid at (t; x) (the probable mass of uid in a given state per unit volume). The expected density, denoted f (t; x), is deAned by the following equation: :f (t; x)f (t; x) = f ( f ) pfE (t; x; Vf ; f ) d Vf d f ; (424) where the Eulerian mass density function FfE is given by a relation identical to the single-phase ow case, see Eq. (214), that is FfE (t; x; Vf ;
f)
= ( f ) pfE (t; x; Vf ;
f)
:
(425)
The new quantity (compared to the single phase ow case), :f (t; x), is of course deAned as the normalization factor of pfE , that is :f (t; x) = pfE (t; x; Vf ; f ) d Vf d f ; (426) which represents the probability to And the uid phase, at time t and position x, in any state (0 6 :f (t; x) 6 1). This probability is not always one as in single-phase ows where the physical space is continuously Alled by the uid. In a uid–particle mixture, at (t; x) there might be some uid or a discrete particle. Note that the above equations, i.e. Eqs. (424) – (426) are consistent with the deAnitions obtained at the two-point level, i.e. Eqs. (420) – (422). As done in Section 6, integration of FfL and FfE over phase space (Vf ; f ) yields (where the notation y = x in the Lagrangian pdfs is, from now on, dropped most of the time for the sake of clarity) 1 pfL (t; x) = : (t; x)f (t; x) (427) Mf f and therefore the conditional expectation pfL (t; Vf ; pfL (t; Vf ;
f | x)
=
f ( f ) pE (t; x; Vf ; :f (t; x)f (t; x)
f | x) f)
:
is given by (428)
As in the single-phase ow case, we And that in a compressible ow, the uid Lagrangian pdf conditioned by the position is not the uid Eulerian distribution function but the density-weighted uid Eulerian pdf, pfE =:f . Before we start the treatment of the discrete phase, let us rewrite the results derived above in the particular case of an incompressible ow. In this case, all information is contained in pfE and pfL and we And that the uid transitional pdf is the propagator of an Eulerian information which is pfE , E pf (t; x; Vf ) = p|Lf (t; x; Vf | t0 ; x0 ; Vf 0 ) pfE (t; x0 ; Vf 0 ) d x0 d Vf 0 : (429)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
159
Eq. (411) becomes (f = Mf = Vf , where Vf is the volume of the physical space occupied by the uid) 1 E L p (t; x; Vf ) = pf (t; yf = x; Vf ) = pfL (t; yf ; Vf ) (x − yf ) d yf : (430) Vf f The probability to And a uid particle at a given position x is associated to the marginal pfL (t; x) = :f = Vf , and the uid Lagrangian pdf conditioned by position reads, pfL (t; Vf | x) =
1 pE (t; x; Vf ) : :f (t; x) f
(431)
The relations relevant to the discrete phase are obtained by following a procedure which is identical to the one which has just been presented in the treatment of the continuous phase. With Eq. (412), the deAnition of the particle Lagrangian mdf, FpL = Mp ppL , and introducing the discrete particle transitional pdf p|Lp , one can write FpE (t; x; Vp ; p ) = p|Lp (t; x; Vp ; p |t0 ; xp0 ; Vp0 ; p0 )FpE (t; x0 ; Vp0 ; p0 ) d x0 d Vp0 d p0 : (432) As it was explained previously, this relation shows that the Eulerian mass density function FpE is ‘propagated’ by the transitional pdf, or in the language of statistical physics, the transitional pdf p|Lp is the propagator of an information which is the Eulerian mass density function. The following set of relations is obtained (433) :p (t; x)p (t; x) = p ( p ) ppE (t; x; Vp ; p ) d Vp d p ; FpE (t; x; Vp ; p ) = ( p )ppE (t; x; Vp ; p ) ; :p (t; x) = ppE (t; x; Vp ; s ) d Vp d p ; ppL (t; x) =
1 :p (t; x)p (t; x) ; Mp
ppL (t; Vp ;
p | x) =
p ( p ) pE (t; x; Vp ; :p (t; x)p (t; x) p
(434) (435) (436) p)
:
(437)
Integration of Eq. (412) over x = xp ; Vp ; p yields the total mass of discrete particles Mp . The integral of FpE over phase space (Vp ; p ) is the expected density of the particles at (t; x) denoted p (t; x) and it is deAned by Eq. (433). The discrete particle Eulerian mass density function is given by Eq. (434). In this equation, a distinction between p and p is made, that is, not all additional variables ( p ) enter the density law (for example particle diameter, uid velocity seen as it will be demonstrated later). Note that the above equations, i.e. Eqs. (433) – (435) are consistent with the deAnitions obtained at the two-point level, i.e. Eqs. (420) – (422). The probability to And the discrete phase at time t and position x in any state is :p (t; x) (0 6 :p (t; x) 6 1) which is the normalization factor of ppE , i.e. Eq. (435).
160
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Integration of FpL and FpE over phase space (Vp ; p ) yields Eq. (436) (which is the probability to And a discrete particle at a given position and in any state) and therefore the conditional expectation ppL (t; Vp ; p | x) (conditioned by position) is given by Eq. (437). For discrete particles of variable density, the Lagrangian pdf conditioned by the position is not the equivalent Eulerian pdf but the density-weighted Eulerian pdf, ppE =:p . In the particular case of particles of constant density and constant diameter, one obtains identical relations to the uid case, Eqs. (429) – (431). With particles of constant density but variable diameter, all information is contained in ppL and Eq. (412) becomes (p = mp =#p where mp and #p are the mass and the volume of a particle of given diameter) mp 1 E p (t; x; Vp ; p ) = pfL (t; yf = x; Vp ; p ) (438) Mp #p p (439) = pfL (t; yp ; Vp ; p )(x − yp ) d yp : The probability to And a discrete particle at a given position x is associated to the marginal ppL (t; x) = (:p =#p )(mp =Mp ) and the Lagrangian pdf conditioned by position reads, 1 ppL (t; Vp ; p |x) = (440) pE (t; x; Vp ; p ) : :p (t; x) p 8.2.8. Discrete representation and mass-weighted averages Once the general formalism has been introduced, it is useful to clarify the correspondence between averages (deAned as mathematical expectations) and Monte Carlo estimations drawn from a Anite ensemble of particles. In doing so, we slightly anticipate on Section 8.5 where the question of a precise deAnition of averages is taken up and addressed more at length. Yet, since averages will already be used naturally in the construction of pdf models, a Arst discussion is made here. A similar discussion was already proposed for single-phase ows in Section 6.4.4. It was shown that in the general case, at a given location x, a Monte Carlo calculation based on the local number of particles present in a small volume around x is an estimation of the Favre-averaged (or density averaged), see Eq. (227). For the uid case, the computational particles are generally chosen to have the same mass. Actually, the mass mi attached to each of the ‘computational particles’ is arbitrary; the important constraint to respect is the mean uid continuity equation which is related to a proper normalisation of the uid mdf [6]. For incompressible uid, the particle mass concentration should be constant, and by assigning the same mass to each particle, the constraint is now that the particle concentration should be constant which is attractive from a numerical point of view to ensure a uniform statistical error of the Monte Carlo estimations in the domain. Then, in the incompressible case, we have that (Reynolds) averages are estimated by the local ensemble averages and Eq. (227) is simpliAed to N
x 1 H˜ HN = H (Ui (t); i (t)) : Nx
i=1
(441)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
161
In the two-phase ow case, even when both densities f and p are constant, the natural setting is compressible ows, compressibility being due to the variable local fractions :f and :p . Most of what has been said above remains valid for the uid case with the relation between the local number of uid particles Nxf with mass mif and density being Nxf i m :f (t; x)f i=1 f : (442) Vx Once again, for uid particles, the mass mif attached to each particle is arbitrary and we can choose to assign the same mass Qm to all uid particles. From the above equation, the local number of uid particles must then be proportional to :f (t; x). Then, for any uid property Hf attached to uid particles (denoted by Hfi ), the estimation of the averaged Eulerian quantity is Nf
Hf ; N
x 1 = Hfi : Nx
(443)
i=1
The situation is however di8erent for the particle phase. The general deAnition of an average quantity is :p (t; x)p Hp = H (Vp ; p )FpE (t; x; Vp ; p ) d Vp d p : (444) Therefore, following the same reasoning as in Section 6.4.4, we obtain N
p
x 1 mip H (Upi (t); ip (t)) :p (t; x)p Hp Vx
(445)
i=1
and with :p (t; x)p
Nxp
we have Hp Hp; N =
i i=1 mp
Vx
;
Nxp
i i i i=1 mp H (Up (t); p (t)) Nxp i i=1 mp
(446)
:
(447)
However, for the dispersed phase, the computational particles represent real physical particles and must have identical physical properties, such as diameter and density, to simulate the real particle behaviour. For a polydispersed particle phase, particles have di8erent masses even when p is constant since their diameter varies. The important consequence is that, even for constant density particles, the natural deAnition or understanding of a mean quantity is the mass-weighted-average, or the volume-weighted average when p is constant. 8.3. Choice of the pdf description In the previous subsection, a general formalism was proposed in order to give a probabilistic description of dispersed two-phase ows. It was seen that the key quantity of the approach is the transitional Lagrangian pdf p|Lfp from which Lagrangian and Eulerian mass density functions are obtained.
162
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
8.3.1. The Lagrangian stochastic point of view The existence of a propagator (p|Lfp for the uid-particle mixture or p|Lf and p|Lp for the uid and the particles, respectively) indicates that the Lagrangian point of view is the natural choice and from now on this point of view is retained. We follow the ideas of Sections 6 and 7 that is, the exact equations (in a Lagrangian sense) are replaced by stochastic models which should reproduce the same statistics as the exact description: instantaneous exact equations are replaced by instantaneous modelled ones. At last, the trajectory point of view is adopted, but the pdf interpretation is presented in order to justify the derivation of the mean Aeld equations. 8.3.2. PDF equation in dispersed two-phase ;ows From now on, for the sake of simplicity, only dispersed two-phase ows with two-way coupling are considered. The possible in uence of collisional mechanisms (between discrete particles), in the frame of the present formalism, will be discussed at the end of Section 9. In order to write the partial di8erential equation (Fokker–Planck equation) veriAed by the propagator, we recall the exact equations for the trajectories of uid and discrete particles, see Eqs. (230) and (281), respectively. The new set of equations take the following form, d xf+; i = Uf+; i d t; + d Uf+; i = A+ f ; i d t + Ap→f (t; Z; Z) d t; + + d + f ; l = >f Q f ; l d t + Sf (f ) d t ; + + d xp; i = Up; i d t ;
d Up;+i = A+ p; i d t ; + + d + p; k = >p Q p; k d t + Sp (p ) d t ;
(448)
where the indexes l and k refer to the dimensions of f and p , respectively. As in the previous sections, the + subscript is used to indicate the exact trajectories in contrast to the modelled + ones. Both (exact) accelerations are given by (A+ f ; i and Ap; i are the accelerations of the uid particles and the discrete particles, respectively) A+ f;i = − A+ p; i =
1 9P + + QUf+; i ; f 9xi
1 + (U − Up;+i ) + gi : $p s; i
(449)
A new term is added in the momentum equation of the uid to account for the in uence of the particles on the uid. The exact expression for this acceleration, which is induced by the presence of the discrete particles, is not a priori known and possible models for the trajectories of stochastic uid particles are discussed in Sections 8.4.2 and 8.5.5.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
163
Using the techniques presented in Section 2, the transitional pdf p|L+ fp veriAes the following partial di8erential equation: 9p|L+ 9p|L+ 9p|L+ fp fp fp + Vp; i + Vf ; i 9t 9yf ; i 9yp; i =− − −
9 9Vf ; i 9 9Vp; i 9 9
p; k
L+ (A+ f ; i |Z = zp|fp ) −
L+ (A+ p; i |Z = zp|fp ) −
9 9Vf ; i 9 9
f;l
L+ (A+ p→f (t; Z; Z)|Z = zp|fp ) L+ (>f Q + f ; l | Z = zp|fp ) −
L+ (>p Q + p; k | Z = zp|fp ) −
9 9
p; k
( Sp; k (
+ L+ p ) p|fp )
9 9
;
f;l
( Sf ; l (
+ L+ f ) p|fp )
(450)
+ where all contracted terms A|Z = z (the conditioned accelerations A+ p; i |Z = z, Af ; i |Z = z, and A+ p→f (t; Z; Z)|Z = z and the conditioned di8usion terms for the scalars of both phases + >p Q + p; k | Z = z and >f Q p; k | Z = z), represent the average value of A conditioned on the values of the state vector Z = z, cf. Section 3. Then, introducing the deAnition of the exact uid–particle pdf in terms of the transitional pdf L+ pfp (t; xf ; Vf ; f ; xp ; Vp ; p ) = p|L+ fp (t; xf ; Vf ; f ; xp ; Vp ; L+ pfp (t; xf 0 ; Vf 0 ;
f 0 ; xp0 ; Vp0 ;
p |t0 ; xf 0 ; Vf 0 ;
f 0 ; xp0 ; Vp0 ;
p0 )
p0 ) d xf 0 d Vf 0 d f 0 d xp0 d Vp0 d p0
(451)
L+ , and using Eq. (419), it can be shown that Eq. (450) is satisAed by the Lagrangian pdf, pfp E+ and by the Eulerian mass density function, Ffp . Once Eq. (450) is written for the modelled transitional pdf p|Lfp , the same reasoning is valid L and FE . Therefore, Aeld equations (for di8erent moments) can be expressed (see for pfp fp Sections 8.5.3 and 8.5.4) following the procedure outlined in Section 8.5 and more especially in Fig. 42.
8.3.3. Interest of the pdf approach This matter was discussed in Sections 6 and 7 for the continuous phase and the discrete phase, respectively. By introducing a two-point uid–particle pdf, the preceding features exposed in Sections 6 and 7 are gathered and these are brie y recalled. Convection and scalar source terms appear in closed form for both the uid and the discrete particles. The Arst point implies that closure problems encountered in the classical moment approach are not met. The second feature shows that the method is particularly suited for combustion problems because sources terms are closed at the local instantaneous level. In addition, it will be shown that, for discrete particles, this approach treats complex closure issues (at the macroscopic level) in a rather simple way (for example the closure of terms like Up =$p when the diameter of the particles vary considerably from particle to particle or when we are confronted with a situation where
164
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 42. Derivation of the mean Aeld equations from the two-point uid-particle Eulerian mass density function or derivation of the mean Aeld equations from the marginal Lagrangian pdfs (the level of information in the case of the one-point particle pdf approach is indicated by the ∗ symbol).
particles have completely di8erent histories), whereas deriving partial di8erential equations for mean quantities is, in this case, a thorny issue. The Lagrangian approach is attractive since it treats these phenomena without approximation. 8.4. Present ‘two-point’ models From now on, the study is further restricted to non-reacting dispersed two-phase ows verifying the following conditions: there are no collisions between particles, both phases have a constant density. The restriction to non-reactive ows is made for the sake of simplicity. Extension of the present formalism to reactive ows is indeed straightforward (this is precisely one of the main interest of PDF models) through proper introduction of the relevant scalar variables in f and p . The models presented in this section include interactions between uid particles and between uid and solid particles, but not between solid particles (such as collisions) at the moment though this extension is discussed at the end of Section 9. 8.4.1. Dimension of the state vector As explained in Sections 6 and 7.4.2, Kolmogorov theory tells us that the acceleration of uid particles and the acceleration of the uid sampled along discrete particle trajectories are fast variables (with d t being the reference time scale). Both accelerations are external variables which have to be modelled and the two-particle (one uid particle and one discrete particle)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
165
state vector is deAned by Z = (xf ; Uf ; xp ; Up ; Us ; dp ) ;
(452)
and in sample space, (x; U; dp ) ↔ (y; V; p ). The selection of the velocity of the uid seen by the discrete particles, namely Us , as an independent variable linked to the discrete particles is a noteworthy point. This is done for the reasons put forward in Section 7.3.2 where it was explained that the velocity of the uid seen (which is one of the particle driving forces) should be included in the state vector related to discrete particle properties since its rate of change along particle trajectories has better chances to be replaced by a stochastic but local model. In Section 7, we were following an hybrid approach where the pdf description was limited to discrete particles properties. However, in the present section, we are dealing with a joint uid–particle pdf description and uid characteristics are already included in the two-particle state vector Z through the variables (xf ; Uf ) for example. One could thus wonder whether the introduction of the velocity of the uid seen as an independent variable is justiAed when Uf is already included. Yet, as explained in detail in Section 7.4.1, the statistics of the velocity of the uid seen are di8erent from the statistics of a uid particle due to particle inertia and crossing-trajectory e8ects. Furthermore, these two uid velocities do not correspond to the same trajectories: Uf is the velocity along the uid particle trajectory xf , while Us is the velocity along the discrete particle trajectory xp . The reasoning developed in Section 7.4.1 is thus very much valid and justiAes the presence of these two uid velocities which are treated as independent variables. As far as the choice of the state vector is concerned (with the uid seen as an independent variable), it should also be pointed out that this stems from a limitation due to the one-point approach (for the uid). If a two-point pdf were available for the uid, the closure problem (Us for the discrete particle statistical properties) would be solved. This velocity could be directly calculated since the uid velocity at the discrete particle location xp2 at time t2 , given the particle location xp1 at time t1 could be directly determined from the conditional pdf p|L8 (t2 ; yf 2 ; Vf 2 | t1 ; yf 1 ; Vf 1 ). This is clearly an indication that the real modelling issue is a multi-point pdf or statistical treatment of the uid phase. Yet, in that case, the stochastic models (for Us in particular) would be changed, but the present uid–particle pdf formalism would still be necessary. 8.4.2. Stochastic model for Uf and Us The trajectory point of view is adopted, which means that stochastic di8erential equations are written in order to describe the time increments of the uid velocity seen along discrete particle trajectories and the time increments of the uid velocity along uid particle trajectories. An attractive approach is to use a Langevin equation and to adopt the procedures that were used for a uid particle in Section 6 and for the uid velocity seen (discrete particle) in Section 7. Then, the Langevin equation model consists in writing the increments in time of Uf and Us as di8usion processes, d xf ; i = Uf ; i d t ;
(453a)
d Uf ; i = [Af ; i (t; Z) + Ap→f ; i (t; Z; Z)] d t + Bf ; ij (t; Z) d Wj ;
(453b)
166
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
d xp; i = Up; i d t ;
(453c)
d Up; i = Ap; i (t; Z) d t ;
(453d)
d Us; i = [As; i (t; Z) + Ap→s; i (t; Z; Z)] d t + Bs; ij (t; Z) d Wj ;
(453e)
where the drift vectors Af ; i ; Ap→f ; i ; Ap→s; i ; As; i and the di8usion matrices Bf ; ij ; Bs; ij have to be modelled. The trivial relation, d dp = 0 d t, was not included in Eqs. (453) since, for any discrete particle in physical space, there is no mass variation with time (see previous hypotheses) and the diameter remains constant with time. A Arst proposal for the modelling of the drift vectors and the di8usion matrices is to adopt the expressions given by Eqs. (236b) and (330). By doing so, it is assumed that the mean transfer rate of energy and energy dissipation + is changed by the presence of particles, but the nature and structure of turbulence remains the same. The drift vectors and the di8usion matrices, which account for the in uence of all other ( uid) particles on the uid particle under consideration, are then of the same nature and the expressions remain unchanged. Therefore, Eqs. (453) are written by adding the additional accelerations, Ap→f ; i (t; Z; Z), and Ap→s; i (t; Z; Z) to account for the presence of particles and the same closures as in Sections 6 and 7 are used for the drift vectors and the di8usion matrices, where, once again, the mean Aelds +; Uf2 ; : : : are modiAed by the presence of the particles. In opposition to the previous hypotheses, recent results of direct numerical simulations in the Aeld of turbulence modulation by particles (in isotropic turbulence) [77] seem to indicate that there is a non-uniform distortion of the energy spectrum. This could mean that, contrary to our previous assumption, the nature and structure of the energy transfer mechanisms of turbulence are modiAed by the presence of particles. There is no precise ‘geometrical’ knowledge on the structure of turbulence in the presence of discrete particles and this makes it extremely di@cult to isolate the important variables in order to modify the theory of Kolmogorov (which is used in our closures). This problem is out of the scope of the present paper and it remains an open question. In Eqs. (453) two di8erent Wiener processes are used for the velocity increments of the uid and the velocity increments of the uid seen. This amounts to neglecting the correlations between the uid accelerations at the two locations xf and xp . Strictly speaking, what is neglected is the correlation between the uid acceleration at location xf and the time rate of change of Us along discrete particle trajectories at location xp . This can be taken as a reasonable Arst guess in the frame of Kolmogorov theory. Moreover, it should be remembered that we are not dealing with two uid velocities, but rather with one uid velocity (at xf ) and with the velocity of the uid seen (at xp ). Even when we consider two close locations (when xp → xf x), since Us represents the velocity of the uid seen or sampled by discrete particles, the statistics of Us and Uf are still not necessarily identical. In other words, for Us , one only records the velocity when there is a discrete particle in the close neighbourhood and thus an ensemble of sampled values of Us form only a subset of all possible values of Uf at location x. Yet, when particle inertia are negligible ($p → 0) and when we consider nearby locations, then the present models are inaccurate. Though this is only a very special case, it indicates that the present form of stochastic models must still be improved. However, it must also be remembered that our real objective is not a two-point description of one of the uid or particle phase but rather a ‘joint one-point’ pdf description from which the two marginal pdf descriptions for each phase can
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
167
be derived. From that point of view, the correlation between the two Wiener process (when xp → xf x) does not have the same importance as in a real two-point uid description. Compared to the forms of the stochastic equations already used for uid particles in Section 6.6 and for discrete particles in Section 7.4.2, there is a new term entering the equations of Uf and Us , Ap→f which re ects the in uence of the discrete particles on the uid. This is a simple consequence of Newton’s third law: the uid exerts a force Ff →p on the discrete particles and, in return, the particles exert a force Fp→f = −Ff →p on the uid. The force corresponds to the exchange of momentum between the uid and the particles, but should not be confused with the total force acting on particles since the latter includes external forces such as gravity. With current expressions discussed in Section 7.1, the force exerted by one particle on the uid corresponds to the drag force written here as Fp→f = −mp ApD = −mp
Us − Up : $p
(454)
An accurate treatment (in the exact instantaneous equations for the uid) requires that this total force be converted into a density of force acting on the uid located in the neighbourhood of the discrete particles in order to express the resulting acceleration on nearby uid particles. This is not completely known for small particles in turbulent ows. Furthermore, in the stochastic or pdf description, an exact treatment of the reverse force (Fp→f ) would mean a multi-point (or multi-particle) pdf description for the discrete particles. This is outside the present scope. Consequently, the reverse forces and the e8ect of particles on uid properties are expressed directly in the stochastic equations of uid particles with simple stochastic models. Since Uf are Us are treated as independent variables, di8erent models can be developed. The simplest case is the model for the velocity of the uid seen. Indeed, the variables entering the drag force and the reaction force Eq. (454) are variables attached only to the discrete particles, namely Uf , Us and dp . Therefore, the action of particles on the uid seen can be accounted directly in terms of these variables. The Arst choice is to consider a local model where, at location xp , the force due to one particle is given by Eq. (454). The total force acting on the uid element surrounding a discrete particle is then the sum of all elementary force, Fp→f , due to all neighbouring discrete particles
:p p Us − Up Ap→s; i = − : :f f $p
(455)
In this simple model, all neighbouring particles are considered as having the same acceleration term ApD which is multiplied by the expected particle mass at xp , :p p divided by the expected mass of uid, :f f (since the total force is distributed only on the uid phase). The situation is more complicated for the reverse force in the equation of a uid particle, since a local model at location xf cannot be expressed directly in terms of the instantaneous variables attached to the discrete element which has a location xp . In Eqs. (453), in the equation for the time rate of change of the uid particle velocity Uf , Ap→f ; i is considered, at time t and
168
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
for a uid particle located at xf = x, as a random term which is given by + 0 with a probability 1 − :p (t; xf ) ; Ap→f = p with a probability :p (t; xf ) ;
(456)
where p is a random variable which plays the role of an ersazt of the Eulerian random variable which is formed from the discrete particles at the location xp = x, Op; i ≡
p Up; i − Us; i : f $p
(457)
In other words, from the stochastic models for the discrete particles, or from the one-point particle pdf value at location x=xf , we form the random variables p with the same distribution. This random term mimics the reverse forces due to the discrete particles and is only non-zero where the uid particle is in the close neighbourhood of a discrete particle. At the location x considered, p is deAned as a random acceleration term in the equation of Uf , correlated with Uf so that we have
p Up; i − Us; i Op; i = ; (458a) f $p
p Us; j (Up; i − Us; i ) Op; i Uf ; j = : (458b) f $p According to Section 2, the complete Lanvegin equation model is equivalent to a Fokker– Planck equation given in closed form for the transitional pdf, p|Lfp . As demonstrated previously, the Fokker–Planck equation veriAed by p|Lfp is also veriAed by the two-point uid–particle L . This Eulerian mass density function FEfp and the two-point uid–particle Lagrangian pdf pfp Fokker–Planck equation is, for the transitional pdf 9p|Lfp 9t
+ Vf ; i
=− −
+
9p|Lfp 9yf ; i
9 9Vf ; i 9 9Vp; i
+ Vp; i
9p|Lfp 9yp; i
([Af ; i + Ap→f ; i | yf ; Vf ]p|Lfp ) (Ap; i p|Lfp ) −
9 9Vs; i
([As; i + Ap→s; i | yp ; Vp ;
L p ]p|fp )
1 1 92 92 ([Bf BfT ]ij p|Lfp ) + ([Bs BsT ]ij p|Lfp ) : 2 9Vf ; i 9Vf ; j 2 9Vs; i 9Vs; j
(459)
L can now be performed using the trajectory Computations of the two-point uid–particle pdf pfp point of view or, in other words, by Lagrangian=Lagrangian simulations. Time evolution equations, Eqs. (453) are written for an ensemble made up of uid and discrete particles which are tracked together. Both have speciAed variables attached to them which appear as independent variables in the pdf.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
169
By direct integration of Eq. (459) the Fokker–Planck equations veriAed by the one-particle transitional pdfs (the uid transitional pdf p|Lf and the particle transitional pdf p|Lp ) can be obtained. In the case of the uid transitional pdf, p|Lf , 9p|Lf 9p|Lf 9 9 =− (Af ; i p|Lf ) − (Ap→f ; i | yf ; Vf p|Lf ) + Vf ; i 9t 9yf ; i 9Vf ; i 9Vf ; i
+
92 1 ([B BT ] p|L ) : 2 9Vf ; i 9Vf ; j f f ij f
(460)
The conditional expectation in the equation above shows that, as explained in Section 3 Eq. (71), when a contraction is made, information is lost. When two-way coupling is accounted for, the two-point (one uid particle and one discrete particle) information is necessary since the dynamics of the uid phase involve variables attached to the discrete particles. In the case of the particle transitional pdf, p|Lp , 9p|Lp 9p|Lp 9 9 =− (Ap; i p|Lp ) − ([As; i + Ap→s; i |yp ; Vp ; + Vp; i 9t 9yp; i 9Vp; i 9Vs; i
+ where
p
92 1 ([Bp BpT ]ij p|Lp ) ; 2 9Vp; i 9Vp; j
L p ]p|p )
(461)
↔ (Vs ; p ).
8.5. Mean ?eld equations As mentioned earlier at the beginning of the section, the use of the two-point uid–particle pdf allows an equal treatment of both phases and it is a compact way to derive a set of Aeld equations. These Aeld equations are often referred to as the ‘Eulerian model’ or sometimes ‘two- uid model’. Here, we would like to call it a two-?eld model: this term describes the spirit of the approach which is to derive Aeld equations for both phases using arguments from statistical physics. Now, let us discuss how such equations are derived. We specialize momentarily in the particular case of one-way coupling, i.e. Ap→f = Ap→s = 0 (for incompressible ows with particles of constant density but variable diameter) and this for the sake of simplicity. At the end of this subsection, the case of two-way coupling will be addressed, see Section 8.5.5. Before we move to the derivation of the mean Aeld equations, let us recall the pdfs and the mdfs (and their associated tools) that have been deAned, see Fig. 42. The end of the road of the present formalism is to be able to derive Aeld equations for both phases. A two-point L , (extracted from the transitional pdf p|L ) has been introduced uid–particle Lagrangian pdf, pfp fp L, and from it separate information on each phase was extracted in form of the marginals of pfp that is pfL for the continuous phase and ppL for the discrete phase. Corresponding mass density functions were deAned (FfL and FpL ) and for both of them correspondence with the Aeld (Eulerian) description could be made (this crucial step is indicated with dashed arrows in Fig. 42). After this, we have found that each Eulerian mass density function, FfE and FpE , is propagated by
170
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
the corresponding transitional pdf (p|Lf and p|Lp , respectively). This allows us to write immediately the Fokker–Planck (partial di8erential) equation veriAed by FfE and FpE from the Fokker– Planck equations veriAed by the transitional pdfs p|Lf and p|Lp or from the Fokker–Planck equation veriAed by the transitional pdf p|Lfp . Using an appropriate averaging operator for each phase, (mean) Aeld equations can then be written, see Sections 8.5.3 and 8.5.4. Fig. 42 shows clearly the level of information contained in the one-point particle pdf approach (hybrid approach, see Section 7), which is marked with the ∗ symbol. For the continuous phase, information is only available for the mean Aeld quantities (at the Eulerian level in form of the two Arst velocity moments extracted from the uid mass density function, FfE ) whereas for the discrete phase information is kept at the local instantaneous level (which is the discrete particle Lagrangian pdf, ppL ). There is another, yet equivalent, way to derive the mean Aeld equations, Fig. 42. It is indeed possible to keep the joint (one uid point–one particle point) information for the Aeld description by treating the two-point uid–particle Eulerian mass density function, FEfp . This alternative procedure highlights the spirit of the derivation of the Aeld equations for both phases. We study, as mentioned previously, the cases where xf = x for the uid, and xp = x for the discrete phase, that is the following mass density functions FEfp (t; xf = x; xp ; Vf ;
f ; Vp ;
p)
;
FEfp (t; xf ; xp = x; Vf ;
f ; Vp ;
p)
:
(462)
As indicated in Fig. 42, by direct integration, the Fokker–Planck equations veriAed by the marginals FEf and FEp can be obtained from the Fokker–Planck equation veriAed by FEpf which is, in its turn, obtained from the partial di8erential equation veriAed by the transitional pdf p|Lfp . The latter equations are also veriAed by FfE and FpE . 8.5.1. Fluid and discrete particle expectations In the present case (discrete particle of constant density but variable diameter in an incompressible ow) all information is contained in the distribution functions ppE (t; x; Vp ; p ) (with E uid but the derivation will be p = (Vs ; p ) for the discrete phase and pf (t; x; Vf ) for the E addressed in terms of the mass density function Fp (t; x; Vp ; p ) = p ppE (t; x; Vp ; p ) for the discrete phase. The discrete particle expectations, i.e. the expectation of the variables which are attached to the discrete particles, are now deAned. The operator could be indexed with a p subscript, p or f subscript, f so that no confusion between the uid expectations (the expectation of the variables which are attached to the uid particles) and the discrete particle expectations would be possible. However, separate variables have been introduced for each kind of particle (for example Vp and Vs ), and such a distinction is not necessary. In the following equations, Eqs. (463) and (464), (Ui ; ui ) is used as a short-hand notation for (Up; i ; up; i ) or (Us; i ; us; i ). The same procedure is applied to sample space where the subscripts p and s are momentarily dropped. The mean discrete particle velocity, or the mean uid velocity
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
seen, are deAned by
:p (t; x)p Ui (t; x) =
Vi FpE (t; x; Vp ;
p ) d Vp d p
171
(463)
and the associated velocity moments of order n are given by n vik FpE (t; x; Vp ; p ) d Vp d :p (t; x)p ui1 : : : uin (t; x) =
p
;
(464)
k=1
where ik ∈ {1; 2; 3} ∀k. The uctuating component of the velocity of the discrete phase is deAned by up; i = Up; i − Up; i and we have of course up; i = 0. For the uid velocity seen, we deAne us; i = Us; i − Us; i with us; i = 0, which should not be confused with the decomposition of the uid velocity, Uf ; i = Uf ; i + uf ; i with uf ; i = 0. The uid seen-particle velocity moments (of order n + m) can also be deAned :p (t; x)p us; i1 : : : us; in up; j1 : : : up; jm (t; x) n m vs; ik vp; jl FpE (t; x; Vp ; p ) d Vp d = k=1
p
;
(465)
l=1
where jl ∈ {1; 2; 3} ∀l. Note that Eq. (464), the deAnition of the velocity moment of order n, is actually a particular case of Eq. (465) (moments for the particle velocity or the uid velocity seen can be obtained with n = 0 or m = 0, respectively). All expected values given so far represent moments for the whole population of particles. Additional information is necessary to describe how the discrete particle size distribution varies in time and space (for example if large particles gather in preferential locations and so on). This information can be obtained from dp and (dp )2 , that is the Arst (mean diameter) and second order (dp = dp − dp ) moments, respectively (if a continuous approach is used, see the next subsection). The mean diameter is deAned by :p (t; x)p dp (t; x) = p FpE (t; x; Vp ; p ) d Vp d p (466) and for the moments, a general deAnition is introduced, that is a moment of order n + m + q, :p (t; x)p (dp )n us; i1 : : : us; im up; j1 : : : up; jq (t; x)
=
(p )n
m k=1
vs; ik
q
vp; jl FpE (t; x; Vp ;
p ) d Vp d p
;
(467)
l=1
Eq. (467) is in fact the most general deAnition and it encompasses the previous ones, that is Eqs. (464) and (465). The expectations of the variables which are attached to the uid particles are now given. Similarly to the discrete phase, the mean uid velocity is :f (t; x)Uf ; i (t; x) = Vf ; i pfE (t; x; Vf ) d Vf (468)
172
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
and the velocity moments of order n read n vf ; ik pfE (t; x; Vf ) d Vf : :f (t; x)uf ; i1 : : : uf ; in (t; x) =
(469)
k=1
8.5.2. Treatment of the particle size distribution Here, we would like to clarify the treatment of the particle size distribution when it comes to the Aeld equations of the discrete phase. It is now common, for hard spheres of di8erent diameters, to encounter models where one derives Aeld equations for each individual subclass of diameters (chosen arbitrarily or corresponding to the physical situation, for example in binary mixtures), for example Mathiesen et al. [94] and Gourdel et al. [95]. It should be pointed out that, this method which is often referred to as ‘multi-phase ow approach’ in the literature, should rather be called discrete approach for polydispersed two-phase ows since it refers to a distribution in size and not to the physical state of the phases (solid, liquid, gas). Indeed, this method is a discretization (in the diameter space) of the general formalism presented here, that is, in the sample space associated to particle diameter p , discretization is made in accordance with the desired description. For a subclass k whose diameters are in the range Qk , the following properties are deAned (for each class of particles) :p; k (t; x)p · k (t; x) = · FpE (t; x; Vp ; p ) d Vp d p (470) Qk
where :p; k (t; x) is the probability to And at time t and position x subclass k in any state, :p; k (t; x) = ppE (t; x; Vp ; p ) d Vp d p : (471) Qk
k :p; k = :p where Ck is the total number of classes. This proceIt is then obvious that Ck=1 dure might be adequate when a binary mixture is under consideration (it becomes, however, quite intricate when collisions are accounted for [95]), but when there is a wide distribution, the number of classes under consideration (which implies that the number of Aeld equations increases tremendously), and additional closures render the problem very di@cult to treat in a practical way. A more logical approach (if one adopts the Aeld approach at the expense of a more detailed description) is a continuous one, that is to write partial di8erential equations for the local moments of the diameter distribution. This approach is more in line with the general formalism presented here but it will be seen shortly that, one has to come up with other closures involving correlations between the diameter and the velocity Aelds. This highlights once again the superiority of the pdf approach.
8.5.3. Field equations for the discrete phase Here, we wish to derive the Aeld equations for the discrete particles (the partial di8erential equations for the expected values of the variables attached to a discrete particle) for the following quantities: the mean discrete particle velocity Up; i , the second-order velocity moment for discrete particles up; i up; j , the mean of the uid velocity seen Us; i , the second-order velocity moment for the uid seen us; i us; j , the uid seen–discrete particle velocity correlation
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
173
tensor, us; i up; j , the mean diameter dp , the diameter– uid velocity seen correlation vector, dp us; i , the diameter–particle velocity correlation vector, dp up; i and the diameter second-order moment (dp )2 . Note that the procedure which is developed now allows to calculate any moment, but as the present task is to derive an Eulerian-like model, closure is performed at the second-order level. In order to obtain the mean Aeld equations a standard procedure is used, in analogy with the derivations which can be found in kinetic theory, Chapman and Cowling [20] and Libo8 [21] (this procedure was used in Section 6). Let us deAne the expectation of a given function Hp (Vp ; p ) as (472) :p (t; x)p Hp (t; x) = Hp (Vp ; p ) FpE (t; x; Vp ; p ) d Vp d p ; where Hp (Vp ; p ) is a scalar (a component of a tensor of order n + m + q as explained in Eq. (467)). Using Eqs. (432) and (461), it is straightforward to prove that FpE (t; x; Vp ; Vs ; p ) veriAes the following partial di8erential equation, 9FpE 9FpE 92 9 9 1 =− (Ap; i FpE ) − (As; i FpE ) + ((Bs BsT )ij FpE ) : + Vp; i 9t 9xi 9Vp; i 9Vs; i 2 9Vs; i 9Vs; j
(473)
Let us multiply Eq. (473) by Hp and apply the · operator, Eq. (472). Similar expression as the one in Eq. (246) in Section 6 are obtained and we suppose that all of them converge to zero in the limit Vs; i → ±∞ and Vp; i → ±∞. Assuming that all generalized integrals converge and that uniform convergence is veriAed, after some derivations, one can write
9Hp 9 9 (:p p Vp; i Hp ) = :p p Ap; i (:p p Hp ) + 9t 9xi 9Vp; i
9Hp 92 Hp T + :p p As; i + :p p (Bs Bs )ij : (474) 9Vs; i 9Vs; i 9Vs; j The partial di8erential equations for the speciAed discrete particle expectations can now be derived, simply by choosing the right function for Hp . For Hp = 1, the continuity equation is obtained, 9 9 (:p p Up; i ) = 0 : (475) (:p p ) + 9t 9xi With Hp = Vp; i , the momentum equation for the discrete phase reads (where the Eulerian derivative along the path of a discrete particle is denoted d = d t with d = d t = 9= 9t + Up; m 9= 9xm ) d 9 (476) :p p Up; i = − (:p p up; i up; j ) + :p p Ap; i : dt 9xj The partial di8erential equation of the expected
Hp = Vs; i
:p p
uid velocity seen, Us; i , is derived with
9 d Us; i = − (:p p us; i up; j ) + :p p As; i dt 9xj
(477)
174
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
and with Hp = p , the partial di8erential equation for the mean diameter reads :p p
d 9 dp = − (:p p dp up; i ) : dt 9xi
(478)
As explained in Section 6, the partial di8erential equations veriAed by the second-order moments (n + m + q = 2 in Eq. (467)) cannot be obtained directly from the procedure presented above. As in Section 6, a change of coordinates in sample space is introduced, vp = Vp − Up (t; x), vs = Vs − Us (t; x) and p = p − dp (t; x). It is straightforward to prove that the partial di8erential equation veriAed by FpE (t; x; vp ; vs ; p ) reads d FpE 9FpE 1 92 9 9 =− (Ap; i FpE ) − (As; i FpE ) + ((Bs BsT )ij FpE ) + vp; i dt 9xi 9vp; i 9vs; i 2 9vs; i vs; j
+
dUp; i 9FpE ddp 9FpE dUs; i 9FpE + + d t 9vp; i d t 9vs; i d t 9p
9Up; j 9FpE 9Us; j 9FpE 9dp 9FpE + vp; i + vp; i + vp; i : 9xi 9vp; j 9xi 9vs; j 9xi 9p
With FpE (t; x; vp ; vs ; p ) d vp d vs d p = FpE (t; x; Vp ; 9(V ; V ; V ; ) p s p f =1 ; 9(vf ; vp ; vs ; p )
p ) d Vp d p ,
(479)
because we have (480)
the moments of order n + m + q can also be deAned with FpE (t; x; vp ; vs ; p ), see Eq. (472). The partial di8erential equation for a function Hp (vp ; vs ; p ) is derived in the same fashion as for Eq. (474). Similar expressions to the one displayed in Eq. (246) in Section 6 are then obtained and we suppose that all of them converge to zero in the limit vs; i → ±∞, vp; i → ±∞ and p → ±∞. Once again, assuming that all generalized integrals converge and that uniform convergence is veriAed, after some derivations, the partial di8erential equation veriAed by Hp (vp ; vs ; p ) becomes d 9 (:p p vp; i Hp ) (:p p Hp ) + dt 9xi
9Hp 9Hp 92 Hp 1 T = :p p Ap; i + :p p As; i + :p p (Bs Bs )ij 9vp; i 9vs; i 2 9vs; i 9vs; j
dUp; i 9Hp ddp 9Hp dUs; i 9Hp − :p p − :p p − :p p dt 9vp; i 9vs; i dt 9p dt 9Up; j − :p p 9xi
9(vp; i Hp ) 9vp; j
9Us; j − :p p 9xi
9(vp; i Hp ) 9vs; j
9dp − :p p 9xi
9(vp; i Hp ) 9p
:
(481)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
175
The partial di8erential equations for the velocity moments of order 2 can now be obtained. Inserting Hp = vp; i vp; j , Hp = vs; i vp; j and Hp = vs; i vs; j in Eq. (481), the partial di8erential equations veriAed by the second order velocity moment for the discrete particles up; i up; j , for the second-order velocity moment of the uid seen us; i us; j and for the uid seen–discrete particle velocity correlation tensor us; i up; j , can be derived. After some algebra, one Ands for up; i up; j :p p
9Up; j d 9 up; i up; j = − (:p p up; i up; j up; k ) − :p p up; i up; k dt 9xk 9xk − :p p up; j up; k
9Up; i + :p p Ap; i vp; j + Ap; j vp; i 9xk
(482)
for up; i us; j :p p
9Up; j d 9 us; i up; j = − (:p p us; i up; j up; k ) − :p p us; i up; k dt 9xk 9xk − :p p up; j up; k
9Us; i + :p p As; i vp; j + :p p Ap; j vs; i 9xk
(483)
and for us; i us; j :p p
9Us; j 9 d us; i us; j = − (:p p us; i us; j us; k ) − :p p us; i us; k 9xk dt 9xk − :p p us; j us; k
9Us; i + :p p As; j vs; i + As; i vs; j + :p p (Bs BsT )ij : (484) 9xk
By replacing Hp by Hp =p vp; i , Hp =p vs; i and Hp =(p )2 in Eq. (481), the partial di8erential equations veriAed by the second-order discrete particle velocity–diameter moment dp up; i , by the second-order uid velocity seen–diameter moment dp us; i and by the second-order diameter moment can be written. After some calculus, one Ands for dp up; i :p p
9Up; i d 9 dp up; i = − (:p p dp up; i up; j ) − :p p dp up; j dt 9xj 9xj − :p p up; i up; j
9dp + :p p Ap; i dp 9xj
(485)
for dp us; i :p p
d 9 9Us; i d us; i = − (:p p dp us; i up; j ) − :p p dp up; j dt p 9xj 9xj − :p p us; i up; j
9dp + :p p As; i dp 9xj
(486)
and for (dp )2 :p p
9dp 9 d (dp )2 = − (:p p (dp )2 up; i ) − 2:p p dp up; i : dt 9xi 9xi
(487)
176
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
8.5.4. Field equations for the ;uid phase Here, the procedure is identical to the one developed above for the discrete phase. Equations for the mean uid velocity Uf ; i and the second-order velocity moment uf ; i uf ; j are written. The expected value of a function Hf (Vf ) is deAned by :f (t; x)Hf (t; x) = Hf (Vf )pfE (t; x; Vf ) d Vf : (488) With Eqs. (423) and (460), it is straightforward to write the Fokker–Planck equation veriAed by FfE (t; x; Vf ) 9FfE 9F E 1 92 9 (Af ; i FfE ) + ([B BT ] F E ) : + Vf ; i f = − 9t 9xf ; i 9Vf ; i 2 9Vf ; i 9Vf ; j f f ij f
(489)
In the case of constant uid density, f , the same equation is veriAed by pfE (t; x; Vf ). As explained before, it is more convenient to make a change of coordinates in velocity space, vf = Vf − Uf (t; x), and it is straightforward to prove that the Fokker–Planck equation veriAed by pfE (t; x; vf ) reads d pfE 9p E 9 1 92 (Af ; i pfE ) + ((Bf BfT )ij pfE ) + vf ; i f = − dt 9xi 9vf ; i 2 9vf ; i vf ; j
+
9Uf ; j 9pfE dUf ; i 9pfE + vf ; i ; d t 9 vf ; i 9xi 9vf ; j
(490)
where d = d t = 9= 9t + Uf ; i 9= 9xi is the Eulerian derivative along the path of a uid particle. Let us multiply Eq. (490) by Hf and apply the · operator, Eq. (488). Similar expressions as the ones in Eq. (246) in Section 6 are obtained and we suppose that all of them converge to zero in the limit vf ; i → ±∞. Assuming that all generalized integrals converge and that uniform convergence is veriAed, after some derivations, one Ands
1 9Hf 92 H f + :f (Bf BfT )ij 9vf ; i 2 9vf ; i 9vf ; j
9Uf ; j 9(vf ; i Hf ) dUf ; i 9Hf − :f − :f : dt 9vf ; i 9xi 9 vf ; j
d 9 (:f vf ; i Hf ) = :f (:f Hf ) + dt 9xi
Af ; i
(491)
By replacing Hf by Hf = 1, Hf = Vf ; i and Hf = vf ; i vf ; j , the continuity equation, the momentum equations and the Reynolds stress equations are obtained, respectively. Multiplying these equations by f yields for the continuity equation, 9 9 (:f f Uf ; i ) = 0 (:f f ) + 9t 9xi
(492)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
for the momentum equation 9 d :f f Uf ; i = − (:f f uf ; i uf ; j ) + :f f Af ; i : dt 9xj
177
(493)
and for the Reynolds stress equations 9Uf ; j d 9 :f f uf ; i uf ; j = − (:f f uf ; i uf ; j uf ; k ) − :f f uf ; i uf ; k dt 9xk 9xk − :f f uf ; j uf ; k
9Uf ; i + :f f Af ; i vf ; j + Af ; j vf ; i + :f f (Bf BfT )ij : 9xk
(494)
8.5.5. Two-way coupling In the preceding sections on the derivation of mean Aeld equations, we have limited ourselves to one-way coupling for the sake of simplicity and in order to present the methodology that leads from the pdf equations to the mean Aeld equations without handling too many terms. We now consider the extension of the previous results when Ap→f ; i = 0 and Ap→s; i = 0. With the inclusion of the reverse force due to the discrete particles on the uid, the complete pdf equation satisAed by FfE (t; x; Vf ) becomes 9FfE 9F E 1 92 9 (Af ; i FfE ) + ([B BT ] F E ) + Vf ; i f = − 9t 9xf ; i 9Vf ; i 2 9Vf ; i 9Vf ; j f f ij f −
9 9Vf ; i
(Ap→f ; i | x; Vf FfE ) :
(495)
Applying the classical procedure (to obtain mean Aeld equations) presented in Sections 8.5.3 and 8.5.4, the mean Aeld equations involve additional terms which have the general form 9Hf E Ap→f ; i | x; Vf F (t; x; Vf ) d Vf (496) 9Vf ; i f or similar expressions when we deal with FfE (t; x; vf ) for the uctuating velocities. This extra term implies that all mean Aeld equations related to the uid are modiAed. We deAne the following operator, L, where L( · ) = 0 represents all partial di8erential equations, in one-way coupling, of the moments related to the continuous phase. By inserting Hf =Vf ; i and Hf =Vf ; i Vf ; j the modiAed mean Aeld equations are obtained. The uid continuity equation Eq. (492) is not M modiAed. The mean momentum equation, Eq. (493) is supplemented by a term Ipf IpfM; i = Ap→f ; i | x; Vf FfE (t; x; Vf ) d Vf : (497) With the model retained for Ap→f and which is described in Section 8.4.2, this term can be expressed by :p M Op; i | Vf FfE (t; x; Vf ) d Vf : Ipf ; i = (498) :f Indeed, the random force Ap→f ; i is only applicable when there is a uid particle at location x and, conditioned on the fact that there is indeed a uid element, its probability of being non-zero
178
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
is given by :p (t; x), cf. Section 8.4.2. This explains the ratio :p =:f in the above equation which expresses the probability to have a random force conditioned on the fact that a uid particle is found at x. The random acceleration term due to the reverse action of particles is correlated with the uid velocity, and from the deAnition of the conditional expectation, IpfM; i can be written as p( p ; Vf ) E :p M Ipf ; i = Rp; i (499) F (t; x; Vf ) d Vf d p : :f p(Vf ) f where p is the sample-space value for the random variable p at the time t and the location x considered. In this equation, p( p ; Vf ) represents the joint pdf of p and Uf , and here p(Vf ) denotes the normalized pdf of Uf at location x since we are already considering the subset of events when there is a uid element at x. From the deAnition of :f (t; x) and the relations given in Section 8.2.7, we have F E (t; x; Vf ) p(Vf ) = f : (500) :f f Then by performing the integration over all possible values of Vf ; i , we obtain with the expression of the mean value of the random force in Eq. (458a)
:p (Up; i − Us; i ) M Ipf ; i = :f f Op; i = :p p : (501) :f $p The additional term which enters the equation for the Reynolds stress components, Eq. (494), can be obtained from the procedure followed in Section 8.5.4 where pdfs for the uctuating components are directly manipulated or by writing Arst the equation for the total energy Uf ; i Uf ; j , from which the one for uf ; i uf ; j is easily derived. The new term in the second-order equation can be written as IpfE ; ij + IpfE ; ji where IpfE ; ij is deAned as E Ipf ; ij = Ap→f ; i | x; Vf Vf ; j FfE (t; x; Vf ) d Vf : (502) Applying the same reasoning outlined just above for the modiAed momentum equation, this term can be written as :p E Ipf ; ij = Op; i | Vf Vf ; j FfE (t; x; Vf ) d Vf :f :p = :f f Rp; i Vf ; j p( p ; Vf ) d Vf d p (503) :f and from the characteristic properties of the random term p given in Eq. (458b), we have
Us; j (Up; i − Us; i ) : (504) IpfE ; ij = :p p $p The additional term in the Reynolds stress equation for the deduced from IpfE ; ij and has the form
uid phase, IpfR ; ij is then easily
D IpfR ; ij = :p p AD p; i us ; j + Ap; j us; i D − :p p (Uf ; i − Us; i )AD p; j − :p p (Uf ; j − Us; j )Ap; i :
(505)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
179
The expressions obtained for the reverse e8ect of particles on the uid can be shown to correspond to direct estimations that can be performed in simple cases. For example, if we assume spatial homogeneity in a volume of uid, say Vf which contains Np discrete particles, then we can simply write in the balance law for the volume of uid Vf the total force due to the Np particles as Fp→f =
Np
p Vp ApD ;
(506)
n=1
where ApD is the drag term given by Eq. (454). Then, we obtain Np Np ( n=1 Vp ApD ) Fp→f = p Vp Np ( n=1 Vp ) n=1
(507)
Np and the mean momentum exchange is, using :p ( n=1 Vp )= Vf and for Np large enough Fp→f = :p p Vf ApD ;
(508)
since the discrete form of an average is to be understood as the mass-weighted average, see Section 8.2.8. It is thus that the total force per unit of volume of uid is identical to the general M. expression written above for Ipf We can now write the two sets of mean Aeld equations, corresponding to the mean momentum and mean Reynolds stress equations for the uid phase which include the e8ects due to the particles as L(Uf ; i ) + IpfM; i = 0 ;
(509)
L(uf ; i uf ; j ) + IpfR ; ij = 0 :
(510)
The reverse force acting on the uid equations due to the discrete particles, Ap→s , enters also the stochastic equation for the velocity of the uid seen Us in Eq. (453). Therefore the complete pdf equation satisAed by FpE (t; x; Vp ; Vs ; p ) has now the form 9FpE 9FpE 1 92 9 9 =− (Ap; i FpE ) − (As; i FpE ) + ((Bs BsT )ij FpE ) + Vp; i 9t 9xi 9Vp; i 9Vs; i 2 9Vs; i 9Vs; j −
9 9Vs; i
(Ap→s; i | x; Vs ; Vp ; p FpE ) :
(511)
The last term in the previous equation is an additional term which implies that all mean Aeld equations involving the uid velocity seen are modiAed compared to the one-way coupling situation. Let us deAne the following operator, Lp , where Lp ( · ) = 0 represents all partial di8erential equations, in one-way coupling, of the moments related to the discrete phase. By performing all the corresponding integration from the pdf equation, the modiAed mean Aeld equations can be written as Lp (Us; i ) + :p p KAD p; i = 0 ;
(512)
180
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Table 5 Two-Aeld model: list of the mean Aeld equations and related unclosed terms Equation
Variable
Unclosed terms
Third-order tensors
Eq. (492) Eq. (475)
:f :p
Eq. Eq. Eq. Eq.
(509) (476) (512) (478)
Uf Up Us dp
M Af ; i Ipf ;i Ap; i As; i AD p; i dp up; i
Eq. Eq. Eq. Eq. Eq. Eq. Eq.
(510) (482) (514) (513) (485) (515) (487)
uf ; i uf ; j up; i up; j us; i us; j us; i up; j dp up; i dp us; i (dp )2
ij R Af ; i uf ; j Ipf ; ij (Bf Bf Ap; i up; j T)ij As; i us; j AD p; j us; i (Bs Bs D As; i up; j Ap; j us; i Ap; j up; i Ap; i dp As; i dp AD p ; i dp
T)
uf ; i uf ; j uf ; k up; i up; j up; k us; i us; j us; k us; i up; j up; k dp up; i up; j dp us; i up; j (dp )2 up; i
Lp (us; i up; j ) + :p p KAD p; i up; j = 0 ;
(513)
D Lp (us; i us; j ) + :p p KAD p; j us; i + Ap; i us; j = 0 ;
(514)
Lp (dp us; i ) + :p p KAD p; i dp = 0 ;
(515)
respectively, where K = (:p p )=(:f f ). Possible closures for these additional terms will be discussed in the next subsection. 8.5.6. Closure of the two-?eld model Mean Aeld equations have now been written, up to the second-order moments, in the case of a non-reactive dispersed two-phase ow where the uid is incompressible and the discrete particles are hard spheres of constant density. The set of equations and the associated unclosed terms are given in Table 5. Table 5 should convince the reader of the necessity of a Lagrangian approach when one attempts to simulate turbulent reactive dispersed two-phase ows: the physics of the problem are quite simpliAed and the system of equation is already nearly intractable: 13 partial di8erential equations (the dimension of the system is 46) with 23 terms to be closed. However, in the case of industrial applications which fall into the category of the simpliAed case under consideration (for example in uidized beds), the mean Aeld equations can be furthermore simpliAed, provided some additional restrictions and hypotheses. Let us consider the case where the distribution in diameter for the discrete particle is ‘narrow’ enough so that the suspension can be described in terms of a mean diameter only, dp (we do not specify how this diameter can be chosen, see [96] for more information). If a quasi-equilibrium hypothesis (Boussinesq approximation) can be made on the second-order velocity moments (which implicitly means that the characteristic time scale of the uctuating motion is much smaller than the time scale of the mean ow), and if, as done in Section 7,
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
181
there is no statistical bias between the kinetic energy of the uid seen us; i us; i =2 and the turbulent kinetic energy of the uid uf ; i uf ; i =2, the system of mean Aeld equations reduces to a set of equations which can be treated with modern computer technology [78]. This model consists in partial di8erential equations for :p ; :f , for the expected velocities Uf ; Up ; Us and the traces of the contracted tensors, that is the turbulent kinetic energies in both phases uf ; i uf ; i =2; up; i up; i =2 and the uid–particle velocity covariance us; i up; i . The treatment of the closures is out of the scope of the present paper and can be found in [78]. 8.6. Concluding remarks A general formalism, characterized by the introduction of a two-point uid–particle pdf, has been presented. The formalism is detailed (especially the notion of Lagrangian and Eulerian mass density functions) and its relation with macroscopic approaches (mean Aeld equations) is speciAed. The complexity of the mean Aeld equations which are obtained in a very simple case indicate, once again, that the natural way to treat complex problems in uid mechanics is the Lagrangian approach. 9. Summary and propositions for new developments The main points are summarized in this section. We Arst go back to the reasons that justify a stochastic approach, we discuss the current state of models and we suggest research directions. Some of these suggestions aim at beneAting from the synthesis of present methods with di8erent particle methods while other suggestions aim at making connections with developments in other Aelds of physics in order to help improve present closures ideas. 9.1. DiIculties with conventional approaches and interest of a pdf description In this work, we have tried to propose a consistent and self-contained formalism for the one-particle probabilistic description of turbulent and dispersed two-phase ows from a Lagrangian point of view. Many di8erent approaches or descriptions have been proposed for single-phase turbulence and the quest is bound to continue [1]. Prediction and simulation of single-phase turbulence is of course central in two-phase ow modelling. Yet, in order to understand how the present probabilistic approach Ats into the landscape, one must remember that the aim is not to put forward the most attractive model for single-phase turbulence. Our purpose is to develop one approach which, while treating single-phase turbulence in an acceptable way, can be easily extended to reactive and two-phase ows. The central objective of this probabilistic description is to build a general framework that allows tractable models to be developed for general non-homogeneous ows while still treating accurately the key physical aspects of turbulent two-phase ows. These two aims are actually con icting and one has to reach a compromise. Simple models can be devised but often at the expense of physical precision. On the other hand, some advanced theoretical methods, such as LHDIA [4] or EDQNM [3] to name a few, may well describe the physics but are limited to a small class of ows. It also seems di@cult to hope for a single approach to be the best answer to each and every modelling need
182
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
since the notion of the ‘key physical aspects’ may di8er from one situation to another. What is perhaps appropriate to understand the formation of trailing vortices behind aircrafts may not be adequate if one is to predict the formation of various noxious pollutants released in the combustion of polydispersed coal particles. It is more realistic to hope for one approach to be merely a satisfactory answer to an as-wide-as-possible range of problems. Consequently, the selection of the modelling approach is an important step in practice and one should be aware of the necessity to balance practical use against accuracy when assessing the strong points and the weak points of the present approach. With the above caveat in mind, one can still wonder: why do we have to bother with a probabilistic approach or is it really worth the trouble? To answer these questions, it is useful to summarize the modelling issue. Turbulent dispersed two-phase ows involve in most practical cases far too many degrees of freedom to be directly simulated, unless we limit ourselves to low or moderate Reynolds numbers and to simpliAed geometries. This does not mean that direct numerical simulations are of no use but rather that a ‘macroscopic’ description is necessary in order to be able to say something for the vast majority of turbulent ows. The term macroscopic refers here to a limited number of degrees of freedom compared to the exact equations and the issue is therefore to extract from these exact equations another set of equations for a reduced number of degrees of freedom. This represents the classical issue of single-phase turbulence problems, and the modelling issue in dispersed two-phase ows is similar. Following the usual line of reasoning in continuum mechanics, a perhaps natural approach consists in trying to write directly a set of closed partial di8erential equations satisAed by a small number of statistical averages or moments. These moments include typically the mean or average velocities of the uid and of the particles, their mean temperatures or their mean scalar compositions if needed, as well as the mean particle diameter and the mean volume fraction. The mean values can be either the ensemble-average values, along Reynolds decomposition which is written here for example for the uid velocity Uf (t; x) = Vf pE (t; x; Vf ) d Vf (516) or the large-scale values, in the spirit of LES decomposition which makes use of a spatial Alter
G and is concerned with Uf l (t; x) = Uf (t; y)G(t; y − x) d y :
(517)
We refer to this approach as the classical or conventional approach. It implies two main prerequisites. If we are certain to be interested only in the mean quantities selected in advance and if we are able to close the corresponding transport equations through physically correct constitutive relations, then this classical approach is satisfactory and there is little need to consider more reAned approaches. This is typically what is done in single-phase turbulence modelling or analysis. However, that direct procedure can become very cumbersome in the case of turbulent dispersed two-phase ows. The discussion in Section 8, and in particular in Section 8.5, where the transport equations for mean quantities, such as the uid and the particle mean velocities and their turbulent kinetic energies, have been presented illustrates this point. It is indeed quite possible to, Arst write the unclosed Aeld equations, and then to introduce all the required closure relations. However, due to the large number of variables and to the complicated forms of the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
183
Aeld equations, this is a much more di@cult issue than, for instance, to assume the existence of a turbulent viscosity for the e8ect of subgrid scales on the large ones as is sometimes done in numerical models for single-phase turbulence. On the other hand, these mean equations for the uid and particle properties have been shown to derive from just one set of stochastic di8erential equations, Eqs. (453) in Section 8.4. The stochastic equations have a more tractable form. Once the probabilistic formalism is grasped, it is simpler to see the physics that is contained in a modelling proposal from the stochastic equations rather than from the set of mean Aeld equations. In other words, even if the probabilistic path ends up at the same point (in terms of mean equations) than the classical path, it is a much easier one to ride. This is a Arst reason for a pdf approach. However, the main interest of probabilistic descriptions is to supply more detailed information. As a consequence, this approach will be attractive in situations when the conventional approach actually fails, that is when at least one of the two emphasized if above is not satisAed. As it was explained in Sections 6.3, 6.5.3 and 7.2, there are two main situations where such a direct route towards macroscopic equations will not be appropriate: (1) when we want detailed local information on two-phase ows and we want to be able to extract from the models various statistics that can help to analyse carefully and to bring insights into the physics of the problem. Typical examples are conditional statistics, particle residence time distributions, and separate statistics on particles depending upon their previous histories or trajectories. (2) when the local exact instantaneous equations, from which we want to derive the macroscopic equations, involve complicated source terms that cannot be easily replaced by closed expressions in terms of the macroscopic variables. The Arst case (1) is simply related to a question of the deAnition of information. A small number of mean variables has been selected implying that, right at the outset, a lot of information is disregarded. If we are faced with this Arst di@culty, then the selection was too drastic. The only possible solution in the classical approach is to go back to the beginning of the process and to try to derive additional transport equations for all the needed quantities. With an increasing number of additional mean values to model and to calculate from transport equations, this may become cumbersome, as already mentioned, and this approach may loose its interest compared to the probabilistic approach. The second case (2) is far more serious and means that the direct approach will completely fail. This is once again a question of the level of information contained in a proposed approach. That point can be developed using the LES method in single-phase turbulence as a guiding example. The LES approach starts by removing the small scales. Their e8ect on the large ones are accounted for but they are eliminated from the computed state vector and are external to the description. LES can improve considerably the modelling power compared to Reynolds stress models. A strong point is that LES contains a part of the spatial information and can calculate the length scale of the large scale uctuations. As such, it may be the appropriate answer for the kinetic energy governed by the large scales. However, if we are dealing with turbulent reactive ows and a Anite-rate chemistry, the key physical phenomena for the chemical species take place precisely at the small scales, since they involve molecular mechanisms. The important phenomena are governed by scales for which insu@cient information is available. As a result,
184
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
we are faced with a modelling closure issue that is similar to the one encountered in Reynolds stress models, see Section 6.5.3. One could neglect scalar subgrid scales and express the Altered reactive source terms as S( )l = S( l ) :
(518)
This can lead to serious discrepancies in numerical predictions, as was demonstrated recently [62,63]. In that case, we need to model all the scales explicitly. For turbulent dispersed two-phase ows, the issue is similar when a polydispersed particle phase is present, the range of particle diameters playing the role of the subgrid scales. A satisfactory closure relation using only the mean particle diameter is too di@cult to derive in the mean particle momentum equation, see Section 7.2. These two issues are representative of issues of ‘complex physics’ in uid dynamics where the problem is not limited to the sole prediction of isothermal ows but involves heat and mass transfer, combustion, polydispersed two-phase ows among other e8ects. In both cases, too much information was eliminated at the beginning from the very choice of the classical approach. In both cases, the probabilistic approach has at least the potential to address the problem. 9.2. Assessment of current modelling state The second important objective of this work was to describe the physical ideas that are behind present models equations. Present proposals consist in modelling the variables that have been selected in the state vector by stochastic di8usion processes, in single-phase turbulence in Section 6 and in turbulent two-phase ows in Sections 7 and 8. This step may involve important simpliAcations and some of the assumptions made in the course of the derivations can be questioned. Yet, it has been stressed that these stochastic di8usion models are not mere tricks, or artiAces to include uctuations around mean or macroscopic laws that are known in advance. These macroscopic laws are precisely obtained from the stochastic approach, and the present methods have indeed the potential to represent and simulate the complete one-point dynamics of turbulent two-phase ows. For this reason, we have tried to precise the meanings of the present approach as well as its mathematical background. Stochastic processes are powerful modelling tools and the deAning notions or the technical aspects that go with them have been presented independently from their applications to single-phase turbulence and to turbulent two-phase ow modelling. Three aspects have been treated separately: the technique, presented in Section 2, the general framework discussed in Sections 3 and 4, and the details of present models used in Sections 6 –8. Now, once the general context and the formalism are understood, comes the really interesting question for a physics-minded user of turbulent models: how much of the physics is accounted for by these stochastic models, what is left out and consequently what are the weakest points to improve? These questions have been analysed at length in Section 6.8 for single-phase turbulence and in nearly all Section 7 (in particular, Sections 7.4.1, 7.4.2 and 7.5). Broadly speaking, the present approach consists in the following steps. First, a state vector Z gathering the important local variables is selected. This is a very important step governing the chances of success of later models, as it was emphasized in Section 4 for the general case, in Section 6.6
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
185
for single-phase turbulence and in Section 7.3.2 for two-phase ow models. Then, a di8usion stochastic model is used where the local exact equations for the variables contained in Z are replaced by stochastic di8erential equations which represent the time evolution equations of a large number of trajectories of the process and which can be written as general Langevin equations, cf. Eqs. (453), d Zi = Ai (t; Z; H (t; Z)) d t + Bij (t; Z; G(t; Z)) d Wj :
(519)
The stochastic process is meant as a weak approximation of the actual one or, in other words, we expect that statistics extracted from Z will approximate the real ones, see Section 2.1.2. In the Langevin equations, the drift vector Ai and the di8usion matrix Bij , are functions of the state vector and also of some statistics represented by the dependence of the coe@cients on mean functions H (t; Z) and G(t; Z). A simple example of that feature is the Langevin equation used for uid particle velocities which are assumed to relax to the local mean value Uf (t; x), Eq. (237), with a relaxation time scale given by Eq. (238). Both terms involve mean values calculated from the ensemble of particles, and implicitly from the calculated pdf p(t; z). In mathematical terms we are dealing with non-linear pdf equations, and with what is referred to as Mc-Kean stochastic di8erential equations [16,56]. Several comments can be made from the physical point of view. First, the form of this pdf is not assumed beforehand. There are no equilibrium assumptions that are used directly to obtain a generic form for p(t; z), as in equilibrium statistical physics, and the present approach appears in that respect as a non-equilibrium statistical description. Yet, from the discussion given in Section 4, it is clear that there are some ‘equilibrium reasonings’. Yet, these equilibrium assumptions are performed for the fast variables, replaced by local models in terms of the slow modes, see Section 4.2, while the slow variables are explicitly calculated by solving the modelled transport equation for their pdf. Second, the general equations, Eq. (519), reveal that we are handling mean Aeld kind of models. This is due to the limitation to one-point models: the particles do not interact directly but rather indirectly through potentials (or mean Aelds) which are computed from them. This limitation allows the general case of non-stationary non-homogeneous turbulent two-phase ows to be addressed, but some physical aspects are left out. In particular, we can make the following comments: (i) since direct interactions between particles are not calculated but only ‘mean’ ones, there is no spatial information at present. We can calculate from the particle properties time-correlations and deduce time scales. We cannot calculate spatial information and we cannot simulate the energy spectrum or the length scale of the uctuations. Consequently, we cannot expect this approach to predict the formation or the characteristics or what is referred to as coherent structures [1,3]. These are far-from-equilibrium structures that require the collective spatial motion of many particles. (ii) If these coherent structures cannot be calculated from present models, they could still be included in the models. If we know the basic statistical signature of coherent structures (their frequencies, their intensity, etc.), we can try to mimic them, see below the discussion in Section 9.3.1. At the moment, this is not the case, and there is about the same physical content (or the lack of it!) in the probabilistic description as in classical one-point
186
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Reynolds stress models. Phrased di8erently, present stochastic models could be roughly described as Reynolds stress type of models, that nevertheless treat chemical source terms and polydispersed ows correctly. (iii) A consequence of the absence of spatial information in present one-point closures in the stochastic models for single-phase ows is that the particle dispersion modelling issue is unclosed, see Section 7.4.1. This remains a very important issue, even though we are only looking for approximate one-point statistics in turbulent two-phase ows. In spite of the reasons put forward to justify the present closures, see Section 7.4.2, too much uncertainty in the physical understanding of the problem and in the stochastic modelling technical steps still prevents the resulting models from being fully satisfactory. In terms of the physical content of present stochastic models, the feeling is therefore mixed. Nevertheless, there is much room for improvement. The probabilistic framework is a very convenient framework in which the basic stochastic equations can be easily modiAed to include supplementary physical e8ects, if these e8ects are well understood. At the moment, the limiting steps are mostly due to a poor understanding of the basic physics involved. 9.3. Open issues and suggestions The above comments explain that the situation on the modelling front is not frozen but is evolving with extensions or improvements of present models being proposed. A Arst series of proposals have been made in the same context of one-point closures or mean-Aeld interactions. The basic equations have been modiAed, in particular the position equation with a new random term and use of the elliptic relaxation for near-wall modelling [97]. To treat compressibility e8ects, the description can be broadened to include the instantaneous particle pressure and internal energy [98]. Another recent interesting development is the particle exact representation of Rapid Distortion Theory in turbulence [60] by the addition of a wave vector. We do not dwell on these extensions in this concluding section, but rather choose to discuss new possible ideas that go beyond the present context or that could be linked to other domains of applied physics. We Arst discuss how present stochastic models could And a better place within the family of turbulent models and other large-scale methods for single-phase turbulence. We then turn our attention to a number of open issues concerning mainly discrete particle properties for which much work remains to be done. It is believed that these issues can be related to other Aelds of physics and could probably beneAt from developments made there. Interestingly enough, these points involve both immediate practical concerns and interesting physical aspects. They concern (i) how to beneAt from developments and results obtained in theoretical and fundamental works for single-phase turbulence, (ii) how to include spatial information for the continuous phase (the uid), keeping the interest of the present stochastic approach for reactive and two-phase ows intact, (iii) how to properly simulate the e8ect of molecular transport coe@cients in the one-point pdf models, where they manifest themselves by a kind of ‘anti-di8usion’ behaviour,
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
187
(iv) how to properly build a dispersion model from a di8usion one taking into account the various possible uid and particle paths, (v) how to marry present ideas with developments in granular matter which do not consider the in uence of an underlying turbulent uid. 9.3.1. A middle path between theoretical and practical approaches Present stochastic models simulate the instantaneous behaviour of uid and discrete particles in a turbulent ow and lead directly to the general mean moment equations. They rely both on theoretical predictions and on rough macroscopic estimations. For example, in the simplest Langevin model for uid particle velocities are (see Section 6.6) written as Bij
" # ! 1 9P Ui − Ui d Ui = − dt − d t + C0 + d Wi ; 9xi TL !" #
(520)
Di
where TL is the Lagrangian time scale given by Eq. (238), the closure of the di8usion term, Bij , is based on Kolmogorov theory while the closure of the drift term, Di , remains a macroscopic assumption having weaker support. The same is true for the model for the velocity of the uid seen in two-phase ows, see Section 7.4.2. Provided we have additional information, both terms could be improved without loosing the connection with mean equations. Thus, the probabilistic description can include new pictures and new results on the instantaneous nature of turbulent ows obtained in simpliAed situations (such as stationary isotropic turbulence) and calculate their e8ects in general non-homogeneous ows. In that sense, this description is an attractive candidate to bridge the gap between fundamental and practical approaches. We can develop that suggestion as follows. Broadly speaking, the di8erent approaches to turbulence can be divided into two main categories. In the Arst one we And direct numerical simulation (DNS) which is an explicit numerical simulation of the Navier–Stokes equations providing all details on turbulent ows [99], as well as experiments [28] and a number of theoretical developments. These theoretical developments include multifractal approaches [36,5], renormalization-group approaches or functional and diagrammatic approaches [4,100,101] among others. We refer to this category as fundamental approaches. They are mainly concerned with the question of the nature of turbulence. Their aim is to provide insights into the physics of turbulence and hopefully to try to isolate what could be the ‘turbulence molecules’. Along this line of thinking, one assesses the validity of Kolmogorov scaling. In a more geometrical approach, one may try to bring out particular structures, such as the structures called worms which have been found in DNS and in experiments, and analyze their role in the mechanisms of turbulence [34]. Another aim of the fundamental approaches is to supply information needed in coarser or reduced descriptions to help devise better approximations and also to test proposals. At the other end of the spectrum covered by the various approaches to turbulence, we have the category of macroscopic approaches. Under that term, we usually And the classical moment approach, either two-equation models such as the k–+ model or Reynolds stress models, see
188
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Section 6.3. These models are now routinely used in Computational Fluid Dynamics softwares for any complex geometries and are often referred to as engineering models. Yet, the term seems to carry a disparaging connotation in most works on the physical aspects of turbulence and we prefer the term of macroscopic model. The connection between these two categories is weak. Indeed, it seems very di@cult to use information or results obtained from a fundamental approach directly, say in a Reynolds stress model. This is due to the fact that a huge number of degrees of freedom have been eliminated, leaving only the Aeld of mean velocities and of Reynolds stresses. This is also due to the classical approach itself which, as explained in Section 6.3, Arst derives an open mean equation and then obtains a closed form by resorting to a macroscopic constitutive relation. One approach put forward to All in the gap is large Eddy simulation which has received a great deal of attention in the last decade. It has even been the only subject of concern in most works concerned with new ideas in numerical simulations in turbulence. LES is indeed a very fruitful avenue and appears as the best candidate at the moment for problems mainly governed by the large scales and where the physics is not too complicated. Nevertheless, the method has Arst to live up to its expectations for practical applications in not too simple geometries. Second, it may not be the ultimate answer when complex physics is involved. This has been explained in the introduction of this section for combustion issues. If one is concerned with practical applications and with assessing approaches in that respect, then it is important to realize that, even with the foreseeable increase in computational power, the most urgent needs are often to include complex physics (heat and mass transfer, combustion, two-phase ows, free surface ows with complex interfaces, etc.,) before improving the predictions of the dynamics of the large scale. For this reason, it is proposed to use the present one-particle probabilistic framework to link fundamental and macroscopic approaches. At the moment, DNS results are mainly used to assess some of the mean terms in the moment approach and to provide budgets of production or transport terms in a given ow, see the discussion of the impact of DNS on turbulence modelling in Moin and Mahesh [99]. We suggest to go Arst from the DNS results to a pdf description. In turn, the pdf description will yield the corresponding mean moment equations. Yet, since we are dealing with instantaneous variables in the pdf models, it seems easier to introduce newly understood mechanism or new pictures within the stochastic equations rather than directly at the macroscopic level. This idea was Arst followed in Yeung and Pope [29], more than 10 years ago and has not received enough attention (apart from Yeung’s continuing e8orts [102,103]). For instance, the closure of the drift term in the Langevin equation, Di , could be improved with information from DNS. Yet, the di8erent formalisms between the fundamental and the pdf approaches suggest to avoid direct (or blind) applications of statistical results from DNS into statistical closures. Information provided by DNS would be best used if one builds Arst a simpliAed picture and then uses DNS to provide quantitative information for this picture. We develop this proposal through two examples (example (a) and example (b)). Both examples will deal with ways to account for structures. These structures are observed in di8erent situations and should not be confused. In example (a) we consider essentially homogeneous turbulence and in example (b) we consider a typical case of a non-homogeneous ow.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
189
Example (a) concerns the di8usion coe@cient and is related to the question of intermittency. At present, the closure relation Bij = C0 + ; (521) is based on Kolmogorov 1941 scaling and relies only on the mean dissipation rate of the turbulent kinetic energy +. The Lagrangian velocity structure function is linear with respect to the time increment d t DL (d t) = C0 + d t ;
(522)
when d t belongs to the inertial range, $@ d t TL , see Sections 5.3 and 6.8. There is no account for intermittency due to the uctuations of the instantaneous dissipation rate sampled by the uid particle. Current knowledge on the question of intermittency and its importance is still incomplete. Recent Lagrangian measurements performed at relatively high Reynolds numbers, 900 ¡ Re# ¡ 2000 for a Reynolds number based on the Taylor scale #, have reported results in agreement with Kolmogorov 1941 scaling [104]. On the other hand, results obtained from DNS at Reynolds numbers up to Re# 200 have shown that the uid particle acceleration variance normalized by Kolmogorov inner scales, Eq. (135), A2 A2 = 3=2 −1=2 ; (523) +=$@ +
scales as Re#1=2 , whereas Kolmogorov 1941 predicts a constant value a0 [103]. It remains unclear whether this dependence on the Reynolds number is due to the low value of Re# used in DNS and whether this Re#1=2 scaling extends to higher values. This will have to be clariAed by further work to decide upon the real importance of that subject. Yet, if intermittency is to be taken into account, this can be achieved quite naturally in the stochastic equation for uid particle velocities by replacing + with an instantaneous value + attached to each particle. The di8usion coe@cient can be written as Bij = C0 + (524) and the instantaneous value of the dissipation rate sampled by a uid particle can be simulated for example by the log-normal model presented in Section 6.6. The modiAed picture is then in line with Kolmogorov reAned hypothesis, referred to as Kolmogorov 1962 [25]. The reAned hypothesis predicts similar scaling for the Lagrangian velocity structure function DL (d t) = C0 + d t ;
(525)
but this is now a conditional result for a given value of +. The unconditional structure function is obtained by averaging on the instantaneous values of + which are assumed to follow a log-normal distribution [25]. Following the reAned hypothesis and this new closure, the stochastic equations that model the particle joint velocity–dissipation rate (U; +) are directly coupled by the di8usion coe@cient. On the contrary, in the stochastic equations developed in the frame of Kolmogorov 1941 scaling and given in Section 6.6, the coupling between the particle velocity and dissipation rate equations is weak since only the mean value + enters the instantaneous velocity equation. That situation is changed. In stationary homogeneous turbulence, when + is constant, this new di8usion coe@cient, Eq. (524), is now a stochastic variable and this leads to deviations of mean
190
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
square velocity increments from normal di8usion. Such a complete stochastic model accounting for internal intermittency was proposed some years ago [57,58] and has been used for realistic non-homogeneous turbulent ows [59]. Other proposals have been made, based for example on multifractal ideas for the instantaneous dissipation rate [105]. These modiAcations follow the statistical route and try to replace one scaling with another one. On the other hand, recent simulations have revealed the importance of structures that can be isolated within the ow [1]. We make here the following suggestion that could reconcile the statistical and geometrical pictures. From the DNS, we give ourself a two- uid picture of a turbulent ow that is a follow-up of a proposal made by She et al. [106]. A turbulent ow is composed Arst of a background with no marked structures, and second of a number of particular structures. We are considering homogeneous turbulence. In that case, the structures embedded in the structure-less background, that we propose to retain in the overall picture, could be described as the intense vortex tubes that have been isolated in DNS and which are called worms [34]. Their characteristics are still debated but they can be described as vortex tubes having a length of the order of the turbulent length scale L, a radius of the order of @ (or perhaps # as some results suggest) and a vorticity of the order of u=@. Yet, the key result is to know the typical number of these structures nvt within a given turbulent ow or the volumetric fraction occupied by these worms, and how nvt scales with Re# . This is a statistical quantity but for a geometrical object. That statistical estimate is still unknown at the moment but having information on nvt would be a crucial result. The leading idea of the two- uid picture is that the Kolmogorov 1941 description is valid but only for the structure-less background, and that these particular structures could explain the deviations from Kolmogorov scaling observed for the high-order velocity structure exponents. Indeed, if we assumed that their typical number nvt is not too high, these intense events (the worms), associated to low-pressure zones, would only contribute to high-order statistics while not modifying small-order statistics. Of course, that idea must be Arst conArmed (or contradicted) by the fundamental approaches. In that case, based on that two- uid picture, one could then develop a corresponding stochastic model. The particle velocity model would be the Langevin equation based on + to which one would add increments that would represent random encounters with the worms (at a frequency that would be of the order of n−1 vt ). This is a way to connect fundamental developments with practical models. The second example (example (b)) we would like to develop takes up this idea of accounting for the statistical signature of particular structures to another case, wall boundary layers. The near-wall region and the turbulent boundary layer is a thin zone which is of key importance in many practical situations. The traditional description relies on similarity arguments and on universal proAles of mean quantities (the logarithmic variation of the mean velocity outside the viscous layer) [26]. DNS has changed this view of the turbulent boundary layer structure [99] and has pointed out the role of near-wall structures. We are dealing here with a non-homogeneous ow, and these structures are di8erent from the ones described above (the worms of example (a)). Being able to include the near-wall structures in tractable models is of importance in a number of problems. A Arst example is the thermohydraulics of nuclear piping systems, where in some regions one would like to have detailed information on thermal near-wall uctuations that may cause material damages. Detailed information means here access to more than the sole mean velocity and temperature proAles, that is to the pdfs of velocity
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
191
and temperature uctuations. Another example concerns directly the main subject of this paper, turbulent two-phase ows. The boundary layer structure plays a key role in the mechanisms of discrete particle deposition on the walls and on their possible resuspension away from the wall. This is an important concern in many industrial processes. For these problems, DNS is of course the ultimate answer since all details are explicitly calculated. This is however quite impossible in practice, even with LES, due to complex geometries, to the high Reynolds numbers of these ows and also due to the fact that we need detailed description but only in a certain region of a whole domain. In that case, it is proposed to use present stochastic models as an intermediate between the fundamental and the macroscopic approaches. Using DNS data as an input, the issue is then to include in the particle velocity and scalar equations, the speciAc e8ects of the near-wall structures based on their intensities, frequencies and orientations. Work in that direction has already been done [107,108] but often in di8erent settings and no clear model has yet emerged. 9.3.2. Ideas for particle-based LES The two series of improvements mentioned above (extending the one-particle state vector or introducing the statistical signatures of ‘structures’ deduced from direct simulations) represent attempts to extend the accuracy of present methods within the same framework of one-point approaches. However, from the comments on the current modelling state, it appears that the weakest point is the lack of spatial information. So the question is: how can such spatial information be introduced in the probabilistic description? A Arst possibility within the stochastic framework is to extend present one-particle pdf models to two-particle pdf models (or s = 2 in the notation of Section 3). Instead of following a large number of uid particles, we would then follow a large number of pairs of uid particles. Access to spatial information is not only interesting for the prediction of uid properties or statistical characteristics, but is important for the issue of discrete particle dispersion. From the discussion detailed in Section 7 and in particular in Section 7.4.1, it is clear that the additional closure problem for particle dispersion (how do we represent the velocity of the uid seen Us ) would vanish since enough information would be available. Discrete particle dispersion is an unclosed issue when a one-particle description of the uid is made, but it becomes a closed problem when a two-particle description of the uid is available. This is the key reason for striving to improve the statistical description of the underlying uid, which is the particle driving force, rather than trying to improve the statistical closures of particle kinetic equations, see the discussions in Sections 7.3.2 and 7.5.7. A number of Lagrangian two-particle stochastic model have been developed aiming mainly at atmospheric pollutant dispersion studies [109 –114]. However, these models are limited to the ideal case of homogeneous stationary isotropic turbulence and work remains to be done to extend their range of validity to non-homogeneous ows. In the present subsection, we want to put forward another suggestion that actually breaks away from this strictly pdf frame and re-introduce spatial information by coupling the stochastic methods with other and non-necessarily stochastic particle methods used in uid mechanics. We can detail this suggestion by discussing the idea for the description of uid velocity, taking up the presentation started in Section 6.8.
192
as
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
In the present stochastic models, the uid particle instantaneous acceleration is decomposed A=−
1 9P + G(U − U) + ; 9x !" #
(526)
Al
as discussed in that section. An outstanding feature is that, since we are treating the instantaneous variables, the model still retains information on the small scales. This is the good point. The bad point is that one has to express the large scale velocities Al resulting from the large scale part of force due to the other particles by a one-point relation in terms of the actual uid particle velocity and of mean quantities. We are thus dealing with a one-particle pdf description of the full velocity. Instead of making mean Aeld approximations, we could calculate that contribution from the N particles as a direct force between them. This is done in other particle methods, such as Vortex Methods and more precisely SPH (smoothed particle hydrodynamics). In these methods, one introduces a cut-o8 length h and a smoothing kernel Wh (x) to each particle. The kernel satisAes the conditions Wh (x − y) d y = 1 ; (527a) lim Wh (x − y) = (x − y) :
h→0
(527b)
In SPH techniques, this introduction of a kernel allows to calculate derivatives at the particle position [115] and thus what is called ‘regularized solutions’ with a purely particle method. The basic idea is to use an interpolation method that approximates any function H (x) by HI (x) HI (x) = H (y)Wh (x − y) d y ; (528) and the regularized version of H (x) is obtained from the set of particles labelled here b as mb H˜ (x) = H (xb )Wh (x − xb ) ; (529) b b
where xb is the location of the particle b, mb its mass and b its density (for incompressible ows generally considered in this work, it would be a constant). Broadly speaking, the velocity Aeld is obtained by regularizing locally around each particle using the analytical kernel (this is actually like ‘spreading’ each particle in a small domain around its centre of mass) and the regularized acceleration of one uid particle, labelled a, is the sum of the regularized (or large scale) forces F˜ b→a due to all the other particles b N F˜ b→a : (530) Aal = b=1
For example, the pressure-gradient force can be expressed as [115] ' ( 1 9P˜ 1 − = mb (Pb − Pa )Qa Wh (xb − xa ) ; 9xi a a
b
(531)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
193
where the derivatives are taken with respect to the coordinates of particle a. The interest of this approach is that one retains the Lagrangian nature and its satisfactory treatment of convection (low di8usivity) and that the method is basically grid free. This is indeed a N -particle approach since the N particles are tracked simultaneously and interact directly. Yet, due to the regularizing procedure, these particles are in fact large scale particles. We are thus dealing with a N-particle pdf description of the large scale component of the uid velocity. The approach is on Arm ground when the number of particles N and the typical length of the kernel h are such that each particle has a reasonable number of neighbours (say around 10 or 20) within the volume whose size is h3 and which therefore contribute in the above sums. It is also clear from the previous summary that scales smaller than the length of the kernel h are actually smoothed. In turbulence, the number of degrees of freedom is extremely large and to obtain a tractable model one must then choose the cut-o8 h somewhere in the inertia range. This amounts to a direct calculation of the large scales using a particle method, but without any speciAc account for the ‘sub-kernel’ scales. The smoothing kernel plays here the role of the grid size in a classical grid-based LES method, and the basic SPH approach appears as a particle-based LES without any subgrid model. Therefore, both approaches (one-particle pdf and SPH) are particle methods but with di8erent characteristics. One suggestion is then to try to beneAt from the good points of the two approaches by introducing length information and the notion of a Altered Aeld while retaining the instantaneous (including the small scale component) nature of the modelled variables associated to the particles. We would then have a N -large scale particle pdf description with still enough information for the full velocity. This can be achieved by dividing the instantaneous and complete acceleration of each particle into two contributions A = Al + ;
(532)
where the Arst term refers to the large scale part of the forces that act on a uid particle and can be provided by an SPH calculation. The large scales are explicitly calculated. Small-scale e8ects and information are kept in the method to maintain its interest for the calculation of reactive source terms. This can be done by the second process in the decomposition of the particle instantaneous acceleration, which is then a stochastic term in the spirit of the one-particle pdf method. This process re ects subgrid e8ects (here sub-kernel e8ects) and could be modelled by a simple Langevin model with a time scale provided by the resolution at length h. This approach would be a mixed SPH=pdf stochastic particle coupled approach. In practice, it would appear as a particle method that gathers both the N -particle character of the SPH method and the explicit simulation of small scales obtained with the one-particle pdf description. We can propose the following sketch for the method: say we have a large number of particles N . These particles are classiAed form a hierarchy. They are gathered into Nl clusters, with N = ns × Nl , from which regularized variables are computed. The interactions between the clusters are calculated directly (Nl -body problem) by the SPH formalism and we add to each particle an independent small-scale random process . This small-scale process could be devised for example with a simple Langevin model with a return-to-equilibrium term. For a particle at the lowest level of the particle hierarchy and which belongs to one large scale particle or cluster, the velocity of that large scale particle could be taken as the equilibrium value. For the rapid process , we would then perform in fact a local ‘mean Aeld’
194
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
approximation, the mean Aeld being here the one calculated explicitly at the higher particle hierarchy level. SPH is still a young method and not much used in uid mechanics, apart from free-surface problems where its grid-free characteristics is very appealing. Application for turbulence issues is still at the early stages. It remains thus to be seen whether these ideas have some viability! 9.3.3. The eCect of molecular transport coeIcients in pdf models As already indicated several times throughout this paper, one of the main interests of stochastic models appears when averages of complicated source terms have to be calculated. However, one has still to consider the e8ect in sample space of the molecular transport coe@cient, like the scalar di8usivity > or the uid viscosity. To focus the discussion on this issue, we limit ourselves in this section to passive scalars but we still consider one-particle PDF descriptions of the scalar Aeld. That problem appears simply when we have to simulate thermal e8ects, for example heat exchanges between the uid and the particles. Following our Lagrangian point of view, this is done by assigning to each discrete particle a new variable which represents the particle temperature Tp . The simplest model which accounts for the heat exchange between the uid and the particles (with no mass exchange) relies on a macroscopic coe@cient, the heat transfer coe@cient hfp and has the form of 6hfp d Tp (Ts − Tp ) ; (533) = dt dp Cpp where Cpp is the particle heat capacity. The heat transfer coe@cient hfp is usually given by empirical expressions in terms of the non-dimensional Nusselt and Prandlt numbers hfp dp
Nu = ; Pr = f ; (534) #f > which have for example the form Nu = 2 + 0:6Rep1=2 Pr 1=3 :
(535)
These equations mirror the ones that express the discrete particle momentum equation, Eq. (281) and Eq. (285) with hfp playing in the discrete particle temperature equation the role of the drag coe@cient CD in the momentum equation. In Eq. (533), Ts stands for the instantaneous temperature of the uid seen Ts (t)=Tf (t; xp ) which implies modelling issues similar to the ones detailed in Section 7.4.1 for the velocity of the uid seen. Even if we neglect the crossing-trajectory e8ect and regard Ts as having the same statistics as for a uid particle, we still have to model this instantaneous uid temperature. For the sake of simplicity, and since this does not change the modelling problem that we would like to bring up in this section, we consider only the uid case. We also generalize the discussion to include, not speciAcally the uid temperature, but any scalar whose local exact equation involves a molecular transport coe@cient. The background is therefore provided by Section 6. We follow mainly a Lagrangian point of view in the rest of this section, the Eulerian pdf being retrieved from the Lagrangian one through the general relations, see Section 6.4.3. The one-particle pdf equation for the pdf pL (t; y; ) can be obtained either from the exact instantaneous Aeld equation, Eq. (117) by applying directly the techniques of Section 3, or by starting
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
195
from the pdf equation satisAed by the joint velocity–scalar pdf, Eq. (231) in Section 6.5.2. Indeed, pL (t; y; ) is simply the marginal of the joint velocity–scalar pdf pL (t; y; V; ) L (536) p (t; y; ) = pL (t; y; V; ) d V : From Eq. (231), by integration over velocity variables we obtain the pdf equation for pL (t; y; ) 9[>Q |( = )pL ] 9[S( )pL ] 9p L 9[Ui |( = )pL ] =− − + : (537) 9t 9yi 9 9 This equation illustrates once again the interplay between closure terms and the hierarchy of di8erent descriptions. At the level where we handle the joint values of the velocity and of the scalar, that is when the one-particle state vector is Z = (y; V; ), turbulent uxes are closed and are treated without approximation. By ‘going down’ to the reduced vector state Z = (y; ), the e8ect of velocity (which has now become an external variable whereas it was an internal variable in the former case) represented by the mean conditional Ui |( = ) has to be closed. This shows that one may have an interest in staying at the upper oor even though one is mainly interested in the scalar statistics. Yet, we leave out that question and concentrate on the terms on the rhs of the pdf equation. It can be shown that the term which involves the molecular transport coe@cient, here the scalar di8usivity >, can be exactly re-expressed as the sum of two contributions [6,7,11] 9p L 9[Ui |( = )pL ] 92 [>pL ] 92 [+ |( = )pL ] 9[S( )pL ] = − − + ; (538) 9t 9yi 92 9 9yi2 where the Arst term on the rhs is negligible at high Peclet numbers and where + is the scalar dissipation 9 (t; x) 2 + = > : (539) 9xi In the above scalar pdf equation, we end up with the same form as with the viscous terms of the momentum equation, Eq. (235) in Section 6.6. This is a general result for all molecular transport terms in the exact Aeld equations. Such terms which are di8usive in physical space but yield a negative coe@cient (anti-di8usion) in sample space. For the uid particle velocity model developed in Section 6.6, this negative coe@cient was later compensated by a larger positive term arising from the model of the uctuating pressure gradient and the equivalent negative square root that one would like to write in the corresponding equations of the trajectories of the process were mere formal intermediates. For micro-mixing models there is (unfortunately) no such supplementary terms and one is faced with the di@culty of modelling an anti-di8usion coe@cient. At high Peclet numbers, without reactive source terms S = 0 and neglecting the unconditional form of the scalar dissipation + |( = ) + , we have 92 [+ pL ] 9p L 9[Ui |( = )pL ] =− + : (540) 9t 92 9yi The scalar modelling issue, referred to as the micro-mixing problem, is to construct the corresponding term in the equations of the trajectories of the process that corresponds to this second-derivative form in the pdf equation, following the equivalence between the trajectory and
196
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
the pdf point of view for stochastic processes (see Section 2). In the absence of velocity e8ects, this is illustrated by the following sketch where the question mark indicates the unknown and required term d =? ↔
92 [+ pL ] 9p L =− : 9t 92
(541)
The scalar pdf equation looks very much like a Fokker–Planck equation (see Section 2.8) and the required model term for the trajectories of the process should be something like the Langevin stochastic di8erential equation. However, this cannot be really a Fokker–Planck equation since the coe@cient appearing in front of the second-order derivative is negative whereas the similar coe@cient in an actual Fokker–Planck equation is always positive, as shown in Eq. (40) or in Eq. (59) for the multi-dimensional case. The micro-mixing issue, or anti-di8usion behaviour, is bound to bring about a great deal of eyebrows-raising and perhaps even downright suspicion, particularly for the mathematically oriented reader, since the equation is not well posed. It is thus useful to try to provide further physical explanations and a general picture for the origin of this behaviour. Let us go back to the basic physical ideas leading to Langevin and Fokker–Planck di8usive equations, which are developed in Sections 2 and 4, and which seem well in place. From Section 2.8, we know that the existence of a positive second-order term in a partial di8erential equation of the convection–di8usion type, is equivalent to the existence of a white-noise term in the particle evolution equation. If (t; x) is the solution of a 1D heat transfer equation, (t; x) can be interpreted as the law of a stochastic process, say X whose trajectories undergo random walks. This can be represented as √ 92 (t; x) 9 (t; x) d X = 2> d W ↔ : (542) => 9t 9x 2 The stochastic equation for the trajectories of the process (that we will call particle equations) helps to bring out the underlying physics. A particle dynamics leads to a di8usive behaviour because it is subject to random forces or kicks from its environment. The ‘force’ acting on this particle (the rate of changes of the state variables considered) is taken as a fast variable (rapidly changing or with no memory) and as being independent of the actual state of the system considered, resulting in the white-noise term d W . The solution (t; x) of the PDE appears as a ‘macroscopic’ quantity and represents the mean or averaged behaviour of the underlying ‘microscopic’ constituents, see Section 6.2. These ‘microscopic’ constituents can be seen as the carriers of the related information (here it would be thermal energy or their kinetic energy) which they carry and propagate through the domain. In our case, these microscopic constituents are the molecules of the uid. The emerging picture is thus: the temperature Aeld (t; x) di8uses in space because each small (but macroscopic with respect to the molecules) volume exchanges molecules very rapidly with the surrounding small volumes of uid. If we select an observation time scale (the incremental time interval in the di8erential equations) small with respect to the evolution of the small element of uid but much larger than molecular time scales, and a length scale (the dimension of the small uid elements) much larger than molecular sizes, these molecules can be regarded as being independent and varying inAnitively fast. This corresponds to the discussion of the chosen ‘observation’ time scale in Section 4 and is in line with the
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
197
general picture given above. Such a modelling can only represent a system whose variance X 2 constantly increases as a direct consequence of the Langevin stochastic equation: dX 2 = > d t ¿ 0 :
(543)
This variance can be interpreted as the entropy of the system, it increases as the molecules become increasingly well mixed throughout the domain. The probabilistic description performed in the present pdf methods is di8erent. As emphasized in Section 6.2, we actually try to follow the reverse direction compared with the previous explanation of hydrodynamical di8usion and molecular random walks. We start at the hydrodynamical level and interpret the Aeld as an ensemble of N -interacting uid particles. These uid particles are therefore small elements of uid (we are within the framework of continuous mechanics) or, in other words, each uid particle is a large scale particle (compared to molecules) or a cluster gathering a large number of molecules. We then try to describe not the dynamics of this N -particle problem but rather N one-particle dynamical problems. The di8erence between the points of view may be illustrated by the following sketch: √ 92 [+ pL ] 92 T (t; x) 9T (t; x) 9p L d X = !"2> d W# → → − (544) => = : 9t 9x 2 # 9t 92 !" !" # molecular level hydrodynamical level one-particle PDF level →
increasingly coarser description
(545)
If we consider a volume of uid that contains a number of these uid particles and that we describe as statistically homogeneous, then the uid particles contained in that volume interact between themselves through the exchanges of molecules. We could explicitly calculate these interactions, in a particle formulation for example through the use of the particle strength exchange (PSE) method developed in vortex methods [116]. However, in turbulence these interactions are small-scale forces. Most of the energy exchanged in the process is due to interactions that take place within a distance which scales with the Kolmogorov inner scale @. An explicit calculation of this phenomenon would require a su@cient number of particles to be present within a distance of order @ and to follow the time evolution with a time step of the order of the Kolmogorov time scale $@ . These are the observation scales mentioned in the previous paragraph, which are macroscopic time scales with respect to molecular scales. This would be like an direct simulation and this is precisely what we would like to avoid in the present pdf models! We want now to express the dynamics and the time evolution of the temperature attached to each particle at a much larger time scale, see Sections 4.2 and 6.8. We are looking for a coarser description without explicitly computing the local interactions of the particle considered with all its neighbours. We simply want to account for the resulting e8ect of these molecular interactions on the one-particle pdf. What is this resulting e8ect? Let us consider a number of uid elements or particles that have di8erent temperatures (our considered scalar). In the one-particle pdf sample space, this means that the pdf is spread since there is a range (either discrete or continuous) of possible values. Then, as time goes by, these temperatures will be smoothed out by the action of the uid di8usivity and will tend towards a single value. This smoothing action re ects the exchange of molecules between the uid particles. At the time
198
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
chosen in the pdf description, larger than the Kolmogorov scale which in turn is larger than the molecular time scale, this exchange is random and inAnitively fast. The resulting mixing e8ect is irreversible at the pdf level (thus perhaps the second-order derivative) and in the corresponding sample-space the pdf tends towards a Dirac-value distribution. In the pdf description, the mean value of the uid particle scalar does not change but the variance decreases d 2 = −+ d t ¡ 0 :
(546)
There is no new physical phenomenon involved. The variance (or the entropy) in the pdf description decreases since energy (or order) is transferred in an irreversible way to the molecules whose disorder increases. Both evolutions, Eqs. (543) and (546), are two manifestations of molecular random motions. The evolution at the pdf level is the re ection of the relaxation of macroscopic systems towards thermodynamical equilibrium through increasing molecular disorder. The micro-mixing issue, indicated by the sketch (541), is one of the key di@culties of the subject and it remains a much discussed and open question. No satisfactory model representing in a trajectory formulation this pdf behaviour has been proposed, at least in the turbulence community. Interestingly enough, this question has also appeared recently in other physical situations. For example, it is mentioned in an appendix (appendix S.11) of the latest edition of Risken’s book [117] where the idea of ‘doubling the phase space variables’ to retrieve a positive di8usion matrix is brie y put forward. Similar notions are also addressed in the reference book of Gardiner, particularly in the part 7.7.4 of [15] where complex stochastic di8erential equations are introduced. This is achieved through what is referred to as Poisson representation and the direct analogy with our present micro-mixing modelling issue, although striking, remains to be properly established. It seems therefore that there is room for improvement and that theoretical work, perhaps in connection with the above-mentioned works, would greatly help to devise better and physically sound model proposals. At the moment, the modelling problem remains partially and even poorly treated, with the limited objectives to represent the correct evolution of the mean scalar and the decrease of its variance 2 instead of representing the correct evolution of the pdf pL (t; y; ). This is really a limiting problem, and any improvement would have important consequences for practical calculations. As an example of current closures, the simplest model replaces the real process by a linear return-to-equilibrium to the mean (IEM) [8]
− d = − dt ; (547) $ where $ is the scalar time scale. It is seen that, in sample space, the second-order derivative term has been replaced by a Arst order one (the model involves only a drift term) which highlights its limitations. In sample space, the pdf equation is now of convection type 9p L 9 − = pL : (548) 9t 9 $ Therefore, in homogeneous turbulence starting from an initial two-value discrete pdf, the IEM model predicts that though the variance of 2 decreases, the shape of the pdf is conserved
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
199
contrary to the real one which tends towards a Gaussian distribution with a vanishing standard deviation. A number of other models have been proposed that come more or less close to the exact process. They have been discussed at length in most reviews on the subject [7–9] among others, and these works are referred to for further details and tests of the various models. 9.3.4. Path-integral ideas for dispersion models In this section, we go back to the core subject of this paper, two-phase ow modelling. The propositions developed above aim at improving current stochastic models by using more information provided by fundamental approaches (see Section 9.3.1) or by introducing spatial information (see Section 9.3.2). This will obviously increase the complexity of the models and increase the computational costs, and in turn could limit their range of applications. Yet, there is clearly one aspect, even within the present one-point formalism, that is in need of improvement: the modelling of the velocity of the uid seen by discrete particles. The modelling issues have been detailed in Section 7.4. It was explained that, even if we accept the present models for the velocity of a uid particle, the issue in turbulent two-phase ows is to model the successive velocities of the uid seen or sampled by a discrete particle as it moves across a turbulent ow. A careful analysis of the di8erences between these two variables, the velocity of a uid particle Uf and the velocity of the uid seen Us , has been proposed in Section 7.4. However, it is clear that the derivation of a Langevin model for Us implies additional assumptions compared to a Langevin model for Uf (see the discussion of the crossing-trajectory e8ect in Section 7.4.1 and the application of Kolmogorov hypothesis in Section 7.4.2). Improving present closure relations of Section 7.4.2 would certainly enhance the precision of the numerical predictions in most practical cases. This improvement would be obtained within the same one-particle pdf approaches and without going to higher pdf descriptions. Moreover, such a description would not impair the applicability of the approach to practical problems. It seems di@cult to keep on striving to devise better models by Addling with the di8erent terms of the stochastic equations (the drift and the di8usion coe@cients) and by trying to obtain better comparisons against various experimental data sets. A strong reason for this limitation is that we cannot be helped in the construction of these models by the knowledge of the macroscopic laws (see Sections 7.1 and 7.2). Consequently, it is believed that too much uncertainty limits the conAdence we can have in present closures and that a breakthrough is needed. Such a breakthrough requires that the model approach be Arst put on Arm ground with a clear theoretical setting. In other words, we need the help of a fundamental approach. The approach we propose to follow is a path-integral and a variational approach. We Arst outline the main characteristics of the general path-integral approach and then suggest how this approach can be used for our modelling issue. Originated in quantum mechanics, this method, which extends the Lagrangian=Hamiltonian variational ideas of classical mechanics, has a direct interpretation for stochastic di8usion processes [118]. In Section 2, we have emphasized that stochastic di8usion processes can be addressed from two points of view: the trajectory and the pdf point of view. In a 1D formulation for a stochastic process Z, the trajectory point of view consists in a Langevin equation d Z = A(t; Z) d t + B(t; Z) d W ;
(549)
200
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 43. Representation of one path between the two states (t0 ; z0 ) and (t; z).
while the pdf point of view consists in the Fokker–Planck equation, satisAed by the pdf p(t; z) of the process in sample space 9[A(t; z)p] 1 92 [B2 (t; z)p] 9p : =− + 9t 9z 2 9z 2
(550)
The correspondence is explained in the general case in Section 2.8. There is actually a third representation which is the path-integral. This representation, built from the trajectory and pdf points of view, expresses the probability to follow one particular path between two possible states of the stochastic process at two di8erent times, (t1 ; z1 ) and (t2 ; z2 ). We brie y recall the main characteristics of this way to handle stochastic processes which can be found in a few textbooks [119 –121]. It is illustrated in Fig. 43 which represents one particular path connecting the initial state (t0 ; z0 ) and the Anal state (t; z) through a number of intermediate values zi at the intermediate times ti when the Anite time interval is split in small time intervals. The transitional probability density p(t; z | t0 ; z0 ) to have the value z for the process Z at time t given that we had the value z0 at time t0 , can be worked out by the successive use of the Chapman–Kolmogorov relation, Eq. (14), p(t; z | t0 ; z0 ) = · · · p(t; z | tn ; zn ) × p(tn ; zn | tn−1 ; zn−1 ) × · · · ×p(t2 ; z2 | t1 ; z1 ) × p(t1 ; z1 | t0 ; z0 ) d zn d zn−1 : : : d z1 :
(551)
If we split the time interval (t0 ; t) in N + 1 identical subintervals of equal duration Qt, with Qt = (t − t0 )=(N + 1), we can approximate the relation between the intermediate states zi and zi+1 by an Euler scheme (see Section 2.10) 1 (552) (zi+1 − zi − A(ti ; zi )Qt) = d Wi : B(ti ; zi ) From the property of the Wiener process given in Section 2.4.2, the incremental conditional pdf is (zi+1 − zi − A(ti ; zi )Qt)2 1 p(ti+1 ; zi+1 | ti ; zi ) = exp − (553) 2B2 (ti ; zi )Qt 2(B2 (ti ; zi )Qt
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
201
and by adding all the terms within the exponential expression in Eq. (551) we get ) * N +1 N +1 1 ((zi+1 − zi )=Qt − A(ti ; zi ))2 d zi exp − : p(t; z | t0 ; z0 ) = lim Qt 2 N →∞ 2 2B2 (ti ; zi ) 2(B (ti ; zi )Qt i=1
The short-hand notation for the above integral is p(t; z | t0 ; z0 ) = C exp{−S[z($)]}D[z($)] ; where C is a normalization factor and where t S[z($)] = L[z($)] d $ ; t0
L[z($)] = L(z($); ˙ z($)) =
1 [z($) ˙ − A($; z($))]2 ; 2B2 ($; z($))
i=1
(554)
(555)
(556a) (556b)
where z($) ˙ is the time derivative of z($). In these relations, the notation z($) represents a particular trajectory between (t0 ; z0 ) and (t; z) and denotes the complete time function (for $ varying from t0 to t). The quantities S and L are functions of the whole trajectory. They are thus functionals and we use the classical notation L[:] to indicate that L depends on all values of z($); $ ∈ [t0 ; t]. The resulting expression, Eq. (555), yields what is referred to as the path-integral representation of a di8usion process. It has the form of a sum over all histories, since it expresses that the probability to start with the value z0 at time t0 and to end up with the value z at time t is the sum over all the possible paths that connect (t0 ; z0 ) to (t; z), each path z($) being weighted by the factor exp{−S[z($)]}. The above expressions are similar to the variational approach to classical mechanics [122]. The functional S[z($)] can be regarded as the action along a given path z($) and the functional L[z($)] as the equivalent of the classical Lagrangian extended to a stochastic context. Loosely speaking, Eq. (555) means that the probability p(t; z | t0 ; z0 ) to go from (t0 ; z0 ) to (t; z) is the probability to follow one particular path, summed over all the possible paths. The probability to follow one path is proportional to exp{−S[z($)]}. These forms of the Lagrangian and of the action are referred to as Onsager–Machlup actions [123,124], who Arst derived this representation. The above continuous forms of the path-integral representation, Eqs. (555) and (556), are actually symbolic expressions. From a mathematical point of view, the ‘measure’ written in Eq. (555), D[z($)], has not a well-deAned sense on the ensemble of the trajectories z($). Furthermore, what is loosely called the ‘probability’ to follow one particular path z($), exp{−S[z($)]} is also not well deAned. Indeed, the expression of the Lagrangian functional, L, uses the derivative z($) ˙ along the path z($). However, we have seen in Section 2 that one of the characteristics of a di8usion process is that the trajectories are continuous but nowhere di8erentiable! The expression entering the Lagrangian functional L is thus meaningless. Yet, the formulas can be used in a discrete sense, as in Eq. (554). From the physical point of view, they have nevertheless a clear and appealing meaning. We can consider complete paths z($) and sort them out with respect to their relative contribution to the sum. In that sense, the best meaning for
202
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
the functional S[z($)] is perhaps as an importance function for the set of trajectories between (t0 ; z0 ) and (t; z). The path-integral representation, Eqs. (555) and (556), is just another way to express the properties of a stochastic di8usion process. It does not really add something new to the knowledge of Z. If we already know either the trajectory Langevin stochastic di8erential equations or the Fokker–Planck equation, then this is simply a reformulation of a closed problem. The strong interest of the path-integral approach, for our present concerns, is to reverse that point of view and to follow an action principle. In that approach, a Langevin model is derived from the functional S and L as follows [125]. If we have an idea of a suitable or a reasonable functional, L[z($)], that represents the relative importance of the di8erent paths z($), then we can work out a Langevin equation by the following procedure. We Arst calculate the mean path z ($) which is the path that minimizes the action functional S[z($)] (assuming there is only one such path), z ($)
such that
minS[z($)] = S[z ($)] : z($)
(557)
The mean path corresponds to the trajectory on which the functional derivative of S is the zero function, S[z ($)] =0 : (558) z ($) By writing the ‘derivative’ along the mean path as z˙($) = A($; z ($)) ;
(559)
we get the desired expression for the drift coe@cient A of the Langevin SDE. If we now make a quadratic approximation of the Lagrangian around the mean path, L[z($)] D($; z($))(z($) − z ($))2 ;
(560)
where D($; z($)) is positive since z ($) is the minimum of L and of S, the di8usion coe@cient B for the Langevin equation is then 1 B($; z($)) = (561) 2D($; z($)) and is given by the second-order derivative of the functional L[z($)] for the mean path z . How can this approach be put to practical use for our concern to model the successive velocities of the uid seen by discrete particles in a turbulent ow? To see this, it is necessary to go back to the discussion of Section 7.4.1. The modelling issue is to derive a model for Us taking into account particle inertia and crossing-trajectory e8ects. In a one-particle approach, the dispersion model is built on given models for the Lagrangian increments of the velocity of a uid particle and on Eulerian correlations between two particles at the same time. The classical scheme is given in Fig. 27 where only one uid particle location is considered. That scheme was analysed in Section 7.4.1, and to get around the di@culties it creates a simpliAed picture based on the mean relative velocity (see Fig. 29) was used as the starting point for the derivation of Langevin models in Section 7.4.2. Compared to what was just a qualitative analysis there, the path-integral representation provides a powerful tool to delve into the question in a systematic
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
203
Fig. 44. Sketch of the possible Lagrangian correlation step between F(t1 ) and F(t2 ) and Eulerian correlation step between F(t2 ) and F (t2 ) for the velocity of the uid seen.
and consistent way. If we consider a discrete particle at two time steps, say t1 and t2 , and the relative motion of the uid particle F located around the discrete particle at time t1 , there is actually an inAnite range of possibility for the Lagrangian and Eulerian correlation steps (see Section 7.4.1 for the details of these correlation steps). This is represented in Fig. 44 where four possibilities are displayed. For the same discrete particle motion, we cannot say that the uid particle will go there, but we can say that it has a probability to go there. Therefore, all of the four sketches in Fig. 44 are possible but do not have the same ‘importance’. In Section 7.4.1, we already used a similar reasoning to point out the shortcomings due to a ‘poor choice’ of the relative disposition of the uid and the discrete particle locations (see Fig. 28). The path-integral formalism can now help us to select consistently the ‘relevant path’ on which to build a stochastic Langevin approximation for Us . We propose the following notion. From the previous description, we assign to each path zs ($) connecting the value of Us at time t1 (represented by the velocity of the particle F(t1 ) in Fig. 44) to the value of Us at time t2 (represented by the velocity of the particle F (t2 ) in Fig. 44) an importance function, or loosely speaking a probability, say Ls [zs ($)]. This Lagrangian functional Ls for the paths of the velocity of the uid seen can be proposed directly. Another possibility is to try to built it by a combination of the Lagrangian step and of the Eulerian step. Indeed, we can assign to the Lagrangian step, indicated by [L] in Fig. 44, a Lagrangian functional, say LL . Conversely, we can assign to the Eulerian step, indicated by [E] in Fig. 44, another Lagrangian functional, say LE , for the paths that link the two possible values of the velocities of the uid particles F (t2 ) and F(t2 ). The sum of these two steps, that is the link between the possible values of F(t1 ) and F (t2 ), can be regarded as a complete path and we write zs = zL + zE . The complete Lagrangian functional, Ls , that will describe the path of the velocity of the uid seen can be taken as the sum of the two
204
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
elementary Lagrangian functionals Ls [z($)] = LL [zL ($)] + LE [zE (r($))] :
(562)
The Eulerian step is dependent on the Lagrangian one, and we loosely indicate this by the notation r($). The total action is then t2 Ss [z($)] = Ls [z($)] d $ : (563) t1
Once we know Ss , we can now apply an action principle and the above procedure to derive a Langevin equation model. We say that the relevant or most important path is the complete path, z s ($), that minimizes the action z s ($)
such that
minSs [zs ($)] = Ss [z s ($)] : zs ($)
(564)
This should yield the drift terms of the Langevin approximation while the positive function that is the coe@cient in the quadratic expansion around z s ($) should give the di8usion coe@cient. At this point, it must be repeated that the previous description does not pretend to lay out a complete and Anal solution. That description should simply be regarded as a proposal. It is not claimed that following the path-integral representation is a necessity, but it is merely suggested that, through this formalism, one can approach the problem in a rigorous framework. Indeed, the knowledge of the action leads to a clear deAnition of the relevant path and may help to avoid the di@culties encountered in the heuristic models which are mentioned in Section 7.4. At the moment, this path-integral approach has never been followed. Nevertheless, it appears as an interesting possibility to marry theoretical tools from other Aelds of physics and rather practical concerns of two-phase ow modelling. Much work still remains to be done which requires insights from researchers conversant with the path-integral concepts and their manipulation. If we accept a Langevin model for the velocities of a uid particle, the Lagrangian functional (for the Arst Lagrangian correlation step [L]) LL is given by an Onsager–Machlup expression, Eq. (556). On the other hand, a proper expression for the Lagrangian functional describing the Eulerian correlation [E] has yet to be proposed. As mentioned in Section 7.4.1, one must also account for the fact that the Eulerian step is actually conditioned on the Lagrangian one. It is here simply hoped that these challenges will be deemed worthy of consideration. 9.3.5. Particle–particle interactions and granular behaviour Up till now, we have mainly treated the di8usion and dispersion problems (di8usion for uid particles and dispersion for discrete particles) in Sections 6 and 7, respectively, and the problem of turbulence modulation has been brie y touched in Section 8. Broadly speaking, one can state that as the concentration of discrete particles increases, one encounters Arst the one-way coupling case (where the discrete particles are dispersed by the turbulent uid), then the two-way coupling situation (where particles modify the intensity and possibly the nature of turbulence) and Anally four-way coupling, when the relative distance between particles is small enough so that there are particle–particle interactions (particles start to collide in the case of hard spheres or there is coalescence and break-up in the case of particles which can be deformed like bubbles and droplets).
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
205
Here, no attempt at classifying turbulent dispersed two-phase ows is proposed, i.e. if it is possible to predict when turbulence modulation and particle–particle interactions become relevant mechanisms based on, for example, some characteristic time scales. Instead, the physics of four-way coupling are brie y presented and discussed from the physical and modelling points of view and this only for hard spheres, a case which is line with the Aeld equations presented in Section 8. As far as the physics of particle–particle interactions between non-rigid particles is concerned, the reader can And suitable information in [126,127]. In the case of hard spheres immersed in an interstitial uid, it is quite intricate to address the problem in a general way. Instead, it is often necessary to consider distinct cases that correspond to di8erent areas of physics. If the uid ow is not turbulent (which is not the case of the present work), a problem often referred to as sedimentation (discrete particles immersed in a uid whose density has the same order of magnitude as p ), hydrodynamic interactions become important as mentioned in Section 7 and the nature of the contact between particles, if any, is a subject of controversy [128]. Here, only turbulent uid ows are under consideration and hydrodynamic interactions are neglected. There is no formal proof for this, but some guidance can be found in the work of Sa8man [129], who showed that as long as the relative distance between particles is large enough, the velocity perturbations on a given particle induced by the surrounding particles remain small. Let us deAne a time scale $c which characterizes particle–particle interactions, for example the time experienced by a given particle between two consecutive binary collisions as in the spirit of the kinetic theory [20]. One can state that when $c $p , there is almost no in uence of the uid on the collisional mechanisms (although energy can still be mainly supplied by the uid provided its agitation is high enough) a regime which is called dry granular ows in the engineering community and more recently granular matter in physics (a large collection of small grains, under conditions in which the Brownian motion of the grains is negligible [2]). When $p $c , collisions are controlled by uid-dynamic properties, a situation which is of course much more complex than the case of granular matter where the in uence of the uid is negligible. In the case where $p $c , the motion of particles is mainly controlled by aerodynamic forces. The only use of $p and $c is, of course, not enough for an exhaustive deAnition of the di8erent regimes. One might need to know how the particles respond to gas phase turbulence by comparing $p and TL , see Section 6 (the ratio $p =TL is called the Stokes number). A typical example of the complexity of the physics when the interstitial uid in uences the collisional motion between the particles ($p $c ) is the evaluation of the time scale $c . In dry granular ows that are rapidly sheared, by analogy with the kinetic theory (this will soon be explained) and assuming molecular chaos (two colliding particles have uncorrelated velocities), $c is given as the ratio between the mean free path (which is a function of dp and the particle volumetric fraction) and a characteristic uctuating velocity. The assumption of molecular chaos is valid when TL $p , that is, when the particle motion is hardly a8ected by the turbulent motion of the uid. However when particle motion is in uenced by the turbulent motion of the uid, $p TL , the assumption of molecular chaos is not valid anymore. If large scale instabilities (or turbulence) are present in the uid, one might end up with a situation where particles are dragged by the uid (particles relax fast to changes in the local instantaneous uid velocity Aeld) and do not collide since their motion is well correlated with the large scale motions of the uid (or if they happen to collide, their velocities will be correlated), i.e. $c → ∞. In that
206
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
Fig. 45. Energy path in a rapid granular ow (when the interstitial uid is involved in the granular ow, additional mechanisms are indicated in bold style).
case, a spatial information might be needed to solve the problem. Before we go on to some proposals and perspectives for this type of ows, we start with a description of granular matter which is a simpliAed case of our general problem. The interest for granular ows is not new and outstanding pioneers like Coulomb, Reynolds and Bagnold [130], to name of few, had already gathered knowledge in the science of granular materials. Over the years the subject has received a great deal of attention in chemical and mechanical engineering (where granular ows are ubiquitous) and more recently in physics where granular matter is a new type of condensed matter, and it has become a fruitful metaphor for describing microscopic, dissipative dynamical systems and the concept of self-organized criticality [131,132]. Even though there is a striking analogy between granular matter and the other forms of matter (solid, liquid, gas), granular matter exhibits unique properties in both its solid-like and uid-like behaviour [2,133]. This is mainly due to two reasons: (i) ordinary temperature plays no role (the relevant energy scale is potential energy—and also kinetic energy if particles are dragged by a uid—but not thermal energy kB T ), (ii) the interactions between the particles (grains) are dissipative because of static friction (solid-like state) or the inelasticity of collisions ( uid-like state)—and also friction if particles are dragged by a uid. Granular matter is an unusual solid, liquid or gas. Fill a container with sand and the pressure at the bottom will reach a maximum value independent of the height. Vibrate the container and one will And that the degree of compaction is history dependent. Pour the grains on at table and motion will stop almost instantaneously. If a heap is formed, pour some more grains and phase transitions (solid-like and uid-like) can be observed. These simple experiments (and many others) clearly indicate that granular matter is a non-conventional uid or liquid. These non-conventional behaviours raise a fundamental question: is it possible to describe granular matter using a Aeld approach as it is done in continuum mechanics? Is it possible to describe phase transition with classical arguments from statistical physics? There is no general agreement on the subject. For example, as far as the issue of the derivation of hydrodynamic equations in the uid-like case is concerned, some authors are inclined to say ‘no’ [134] whereas for others it is a subject of controversy [133]. In the Aeld of classical mechanics, the problem of the derivation of Aeld equations has been addressed for many years for the so-called rapid granular ows, Campbell [135], i.e. the uid-like behaviour of granular matter that is rapidly sheared so that there is a constant free motion of the particles where only short duration contacts are involved. In this particular case, the concept of granular temperature can be introduced, Tp = up; i up; i =3. This internal energy which is supplied by external forces is dissipated into heat by inelastic collisions. The energy path is described as follows, Fig. 45. Driving forces (gravity, motion of external boundaries,
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
207
forces exerted by the uid) supply the system of grains with kinetic energy. Part of the kinetic energy is converted, via shear work (due to velocity gradients), into random agitation of the particles (granular temperature). This internal energy is dissipated into (thermodynamic) heat due to collisions (small deformations at the surface of the particles). When the interstitial uid drags the particles, heat is also dissipated due to friction (drag force) if one assumes that the particles are smaller than the Kolmogorov length scale so that perturbations in the uid velocity Aeld are directly dissipated into heat. If not, spatial information is once again needed. Once the concepts of rapid granular ows and granular temperature are accepted, and especially their relevance in physical and engineering applications, the physical similarity with the kinetic theory of gases is a fact. The question which remains to be answered is: to what extend can dissipative gases At in the frame of the kinetic theory? Under what assumptions can we derive macroscopic equations? In the particular case of a population of smooth, rigid, non-rotating, identical spheres, this question was answered mainly by Savage and Jenkins, see for example [136,137], and their results were later on extended to gas–solid ows [78,79]. In the case where the interstitial gas drags the particles, the procedure can be sketched as follows: unclosed Boltzmann equation ↓ simpliAed closure on Us; i | yp ; Vp molecular chaos assumption ↓
closed Boltzmann equation ↓
unclosed mean Aeld equations ↓ small departure from equilibrium Grad ’s 13-moment approximation simpliAed collision model and 1 − e1 ↓
closed mean Aeld equations The starting point of the ‘kinetic theory of granular ows’ is an unclosed Boltzmann-like equation on the one-point particle pdf p(t; yp ; Vp ), Eq. (373), where an additional term accounting for the time rate of change of p(t; yp ; Vp ) due to collisions is introduced,
Vp; i 9p 9 1 9p 9 9 [Vp; i p] − p =− Us; i | yp ; Vp p + : (565) + 9t 9yp; i 9Vp; i $p 9Vp; i $p 9t c This Boltzmann-like equation is closed by making a simpliAed assumption on Us; i | yp ; Vp and by assuming molecular chaos, that is the two-point (for two discrete particles) pdf is the product of the one-point particle pdfs. Then unclosed mean Aeld equations can be derived using the procedure outlined in Section 8.5.3. Closure at the macroscopic level can be performed
208
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
by supposing that the system is close to equilibrium (perturbation analysis), that the form of the pdf is known in advance (Grad’s 13-moment approximation) and that simpliAed collision models can be used (binary collisions are characterized only by the restitution coe@cient e and collisions are nearly elastic). Expressions for the mean ux of momentum and energy, and for the mean dissipation rate can then be derived analytically for the simpliAed collision integrals, i.e. closed mean Aeld equations are written for the mean density, the mean velocity Aeld and the granular temperature. The form of the equations and the technical aspect of the problem will not be discussed but, roughly speaking, it can be stated that mean Aeld equations can be derived when we are in the case of an (almost) dense gas close to equilibrium. Some e8orts are still made in the Aeld to include more physics (binary mixtures, accurate collision models, rotation on the particles, in uence of the gas by working on p(t; yp ; Vp ; Vs ), etc.). The spirit of the method has however not changed and one is left with the classical drawbacks when closure is performed at the macroscopic level as explained in Section 6. At the microscopic level, it is possible today, with modern computer technology, to perform calculations of granular ows and turbulent gas-solid ows. The microscopic simulations of granular ows, for example [138–140], are a powerful tool to study granular matter (for example inelastic collapse in the liquid-like form) since precise information can be extracted and more realistic physics can be put into the model (collision models, rotation of the particles). However, it is not yet reasonable to claim that these simulations are real microscopic ones since the collisional models are simpliAed [141]. In the case of gas–solid suspensions the di@culty is increased by the presence of the uid. Real DNS is not possible (where particles become moving boundaries) and most of the time particular turbulent Aelds are generated and large eddy simulation is used together with a particle-point approximation [142] (the size of the LES Alter is chosen in a way so that the unresolved velocity uctuations do not a8ect the motion of the particles). Even though these methods provide fruitful ‘numerical experiments’, they su8er from two major drawbacks: (i) the number of degrees of freedom is huge, (ii) the collision-tracking algorithm imposes stringent numerical constraints [138]. The leading idea of the present paper is that, when the number of degrees of freedom of a system is too large, one has to come with a contracted description, i.e. to describe the system at a mesoscopic level. The treatment of collisions in granular matter and dispersed two-phase ows can At in this approach. If one is interested in one-point information for the discrete particles, the exact trajectories of the discrete particles can be approximated by stochastic particles (in order to reproduce the statistical signature of particle–particle collisions) as in the spirit of DSMC [143] (direct simulation Monte Carlo). The trajectories of the stochastic process are no longer continuous (there are velocity discontinuities) and di8usion processes cannot be directly applied as a modelling tool. Instead, more general Markov processes must be used, for example the combination of di8usion process and a jump process, see Section 2. A Ast proposal could be to model the statistical signature of the collision with a generalized Poisson process. For example, for a given particle under time interval d t, the velocity increment becomes, d Xt = A(t; X; F(t; X )) + B(t; X; G(t; X )) d W + d Nt ;
(566)
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
209
where d Nt = 0
with probability 1 − #(t; x; h(t; x)) d t ;
d Nt = Yt − Xt
with probability #(t; x; h(t; x)) d t :
Here, we obtain a generalized Mc-Kean stochastic equation. Yt is a random variable which is speciAed by a conditional pdf g(y | t; x). Since #(t; x; h(t; x)) and g(y | t; x) are independent, the pdf associated to Nt is W (y | t; x) = #(t; x; h(t; x)) g(y | t; x) ;
(567)
that is, roughly speaking, the probability to jump from Xt = x to Yt = y is the product of the probability that a jump occurs and the probability to have the speciAed amplitude. It is obvious that g(y | t; x) d y = 1 (568) (since g(y | t; x) is a pdf) and inserting Eq. (567) into Eq. (34), one obtains for the transitional pdf p(t; x | t0 ; x0 ) 9p 9 1 92 2 (B (t; x; H (t; x))p) = − (A(t; x; F(t; x)) p) + 9t 9x 2 9x 2
+
#(t; y; h(t; y))g(x | t; y) p(t; y | t0 ; x0 ) d y − #(t; x; h(t; x)) p(t; x | t0 ; x0 ) :
(569)
The modelling problem is now to And expressions for #(t; x; h(t; x)) and g(y | t; x) based on physical arguments. By doing so, the numerical treatment of collisions could be signiAcantly simpliAed, i.e. by applying a mesoscopic description. References [1] K.R. Sreenivasan, Fluid turbulence, Rev. Mod. Phys. 71 (2) (1999) 383–395. [2] P.G. de Gennes, Granular matter: a tentative view, Rev. Mod. Phys. (Centenary) 71 (2) (1999) S374–S382. ∗ [3] M. Lesieur, Turbulence in Fluids, 3rd Edition, Kluwer, Dordrecht, 1997. [4] W.D. McComb, The Physics of Fluid Turbulence, Clarendon Press, Oxford, 1990. [5] U. Frisch, Turbulence, The Legacy of A.N Kolmogorov, Cambridge University Press, Cambridge, 1995. ∗ [6] S.B. Pope, Pdf methods for turbulent reactive ows, Prog. Energy Combust. Sci. 11 (1985) 119–192. ∗ ∗ ∗ [7] S.B. Pope, Lagrangian pdf methods for turbulent reactive ows, Ann. Rev. Fluid Mech. 26 (1994) 23–63. ∗∗ [8] C. Dopazo, Recent developments in PDF methods, in: P.A. Libby, F.A. Williams (Eds.), Turbulent Reactive Flows, Academic, New York, 1994. [9] R.O. Fox, Computational methods for turbulent reacting ows in the chemical process industry, Rev. Inst. Fr. P\et. 51 (2) (1996). [10] S.B. Pope, On the relationship between stochastic lagrangian models of turbulence and second-order closures, Phys. Fluids 6 (2) (1994) 973–985. [11] J.-P. Minier, J. Pozorski, Derivation of a pdf model for turbulent ows based on principles from statistical physics, Phys. Fluids 9 (6) (1997) 1748–1753. ∗ [12] D.E. Stock, Particle dispersion in owing gases, J. Fluids Eng. 118 (1996) 4–17.
210
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
[13] L. Arnold, Stochastic Di8erential Equations: Theory and Applications, Wiley, New York, 1974. ∗ ∗ ∗ [14] B. Iksendal, Stochastic Di8erential Equations, An Introduction with Applications, Springer, Berlin, 1995. [15] C.W. Gardiner, Handbook of Stochastic Methods for Physics, Chemistry and the Natural Sciences, 2nd Edition, Springer, Berlin, 1990. ∗ ∗ ∗ R [16] H.C. Ottinger, Stochastic Processes in Polymeric Fluids, Tools and Examples for Developing Simulation Algorithms, Springer, Berlin, 1996. [17] B. Lapeyre, E. Pardoux, R. Sentis, in: M\ethodes de Monte-Carlo pour les e\ quations de transport et de di8usion, Coll. Math\ematiques et Applications, Springer, Berlin, 1998. [18] P.E. Kloeden, E. Platen, Numerical Solution of Stochastic Di8erential Equations, Springer, Berlin, 1992. [19] D. Talay, Simulation of stochastic di8erential equation, in: P. Kree, W. Wedig (Eds.), Probabilistic Methods in Applied Physics, Springer, Berlin, 1995. ∗ ∗ ∗ [20] S. Chapman, T.G. Cowling, The Mathematical Theory of Non-Uniform Gases, Cambridge Mathematical Library, Cambridge, 1970. [21] R.L. Libo8, Kinetic Theory: Classical, Quantum, and Relativistic Descriptions, 2nd Edition, Prentice-Hall Advanced Reference Series, London, 1998. [22] R. Balescu, Statistical Dynamics: Matter Out of Equilibrium, Imperial College Press, London, 1997. ∗ ∗ ∗ [23] H. Haken, Synergetics: an overview, Rep. Prog. Phys. 52 (1989) 515–533. ∗ ∗ ∗ [24] M. Bushev, Synergetics, Chaos, Order, Self-Organization, World ScientiAc, Singapore, 1994. [25] A.S. Monin, A.M. Yaglom, Statistical Fluid Mechanics, MIT Press, Cambridge, MA, 1975. ∗ ∗ ∗ [26] H. Tennekes, J.L. Lumley, A First Course in Turbulence, The MIT Press, Cambridge, MA, 1990. [27] S.B. Pope, Turbulent Flows, Cambridge University Press, Cambridge, 2000. [28] K.R. Sreenivasan, R.A. Antonia, The phenomenoly of small-scale turbulence, Annu. Rev. Fluid Mech. 29 (1997) 435–472. [29] P.K. Yeung, S.B. Pope, Lagrangian statistics from direct numerical simulations of isotropic turbulence, J. Fluid Mech. 207 (1989) 531–586. ∗∗ [30] K.D. Squires, J.K. Eaton, Lagrangian and eulerian statistics obtained from direct numerical simulations of homogeneous turbulence, Phys. Fluids A 3 (1991) 130–143. [31] E. Deustch, Dispersion de particules dans une turbulence stationnaire homogene isotrope calcul\ee par simulation directe des grandes e\ chelles, Ph.D. Thesis, Universit\e Paris VI, 1992. [32] G.K. Batchelor, The Theory of Homogeneous Turbulence, Cambridge University Press, Cambridge, 1953. [33] C.W. Van Atta, R.A. Antonia, Reynolds number dependence of skewness and atness factors of turbulent velocity derivatives, Phys. Fluids 23 (1980) 252–257. [34] J. Jimenez, A.A. Wray, P.G. Sa8man, R.S. Rogallo, The structure of intense vorticity in homeogeneous isotropic turbulent ows, J. Fluid Mech. 255 (1993) 65–90. [35] F. Anselmet, Y. Gagne, E.J. HopAnger, R.A. Antonia, High-order velocity structure functions in turbulent shear ows, J. Fluid Mech. 140 (1984) 63–89. [36] K.R. Sreenivasan, Fractals and multifractals in uid turbulence, Annu. Rev. Fluid Mech. 23 (1991) 539–600. [37] L.P. Wand, S. Chen, J.G. Brasseur, J.C. Wyngaard, Examination of hypotheses in the kolmogorov reAned turbulence theory through high-resolution simulations. Part 1. Velocity Aeld, J. Fluid Mech. 309 (1996) 113–156. [38] W. George, P.D. Beuther, R.E. Arndt, Pressure spectra in turbulent free shear ows, J. Fluid Mech. 148 (1984) 151–191. [39] T. Gotoh, R.S. Rogallo, Statistics of pressure and pressure gradient in homogeneous isotropic turbulence. Proceedings of the Summer Program, Standford, USA, Center for Turbulence Research, 1994. [40] M. Nelkin, Universality and scaling in fully developed turbulence, Adv. Phys. 43 (1994) 143–181. [41] R. Benzi, S. Ciliberto, R. Tripiccione, C. Baudet, F. Massaioli, S. Succi, Extended self similarity in turbulent ows, Phys. Rev. E 48 (1993) R29–R32. [42] Z.S. She, E. Leveque, Universal scaling laws in fully developed turbulence, Phys. Rev. Lett. 72 (1994) 336–339. [43] C.W. Van Atta, W.Y. Chen, Structure functions of turbulence in the atmospheric boundary layer over the ocean, J. Fluid Mech. 44 (1970) 145–159.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
211
[44] B. Castaing, Y. Gagne, E.J. HopAnger, Velocity probability density functions of high-Reynolds number turbulence, Physica D 46 (1990) 177–200. [45] Y. Gagne, M. Marchand, B. Castaing, Conditional velocity pdf in 3-d turbulence, J. Phys. II France 4 (1994) 1–8. ∗∗ [46] A. Vincent, M. Meneguzzi, The spatial structure and statistical properties of homogeneous turbulence, J. Fluid Mech. 225 (1991) 1–20. [47] Z.S. She, E. Jackson, S.A. Orszag, Scale-dependent intermittency and coherence in turbulence, J. Sci. Comput. 1 (1988) 407–434. [48] A. Pumir, A numerical study of pressure uctuations in three-dimensional incompressible, homogeneous, isotropic turbulence, Phys. Fluids 6 (1994) 2071–2083. [49] G.L. Brown, A. Roshko, On density e8ects and large structures in turbulent mixing layers, J. Fluids Mech. 64 (1974) 775–816. [50] E.D. Siggia, Numerical study of small scale intermittency in three dimensional turbulence, J. Fluid Mech. 107 (1981) 375–406. [51] J. Jimenez, A.A. Wray, On the characteristics of vortex Alaments in isotropic turbulence, J. Fluid Mech. 373 (1998) 225–285. [52] O. Cadot, S. Douady, Y. Couder, Characterisation of the low-pressure Alaments in a three-dimensional turbulent shear ows, Phys. Fluids 7 (1995) 630–646. [53] O. Cadot, D. Bonn, Y. Couder, Turbulent drag reduction in a closed system: boundary layer versus bulk e8ects, Phys. Fluids 10 (1995) 426–436. [54] J.-P. Minier, Lagrangian stochastic modelling of turbulent ows, Lecture Notes of the Von-Karman Institute, Session on Advances in Turbulence Modelling, 23–27 March, 1998. [55] M.H. Kalos, P.A. Whitlock, Monte Carlo Methods, Vol. I, Wiley, New York, 1986. [56] J. Xu, S.B. Pope, Assessment of numerical accuracy of pdf=monte carlo methods for turbulent reacting ows, J. Comput. Phys. 152 (1999) 192. [57] S.B. Pope, Y.L. Chen, The velocity-dissipation probability density function model for turbulent ows, Phys. Fluids A 2 (1990) 1437. [58] S.B. Pope, Application of the velocity-dissipation probability density function model to inhomogeneous turbulent ows, Phys. Fluids A 3 (1991) 1947. [59] J.-P. Minier, J. Pozorski, Analysis of a pdf model in a mixing layer case, Proceedings of the 10th Symposium on Turbulent Shear Flows, University Park, PA, 1995. [60] P.R. Van Slooten, Jayesh, S.B. Pope, Advances in pdf modeling for inhomogeneous turbulent ows, Phys. Fluids 10 (1998) 246. [61] H.A. Wouters, T.W.J. Peeters, D. Roekaerts, On the existence of a stochastic lagrangian model representation for second-moment closures, Phys. Fluids A 8 (1996) 1702. [62] P.J. Colucci, F.A. Jaberi, P. Givi, S.B. Pope, The Altered density function for large-eddy simulation of turbulent reactive ows, Phys. Fluids 10 (1998) 499. [63] F.A. Jaberi, P.J. Colucci, S. Givi, S.B. Pope, Filtered mass density function for large-eddy simulation of turbulent reactive ows, J. Fluid Mech. 401 (1999) 85–121. ∗ [64] D.C. Haworth, S.H. El Tahry, Probability density function approach for multimensional turbulent ows calculations with application to an in-cylinder ows in reciprocating engines, AIAA 29 (2) (1991) 208–218. [65] M. Muradoglu, P. Jenny, S.B. Pope, D.A. Caughey, A consistent hybrid Anite volume=particle method for the pdf equations of turbulent reactive ows, J. Comput. Phys. 154 (1999) 342–371. [66] J.-P. Minier, J. Pozorski, Wall boundary conditions in the pdf method and application to a turbulent channel ow, Phys. Fluids 11 (1999) 2632–2644. [67] R.P. Patel, J. AIAA 11 (67) (1973). [68] F.H. Champagne, Y.H. Pao, I.J. Wygnanski, On the two-dimensional mixing region, J. Fluid Mech. 74 (1976) 209–250. [69] I.J. Wygnanski, H.E. Fiedler, The two-dimensional mixing layer, J. Fluid Mech. 41 (1970) 327–361. [70] J. Pozorski, J.-P. Minier, Full velocity–scalar pdf approach for wall-bounded ows and computation of thermal boundary layers, Proceedings of the 8th European Turbulence Conference, Barcelone, June 27–30, 2000.
212
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
[71] C.M. Tchen, Mean value and correlation problems connected with the motion of small particles suspended in a turbulent uid, Ph.D. Thesis, Delft University of Technology, 1947. [72] M.R. Maxey, J.J. Riley, Equation of motion for a small rigid sphere in a nonuniform ow, Phys. Fluids 26 (4) (1983) 883–889. [73] R. Gatignol, The Fax\en formulae for a rigid particle in an unsteady non-uniform Stokes ow, J. M\ec. Th\eor. Appl. 1 (2) (1983) 143–160. ∗ [74] J. Magnaudet, M. Rivero, J. Favre, Accelerated ows past a rigid sphere or a spherical bubble, Part 1, steady straining ow, J. Fluid Mech. 284 (1995) 97–135. [75] R. Clift, J.R. Grace, M.E. Weber, Bubbles, Drops and Particles, Academic Press, New York, 1978. [76] D.L. Koch, Kinetic theory for a monodispersed gas-solid suspension, Phys. Fluids A 2 (10) (1990) 1711–1723. [77] M. Boivin, O. Simonin, K.D. Squires, Direct numerical simulation of turbulence modulation by particles in isotropic turbulence, J. Fluid Mech. 375 (1998) 235–263. [78] E. Peirano, B. Leckner, Fundamentals of turbulent gas-solid ows applied to circulating uidized bed combustion, Prog. Energy Combust. Sci. 24 (1998) 259–296. [79] O. Simonin, Continuum modelling of dispersed two-phase ows, Combustion and Turbulence in Two-Phase Flows, 1995 –1996, Lecture Series Programme, von K\arm\an Institute, Belgium, 1996. [80] J. Pozorski, J.-P. Minier, Probability density function modelling of dispersed two-phase turbulent ows, Phys. Rev. E 59 (1) (1998) 855–863. [81] O. Simonin, E. Deutsch, J.-P. Minier, Eulerian prediction of the uid=particle correlated motion in turbulent two-phase ows, Appl. Sci. Res. 51 (1993) 275–283. [82] M.W. Reeks, On the continuum equations for dispersed particles in nonuniform ows, Phys. Fluids A 4 (6) (1992) 1290–1303. [83] M.W. Reeks, On the constitutive relations for dispersed particles in nonuniform ows, I dispersion in a simple shear ow, Phys. Fluids A 5 (3) (1993) 750–761. [84] J. Pozorski, J.-P. Minier, On the lagrangian turbulent dispersion models based on the langevin equation, Int. J. Multiphase Flow 24 (1998) 913–945. [85] J.-P. Minier, Closure proposals for the langevin equation model in Lagrangian two-phase ow modelling, Proceedings of the third ASME=JSME Conference, San Francisco, ASME FED, July 28–23 1999, pp. FEDSM99-7885. ∗ [86] J.M. Mc Innes, F.V. Bracco, Stochastic particle dispersion modeling and the tracer-particle limit, Phys. Fluids A 4 (1992) 2809. ∗ [87] S.B. Pope, Consistency conditions for random-walk models of turbulent dispersion, Phys. Fluids 30 (8) (1987) 2374–2378. ∗ [88] J.O. Hinze, Turbulence, 2nd Edition, McGraw Hill, New-York, 1975. [89] T.R. Auton, J.C.R. Hunt, M. Prud’homme, The force exerted on a body in inviscid unsteady non-uniform rotational ow, J. Fluid Mech. 197 (1988) 241–257. [90] Hockney, Eastwood, Computer Simulations Using Particles, Institute of Physics Publishing, Bristol, Philadelphia, 1988. [91] J. Pozorski, J.-P. Minier, Computation and projection of statistical averages in monte carlo particle-mesh methods, J. Comput. Phys. 2000, submitted. [92] Y. Sato, K. Hishida, M. Maeda, E8ect of dispersed phase on modiAcation of turbulent ow in a wall jet, J. Fluids Eng. 118 (1996) 307–314. [93] T. Ishima, J. Boree, P. Fanouillere, I. Flour, Presentation of a data base: conAned blu8 body ow laden with solid particle, Proceedings of the Nineth Workshop on Two-Phase Flow Predictions, Halle-Wittenburg, Germany, Martin-Luther-Universitat, April 13–16, 1999. [94] V. Mathiesen, T. Solberg, B.H. Hjertager, An experimental and computational study of multiphase ow behavior in a circulating uidized bed, Int. J. Multiphase Flow 26 (3) (2000) 387–419. [95] C. Gourdel, O. Simonin, E. Brunier, Modelling and simulation of gas-solid turbulent ows with a binary mixture of particles, Proceedings of the Third International Conference on Multiphase Flow, ICMF 98. Lyon, France, June 8–12, 1998. [96] S.L. Soo, Multiphase Fluid Dynamics, Science Press, Gower Technical, New York, 1990.
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
213
[97] T.D. Dreeben, S.B. Pope, Probability density function=monte carlo simulation of near-wall turbulent ows, J. Fluid Mech. 357 (1998) 141. [98] B.J. Delarue, S.B. Pope, Application of pdf methods to compressible turbulent ows, Phys. Fluids 9 (9) (1997) 2704. [99] P. Moin, K. Mahesh, Direct numerical simulation: a tool in turbulence research, Ann. Rev. Fluid Mech. 30 (1998) 539–578. [100] V. L’vov, I. Procaccia, Phys. World 9 (1995) 35. [101] V. L’vov, I. Procaccia, Fusion rules in turbulent systems with ow equilibrium, Phys. Rev. Lett. 76 (1996) 2898–2901. [102] P.K. Yeung, One- and two-particle lagrangian acceleration correlations in numerically simulated homogeneous turbulence, Phys. Fluids 9 (10) (1997) 2981–2990. [103] P. Vedula, P.K. Yeung, Similarity scaling of acceleration and pressure statistics in numerical simulations of isotropic turbulence, Phys. Fluids 11 (5) (1999) 1208–1220. [104] G.A. Voth, K. Satyanarayan, E. Bodenschatz, Lagrangian acceleration measurements at large Reynolds numbers, Phys. Fluids 10 (9) (1998) 2268–2280. [105] M.S. Sawford, B.L. Borgas, Stochastic equations with multifractal random increments for modeling turbulent dispersion, Phys. Fluids 6 (2) (1994) 618–632. [106] Z.S. She, E. Jackson, S.A. Orszag, Structure and dynamics of homogeneous turbulence: models and simulations, Proc. Roy. Soc. London A. 434 (1991) 101–124. ∗ [107] R.J. Adrian, P. Moin, Stochastic estimation of organized turbulent structure: homogeneous shear ow, J. Fluid Mech. 190 (1988) 531–559. [108] Y. Nagano, Modelling heat transfer in near-wall ows, Closure strategy for modelling turbulent and transitional ows, Isaac Newton Institute for Mathematical Sciences, Cambridge, April 6 –17, 1999. [109] D.J. Thomson, A stochastic model for the motion of particle pairs in isotropic high-Reynolds number turbulence, and its application to the problem of concentration variance, J. Fluid Mech. 210 (1990) 113–153. [110] B.L. Borgas, M.S. Sawford, A family of stochastic models for two-particle dispersion in isotropic homogeneous stationary turbulence, J. Fluids Mech. 279 (1994) 69–99. [111] O.A. Kurbanmuradov, Stochastic lagrangian models for two-particle relative dispersion in high-Reynolds number turbulence, Monte Carlo Methods Appl. 3 (1) (1997) 37–52. [112] K.K. Saberfeld, O.A. Kurbanmuradov, Stochastic lagrangian models for two-particle motion in turbulent ows, Monte Carlo Methods Appl. 3 (1) (1997) 53–72. [113] K.K. Saberfeld, O.A. Kurbanmuradov, Stochastic lagrangian models for two-particle motion in turbulent ows. Numerical results, Monte Carlo Methods Appl. 3 (3) (1997) 199–223. [114] B.M.O. Heppe, Generalized langevin equation for relative turbulent dispersion, J. Fluid Mech. 357 (1998) 167–198. [115] J.J. Monaghan, Smoothed particle hydrodynamics, Ann. Rev. Astron. Astrophys. 30 (1992) 543–574. ∗∗ [116] G.H. Cottet, P. Koumoutsakos, Vortex Methods, Theory and Practice, Cambridge University Press, Cambridge, 2000. [117] H. Risken, The Fokker–Planck Equation, Methods of Solution and Applications, 2nd Edition, Springer, Berlin, 1989. [118] G. Roepstorf, Path Integral Approach to Quantum Physics, Springer, Berlin, 1994. [119] L.S. Schulman, Techniques and Applications of Path Integration, Wiley, New York, 1981. [120] F.W. Wiegel, Introduction to Path-Integral Methods in Physics and Polymer Science, World ScientiAc, Singapore, 1986. [121] M. Namiki, Stochastic Quantization, Springer, Berlin, 1992. [122] H. Goldstein, Classical Mechanics, 2nd Edition, Addison-Wesley Publishing Co., Reading, MA, 1980. [123] L. Onsager, S. Machlup, Phys. Rev. 91 (1953) 1505. [124] L. Onsager, S. Machlup, Phys. Rev. 91 (1953) 1512. [125] G.L. Eyink, Linear stochastic models for nonlinear dynamical systems, Phys. Rev. E 58 (6) (1998) 6975–6991. [126] M. Orme, Experiments on droplet collisions, bounce, coalescence and disruption, Prog. Energy Combust. Sci. 23 (1997) 65–79.
214
J.-P. Minier, E. Peirano / Physics Reports 352 (2001) 1–214
[127] S.P. Lin, R.D. Reitz, Drop and spray formation from a liquid jet, Ann. Rev. Fluid Mech. 30 (1998) 85–105. [128] S. Zeng, E.T. Kerns, R.H. Davis, The nature of particle contacts in sedimentation, Phys. Fluids 8 (1996) 1389. [129] P.G. Safman, On the settling speed of free and Axed suspensions, Stud. Appl. Maths. 52 (1973) 115–127. [130] E.R. Bagnold, Physics of Blown Sand and Sand Dunes, Chapman & Hall, London, 1941. [131] P. Bak, How Nature Works: the Science of Self-Organized Criticality, Oxford University Press, Oxford, 1997. [132] Jensen, Self-organized Criticality, Cambridge Lecture Notes in Physics, Vol. 10, 1998. [133] H.M. Jaeger, S.R. Nagel, R.P. Behringer, Granular solids, liquids, and gases, Rev. Mod. Phys. 68 (4) (1996) 1259–1273. [134] L.P. Kadano8, Built upon sand: theoretical ideas inspired by granular ows, Rev. Mod. Phys. 71 (1) (1999) 435–444. [135] C.S. Campbell, Rapid granular ows, Ann. Rev. Fluid Mech. 22 (1990) 57–92. [136] J.T. Jenkins, S.B. Savage, A theory for rapid ow of identical, smooth, nearly elastic, spherical particles, J. Fluid Mech. 130 (1983) 187–202. [137] J.T. Jenkins, M.W. Richman, Grad’s 13 moment system for a dense gas of inelastic spheres, Arch. Rational Mech. Anal. 87 (1985) 355–377. [138] M.A. Hopkins, M.Y. Louge, Inelastic microstructure in rapid granular ows of smooth disks, Phys. Fluids A 3 (1) (1991) 47–57. [139] S. MacNamara, W.R. Young, Inelastic collapse in two dimensions, Phys. Rev. E 50 (1) (1994) R28–R31. [140] N. SchRorghofer, T. Zhou, Inelastic collapse of rotating spheres, Phys. Rev. E 54 (5) (1996) 5511–5515. [141] S.F. Foerster, M.Y. Louge, H. Chang, K. Allia, Measurements of the collision properties of small spheres, Phys. Fluids 6 (3) (1994) 1108–1115. [142] J. Lavi\eville, E. Deutsch, O. Simonin, Large eddy simulation of interactions between colliding particles and a homogeneous isotropic turbulent Aeld. in: Gas–Solid Flows, ASME FED, Vol. 228, ASME, New York, 1995, pp. 347–357. [143] E.S. Oran, C.K. Oh, B.Z. Cybyk, Direct Simulation Monte Carlo: recent advances and applications, Ann. Rev. Fluid Mech. 30 (1998) 403–441.
Renormalization group theory in the new millennium. III edited by Denjoe O'Connor, C.R. Stephens editor: I. Procaccia Contents D. O'Connor, C.R. Stephens, Renormalization group theory in the new millennium. III D.V. Shirkov, V.F. Kovalev, The Bogoliubov renormalization group and solution symmetry in mathematical physics G. Gallavotti, Renormalization group in statistical mechanics and mechanics: gauge symmetries and vanishing beta functions
PII S0370-1573(01)00069-2
215
219
251
G. Gentile, V. Mastropietro, Renormalization group for one-dimensional fermions. A review on mathematical results G. Jona-Lasinio, Renormalization group and probability theory E.A. Calzetta, B.L. Hu, F.D. Mazzitelli, Coarsegrained e!ective action and renormalization group theory in semiclassical gravity and cosmology
273 439
459
RENORMALIZATION GROUP THEORY IN THE NEW MILLENNIUM. III
edited by Denjoe O:CONNOR, C.R. STEPHENS
ElectriciteH de France, Div. R&D, MFTT, 6 Quai Watier, 78400 Chatou, France Energy Conversion Department, Chalmes University of Technology, S-41296 GoK teborg, Sweden
AMSTERDAM } LONDON } NEW YORK } OXFORD } PARIS } SHANNON } TOKYO
Physics Reports 352 (2001) 215–218
Editorial
Renormalization group theory in the new millennium. III 1. Introduction This volume constitutes the third in a series of reviews based loosely on plenary talks given at the conference “RG2000: Renormalization Group Theory at the Turn of the Millennium” held in Taxco, Mexico in January 1999. The chief purpose of the conference was to bring together a group of people who had made signi0cant contributions to RG Theory and its applications, especially those who had contributed to the development of the subject in quantum 0eld theory=particle physics and statistical mechanics=critical phenomena, i.e. the high- and low-energy regimes of RG theory. In the last half-century, renormalization group (RG) theory has become a central structure in theoretical physics and beyond, though it is not always clear that di4erent authors mean the same thing when they speak about it. The aim of these reviews is to try and convey some of the power and scope of RG theory and its applications and in the process hopefully convey the underlying unity of the set of ideas involved. Although RG theory has had a major impact it has tended to be viewed as a tool rather than as a subject in and of itself. Being presented principally in terms of its applications has therefore meant a lack of contact between practitioners from di4erent 0elds. An important exception to this tendency is the series of RG conferences organized by Dimitri Shirkov and others of the Joint Institute for Nuclear Research, Dubna theoretical physics community. The Taxco conference was in the same spirit. The advent in recent years of conferences on the “exact” RG has also provided an opportunity for practitioners to come together. The only criticism one might have of this latter series is the large emphasis on 0eld theory. This small criticism notwithstanding we hope that there will be continued opportunity to bring together RG practitioners from di4erent 0elds. In obtaining contributions for these reviews we did not restrict ourselves to speakers from the conference. A major concern was to avoid producing yet another typical conference proceedings. Hence, the remit given to the contributors was to write as extensively and comprehensively as they saw 0t. Naturally, with such a liberal regime the length of article varies signi0cantly. Our goal was to try and review the state of the art of RG theory given that it could now be considered mature enough to warrant a large scale overview. We believe that we were to some extent defeated in our purpose by the very size and range of applicability of the RG. Although we have managed to cover a large gamut we know there c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 3 8 - 2
216
Editorial
are glaring omissions. Nevertheless, we feel it is of great bene0t to have reviews by leading practitioners all brought together in the same place even if the range of coverage is suboptimal. A possible remedy to this would be for specialists in areas not adequately covered here to submit articles which would naturally fall into the present series. We particularly wanted to emphasize the idea that although mature enough to warrant a major review, RG theory is young enough, and more signi0cantly, deep enough, that such a review would still only barely scratch its surface. We hope that young researchers will get the feeling that it is still very much an emerging 0eld with a large number of open problems associated with the understanding of RG theory itself and an even larger number associated with its applications. 2. Introduction to the third volume This third volume in the series is principally devoted to theory and applications of the “0eld theoretic RG” treating both theoretical developments and applications. The two principle strands of RG theory are based on two, apparently, quite di4erent concepts: “coarse graining” and “reparametrization”. In the context of 0eld theory, of either statistical or quantum systems, they coexist within the same formalism, coarse grainings naturally introducing coordinate changes on the space of Hamiltonians and reparametrizations being naturally associated with scale changes. However, the last few years have witnessed developments wherein the basic di4erences between the two approaches have become more apparent. An example of this is how the reparametrization approach has been abstracted away from its 0eld theoretic origins to treat problems such as the solution of non-linear partial or ordinary di4erential equations. The review of Shirkov and Kovalev gives an overview of the evolution of these developments. Stated abstractly, the RG in the reparametrization approach is a continuous symmetry of a solution of some problem with respect to transformations of the parameters on which the solution depends. A good example would, for instance, be boundary conditions. Importantly, it is also an exact symmetry of the problem. If the in0nitesimal generators of this group can be found perturbatively then the solution of the corresponding RG equation corresponds to an “exponentiation” of this perturbation theory. This fact is at the heart of the applications that involve a RG improvement of the perturbative solutions of di4erential equations. The authors consider several illustrative examples such as solution of the modi0ed Burgers equation and the self-focusing of laser beams. In the latter case one very interesting feature of the reparametrization method is its generalization to a more than one parameter family of reparametrizations, i.e. more than one RG, which opens the door to solving problems where the singularities one is trying to access are more than one-dimensional. Without doubt applications of this more abstract view of the RG method, outside its origin in the context of Quantum Field Theory, will be a continuing important area of application of RG methods. The review of Gallavotti discusses two very di4erent yet similar problems: the theory of the ground state of a system of fermions in one dimension and the theory of KAM tori in classical mechanics. He thus demonstrates ‘the conceptual unity that renormalization group methods’
Editorial
217
have brought to theoretical physics. In many ways, the most surprising use of RG techniques is the application to the KAM problem. This is a diGcult problem of classical mechanics where surprisingly a 0eld theoretic analysis and RG study of an associated Feynman diagrammatic series (which involves only trees) proves to be a powerful and insightful method of attacking the problem. In the third review, Gentile and Mastropietro give a very comprehensive treatment of onedimensional Fermionic systems. They review what is known at a rigorous level about the correlation functions of models of interacting one-dimensional Fermi systems emphasizing primarily results obtained in the 1990s. It turns out that the correlation functions of many models can be written as convergent series, in the weak coupling regime, and such expressions provide all the information of interest. In “Renormalization Group and Probability Theory” Jona-Lasinio discusses the probabilistic interpretation of the RG and argues that the critical point can be characterized by deviations from the central limit theorem. He further argues that to any type of RG transformation one can associate a multiplicative structure and that ‘the characterizing feature of the Green’s function RG is that it is de0ned directly in terms of this multiplicative structure’ and that this multiplicative structure emerges from the properties of conditional probabilities. It is undoubtedly true that further insights into the connection between the di4erent RG approaches is to be gained by this type of probabilistic analysis. This seems to us to be an aspect of RG theory that has not yet yielded all of its secrets. In the 0fth review Calzetta, Hu and Mazzitelli discuss various applications of the RG in situations involving a non-trivial gravitational 0eld, such as in the early universe. In particular, they consider both at a qualitative and quantitative level the application of RG theory to non-equilibrium quantum processes and phase transitions in the early universe. For example, in the simplest case—De Sitter space, where the dynamics can be rewritten as a pure scaling transformation—one can easily calculate the running of the couplings in, for example, a 4 theory. The resultant renormalized couplings “run” with time, leading to an exponentially decreasing e4ective coupling constant. The full consequences of this type of behaviour for phase transitions in an inLationary universe do not seem to have been worked out, but certainly deserve to be. Similar calculations are in principle possible for more general spacetimes. An associated unsolved problem is how to implement a cuto4 in these dynamical situations and how to avoid having to confront post-Planckian frequencies. The authors further go on to consider the Closed Time Path Coarse Grained E4ective Action as the key functional for dynamic processes (remember, the e4ective potential has little meaning in a dynamic situation). This involves a division of the wavemodes into fast and slow type and a subsequent coarse graining (over a spatial volume rather than the space-time of the Euclidean average e4ective action) of the fast modes which then act as noise for the slow modes. A key point here is the ability to study the origin and properties of the noise from 0rst principles rather than put it in by hand as is done in, for example, RG approaches to the Langevin equation. The authors end their review by considering the interesting possibility that RG equations themselves, when noise and dissipation are inherent in a system, should be stochastic equations. It remains to be seen under what circumstances this is indeed the case and whether it has important quantitative consequences.
218
Editorial
Acknowledgements We take this opportunity to thank our coorganizers of the Taxco conference, Alberto Robledo, Riccardo Capovilla and Juan Carlos D’Olivo, and the conference secretaries, Trinidad Ramirez and Alejandra Garcia. We thank the conference sponsors for their signi0cant 0nancial support: CONACyT, MMexico; NSF, USA; ICTP, Italy; the Depto de FMPsica, Cinvestav, MMexico; Instituto de Ciencias Nucleares, UNAM, MMexico; Instituto de FMPsica, UNAM, MMexico; Fenomec, UNAM, MMexico; Cinvestav, MMexico; DGAPA, UNAM, MMexico and the CoordinaciMon de InvestigaciMon CientMP0ca, UNAM, Mexico. It is fair to say that without this generous support a conference of such caliber could not have taken place. We also take this opportunity to express our gratitude, for their advice and assistance, to the international advisory committee comprised of: A.P. Balachandran, Syracuse University, USA; K. Binder, Mainz, Germany; M.E. Fisher, University of Maryland, USA; N. Goldenfeld, University of Illinois, USA; B.L. Hu, University of Maryland, USA; D. Kazakov, Dubna, Russia; V.B. Priezzhev, Dubna, Russia; I. Procaccia, Weizmann Institute, Israel; M. Shifman, University of Minnesota, USA; D.V. Shirkov, Dubna, Russia; F. Wegner, Heidelberg, Germany; J. Zinn-Justin, Saclay, France. We express our special thanks to Michael Fisher for his cogent advice and organizational help, to Bei Lok Hu for helping organize the US component of the conference and to Itamar Procaccia for organizing an appropriate forum in which to present this overview. Denjoe O’Connor Dept. de Fisica, CINVESTAV, A. Postal 14-740, 07360 Mexico D.F., Mexico E-mail address:
[email protected] (D. O’Connor). C.R. Stephens Instituto de Ciencias Nucleares, A. Postal 70-543, 04510 Mexico, D.F., Mexico E-mail address:
[email protected] (C.R. Stephens).
Physics Reports 352 (2001) 219–249
The Bogoliubov renormalization group and solution symmetry in mathematical physics Dmitrij V. Shirkova; ∗ , Vladimir F. Kovalevb a
Bogoliubov Laboratory of Theoretical Physics, J.I.N.R., Dubna 141980, Russia b Institute for Mathematical Modelling, Moscow 125047, Russia Received March 2001; editor: I: Procaccia
Contents 1. The Bogoliubov renormalization group 1.1. Historical introduction 1.2. The Bogoliubov RG: symmetry of a solution 1.3. The renorm-group method 2. Evolution of the renormalization group concept 2.1. Renormalization group evolution 2.2. Di6erence between the Bogoliubov RG and KW-RG 2.3. Functional self-similarity 3. Solution symmetry in mathematical physics
221 221 223 224 227 227 231 231 232
3.1. Constructing RG-symmetries and their application 3.2. Examples of solution improvement 4. The RG in non-linear optics 4.1. Formulation of a problem 4.2. Plane geometry 4.3. Cylindrical geometry 5. Overview Acknowledgements References
232 237 241 241 241 244 246 247 248
Abstract Evolution of the concept known in theoretical physics as the renormalization group (RG) is presented. The corresponding symmetry, that was ?rst introduced in quantum ?eld theory (QFT) in the mid-1950s, is a continuous symmetry of a solution with respect to transformations involving the parameters (e.g., that determine boundary condition) which specify some particular solution. After a short detour into Wilson’s discrete semi-group, we follow the expansion of the QFT RG and argue that the underlying transformation, being considered as a reparametrization, is closely related to the property of self-similarity. It can be treated as its generalization—Functional Self-similarity (FS). Next, we review the essential progress made in the last decade in the application of the FS concept to boundary value problems formulated in terms of di-erential equations. A summary of a regular approach, recently devised for ∗
Corresponding author. E-mail addresses:
[email protected] (D.V. Shirkov),
[email protected] (V.F. Kovalev).
c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 3 9 - 4
220
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
discovering the RG = FS symmetries with the help of modern Lie group analysis, and some of its applications are given. As the principal physical illustration, we consider the solution of the problem of c 2001 Elsevier Science B.V. All rights reserved. a self-focusing laser beam in a non-linear medium. PACS: 02.20.−a; 03.70.+k; 11.10.Hi Keywords: Quantum ?eld theory; Renormalization group; Renorm-group symmetry; Lie groups
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
221
1. The Bogoliubov renormalization group 1.1. Historical introduction 1.1.1. Discovery of the renormalization group In 1952–1953 StKuckelberg and Petermann [1] discovered 1 a group of in?nitesimal transformations related to the ?nite arbitrariness that arises in S-matrix elements upon elimination of ultraviolet (UV) divergences. These authors introduced the notion of a normalization group as a Lie transformation group generated by di6erential operators connected with renormalization of a coupling constant, e. The following year, on the basis of (in?nite) Dyson’s renormalization transformations formulated in the regularized form, Gell-Mann and Low [3] derived functional equations (FEs) for the QED propagators in the UV limit. The appendix to this article contained the general solution (obtained by T.D. Lee) of the FE for the renormalized transverse photon propagator amplitude d(Q2 =2 ; e2 ) (where is a cuto6 de?ned as a normalization momentum). This solution was used for a qualitative analysis of the small distance behavior of the quantum electromagnetic interaction. Two possibilities, namely, in?nite and ?nite charge renormalizations were pointed out. However, paper [3] paid no attention to the group character of the analysis and of the qualitative results obtained there. The authors missed a chance to establish a connection between their results and the QED perturbation theory and did not discuss the possibility that a ghost pole solution might exist. The decisive step was made by Bogoliubov and the present author [4 – 6] in 1955. 2 Using the group properties of ?nite Dyson transformations for the coupling constant, Fields and Green’s functions, they derived functional group equations for the renormalized propagators and vertices in QED in the general (i.e., with the electron mass taken into account) case. In “modern notation”, the ?rst equation x y Q2 m2 (x; Q y; ) = Q ; ; (t; (1) Q y; ) ; x = 2 ; y = 2 t t
is that for the invariant charge (now also widely known as the e6ective or running coupling) Q = d(x; y; = e2 ) and the second— x y s(x; y; ) = s(t; y; ) s ; ; (t; (2) Q y; ) t t —for the electron propagator amplitude. These equations obey a remarkable property: the product, e2 d ≡ , Q of the electron charge squared and the photon transverse propagator amplitude enters into both FEs. This product is invariant with respect to ?nite Dyson’s transformation (as stated by Eq. (1)) which now can be written in the form Rt : { 2 → t 2 ; → (t; Q y; )} :
(3)
We called this product the invariant charge and introduced the term renormalization group. 1 2
For a more detailed exposition of the RG’s early history see our recent reviews [2]. See also the two survey papers [7] published in English in 1956.
222
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
We emphasize that, unlike in Refs. [1,3], in the Bogoliubov formulation there is no reference to UV divergences and their subtraction or regularization. At the same time, technically, there is no simpli?cation due to the massless nature of the UV asymptotics. Here, the homogeneity of the transfer momentum scale Q is explicitly violated by the mass m. Nevertheless, the symmetry with respect to the transformation Rt (even though a bit more involved) underlying RG is formulated as an exact property of the solution. This is what we mean by the term Bogoliubov renormalization group or renormgroup for short. The di6erential Lie equations for Q and for the electron propagator y y 9 (x; Q y; ) 9s(x; y; ) (4) = ; (x; Q y; ) ; = ; (x; Q y; ) s(x; y; ) 9 ln x x 9 ln x x with 9 (; Q y; ) 9s(; y; ) (y; ) = ; (y; ) = at = 1 (5) 9 9 were ?rst derived in [4] by di6erentiating the FEs (1) and (2) with respect to x at the point t = x. On the other hand, by di6erentiating the same equations with respect to t one obtains [8] X (x; Q y; ) = 0;
Xs(x; y; ) = (y; )s(x; y; )
(6)
with X = x9x + y9y − (y; )9
(9x ≡ 9= 9x)
(7)
being the Lie in?nitesimal operator. 1.1.2. Creation of the RG method Another important achievement of [4] was the formulation of a simple algorithm for improving an approximate perturbative solution by combining it with Lie group equations—for details, see Section 1.3 below. In the adjacent publication [5] this algorithm was e6ectively used to analyze the UV and infrared (IR) behavior in QED. In particular, the one-loop UV asymptotics of the photon propagator as well as the IR behavior of the electron propagator in the transverse gauge Q(1) (8) ; s(x; y; ) ≈ (p2 =m2 − 1)−3 =2 rg (x; ) = 1 − ( =3) ln x were derived. At that time, these expressions, summing the leading logs were already known from papers by Landau and collaborators [9]. However, Landau’s approach did not provide a means for constructing subsequent approximations. A simple technique for calculating higher approximations was found only within the new renormgroup method. In the same paper, starting with the next order perturbation expression 3 Q(2) pt (x; ) containing the ln x term, we arrived at the second renormgroup approximation (see below Section 1.3.2): Q(2) (9) rg (x; ) = 1 − ( =3) ln x + (3 =4) ln(1 − ( =3) ln x) which performs an in?nite summation of the 2 ( ln)n terms. This two-loop solution for the invariant coupling, ?rst obtained in [5], contains the non-trivial log-of-log dependence which
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
223
is now widely known as the “next-to-leading logs” approximation for the running coupling in quantum chromodynamics (QCD)—see, below, Eq. (21). Comparing (9) with (8), one concludes that two-loop correction is essential in the vicinity of the ghost pole at x1 = exp(3= ). This also shows that the RG method is a regular procedure, within which it is easy to estimate the range of applicability of its results. Quite soon, this approach was formulated [6] for the case of QFT with two coupling constants. To the system of FEs for two invariant couplings there corresponds a coupled system of non-linear di6erential equations (DEs). The latter was used in [10] to study the UV behavior of the –N interaction at the one-loop level. Thus, in Refs. [4 – 6,10] the RG was directly connected with practical computations of the UV and IR asymptotics. Since then, this technique, the renormalization group method (RGM), 3 has become the principle means of asymptotic analysis in local QFT. 1.2. The Bogoliubov RG: symmetry of a solution The RG transformation: Generally, the RG can be de?ned as a continuous one-parameter group of speci?c transformations of a partial solution (or the solution characteristic) of a problem, a solution that is ?xed by boundary conditions. The RG transformation involves boundary condition parameters and corresponds to some change in the way of imposing this condition. For illustration, imagine a one-argument solution characteristic f(x) that has to be speci?ed by the boundary condition f(x0 ) = f0 . Formally, one can represent a given characteristic of a partial solution as a function of boundary parameters as well: f(x) = f(x; x0 ; f0 ). This step can be treated as an embedding operation. Without loss of generality, f can be written in the form of a two-argument function F(x=x0 ; f0 ) with the property F(1; ) = . The RG transformation then corresponds to a change in the parameterization, say from {x0 ; f0 } to {x1 ; f1 }, for the same solution. In other words, the x argument value, at which the boundary condition is given, can be changed for x1 with f(x1 ) = f1 . The equality F(x=x0 ; f0 ) = F(x=x1 ; f1 ) now reRects the fact that under such a change the form of the function F itself is not modi?ed. Noting that f1 = F(x1 =x0 ; f0 ), we get F(; f0 ) = F(=t; F(t; f0 ));
= x=x0 ;
t = x1 =x0 :
The group transformation here is { → =t; f0 → F(t; f0 )}. The renormgroup transformation for a given solution of some physical problem in the simplest case can now be de?ned as a simultaneous one-parameter transformation of two variables, say x and g, by Rt : {x → x = x=t; g → g = g(t; Q g)} ;
(10)
the ?rst being a scaling of a coordinate x (or reference point) and the second—a more complicated functional transformation of the solution characteristic. The equation g(x; Q g) = g(x=t; Q g(t; Q g)) 3
Summarized in the special chapter of the ?rst edition of the monograph [11].
(11)
224
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
for the transformation function gQ provides the group property Tt = T Tt of the transformation (10). These are just the RG FEs and transformation for a massless QFT model with one coupling constant g. In that case x = Q2 = 2 is the ratio of a four-momentum Q squared to a “normalization” momentum squared and g, the coupling constant. The RG transformation (10) of a QFT amplitude s is of the form (compare with Eq. (2)) Rt · s(x; g) ≡ e−ln tX s(x; g) = s(x=t; g(t; Q g)) = zs−1 s(x; g);
zs = s(t; g) :
(12)
Several generalizations are in order. (a) “Massive” case: For example, in QFT, if we do not neglect the mass, m, of a particle, we have to insert an additional dimensionless argument into the invariant coupling gQ which now has to be considered as a function of three variables: x = Q2 = 2 ; y = m2 = 2 , and g. The presence of a new “mass” argument y modi?es the group transformation (10) and the FE (11) x y x y Rt : x = ; y = ; g = g(t; Q y; g) ; g(x; Q y; g) = gQ ; ; g(t; (13) Q y; g) : t t t t Here, it is important that the new parameter y (which, physically, should be close to the x variable, as it scales similarly) enters also into the transformation law of g. If the considered QFT model, like QCD, contains several masses, there will be several mass arguments y → {y} ≡ y1 ; y2 ; : : : ; yn . (b) Multi-coupling case: A more involved generalization corresponds to the case of several coupling constants: g → {g} = g1 ; : : : ; gk . Here, there arise a “family” of e6ective couplings gQ → {gQ};
gQi = gQi (x; y; {g});
i = 1; 2; : : : ; k ;
(14)
satisfying the system of coupled functional equations gQi (x; y; {g}) = gQi (x=t ; y=t; {g(t; Q y; {g})}) :
(15)
The RG transformation now is Rt : {x → x=t; y → y=t; {g} → {g(t)}};
gi (t) = gQi (t; y; {g}) :
(16)
1.3. The renorm-group method 1.3.1. The algorithm The uni?cation of an approximate solution [4,5] and the abstract group symmetry can be realized with the help of the group DEs. If we de?ne and (the so-called group “generators”) via some approximate solutions and then solve the evolutional DEs we obtain the RG improved solutions that obey the group symmetry and correspond to the approximate solutions used as an input. Now, we can formulate an algorithm for improving an approximate solution. The procedure is given by the following prescription which we illustrate in the massless one-coupling cases (4) and (5):
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
225
Assume some approximate solution gQappr (x; g); sappr (x; g) is known: 1. On the basis of Eq. (5) de?ne the beta- and gamma-functions def 9 def 9 (g) = : gQappr (; g) ; sappr (; g) (g) = 9 9 =1 =1 2. Integrate the ?rst of Eqs. (4), i.e., construct the function g d def f(g) = : ()
(17)
(18)
3. Solve the resulting equation to obtain gQrg (x; g) = f−1 {f(g) + ln x} :
(19)
4. Integrate the second of Eqs. (4) using this expression gQrg on its right hand side to explicitly obtain srg (x; g). 5. The expressions gQrg and srg precisely satisfy the RG symmetry, i.e., they are exact solutions of Eqs. (11) and (12) corresponding to gQappr and sappr used as input. 1.3.2. A simple illustration 2 As a concrete illustration, take the simplest perturbative expressions: gQ(1) pt = g − g 1 ln x for (1) gQappr and spt = 1 − g1 ln x. Here, (g) = − 1 g2 ; (g) = − 1 g and integration of (4) gives the explicit expressions g (1) gQ(1) (x; g) = (g(x; Q g)=g)!1 ; !1 = 1 =1 ; (20) ; srg rg (x; g) = 1 + g1 ln x which, on the one hand, exactly satisfy the RG symmetry and, on the other, being expanded in powers of g, correlate with gQpt and spt . Now, on the basis of the geometric progression (20), let us present the two-loop perturbative 2 3 2 2 approximation for gQ in the form gQ(2) pt = g−g 1 ln x+g (1 ln x −2 ln x). By using this expression as an input in Eq. (17), we have (2) (g) = − 1 g2 − 2 g3 and then (performing step 2), z 1 d z 2 (2) 1 f (z) = − = + b ln : ; b= 2 + b3 z 1 + bz 1 To make the last step, we have to start with the equation f(2) [gQrg(2) (x; g)] = f(2) (g) + 1 ln x which is transcendental and has no simple explicit solution. 4 Due to this, one usually solves the equation approximately by noting that the second, logarithmic, contribution to f(2) (z) is a small correction to the ?rst one at bz 1. With this caveat we can substitute the one-loop RG expression (20), instead of gQ(2) rg , into this correction and obtain the explicit “iterative” solution g gQ(2) ; l = ln x : (21) rg = 1 + g1 l + g(2 =1 ) ln [1 + g1 l] 4
It can be expressed in terms of special, Lambert, W -function: W (z) expW (z) = z; see, e.g., Ref. [12].
226
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
(2) An analogous procedure for spt = 1 − g1 ln x + g2 (1 (1 + 1)12 ln2 x − 2 ln x) yields (2) = srg
S(gQ(2) 1 2 − 1 2 rg (x; g)) : with S(g) = g!1 e!2 g and !2 = S(g) 12
(22)
These results are interesting from several aspects. • Firstly, being expanded in powers of g and gl, they produce an in?nite series containing
“leading” and “next-to-leading” UV logarithmic contributions.
• Secondly, they contain a new analytic dependence ln(1 + g1 l) ∼ ln(ln Q2 ) which is absent
in the perturbative input. • Thirdly, when compared with the one-loop solution, Eq. (20), they illustrate an algorithm for the improvement of a solution’s accuracy, i.e., of the RGM regularity.
1.3.3. RGM usage in QFT As we have seen, ?nite order perturbative expressions in QFT do not obey the RG symmetry. On the other hand, it was shown that the one- and two-loop approximation, used as an input for the construction of the “generators” (g) and (g), yield expressions (20) – (22) that obey the group symmetry and exactly satisfy the FEs (11) and (12). More generally, one can state the following logical structure of the RGM procedure. • Solve the group equation(s) for the invariant coupling(s) gQrg (x; g) using some approximate
solution gQpt as an input.
• Obtain the RG solutions for some other QFT objects (like vertices and propagator amplitudes)
on the basis of the expression(s) for gQrg just derived.
Typically, they satisfy the equation XM (x; y; g) = (y; g) M (x; y; g) :
(23)
The general structure of the corresponding solutions has the form −1 (y; g) M(x=y; g(x; Q y; g)) : M (x; y; g) = zM
(24)
Note that the function M on the right hand side depends only on the RG invariants, that is on the ?rst integrals of the RG operator X introduced in Eqs. (6) and (7). It satis?es homogeneous partial di6erential equations (PDEs) X M = 0. For RG invariant objects, like observables, zM = 1, = 0. Now we can summarize the properties of the RGM. The RGM is a regular procedure for combining dynamical information (taken from an approximate solution) with the RG symmetry. The essence of the RGM is the following: (1) The mathematical tool used in the RGM is Lie di6erential equations. (2) The key element of the RGM is the possibility of an (approximate) determination of the “generators”, such as (g); (g), from the dynamics. (3) The RGM works most e6ectively in the case where the solution has a singular behavior. It restores the structure of the singularity compatible with the RG symmetry.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
227
2. Evolution of the renormalization group concept In the 1970s and 1980s RG ideas were applied to critical phenomena (spontaneous magnetization, polymerization, percolation), non-coherent radiation transfer, dynamic chaos, and so on. Perhaps, a less sophisticated motivation by Wilson in the context of spin lattice phenomena (rather than in QFT) made this “explosion” of RG applications possible. 2.1. Renormalization group evolution 2.1.1. The Kadano-–Wilson RG in critical phenomena (a) Spin lattice: The so-called renormalization group in critical phenomena is based on the Kadano6–Wilson procedure [13,14] of “decimation” or “blocking”. Initially, it emerged from the problem of critical phenomena on spin lattices. Imagine a regular (two- or three-dimensional) lattice consisting of N d ; d = 2; 3 sites with an ‘elementary step’ a between them. Suppose that at every site a spin vector is located. The Hamiltonian, describing the spin interaction between nearest neighbors H =k i · i±1 i
contains k, the coupling constant. A statistical sum is obtained from the partition function, S = exp(−H=() aver . To realize blocking, one has to perform a “spin averaging” over blocks consisting of nd elementary sites. This step diminishes the number of degrees of freedom from N d to (N=n)d . It also destroys the short-range properties of the system, in the averaging procedure some information being lost. However, the long-range physics (such as the correlation length, essential for understanding the phase transition) is not a6ected by it, and thus we gain a simpli?cation of the problem. As a result of this blocking procedure, new e6ective spins, , arise at new sites, forming a new e6ective lattice with lattice spacing na. We arrive also at the new e6ective Hamiltonian He6 = Kn I · I ±1 + TH I
with the e6ective coupling Kn between new spins I of new neighboring sites; Kn has to be de?ned by the averaging process as a function of k and n. Here, TH contains quartic and higher spin forms which are irrelevant for the IR (long-distance) properties. Due to this, one can drop TH and conclude that the spin averaging leads to an approximate transformation, k · → Kn · ; i
I
or, taking into account the “elementary step” change, to {a → n a; k → Kn }. The latter is the Kadano6–Wilson transformation. It is convenient to write down the new coupling Kn in the form Kn = K(1=n; K). Then, the KW transformation reads as KWn : {a → na; k → Kn = K(1=n; k)} :
(25)
228
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
These transformations obey a composition law KWn · KWm = KWnm if the relation K(x; k) = K(x=t; K(t; k));
x = 1=nm;
t = 1=n
(26)
holds. This is very close to the RG symmetry. We observe the following points: • • • •
The RG symmetry in this case is approximate (due to neglecting TH ). The transformations KWn are discrete. There exists no inverse transformation to KWn . The transformations KWn relate di6erent auxiliary models.
Hence, the ‘Kadano6–Wilson renormalization group’ (KW-RG) is an approximate and discrete semi-group. For the long-distance (IR limit) physics, however, +(1=n) is small and it is possible to use di6erential Lie equations. 5 (b) Polymer theory: In polymer physics, one considers the statistical properties of polymer macromolecules which can be imagined as very long chains of identical elements (with the number of elements N as big as 105 ). Molecules are swimming in a solvent and form globulars. This big molecular chain forms a speci?c pattern resembling that formed by a random walk. The central problem of polymer theory is very close to that of a random walk and can be formulated as follows. For a long chain of N “steps” (with stepsize = a), one has to ?nd the “chain size” RN , i.e. the distance between the “start” and the “?nish” points (the size of a “globule”), with the distribution function f(,) of angles between the neighboring elements being given. For large values of N , the molecular size, RN , obeys the power, Fleury, law RN ∼ N ! , with !, the Fleury index. When N is given, RN is a functional of f(,) which depends on external conditions (e.g., temperature T , properties of the solvent, etc). If T grows, RN increases and at some moment the globules touch one another. This is the polymerization process which is very similar to a phase transition phenomenon. The Kadano6–Wilson blocking ideology has been introduced into polymer physics by De Gennes [15]. The key idea is a grouping of n neigboring elements of a chain into a new “elementary block”. It leads to the transformation {1 → n; a → An } which is analogous to the one for spin lattice decimation. This transformation must be speci?ed by a direct calculation which gives an explicit form of An = a(n; Q a). Here, we have a discrete semi-group. Then, by using the KW-RG technique, one ?nds the ?xed point, obtains the Fleury power law and calculates its index !. An essential feature of a polymer chain is the impossibility of a self-intersection. This is known as the excluded volume e6ect in the random walk problem. Generally, the excluded volume e6ect yields some complications. However, using the QFT RG approach to polymers [16], it can be treated rather simply by introducing another argument which is analogous to the ?nite length L in the transfer problem or the particle mass m in QFT. 5 In applications of these transformations to critical phenomena the notion of a Axed point is important. Generally, a ?xed point is associated with power-type asymptotic behavior. Note here that, contrary to the QFT case considered in Section 1.3.2, in phase transitions we deal with an IR stable point.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
229
Besides polymers, the KW-RG technique has been used in other ?elds of physics, such as percolation, non-coherent radiation transfer [17], dynamical chaos [18] and others. 2.1.2. The Bogoliubov symmetry outside QFT The original QFT-RG approach has also proliferated into other parts of theoretical physics. In the late 1950s, it was used [19] for the summation of Coulomb singularities in Bogoliubov’s theory of superconductivity based on the FrKohlich electron–phonon interaction. Twenty years later it was used in the theory of turbulence. (a) Turbulence: To formulate the turbulence problem in terms of the RG, one has to perform the following steps [20,21]: 1. Introduce the generating functional for correlation functions. 2. Write down the path integral representation for this functional. 3. By changing the functional integration variable, ?nd the equivalence of the statistical system to some QFT model. 4. Construct the system of Schwinger–Dyson equations for this equivalent QFT model. 5. Perform the ?nite renormalization procedure and derive the RG equations. Here, the reparametrization degree of freedom physically corresponds to a change of long wavelength cuto6 which is built into the de?nition of a few e6ective parameters. (b) Weak shock wave: Another example can be taken from hydrodynamics. Consider a weak shock wave in the one-dimensional case of a large distance l from the starting (implosion) point. The dependence of the velocity, v, of the matter as a function of l at a given moment of time, t, has a simple triangular shape and can be described by the expression l v(l) = V L
at l 6 L;
=0
for l ¿ L ;
where L = L(t) is the front position and V = v(L)—the front velocity. They are functions of time. In the absence of viscosity, the “conservation law” LV = Const: holds. Due to this, they can be treated as functions of the front wave position L ≡ x; V = V (x) as well. If the physical situation is homogeneous the front velocity V (x) can be considered to be a function of only two additional relevant arguments—its own value V0 = V (x0 ) at some other point (x0 ¡ x) and of the x0 coordinate. It can be written in the form: V (x) = G(x=x0 ; V0 ). If we pick three points x0 ; x1 and x2 (for details, see Refs. [22,23]), then the initial condition may be given either at x0 or x1 . Thus, we obtain the FE equivalent to (11) V2 = G(x2 =x0 ; V0 ) = G(x2 =x1 ; V1 ) = G(x2 =x1 ; G(x1 =x0 ; V0 )) : (c) One-dimensional transfer: A similar argument has been given by Mnatzakanian [26] in the one-dimensional transfer problem. Imagine a half-space ?lled with a homogeneous medium on the surface of which some Row (of radiation or particles) with intensity g0 falls from the vacuum half-space. We follow the Row as it moves into the medium to a distance l from the boundary. Due to homogeneity along the l coordinate, the intensity of the penetrated Row g(l) depends on
230
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
two essential arguments, g(l) = G(l; g0 ). The values of the Row at three di6erent points g0 (on the boundary), g1 and g2 ¿ g1 can be connected to each other by the transitivity relations, g1 = G(; g0 ); g2 = G( + l; g0 ) = G(l; g1 ), which lead to the FE G(l; g) = G(l − ; G(; g)) :
(27)
Performing a logarithmic change of variables l = ln x; = ln t; G(l; g) = g(x; Q g), we see that (27) is equivalent to (11). Consider now the intensity of a reverse Row, i.e. the total amount of particles at the point l moving in the backward direction. It is completely de?ned by g0 and can be written down as R(l; g0 ). This function can be represented in the form R(l; g) = R0 (g)N (l; g) with R0 ≡ R(0; g) and the function N “normalized” on the boundary N (0; g) = 1. Playing the same game with the transitivity we arrive at the FE N (l; g) = Z(l; g)N (l − ; G(l; G(; g));
Z = R0 (g1 )=R0 (g)
(28)
related to Eq. (12) by a logarithmic change of variables. One can refer to (27) and (28) as the additive version of the RG FEs while the previous equations of Section 1, like (11), (12) and (13) are the multiplicative one. The transfer problem admits a modi?cation connected with discrete inhomogeneity. Imagine the case of two di6erent kinds of homogeneous materials separated by an inner boundary surface at l = L. The separation point l = L may correspond, for instance, to the boundary with empty space wherein the resulting equation is equivalent to Eq. (13). One more generalization is related to “multiplication” of the argument g as expressed by Eq. (14). Physically, this relates to the case of radiation on di6erent frequencies !i ; i = 1; 2; : : : k (or particles of di6erent energies or of di6erent types). Take the case for k = 2 and suppose that the material of the medium has such properties that the transfer processes of the two Rows are not independent. In this case, the characteristic functions of these Rows G and H are dependent on both the boundary values g0 and h0 and can be taken as functions g(l) = G(l; g0 ; h0 ); h(l) = H (l; g0 ; h0 ). After a group operation l → l − , we arrive at a coupled set of functional equations G(l + ; g; h) = G(l; g ; h ); g ≡ G(; g; h);
H (l + ; g; h) = H (l; g ; h );
h ≡ H (; g; h)
which is just an additive version of system (15) for k = 2. Now, we can make the important conclusion that a common property yielding functional group equations is just the transitivity property of some physical quantity with respect to the way of giving its boundary or initial value. Hence, the RG symmetry is not a symmetry of equations but a symmetry of equation solutions, that is of equations and boundary conditions considered as a whole.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
231
2.2. Di-erence between the Bogoliubov RG and KW-RG As mentioned above, RG ideas have expanded into diverse ?elds of physics in two di6erent ways: • Via direct analogy with the Kadano6–Wilson construction (averaging over some set of de-
grees of freedom) in polymers, non-coherent transfer and percolation, i.e., constructing a set of models for a given physical problem. • Via ?nding an exact RG symmetry by proof of the equivalence with a QFT model (e.g., in turbulence [20,21]), plasma turbulence [27] or by some other reasoning (like in the transfer problem). To the question Are there di-erent renormalization groups? the answer is positive: 1. In QFT and some simple macroscopic examples, RG symmetry is an exact symmetry of a solution formulated in terms of its natural variables. 2. In turbulence, continuous spin-?eld models and in some others, it is a symmetry of an equivalent QFT model. 3. In polymers, percolation, etc. (with KW blocking), the RG transformation is a transformation between di-erent auxiliary models (specially constructed for this purpose) of a given system. As we have shown, there is no essential di6erence in the mathematical formulation. There exists, however, a profound di6erence in physics: • In cases 1 and 2 (as well as in some macroscopic examples), the RG is an exact symmetry
of a solution. • In the Kadano6–Wilson type problem (spin lattice, polymers, etc.), one has to construct a set M of models Mi . The KW-RG transformation KWn Mi = Mni
with integer n
(29)
is acting inside a set of models. 2.3. Functional self-similarity The RG transformations have a close connection to the concept of self-similarity. Self-similarity transformations for problems formulated by using non-linear PDEs are well known since the last century, mainly in the dynamics of liquids and gases. They are one parameter transformations de?ned as a simultaneous power scaling of independent variables z = {x; t; : : :}, solutions fk (z) and other functions Vi (z) (like external forces) S : {x = x; t = ta ; fk = ’k fk ; Vi = !i Vi } entering into the equations. To emphasize their power structure, we use the term power self-similarity = PS. According to Zel’dovich and Barenblatt [28], PS can be classi?ed into two types: (a) PS of the 1st kind with all indices a; : : : ; ’; !; : : : being integers or rational numbers (rational PS) that are usually found from the theory of dimensions;
232
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
(b) PS of the 2nd kind with irrational indices (fractal PS) which should be de?ned from the system’s dynamics. To relate the RG to PS, consider the renormgroup FE g(xt; Q g) = g(x; Q g(t; Q g)). Its general solution is known; it depends on an arbitrary function of one argument—see Eq. (19). However, at the moment, we are interested in a special solution linear in the second argument: g(x; Q g) = gX (x): The function X (x) should satisfy the equation X (xt) = X (x)X (t) with the solution X (x) = x! . Hence, g(x; Q t) = gx! . This means that in our special case, linear in g, the RG transformation (10) is reduced to the PS transformation, Rt ⇒ St : {x = xt −1 ; g = gt ! } : (30) More generally, with the RG, instead of a power law we have an arbitrary functional dependence. Thus, one can consider transformations (10), (13) and (16) as functional generalizations of the usual (i.e., power) self-similarity transformations. Hence, it is natural to refer to them as transformations of functional scaling or functional (self-)similarity (FS) rather than as RG-transformations. In short, RG ≡ FS with FS standing for functional similarity. 6 Now, we can answer the question on the physical meaning of the symmetry underlying FS and the Bogoliubov renormgroup. As we have mentioned, it is not a symmetry of a physical system or of the equation(s) for the problem at hand, but rather a symmetry of a solution considered as a function of the relevant physical variables and suitable boundary parameters. A symmetry like that can be related; in particular, to the invariance of a physical quantity described by a solution. It means, that this quantity remains unaltered under group transformations changing the way in which boundary conditions are imposed. For instance, this happens in illustration of Section 1.2 where the changing of the reference point constitutes the group operation. Homogeneity is an important feature of a physical system under consideration. However, homogeneity can be violated in a discrete manner. Imagine that such a discrete violation is connected with a certain value of x, say, x = y. In this case the RG transformation with the canonical parameter t has the form (13). The symmetry connected to FS is a very simple and frequently encountered property of physical solutions. It can be easily “discovered” in numerous problems of theoretical physics like classical mechanics, transfer theory, classical hydrodynamics, and so on [30,26,22,23]—see, above, Section 2.1.2. 3. Solution symmetry in mathematical physics 3.1. Constructing RG-symmetries and their application From the discussion in Sections 1.1 and 1.2 it follows that the FS transformation in QFT is the scaling transformation of an independent variable x (and, possibly, a parameter y) accompanied 6
This notion was ?rst mentioned in [29] and formally introduced [30] in the beginning of 1980s.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
233
Fig. 1. RGS constructing and application to BVP in mathematical physics.
by a functional transformation of the solution characteristic g. It is introduced by means of either ?nite transformations (10), (13) and (16) or the in?nitesimal operator (7). Hence, the symmetry of a solution, i.e., FS symmetry, is commonly understood in QFT as the Lie point symmetry of a one-parameter transformation group de?ned by an operator of (7)-type. Now, we are interested in answers to the following questions: • is it possible to extend the notion of RG symmetry (RGS) and generalize the form of RGS
implementation that may di6er from that given by (7)?—and if “yes”,
• is it possible to create a regular algorithm for ?nding these symmetries?
The answer is yes to both these questions, and below we demonstrate a regular algorithm for constructing an RGS in mathematical physics that up to now has been devised only for boundary value problems (BVPs) for the (system of) di6erential equation(s) which we shall refer to as basic equations (BEs). The point is that these models can be analyzed by methods of Lie group analysis which employ in?nitesimal group transformations instead of ?nite ones. The general idea of the algorithm is to ?nd a speci?c renormgroup manifold RM that contains the desired solution of a BVP. Then, construction of a RGS that leaves this solution unaltered is performed by using standard methods of group analysis of DEs. The regular algorithm for constructing RGS (and their application) can be formulated in the form of a scheme 7 which comprises of a few steps. It is illustrated in Fig. 1. (I) First of all, a speci?c renormgroup manifold RM for the given BVP is constructed which is identi?ed below with a system of the kth-order DEs F8 (z; u; u(1) ; : : : ; u(k) ) = 0; 7
8 = 1; : : : ; s :
(31)
In the present form this scheme was described in [31]. One can ?nd there historical comments and references on the pioneering publications.
234
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
In (31) and what follows we use the terminology of group analysis and the notation of differential algebra. In contrast to mathematical analysis, where we usually deal with functions u ; = 1; : : : ; m of independent variables xi ; i = 1; : : : ; n and derivatives ui (x) ≡ 9u = 9xi ; uij (x) ≡ 92 u = 9xi 9xj ; : : : that are also considered as functions of x, in di6erential algebra we also treat u ; ui ; uij ; : : : as variables. Therefore, in di6erential algebra we deal with an in?nite number of variables x = {x i };
u = {u };
u(1) = {ui };
u(2) = {ui 1 i2 }; : : : ;
(i; i1 ; : : : = 1; : : : ; n) ;
(32)
where the xi are called independent variables, u dependent variables and u(1) ; u(2) ; : : : derivatives. A locally analytic function f(x; u; u(1) ; : : : ; u(k) ) of the variables (32), with the highest order derivative being of the kth-order is called a di-erential function of order k. The set of all di6erential functions of a given order forms a space of di6erential functions A, the universal space of modern group analysis [32–35]. The realization of the Arst step is not unique, as it depends on both the form of the basic equations and the boundary conditions; generally, the RM does not coincide with the BEs. We indicate here a few possibilities for achieving this step. • One can use an extension of the space of variables involved in the group transformations. These variables, for example, may be parameters, p = {pj }; j = 1; : : : ; l entering into a so-
lution via the equations and=or boundary conditions. Adding parameters p to the list of independent variables z = {x; p} we treat the BEs in this extended space as the RM (31). Similarly, one can extend the space of di6erential variables by treating derivatives with respect to p as additional di6erential variables. • Another possibility employs reformulating the boundary conditions in terms of embedding equations or di-erential constraints which are then combined with the BEs. The key idea here is to treat simultaneously the solution of the BVP as an analytic function of the independent variables and the boundary parameters b = {x0i ; u0 }. Di6erentiation with respect to these parameters leads to additional DEs (embedding equations) that, together with the BEs, form an RM. In some cases, while calculating the Lie point RGS, the role of the embedding equations can be played by di6erential constraints (for details see [31]) that come from an invariance condition for the BEs with respect to the Lie-BCacklund 8 symmetry group. • In the case when the BEs contain a small parameter , the desired RM can be obtained by simpli?cation of these equations and use of “perturbation methods of group analysis” (see Vol. 3, Chapter 2, p. 31 in [34]). The main idea here is to consider a simpli?ed ( = 0) model, which admits a wider symmetry group (see examples in Section 4.2 below) in comparison with the case = 0. When we take the contributions from small into account, this symmetry is inherited by the BEs, which results in some additional terms, corrections in powers of , in the RGS generator. (II) The next step consists of calculating the most general symmetry group G that leaves the RM unaltered. The term “symmetry group”, as used in classical group analysis, means 8
We use here the terminology adopted in Russian literature [32,35]. This symmetry is also known as generalized or higher-order symmetry [33,34].
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
235
the property of system (31) that it admits a local Lie group of point transformations in the space A. The Lie algorithm for ?nding such symmetries consists of constructing tangent vector ?elds de?ned by the operator X = i 9xi + ; 9u ;
i ; ; ∈ A ;
(33)
where the coordinates, i ; ; are functions of the group variables and have to be determined by a system of equations XF8 |(31) = 0;
8 = 1; : : : ; s ;
(34)
that follow from the invariance of the RM. Here X is extended 9 to all derivatives involved in F8 and the symbol |(31) means calculated on frame (31). The linear homogeneous PDEs (34) for the coordinates i ; ; , are known as determining equations, and form an overdetermined system as a rule. The solution of Eqs. (34) de?nes a set of in?nitesimal operators (33) (also known as group generators), which correspond to the admitted vector ?eld and form a Lie algebra. In the case that the general element of this algebra X= Aj Xj ; (35) j
Aj
are arbitrary constants, contains a ?nite number of operators, 1 6 j 6 l, the group is where called Anite dimensional (or simply ?nite) with the dimension l; otherwise, for unlimited j or in the case that the coordinates i , ; depend upon arbitrary functions of the group variables, the group is called inAnite. The use of the in?nitesimal criterion (34) for calculating the symmetry groups makes the whole procedure algorithmic and can be carried out not only “by hand” but using symbolic packages of computer algebra (see, e.g., Vol. 3 in [34]) as well. In modern group analysis, different modi?cations of the classical Lie scheme are in use (see [32–34] and references therein). Generator (33) of the group G is equivalent to the canonical Lie–BKacklund operator Y = = 9u ;
= ≡ ; − i ui ;
(36)
that is known as a canonical representation of X and plays an essential role in RGS construction. The group de?ned by generators (33) and (36), in general, is wider than the desired RG, that usually appears as its subgroup. As the RGS is related to a partial BVP solution, it can be revealed by it restricting the admitted group G on a manifold de?ned by this given solution. (III) Hence, to obtain the RGS, the restriction of the group G on a particular BVP solution should be made, and this forms the third step. Mathematically, this procedure appears as checking the vanishing condition for the linear combination of coordinates =j of the canonical operator equivalent to (35) on a particular approximate (or exact) BVP solution U (z) Aj =j ≡ Aj (; j − ij ui ) =0 : (37) j
9
j
|u =U (z)
The extension of the generators to the derivatives employs the prolongation formulas and is a regular procedure in group analysis (see, e.g. [34]).
236
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
Evaluating (37) on a particular BVP solution, U (z), transforms the system of DEs for the group invariants into algebraic relations. 10 Firstly, it gives relations between the Aj thus “combining” di6erent coordinates of the group generators Xj admitted by the RM (31). Secondly, it eliminates (partially or entirely) the arbitrariness that may appear in the coordinates i , ; in the case of an in?nite group G. In terms of the “classic” QFT RG terminology, where there exists only one operator, X , of (7)-type (i.e., all Aj except one are equal to zero), the procedure of group restriction on a particular BVP solution gQappr eliminates the arbitrariness in the form of the (g)-function. While the general form of the condition given by Eq. (37) is the same for any BVP solution, the way of realization of the restriction procedure in every particular case employs a particular perturbation approximation (PA) for the concrete BVP. Generally, the restriction procedure reduces the dimension of G. It also “?ts” boundary conditions into operator (35) by a special choice of coeWcients Aj and=or by choosing the particular form of arbitrary functions of the coordinates i , ; . Hence, the general element (35) of the group G after the ful?llment of a restriction procedure is expressed as a linear combination of i the new generators Ri with the coordinates ˜ , ;˜ , i Bj Rj ; Rj = ˜j 9xi + ;˜ j 9u ; (38) X ⇒ R= j
where the Bj are arbitrary constants. The set of RGS generators Ri , each containing the desired BVP solution in its invariant manifold, de?ne a group of transformations that we also refer to as a renormgroup. This symmetry group is wider than the one considered in QFT, as the set of generators Rj generally form a ?nite or in?nite dimensional algebra. Moreover, Rj may correspond to Lie–BKacklund symmetry. Therefore, here we extend the notion of renormgroup and RG symmetry, the direct analogy with the “Bogoliubov RG” being preserved only for a one-parameter group of point transformations. (IV) The above prescribed three steps entirely de?ne a regular algorithm for RGS construction but do not touch on how a BVP solution is found. Hence, one more important, fourth, step should be added. It consists of using the RGS generators to ?nd analytical expressions for the new, “improved”, solution of the BVP. Mathematically, this step makes use of the RG = FS invariance conditions that are given by a combined system of (31) and the vanishing condition for the linear combination of coordinates =˜ j of the canonical operator equivalent to (38), i Rj =˜ j ≡ Bj (;˜ j − ˜j ui ) = 0 : (39) j
j
One can see that conditions (39) are akin to (37). However, in contrast to the previous step, the di6erential variables u in (39) should not be replaced by an approximate expression for the BVP solution U (z), but should be treated as normal dependent variables. 10
Similar relations were discussed in [32, Chapter 8], when constructing invariant solutions for the Cauchy problem for a quasi-linear system of ?rst order PDEs.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
237
For the one-parameter Lie point renormgroup, RG invariance conditions lead to the Arst order PDE that gives rise to the so-called group invariants (such as invariant couplings in QFT) which arise as solutions of associated characteristic equations. A general solution of the BVP is now expressed in terms of these invariants. On the one hand, this is in direct analogy with the structure of RG invariant solutions in QFT—compare with Eqs. (22) and (24). On the other hand, it reminds one of the so-called @-theorem from the theory of dimensional analysis and similitude (see, Section 19 in [32], Section 6 of Chapter 1 in [36] and historical comment to Section 43 in [37]) directly related to power self-similarity, discussed above in Section 2.3. However, as we shall see later, in the general case of arbitrary RGS the group invariance conditions obtained for a BVP are not necessarily characteristic equations for the Lie point group operator. They may appear in a more complicated form, e.g., as a combination of PDEs and higher order ODEs (see Section 4.2). Nevertheless, the general idea of ?nding solutions to the BVP in terms of RG invariant solutions remains valid. 3.2. Examples of solution improvement We now present a few examples of RGS construction with further use of the symmetry for “improving” an approximate solution. 3.2.1. ModiAed Burgers equation As the ?rst example, we take the initial value problem for the modi?ed Burgers equation ut − aux2 − !uxx = 0;
u(0; x) = f(x) :
(40)
It is connected to the heat equation u˜ t = !u˜ xx
(41)
by the transformation u˜ = exp(au=!) and has an exact solution which therefore allows us to check the validity of our approach. The RGS construction for (40) is an apt illustration of the general scheme, shown in Fig. 1 which may be helpful in understanding other examples of the general algorithm implementation. We review here brieRy the procedure and results of paper Ref. [38]. The RG-manifold RM (step (I)) is given by Eq. (40) with the parameters of non-linearity a and dissipation ! included in the list of independent variables. The Lie calculational algorithm applied to the RM gives, for the admitted group G (step (II)), nine independent terms in the general expression for the group generator X=
8
Ai (a; !)Xi + (t; x; a; !)e−au=! 9u ;
(42)
i=1
X1 = 4!t 2 9t + 4!tx9x − (!=a)(x2 + 2!t)9u ;
X2 = 2t 9t + x9x ;
238
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
X3 = (1=!)9t ;
X4 = 2!t 9x − (!=a)x9u ;
X7 = a9a + [(!=a) − u]9u ;
X5 = 9x ;
X6 = − (!=a)9u ;
X8 = 2!9! + x9x + 2[u − (!=a)]9u :
Here, Ai (a; !) are arbitrary functions of their arguments and (t; x; a; !) is an arbitrary function of four variables, satisfying the heat equation (41). The set of operators Xi forms an eight-dimensional Lie algebra, L8 . The ?rst six generators are related to the well-known symmetries of the modi?ed (potential) Burgers equation (see, e.g., Vol. 1, p. 183 in [34]). They describe projective transformations in the (t; x)-plane (X1 ), dilatations in the same plane (X2 ), translations along the t-, x- and u-axis (X3 , X5 and X6 ) and Galilean transformations (X4 ). The last two generators X7 and X8 relate to dilatations of the parameters a and !, now involved in group transformations. The procedure of restriction (step (III)) of the group (42) admitted by the RM (40) gives us a check of the invariance condition (37) on a particular BVP solution u = U (t; x; a; !) 8 i ;∞ + A (a; !)=i = 0; =i ≡ ;i − 1i ut − 2i ux − 3i ua − 4i u! : (43) i=1
|u=U (t; x; a; !)
This formula expresses the coordinate of the last term in (42) in terms of the remaining coordinates of the eight generators Xi for arbitrary t, and hence for t = 0, when U (0; x; a; !) = f(x). As a result, we obtain the “initial” value (0; x; a; !) and then, using the standard representation for the solution to the linear parabolic equation (41), the value of at arbitrary t = 0 (t; x; a; !) = −
8
Ai (a; !)=Q i (x; a; !) :
(44)
i=1
Here, =Q i (x; a; !) denote “partial” canonical coordinates =i taken at t = 0 and u = f(x). Symbol
F designates the convolution of a function F with the fundamental solution of (41), multiplied
by the exponential function of f entering into the boundary condition ∞ 1 (x − y)2 af(y) F(x; t; a; !) ≡ √ dy F(y; t; a; !) exp − : + 4!t ! 4!t −∞
Substitution (44) in the general expression (42) gives the desired RG generators Ri = Xi + %i e−au=! 9u ; 1 ! %3 = afx2 + !fxx ; %4 = x ; ! a ! ! ! ; %8 = xfx − 2f + 2 : %5 = fx ; %6 = 1 ; %7 = f − a a a The operators Ri form an eight-dimensional RG algebra RL8 that has the same tensor of structural constants as L8 , i.e. RL8 and L8 are isomorphic. Hence, the group restriction procedure eliminates the arbitrariness presented by the function and “?ts” the boundary conditions into the RG generators by means of %i . ! %1 = x2 ; a
%2 = xfx ;
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
239
It can be veri?ed that the exact solution of the initial-value problem (40) ∞ ! ! 1 (x − y)2 af(y) u(t; x; a; !) = ln1 ≡ ln √ dy exp − (45) + a a 4!t ! 4!t −∞ is the invariant manifold for any of the above RGS operators. Also, vice versa, (45) can be reconstructed from an approximate solution with the help of any of the RGS operators or their linear combination. For example, two such operators, !R3 ≡ Rt and (1=a)(R6 + R7 ) ≡ Ra were used in [38] to reconstruct the exact solution from perturbative (in time and in the non-linearity parameter a) solutions. Below, we describe this procedure (step (IV)) using the operator Ra , Ra = 9a + (1=a)(−u + e−au=! f(x) ) 9u :
(46)
It is evident that t; x and ! are invariants of the group transformations with (46), whilst ?nite RG transformations of the two remaining variables, a and u, are obtained by solving the Lie equations for (46), with ‘ the group parameter du da u (47) = 1; a |‘=0 = a; = (t; x; a ; !)e−a u =! − ; u |‘=0 = u : d‘ d‘ a Combining these equations yields one more invariant J = eau=! − 1 for the RGS generator (46). Solution of (47) along with (44) gives the formulae for ?nite RG transformations of the group variables {t; x; a; !; u} ! t = t; x = x; ! = !; a = a + ‘; u = (48) ln(eau=! + e‘f(x)=! − 1 ) : a+‘ Choosing the value a equal to zero, which is a starting point of PA in a, we get a = ‘. Then after excluding t; x; ! and ‘ from the expression for u (48) and omitting accents over t ; x ; ! ; u and a the desired BVP solution (45) is obtained. It also follows directly from J in view of the initial condition J|a=0 = 0. A similar procedure can be followed for the other RG operator, Rt = 9t + e−au=! afx2 + !fxx 9u ; which is consistent with the PA in time t. Although invariants for Rt and ?nite RG transformations di6er from that for (46), the ?nal result, i.e., the exact solution of BVP (40) given by (45), is the same. This possibility is the distinct demonstration of the multi-dimensional RGS to reconstruct the unique BVP solution from di-erent PA: either in parameter a or in t (though we used only two one-dimensional subalgebras here 11 ). 3.2.2. BVPs for ODEs: a simple example Quite recently, the QFT RG ideology has been applied, in a rather straightforward fashion, in mathematical physics to the asymptotic analysis of solutions to DEs [41,42] and in constructing an envelope for a family of solutions [43]. Our second methodological example with a linear ODE is presented here in order to illustrate the di6erence between our approach and the “perturbative RG theory” devised in [41] for a global analysis of BVP solutions in mathematical physics. 11
This can be considered as a construction parallel to the one used in Ref. [39].
240
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
Consider a linear second order ODE for y(t) with the initial conditions at t = , ˜ ytt + yt + Cy = 0; y() = u; which has the exact solution:
yt () = w˜ ;
(49)
√ w˜ + ∓ u˜ 1±K ; K = 1 − 4C; C± = ∓ : (50) 2 K Provided that the parameter C is small, the solutions to Eq. (49) have been treated in [41] with the goal of demonstrating the e6ectiveness of “perturbative RG theory” in the asymptotic analysis of a solution’s behavior. The main goal of this treatment was to improve a perturbative expansion in powers of C with secular terms ˙ C(t −) and obtain 12 a uniformly valid asymptotic of a solution
y = C+ e−+ (t−) + C− e−− (t−) ;
± =
y = c+ e(−1+C(1+C))(t−) + c− e−C(1+C)(t−) + O(C2 ) ;
(51)
˜ c− ≈ ((1 + 2C)w˜ + (1 + C)u) ˜ ; c+ ≈ −((1 + 2C)w˜ + Cu); which is accurate for small values C1 but for arbitrary values of the product C(t − ). We are going to show that the use of our regular RG algorithm enables one to improve a PA solution (either in powers of C or in t − ) up to the exact BVP solution (50). Rewriting (49) in the form of a system of two ?rst order ODEs for the functions u ≡ y and w ≡ yt , (52) ut = w; wt = − Cw − u ; we construct the RM (step (I)) using the invariant embedding method (this approach was ?rst realized in [44]). Then, the RM is presented as a joint system of BEs (52) and embedding equations u − (Cw˜ + u)u ˜ w˜ − wu ˜ u˜ = 0; w − (Cw˜ + u)w ˜ w˜ − ww ˜ u˜ = 0 ; treated in the extended space of group variables which include the parameters ; w; ˜ u˜ associated with the boundary conditions in addition to t and dependent variables u; w. Omitting tedious calculations related to the following two steps (steps (II) and (III)), we present here two examples of the resulting RGS generators R = 9 − (w˜ + Cu) ˜ 9w˜ + w˜ 9u˜ ; t 1 t t (2w + u) 1 − + u 9w + 2 (2w + u)9u RC = 9C − 2
2 2
1 − (2 w ˜ + u) ˜ 1 − + u˜ 9w˜ + 2 (2w˜ + u) ˜ 9u˜ ; (53)
2 2 2
that involve the initial values w, ˜ u˜ and the initial point in RG transformations. In addition, RC transforms the parameter C. 12
The algorithm used in [41] for improving PA solutions with secular terms involves: (a) an introduction of some additional parameters in the solutions, (b) a special choice of these parameters that eliminates secular divergences, and (c) imposing an independence condition on the solution with respect to the way these parameters are introduced. In some cases, this algorithm, directly borrowed from the QFT RG-method, gives an exact solution. However, the question of correspondence of this construction to a transformation group of a solution of the BEs remains open.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
241
Now, the procedure for constructing the BVP solution (50) (step (IV)) is similar to that used in the previous Section 3.2.1 and employs ?nite transformations that are de?ned by the Lie equations for operators (53). For R , the functions u; w and the parameter C are group invariants, while the translations of and the corresponding transformations of u; ˜ w˜ restore the exact solution (50) from the PA in powers of t − (note that the parameter C is not necessarily small in this PA!). For RC the di6erence t − is group invariant, whilst the transformation of C and related transformations of u; w; u; ˜ w˜ restore the exact solution (50) from the PA (discussed in [41]) in powers of C. Hence, as in the previous Section 3.2.1, both the RGS generators (53) reconstruct the unique BVP solution but from di6erent PAs. 4. The RG in non-linear optics 4.1. Formulation of a problem As a problem of real physical interest, take the BVP that describes self-focusing of a high-power light beam. While the problem has played an important role in non-linear electrodynamics since the 1960s, a detailed quantitative understanding of self-focusing is still missing [45], and there is no method which allows one to ?nd an analytic solution to the corresponding equations with arbitrary boundary conditions. Here, we demonstrate the great potential of the RGS approach in constructing analytic solutions to BVP equations with arbitrary boundary conditions. The RGS method allows one to consider di6erent types of BEs for self-focusing processes which include plane and cylindrical beam geometry, non-linear refraction and di6raction. The merit of the RGS method is that it describes BVP solutions with one- or two-dimensional singularities in the entire range of variables from the boundary up to the singularity point. Let us start with the BVP for a system of two DEs vz + vvx − nx = 0; v(0; x) = 0;
nz + nvx + vnx + (! − 1)(nv=x) = 0 ;
n(0; x) = N (x) ;
(54) (55)
which are used in the non-linear optics of self-focusing wave beams when di6raction is negligible. We study the spatial evolution of the derivative of the beam eikonal v and the beam intensity n in the direction into the medium z and in the transverse direction x. The term proportional to is related to non-linear refraction e6ects; ! = 1 and 2 refer to the plane and cylindrical beam geometry, respectively. Boundary conditions (55) correspond to the plane front of the beam and the arbitrary transverse intensity distribution. 4.2. Plane geometry In the plane beam geometry (at ! = 1) Eqs. (54) can be reduced to the system of BEs w − nHn = 0;
Hw + n = 0
(56)
242
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
for functions = nz and H = x − vz of w = v= and n arguments, with boundary conditions (0; n) = 0;
H(0; n) = H (n) ;
(57)
where H (n) is the inverse to N (x). Here, the procedure of RGS construction makes use of the Lie–BCacklund symmetry and is described as follows [31]. The manifold RM (step (I)) is de?ned by Eqs. (56) treated in the extended space that include dependent and independent variables ; H; w; n and derivatives of and H with respect to n of an arbitrary high order. The admitted symmetry group G (step (II)) is represented by the canonical Lie–BKacklund operator X = f 9 + g 9H
(58)
with the coordinates f and g that are linear combinations of and H and their derivatives
9i = 9ni and 9i H= 9ni ; i ¿ 1 with the coeWcients depending on w and n. The restriction of the group admitted by RM (56) (step (III)) implies the check of the
invariance condition (37) that yields two relations f = 0;
g=0 :
(59)
These relations should be valid on a particular solution of BVP with the boundary data (57).√For example, choosing the so-called “soliton” pro?le, N (x) = cosh−2 (x), i.e., H (n) = Arccosh(1= n), we have f = 2n(1 − n)nn − nn − 2nw(Hn + nHnn ) + ( w2 =2)nnn ; g = 2n(1 − n)Hnn + (2 − 3n)Hn + w(2nnn + n ) + ( w2 =2)(nHnn + Hn ) :
(60)
Dependence on nn and Hnn indicates that here RGS is the second-order Lie–BCacklund symmetry. In order to ?nd a particular solution of a BVP (step (IV)), one should solve the joint system of BEs (56) and second-order ODEs that follow from the RG = FS invariance conditions (59) and (60). The resulting expressions [46]—the well-known Khokhlov solutions 13 v = − 2 nz tanh(x − vz);
n2 z 2 = n cosh2 (x − vz) − 1 ;
(61)
describe the process of self-focusing of a soliton beam: the sharpening of the beam intensity pro?le with the increase of z is accompanied by the intensity growth on the beam axis. Solution (61) is valid up to the singularity point where the derivatives vx and nx tend to in?nity whilst the beam intensity n remains ?nite √ sol zsing = 1=2 ; nsol (62) sing = 2 : Here, the Lie–BKacklund RGS enables one to reconstruct the BVP solution and describe the solution singularity for the light beam with the soliton initial intensity pro?le. One more example of an exact BVP solution obtained with the help of Lie–BKacklund RGS (with the initial beam pro?le in the form of a “smoothed” step) can be found in [46]. For arbitrary boundary data, it turns to be impossible to ful?ll condition (59) with the help of the Lie–BKacklund symmetries of any ?nite order, and one is forced to use a di6erent algorithm [31,46] of RGS construction, based on the approximate group methods. Here, (step (I)) RM 13
In Ref. [47], where this solution was ?rst obtained, it did not result from a regular procedure.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
243
is given by BEs (56) with a small parameter , and coordinates of the group generator (58) (and, hence, coordinates of the RGS operators) appear as in?nite series in powers of ∞ ∞ i i f= f ; g= i gi : (63) i=0
i=0
The procedure of ?nding the coeWcients fi ; gi (step (II)) leads to the system of recurrent relations that express higher-order coeWcients fi+1 , gi+1 in terms of previous ones fi , gi . It means that once the zero-order terms are speci?ed, the other terms are reconstructed by the recurrent relations. The coeWcients fi and gi contain an arbitrary function of n; H[s] and [s] − w(sH[s] + nH[s+1] ) where subscript [s] denotes the partial derivative of the order s with respect to n. This arbitrariness is eliminated by the procedure of group restriction (step (III)), i.e., by imposing the invariance condition (59). For particular forms of f0 and g0 , that is for partial boundary conditions (57), inAnite series are truncated automatically, and we arrive at the exact RGS. One example of this kind is given by Eqs. (60) that have a binomial structure f = f0 + f1 , g = g0 + g1 . If we neglect the higher-order terms in the case of arbitrary boundary conditions (when series (63) are not truncated automatically), then we get an approximate RGS which produces an approximate solution to the BVP. As an example, we give here two sets of expressions for the coordinates fi and gi for the Gaussian initial pro?le with N (x) = exp(−x2 ), i.e., H (n) = (ln(1=n))1=2 , which de?ne approximate RGS (a) f0 = 1 + 2nHHn ; g0 = 0; f1 = − 2n + 2 =n; g1 = − 2(Hn + Hn ), (b) f0 = 2n(Hn + n H); g0 = 1 + 2nHHn ; f1 = 2H ; g1 = 2(HH − n ). Here, linear dependencies of f and g upon ?rst-order derivatives indicate that RGS is equivalent to Lie point symmetry. The peculiarity of case (b) is a dependence of f and g not only on derivatives with respect to n but also with respect to : it means that the parameter is also involved in group transformations. In the non-canonical representation (33), the RGS generator in this case has the form RGuass1 = 29w + 2nH9n + 2 H9 − 9H :
(64)
The last step (IV) is performed in a usual way by solving the joint system of BEs (56) and equations that follow from the RG = FS invariance condition (59), or else, using invariants of associated characteristic equations for RG operator provided that RGS is a Lie point symmetry. We give here the solution that follows from RGS (64), 1 2x nz x2 = (1 − 2 nz 2 )2 ln : (65) ; v=− n(1 − nz 2 ) 1 − 2 nz 2 These expressions describe a self-focusing Gaussian beam (the plot n(x) for this solution is presented at the end of the section in Fig. 2), that is qualitatively very similar to the spatial evolution of the soliton beam (61). Moreover, the singularity position and the value of maximum beam intensity at this point coincide with analogous values (62) for the soliton beam. Although formulae (65) correspond to an approximate BVP solution, they exactly describe the behavior of n on the beam axis at x = 0. To estimate the reliability of result (65) in the o6-axis region, we compared it with another approximate BVP solution which arises from the approximate RGS
244
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
Fig. 2. Intensity n versus transverse coordinate x for a plane (left panel) and cylindrical (right panel) beam geometry for a few values of distance z from the boundary z2 ¿ z1 ¿ z0 = 0.
in case (a). These approximations agree very well (details are presented in [46]), thus proving the accuracy 14 of the RG approach. 4.3. Cylindrical geometry In the above discussion we dealt with the plane beam geometry and took into account only e6ects of non-linear beam refraction, neglecting di6raction. The Rexibility of RGS algorithm allows one to apply it in a similar way to a more complicated model as compared to (56), e.g., for the cylindrical beam geometry, ! = 2. Omitting technical details, we present the RGS generator for the cylindrical parabolic beam with N = 1 − x2 Rpar = (1 − 2 z 2 )9z − 2 zx9x − 2 (x − vz)9v + 4 nz 9n : The BVP solution is expressed in terms of group invariants for this generator: J1 =
x2 ; %
J2 = n%;
J3 = 2 x2 − v2 % +
xv %z ; 2
% = (1 − 2 z 2 ) :
(66)
The explicit form of dependencies of J2 = 1 − J12 ; J3 = 2 J1 upon J1 follows from the boundary conditions (55). They lead to the well-known solution [47] v = (x=(2%))%z ;
n = (1=%)(1 − (x2 =%)) par zsing
√
(67)
= 1= 2 where % = 0 that describes the convergence of the beam to the singularity point and n → ∞. The solution singularity is two-dimensional here: the in?nite growth of beam par intensity in the vicinity of the singularity z → zsing is accompanied by the in?nite growth of the derivative vx and collapsing of the beam size in the transverse direction. The RGS algorithm based on approximate group methods can also be applied in the case when besides non-linear refraction also di6raction e6ects are taken into account. Then, the ?rst equation in (54) should be modi?ed by adding the di6raction term √ √ −9x {(x1−! = n)9x (x!−1 9x n)} : 14
One more evidence is provided by the comparison of approximate and exact BVP solution for the soliton beam performed in [46].
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
245
Standard calculations done in compliance with a general scheme for thus modi?ed RM (for details see [48]) give the RGS generator for the cylindrical beam geometry (! = 2) vz nz SHH + SH 9n : (68) RGauss2 = (1 + z 2 SHH )9z + (zSH + vz 2 SHH )9x + SH 9v − nz 1 + x x Here the function S, de?ned by the form of the intensity boundary distribution, 9H (H9H N (H)) ; S(H) = N (H) + H N (H)
contains two small parameters, and , and, as in the case = 0, there exist speci?c forms of boundary distribution, N , for which the RGS operator (68) de?nes exact (not approximate) symmetry valid for arbitrary values of and . Constructing a particular BVP solution (step (IV)) implies the use of group invariants related to (68), and the procedure is similar to that one for the parabolic beam. For the Gaussian wave beam, N = exp(−x2 ), the result is as follows: v(z; x) =
x−H ; z
n(z; x) = e−
2
2
H ( − e−H ) : x ( − e− 2 )
(69)
Here H and are expressed in terms of t and x by the implicit relations 2
2
2
( 2 − H2 ) + (e− − e−H ) = 2z 2 H2 ( − e−H )2 ; 2
x = H(1 + 2z 2 ( − e−H )) : Solution (69) describes the self-focusing of the cylindrical Gaussian beam that gives rise to the two-dimensional singularity: both the beam intensity n and derivatives vx ; nx go to in?nity Gauss = 1= 2( − ) provided that ¿ . A detailed analysis of (69) and more at the point zsing general solutions with a parabolic form of an eikonal at z = 0; v(0; x) = − x=T , is given in [48,45]. To illustrate the di6erence between the one- and two-dimensional solution singularities, in Fig. 2 we present a typical behavior of the wave beam intensity, de?ned by Eqs. (65) and (69). The left panel corresponds to the plane beam geometry, ! = 1, and without di6raction, = 0, while the right one is concerned with a cylindrical wave beam, ! = 2, with both non-linearity and di6raction e6ects included. Diverse curves describe beam intensity distribution upon coordinate x at di6erent distances from the medium boundary, where we have the collimated Gaussian beam, N = exp(−x2 ). It is clear that in the plane geometry the derivative of the beam intensity with respect to x turns to in?nity at some singular point, while the value of intensity on the axis remains ?nite. In cylindrical case the solution singularity is two dimensional: both the beam intensity and its derivative with respect to transverse coordinate turn to in?nity simultaneously at point zsing . This last example demonstrates the possibility of the RGS approach to analyze two-dimensional singularity. In the practice of RG application to critical phenomena, this correlates with the case of “two renormalization groups” [39].
246
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
Fig. 3. Early development of concept: from Bogoliubov RG to Wilson RG and FS.
5. Overview To complete our review, we indicate some milestones in the evolution of the RG concept. Since its appearance in QFT, the RG has served as a powerful tool for analyzing diverse physical problems and improving solution singularities disturbed by a PA. The development of the RG concept can be divided into two stages. The ?rst one (since the mid-1950s up to the mid-1980s) is summarized in Fig. 3. Besides the early history (discovery of the RG, formulation of the RG method and application to UV and IR asymptotics), it comprises the development of the Kadano6–Wilson RG in the 1970s and the subsequent explosive expansion into other ?elds of theoretical physics. During this stage, the formulation of the RG method was based on the uni?ed scaling transformation of an independent variable (and=or some parameters) accompanied by a more complicated transformation of a solution characteristic g = g( ; Q g)—see Eqs. (10), (13) and (16) in Section 1.2. Here, the main role of RG = FS was the aZ priori establishment of the fact that the solution under consideration admits functional transformations that form a group. Particular implementations of the RG symmetry di6er in the form of the function(s) g( ; Q g) (or (g)) which, in an every partial case, is obtained from some approximate solution. The next stage, after the mid-1980s, is depicted in Fig. 4. The scheme describes the entire evolution of the Bogoliubov RG. There were several important reasons for further developing the RG concept in theoretical physics in this period. On the one hand, it was due to the extension of the notion of FS and RG symmetry that until then were based on one-parameter Lie groups of point transformations. Appending multi-dimensional Lie point groups and Lie–BKacklund groups to possible realizations of the group symmetry enhanced the usefulness of the RG method.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
247
Fig. 4. Evolution of concept: from the Bogoliubov RG via FS to RG-symmetries.
On the other hand, this additional possibility arose due to the mathematical apparatus that was used in mathematical physics to reveal RGS. The advantage came from in?nitesimal transformations that enabled one to describe the RGS by an algebra of RG generators. However, in contrast to the situation typical of QFT models with only one operator, in mathematical physics we have ?nite or in?nite-dimensional algebras. Both their dimension and the method of construction depend upon the model employed and upon the form of the boundary conditions. The use of the in?nitesimal approach results in constructing an RG-type symmetry with the help of the regular methods of group analysis of DEs. Precisely, this regular algorithm naturally includes the RG = FS invariance condition in the general scheme of construction and application of the RGS generators (see also our recent review [49]). Within the in?nitesimal approach this condition is formulated in terms of the vanishing of the canonical RG operator coordinates, which is especially important for the Lie–BKacklund RGS because ?nite transformations in this case are expressed as formal series. In particular, this property attribute a new feature to the RG analysis of a BVP solution with singular behavior, making the singularity analysis more powerful. At the same time, as the group analysis technique is still developing—here we mean both extension to new types of symmetries and application to more complicated mathematical models, e.g., including integro-di6erential equations—we have a clear perspective that the possibilities of a regular scheme based upon the Bogoliubov RG method are far from being exhausted. Acknowledgements The authors are grateful to Professors Chris Stephens and Denjoe O’Connor for the invitation to participate in the Conference “Renormalization Group 2000”. They are indebted to these
248
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
gentlemen for useful discussions and comments. This work was partially supported by grants of the Russian Foundation for Basic Research (RFBR projects Nos 96-15-96030, 99-01-00232 and 99-01-00091) and by INTAS grant No 96-0842, as well as by the Organizing Committee of the above-mentioned meeting. References [1] E.E.C. Stueckelberg, A. Petermann, Helv. Phys. Acta 22 (1953) 499–520. [2] D.V. Shirkov, Historical remarks on the renormalization group, in: L.M. Brown (Ed.), Appendix in the collective monograph Renormalization: From Lorentz to Landau (and Beyond), Springer, New York, 1993, pp. 167–186; D.V. Shirkov, On the early days of renormalization group, in: L. Hoddeson et al. (Eds.), The Rise of the Standard Model, Proceedings of the Third International Symposium on the History of Particle Physics, SLAC, 1992, Cambridge Univ. Press, Cambridge, 1997, pp. 250 –258. [3] M. Gell-Mann, F. Low, Phys. Rev. 95 (1954) 1300–1312. [4] N.N. Bogoliubov, D.V. Shirkov, Dokl. Akad. Nauk SSSR 103 (1955) 203–206 (in Russian); see also in Nuovo Cimento 3 (1956) 845 –863. [5] N.N. Bogoliubov, D.V. Shirkov, Dokl. AN SSSR 103 (1955) 391–394 (in Russian); see also in Nuovo Cimento 3 (1956) 845 –863. [6] D.V. Shirkov, Dokl. AN SSSR 105 (1955) 972–975 (in Russian); see also in Nuovo Cimento 3 (1956) 845 –863. [7] N.N. Bogoliubov, D.V. Shirkov, Nuovo Cimento 3 (1956) 845–863. [8] L.V. Ovsyannikov, Dokl. AN SSSR 109 (1956) 1112–1115 (in Russian) (for English translation see in: Yu. Trutnev (Ed.), Intermissions : : : WS, 1998, pp. 76 –79); C. Callan, Phys. Rev. D 2 (1970) 1541–1547; K. Symanzik, Comm. Math. Phys. 18 (1970) 227–246. [9] L.D. Landau et al., Nuovo Cimento 3 (Supp.) (1955) 80–104. [10] I.F. Ginzburg, Dokl. AN SSSR 110 (1956) 535 –538 (in Russian), see Chapter “Renormalization Group” in N. Bogoliubov, D. Shirkov, Introduction to the Theory of Quantized Fields, 1959; Wiley-Interscience, New York, 1980. [11] N. Bogoliubov, D. Shirkov, Introduction to the Theory of Quantized Fields, 1959; Wiley-Interscience, New York, 1980. [12] D.V. Shirkov, Theor. Math. Phys. 119 (1999) 55 – 66; hep-th=9810246. [13] L. Kadano6, Physics 2 (1966) 263–272. [14] K. Wilson, Phys. Rev. B 4 (1971) 3174–3183. [15] P.G. De Gennes, Phys. Lett. 38A (1972) 339 –340; J. des Cloiseaux, J. Phys. (Paris) 36 (1975) 281–292. [16] V.I. Alkhimov, Theor. Math. Phys. 39 (1979) 422– 424; 59 (1984) 591–597. [17] T.L. Bell et al., Phys. Rev. A 17 (1978) 1049 –1057; G.F. Chapline, Phys. Rev. A 21 (1980) 1263–1271. [18] B.V. Chirikov, Lect. Notes Phys. 179 (1983) 29 – 46; B.V. Chirikov, D.L. Shepelansky, Chaos Border and Statistical Anomalies, in: D.V. Shirkov, D.I. Kazakov, A.A. Vladimirov (Eds.), Renormalization Group, Proceedings of 1986 Dubna Conference, WS, Singapore, 1988, pp. 221–250; Yu.G. Sinai, K.M. Khanin, Renormalization group method in the theory of dynamical systems, in: D.V. Shirkov, D.I. Kazakov, A.A. Vladimirov (Eds.), Renormalization Group, Proceedings of 1986 Dubna Conference, WS, Singapore, 1988, pp. 251–277; A. Peterman, A. Zichichi, Nuovo Cimento 109A (1996) 341–355. [19] D.V. Shirkov, Sov. Phys. JETP 9 (1959) 421–424. [20] C. DeDominicis, P. Martin, Phys. Rev. A 19 (1979) 419 – 422; L.Ts. Adzhemyan et al., Theor. Math. Phys. 58 (1984) 47–51; 64 (1985) 777–784; A.N. Vasiliev, Quantum Field Renormalization Group in the Theory of Turbulence and in Magnetic Hydrodynamics, in: D.V. Shirkov, D.I. Kazakov, A.A. Vladimirov (Eds.), Renormalization Group, Proceedings of 1986 Dubna Conference, WS, Singapore, 1988, pp. 146 –159. [21] A.N. Vasiliev, Quantum Field Renormalization Group in the Theory of Critical Behavior and Stochastic Dynamics, PINF Publ., St-Petersburg, 1998, 773 pp (in Russian), also Taylor and Francis group PLC, London, in preparation.
D.V. Shirkov, V.F. Kovalev / Physics Reports 352 (2001) 219–249
249
[22] D.V. Shirkov, Teor. Mat. Fiz. 60 (1984) 778–782; The RG method and functional self-similarity in physics, in: R.Z. Sagdeev (Ed.), Nonlinear and Turbulent Processes in Physics, Vol. 3, Harwood Acad. Publ., New York, 1984, pp. 1637–1647. [23] D.V. Shirkov, Renormalization group in modern physics, in: D.V. Shirkov, D.I. Kazakov, A.A. Vladimirov (Eds.), Renormalization Group, Proceedings of 1986 Dubna Conference, WS, Singapore, 1988, pp. 1–32; Int. J. Mod. Phys. A 3 (1988) 1321–1341; Several topics on renorm-group theory, in: D.V. Shirkov, V.B. Priezzhev (Eds.), Renormalization group ‘91, Proceedings of Second International Conference, September 1991, Dubna, USSR, WS, Singapore, 1992, pp. 1–10; Renormalization group in di6erent ?elds of theoretical physics, KEK Report 91-13 (February 1992), 85pp. [26] M.A. Mnatsakanyan, Sov. Phys. Dokl. 27 (1982) 856–859. [27] G. Pelletier, Plasma Phys. 24 (1980) 421–443. [28] Ja.B. Zel’dovich, G.I. Barenblatt, Sov. Phys. Dokl. 3 (1) (1958) 44 – 47; see also G.I. Barenblatt, Scaling, Self-Similarity, and Intermediate Asymptotics, Cambridge Univ. Press, Cambridge, 1996 (Chapter 5). [29] V.Z. Blank, V.L. Bonch-Bruevich, D.V. Shirkov, Sov. Phys. JETP 3 (1956) 845–863. [30] D.V. Shirkov, Sov. Phys. Dokl. 27 (1982) 197–200. [31] V.F. Kovalev, V.V. Pustovalov, D.V. Shirkov, J. Math. Phys. 39 (1998) 1170 –1188; hep-th=9706056. [32] L.V. Ovsyannikov, Group Analysis of Di6erential Equations, Academic Press, New York, 1982. [33] Peter J. Olver, Applications of Lie Groups to Di6erential Equations, Springer, New York, 1986. [34] N.H. Ibragimov (Ed.), CRC Handbook of Lie Group Analysis of Di6erential Equations, 3 Vols. CRC Press, Boca Raton, FL, USA, 1994 –1996. [35] N.H. Ibragimov, Transformation Groups Applied to Mathematical Physics, Reidel Publ, Dodrecht-Lancaster, 1985. [36] L.I. Sedov, Similarity and Dimensional Analysis, Academic Press, New York, 1959. [37] G. Birkho6, Hydrodynamics, A study in Logic, Fact and Similitude, Princeton Univ. Press, Princeton, 1960. [38] V.F. Kovalev, V.V. Pustovalov, Lie Groups Appl. 1 (1994) 104–120. [39] C.R. Stephens, Why two renormalization groups are better than one, in: D.V. Shirkov, D.I. Kazakov, V.B. Priezzhev (Eds.), Renormalization group ‘96, Proceedings of Third International Conference, August 1996, Dubna, Russia, JINR publ., Dubna, 1997, pp. 392– 407; Int. J. Mod. Phys. 12 (1998) 1379 –1396. [41] L.-Y. Chen, N. Goldenfeld, Y. Oono, Phys. Rev. E 54 (1996) 376–394. [42] J. Bricmont, A. Kupiainen, G. Lin, Comm. Pure Appl. Math. 47 (1994) 893–922. [43] T. Kunihiro, Progr. Theor. Phys. 94 (1995) 503–514. [44] V.F. Kovalev, S.V. Krivenko, V.V. Pustovalov, The Renormalization group method based on group analysis, in: D.V. Shirkov, V.B. Priezzhev (Eds.), Renormalization group ‘91, Proceedings of Second International Conference, September 1991, Dubna, USSR, WS, Singapore, 1992, pp. 300 –314. [45] V.F. Kovalev, V.Yu. Bychenkov, V.T. Tikhonchuk, Phys. Rev. A 61(3) (2000) 0338098(1-10). [46] V.F. Kovalev, Theor. Math. Phys. 111 (1997) 686 –702; V.F. Kovalev, D.V. Shirkov, J. Nonlinear Opt. Phys. Mater. 6 (1997) 443– 454. [47] S.A. Akhmanov, R.V. Khokhlov, A.P. Sukhorukov, Sov. Phys. JETP 23 (1966) 1025–1033. [48] V.F. Kovalev, Theor. Math. Phys. 119 (1999) 719–730. [49] V.F. Kovalev, D.V. Shirkov, Theor. Math. Phys. 121 (1999) 1315–1322.
Physics Reports 352 (2001) 251–272
Renormalization group in statistical mechanics and mechanics: gauge symmetries and vanishing beta functions Giovanni Gallavotti Departimento di Fisica, Universita di Roma La Sapienza 1, Piazzale Aldo Moro 2, 00185 Roma, Italy Received March 2001; editor : I: Procaccia
Contents 1. Introduction 2. Fermi systems in one dimension 3. The conceptual scheme of the renormalization group approach followed above
252 252 263
4. The KAM problem Acknowledgements References
265 271 271
Abstract Two very di0erent problems that can be studied by renormalization group methods are discussed with the aim of showing the conceptual unity that renormalization group has introduced in some areas of theoretical Physics. The two problems are: the ground state theory of an one-dimensional quantum Fermi liquid and the existence of quasi periodic motions in classical mechanical systems close to integrable ones. I summarize here the main ideas and show that the two treatments, although completely independent c 2001 Elsevier Science B.V. All rights reserved. of each other, are strikingly similar. PACS: 05.10.Cc
E-mail address:
[email protected] (G. Gallavotti). c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 4 0 - 0
252
G. Gallavotti / Physics Reports 352 (2001) 251–272
1. Introduction There are few cases in which a renormalization group analysis can be performed in full detail and without approximations. The best known case is the hierarchical model theory of Wilson (1970) and Wilson and Kogut (1974). Other examples are the (Euclidean) ’4 quantum ?eld theories in two and three space-time dimensions (Wilson, 1973) (for an analysis in the spirit of what follows see Gallavotti (1985) or Benfatto and Gallavotti (1995)), and the universality of critical points (Wilson and Fisher, 1972). In all such examples there is a basic diAculty to overcome: namely the samples of the ?elds can be unboundedly large: this does not destroy the method because such large values have extremely small probability (Gallavotti, 1985). The necessity of a di0erent treatment of the large and the small ?eld values hides, to some extent, the intrinsic simplicity and elegance of the approach: unnecessarily so as the end result is that one can essentially ignore (to the extent that it is not even mentioned in most application oriented discussions) the large ?eld values and treat the renormalization problem perturbatively, as if the large ?elds were not possible. Here I shall discuss two (nontrivial) problems in which the large ?eld diAculties are not at all present, and the theory leads to a convergent perturbative solution of the problem (unlike the above-mentioned classical cases, in which the perturbation expansion cannot be analytic in the perturbation parameter). The problems are: (1) the theory of the ground state of a system of (spinless, for simplicity) fermions in 1-dimension, (Berretti et al., 1994; Benfatto and Gallavotti, 1990; Bonetto and Mastropietro, 1995); (2) the theory of KAM tori in classical mechanics (Eliasson, 1996; Gallavotti, 1995; Bonetto et al., 1998; Gallavotti et al., 1995). The two problems will be treated independently, for completeness, although it will appear that they are closely related. Since the discussion of problem (1) is quite technical we summarize it at the end (in Section 3) in a form that shows the generality of the method that will then be applied to the problem (2) in Section 4. The analysis of the above examples suggests methods to study and solve several problems in the theory of rapidly perturbed quasi periodic unstable motion (Gallavotti, 1995; Gallavotti et al., 1999): but for brevity we shall only refer to the literature for such applications. 2. Fermi systems in one dimension The Hamiltonian for a system of N spinless fermions at x1 ; : : : ; xN enclosed in a box (actually an interval) of size V is N N 1 H= v(xi − xj ) − ; (2.1)
x − + 2 2m i i¡j i=1
i=1
where is the chemical potential, v is a smooth interaction pair potential, is the strength of the coupling; is a correction to the chemical potential that vanishes for = 0 and that has
G. Gallavotti / Physics Reports 352 (2001) 251–272
253
Fig. 1. The two basic building blocks (“graph elements”) of the Feynman graphs for the description of the ground state: the ?rst represents the potential term (2 v) in (2.1)) and the second the chemical potential term ().
Fig. 2. Graphical representation of the “external” lines and vertices in Feynman graphs.
to be adjusted as a function of ; it is introduced in order that the Fermi momentum stays -independent and equal, therefore, to pF = (2m)1=2 . It is in fact convenient to develop the theory at ?xed Fermi momentum because the latter has a more direct physical meaning than the chemical potential as it marks the location of important singularities of the functions that describe the theory. The parameter m is the mass of the particles in absence of interaction. It is well known (Luttinger and Ward, 1960) that the ground state of the above Hamiltonian is described by the Schwinger functions of a fermionic theory whose ?elds will be denoted ± x . For instance, the occupation number function nk which, in absence of interaction, is the simple characteristic function nk = 1 if |k | ¡ pF and nk = 0 if |k | ¿ pF is, in general, the Fourier transform of S(x; t) with x = (x; t) = (x1 − x2 ; t1 − t2 ) and t = 0+ : Tr e−(−t1 )H x+1 ;t1 e−(t1 −t2 )H x−2 ;t2 e−t2 H (2.2) S(x) = S(x1 ; t1 ; x2 ; t2 )|t1 =t + = lim ’ 2 + →∞ Tr e−H V →∞
t1 =t2
Formal perturbation analysis of the 2-points Schwinger function S(x) and of the n-points natural extensions S(x1 ; x2 ; : : : ; x n ) can be done and the (heuristic) theory is very simple in terms of Feynman graphs. p q The n-points Schwinger function is expressed as a power series in the couplings ; ∞ p=0 ×S (p; q) (x1 ; x2 ; : : : ; x n ) with the coeAcients S (p; q) computed by considering the (connected) Feynman graphs composed by linking together in all possible ways the following basic “graph elements” (1) p “internal 4-lines graph elements” (also called “coupling graphs”) and q “internal 2-lines graph elements” (or “chemical potential vertices”) of the form in Fig. 1 where the incoming or outgoing arrows represent x− or x+ , respectively, and (2) n single lines attached to “external” vertices xj : the ?rst half of which oriented towards the vertex x and the other half of them oriented away from it (Fig. 2). The graphs are formed by contracting (i.e. joining) together lines with equal orientation. The lines emerging from di0erent nodes are regarded as distinct: we can imagine that each line carries a label distinguishing it from any other, e.g. the lines are thought to be numbered from 1 to 4 or from 1 to 2, depending
254
G. Gallavotti / Physics Reports 352 (2001) 251–272
Fig. 3. An example of a Feynman graph: in spite of its involved structure it is far simpler than its numerical expression, see (2.5). A systematic consideration of graphs as “short cuts” for formulae permits us to visualize more easily various quantities and makes it possible to recognize cancellations due to symmetries.
on the structure of the graph element to which they belong. So that there are many graphs giving the same contributions. Each graph is assigned a value which is ±(p!q!)−1 p q times a product of propagators, one per line. The propagator for a line joining x1 to x2 is, if x1 = (x1 ; t1 ); x2 = (x2 ; t2 ): 1 g(x1 − x2 ) =x1 •→—• x2 = (2)2
e−i(k0 (t1 −t2 )+k(x1 −x2 )) −ik0 + (k 2 − pF2 )=2m
d k0 d k :
(2.3)
A wavy line, see Fig. 1, joining x1 with x2 is also given a propagator g(x ˜ 2 − x1 ) = v(x2 − x1 ) (t2 − t1 ) ;
(2.4)
associated with the “potential”. However the wavy lines are necessarily internal as they can only arise from the ?rst graph element in Fig. 1. The p + q internal node labels (x; t) must be integrated over the volume occupied by the system (i.e. the whole space-time when V; → ∞): the result will be called the “integrated value” of the graph or simply, if not ambiguous, the graph value. Since the value of a graph has to be integrated over the labels x = (x; t) of the internal nodes we shall often consider also the value of a graph # without the propagators corresponding to the external lines but integrated with respect to the positions of all nodes that are not attached to external lines and we call it the “kernel” of the graph #: the value of a graph # will often be denoted as Val # and the kernel by K# . Note that the kernel of a graph depends on less variables: in particular it depends only on the positions of the internal nodes; it also depends on the labels ! of the branches external to them through which they are connected to the external vertices. Introducing the notion of kernel is useful because it makes natural to collect together values of graphs which contain subgraphs with the same number of lines exiting them, i.e. whose kernels have the same number of variables. The function p q S (p; q) is given by the sum of the values of all Feynman graphs with p vertices of the ?rst type in Fig. 1 and q of the second type in Fig. 1 and, of course, n external vertices, integrated over the internal vertices positions. As an example consider the contribution to S (4; 2) in Fig. 3.
G. Gallavotti / Physics Reports 352 (2001) 251–272
255
The value of the graph in Fig. 3 can be easily written in formulae: apart from a global sign that has to be computed by a careful examination of the order in which the xj -labels are written it is 1 (4; 2) S# (x1 ; x2 ; x8 ; x9 ) = ± g(x10 − x2 ) g(x3 − x1 ) g(x3 − x10 ) 4!2! ×g(x4 − x3 ) g(x5 − x4 ) g(x4 − x5 ) g(x5 − x4 ) g(x4 − x5 ) g(x6 − x5 ) ×g(x4 − x5 ) g(x5 − x4 ) g(x7 − x6 ) g(x9 − x7 ) g(x8 − x7 ) (t4 − t4 ) ×(t5 − t5 ) v(x3 − x3 ) (t3 − t3 ) v(x7 − x7 ) (t7 − t7 ) ×v(x4 − x4 ) v(x5 − x5 ) d x3 d x3 d x4 d x4 d x5 d x5 d x6 d x7 d x7 d x10 ; (2.5)
which is easily derived from the ?gure. And one hardly sees how this formula could be useful, particularly if one thinks that this is but one of a large number of possibilities that arise in evaluating S: not to mention what we shall get when looking at higher orders, i.e. at S (p; q) when p is a bit larger than 2. Many (in fact most) of the integrals over the node variables xv will, however, diverge. This is a “trivial” divergence due to the fact that interaction tends to change the value of the chemical potential. The chemical potential is related (or can be related) to the Fermi ?eld propagator singularities, and the chemical potential is changed (or may be changed) by the interaction: the divergences are due to the naivetMe of the attempt at expanding the functions S in a power series involving functions with singularities located “at the wrong places”. The divergences disappear if the (so far free) parameter is chosen to depend on as: =
∞
k k
(2.6)
k=1
with the coeAcients k suitably de?ned so that the resulting power series in the single parameter has coeAcients free of divergences (Luttinger and Ward, 1960). This leads to a power series in just one parameter and the “only” problem left is therefore that of the convergence of the expansion of the Schwinger functions in powers of . This is non trivial because naive estimates of the sum of all graphs contributing to a given order p yield bounds that grow like p!, thus giving a vanishing estimate for the radius of convergence. The idea is that there are cancellations between the values of the various graphs contributing to a given order in the power series for the Schwinger functions: and that such cancellations can be best exhibited by further breaking up the values of the graphs and by again combining them conveniently. The “renormalization group method” can be seen in di0erent ways: here I am proposing to see it as a resummation method for (possibly divergent) power series.
256
G. Gallavotti / Physics Reports 352 (2001) 251–272
Fig. 4. A (non-smooth) scaling (by a factor of 2) decomposition of unity.
Keeping the original power series in ; , i.e. postponing the choice of as a function of , one checks the elementary fact that the propagator g(x) can be written, setting k = (k0 ; k) ∈ R2 , also as g(x) =
1
(h) h ei! pF x 2h g! (2 pF x) ;
h=−∞ !=±1
gˆ(h) ! (k) =
(2.7)
$(h) (k) + “negligible corrections” ; −ik0 + !k
where $(1) (k) is a function increasing from 0 to 1 between 12 pF and pF , while the functions $(h) (k) are the same function scaled to have support in 2h−2 pF ¡ |k | ¡ 2h pF . This means that for h 6 0 it is $(h) (k) = $(2−h kpF−1 ). The simplest choice is to take $(1) (k) to be the characteristic function of z ≡ pF−1 |k | ¿ 1 and $(z) to be the characteristic function of the interval [ 12 ; 1] (Fig. 4), so that 1
$(h) (k) ≡ 1 :
(2.8)
h=−∞
To avoid technical problems it would be convenient to smoothen the discontinuities in Fig. 4 of $ and $(1) turning them into C ∞ -functions which in a small vicinity of the jump increase from 0 to 1 or decrease from 1 to 0, this is possible while still keeping the scaling decomposition (2.8) (i.e. with $(h) (k) ≡ $(k)). However, the formalism that this smoothing would require is rather havy and hides the stucture of the approach; therefore we shall continue with the decomposition of unity in (2.8) with the sharply discontinuous functions in Fig. 4, warning (c.f.r. footnote 2) the reader when this should cause a problem. The “negligible terms” in (2.7) are terms of a similar form but which are smaller by a factor 2h at least: their presence does not alter the analysis other than notationally. We shall henceforth set them equal to 0 because taking them into account only introduces notational complications. The above is an infrared scale decomposition of the propagator g(x): in fact the propagator g(h) contains only momenta k of O(2h pF ) for h 6 0 while the propagator g(1) contains all (and only) large momenta (i.e. the ultraviolet part of the propagator g(x)). The representation (2.7) is called a quasi particles representation of the propagator and the quantities ! pF
G. Gallavotti / Physics Reports 352 (2001) 251–272
257
are called a quasi particles momenta. The function g!(h) is the “quasi particle propagator on scale h”. After extracting the exponentials ei!pF x from the propagators the Fourier transforms gˆ(h) ! (k) (h) of g! (x) will no longer be oscillating on the scale of pF and the variable k will have the interpretation of “momentum measured from the Fermi surface”. The mentioned divergences are still present because we do not yet relate and : they will be eliminated temporarily by introducing an infrared cut-o8: i.e. by truncating the sum in (2.7) to h ¿ − R. We then proceed keeping in mind that we must get results which are uniform as R → ∞: this will be eventually possible only if is suitably ?xed as a function of . def Writing g(x) = Z1−1 g(1) (x) + Z1−1 g(60) (x) with Z1 = 1 and g(6m) being de?ned in general, see (2.7), as g(6m) (x) =
m
2h ei! pF x g!(h) (2h pF x);
m60 :
(2.9)
h=−R !=±1
Each graph can now be decomposed as a sum of graphs each of which with internal lines carrying extra labels “1” and “!” or “6 0” and “!” (signifying that the value of the graph has to be computed by using the propagator Z1−1 g!(1) (x − x ) or Z1−1 g!(60) (x − x ) for the line in question, if it goes from x to x). We now de?ne clusters of scale 1: a “cluster” on scale 1 will be any set C of vertices connected by lines bearing the scale label 1 and which are maximal in size (i.e. they are not part of larger clusters of the same type). Wavy lines are regarded as bearing a scale label 1. The graph is thus decomposed into smaller graphs formed by the clusters and connected by lines of scale 6 0: it is convenient to visualize the clusters as enclosed into contours that include the vertices of each cluster as well as all the lines that connect two vertices of the same cluster. The latter can be naturally called lines internal to the cluster C. The integrated value of a graph will be represented, up to a sign which can be determined as described above, as a sum over the quasi particles labels ! of the cluster lines and as an integral over the locations of the inner vertices of the various clusters lines. The integrand is a product between (a) the kernels KCi associated with the clusters Ci and depending only on the locations of the vertices inside the cluster Ci which are extremes of lines external to the cluster and on the quasi particles labels ! of the lines that emerge from it, 1 and (b) the propagators Z1−1 g!(60) corresponding to the lines that are external to the clusters (in the sense that they have at least one vertex not inside the cluster). We now look at the clusters C that have just |C | = 2 or |C | = 4 external lines and that are therefore associated with kernels KC ({xj ; !j }j=1; 2 ; C) or KC ({xi ; !i }i=1; :::; 4 ; C). Such kernels, by
1
By de?nition the kernel KC also involves integration over the locations of its inner vertices and the sum over the quasi particle momenta of the inner propagators.
258
G. Gallavotti / Physics Reports 352 (2001) 251–272
the structure of the propagators, see (2.7) and (2.9), will have the form KC = eipF (
j
! j xj )
KQ C ({xj ; !j }j=1; 2; :::; |C| ) ;
(2.10)
where xj are vertices of the cluster C to which the entering and exiting lines are attached; the cluster may contain more vertices than just the ones to which the external lines are attached: the positions of such “extra” vertices must be considered as integration variables (and as integrated), and a sum is understood to act over all the quasi particles labels of the internal lines (consistent with the values of the external lines !j ’s). If |C | = 2; 4 we write the Fourier transform at k = (k0 ; k) of the kernels KQ C (: : :): (1) (1) Z1 2−1 (1) C !1 ;!2 + Z1 (−ik0 )C + ! k*C ) !1 ;!2 + “remainder ” ;
Z12 C(1) !1 +!2 +!3 +!4 =0
+ “remainder ” ;
(2.11)
where the ?rst equation (|C | = 2) is a function of one k only while the second equation (|C | = 4) depends on four momenta k: one says that the remainders are obtained by “subtracting from the kernels their values at the Fermi surface” or by collecting terms that do not conserve the quasi particles momenta (like terms with !2 ;−!2 in the ?rst equation or with !1 + · · · !4 = 0 in the second). The remainder contains various terms which do not have the form of the terms explicitly written in (2.11): a form which could be as simple as ! · k!1 ;−!2 but that will in general be far more involved. In evaluating graphs we imagine, as described, them as made with clusters and that the graph value is obtained by integrating the product of the values of the kernels associated with the graph times the product of the propagators of the lines that connect di0erent clusters. Furthermore we imagine to attach to each cluster with 2 external lines a label indicating that it contributes to the graph value with the ?rst term in the decomposition in (2.11) only (which is the term proportional to (1) ), or with the second term (which is proportional to (−ik0 )(1) + ! k*(1) ) or with the remainder. This is easily taken into account by attaching to the cluster an extra label 1; 2 or r. Likewise, we imagine to attach to each cluster with 4 external lines a label indicating that it contributes to the value with the ?rst term in the decomposition in (2.11) only (which is the term proportional to (1) ), or with the remainder. This is again easily taken into account by attaching to the cluster an extra label 1; r. The label r stands for “remainder term” or “irrelevant term”, however irrelevant does not mean negligible, as usual in the renormalization group nomenclature (on the contrary they are in a way the most important terms). The next idea is to collect together all graphs with the same clusters structure, i.e. which become identical once the clusters with 2 or 4 external lines are “shrunk” to points. Since the internal structure of such graphs is di0erent this means that we are collecting together graphs of di0erent perturbative order. In this way, we obtain a representation of the Schwinger functions that is no longer a power series representation and the evaluation rules for graphs in which single vertex subgraphs (or single node subgraphs) with 2 or 4 external lines have a new meaning. Namely a four external
G. Gallavotti / Physics Reports 352 (2001) 251–272
259
lines vertex will mean a quantity Z12 equal to the sum of Z12 C(1) of all the values of the clusters C with 4 external lines and with label 1. The 2 external lines nodes will mean ei(!1 x1 −!2 x2 )pF !1 ;!2 Z1 ( + ) 9t − i* !9x )(x1 − x2 ) ;
(2.12)
where again or ) ; * are the sum of the contributions from all the graphs with 2 external lines and with label 0 or 1, respectively. It is convenient to de?ne = ) − * and to rewrite the 2 external lines nodes contributions to the product generating the value of a graph simply as ei(!1 x1 −!2 x2 )pF !1 ;!2 Z1 (2 + 9t + * (9t − ! · 9x )) (x1 − x2 ) :
(2.13)
One then notes that this can be represented graphically by saying that 2 external lines nodes in graphs which do not carry the label r can contribute in 3 di0erent ways to the product determining the graph value. The 3 ways can be distinguished by a label 0, 1 and z corresponding to the three addends in (2.13). Any graph without z-type of nodes can be turned into a graph which contains an arbitrary number of them, on each line connecting the clusters. And this amounts to saying that we can compute the series by imposing that there is not even a single vertex with two external lines and with label z simply by modifying the propagators of the lines connecting the graphs: changing them from Z1−1 g(60) to Z0−1 g(60) with Z0 = Z1 (1 + * ) :
(2.14)
This can be seen either elementarily by remarking that adding values of graphs which contains chains of nodes with label z amounts to summing a geometric series (i.e. precisely the series ∞ k k −1 or, much more easily, by recalling that the graphs are generated by k=0 (−1) (* ) = (1+* ) a formal functional integral over Grassmanian variables and checking (2.14) from this remark without any real calculation, see Benfatto and Gallavotti (1995). In the ?rst approach care is needed to get the correct relation (2.14) and it is wise to check it ?rst in a few simple cases (starting with the “linear” graphs which only contain nodes with one entering line and one exiting line, see Fig. 1: the risk is to get Z0 = Z1 (1 − * ) instead of (2.14)). 2 2
It is at this point that using the sharply discontinuous $-functions would cause a problem. In fact if one uses the smooth decomposition (2.14) is no longer correct: namely it would become Z0 = Z1 (1 + $(0) (k) * ) with the consequence that Z1 would no longer be a constant. At this point there are two possible ways out: the ?rst is to live with a Z0 which dpends on k and with the fact that the quantities introduced below Zj , j 6 − 1, will also be k-dependent; this is possible but it is perhaps too di0erent from what one is used to in the phenomenological renormalization group approaches in which quantities like Zj are usually constants. The other possibility is to modify the propagator on scale 0 from Z0−1 g(60) (k) to g(60) (k) (1 + * )=(1 + $(0) (k) * ). The second choice implies that g(6h) will no longer be exactly $(h) (k)=(−ik0 + ! · k) but it will be gradually modi?ed as h decreases and the modi?cation has to be computed step by step. This is also unusual in the phenomenological renormalization group approaches: the reason being simply that in such approaches the decomposition with sharp discontinuities is always used. The latter is not really convenient if one wants to make estimates of large order graphs. Here this will not be a problem for us because we shall not do the technical work of deriving estimates. In Benfatto and Gallavotti (1990) as well as in Berretti et al. (1994) the second choice has been adopted.
260
G. Gallavotti / Physics Reports 352 (2001) 251–272
Correspondingly we set: Z0 Z0 (0) = 2 ; (0) = ; Z1 Z1
(0) =
Z02 : Z12
(2.15)
We can now iterate the analysis: we imagine writing the propagators of the lines connecting the clusters so far considered and that we shall call clusters of scale 1 as 1 (60) 1 (0) 1 g = g + g(6−1) (2.16) Z0 Z0 Z0 and proceed to decompose all the propagators of lines outside the clusters of scale 1 into propagators of scale 0 or of scale 6 − 1. In this way, imagining all clusters of scale 0 as points, we build a new level of clusters (whose vertices are either vertices or clusters of scale 0): they consist of maximal sets of clusters of scale 1 connected via paths of lines of scale 0. Proceeding in the same way as in the above “step 1” we represent the Schwinger functions as sums of graph values of graphs built with clusters of scale 0 connected by lines with propagators on scale 6 − 1 given by Z0−1 g(6−1) and with the clusters carrying labels 1; 2 or r. Again we rearrange the 2-external lines clusters with labels 1; 2 as in (2.13) introducing the parameters ; ; * ; and graphs with nodes of type z by de?ning * in an analogous way as the previous quantity with the same name (relative to the scale 1 analysis). The one vertex nodes of such graphs with 2 or 4 external lines of scale 6 − 1 will contribute 2 (−1) (while the to the product de?ning the graph value, a factor Z−1 2(−1) or Z−1 (−1) or Z−1 propagators in the clusters of scale 1 and the 2 or 4 nodes with two lines of scale 0 emerging from them retain the previous meaning). Again one sets Z−1 = Z0 (1 + * ) and correspondingly we set Z−1 Z−1 (−1) = 2 ; (−1) = ; Z0 Z0
(2.17) (−1) =
2 Z−1 Z02
(2.18)
and now we shall only have graphs with 2 or 4 external lines clusters which carry a label 0; 1 or r as in the previous analysis of the scale 1 and the propagators connecting clusters of scale −1 (6−1) 0 changed from Z0−1 g(6−1) to Z−1 g . Having completed the step 0 we then “proceed in the same way” and perform “step −1” and so on. One can wonder why the choice of the scaling factor 2 in (2.13) and (2.18) multiplying the ratio of the renormalization factors in the de?nition of the new j or, for that matter, why the choice of 1 for the de?nition of the new j ; j : these are dimensional factors that come out naturally and any attempt at modifying the above choices leads to a beta function, de?ned below, which is not uniformly bounded as we remove the infrared cut-o0. In other words: di0erent scalings can be considered but there is only one which is useful. It could also be found by using arbitrary scaling and then look for which one the estimates needed to get a convergent expansion can be made. The conclusion is a complete rearrangement of the perturbation expansion which is now expressed in terms of graphs which bear various labels and, most important, contain propagators
G. Gallavotti / Physics Reports 352 (2001) 251–272
261
that bear a scale index which gives us information on the scale on which they are sizably di0erent from 0. The procedure, apart from convergence problems, leads us to de?ne recursively a sequence (j) ; (j) ; (j) ; Zj of constants each of which is a sum of a formal power series involving values of graphs with 2 or 4 external lines. The quantities gj = ( (j) ; (j) ; (j) ) can be called the running coupling constants while Zj can be called the running wave function renormalization constants: here j = 1; 0; −1; −2; : : : : Of course all the above is nothing but algebra, made simple by the graphical representation of the objects that we wish to compute. The reason why it is of any interest is that, since the construction is recursive, one derives expressions of the gj ; Zj in terms of the gn ; Zn with n ¿ j: Zj+1 = 1 + Bj (gj+1 ; gj ; : : : ; g0 ) ; Zj gj = -j gj+1 + C j (gj+1 ; gj ; : : : ; g0 ) ; where -j is a matrix (Zh =Zh−1 )2 -j = 0 0
(2.19)
0
0
(Zh =Zh−1 )
0
0
2 (Zh =Zh−1 )
(2.20)
and the functions Bj ; C j are given by power series, so far formal, in the running couplings. The expression of Zj+1 =Zj can be used to eliminate such ratios in the second relation of (2.19) which therefore becomes Zj+1 = 1 + Bj (gj+1 ; gj ; : : : ; g0 ) ; Zj gj = - gj+1 + Bj (gj+1 ; gj ; : : : ; g0 ) ;
(2.21)
where - is the diagonal matrix with diagonal (1; 1; 2). The scalar functions Bj and the three components vector functions Bj = (Bj; 1 ; Bj; 2 ; Bj; 3 ) are called the beta functional of the problem. There are two key points, which are nontrivial at least if compared to the above simple algebra and which we state as propositions Proposition 1 (regularity and boundedness of the beta function). Suppose that there is . ¿ 0 such that |gj | ¡ .; |Zj =Zj−1 − 1| ¡ . for all j 6 1 then if . is small enough the power series de9ning the beta functionals converge. Furthermore the functions Bj ; Bj are uniformly bounded and have a dependence on the arguments with label j + n exponentially decaying as n grows; namely there exist constants D; 0 such that if G = (gj+1 ; : : : ; g0 ) and G = (gj+1 ; : : : ; g0 ) with G and F di8ering only by the (j + n)th “component” d = gj+n − gj+n = 0; then for all j 6 0 and all n ¿ 0 |Bj (G)|; |Bj (G)| 6 D.2 ; |Bj (G ) − Bj (G)|;
|Bj (G ) − Bj (G)| 6 De−0n .|d| ;
(2.22)
262
G. Gallavotti / Physics Reports 352 (2001) 251–272
if . is small enough: i.e. the “memory” of the “beta functionals” Bj ; Bj is short ranged. The Schwinger functions are expressed as convergent power series in gj in the same domain |gj | ¡ .. The diAcult part of the proof of the above proposition is to get the convergence of the series under the hypotheses |gj | ¡ .; |Zj =Zj−1 − 1| ¡ . for all j: this is possible because the system is a fermionic system and one can collect the contributions of all graphs of a given order k into a few, i.e. not more than an exponential in k, groups each of which gives a contribution that is expressed as a determinant which can be estimated without really expanding it into products of matrix elements (which would lead to bounding the order k by a quantity growing with k!) by making use of the Gram–Hadamard inequality. Thus the k!−1 that is in the de?nition of the values compensates the number of labels that one can put on the trees and the number of Feynman graphs that is also of order k! is controlled by their representability as determinants that can be bounded without generating a k! via the Hadamard inequality. The basic technique for achieving these bounds is well established after the work (Lesniewski, 1987). A second nontrivial result is Proposition 2 (short range and asymptotics of the beta function). Let G 0 = (g; g; : : : ; g) with g = ( ; ; ) then the function Bj (G 0 ) de9nes an analytic function of g; that we shall call “beta functional ”; by setting (g) = lim Bj (G 0 )
(2.23)
j→−∞
for |g| ¡ .. The limit is reached exponentially |(g)−Bj (G 0 )| ¡ .2 De−0|j| ; for some 0 ¿ 0; D ¿ 0 provided |g| ¡ .. Finally the key result (Benfatto and Gallavotti, 1990; Berretti et al., 1994) is Proposition 3 (vanishing of the beta function). If g = ( ; ; 0) then the functions (g) = 0 provided |g| ¡ .. Furthermore for some D; 0 ¿ 0 it is; for all j 6 0 2 Bj3 (gj+1 ; : : : ; g0 ) + e0j Bj3 (gj+1 ; : : : ; g0 ) Bj3 (gj+1 ; : : : ; g0 ) = j+1 j+1 |Bj3 (gj+1 ; : : : ; g0 )| ¡ D;
|Bj3 (gj+1 ; : : : ; g0 )| ¡ D.2
(2.24)
provided; for h = 0; : : : ; j + 1; |gh | ¡ .. The above propositions are proved in Benfatto and Gallavotti (1990), Berretti et al. (1994) and Benfatto and Mastropietro (2000). The vanishing of ( ; ; 0) is proved in a rather indirect way. We proved that the function is the same for model (2.1) and for a similar model, the Luttinger model, which is exactly soluble; but which can be also studied with the technique described above: and the only way the exactly soluble model results could hold is to have = 0. The vanishing of the beta function seems to be a kind of Ward identity: it is easy to prove it directly if one is willing to accept a formal proof. This was pointed out, after the work
G. Gallavotti / Physics Reports 352 (2001) 251–272
263
(Benfatto and Gallavotti, 1990), in other papers and it was believed to be true probably much earlier in some equivalent form, see Solyom (1979); note that the notion of the beta function is intrinsic to the formalism of the renormalization group and therefore a precise conjecture on it could not even be stated before the 1970s; but of course the existence and importance of in?nitely many identities had already been noted. Given the above propositions one shows that “things go as if ” the recursion relation for the running couplings was, up to exponentially small corrections, a simple memoryless evolution gj−1 = (gj )+O(e−0|j| ): the propositions say in a precise way that this is asymptotically, as j → −∞, true. This tells us that the running couplings j ; j stay constant (because 1 ; 2 vanish): however they in fact tend to a limit as j → −∞ exponentially fast because of the corrections in the above propositions, provided we can guarantee that also j → 0 exponentially fast and j→−∞
that the limits of ; ; j do not exceed . (so that the beta functionals and the beta function still make sense). It is now important to recall that we can adjust the initial value of the chemical potential. 3 This freedom corresponds to the possibility of changing the chemical potential “correction” in (2.1) and tuning its value so that h → 0 as h → −∞. Informally if 0 is chosen “too positive” then j will grow (exponentially) in the positive direction (becoming larger than ., a value beyond which the series that we are using become meaningless); if 0 is chosen “too negative” the j also will grow (exponentially) in the negative direction: so there is a unique choice such that j can stay small (and, actually, it can be shown to converge to 0. 4 The vanishing of the beta function gives us the existence of a sequence of running couplings gj = ( j ; j ; j ) which converge exponentially fast to ( −∞ ; −∞ ; 0) as j → −∞ if 0 are conveniently chosen: and one can prove that −∞ ; −∞ ; 0 are analytic in for small enough (Berretti et al., 1994). In this way one gets a convergent expansion of the Schwinger functions: which leads to an essentially complete theory of the one dimensional Fermi gas with spin zero and short range interaction. 3. The conceptual scheme of the renormalization group approach followed above The above schematic exposition of the method is a typical example of how one tries to apply the multiscale analysis that is commonly called a “renormalization group approach”: (1) one has series that are easily shown to be ?nite order by order possibly provided that some free parameters are suitably chosen (“formal renormalizability theory”: this is the proof 3
Which is a “relevant operator”, in the sense that if regarded as a running coupling it is roughly multiplied by 2 at each change of scale, i.e. j−1 ∼ 2j . 4 A simpli?ed analysis is obtained by “neglecting memory corrections” i.e. using as a recursion relation gj = -gj+1 + (gj+1 ) with (g) verifying (2.24): this gives that j ; j → ( −∞ ; −∞ ) exponentially fast and j → 0 expoj →−∞
nentially fast provided 0 is suitably chosen in terms of 0 ; 0 : otherwise everything diverges.
j →−∞
264
G. Gallavotti / Physics Reports 352 (2001) 251–272
in Luttinger and Ward (1960)) that if h in (2.2) are suitably chosen we obtain a well-de?ned perturbation series in powers of . (2) However the series even when ?nite term by term come with poor bounds which grow at order k as k! which, nevertheless are often non trivial to obtain (although this not so in case (2.1) discussed here unlike the case discussed in the next sections). (3) One then tries to reorganize the series by leaving the original parameters ( ; ) in the present case as is ?xed) as independent parameters and collecting terms together. The aim being to show that they become very convergent power series in a sequence of new parameters, the “running couplings” (h) ; (h) and (h) in the present case, under the assumption that such parameters are small (they are functions, possibly singular, of the initial parameters of the theory, ; in case (2.1), as has to be imagined to be 0). (4) The running couplings, essentially by construction, also verify a recursion relation that makes sense again under the assumption that the parameters are small. This relation allows us to express (if it makes sense) successively the running couplings in terms of the preceding ones: the running couplings are ordered into a sequence by “scale labels” h = 1; 0; −2; : : : : The recursion relation is interpreted as an evolution equation for a dynamical system (a map de?ned by the beta function(al)): it generates a “renormalization group trajectory” (the sequence ( h ; h ; h ) out of the original parameters ; present in (2.1), as has to be taken as 0). (5) One then shows that if the free parameters in the problem, (i.e. ; in (2.1)) are conveniently chosen, then the recursion relation implies that the trajectory stays bounded and small, thus giving a precise meaning to (2.2)), and actually the limit relation holds ( (h) ; (h) ; (h) ) → h→−∞
( ∞ ; ∞ ; 0) (this is achieved in the above Fermionic problem by ?xing as a suitable function of , see Berretti et al. (1994)). (6) Hence the whole scheme is self-consistent and it remains to check that the expressions that one thus attributes to the sum of the series are indeed solutions of the problem that has generated them: not unexpectedly this is the easy part of the work, because we have always worked with formal solutions which “only missed, perhaps, to be convergent”. (7) The ?rst step, i.e. going to scale 0 is di0erent from the others as the propagators have no ultraviolet cut o0 (see the graph of $(1) in Fig. 2). Although there are no ultraviolet divergences the control of this ?rst step o0ers surprising diAculties (due to the fact that in the direction of k0 the decay of the propagators is slow making various integrals improperly convergent): the analysis is done in Berretti et al. (1994) and Gentile and Scoppola (1993). Note that the above scheme leaves room for the possibility that the running couplings rather than being analytic functions of a few of the initial free parameters are singular: this does not happen in the above fermionic problem because some components of the beta function vanish identically: this is however a peculiarity of the fermionic models. In other applications to ?eld theory, and particularly in the very ?rst example of the method which is the hierarchical model of Wilson, this is by far not the case and the perturbation series are not analytic in the running couplings but just asymptotic in the actual free parameters of the theory. The method however “reduces” the perturbation analysis to a recursion relation in small dimension (namely 3 in case (2.1)) which is also usually easy to treat heuristically. The d = 2 ground state fermionic problem (i.e. (2.1) in 2 space dimensions) provides, however, an example in which even the heuristic analysis is not easy.
G. Gallavotti / Physics Reports 352 (2001) 251–272
265
In the following section we discuss another problem where the beta function does not vanish, but one can guarantee the existence of a bounded and small solution for the running couplings thanks to a “gauge symmetry” of the problem. This is an interesting case as the theory has no free parameters so that it would not be possible to play on them to ?nd a bounded trajectory for the renormalization group running constants. This also illustrates another very important mechanism that can save the method in case there seemed to be no hope for its use, namely a symmetry that magically eliminates all terms that one would fear to produce “divergences” in formal expansions. Again the case studied is far from the complexity of gauge ?eld theory because it again leads to the result that the perturbation series itself is summable (unlike gauge ?eld theories which can only yield asymptotic convergence): but it has the advantage of being a recognized diAcult problem and therefore is a nice illustration of the role of symmetries in the resummation of (possibly) divergent series and the power of the renormalization group approach in dealing with complex problems. 4. The KAM problem Consider d rotators with angular momentum A = (A1 ; : : : ; Ad ) ∈ Rd and positions * = (*1 ; : : : ; *d ) ∈ T d = [0; 2]d ; let J ¿ 0 be their inertia moment and suppose that .f(*) is the potential energy in the con?guration *, which we suppose to be an even trigonometric polynomial (for simplicity) of degree N . Then the system is Hamiltonian with Hamiltonian function 1 2 H= (4.1) A + .f(*) 2J giving rise to a model called “Thirring model”. 5 For . = 0 motions are quasi periodic (being t → (A0 ; *0 + !0 t) with !0 = J −1 A0 ) and their “spectrum” !0 ?lls the set S0 ≡ Rd of all vectors !0 : there is a 1-to-1 correspondence between the spectra !0 and the angular momenta A0 . Question: If . = 0 can we 9nd, given !0 ∈ S0 a perturbed motion, i.e. a solution of the Hamilton equations of (4:1), which has spectrum !0 and, as . → 0, reduces with continuity to the unperturbed motion with the same spectrum? or less formally: which among the possible spectra ! ∈ S0 survives perturbation? (1) The global canonical transformations C of Rd × T d with generating functions S(A; *) = N A · *+8(*) · A +’(*) parameterized by an integer components non singular matrix N , and analytic functions g(*); f(*) leave invariant the class of Hamiltonians of the form H = (A; M (*)A)=2 + A · g(*) + f(*). The subgroup CLd (R) of the global canonical coordinate transformations C was (remarked and) used by Thirring so that (4.1) is called the “Thirring model”, see Thirring (1983). (2) The function H . ( ) in (4.2) must have zero average over or, if → + !0 t, over time: hence the surviving quasi periodic motions can be parameterized by their spectrum !0 or, equivalently, by their average action A0 = J !0 . The “spectral dispersion relation” between the average action A0 and the frequency spectrum is not twisted by the perturbation. Furthermore, the function .(!0 ; J ) can be taken monotonically increasing J : J −1 is called the “twist rate”. The latter two properties motivated the name of “twistless motions” given to the quasi periodic motions of form (4.2) for Hamiltonians like (4.1). (3) The invariance under the group CLd (R) has been used widely in the numerical studies of the best treshold value .(!0 ; J ) and a deeper analysis of this group would be desirable, particularly a theory of its unitary representations. 5
266
G. Gallavotti / Physics Reports 352 (2001) 251–272
Analytically this means asking whether two functions H . ( ); h. ( ) on T d exist, are divisible by . and are such that if A0 = J !0 and if we set A = A0 + H . ( ); ∈Td ; (4.2) a = + h. ( ); then → + !0 t yields a solution of the equations of motion for . small enough. It is well known that in general only “nonresonant” spectra can survive: for instance (KAM theorem) those which verify, for some <; C ¿ 0 |!0 · |−1 ¡ C ||<
∀ ∈ Z d = {integer vectors};
= 0
(4.3)
for some C; < ¿ 0 (Diophantine vectors) and we restrict for simplicity to such vectors. Furthermore given !0 the perturbation size . has to be small enough: |.| 6 .0 (!0 ; J ). To ?nd H . ; h. we should solve the equation (setting J ≡ 1), (!0 · 9 )2 h( ) = − .(9a f)( + h( ))
(4.4)
and if such h is given then, setting h. ( ) = h( ); H . ( ) = (!0 · 9)h( ) one checks that (4.2) has the wanted property (i.e. → + !0 t is a motion for (4.1)). To an exercized eye (4.4) de?nes the expectation value of the 1-particle Schwinger function of the euclidean ?eld theory on the torus T d for two vector ?elds F ± ( ) = (F1± ( ); : : : ; Fd± ( )) whose free propagator is −1 ei ( − ) · @ @ F‘ ( )Fm ( ) = @; −@ ‘; m (4.5) (!0 · )2 (2)d =0
and the interaction Lagrangian is, see Gallavotti (1995), L(F) = . F + ( ) · 9* ( + F − ( )) d : Td
If P0 (dF) the “functional integral” de?ned by Wick’s rule with propagator (4.5) then
− + − F ( )e. T d F ( )·9* f( +F ( )) P0 (dF) h( ) = :
. d F + ( )·9 f( +F − ( )) * e T P0 (dF)
(4.6)
(4.7)
At ?rst sight this is a “sick ?eld theory”. Not only the ?elds F ± ( ) do not come from a positive de?nite propagator, hence (4.7) has to be understood as generating a formal expansion in . of h with integrals over F being de9ned by the Wick rule, but also the theory is nonpolynomial and naively nonrenormalizable. The “only” simpli?cation is that the Feynman diagrams of (4.7) are (clearly) tree-graphs, i.e. loopless: this greatly simpli?es the theory which, however, remains nonrenormalizable and nontrivial (being equivalent to a nontrivial problem). It is not diAcult to work out the Feynman rules for the diagrams expressing the kth order coeAcient of the power series expansion in . of (4.7). Consider a rooted tree with k nodes: the branches are considered oriented towards the root which is supposed to be reached by a single branch and which is not regarded as a node of the tree (hence the number of nodes and the number of branches are equal) (Fig. 5).
G. Gallavotti / Physics Reports 352 (2001) 251–272
267
Fig. 5. A graph # with pv0 = 2; pv1 = 2; pv2 = 3; pv3 = 2; pv4 = 2 and k = 12, and some labels. The line numbers, distinguishing the lines are not shown. The lines length should be the same but it is drawn of arbitrary size. The momentum Towing on the root line is = (v0 ) = sum of all the nodes momenta, including v0 .
We attach to each node (or “vertex”) v of the tree a vector v ∈ Z d , called a “mode label”, and to a line oriented from v to v we attach a “current” or “momentum” (v) and a “propagator” g((v)): def def v ; g(!0 · ) = (!0 · )−2 ; (4.8) (v) = w6v
if v is the “?rst node” of the tree, see v0 in Fig. 5, then the momentum (v) is called the total momentum of the tree. One de?nes the value Val (#) of a tree # decorated with the described labels as v · v fv ; (4.9) Val(#) = k!−1 (!0 · (v))2 v∈nodes where f are the Fourier coeAcients of the perturbation f(*) = f ei· , with || 6 N; f =f− , and v denotes the node following v and, if v is the ?rst node of the tree (so that v would be the root which (by our conventions) is not a vertex) then v is some unit vector e, see Fig. 3. Given the above Feynman rules, which one immediately derives from (4.6), (4.7) the component along the vector e, labeling the root, of the kth order Fourier coeAcient h(k) of the function h( ) · e, which we write as 2 (2) h · e = .h(1) · e + . h · e + · · · ;
(4.10)
is simply the sum of the values Val# over all trees # which have total momentum ; k nodes, and no branch carrying 0-momentum. One can check directly that h( ) so de?ned is a formal solution of (4.4): series (4.10) with the coeAcients de?ned as above is called the Lindstedt series of the KAM problem: it was introduced, at least as a method for computing the low order coeAcients h(k) , by Lindstedt and Newcomb in celestial mechanics problems and it was shown to be possible to all orders by PoincarUe (one has to show that the algorithm generating the series does not produce graphs with branches carrying 0 momentum which, by (4.9), would yield meaningless expressions for the corresponding tree values), see (PoincarUe, 1987, Vol. 3).
268
G. Gallavotti / Physics Reports 352 (2001) 251–272
Fig. 6. A tree of order k with momentum = 1 + : : : k=3 and value of size of order (k=3)!< if = 1 + : : : k is “as small as it can possibly be”: or is “almost resonant” i.e. such that !0 · ∼ C −1 ||−< . The tree has k=3 + 1 branches carrying momentum = k=3 i=1 i , i.e. the k=3+ horizontal branches. The last k=3 − 1 branches have momenta i ; i = 2; : : : ; k=3 so arranged that their sum plus 1 , i.e. , is “almost resonant”.
The number of trees of order k that do not di0er only by the labeling of the lines (i.e. that are topologically the same) is bounded exponentially in k while the total number of trees (labels included) is, therefore, of order k! times an exponential in k. Therefore taking into account the k!−1 in (4.8) we see that the perturbation series might have convergence problems only if there exist individual graphs whose value is too large, e.g. O(k!8 ) for some 8 ¿ 0. Such graphs do exist; an example is Fig. 6. Therefore, even though the theory is loopless and its perturbation series is well de?ned to all orders, yet it is nontrivial because the kth order might be “too large”. This is a typical situation in “infrared divergences” due to too large propagators: in fact the “bad” graphs (like the one in Fig. 5) are such because (! · )−2 , i.e. the propagator, is too large. It is remarkable that the same strategy used in the analysis of the Fermi gas theory, i.e. the renormalization group approach outlined in general terms in Section 3, works in this case. We decompose the propagator as 1 1 1 $(h) (!0 · ) = = 2−2h g(h) (2h !0 · ) ; (4.11) (!0 · )2 (!0 · )2 h=−∞
h=−∞
def
where if h 6 0 we have set $(h) (x) = $(2−h x) with $(x) = 0 unless x is in the interval 2−1 ¡ |x| 6 1 where $(x) ≡ 1, and $(1) is de?ned to be identically 1 for |x| ¿ 1 and 0 otherwise so that 1 ≡ 1 (h) 6 h=−∞ $ (x): see Fig. 2 and (2.9). Given a Feynman graph, i.e. a tree # (with the decorating labels) one replaces g(!0 · ) by the last sum in (4.11) and we obtain trees with branches bearing a “scale label ”. 6 In this problem there is no need to use functions $(h) which are not as in Fig. 4, i.e. with smoothed out discontinuities: there is, however, a minor diAculty also in this case. The decomposition above, (4.11), can be done exactly as written only if !0 ∈ Rd is outside a certain set of zero volume in Rd . Although already the Diophanntine property (4.3) holds only outside a set of zero volume in Rd , as is well known, the set of !0 ∈ Rd for which what follows can be done literally as described is slightly smaller (although still with a complement of zero volume). However the following discussion can be repeated under the only condition that !0 veri?es (4.3) provided one does not insist in taking a sequence of scales that are exactly equal to 2h ; h = 1; 0; −1; : : : and one takes a sequence of scales that have bounded but suitably variable successive ratios. Here we ignore this problem, see Gallavotti (1994a) for the cases that work as described here (?rst reference) and for the general case (second reference) discussed under the only “natural” condition (4.3).
G. Gallavotti / Physics Reports 352 (2001) 251–272
269
We collect the graph lines into clusters of scale ¿ h with at least one line of scale h and de?ne the order n of the cluster to be the number of nodes in it. It is not diAcult, see PWoschel (1986) and Gallavotti (1994a), to see that only graphs which contain one incoming and one outgoing line with the same momentum (hence same scale) are the source of the problem: for instance in the case of the graph in Fig. 4 all horizontal lines have the same momentum of scale h log k −< while all the other lines have scale O(1) because |j | 6 n. The subgraphs that contain one incoming and one outgoing line with the same momentum of scale h 6 0 have been called in Eliasson (1996) resonances, perhaps not very appropriately given the meaning that is usually associated with the word resonance but we adopt here the nomenclature of the latter breakthrough work. The subgraphs which would be resonances but have maximal scale h = 1 are not considered resonances. Therefore, it is natural to collect together all graphs which contain chains of 1; 2; 3 : : : clusters with equal incoming and outgoing momenta on scale h 6 0. For instance, the graph in Fig. 4 contains a chain of k=3 clusters, each containing a single line (namely the lines with 0 ; −0 modes at their extremes), and one cluster with k=3 − 1 lines (the lines entering the node 2k=3+1) and k=3 + 1 lines external to the resonant clusters, the horizontal lines. Call D(h) () = sum of .n times the value of the clusters with n nodes and single incoming and outgoing lines of equal momentum and scale h ;
(4.12)
which can be given meaning because disregarding the incoming line we can regard the subcluster with one entering and one exiting line as a tree # (or subtree) with root at the node where the exiting line ends, so that its value D(h) () will be naturally de9ned as the value of # times (!0 · )−2 (which takes into account the propagator of the line entering the cluster). This leads to a rearrangement of the series for h in which the propagator of a line with momentum on scale h is D(h) () rather than (!0 · )−2 $(h) (!0 · ) and there are no more clusters with one incoming and one outgoing line of equal momentum. Of course the graphs with k nodes give a contribution to h which is no longer proportional to .k because they contain quantities D(h) () which are (so far formally) power series in .. The same argument invoked above (PWoschel, 1986; Gallavotti, 1994a), gives again that if for some constant R and for all . small enough one could suppose that |D(h) ()| 6 R(!0 · )−2 ;
(4.13)
then the series for h would be convergent for . small enough. Furthermore if (4.13) holds for scales 0; −1; : : : ; h + 1 then one sees that there exist ah ; bh ; rh such that D(h) () =
ah bh + + (1 + rh ())gQ (h) (!0 · ) 4 (!0 · ) (!0 · )3
(4.14)
with |rh | ¡ R.; |gQ (h) (!0 · )| ¡ N 2 (!0 · )−2 , and we see therefore that ah ; bh play the role of “running couplings”: they are nontrivial functions of . but they are the only quantities to control because if we can show that they are bounded by .R, say, then the convergence of the perturbation series would be under control because we are reduced essentially to the case in which no graphs with resonances are present.
270
G. Gallavotti / Physics Reports 352 (2001) 251–272
Naturally, by the principle of conservation of diAculties, the quantities ah and bh are given by power series in . whose convergence is a priori diAcult to ascertain. As in the case of the Fermi gas, setting ch = (ah ; bh ) the very de?nition of such constants in terms of sums of in?nitely many Feynman graphs implies that they verify a recursion relation ch = -ch+1 + Bh (ch+1 ; : : : ; c0 );
h 6 0;
(4.15) 22 ; 2,
and c1 ≡ 0 because by de?nition where - is a 2 × 2 diagonal matrix with diagonal elements (h) (h) D = g only for h 6 0. The analogy with the previous Fermi system problem seems quite strong. The function Bh ({c})h+1; :::; 0 is not homogeneous in the ch : and this is an important di0erence with respect to the previous fermionic case. Clearly if ah ; bh = 0 for some h we are “lost” because although, under the boundedness condition (4.13), on the running couplings the beta function Bh is well de?ned by a convergent series, and although the whole rearranged perturbation series is convergent under the same condition, we shall not be able to prove that the running couplings stay small so that (4.13) is selfconsistent. In fact both ah ; bh are “relevant couplings” (because the elements of the matrix - are ¿ 1) and the two data a1 ; b1 (which are 0 for h = 1 because, by de?nition, there are no resonances of scale 1) will need to be carefully tuned so that the renormalization group trajectory ah ; bh that (4.15) generates from a1 ; b1 is bounded: there are however no free parameters to adjust in (4.1)! The situation is very similar to the one met in the Fermi liquid theory: in that case one solves the problem by showing that the beta function vanishes, at least asymptotically, for the marginal couplings h ; h : it remains the relevant coupling h which can be bounded only because we have in that problem the freedom of “adjusting” a free parameter (the chemical potential ). In the present case we have two relevant parameters, ah ; bh , and no free parameter in the Hamiltonian: the situation would be hopeless unless it just happened that the correct initial data for a bounded renormalization group trajectory were precisely the ones that we have, namely a1 = b1 = 0. This means that one should prove the identity Bh (0) ≡ 0 for all h 6 0 so that ch ≡ 0 for all h 6 1. The vanishing of the beta function at c = 0 was understood in Eliasson (1996) and it can be seen in various ways, see also Gallavotti (1994a). It is however always based on symmetry properties of the model (the choice of the origin of plays the role of a “gauge symmetry”: the interpretation of the cancellations used by Eliasson (1996) as a consequence of a gauge symmetry was pointed out and clearly stated in Bricmont et al. (1999), although of course the symmetry is used in the analysis in Eliasson (1996) and in great detail in Gallavotti (1994a)). We are back to a very familiar phenomenon in ?eld theory: a nonrenormalizable theory becomes asymptotically free and in fact analytic in the parameter . measuring the strength of the perturbation, because of special symmetries which forbid the exponential growth of the relevant couplings in absence of free parameters in the Lagrangian of the model which could possibly be used to control them. Concerning the originality of the results obtained with the techniques exposed in this paper the following comments may give an idea of the status of the matter: (1) The KAM theory presented above is only a reinterpretation of the original proofs (Eliasson, 1996), following Gallavotti (1994a), and Gentile and Mastropietro (1996a) see also Bricmont et al. (1999). However, even in classical mechanics the method has generated new
G. Gallavotti / Physics Reports 352 (2001) 251–272
271
results. Other problems of the same type that can be naturally interpreted in terms of renormalization group analysis of suitable quantum ?elds, see Gentile and Mastropietro (1996a, b), and in the theory of the separatrix splitting, see Gallavotti (1995) and Gallavotti et al. (1999). (2) The results on the theory of the Fermi systems were obtained for the ?rst time by the method described above (including the vanishing of the beta function, Benfatto and Gallavotti (1990)) and have led to the understanding of several other problems (Bonetto and Mastropietro, 1995; Benfatto et al., 1997; Benfatto and Mastropietro (1999); Mastropietro, 1997, 1998a, b, 1999). 5. Uncited references Berretti and Gentile, 1999; Lieb and Mattis, 1966; Luttinger, 1963; Polchinski, 1992; Gallavotti, 1994b; Polchinski, 1984. Acknowledgements I am grateful to the Organizing committee of the conference “Renormalization Group 2000” for giving me the opportunity to write and present this review, and for travel support to the conference held in Taxco, Mexico. I am indebted to G. Gentile and V. Mastropietro for precious comments on the draft of this paper. References Benfatto, G., Gallavotti, G., 1990. Perturbation theory of the Fermi surface in a quantum liquid. A general quasi particle formalism and one dimensional systems. J. Stat. Phys. 59, 541–664. Benfatto, G., Gallavotti, G., 1995. Renormalization Group. Princeton University Press, Princeton, NJ, pp. 1–144. Berretti, A., Gallavotti, G., Procacci, A., Scoppola, B., 1994. Beta function and Schwinger functions for a many body system in one dimension. Anomaly of the Fermi surface. Commun. Math. Phys. 160, 93–172. Berretti, A., Gentile, G., 1999. Scaling properties of the radius of convergence of a Lindstedt series: the standard map. J. Math. 78, 159 –176. And Scaling properties of the radius of convergence of a Lindstedt series: generalized standard map, preprint, 1999. Bricmont, J., Gawedzki, K., Kupiainen, A., 1999. KAM theorem and quantum ?eld theory. Commun. Math. Phys. 201, 699–727. Benfatto, G., Gentile, G., Mastropietro, V., 1997. Electrons in a lattice with incommensurate potential. J. Stat. Phys. 89, 655–708. Benfatto, G., Mastropietro, V., 2000. A renormalization group computation of the spin correlation functions in the XYZ model. Preprint, U. Roma 2, January, pp. 1–77. Bonetto, F., Gallavotti, G., Gentile, G., Mastropietro, V., 1998. Quasi linear Tows on tori: regularity of their linearization. Commun. Math. Phys. 192, 707–736. And Bonetto et al., 1998. Lindstedt series, ultraviolet divergences and Moser’s theorem. Annali della Scuola Normale Superiore di Pisa 26, 545 –593. A review in: Gallavotti, G., 1997. Methods in the theory of quasi periodic motions. In: Spigler, R., Venakides, S. (Eds.), Proceedings of Symposia in Applied Mathematics, Vol. 54. American Mathematical Society, Providence, RI, pp. 163–174. Bonetto, F., Mastropietro, V., 1995. Beta Function and anomaly of the Fermi surface for a d = 1 system of interacting fermions in a periodic potential. Communications in Mathematical Phys. 172, 57–93. See also Boretto, F.,
272
G. Gallavotti / Physics Reports 352 (2001) 251–272
Mastropietro, V., 1995. Filled band Fermi systems. Math. Phys. Electron. J. 2, 1– 43. 1996. And Bonetto, F., Mastropietro, V., 1997. Critical indices in a d = 1 ?lled band Fermi system, Phys. Rev. B 56, 1296 –1308. Eliasson, H., 1996. Absolutely convergent series expansions for quasi-periodic motions. Math. Phys. Electron. J. 2. Gallavotti, G., 1985. Renormalization theory and ultraviolet stability via renormalization group methods. Rev. Mod. Phys. 57, 471–569. Gallavotti, G., 1994a. Twistless KAM tori. Commun. Math. Phys. 164, 145 –156. And Gallavotti, G., Gentile, G. 1995. Majorant series convergence for twistless KAM tori. Ergodic Theory Dynamical systems 15, 857–869. Gallavotti, G., 1994b. Twistless KAM tori, quasi Tat homoclinic intersections, and other cancellations in the perturbation series of certain completely integrable hamiltonian systems. A review. Rev. Math. Phys. 6, 343–411. Gallavotti, G., 1995. Invariant tori: a ?eld theoretic point of view on Eliasson’s work. In: Figari, R. (Ed.), Advances in Dynamical Systems and Quantum Physics. World Scienti?c, Singapore, pp. 117–132. Gallavotti, G., Gentile, G., Mastropietro, V., 1995. Field theory and KAM tori. Math. Phys. Electron. J. 1, p. 1–9. (http:==mpej.unige.ch). Gallavotti, G., Gentile, G., Mastropietro, V., 1999. Separatrix splitting for systems with three time scales. Commun. Math. Phys. 202, 197–236. And Melnikov’s approximation dominance. Some examples. Gallavotti et al., 1999. Rev. Math. Phys. 11, 451– 461. Gentile, G., Mastropietro, V., 1996a. KAM theorem revisited. Physica D 90, 225 –234. And Gentile, G., Mastropietro, V., 1995. Tree expansion and multiscale analysis for KAM tori. Nonlinearity 8, 1159 –1178. Gentile, G., Mastropietro, V., 1996b. Methods for the analysis of the Lindstedt series for KAM tori and renormalizability in classical mechanics. A review with some applications. Rev. Math. Phys. 8, 393–444. Gentile, G., Scoppola, B., 1993. Renormalization group and the ultraviolet problem in the Luttinger model. 154, 135 –179. Lesniewski, A., 1987. E0ective action for the Yukawa 2 quantum ?eld theory. Commun. Math. Phys. 108, 437–467. Luttinger, J., Ward, J., 1960. Ground state energy of a many fermion system. Phys. Rev. 118, 1417–1427. Lieb, E., Mattis, D., 1966. Mathematical Physics in One Dimension. Academic Press, New York. Mattis, D., Lieb, E., 1965. Exact solution of a many fermions system and its associated boson ?eld. J. Math. Phys. 6, 304 –312. Reprinted in Lieb and Mattis (1966). Luttinger, J., 1963. An exactly soluble model of a many fermion system. J. Math. Phys. 4, 1154–1162. Mastropietro, V., 1997. Small denominators and anomalous behaviour in the uncommensurate Hubbard-Holstein model. mp− arc #97-652. Mastropietro, V., 1998a. Renormalization group for the XYZ model. FM 98-13, http:==ipparco.roma1.infn.it. Mastropietro, M., 1998b. Renormalization group for the Holstein Hubbard model. FM 98-12, http:==ipparco.roma1.infn.it. Mastropietro, V., 1999. Anomalous BCS equation for a Luttinger superconductor. FM 99-1, http:==ipparco.roma1.infn.it. PoincarUe, H., 1987. Les MMethodes nouvelles de la mMecanique cMeleste, 1892. Reprinted by Blanchard, Paris. PWoschel, J., 1986. Invariant manifolds of complex analytic mappings. In: Osterwalder, K., Stora, R. (Eds.), Les Houches, XLIII (1984), vol. II. North-Holland, Amsterdam, pp. 949–964. Polchinski, J., 1992. E0ective ?eld theory and the Fermi surface, University of Texas, preprint UTTC-20-92. Polchinski, J., 1984. Renormalization group and e0ective lagrangians. Nucl. Phys. B 231, 269–295. Thirring, W., 1983. Course in Mathematical Physics, Vol. 1. Springer, Wien, p. 133. Solyom, J., 1979. The Fermi gas model of one dimensional conductors. Adv. Phys. 28, 201–303. Wilson, K.G., 1970. Model of coupling constant renormalization. Phys. Rev. D 2, 1438–1472. Wilson, K.G., Fisher, M., 1972. Critical exponents in 3.99 dimensions. Phys. Rev. Lett. 28, 240–243. Wilson, K.G., 1973. Quantum ?eld theory models in less than four dimensions. Phys. Rev. D 7, 2911–2926. Wilson, K.G., Kogut, J.B., 1974. The renormalization group and the .-expansion. Phys. Rep. 12, 76–199.
Physics Reports 352 (2001) 273–437
Renormalization group for one-dimensional fermions. A review on mathematical results Guido Gentilea ; ∗ , Vieri Mastropietrob b
a Dipartimento di Matematica, Universita di Roma Tre, I-00146 Roma, Italy Dipartimento di Matematica, Universita di Roma “Tor Vergata”, I-00139 Roma, Italy
Received March 2001; editor : I: Procaccia
Contents 1. Introduction 1.1. A general overview of the state of the art 1.2. More recent results 1.3. Contents 2. One-dimensional interacting Fermi systems 2.1. Free systems 2.2. Interaction with the lattice 2.3. Interaction between the electrons 2.4. Interaction with the phonons 2.5. Spin-Hamiltonians 2.6. General interacting systems 2.7. Other Hamiltonian models 3. Schwinger functions and physical observables 3.1. De6nition 3.2. Physical relevance 3.3. Schwinger functions for free systems 4. Fermionic functional integrals 4.1. Grassman integrals and truncated expectations 4.2. Truncated expectations and Feynman diagrams
∗
275 275 276 276 278 278 279 280 280 281 282 282 282 282 283 284 285 285 289
5. The multiscale decomposition and power counting 5.1. Tree expansion 5.2. Clusters 5.3. Values of Feynman diagrams 5.4. Power counting 5.5. Comparison with Wilson’s method 6. Nonperturbative estimates for the nonrenormalized expansion 6.1. Kernels of the e=ective potentials 6.2. Proof of (6.12) 7. Schwinger functions as Grassman integrals 7.1. Perturbation theory and Euclidean formalism 7.2. Feynman graphs and origin of divergences 8. The Holstein–Hubbard model: a paradigmatic example 8.1. The model 8.2. E=ective potentials 8.3. Renormalization 8.4. Renormalized trees 8.5. Renormalized bounds
Corresponding author.
c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 4 1 - 2
293 293 300 304 305 308 308 308 310 312 312 314 315 315 317 319 323 327
274
9.
10. 11. 12. 13.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437 8.6. Anomalous integration 8.7. Bounds for the renormalized expansion Relationship between lattice and continuum models 9.1. Continuum models 9.2. The ultraviolet problem 9.3. The Luttinger model and the ultraviolet problem Hidden symmetries and Cow equations Vanishing of the Luttinger model beta function The two-point Schwinger functions Two-point Schwinger functions for spinless fermions 13.1. Free fermions 13.2. Noninteracting fermions in a periodic potential 13.3. Noninteracting fermions in a quasi-periodic potential 13.4. Interacting spinless fermions 13.5. Interacting spinless fermions with a periodic potential in the not 6lled band case 13.6. Interacting spinless fermions with a periodic potential in the 6lled band case 13.7. Interacting spinless fermions with a incommensurate potential
PACS: 05.10.Cc
328 335 339 339 340 341 341 346 350 354 354 355 356 362 364 364 365
13.8. Open problems 14. Density–density response function 14.1. The expansion 14.2. The results 15. Approximate Ward identities 16. Spin chains 17. Spinning fermions 17.1. The repulsive case 17.2. The attractive case 18. Fermions interacting with phonon 6elds 18.1. Interaction with a quantized phonon 6eld 18.2. Classical limit: the static Holstein model 19. The variational Holstein model 19.1. Old results 19.2. New results 20. Coupled Luttinger liquids 21. Bidimensional Fermi liquids Uncited references Appendix A A.1. Graphs, diagrams and trees A.2. Discrete versus continuum A.3. Truncated expectations and Gram–Hadamard inequality A.4. Dimensional bounds A.5. Diophantine numbers A.6. Some technical results References
367 368 368 370 372 376 383 383 387 388 388 389 390 390 391 394 398 402 402 402 408 408 426 428 429 433
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
275
1. Introduction 1.1. A general overview of the state of the art The study of one-dimensional nonrelativistic interacting Fermi systems has attracted a vast interest over the years, among physicists and mathematicians. The mathematical interest is motivated by the possibility, due to the low dimensionality, of obtaining some rigorous nontrivial results about such systems (conversely, up to now, this is almost impossible in higher dimensions). The physical motivations arise from the fact that such systems can modelize some real materials, like organic anisotropic compounds. A new wind of interest among physicists was generated in 1990 by the Anderson theory of high Tc superconductivity [1], which relies on the assumption that the physics of two-dimensional interacting Fermi systems is somehow similar to the physics of one-dimensional ones. As far as rigorous results are concerned, they can be distinguished mainly into two classes: results obtained by exact solutions and results obtained by the study of the SchrIodinger equation. An excellent review of exact solutions is in [3]: we can recall the classical exact solutions of the Luttinger model, [4], of the Hubbard model, [5], and of the spin chains (which can be seen as interacting Fermi systems by performing a Schwinger–Dyson transformation) like the XY model, [6], the XXY model, [7], and the XYZ model, [8]. Such solutions are really nonperturbative, as they hold also for large coupling and are based mainly on (rigorous) bosonization, Bethe ansatz or transfer matrix method. However, a limitation of such solutions is that they cannot be extended to other models, even to very similar ones, as they are crucially dependent on the 6ne details of the models. Moreover—with the remarkable exception of the Luttinger model—the exact solutions provide a detailed information of the Hamiltonian spectrum, but it is generally not possible to derive from them the correlations (in terms of which the physical observables are expressed). In a completely di=erent framework other rigorous results about one-dimensional Fermi systems in an external 6eld can be derived by the analysis of the one-dimensional single-particle SchrIodinger equation (see for instance [9] or [10] for reviews). We can mention [11] for the periodic potential, [12,13] for the stochastic potential and [14 –16] for the quasi-periodic potential. From such results about the spectrum of the SchrIodinger equation one can obtain, in principle, the asymptotic behaviour of the correlation functions for a system of fermions in an external 6eld; this is however nontrivial in general (one has to use some properties of the wave functions in the complex plane) and, as far as we know, it has been done only in the case of random potentials in [17] and in the case of periodic potentials in [18]. It is very diKcult to view the large number of works in the physics literature about onedimensional Fermi systems (we can refer to the classical [19] or to the more recent [20 –22]). Many results are found by third-order multiplicative renormalization group [19], but the relevance of the higher-order terms and the validity of the third-order approximation are not clear. Moreover, such methods can be applied only to models with linear dispersion relations (so not really fundamental ones) and only if there is no lattice and if the volume is in6nite. Such limitations are particularly annoying as they make diKcult a detailed comparison with numerical simulations. Other results are found by the “bosonization” techniques, in which the ultraviolet problem is not treated in a consistent way so that an extra parameter—not present in
276
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
the original model—appears in the expressions found for the correlations, see [23]; this means that such expressions can be, in any case, only approximately true. While it is likely that many of the physical conclusions are valid, the lack of distinction between rigorous results and results not really proved at a mathematical level makes generally very diKcult the dialogue between theoretical physicists, mathematical physicists and mathematicians working more or less on the same problems. We mention 6nally the approach based on conformal quantum 6eld theory, see for instance [24]. This approach is quite powerful as it can provide the critical indices, but it can be generally applied only to models for which the exact solutions are possible.
1.2. More recent results In this work, we shall review what is known at a rigorous level about the correlation functions of many (generally not soluble) models of interacting one-dimensional Fermi systems. A main novelty (with respect to the framework brieCy described in Section 1.1) was the application (started from [25,26]) to solid state models of the techniques based on the rigorous implementation of Wilsonian renormalization group [27], developed in the context of constructive quantum =eld theory (see [28–31] for reviews): this was a quite natural development, as 6eld theory methods have been applied to solid state physics for many years (see for instance [32]). Such techniques allow in principle to express the correlation functions of a quantum 6eld theory describing Fermi systems as convergent series (even if they are generally nonanalytic in the perturbative parameters). One of the 6rst realization of this was the theory of the Gross– Neveu model (a system of relativistic one-dimensional fermions) developed in [33–35]. The application of such techniques to one-dimensional nonrelativistic Fermi systems was originally discussed in [25,36 –38,18,39 – 46], and it will be the main content of the present review. The result is that the correlation functions of many not soluble models can be written as convergent series, in the weak coupling regime, and such expressions provide all the information one is interested in.
1.3. Contents The aim of this paper is, on the one hand, to review in a systematic way results spread out in a number of works and on the other, to provide the technical tools necessary to read the original papers. The physical observables are expressed in terms of Schwinger functions, which in turn are expressed by functional integrals de6ned in terms of Grassman variables; in Section 4 we resume some properties of the fermionic functional integrals which will be used to de6ne a constructive algorithm for the computation of the Schwinger functions. The renormalization group ideas are implemented by writing each integration as product of many integrations “on di=erent scales” and the integration of a scale leads to a new e=ective interaction; the technical tools for de6ning the expansions (trees, clusters, Feynman diagrams, and so on) are de6ned in Section 5. This leads to a sequence of e=ective interactions whose expansion
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
277
converges provided that the previous scale interaction is small, due to cancellations based on the Fermi statistics, see Section 6. In Section 8 for 6xing ideas we consider a particular model, and we de6ne an anomalous expansion for the Schwinger function of it: as a paradigmatic model we choose the Holstein– Hubbard model for spinless fermions, as it contains essentially all the possible diKculties encountered for spinless fermions; it describes in fact fermions subject to a quasi-periodic potential and interacting through a short range two-body potential. We start by de6ning an expansion for the e>ective potential. The presence of a quasi-periodic potential has the effect that the expansion is aOicted by a small divisor problem, so that a comparison for the series appearing in classical mechanics is natural. The theory has an anomalous dimension and the bare parameters are modi6ed by critical indices. The Cow of the running coupling constants is controlled using some hidden symmetry of this model, see Section 10. In particular, one exploits remarkable cancellations in the “beta function”, proved by a nonperturbative argument based on the exact solution of the Luttinger model [112], see Section 16; in other words we extract from the exact solution of the Luttinger model information for not exactly solvable models, using the fact that they are “close” in a renormalization group sense. An expansion for the two-point Schwinger function is de6ned in Section 11, while an expansion for the density–density correlation function is de6ned in Section 13. In order to compute the asymptotic behaviour of the density–density response function one has to prove an approximate Ward identity, see Section 15. In Section 12 we collect the results of the Schwinger functions for a number of spinless models, discussing brieCy how the above scheme has to be adapted for each of them. Such models are on a lattice or on the continuum, they interact with a periodic or quasi-periodic external potential, and include a two-body short range interaction. Our results are limited to the case of small external and two-body potential, with the exception of the case of large external quasi-periodic 6eld (considered in absence of the two-body interaction) in which the phenomenon of Anderson localization is found. The XYZ Heisenberg spin chain is included in the class of models we can treat, as it can be written as an interacting fermionic model with an anomalous potential and the spin–spin correlation function is related to the density–density response function, see Section 13. We then consider the presence of the spin: the number of running coupling constants increases, see Section 17, and it turns out that only if the two-body interaction is repulsive the running coupling constants remain small. This is due to cancellations in the beta function based on the solution of the Mattis model [113], the analogue of the Luttinger model with spinning fermions. In the repulsive case the behaviour of the Schwinger function is similar to the one in the spinless case. In the attractive case, only mean 6eld approximations are possible at the moment (with the remarkable exception of the Hubbard model, which was solved in [5]). If the fermions are on the lattice even the mean 6eld theory is not trivial: we show, see Section 19, that a mean 6eld theory foresees the formation of collective excitations called density waves for any rational fermionic density. We discuss also a mean 6eld theory for a two-chain system exchanging Cooper pairs, in which a version of the BCS equation for Luttinger liquids is found. Finally, we discuss some results for 6nite temperature bidimensional fermions, see [47,48].
278
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
2. One-dimensional interacting Fermi systems 2.1. Free systems Let x;± be fermionic creation or annihilation operators de6ned in the standard fermion Fock space [49]. If = ± 1=2 we say that the fermions are spinning (so such operators can describe real electrons), while if = 0 we say that the fermions are spinless. Despite the fact that spinless fermions have no physical meaning, they are widely studied in the literature; one can say (tautologically) that they are easier to study. Furthermore, the results for spinless systems can be used to understand phenomena in which the spin does not play any role. The physical systems one aims to modelize are crystals so anisotropic that they can be approximatively described by one-dimensional systems: the conduction electrons are supposed to be con6ned on a segment and they interact with each other, with the periodic or quasi-periodic background potential generated by the ions of the crystal, with phonons, with stochastic impurities and so on (for physical motivations see [2,51,52,20,21]). There are two main classes of models describing one-dimensional fermion systems. The 6rst class are the lattice models and are such that x is an integer, say between −[L=2] and [(L − 1)=2]: we shall write x ∈ in such a case, if = {x ∈ Z: − [L=2] 6 x 6 [(L − 1)=2]}. One describes in this way fermions on a chain with length L and step a = 1, thinking that the electrons are localized on atomic sites and they can hop to neighbouring sites. Considering only the possibility of hopping between nearest neighbour sites (i.e. neglecting the interaction of the electrons with themselves and with the environment) the hopping Hamiltonian (by setting S = 0 if the fermions are spinless and S = 1=2 if they are spinning) is given by H0 = T0 − 0 N0 ; 1 1 T0 = (− 2S + 1 2 =±S x∈
N0 =
1 [ 2S + 1
+ − x; x+1;
+ − x; x; ]
−
+ − x; x−1;
+2
+ − x; x; )
;
(2.1)
:
=±S x∈
In the above formulae 0 is the chemical potential and it is 6xed by the density (we shall work in the grand canonical ensemble). Hamiltonian (1.2) is also called the tight binding Hamiltonian. Another class of models are the continuum models, in which the fermions are on the continuum and in such a case x assumes values on the segment [ − L=2; L=2]. One imagines in this case that the positive charge of the ions is spreaded out in the metal (jellium). Then the corresponding Hamiltonian, again by neglecting any form of interaction, is given simply by the kinetic energy operator H0 = T0 − 0 N0 ;
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
T0 =
L=2 1 dx 2S + 1 −L=2 =±S
L=2 1 dx N0 = 2S + 1 −L=2
+ x;
1 2 9 2m x
+ − x; x;
− x;
;
279
(2.2)
:
=±S
As in (2.1) 0 denotes the chemical potential, while m is the fermion mass. 2.2. Interaction with the lattice We assume, as it is usual, that the Hamiltonian describing the interacting fermions is obtained by adding to the free Hamiltonian H0 some other terms, according to the kind of interaction one wants to describe. In this way we get more realistic models with respect to the ones considered in Section 2.1. The conduction electrons interact through electric forces with the lattice of ions; in the 6rst approximation this interaction can be described in terms of a pseudopotential, which is assumed as a regular periodic function which takes into account the lattice periodicity. In the continuum models, one then adds to the Hamiltonian H0 a term L=2 uP = u d x ’(x) x;+ x;− (2.3) −L=2
with ’ periodic with period T , i.e. ’(x) = ’(x + T ), and regular in its argument (what we mean exactly by “regular” will become clear later when we shall discuss in detail the model). It is well known that the presence of such a periodic potential leads to the formation of energy bands. In lattice models the presence of the ion lattice is already described by the fact that one has x ∈ ; however to describe energy bands one can still add to the Hamiltonian H0 a term uP = u ’(x) x;+ x;− (2.4) x∈
with ’(x) = ’(x + T ) for some integer T ¿ 1. For a long time, solid state systems were considered as either crystalline (i.e. lattice periodic) or amorphous. The lattice periodicity was then described in terms of interactions with periodic pseudopotentials like (2.3) and (2.4). However in recent times several solid state systems with a quasi-periodic structure have been discovered (see for instance [52]). In some cases, such materials have a basic structure and a periodic modulation superimposed on it, such that the periodicity of the modulation is incommensurate with the periodicity of the basic structure. Another possibility is that of structures composed by two periodic lattice subsystems, with mutually incommensurate periods. In order to study the electronic properties of quasi-periodic systems, in case of lattice systems one can add to the Hamiltonian H0 a term like (2.4), but in which one has ’(x) = ’(x + T ) with an irrational T , so that T is incommensurate with the period of the lattice (which is 1 in
280
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
the units we have chosen); in the case of continuum systems one can write (2.3) with ’(x) a quasi-periodic function, i.e. a function with two incommensurate intrinsic periods. The lattice can be not exactly periodic or quasi-periodic, for the unavoidable presence of impurities: their presence can be modelized by the introduction of an additional term in the Hamiltonian describing the interaction with a white noise (for instance). Of course, such possibilities are not incompatible, i.e. one can consider together both a stochastic and a periodic interaction. 2.3. Interaction between the electrons The conduction electrons interact with each other: taking into account such interactions is essential for the understanding of many properties (superconductivity, magnetism, Mott transition and so on). We can assume that the interaction between the fermions is given by a two-body potential. The interaction is assumed to have short range, as the Coulombian interaction should be screened in the metals. Then one can add to the Hamiltonian H0 (or H0 + uP), in the case of lattice systems, a term of the form 1 V = v(x − y) x;+ y;+ y;− x;+ (2.5) 2 (2S + 1) ; =±S x;y∈
with |v(x − y)| 6 v0 e−|x−y| ;
(2.6)
for some positive constants and v0 . In some special cases, e.g. in the so-called Hubbard model, v(x − y) = |x−y|; 1 . In the continuum case, one has L=2 L=2 1 − + + + V = dx d y v(x − y) (2.7) x; y; y; x; ; 2 (2S + 1) −L=2 −L=2 ; =±S
where v can be assumed to be a smooth function satisfying (2.6). 2.4. Interaction with the phonons It is also important to consider the interaction with phonons, which are the quantized oscillations of the ion positions, i.e. of the lattice. One has to add to the Hamiltonian a term of the form 1 HB + x x;+ x;− − (2.8) 2 x∈
with HB = −
1 92 + (2 + b2 (x − x+1 )2 ) ; 02 x∈ 92x x∈ x
(2.9)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
281
where x is a boson quantum 6eld, corresponding to a discretized vibrating string with linear density 02 , optical frequency ! and maximum wave propagation speed c, so that b = c!−1 . One could also take into account acoustic phonons. 2.5. Spin-Hamiltonians Another class of models closely related to the ones we are considering are the spinHamiltonians, like the Heisenberg Hamiltonians, where there is a 1=2-spin on each site of a lattice and the interaction is between nearest neighbours. In dimension d = 1 a very general model is the XYZ model (which contains as limiting cases the XY model, the XXZ model and others) which is described by the Hamiltonian H=
L−1
1 2 3 [J1 Sx1 Sx+1 + J2 Sx2 Sx+1 + J3 Sx3 Sx+1 + hSx3 ] + UL1 ;
(2.10)
x=1
where Sxj = 2 xj , if x1 , x2 and x3 are the Pauli matrices, 0 1 0 i 1 0 1 2 3 x = ; x = ; x = ; 1 0 0 −1 −i 0
(2.11)
while UL1 is a boundary interaction term. Hamiltonian (2.10) can be written as a fermion interacting spinless Hamiltonian. In fact, it is easy to check that, if x± = ( x1 ± i x2 )=2, the operators x−1 ± (− y3 ) x± ; (2.12) x ≡ y=1
are a set of anticommuting operators, and that we can write x− = e−i'
x −1 y=1
+ − y y
− x ;
x+ =
+ i' x e
x −1 y=1
+ − y y
x3 = 2
;
+ − x x
−1 :
(2.13)
Hence, if we normalize the interaction so that J1 + J2 = 2 and we introduce the anisotropy u=
J1 − J2 ; J1 + J2
(2.14)
we get H=
L−1 1 − [ x=1
− J3
2
+ − x x+1
+ − x x
−
1 2
+
+ − x+1 x ]
u 2
− [
− + x+1 x+1
−
1 2
+ + x x+1
+
−h
− − x+1 x ]
L x=1
+ − x x
−
1 2
+ UL2 ;
(2.15)
where UL2 is the boundary term in the new variables. We choose it so that the fermionic Hamiltonian (2.15) coincides with the Hamiltonian of a fermion system on the lattice with
282
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
periodic boundary conditions, that is we put UL2 equal to the term in the 6rst sum in the r.h.s. ± of (2.15) with x = L and L+1 = 1± (in [6] this choice for the XY chain is called “c-cyclic”). Then the XYZ model can be considered as a fermionic model of the class we are discussing. The XYZ Hamiltonian has a sort of anomalous potential of the form (generalizing it to the case of spinning fermions) (B = ( ( x;+ x;+− + x;−− x;− ) : (2.16) x∈
Such a potential appears in mean 6eld BCS theory in which the superconductivity phenomenon is approximately described in terms of an anomalous potential like (2.16). We shall consider the case of two one-dimensional interacting fermionic systems coupled by a Cooper interaction and we shall see that, analogous to the Bardeen approximation, one is led to consider an interacting fermion system with a term like (2.16). 2.6. General interacting systems So in the following we can consider Hamiltonians which, in the most general case, could be of the form H = H0 + uP + V + (B + HB :
(2.17)
Usually, not all the possible interacting terms are considered together as the corresponding analysis would be very intricate. So we shall begin by considering a particular case, both for propaedeutical and physical reasons: the analysis will be easier to perform (and still not so easy!) and in describing physical situations not all the interacting terms are expected to be at the same level at the same time. 2.7. Other Hamiltonian models There are many other one-dimensional interacting fermionic models. One is the Luttinger model [112,54], which will play an important role in our analysis; there are many extensions of this model to spinning fermions, called the Mattis model, the g-ological model, the Luther– Emery model and so on. All such models are not true “fundamental” ones, in the sense that they are considered approximations, in some physical situations, of the models with Hamiltonians listed above; so we shall not discuss them here. We shall see that our methods make us to introduce such models in a natural way and to give a rigorous meaning to the intuition that such models are “close” to the one we are considering. There are also many relativistic models, like the Thirring model or the Yukawa2 model, which are closely related to the models with the Hamiltonians listed above, in some particular limit. 3. Schwinger functions and physical observables 3.1. De=nition Fix + ¿ 0. Setting x = (x; x0 ), with x ∈ and x0 ∈ [ − +=2; +=2), de6ne
, x;
= ex0 H
, −x0 H . x; e
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
283
If {t1 ; : : : ; ts } is a collection of time variables ti ∈ (−+=2; +=2), we shall denote by {'(1); : : : ; '(s)} the permutation of {1; : : : ; s} of parity p' such that t'(1) ¿ · · · ¿ t'(s) .
At temperature T = +−1 the 6nite-temperature imaginary-time correlation functions, or Schwinger functions, are de6ned by S L; + (x1 ; ,1 ; 1 ; : : : ; xn ; ,n ; n ) = (−1)p'
Tr e−+H
,'(1) x'(1) ; '(1) : : : −+H
,'(s) x'(s) ; '(s)
Tr e
:
(3.1)
In the spinless case, we shall write simply S L; + (x1 ; ,1 ; : : : ; xn ; ,n ). In the limit + → ∞, functions (3.1) de6ne the zero temperature Schwinger functions: they describe the properties of the ground state of the system with Hamiltonian H given by (2.17) in the grancanonical ensemble with chemical potential 0 . 3.2. Physical relevance Most of the physical properties can be derived, at least in principle, by the knowledge of the Schwinger functions. For instance, by the two-point Schwinger function one can get information on the spectrum. If we consider the Fourier transform of the two-point Schwinger function, from the imaginary poles in k0 one can compute the spectral gap; if ik0; 2 ; −ik0; + , with k0; 2 ; k0; + ¿ 0, are such poles then it is well known that 3S = min2 (k2 ) + min+ (k+ ) [55]. Another important quantity is the occupation number, de6ned as the average number of particles with “momentum” k. The momentum is the quantum number which allows us to classify the states of a “free” Hamiltonian, so the de6nition of occupation number depends on what we consider the free Hamiltonian. In a system with Hamiltonian H0 (see (2.1) or (2.2)), the states are obtained considering Slater determinants of plane-waves, so that the occupation number is just given by ˆ 0− ) ; nk ≡ S(k;
(3.2)
ˆ t) is the Fourier transform of the two-point Schwinger function with respect to the if S(k; only space variable (in such a case the Schwinger function is translationally invariant). On the other hand, if the Hamiltonian is H0 + uP, with P given by (2.3) or (2.4), in the periodic case, the good quantum number is the crystalline momentum indicizing the Bloch waves, so that the good de6nition for the occupation number can still be written in the form (3.2), provided the “Fourier transform” has to be done with respect to the Bloch waves (instead of plane-waves). Other important physical quantities are the response functions; they measure the response of a physical observable to an in6nitesimal external perturbation. For instance the density–density response measures the response of the system density to a perturbation proportional to the density of particles; it can be computed from the density–density correlation function (in terms of which the dielectric constant can be written [56]), given, in the spinless case, by S L; + (x; +; x; −; 0; +; 0; −) − S L; + (x; +; 0; −) S L; + (x; +; 0; −) :
(3.3)
284
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
The magnetic response function measures the response of the spin to a magnetic perturbation, and the current–current response function measures the response of the current to an electric 6eld. 3.3. Schwinger functions for free systems Finally, let us consider explicitly the Schwinger functions for free systems in which H = H0 . Suppose, for instance, H0 to be given by (2.1). The model described by H0 , is of course, exactly solvable and all the Schwinger functions can be computed; they are obtained by the anticommutative Wick rule (for more details see Section 4 later) from the two-point Schwinger function. The latter is given, if the fermions are on a lattice, by S0L; + (x; −; y; +) ≡ g(x − y) =
1 eik·(x−y) ; L+ −ik0 + E(k)
(3.4)
k0 ∈D+ k∈DL
where E(k) = 1 − cos k − 0 , with |k | 6 2', is the dispersion relation, with the convention that x = (x; x0 ), y = (y; y0 ), k = (k; k0 ), denoting by · the scalar product in R2 , and de6ning DL ≡ {k = 2'n=L;
n ∈ Z;
D+ ≡ {k0 = 2(n + 1=2)'=+;
−[L=2] 6 n 6 [(L − 1)=2]} ;
n ∈ Z;
−M 6 n 6 M − 1} ;
(3.5)
where M is a suitable cut-o= to be removed at the end (see below). If the fermions are on the continuum the dispersion relation becomes E(k) = (k 2 =2m) − 0 and the two-point Schwinger function is still given by (3.4), with the new de6nition of E(k) and with DL de6ned as DL ≡ {k = 2'n=L;
n ∈ Z;
−N 6 n 6 N − 1} ;
(3.6)
with N a suitable cut-o= (to be removed as M ). Of course, at the end, we will be interested in removing the cut-o=s M and N : we work with M and N 6nite, so that we are able to interpret the Schwinger functions as integrals on 6nite-dimensional Grassman algebras (see next section), and we 6nd results which will be uniform in M and N , so that we can take the limits M → ∞ and N → ∞. We can see that g(x − y) is the Fourier transform of a function singular for k0 = 0; k = pF , where pF is the Fermi momentum, de6ned by the condition E(pF ) = 0. In general, when adding to H0 the interaction between particles, there is no reason for the Fourier transform of the Schwinger function to be singular for k = pF ; it could be singular at some interaction-dependent value. In order to take into account this fact it is useful to write 0 = + 7 ;
(3.7)
where 7 is a counterterm which will be eventually suitably chosen in order to 6x the position of the singularity at some interaction-independent point.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
285
The Schwinger functions (3.1) can be expressed as functional integrals. In next section, we shall review the basic concepts which allow us to introduce a functional integral representation in a fermionic theory; then in Sections 5 and in 6 we shall discuss the notion of e=ective potential, and in Section 7 we shall come back to the problem of studying the Schwinger functions.
4. Fermionic functional integrals 4.1. Grassman integrals and truncated expectations The Schwinger functions we shall be interested in are written as Grassman integrals (see the classical [57] or any modern textbook like [49]; see also Section 7). One introduces a 6nite-dimensional Grassman algebra, which is a set of anticommuting Grassman variables ≡ { 2+ ; 2− }, with 2 an index belonging to some 6nite set A. This means that 2 ; 2
+
2 ; 2
= 0;
∀2; 2 ∈ A; ∀ ; = ± ;
(4.1)
in particular ( 2 )2 = 0 ∀2 ∈ A and ∀ = ±. Note that here and henceforth we use the same symbols to denote both the fermionic 6elds and the Grassman variables: this can be a little misleading, but it is the convention usually followed in quantum 6eld theory, so we shall adopt it. Note also that confusion should not be made between the label = ± in (4.1) and the spin label used in the previous sections. Let us introduce another set of Grassman variables { d 2+ ; d 2− }, a ∈ A, anticommuting with + ; − , and an operation (Grassman integration) de6ned by 2 2 d 2 = 0; a ∈ A; = ± 1 : d = 1; (4.2) 2 2 If F( ) is any analytic function of the d 2+ d 2− F( )
+ − 2 ; 2 ,
2 ∈ A, the operation (4.3)
2∈A
is simply de6ned by iteratively applying (4.2) and taking into account the anticommutation rules (4.1). It is easy to check that for all a ∈ A and C ∈ C − + d 2+ d 2− e− 2 C 2 2− 2+ = C −1 ; (4.4) − − 2+ C 2− + d 2 d 2 e +
−
in fact e− 2 C 2 = 1 − 2+ C 2− and by (4.2) − + d 2+ d 2− e− 2 C 2 = C ;
(4.5)
286
while
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
d
+ − − 2 d 2 e
− + 2 C 2
− + 2 2 =1
:
(4.6)
In the following, we shall need also more complicate expressions involving more than a pair of Grassman variables, like + − + d 2+ d 2− d ++ d +− e− ij=2 i Mij j 2− ++ = [M −1 ]2 + (4.7) − − − +ij=2 i+ Mij j− + + d 2 d + d + d + e with M ∈ GL(2; C), for 2 = + ∈ A and 2 ; + ∈ {2; +}. Again (4.7) can be easily veri6ed by using (4.2) and the anticommutation rules (4.1), which allow us to write + − + d 2+ d +− d ++ d +− e− ij=2 i Mij j = M11 M22 − M12 M21 ≡ det M (4.8) and
d
− − + − + 2 d 2 d + d + e
+
ij=2 i
+
Mij
−
j
− + 2 + = M2 +
;
(4.9)
= M , M = M , M =−M if M2 + is the minor complementary to the entry M2 + (i.e. M22 22 ++ +2 ++ 2+ and M+2 = − M2+ ). The above formulae (4.4) and (4.7) closely remind us of the Gaussian integrals: note however, that there is no need for C or M to be real or positive de6ned (but of course they have to be invertible). Pursuing further the analogy with Gaussian integrals, we can consider a “measure” (a similar expression is found replacing g with a matrix, see 4.26 below) + −1 − P( d ) = d 2+ d 2− g2 e− 2∈A 2 g2 2 ; (4.10) 2∈A
by construction one has P(d ) = 1; P(d )
− + 2 + = 2; + g2
:
(4.11)
In general, P(d ) will be called a Gaussian fermionic integration measure (or Grassman integration measure or, as we shall do in the following, integration tout court) with covariance g: for any analytic function F de6ned on the Grassman algebra we can write P(d )F( ) = E(F) : (4.12) However note that P(d ) is not at all a real measure, as it does not satisfy the necessary positivity conditions, so that the terminology is only formal and the use of the symbol E (which stands for expectation value) is meant only by analogy.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
287
Given p functions X1 ; : : : ; Xp de6ned on the Grassman algebra and p positive integer numbers n1 ; : : : ; np , the truncated expectation is de6ned as 9n1 +···+np T 1 X1 ( )+···+p Xp ( ) E (X1 ; : : : ; Xp ; n1 ; : : : ; np ) = n1 P(d ) e ; (4.13) np log 91 : : : 9p =0 where = {1 ; : : : ; p }. It is easy to check that ET is a linear operation, that is, formally, ET (c1 X1 + · · · + cp Xp ; n) =
n1 +···+np =n
n! n cn1 : : : cpp ET (X1 ; : : : ; Xp ; n1 ; : : : ; np ) ; (4.14) n1 ! : : : np ! 1
so that the following relations immediately follow: (1) ET (X ; 1) = E(X ) ; (2) ET (X ; 0) = 0 ; (3) ET (X; : : : ; X ; n1 ; : : : ; np ) = ET (X ; n1 + · · · + np ) :
(4.15)
Moreover, one has ET (X1 ; : : : ; X1 ; : : : ; Xp ; : : : ; Xp ; 1; : : : ; 1; : : : ; 1; : : : ; 1) = ET (X1 ; : : : ; Xp ; n1 ; : : : ; np ) ;
(4.16)
where, for any j = 1; : : : ; p, in the l.h.s. the function Xj is repeated nj times and 1 is repeated n1 + · · · + np times. We de6ne also ET (X1 ; : : : ; Xp ) ≡ ET (X1 ; : : : ; Xp ; 1; : : : ; 1) :
(4.17)
By (4.16) we see that all truncated expectations can be expressed in terms of (4.17); it is easy to see that (4.17) is vanishing if Xj = 0 for at least one j; see Section A.3. The truncated expectation appears naturally considering the integration of an exponential; in fact as a particular case of (4.13) one has 9n T X ( ) E (X ; n) = n log P(d ) e ; (4.18) 9 =0 so that log
P(d ) e
X(
∞ 1 9n ) = log P(d ) eX ( n! 9n n=0
)
= =0
∞ 1 n=0
n!
ET (X ; n) :
(4.19)
The following properties, immediate consequence of (4.2) and very similar to the properties of Gaussian integrations, follow; see also Section A.3.
288
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
(1) Wick rule. Given two sets of labels {21 ; : : : ; 2n } and {+1 : : : ; +m } in A, one has n − − + + p' P(d ) 21 : : : 2n +1 ; : : : ; +m = n; m (−1) 2i ;+'( j) g2i ; '
(4.20)
i=1
where the sum is over all the permutations ' = {'(1); : : : ; '(n)} of the indices {1; : : : ; n} with parity p' with respect to the fundamental permutation. (2) Addition principle. Given two integrations P(d 1 ) and P(d 2 ), with covariance g1 and g2 , respectively, then, for any function F which can be written as a sum over monomials of Grassman variables, i.e F = F( ), with = 1 + 2 , one has P(d 1 ) P(d 2 )F( 1 + 2 ) = P(d )F( ) ; (4.21) where P(d ) has covariance g ≡ g1 + g2 . It is suKcient to prove it for F( ) = uses the anticommutation rules (4.1). One has P(d 1 ) P(d 2 )( 1− + 2− )( 1+ + 2+ ) = P(d 1 ) 1− 1+ P(d 2 ) + P(d 1 ) P(d 2 ) 2− 2+ = g1 + g2 ;
−
+;
then one
(4.22)
where (4.11) has been used. (3) Invariance of exponentials. From the de6nition of truncated expectations, it follows that, if is an “external 6eld”, i.e. a nonintegrated 6eld, then ∞ 1 P(d ) eX ( +) = exp ET (X (· + ); n) ≡ eX () ; (4.23) n! n=0
which is a main technical point: (4.23) says that integrating an exponential one still gets an exponential, whose argument is expressed by the sum of truncated expectations. (4) Change of integration. If Pg (d ) denotes the integration with covariance g, then, for any analytic function F( ), one has 1 −7 + − Pg (d ) e F( ) = Pg˜(d )F( ); g˜ −1 = g−1 + 7 ; (4.24) N7 where g−1 + 7 N7 = = 1 + g7 = g−1
Pg (d ) e−7
+
−
:
(4.25)
The proof is very easy from the de6nitions. More generally one has that, if M is an invertible 2 × 2 matrix and PM (d ) is given by PM (d ) = d
− − + + 2 d + d + d + det M
e−
+ ij=2
i
+
Mij−1
−
j
;
(4.26)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
then, for ∈ C, 1 PM (d ) e− N
+ − + − 1 2 − 2 1
F( ) =
PM˜ (d ) F( );
where x1 is the Pauli matrix (see (2.11)) and N = det(5 +
Sx1 M ) =
det(M −1 + Sx1 ) = det M −1
PM (d ) e−
289
−1 M˜ = M −1 + x1 ;
+ − + − 1 2 − 2 1
:
(4.27)
(4.28)
Moreover if PM (d ) is the integration measure de6ned by (4.26), one has 1 − +i; j=2 i+ Nij−1 j− PM (d ) e F( ) = PM˜ (d )F( ) ; NN
(4.29)
where −1 M˜ = M −1 + N −1
(4.30)
and NN = det (5 + N
−1
det (M −1 + N −1 ) M) = = det M −1
PM (d ) e−
+
i; j=2
+ −1 − 2 Nij +
:
(4.31)
4.2. Truncated expectations and Feynman diagrams We introduce a 6nite set of Grassman variables { k± }, one for each k ∈ DL; + ; DL; + ≡ DL ×D+ , with DL and D+ de6ned in (3.5). Let + − −1 + − P(d ) = exp − (L+g(k)) ˆ (L+g(k)) ˆ (4.32) k k k k k∈DL;+
k∈DL; +
with g(k) ˆ =
1
−ik0 + E(k)
=
1
−ik0 + cos pF − cos k
;
(4.33)
where (see (3.4)) E(k) = 1 − 0 − cos k ≡ cos pF − cos k : So we are in the situation of Section 4.1 with the set of indices A = DL; + . We introduce the Grassman =elds x± de6ned by 1 ˆ ± ±ik·x ± ; x = ke L+ k∈DL; +
(4.34)
(4.35)
290
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where k = (k; k0 ) and k · x = k0 x0 + kx, and such that 1 −ik·(x−y) P(d ) x− y+ = e g(k) ˆ ≡ g(x − y) : L+
(4.36)
k∈DL; +
Of course, the properties for the Grassman variables seen in Section 4.1 extend trivially to the Grassman 6elds. In order to compute the truncated expectations there are two possible main ways. One is the representation in terms of the ordinary connected Feynman diagrams de6ned in the following way. For a given set of indices P, de6ne (f) ˜ (P) = (4.37) x(f) f∈P
with (f) ∈ {±} and x(f) = (x(f); x0 (f)) ∈ × [ − +; +], and call |P | the number of elements in P. Then, given s sets of indices P1 ; : : : ; Ps , consider ET ( ˜ (P1 ); : : : ; ˜ (Ps ))
(4.38)
for s ¿ 1 (recall (4.17). First of all note that, by writing Pj = Pj+ ∪ Pj− ; Pj± = {f ∈ Pj : (f) = ±}
(4.39)
for each j = 1; : : : ; s, one must have s j=1
|Pj+ | =
s j=1
|Pj− | ;
(4.40)
because the truncated expectations can be written in terms of simple expectations (see Section A.3) and the Wick rule (4.20) holds. For any x = x(f) and = (f), we can represent each 6eld x as an oriented half-line emerging from a point x and carrying an arrow, pointing towards the point if = − and opposite to the point if = +. We can enclose the points x(f) belonging to the set Pj , for some j = 1; : : : ; s, in a box; in this way we obtain s disjoint boxes. Then, given s sets P1 ; : : : ; Ps , we associate to them a set of graphs <, called Feynman diagrams, obtained by joining pairwise the half-lines with consistent orientation (i.e. a half-line representing a 6eld − with a half-line representing a 6eld + and vice versa) in such a way that the boxes are all connected; see Fig. 1. A line obtained by joining two half-lines will be denoted by ‘ and, if ‘ is a line contained in a diagram <, we shall write ‘ ∈ <: the two half-lines are said to be contracted or to form a contraction. − To each line ‘ obtained joining the half-line representing x(i) with the half-line representing + x( j) we associate a propagator g‘ ≡ g(x(i) − x(j)); as the line ‘ uniquely determines the points i and j, we shall also write x(i) − x(j) = x‘ .
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
291
Fig. 1. A Feynman diagram < obtained by joining all the half-lines with consistent orientation emerging from the boxes enclosing the sets P1 ; : : : ; Ps . The diagram < belongs to the set G0 in (4.42).
Then to each diagram < there corresponds a number, which will be called the value of the graph, given by the product of the propagators of the lines ‘ ∈ < (possibly up to a sign): Val(<) = (−1)' g‘ ; (4.41) ‘∈<
where ' is a parity which depends on the way the lines are contracted between themselves. Then, if we denote by G0 the set of all Feynman diagrams which can be obtained by following the given prescription, one has ET ( ˜ (P1 ); : : : ; ˜ (Ps )) = Val(<) : (4.42) <∈G0
As a consequence, we see that all the sets P1 ; : : : ; Ps have to be not empty if s ¿ 1, while one can have P1 = ∅ if s = 1. There is another possible (more compact) representation of the truncated expectations. Consider (4.38) and set f = (j; i) for f ∈ Pj , with i = 1; : : : ; |Pj |, and n = |P1 | + · · · + |Ps |. It is known [35] (see Section A.3) that, up to a sign, if s ¿ 1, d PT (t) det G T (t) ; ET ( ˜ (P1 ); : : : ; ˜ (Ps )) = g‘ (4.43) T
‘∈T
where (1) T is a set of lines forming an anchored tree between the clusters of points P1 ; : : : ; Ps , i.e. T is a set of lines which becomes a tree (see Section A.1 for a formal de6nition of tree) if one identi6es all the points in the same cluster, (2) t is a set of parameters t = {tj; j ∈ [0; 1];
1 6 j; j 6 s} ;
(4.44)
(3) d PT (t) is a suitable (normalized) probability measure with support on a set of t such that tj; j = uj · uj , for some family of vectors uj ∈ Rs of unit norm, and
292
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 2. A term contributing to the truncated expectation (4.38) according to expansion (4.43). The lines connecting the sets P1 ; : : : ; Ps form the anchored tree T . The other lines are left uncontracted, as the determinant in (4.47) takes into account all the possible ways to contract them.
(4) G T (t) is a (n − s + 1) × (n − s + 1) matrix, whose elements are given by [G T (t)](j; i); ( j ; i ) = tj; j g(x(j; i) − x(j ; i )) ;
(4.45)
where 1 6 j; j 6 s and 1 6 i 6 |Pj |, 1 6 i 6 |Pj |, such that the lines ‘ = x(j; i) − x(j ; i ) do not belong to T . If s = 1, the sum over T is empty, but we can still use the above equation, by interpreting the r.h.s. as
1 if P1 is empty; (4.46) det G(1) otherwise; where 1 is obtained from (4.44) by setting tj; j = 1 ∀j; j . Note that, while in the 6rst representation ET was written as a sum over Feynman diagrams, in this second representation it is written as a sum over trees connecting the boxes. Fixing a tree T and expanding the determinant det G T (t), one gets all the possible graphs which can be obtained by contracting the half-lines not belonging to T , i.e one gets the Feynman diagrams and the representation (4.42) follows. Of course, the number of addends in the 6rst representation (4.42) is much larger than in the second one, i.e. (4.43), where a large quantity of Feynman diagrams are grouped together. It is important to stress the di=erence of the two representations of the truncated expectations, more precisely the di=erence between the number of addends appearing in the two representations. In the 6rst one (4.42) a truncated expectation is written in terms of Feynman diagrams and their number can be quite high: for instance, if |Pi | = 4 in (4.36), they are O(s!2 ) (see Section A.1), so while using such a representation it is diKcult to verify the convergence of the perturbative series [98]. In the other representation (4.43) we do not sum over the Feynman diagrams, but over the anchored trees (see Fig. 2), whose number is only O(s!) (see Section A.1). Of course there can be really a gain in expressing (4.38) by using (4.43) instead of (4.42)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
293
only if each summand of the two expressions admits the same bound, for instance a C n bound for some constant C. If the propagators are bounded by some constant C0 , |g‘ | 6 C0 , then one has |Val(<)| 6 C0L , where L is the number of lines in < (see (4.41)); as the number of Feynman diagrams in G0 is bounded by O(s!2 ), we obtain a bound s!2 C n from (4.42), for some constant C. On the other hand, it is a remarkable inequality that the determinant in (4.43) can still be bounded by a constant to the power n (Gram–Hadamard inequality; see Section A.3), so that a bound s!C n can be obtained for (4.38) by using the representation (4.43) instead of (4.42). Of course, if one develops the determinant in (4.43) one obtains the expansion in Feynman diagrams (4.42): the dramatic improvement of the bound is due to the fact that one exploits cancellations among the Feynman diagrams (due to the Fermi statistics), which are lost on bounding each addend in (4.42) by its absolute value. More precisely one has d PT (t) detG T (t) 6 g‘ |g‘ | C1n−s+1 T
T
‘∈T
6
T
‘∈T
C0s−1 C1n−s+1 6 s!(C max{C0 ; C1 })n ;
(4.47)
where C1 is a constant (proportional to C0 ) such that |det G T (t)| 6 C1n−s+1 and s!C n takes into account the number of anchored trees which one has to sum over in (4.34); see Section A.3. We shall see that, as anticipated above, this will allow us to pass from a factorial s!2 to a factorial s! in the estimates, and that this will be enough in order to obtain convergence as a factor 1=s! arises from the perturbative expansion (see (5.22) and the comments around (5.46)); see the end of Section 5.4. 5. The multiscale decomposition and power counting 5.1. Tree expansion The Schwinger functions (3:1) can be written also as Grassman integrals of exponential functions (as we shall see in detail in Section 7). Here we shall see that such integrals can be written as sums over trees by following two possible routes. (Note that the trees involved in the construction below must not be confused with the (anchored) trees introduced in the previous section.) The 6rst route consists in looking at the Feynman diagrams and realizing that it is convenient to associate with each of them a set of boxes, called clusters, establishing a hierarchical order between the sizes of the momenta of the lines of the propagators. The reason for doing this is the following one: if the momenta of the lines in some box are larger than the momenta of the lines outside the box, one has a possibly “dangerous” contribution, while this is not the case in the opposite situation: it is natural that two such di=erent contributions have to be treated in a di=erent way. This argument will become clearer below and in Section 7. Note that such reasoning was followed by Bogolubov, Hepp and Zimmermann (see [58,59]). We shall see
294
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 3. The function @(k ).
that the set of clusters associated to any graph can be very conveniently represented in terms of trees. The other way for introducing trees follows the ideas of Wilson on the Renormalization Group, see [27]; one wants to implement the idea that, integrating the “irrelevant” degrees of freedom of a theory, one gets an “e=ective theory” much simpler than the preceding one, such that all the important physical informations are encoded in it. We will follow this route. For concreteness, we consider the discrete case, in which the free Hamiltonian is given by (2.1) (anyway the following discussion can be easily adapted to the continuum case). So if we denote by k = (k; k0 ) the momentum (see (3.5)), we have that k is de6ned modulo 2'. Let · T denote the distance on the one-dimensional torus T ≡ R=2'Z, i.e. k T = min |k − 2'n| : n∈Z
Fix pF = 2'nF =L, with nF ∈ N, such that 1 − cos pF = 0 . We introduce a smooth C ∞ function @(k ) such that, if d E 2 2 = sin pF |k | = k0 + v0 k T ; v0 = d k k=pF with E(k) de6ned in (4.34), then 1 if |k | 6 t0 = a0 =A ; @(k ) = 0 if |k | ¿ a ; 0
(5.1)
(5.2)
(5.3)
where a0 = min{pF =2; ' − pF =2} and A ¿ 1; see Fig. 3. We can write in (4.33) g(k) ˆ = gˆ (u:v:) (k) + gˆ (i:r:) (k) ; gˆ (u:v:) (k) ≡
1 − @(k0 ; k + pF ) − @(k0 ; k − pF ) ; −ik0 + cos pF − cos k
gˆ (i:r:) (k) ≡
@(k0 ; k + pF ) + @(k0 ; k − pF ) : −ik0 + cos pF − cos k
(5.4)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
295
Fig. 4. Graphic representation of expansion (5.7). We can associate some labels to the points: a label h = 0 to the leftmost point, a label h = 1 to the middle point and a label h = 2 to all the rightmost points (endpoints).
We introduce, for any k ∈ DL; + , two Grassman variables, k(u:v:) and k(i:r:) , with propagators, respectively, gˆ (u:v:) (k) and gˆ (i:r:) (k); given a potential V( ), by the addition principle, we can write (u :v:) + (i:r:) ) P(d ) eV( ) = P(d (i:r:) ) P(d (u:v:) ) eV( (5.5) and, by using the invariance of exponentials property, we have ∞ 1 (0) (u:v:) V( (u:v:) + (i:r:) ) T (i:r:) P(d )e = exp Eu:v: (V(· + ; n) ≡ eV ( n!
(i:r :)
)
:
(5.6)
n=0
We shall see better later why it can be of interest to consider an expression like (5.5). It is convenient to represent the expansion for V(0) (
(i:r:)
)=
∞ 1 n=0
n!
T Eu:v: (V(· +
(i:r:)
); n)
(5.7)
as in Fig. 4. One can say that we have “integrated out the high energy degrees of freedom”, obtaining an “e=ective” theory describing fermions with momenta close to the Fermi surface. As g(i:r:) (k) is singular in two di=erent points (k = ± pF , at k0 = 0), it is natural to write (i:r:) @(k0 ; k + pF ) @(k0 ; k − pF ) gˆ (i:r:) (k) = ≡ gˆ! (k) (5.8) + −ik0 + cos pF − cos k −ik0 + cos pF − cos k !±1
and correspondingly we write P(d (i:r:) ) = P(d
(i:r:) ! )
;
(5.9)
!=±1
the 6elds !(i:r:)± are called quasi-particle Grassman 6elds: the label ! is sometimes called the branch label. Moreover, we decompose each propagator gˆ!(i:r:) (k) as an in6nite sum of propagators gˆ!(i:r:) (k) =
0 h=−∞
0 fh (k + !pF ; k0 ) (h) ≡ gˆ! (k) ; −ik0 + cos pF − cos k h=−∞
(5.10)
296
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 5. The function fh (k ).
where fh (k ) ≡ @(A−h k ) − @(A−h+1 k ) ;
(5.11)
is such that fh (k ) = 0 both for |k | 6 t0 Ah−1 and |k | ¿ t0 Ah+1 , while fh (k ) = 1 for |k | = t0 Ah ; see Fig. 5. Note that in fact the series in (5.10) is a 6nite sum, if L; + are 6nite (that is only a 6nite number of terms can be really di=erent from zero). In fact if L and + are 6xed, one has |k0 | ¿ 2'=+: so that fh (k ) = 0 for any h ¡ h+ , with h+ = min{h : t0 Ah+1 ¿ '=+} ;
(5.12)
note that h+ = O(log +). Therefore, as long as + remains 6nite, one has a natural infrared cut-o= h+ : of course we are interested in bounds uniform in such a cut-o=, i.e. we want to consider the possibility of removing such a cut-o=. Using again the addition principle and the invariance of exponential property, calling !(6−1) (6−1) and !(0) the Grassman 6elds with propagators gˆ! (k), if (6−1) (k) gˆ!
≡
−1
(h) gˆ! (k)
(5.13)
h=−∞
and gˆ(0) ! (k), respectively, and writing (h) P(d )= P(d !(h) ) ;
(5.14)
!=±1
we obtain V(0) ( ) P(d ) e = P(d ≡
P(d
(6−1)
)
(6−1)
) eV
P(d (−1)
(
(0)
) eV
(6−1)
)
;
(0)
(
(6−1)
+
(0)
)
(5.15)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
297
Fig. 6. Graphic representation of expansion (5.16). The 6rst line represents V(−1) in terms of V(0) , while the second line de6nes a unique graph representation for all the contributions to V(0) (and it is the same as in Fig. 4).
Fig. 7. Graphic representation of V(−1) in terms of V : each term representing V(0) in the 6rst line of Fig. 6 is expanded by using the second line of Fig. 6. One should imagine that the leftmost node lies on a vertical line h = − 1, the nodes immediately following it on a vertical line h = 0, the endpoints on a vertical line h = 2, while all the other nodes on a vertical line h = 1, as it will be in Fig. 8 below.
where V
(−1)
(
(6−1)
)=
∞ 1 n=0
n!
E0T (V(0) (·
+
(6−1)
); n) =
∞ 1 n=0
n!
E0T
∞ 1
n =0
ET (V; n ); n n ! u:v:
: (5.16)
V(0) .
A graphical representation of (5.16) is in Fig. 6, where the circles represent Writing the circles as in the second line of Fig. 6 we immediately get Fig. 7. So V(−1) is represented by a graph consisting of a set of lines and points arranged on the plane (x; y) in the following way. A line enters a point v0 and s ¿ 1 lines connect v0 to other s points v1 ; : : : ; vs : for each point vj , with j = 1; : : : ; s, there are sj ¿ 1 exiting lines leading to sj points vj1 ; : : : ; vjsj , which we call endpoints. The endpoints (with the lines entering them)
298
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
represent a graphic representation of V, while the subgraphs consisting of a point vj (with the line entering it) and of all the lines and points following vj are graphic representations of V(0) ; note that the circles are in fact expanded into such subgraphs. In conclusion, one obtains a graph with a tree structure (see Section A.1 for an introduction to tree graphs). In order to have an aesthetically goodlooking picture we can draw all the points vj ; j = 1; : : : ; s, on the same vertical line r1 and all the points vjj ; j = 1; : : : ; s and j = 1; : : : ; sj , on the same vertical line r2 . By introducing a coordinate system (x; y) we can denote by x = 1 and x = 2 the two lines r1 and r2 , respectively; the point v0 is on the line x = 0, while the root is on the line x = − 1. Now we can iterate further the above procedure, by integrating all the 6elds (u:v:) ; (0) ; (−1) ; : : : ; (h+1) , thus obtaining a contribution to V(h) , which is de6ned as (6h) V(h) ( (6h) ) (h+1) (0) + (h+1) +···+ (u:v:) ) e = P(d ) : : : P(d ) P(d (u:v:) ) eV ( ; (5.17) the function V(h) ( (6h) ) is the e>ective potential on scale h. We can introduce also a scale label h = 1 to denote the ultraviolet scale, (1) = (u:v:) , so that (61) ≡ and V(1) ( (61) ) = V( ). By using iteratively the invariance of exponential property we see that V(h) can be expressed in terms of V(h+1) as (h)
V (
(6h)
)=
∞ 1 n=0
n!
T Eh+1 (V(h+1) (· +
(6h)
); n) ;
(5.18)
where V(h+1) in turn can be expressed in terms of V(h+2) as (5.18) with h replaced with h + 1, and so on until V(h) is expressed in terms of V(1) ≡ V. At each step of the iterative procedure a circle representing V(h ) , for some h ¡ h ¡ 1, is transformed into a point v on a vertical line x = h + 1 (we use the coordinate system introduced above) with sv ¿ 1 exiting lines leading to sv circles representing V(h +1) and so on. At the end only points are left (i.e. no circles remain): the ones on the line x = 2 are called endpoints. By resuming the above discussion, we see that we can introduce a graph representation of V(h) in terms of labelled trees. We refer to Section A.1 for a systematic discussion on trees: here we con6ne ourselves to the basic notions, in order to make self-consistent the following analysis. On the plane (x; y) one draws the vertical lines x = h; h + 1; : : : ; 0; 1; 2 and one considers all the possible planar graphs obtained as follows [60]. One draws a horizontal line (a branch or a line) starting from a point r on the line x = h, the root, and leading to a point v0 with coordinate x = hv0 ¿ h, the =rst nontrivial vertex. Such a point is the branching point of sv0 ¿ 2 lines (also branches or lines) forming angles #j ∈ (−'=2; '=2), j = 1; : : : ; sv0 , with the x-axis and ending into points each of which is located on some vertical line x = hv0 + 1; hv0 + 2; : : : (and it becomes another branching point). One proceeds in such a way until n points on the line x = 2 are reached, the endpoints. All the branching points between the root and the endpoints will be called the nontrivial vertices. The trivial vertices will be the points located at the intersections of the lines connecting two
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
299
Fig. 8. A tree appearing in the graphic representation of V(h) . Such a tree is obtained by iterating the graph representations of the previous 6gures. All the endpoints are on the vertical line corresponding to the line h = 2.
nontrivial vertices with the vertical lines. The integer n denoting the number of endpoints will be called the order of the tree. We associate to the endpoints a number 1 to n, ordered from up to down. See Fig. 8. If the tree has only one line connecting the root to a vertex on the line x = 2, we say that the tree is trivial and we shall write E = E0 . Note that in such a case the root has scale h = 1. The graph so obtained is a tree graph: it consists of a set of lines connecting a partially ordered set of points (the vertices). The partial ordering of the vertices will be denoted by the symbol 4: if v ≺ w are two vertices, then hv ¡ hw . Of course the lines are ordered as well; note that there is a one-to-one correspondence between vertices and lines, as a line uniquely identi6es the vertex which it enters. Note that to each vertex v an integer hv is associated by construction: it is called the scale label. In particular we can associate the scale label h to r. We can associate with the unlabelled trees also some other labels: the values of such labels will depend on the particular problem we are studying. Therefore, we shall consider also the labelled trees (to be called simply trees in the following): we shall denote by the same symbol E the labelled trees (in the following we shall deal only with labelled trees) and by Th; n the set of all labelled trees with n endpoints (i.e. of order n) and with a scale label h associated to the root. It is then easy to see that the number of unlabelled trees with n endpoints is bounded by 4n ; see Section A.1. If we include also the endpoints into the set of vertices, we have that the vertices can be either trivial vertices or nontrivial vertices (which include also the endpoints). We shall denote by V (E) the set of vertices of a tree E and by Vf (E) the set of vertices in V (E) which are endpoints. By construction hv = 2 for any v ∈ Vf (E), while h ¡ hv ¡ 2 for any v ∈ V (E)\Vf (E). To each endpoint there corresponds one of the contributions to the interaction part of the Hamiltonian. With respect to Hamiltonian (2.17), it is more convenient to consider a Hamiltonian containing some extra term having the same form of the terms de6ning the free Hamiltonian H0 times some parameter: physically this is interpreted by saying that the interaction changes the “free” values of the parameters, i.e. the values of the parameters of the Hamiltonian
300
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
describing the free system (the mass and the chemical potential). By using the decomposition in (2.1) and (2.2) for H0 , we shall consider Hamiltonians of the form H = H0 + V = H0 + 2V1 + 7V2 + uV3 + V4 + (V5 ; V1 = T0 ; V2 = N0 ; V3 = P ; V4 = V ; V5 = B :
(5.19)
Then with each endpoint v of scale hv = 2 we associate one of the 6ve contributions to V; so we can associate with v a label i ≡ iv ∈ {1; : : : ; 5} uniquely identifying the contribution Vi to V in (5.19). We shall say that the endpoint is (1) of type 2 if i = 1, (2) of type 7 if i = 2, (3) of type u if i = 3, (4) of type if i = 4, (5) of type ( if i = 5. We can also introduce a label rv for v ∈ Vf (E) such that rv = 2 if iv = 1 and so on. If n is the number of endpoints, n = |Vf (E)|, we shall write n = n1 + · · · + n5 , where ni is the number of endpoints v ∈ Vf (E) with iv = i. Moreover, with such an endpoint v we associate also a set {xv } of space–time points, which are the integration variables corresponding to the particular interaction contribution Vi : in particular {xv } contains one point for any i = 4 and two points for i = 4. Given a vertex v, which is not an endpoint, {xv } will denote the family of all space–time points associated with the endpoints following v, i.e. with the endpoints w ∈ Vf (E) such that v ≺ w. We introduce a =eld label f to distinguish the 6elds appearing in the terms associated with the endpoints: the set of 6eld labels associated with the endpoint v will be called Iv . Then x(f), (f) and !(f) will denote the space–time point, the index and the ! index, respectively, of the 6eld with label f. For instance, for v ∈ Vf (E) with iv = 4, then {xv } = {x; y} and Iv = {f1 ; f2 }, if x(f1 ) = x and x(f2 ) = y. We shall write also x(Iv ) = {x(f): f ∈ Iv }. Analogously, if v is not an endpoint, we shall call Iv the set of 6eld labels associated with the endpoints following the vertex v. 5.2. Clusters It is clear that, if h 6 0, the e=ective potential (if E˜ h are normalization factors for any h 6 2) can be written in the following way: ∞ V(h) ( (6h) ) + L+E˜ h+1 = V(h) (E; (6h) ) ; (5.20) n=1 E∈Th; n
where
V(h) (E;
(6h) )
is de6ned iteratively as follows.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
301
If E is the trivial tree E0 , then h = 1 and V(1) (E0 ; (61) ) is given by one of the contributions to V( ), listed in (5.19). If E is nontrivial and v0 is the 6rst vertex of E and E1 ; : : : ; Es (with s = sv0 ) are the subtrees of E with root v0 , then V(h) (E;
(6h)
)=
1 T E (V(h+1) (E1 ; s! h+1
(6h+1)
); : : : ; V(h+1) (Es ;
(6h+1)
)) :
(5.21)
In general, for each v ∈ V (E) we denote by sv the number of lines exiting from v (sv = 0 if v ∈ Vf (E)), so that, by iterating (5.21), one obtains 1 V(h) (E; (6h) ) = sv ! v∈V (E)
T T T T T × Eh+1 (Eh+2 (Eh+3 : : : E−2 (E−1 (E0T (V(E0 ;
(61)
); : : :); : : :); : : :); : : :); : : :) ; (5.22)
where E0 is the trivial tree. The truncated expectations in (5.21) are meant to be computed starting from the endpoints towards the root. The above expression can look a little intricate at 6rst sight: the better way to understand it is to especially work out some examples (for instance for low values of h like h = 0; −1; −2; : : :) and try to generalize them to any value of h 6 0. Once a vertex v is reached, one has to consider an expression of the kind (6hv ) 1 T ˜ (6hv ) Ehv ( (Pv1 ); : : : ; ˜ (Pvsv )) ; sv !
(5.23)
where sv is the number of lines exiting from v and Pvj , with j = 1; : : : ; sv , is a set of indices such that (6h ) (f) v ˜ (6hv ) (Pvj ) = j = 1; : : : ; sv ; (5.24) x(f); !(f) ; f∈Pvj
is a product of |Pvj | 6elds on scale 6 hv . This can be proven by induction on the scale hv ; see Section A.6. Therefore, the e=ect of the truncated expectation EhTv is to contract the 6elds on scale hv appearing in the products (5.24) in all the possible ways. If one uses expansion (4.42) one obtains a sum over all the possible Feynman diagrams which can be obtained by contracting the half-lines emerging from the sets Pv1 ; : : : ; Pvsv . This means that, when the vertex v is reached moving along the tree E, we construct a “diagram” formed by lines ‘ on scales h‘ ¿ hv . To any vertex w v there corresponds a subdiagram <w such that all the lines on scale hw form a connected set if all the subdiagrams <wj , j = 1; : : : ; wsw , corresponding to the vertices immediately following w, are thought of as contracted into points (this simply follows from the very de6nition of truncated expectation). We call Pv the set of labels corresponding to the 6elds associated to the external lines of
302
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 9. An example of Feynman graph < with its clusters. The cluster structure uniquely identi6es a tree E and vice versa. All the endpoints are supposed to be of type (i.e. iv = 4 ∀v ∈ Vf (E); the graph elements corresponding to the endpoints are as will be shown in Fig. 11 below. It is customary to draw the graph elements representing V by not explicitly drawing the ondulated line (representing the two-body potential), so that the two coordinates x and y appear as if they were superimposed on each other.
in (5.23) we have sv Pv = Qvj ;
(5.25)
j=1
if Qvj is the collection of the labels of the 6elds associated to the external 6elds of
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
303
Fig. 10. A tree of order 5 and the corresponding clusters. Only the clusters corresponding to the nontrivial vertices are explicitly taken in consideration.
inequality for grouping together classes of diagrams, and on each Feynman diagram a bound is given. The tree structure underlying (5.22) provides an arrangement of endpoints into a hierarchy of clusters contained in each other. With each vertex v we can associate the cluster Gv formed by the endpoints following v. Then, by construction, there will be an inclusion relation by clusters such that Gv ⊃ Gw if v ≺ w. So, given a tree, we can represent it as a set of clusters and vice versa; see Fig. 10, where only the clusters associated to nontrivial vertices are drawn. As we said above, given a cluster Gv , if all the maximal subclusters Gv1 ; : : : ; Gvsv contained inside Gv are thought of as points, then the set of points so obtained is connected; so it is possible to single out a set of sv − 1 lines connecting them. Such a set will be called an anchored tree: it realizes a minimal connection between the maximal subclusters of Gv . For each cluster Gv the set Pv determines the external lines of any diagram
304
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
will have been introduced). As we shall see, the procedure described here will be too naive to produce a meaningful description of the physics underlying the model we are studying: a more careful analysis will be necessary in order to correctly describe the model. As said just at the beginning of Section 5.1, even without using the Gram–Hadamard inequality, the introduction of the clusters turns out to be a useful device in order to identify which propagators in a (class of) Feynman diagram are really dangerous. Given a Feynman diagram <, suppose we consider a (connected) subdiagram < formed by some points and by the lines connecting them; we shall see later (see Section 5.4) that bad estimates can arise from such a subdiagram only if the number of external lines (i.e. of lines emerging from the vertices internal to < but not belonging to < ) is equal to 2 or 4, for the class of models we are considering. A more careful analysis would show that such a contribution can really create problems only if all lines internal to < have a momentum of size larger than the size of the momenta of the external lines. So if the subdiagram is a cluster, such a property of the subdiagrams is automatically taken into account and, in terms of clusters, we can say that only clusters with 2 or 4 external lines can be a source of problems: such an argument will be given a more rigorous formulation in Section 5.4 below. 5.3. Values of Feynman diagrams Suppose (for simplicity and for concreteness) that each endpoint is of type : then the Feynman diagrams with p external lines are all the possible diagrams obtained by connecting all the clusters and leaving p uncontracted lines. Expanding the truncated expectation in (5.21) by using the Feynman diagram expansion (see Fig. 1), one obtains a representation of V(h) (E; (6h) ) as a sum over Feynman diagrams of quantities which are given by the product of 6elds times suitable coeKcients called the values of the Feynman diagrams. As we said before, for the moment we are supposing that all the truncated expectations are written in terms of Feynman diagrams: then we shall obtain some bounds on the values of the Feynman diagrams. A 6nal bound on the kernels of the e=ective potentials can be obtained simply by multiplying the bound holding for a generic Feynman diagram times the number of Feynman diagrams. In Section 6 we shall prove that the same dimensional arguments can still be performed by directly studying (5.22) and making use also of expansion (4.43) for the truncated expectations. The 6nal expression for the e=ective potential will be called the nonrenormalized expansion for reasons which will become clear later. As we shall see, the procedure described here will be too naive to produce a meaningful description of the physics underlying the model we are studying: a more careful analysis will be necessary in order to correctly describe the model. Now let us come back to the bounds on Feynman diagrams. The value to be assigned to any Feynman diagram is obtained in the following way (for instance, in momentum space). Given a line ‘ of a Feynman diagram, there will be a cluster Gv on scale hv such that ‘ is contained in Gv but it is outside any other cluster internal to Gv ; moreover, the momentum of the propagator corresponding to such a line will be of the form k = k + !pF , for some values of k of size Ahv (otherwise, by the support properties of the @ functions, the value of
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
305
the corresponding diagram is vanishing) and of ! = ± 1. Then with the line ‘ the following labels will be associated: k‘ = k, !‘ = ! and h‘ = hv . One associates with each contracted line ‘ the propagator 1 −ik‘ ·(x−y) (h‘ ) g‘ = g!(h‘‘ ) (x − y) = e gˆ!‘ (k‘ ) ; (5.26) L+ kl ∈DL;+
where x and y are the points connected by the line ‘: here again for concreteness, we are supposing that the model described with free Hamiltonian H0 given by (2.1) is considered. Each line has a momentum according to the usual momentum conservation rules, the independent momenta are integrated and the external =elds (6h) are associated with the lines which are noncontracted, if h is the scale of the root of the tree. Then, the coeKcient by which the product of external 6elds is multiplied is the value of the Feynman diagram. Note that Feynman diagrams associated with a set of clusters naturally appear. We have seen that if one looks at standard Feynman diagrams, one is naturally led to introduce clusters to identify the subgraphs responsible of divergences (which are the subdiagrams such that their internal lines momenta are larger than the momenta of their external lines). 5.4. Power counting It is quite easy to estimate the above Feynman diagrams. First note that each propagator g!(h) (x) is 6nite and, for any integer N , it is bounded by (h) |g! (x)| 6 Ah
CN ; 1 + (Ah |x|)N
(5.27)
as it is easy to derive by using (5.10), see Section A.4. We perform the estimates in the coordinate space; at this level the estimates could be performed also in the momentum space and no conceptual di=erence would arise, but we shall see that in the nonperturbative estimates it is convenient to work in the coordinate space. Note that, given a Feynman diagram <, there is a tree which can be associated to it, uniquely determined by the cluster structure of <: let us call it E. Then, as all the clusters have to be connected, by the very de6nition of the truncated expectation (see Section 4.2), the integrations, up to a constant C n , produce a factor A−2hv (sv −1) ; (5.28) v∈Vf (E)
if sv is the number of subtrees coming out from v and v ∈ Vf (E) stands for v ∈ V (E)\Vf (E); see Section A.4. Moreover, for any cluster Gv , v ∈ Vf (E), by using (5.27) we get, up to a constant C n , a factor 0
Ahv nv ; if n0v is the number of propagators internal to a cluster Gv but not to any smaller one.
(5.29)
306
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
So the bound for the value of a generic Feynman diagram < is given by 0 d x(Iv0 )|Val(<)| 6 C n Ahv (nv −2(sv −1)) ;
(5.30)
v∈Vf (E)
where E is the tree associated to <, Iv0 = {1; : : : ; n + n4 } (if n is the number of endpoints and n4 is the number of endpoints v with iv = 4) and +=2 d x(Iv0 ) = d x0 (f) : (5.31) f∈Iv0 x(f)∈ −+=2
For simplicity, we shall consider only the case in which for v ∈ Vf (E) one has either iv = 4 or 2, see (5.19). Let m4; v be the number of endpoints contained in the cluster v to which is associated a label i = 4 and let m2; v be the number of endpoints contained in the cluster Gv to which is associated a label i = 2. Moreover, let nev be the number of 6elds external to the cluster Gv . Then the following relations can be easily checked to hold, if v is the vertex preceding v on the tree: (hv − h)(sv − 1) = (hv − hv )(m4; v + m2; v − 1) (5.32) v∈Vf (E)
v∈Vf (E)
and
(hv −
h)n0v =
v∈Vf (E)
ne (hv − h ) 2m4; v + m2; v − v 2 v
:
(5.33)
v∈Vf (E)
Note that hv − hv = 1 by construction. Inserting the above two equalities into (5.30), one gets e e d x(Iv0 )|Val(<)| 6 C n A−h(nv0 =2−2+m2; v0 ) A−(hv −hv )(nv =2−2) A−(hv −hv )m2; v ;
(5.34)
v∈Vf (E)
where v0 is the node immediately following the root. Note that in Section 8 it is more convenient to redecompose hm2; v0 + (hv − hv )m2; v = hv m2; v ; v∈V (E)
where, for an endpoint v, one has
1 v is of type 7 ; m2; v = 0 otherwise :
(5.35)
v∈Vf (E)
(5.36)
Identity (5.35) can be easily veri6ed analogously to (5.32) and (5.33). In this way, we obtain a factor A−hv for each endpoint v with iv = 2 (see (5.19)). As all endpoints are on scale h = 2, this means that we have a factor A−1 for each endpoint of type 7: we prefer to maintain the writing hv for reasons that will become clear in the following.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
307
Given a cluster Gv we denote by Pv the sets of labels f such that x(f) is an endpoint (6h) f is the 6eld associated to a line external to Gv . contained in Gv , so that nev = |Pv |, and x(f); !(f) By de6ning nev −2 ; 2 we can rewrite bound (5.34) as D(Pv ) =
(5.37)
d x(Iv0 )|Val(<)| 6 C n A−hD(Pv0 )
A−(hv −hv )D(Pv )
v∈Vf (E)
A−hv m2; v :
(5.38)
v∈Vf (E)
The above estimate is of course 6nite (contrary to the power counting of the all theory for propagators which are singular), but problems arise if one wants to perform the sum over the scales of a tree. If nev ¿ 6, then (nev =2 − 2) ¿ 1 and so, by using that hv − hv ¿ 0, e A−(hv −hv )(nv =2−2) 6 A−(hv −hv ) 6 C n ; (5.39) E∈Th; n {Pv }
v∈Vf (E)
E∈Th; n {Pv }
v∈Vf (E)
for some (di=erent) constant C, as it is proven in Section A.6.1. Note that (5.37) could suggest that each time we have an endpoint v ∈ Vf (E) of type 7, we gain an extra unit contributing to D(Pw ) for all w 4 v, so that one could think that no problems arise for nev = 4 when m2; v ¿ 1 and for nev = 2 when m2; v ¿ 2. Nevertheless, this is not true as all such gains are paid by an extra bad factor A−hn2 in front of the product in (5.37). Then we identify immediately the following problem. If nev 6 4 the above sum cannot be performed; then the clusters with 2 or 4 external lines have to be renormalized. At this level, this simply means that there is something to do if one wants to obtain something meaningful: one will have to consider a di=erent expansion. The above problem manifests itself at a perturbative level, as the e=ect of the bounds for single diagrams, if the sum over the trees is performed. However, there is also a nonperturbative problem; even if nev ¿ 6, we cannot conclude from such bounds that the theory has a meaning. The reason is the following one. As we see from (5.22) we have a factor 1=sv ! for each vertex v ∈ V (E). If we expand the truncated expectation EhTv in terms of Feynman diagrams we obtain O(sv !2 ) terms (see Section A.1). Then the overall combinatorial factor is proportional to sv !2 sv ! ; (5.40) = sv ! v∈V (E)
v∈V (E)
where n is the number of endpoints in E. This means that for any vertex there are too many diagrams and the factor 1=sv ! arising from the expansion into product of truncated expectations (5.22) is not enough to try to compensate the number of Feynman diagrams. So the sum over n cannot be performed. We shall consider a particular model for introducing all the renormalization group formalism, rather than developing it in abstracto. By following [43] we choose the Holstein–Hubbard
308
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
model in which, with respect to the adiabatic Holstein model, there is also a quartic term in the interaction, so that we dispose in such a way of a model which presents most of the interesting features of one-dimensional fermionic systems: by simply putting = 0 (i.e. by neglecting the two-body interaction), we recover the adiabatic Holstein model. The problem arising by the bound (5.40) is then solved through the use of the Gram–Hadamard inequality which allows us to obtain sv ! terms for each v ∈ V (E) instead of sv !2 terms. The technical details are deferred to the next section and to Section A.3. After treating the Holstein–Hubbard model we shall show (see Section 12 below) how similar methods can produce many results in a number of fermionic models. 5.5. Comparison with Wilson’s method In the previous sections we have seen how to de6ne a sequence of e=ective potentials (1) ; (0) ; : : : ; (h+1) . It is interesting to note the similarity of this approach with the renormalization group of Wilson [27]. Calling (¿ ) and (6 ) 6elds with momentum k = (k; k0 ) with |k | bigger or lower than some pre6xed scale , in the approach of Wilson, one computes (see for instance [22]) ( ) (6 ) ) P(d (¿ ) )eV( ) = eV ( : (5.41) V(0) ; V(−1) ; : : : ; V(h) integrating the 6elds
Comparing V( ) and V( +d ) , for |d |1, one gets in the limit d → 0, some di=erential equation for the running coupling constants which will be introduced in Section 8 below. One can see that this is what we do in the limit A → 1 and considering a sharp partition of unity through #-functions instead of the @-functions introduced through (5.3). The reason why we do not do this will become clear in the following: essentially, it is that one has to perform derivatives and the derivative of a #-function is a -function, so that this causes some technical diKculties. We think that it is possible to extend our formalism closer to Wilson’s original formulation, but there is essentially no simpli6cation in doing this; therefore, we will not discuss further such a point here. 6. Nonperturbative estimates for the nonrenormalized expansion 6.1. Kernels of the e>ective potentials By the analysis of the previous section we have that V(h) ( scale h, can be written as V(h) (
(6h)
)=
∞
V(h) (E;
(6h)
(6h) ),
the e=ective potential on
);
n=1 E∈Th; n (h)
V (E;
(6h)
)=
d x(Iv0 )
Pv0 ⊂Iv0
˜ (6h) (Pv0 )W(h) (E; P v0 ; x(Iv0 )) ;
(6.1)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where Th; n is the set of labelled trees of order n contributing to V(h) ( +=2 d x(Iv0 ) = d x0 (f)
(6h) )
309
and
f∈Iv0 x(f)∈ −+=2
(6.2)
with Iv0 = {1; : : : ; n + n4 }, if n is the number of endpoints and n2 is the number of endpoints v with iv = 4. By using (5.21) and (6.1) we obtain for the kernel W(h) (E; P v0 ; x(Iv0 )) the following recursive relation: sv 0 W(h+1) (Ej ; P vj ; x(Ivj )) W(h) (E; P v0 ; x(Iv0 )) = P v1 ;:::;P vsv
×
1
sv 0 !
0
j=1
T Eh+1 (˜
(h+1)
(h+1) (P v1 \Qv1 ); : : : ; ˜ (P vsv0 \Qvsv0 )) ;
(6.3)
where Qvj = P v0 ∩ P vj ;
j = 1; : : : ; sv0 :
(6.4)
Then (6.3) can be iterated leading to W(h) (E; P v0 ; x(Iv0 )) (h ) (h+1) v EhTv ( ˜ (P v1 \Qv1 ); : : : ; ˜ (P vsv \Qvsv )) rv ; = {P v }v∈V (E)
v∈Vf (E)
(6.5)
v∈Vf (E)
where sv is the number of lines exiting from the vertex v (whose value is 6xed by the tree E), while rv is the coupling constant appearing in (5.19) associated to the endpoint v (rv = if v is of type and so on; see (5.19)). The sum (6.6) {P v }v∈V (E)
in (6.5) is over all the possible choices of the sets P v corresponding to the vertices of E, except P v0 which is 6xed. The sets Qv are uniquely determined by the sets {P v } by taking into account that for any v ∈ V (E) one has Qv ⊂ P v ;
Pv =
sv
Qvj ;
(6.7)
j=1
so that for any v ∈ V (E) and for any vj immediately following v one has Qvj = P v ∩ P vj ;
j = 1; : : : ; sv ;
which extends (6.4) to any vertex in V (E).
(6.8)
310
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Then we can write (6.1) as V(h) (
(6h)
)=
∞
V(h) (E;
(6h)
);
n=1 E∈Th; n (h)
V (E;
(6h)
)=
d x(P v0 ) ˜
(6h)
(P v0 )W(h) (E; P v0 ; x(P v0 )) ;
(6.9)
P v0 ⊂Iv0
where W(h) (E; P v0 ; x(P v0 )) =
d x(Iv0 \P v0 )W(h) (E; P v0 ; x(Iv0 )) :
(6.10)
The kernels W(h) (E; P v0 ; x(P v0 )) depend only on the variables x(P v0 ) = {x(f)}f∈P v0
(6.11)
and, as we are going to prove, they satisfy the bound
d x(P v0 )|W(h) (E; P v0 ; x(P v0 ))| 6 +LA−hD(P v0 )
{P v }
A−(hv −hv )D(P v ) (C,)n ;
(6.12)
v∈Vf (E)
where , = max{|7|; ||}, C is a suitable constant depending on N (through bound (5.27) holding for the propagators) and D(P v ) is de6ned in (5.37). The sums over the sets {P v } in (6.12) and over E ∈ Th; n in (6.9) in order to recover the complete kernels of the e=ective potential require D(P v ) ¿ 0. However, this is not true when nev 6 4: it will become possible only after the renormalization procedure has been applied (so that D(P v ) will be modi6ed into D(P v ) + zv , with zv such that D(P v ) + zv ¿ 0). 6.2. Proof of (6.12) First note that (6.12) involves the integrations of all the endpoints. For all the endpoints V ∈ Vf (E) with iv = 4 we can use the potential v(x − y) (x0 − y0 ) in order to integrate one of the two variables {x; y}: so we are left with n integrations. Recall that, by (4.47), T ˜ |E ( (P1 ); : : : ; ˜ (Ps ))| 6 |g‘ | C n−s+1 ; (6.13) T
‘∈T
1
if n = |P1 | + · · · + |Ps | and C1 is a constant proportional to the bound C0 on the propagator E( x;− y;+ ). If E = Eh one has C0 = CN Ah , see (5.27).
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
311
Then, by introducing (4.43) into (6.5) and using (6.13), we can bound (h+1) (P v1 \Qv1 ); : : : ; ˜ (P vsv \Qvsv ))| sv sv g‘ (CCN ) j=1 |P vj |−|P v | Ahv ( j=1 |P vj |−|P v |) ; 6
|EhTv ( ˜
(hv )
T
(6.14)
‘∈T
also the propagators g‘ , ‘ ∈ T , are on scale hv and we used the fact that the number of lines internal to Gv which are contracted on scale hv is given by sv
|P vj | − |P v | :
(6.15)
j=1
Then for each anchored tree T contributing to the sum we can use the sv − 1 propagators g‘ , with ‘ ∈ T , in order to perform sv − 1 integrations: this gives a factor A−2hv (sv −1) ;
(6.16)
as it can be easily proved by using (5.27) (compare with (5.28)); see Section A.4. As the number of integration variables is n (see the initial comments of this section) and (sv − 1) = |Vf (E)| − 1 = n − 1 ; (6.17) v∈Vf (E)
we see that, at the end, all the integrations can be performed, up to one, corresponding to a single endpoint of the tree: such an integration gives the factor (+L) in (6.12). Moreover, we have |rv | 6 ,n ; (6.18) v∈Vf (E)
by the de6nition of , after (6.12) and by the fact that |Vf (E)| 6 n. Noting that sv
|P vj | − |P v | = n0v ;
(6.19)
j=1
where n0v is de6ned after (5.29), we obtain, for the left-hand side of (6.12), a bound 0 +L Ahv (nv −2(sv −1)) (C,)n ;
(6.20)
v∈Vf (E)
for some constant C, depending on N , so that, by using the relations (5.32) and (5.33) and the de6nition (5.37), (6.12) immediately follows (simply reason as in Section 5.4 about the Feynman diagrams).
312
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
7. Schwinger functions as Grassman integrals 7.1. Perturbation theory and Euclidean formalism The Schwinger functions have been introduced in Section 3.1. The standard perturbation theory allows us to express them in terms of Feynman graphs. By using the representation tV n −tH −tH0 =n e = lim e 1− ; (7.1) n→∞ n where H0 is de6ned in (2.1) in the discrete case and in (2.2) in the continuum case, while (for instance, see (2.17) and (5.19)) V = uP + V + 7H0 ;
(7.2)
one 6nds for the numerator of (3.1) the following representation. By introducing p1 + · · · + ps+1 variables tj such that one has tj ¿ tj+1 for any 1 6 j ¡ p1 + · · · +ps+1 and the values tp1 ; : : : ; tp1 +···+ps are 6xed to be t1 ; : : : ; ts , respectively, we de6ne t = {tj } and set V(t) = eH0 t Ve−tH0 :
Then the numerator of (3.1) becomes ± d t Tr e−+H0 V(t1 ) : : : V(tp 1 −1 )
(7.3)
,1 x1 ; 1 V(tp1 +1 ) : : :
,s xs ; s
: : : V(tp 1 +···+ps+1 ) ;
(7.4)
where the sum is over the integers p1 + · · · + ps+1 , the integral is over all the variables tj , with the constraints described above, and the sign ± is + if the number of the V factors is even and − otherwise. By taking into account the fact that each term contributing to V in (7.2) is an integral on space variables and that H0 is quadratic in the 6eld operators, the terms in (7.4) can be expressed as integrals of sums of products of propagators Tr e−+H0 x− y+ = Tr e−+H0 if x0 ¿ y0 ; g(x − y) = (7.5) −Tr e−+H0 y+ x− = Tr e−+H0 if x0 6 y0 : Then each term can be graphically represented in terms of Feynman diagrams, which are obtained by contracting in all the possible ways the graph elements represented in Fig. 11. One has s elements of the last forms (b) in Fig. 11 and n elements of one of the remaining forms (a). The lines are then contracted as described in Section 4.2. It is a remarkable result [32], that all the nonconnected graphs cancel exactly the denominator of (3.1), which of course can be dealt with as the numerator and gives a formula analogous to (7.4), with the only di=erence that only V factors appear and only graph elements of the form (a) in Fig. 11. This explains why only connected graphs have to be considered.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
313
Fig. 11. The graph elements for the model described by the Hamiltonian with interaction given by (7.2). Note that the ondulated lines appearing in two of the graph elements of the form (a) have a di=erent meaning: for the graph element associated to endpoints of type it represents the potential v(x − y)(x0 − y0 ), while for the graph element associated to endpoints of type it represents the potential ’(x)(x0 ).
The Schwinger functions can be expressed also in terms of fermionic functional integrations introduced in Section 4. Expansion (7.4) in terms of fermionic 6elds can be shown [28], to be equivalent to the expansion in terms of Grassman variables given by S(x1 ; ,1 ; 1 ; : : : ; xs ; ,s ; s ) +=2 9n log P(d ) exp V( ) + d x0 (+ = ,1 x 9x1 : : : 9,xss −+=2
− x;
+
+ − x x; )
;
(7.6)
x∈
where the derivatives are meant as (formal) functional derivatives. The equivalence is formally an identity: it is enough to interpret propagator (7.5) as an expectation value of the product of two Grassman 6elds (see (4.36)). Therefore, one 6nds for the Schwinger functions a graphical expression analogous to that of the e=ective potentials: the only di=erence is that the interaction is slightly changed by allowing an interaction with a 6ctitious “external 6eld”. Without considering the multiscale decomposition of the propagators and the renormalization e=ects the relation between the e=ective potential (obtained by integrating all the scales) and the Schwinger functions would be easy to derive (see for instance [28]; see also Section 11 later); however, the multiscale decomposition and the renormalization, mostly the change introduced into the “free measure”, makes such a relation not so obvious and the explicit representation of the Schwinger functions in terms of truncated expectations becomes a little involved. This will be carried out later in Section 11, by starting from (7.6). For instance, in the case of the two-point Schwinger functions, one has to compute S(x; −; ; y; +; ) P(d ) eV( ) x;+ y;− ≡ S(x; y) = P(d ) eV( ) +=2 9n = + − log P(d ) exp V( ) + d x0 (+ x 9x 9y −+=2 x∈
− x;
+
+ − x x; )
; (7.7)
314
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where the interaction V( ) is as above. We note that now the second expression in (7:7)—as well as (7:6) in the general case of any s-point Schwinger functions—is more convenient for practical purposes as it allows to follow the same strategy adopted for the e=ective potentials (simply with a di=erent “interaction Hamiltonian”) consisting in integrating the scales in a hierarchical way. 7.2. Feynman graphs and origin of divergences The expansion given above for the e=ective potentials and the one hinted for the Schwinger functions (which, as anticipated, will be carried out in detail in Section 11) are 6nite sums with 6nite coeKcients if L; + are 6nite; however, in general, there is no hope that the above series are still convergent in the limits L; + → ∞. The reasons are numerous and quite easy to understand. If = 0 (in the limit L; + → ∞) the Fourier transform of the Schwinger function is singular at k0 = 0, |k | = pF ; even in the most favourable case, in which the interacting Schwinger function has the same kind of singularity as the free one (such systems are generally called Fermi liquids), there is no reason for the singularity of the free and the interacting Schwinger functions to be located at the same point, i.e. k0 = 0, |k | = pF , instead, in general, they will be at some other point k0 = 0, |k | = pF + O(). This phenomenon is quite general (there is the remarkable exception of the Luttinger model in which, as we shall see, the singularity is -independent, due to the relativistic invariance of the model) and not limited to the case d = 1. By the way, in more than one dimension the situation is even more complicated as, in absence of rotation invariance, the singularity is shifted by an angle-dependent quantity, see [101–103]. It is quite clear that this produces problems in a naive expansion for the correlation function. Assume that the interacting Schwinger function is simply 1 ; −ik0 + cos k − − 7()
(7.8)
which has the same nature of singularity as in the = 0 case, but at the point cos−1 ( + 7()); an expansion in powers of needs a preliminary expansion n ∞ 1 7() ; (7.9) −ik0 + cos k − −ik0 + cos k − n=0
which of course has no meaning for k close to pF . This is one of the reasons for which we expect that the expansion in terms of Feynman diagrams cannot be well de6ned and why it is not the right expansion to consider. It is also very easy to isolate some of the diagrams reCecting the above shift of the singularity; for instance the diagram represented in Fig. 12. Note that the divergences occur when the momenta of the propagators of the lines in the boxes are larger than the momenta of the lines external to the boxes; in fact, in the other case the small momenta of the external lines is compensated by the momenta of the lines internal to the boxes, and no accumulation of propagators with small momenta is present. Then, by the above simple example, we learn that we have to divide the integration domains for each Feynman diagram to single out the true dangerous contributions; in other words, we
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
315
Fig. 12. A chain of clusters with two external lines. As said in Fig. 9, the graph elements with four external lines are drawn by not expicitly representing the two-body potential as an ondulated line, but simply gluing together the two points x and y.
have to factorize the product of propagators for each Feynman diagram according to the relative size of the momenta of the propagators associated to the lines. Such a “factorization” is essential in a consistent theory of renormalization; if one does not do it, one 6nds well-known problems like the overlapping divergences problem, or the renormalon problem [31], due to the fact that one “subtracts too much”. We have seen in Section 5 that diagrams including the above factorization are naturally generated in a Wilsonian Renormalization Group framework: mathematically the key notion is that of clusters. The one explained above is the simpler source of problem in the expansion. A more serious one is the change of the exponent of the singularity (anomalous dimension), leading to logarithmic for instance one can think that for the function x−(1+,) and its expansion ∞divergencies; n −1 n x n=0 [, log x] =n!: each addend has a O(log|x | ) behaviour. Even more serious is the change of the nature of the singularity, for instance in the case in which there is a gap generation, so that for instance the Fourier transform of the Schwinger function is not singular at all: this is what is believed to happen in superconductivity or in d = 1 when there is the formation of charge or spin density wave. From such considerations the necessity of di=erent expansions is clear, which will be described in the next section. From a mathematical point of view, it is remarkable that one is attempting to construct perturbatively, by a suitable expansion in the perturbative parameters, quantities which are not analytic in such parameters (so that a power expansion fails). 8. The Holstein--Hubbard model: a paradigmatic example 8.1. The model To 6x ideas we study a system of interacting fermions on a lattice subject to a quasi-periodic potential, following the analysis in [43]. In the physical literature such systems are studied in connection with the so-called quasi-crystals, see for instance [61,62]. Such a case contains all the relevant features (anomalous dimension, dynamical Bogolubov transformations, small divisors problem); we shall see that the results for all the models listed in Section 13 can be obtained through suitable changes and adaptations of the arguments we explain here in details. The Hamiltonian of the Holstein–Hubbard model is given by H = H0 + uP + V + 7N0 ;
(8.1)
316
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where H0 and N0 are given by (2.1), P by (2.3) and V by (2.5), with S = 0 and with ’(x) a periodic function with period incommensurate with the lattice step (which is assumed to be 1, see Section 2.1). As there is no dependence on the spin we can write simply x;± as x± in (8.1), so that the Hamiltonian becomes 1 + − + − + − + − H= + u ’(x) x+ x− (− x x+1 − x x−1 + 2 x x ) − 0 x x 2 x∈ x∈ x∈ + − + v(x − y) x+ y+ y− x− + 70 (8.2) x x : x;y∈
x∈
J with mS ∈ N and p = '=T , if T is the period of the potential ’ (i.e. Let us 6x pF = mp, ’(x + T ) = ’(x) for any x; see (2.4)). Suppose also the function ’ to be analytic (in a strip around the real axis). In the following, we assume also the functions ’ in (2.4) and v in (2.5) to be even in their arguments: this is not essential, but parity considerations simplify a few aspects of the following analysis. By the de6nition of p we can write ’(x) = ’(2px) S where ’S is a 2'-periodic function and p=' is an irrational number; moreover, the Fourier transform of ’S is exponentially decreasing (i.e. ’S is supposed to be analytic in a strip around the real axis). In order to perform a rigorous analysis one cannot assume that p=' is a generic irrational number, but it has to belong to a class of numbers called Diophantine characterized by the following arithmetic properties: there exist two constants C0 and E such that, for any integers k; n, |2np + 2k'| ¿ C0 |n|−E
∀(n; k) ∈ Z2 \{(0; 0)} ;
(8.3)
the Diophantine vectors (p; ') are of full measure for E ¿ 1 [63]. Note that we can write (8.3) as 2npT ¿ C0 |n|−E
∀n ∈ Z\{0} ;
(8.4)
which is satis6ed by a full measure set of p’s in the real axis. We can apply the iterative procedure seen in Section 5.1 by introducing the quasi-particle 6elds !(h)± ; after integrating the ultraviolet scale and denoting v(k) ˆ = k0 ;0 ’ˆ m =
v(x) e−ikx ;
x∈
’(x) e−2impx ;
(8.5)
x∈
where, by the analyticity assumption, |’ˆ m | 6 F0 e−|m|
∀m ∈ Z
(8.6)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
317
for suitable positive constants F0 ; , we obtain V(0) (
(60)
=
)
1 (L+)4
+ +
k1 ;:::;k4 ∈DL;+
1 (L+)4
k1 ;:::;k4 ∈DL;+
(60)+ (60)− (60)+ (60)− v(k ˆ 1 k1 k2 k3 k4
(60)+ (60)− (60)+ (60)− W (k1 ; : : : ; k4 ) (k1 k1 k2 k3 k4
1 (7 + F(k)) L+ k∈DL;+
+u
∞
’ˆ m
m=1
∞ ∞
+
n=2 m=1
1 ( L+ k∈DL;+
1 (L+)n
− k2 ) (k1 + k3 − k2 − k4 )
+ k3 − k2 − k4 )
(60)+ (60)− k k
(60)+ (60)− k k+2mp
k1 ;:::;kn ∈DL;+
(60) 1 k1
+
:::
(60)+ (60)− k k−2mp )
(60) n kn
(0) Wˆ n; m (k1 ; : : : ; kn )
n
i ki + 2mp
;
i=1
(8.7) (0) (0) where i = ±, |F(k)| 6 C ||, |W (k1 ; : : : ; k4 )| 6 C ||2 and the kernels Wˆ n; m = Wˆ n; m (k1 ; : : : ; kn ) satisfy the conditions: (0) (0) 1. Wˆ n; m = Wˆ n; −m , if ’ and v are even functions; (0) 2. |Wˆ n; m | 6 C n ,max(2; n=2−1) if , = max{||; |u|; |7|}; moreover p = (p; 0) and the delta-function (k) = L+k0 ;0 k; 0 is de6ned modulo 2' in k.
Such conditions are easily veri6ed: it is enough to express V(0) in terms of Feynman diagrams by using the rules given in Section 3 and to check that the parity properties of the interaction imply condition (1), while condition (2) follows from the fact that in order to have a cluster on scale h = 0 with n external lines one needs at least N ¿ 2 points such that N ¿ 2 + (n − 3)=2. 8.2. E>ective potentials We decompose the 6elds and their propagators as in Section 5.1. For each 6eld we write k = k + !pF ;
(6h) k; !
(8.8)
where pF = (pF ; 0), so that k = (k ; k0 ) measures the distance from the Fermi surface (if the 6eld is on scale h then |k | ≈ Ah ; see (5.2) for notations).
318
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Then by integrating iteratively the 6elds as shown in Section 5.1, one obtains the e=ective potentials V(h) , which can be written as V(0) in (8.7). More precisely, one can write V(h) (
(6h)
)=
∞ ∞ n=1 m=1
1 (L+)n
k1 ;:::;kn ∈DL;+
(h) × Wˆ n; m (k1 ; : : : ; kn )
(6h) 1 k1
n
:::
(6h) n kn
i ki + 2mp
:
(8.9)
i=1 (h)
1 n We shall call Wˆ n; m (k1 ; : : : ; kn ) the value of the cluster with external lines k(6h) ; : : : ; k(6h) . 1 n If h is not the scale of the root then the cluster is a subcluster of a bigger cluster and some of its external lines ‘ can be contracted on scales h‘ ¡ h. The power counting argument of the previous section tells us that we have to renormalize all the clusters with two and four external lines. More precisely, bound (6.13) and de6nition (5.37) show that we need at least a gain A2(hv −hv ) when |P v | = 2 and a gain Ahv −hv when |P v | = 4. However, in this case, there are in6nite kinds of clusters with two and four external lines (depending on the value of m) and renormalizing all of them would be clearly a problem. So we shall try to improve the power counting: this is a typical phenomenon arising in many fermionic systems studied by RG methods. The idea in this case is the following one.
Lemma 1. Assume that in ∞ ∞ n=1 m=1
1 (L+)n
k1 ;:::;kn ∈DL;+
(60) 1 k1
:::
(60) n kn
(0) Wˆ n; m (k1 ; : : : ; kn )
n
i ki + 2mp
(8.10)
i=1
one has n
i !i pF + 2mp mod 2' = 0 :
(8.11)
i=1
Then contribution (8:9) to V(h) is vanishing unless one has −h=E A − mn |m| ¿ C1 S ; n1=E
(8.12)
if C1 is a suitable positive constant. Proof. Consider a contribution to V(h) as in (8.9), arising from a cluster with n external lines (on scale 6 h): by the momentum conservation one has n i=1
i ki + 2mp = 0 ;
(8.13)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
so that n
i ki =
−
n
i=1
319
i !i pF + 2mp
:
(8.14)
i=1
Using the compact support property of the propagators corresponding to the Grassman 6elds (6h) i (see (5.2) and (5.11)) and the Diophantine condition (8.4), we can bind ki +!i pF ;!i n n na0 Ah ¿ i ki ¿ i !i pF + 2mp ¿ C0 (nmS + |m|)−E ; (8.15) i=1
T
i=1
T
from which (8.12) follows with C1 = (C0 =a0 )1=E . Using a terminology coming from classical mechanics (introduced by Eliasson [64]), the clusters with two or four external lines for which the above condition (8.11) is not veri6ed are called resonances or resonant clusters. Let us denote by Nv the integer number such that, if ki = ki + !i pF are the momenta of the e nv lines entering or exiting the cluster Gv , one has e
nv i=1
e
i ki =
nv
i (ki + !i pF ) = 2Nv p :
(8.16)
i=1
We can de6ne inductively
Nv1 + · · · + Nvsv Nv = mv
if v ∈ V (E)\Vf (E) and w = v ∀w ∈ {v1 ; : : : ; vsv } ; if v ∈ Vf (E) :
(8.17)
The above equation (8.12) says that, up to the case of resonances, in order to have a cluster with scale hv one needs Nv to be greater than A−hv =E , a big number if hv is very negative, but it is clear that the larger Nv the smaller the value associated with the cluster. This is obvious if the cluster contains only endpoints, as |’ˆ n | 6 F0 e−|n| . The general case will be discussed below. 8.3. Renormalization The above lemma says that the clusters such that (8.11) is not satis6ed, i.e. n
i !i pF + 2mp mod 2' = 0 ;
(8.18)
i=1
are in some sense special as Nv can be small without limit and in such cases there is no power counting improvement exploiting the fact that Nv has to be large and the exponential decay of the factors |’ˆ n |. Note that (8.18) can be a source of problem only in a few particular cases, depending on Gv and Nv , as only the clusters with two and four external lines have to be renormalized in order to improve the power counting.
320
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
As we said we call such contributions resonances. In classical mechanics the resonances have only two external lines, ([65]; see also [66]): if = 0 the model is technically similar to the perturbative series for invariant tori. The renormalization operator R = 5 − L is a linear operator de6ned in the following way. (The de6nitions below should be slightly modi6ed for L; + 6nite, anyway we prefer to ignore such a technical aspect in order not to overwhelm the notations; see [46] for a technically more satisfactory discussion.) • If n ¿ 4 then 1 L (L+)n
n
k1 ;:::;kn ∈DL;+
(h) Wˆ n; m (k1
i=1
(6h) i ki +!i pF ;!i
+ !1 pF ; : : : ; kn + !n pF )
n
i (ki + !i pF ) + 2mp
=0 :
(8.19)
i=1
• If n = 4 then 1 L (L+)4
4
k1 ;:::;k4 ∈DL;+
(h) Wˆ 4; m (k1
i=1
(6h) i ki +!i pF ;!i
+ !1 pF ; : : : ; k4 + !4 pF )
4
i (ki + !i pF ) + 2mp
i=1
1 = ( 1 !1 + 2 !2 + 3 !3 + 4 !4 )pF +2mp;0 (L+)4 (h) Wˆ 4; m (!1 pF ; : : : ; !4 pF )
4
k1 ;:::;k4 ∈DL;+
i ki
4 i=1
(6h) i ki +!i pF ;!i
:
(8.20)
i=1
• If n = 2 then 1 L (L+)2
k1 ;k2 ∈DL;+
(h) Wˆ 2; m (k1
2 i=1
(60) i ki +!i pF ;!i
+ !1 pF ; k2 + !2 pF )
1 = (!1 −!2 )pF ;0 (L+)
k1 ;k2 ∈DL;+
2 i=1
2
i (ki + !i pF ) + 2mp
i=1
(6h) i ki +!i pF ;!i (h)
× [W2; m (!1 pF ; !2 pF ) + !1 E(k + !1 pF )9k Wˆ 2; m (!1 pF ; !2 pF ) (h)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
+k0 9k0 W2;(h)m (!1 pF ; !2 pF )] 1 +(!1 +!2 )pF ;0 (L+)
2
k1 ;k2 ∈DL;+
321
(h) Wˆ 2; m (!1 pF ; !2 pF ) ;
(6h) i ki +!i pF ;!i
i=1
(8.21)
where E(k + !pF ) = cos pF − cos k = v0 ! sin k + (1 − cos k ) cos pF ;
v0 = sin pF ;
(8.22)
the delta function is always de6ned modulo 2' in k and the symbols 9k ; 9k0 denote discrete derivatives, see Appendix A.2. Note that the action of the localization operator is nontrivial (i.e. di=erent from zero) only for the resonant clusters, i.e. for the clusters with two or four external lines such that n
i !i pF + 2mp = −
i=1
n
i ki = 0 mod 2';
n = 2; 4 :
(8.23)
i=1
By setting LV(h) (
(6h)
)=
∞ ∞ n=2 m=1
1 (L+)n
k1 ;:::;kn ∈DL;+
(0) ×LWˆ n; m (k1 ; : : : ; kn )
(6h) 1 k1
n
:::
(6h) n kn
i ki + 2mp
;
(8.24)
i=1
we can write (8.19) – (8.21) as (h)
(h)
LWˆ 2; m (k1 + !1 pF ; k2 + !2 pF ) = (!1 −!2 )pF;0 [Wˆ 2; m (!1 pF ; !2 pF ) (h)
(h)
+!1 E(k + !1 pF )9k Wˆ 2; m (!1 pF ; !2 pF ) + k0 9k0 Wˆ 2; m (!1 pF ; !2 pF )] (h) +(!1 +!2 )pF;0 Wˆ 2; m (!1 pF ; !2 pF ) ;
(8.25)
(h)
(h)
LWˆ 4; m (k1 + !1 pF ; : : : ; k4 + !4 pF ) = ( 1 !1 + 2 !2 + 3 !3 + 4 !4 )pF +2mp;0 Wˆ 4; m (!1 pF ; : : : ; !4 pF ) ; LWn;(h) m (k1 + !1 pF ; : : : ; kn + !n pF ) = 0;
n¿6 :
Note that the r.h.s. of (8.20) and (8.21) are vanishing unless (8.18) is veri6ed. The localization operator L is aimed to characterize the resonances, i.e. the terms such that ni=1 i ki = 0, with n = 2; 4 (see (8.23)). One can wonder why, for n = 2, we localize the term with !1 = !2 at the second order while for the term with !1 = − !2 only a 6rst-order localization is performed: the reason is that the marginal (according to a naive power counting) terms of the form k0 k;+! k;−−! are indeed irrelevant; as we shall see such terms contain a factor h A−h and this will improve the power counting, see below.
322
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
We can then write LV(h) in the following more compact way: (6h)
LV(h) ( ) = Ah nh F7(6h) + Ah sh F (6h) + zh FK
+ ah F2(6h) + lh F(6h) ;
(8.26)
where
F7(6h) =
!=±1
F (6h) =
!=±1
F2(6h) =
!=±1
FK(6h) =
!=±1
F(6h) =
1 (L+)
(6h)+ (6h)− k +!pF ;! k +!pF ;!
1 (L+)
(6h)+ (6h)− k +!pF ;! k −!pF ;−!
k ∈DL;+
k ∈DL;+
1 E(k + !pF ) (L+) k ∈DL;+
1 (−ik0 ) (L+)
1 (L+)4
k ∈DL;+
k ;:::;k ∈D 1
4
; ;
(6h)+ k +!pF ;!
(6h)− k +!pF ;!
(6h)+ (6h)− k +!pF ;! k +!pF ;!
;
;
(6h)+ (6h)+ (60)− (60)− k1 +pF ;1 k1 −pF ;−1 k3 +pF ;1 k4 −pF ;−1 L;+
4
i ki
(8.27)
i=1
and, for h = 0, n0 = 7 + O(2 ) ; s0 = u’ˆ mS + O(u2 ) ; a0 = O(2 ) ; z0 = O(2 ) ; 2 ˆ − v(2p ˆ l0 = (v(0) F )) + O( ) :
(8.28)
We call nh ; sh ; ah ; zh ; lh the running coupling constants. As a matter of fact, we shall see that the renormalization performed until now will be not enough in order to solve all problems, so that we shall be forced to introduce other running coupling constants and modify the ones de6ned in (8.23). So the “6nal” running coupling constants will not be exactly the ones de6ned so far: this is the reason why we denote them by Latin characters, while the 6nal ones will be denoted by Greek characters; see Section 8.44 below. Let us recall that Vf (E) denotes the vertices of E which are endpoints (see Section 5.1). We can de6ne Vf∗ (E) ⊂ Vf (E) the subset of endpoints with which no running coupling constants are associated. Such endpoints will be all endpoints on scale h = 2 associated with the (nonlocalized) contributions to uP, i.e. v ∈ Vf∗ (E) if hv = 2 and Nv = mv = 0.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
323
Fig. 13. Splitting of the e=ective potential V(0) as sum of two contributions: the renormalized part RV(0) and the localized part LV(0) .
8.4. Renormalized trees The iterative integration is done then in the following way: (u :v:) V( ) (u:v:) + (i:r:) ) P(d ) e = P(d ) P(d (i:r:) ) eV (
=
P(d
(i:r:)
P(d
(¡0)
P(d
(¡−1)
=
=
) eV
(0)
(i:r :)
)
)
P(d
)
P(d
(0)
) eLV
(−1)
(0)
(
) eLV
(60)
(−1)
)+RV (0) (
(
(6−1)
(60)
)
)+RV (0) (
(6−1)
)
and so on :
(8.29) Of course, one can represent these operations in terms of a new kind of trees, which will be called renormalized trees, and which can be obtained in the following way. One writes V(0) as in Fig. 13: there can be an endpoint on scale h ¡ 2, representing contributions arising from LV(1) . Then one writes V(−1) as in Fig. 6, by using the representation in Fig. 13 for V(0) , so obtaining the expansion given in Fig. 14 for LV(−1) . The same expansion holds also for RV(−1) , the only di=erence being that an R operator is associated also to the 6rst node (compare LV(0) and RV(0) in Fig. 13). In conclusion the renormalized trees are given by the same trees as in the previous sections, with the following di=erences. See Fig. 15. (1) With each vertex v ∈ Vf (E) an R operation is associated, up to the 6rst vertex v0 which can have associated either an R operation or an L operation.
324
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 14. Graphic representation of the localized e=ective potential V(−1) .
Fig. 15. A renormalized tree appearing in the graphic representation of RV(h) or LV(h) .
(2) There are endpoints v with scale hv (before each endpoint was at scale hv = 2). If hv ¡ 2 with the endpoint v a contribution LV (h) is associated, while if hv = 2 either a contribution LV (0) or a contribution RV (0) are associated with v. If v is an endpoint and hv 6 − 1 than hv = hv + 1 is v is the nontrivial vertex immediately preceding v. The running coupling constants corresponding to the endpoint v will be denoted by rv : one has rv = 7h if h = hv and the contribution F7(6h) to LV(h) ( ) is considered, and so on. Of course, we can write a Feynman diagram expansion, in which each cluster value is written as W (h) = (5 − L)W (h) + LW (h) (see (8.23)). We shall see that the bound for (5 − L)W (h) has an extra factor Azv (hv −hv ) for each v ∈ V (E), with respect to the bound for W (h) , for a suitable integer zv . It will turn out to be zv = 1; 2 for the clusters on which R acts, such a factor is just what we need in order to perform the sum over the trees, as it converts the exponent in (5.34)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
325
from nev =2 + m2; v − 2 to nev =2 + m2; v − 2 + zv . Therefore, by taking into account the analysis performed in Section 5.3 and the value of zv , the factor n2v =2 + m2; v − 2 + zv becomes positive. In order to understand how the gain factor Azv (hv −hv ) arises, we can consider explicitly an example. Consider a resonant cluster with two external 6elds: if k1 and k2 are the momenta associated to the external lines of the cluster, one has k1 = k2 = k + !pF , so that we can set W2;(h)0 (k1 ; k2 ) = M(h) (k ). We know from the previous analysis that for such a cluster a second-order renormalization is required if !1 = !2 , while a 6rst-order renormalization is enough if !1 = − !2 : this should produce a gain factor Az(hv −hv ) where z = 1; 2, respectively. For simplicity, we explicitly consider now the case of clusters with only two external lines with !1 = − !2 , so a 6rst-order renormalization is enough in order to obtain a “6rst-order gain” Ahv −hv . As far as the following heuristic discussion is concerned, we suppose that all the involved clusters on which the action of the renormalization operator is nontrivial are clusters with two external lines and with !1 = − !2 , i.e. such that 2 1 (60) i L ki +!i pF ;!i (L+)2 k1 ;k2 ∈DL;+
(h) Wˆ 2; m (k1
+
i=1
!1 pF ; k2
1 = (!1 +!2 )pF ;0 (L+)
+ !2 pF )
k1 ;k2 ∈DL;+
2 i=1
2
i (ki
i=1 (6h) i ki +!i pF ;!i
+ !i pF ) + 2mp
(h) Wˆ 2; m (!1 pF ; !2 pF ) :
(8.30)
Then, as the argument is simply dimensional, one can easily convince oneself that, when needed, a second-order normalization produces a “second-order gain”. Of course, as we said, the clusters with two external lines and with !1 = !2 have to be renormalized twice according to the prescription given in the previous section. Anyway the following discussion can be performed for a second-order renormalization without any relevant change but from a notational point of view, so that, in order to not make uselessly cumbersome the analysis, we presume to deal only with a 6rst-order renormalization. Write for a the 6rst-order renormalization of the resonant cluster we are considering 1 M(h) (k ) = M(h) (0) + k · d t 9k M(h) (tk ) (8.31) 0
with 9k = (9k ; 9k0 ). By (8.30) the 6rst term in the right-hand side of (8.31) would take into account the localized contribution to the e=ective potential, while the second term would represent the renormalized(h)contribution. Recall that Wˆ is the integral of a product of propagators g‘ with scales ¿ hv ; the derivative in (8.31) produces an extra dimensional factor A−hv , while the “zero” k produces an extra factor Ahv (by the compact support of the propagator).
326
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
There is a technical point that should be stressed. Of course, it is possible that there are many clusters inside each other to be renormalized. Suppose that Gv1 ; : : : ; Gvm are clusters to be renormalized, with v1 ≺ v2 ≺ : : : ≺ vm : so Gvm ⊂ : : : ⊂ Gv1 . Start by renormalizing Gv1 , i.e. the most external one: then a derivative can be applied on all the propagators corresponding to the lines inside Gv1 . In particular, it can be applied to the propagator of a line inside Gvm . Next, we renormalize v2 ; again the derivative can be applied on all the propagators corresponding to the lines inside Gv2 , and so on. After m renormalization steps all the clusters Gv1 ; : : : ; Gvm have been renormalized, but among all the contributions which have been obtained, terms like 9m k g‘ , with ‘ ∈ Gvm , have also been obtained. This, in addition to the right dimensional factor, contributes to the bound with a factor O(m!2 ), 2 ¿ 1 (one would have 2 = 1 if the support function was analytic, and it is even worse for the choice made in Section 5.1). Therefore, the graph value in general can only be bounded O(n!2 ). There are several ways to solve this problem. In the case of exponential (analytic) cut-o= function [38] or Gevray class function (nonanalytic and with compact support, but with Fourier 1=s transform bounded by e−(n) ) [31], one can still bound these extra O(m!2 ) terms; see for instance [38,47]. Another way to see that there is no problem is to show simply that all the propagators are at most derived twice (see [18]), essentially by exploiting the (simple) idea that once a gain has been obtained corresponding to some resonance there is no further need to renormalize it (a fact already used in [37]). To be more precise, the argument is the following. Note that now no assumptions on the cut-o= function are necessary, except the smoothness one (and in fact one can weaken also such a hypothesis.) The argument is very simple; if we consider a 6rst-order renormalization, as the one we are discussing here, and it can be trivially extended. Consider the cluster Gv1 and assume that the derivative is applied just on a propagator inside Gvn and outside Gvn+1 , for some 1 ¡ n ¡ m; in (h −h ) this way we get a factor A v1 vn , which we can rewrite as (hv −hvn )
A
1
(hv −hv1 ) (hv2 −hv )
=A
1
A
2
: : : A(hvn −hvn ) ;
(8.32)
so that each cluster has the factor Ahv −hv and there is no need of any other normalization. In fact, all the other renormalizations are vanishing as 9k RM(h) (k ) = 9k M(h) (k ) ;
(8.33)
which means that there are no renormalizations acting on the clusters included between Gv1 and Gvn . The above analysis is performed in Fourier space and skips the problem of implementing the Gram–Hadamard inequality in order to control the number of terms arising from the perturbative expansion. On the other hand, as we shall see better in Appendix A.3, the Gram–Hadamard inequality is applied in the coordinate space. The renormalization procedure gives rise to factors k (see (8.31)) which, in the coordinate space, correspond to derivatives of 6elds, hence to derivatives of propagators once such lines are contracted. This creates a series of intricacies and technical problems, for the discussion and solution of which we refer to the original papers: see [38,43,46].
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
327
8.5. Renormalized bounds Proceeding as in Section 6 we get for the renormalized expansion d x(P v0 )|W(h) (E; P v0 ; x(P v0 ))| 6 C n A−h[D(P v0 )+zv0 (Nv0 ; P v0 )] × A−[D(P v )+zv (Nv ; P v )](hv −hv ) v∈Vf (E)
×
A−hv m2; v
v∈Vf (E)
|’ˆ mv | ;
v∈Vf
|rv |
(E)\V ∗ (E) f
(8.34)
v∈Vf∗ (E)
where Nv is the integer such that (8.16) holds (but the second relation in (8.17) has to be replaced with Nv 6 Nv1 + · · · + Nvsv + 2× (number of lines with nondiagonal propagator)), and Vf∗ (E) is the set of endpoints such that no running coupling constant is associated with them (see the end of Section 8.3) and m2; v is de6ned in (5.36), while (1) zv (Nv ; P v ) = 1 if Gv has four external lines (|P v | = 4) and it is a resonance, i.e. 4 i=1 i !i pF + 2Nv p = 0, (2) zv (Nv ; P v ) = 2 if Gv has two external lines (|P v | = 2) and it is a resonance, i.e. (!1 − !2 )pF + 2Nv p = 0, such that !1 = !2 , (3) zv (Nv ; P v ) = 1 if Gv has two external lines (|P v | = 2 and it is a resonance, i.e. (!1 − !2 )pF + 2Nv p = 0, such that !1 = − !2 . Note that now the endpoints v can have also a scale hv ¡ 2, so that we cannot set A−hv = A−1 in the last line of (8.34). Bound (8.34) is obtained by using the Gram–Hadamard inequality like in Section 6: the presence of the renormalization makes the construction a little involved, as derived 6elds also have to be considered in the space-time coordinates (in which the inequality can be applied). However bound (8.34) obtained for the renormalized expansion is not yet suKcient for proving nondiverging bounds when the sum over trees is performed for a number of reasons. (1) The factor D(P v ) + zv (Nv ; P v ) can be still equal to −1 or 0, in correspondence of nonresonant clusters with two and four external lines; we have to extract from |’ˆ mv | ; (8.35) v∈Vf∗ (E)
some good factor by using the lemma in Section 8.2. (2) Also for resonances with two external lines such that !1 = − !2 , by de6nition of R, one can have D(P v ) + zv (Nv ; P v ) = 0, as zv (Nv ; P v ) = 1 in such a case. (3) There are two relevant running coupling constants, namely Ah nh and Ah sh . We deduce from the above discussion that it is necessary to put a factor Ah in front of them (i.e. to assume that they are decreasing at least as Ah ) to have a renormalizable power counting. In fact each endpoint v with m2; v = 2 carries a factor A−hv , which we choose to delete by putting a factor Ahv in front of the corresponding running coupling constants themselves; of course, such an operation is meaningful only after one can prove that nh and sh remain bounded. While there
328
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
is a counterterm 7 in Hamiltonian (8.1) which can be 6xed (hopefully) in order that this can be really done, this is not the case for sh . We shall see in the next section that, while nh is related to the shift of the singularity of the interacting two-point Schwinger function, sh is due to the e=ect of the opening of a gap in the spectrum; because of such a term the propagator becomes essentially “of the form” kfh =(k 2 + h2 ), for some constant h , so that its expansion in terms of h gives an expression “of the form” n ∞ h2 1 − 2 ; (8.36) k k n=0
which would be convergent only if h Ah sh , with sh bounded, since k Ah . It is clear that by a Bogolubov transformation (see [32]) we can put the gap term in the fermionic integration: however, see below, as the true gap is not of order O(u), but of order O(u1−+··· ), many Bogolubov transformations are necessary, one for each scale, as the gap itself has a nontrivial Cow. (4) Finally, zh ; 2h are not bounded uniformly in h. In fact one can write the Cow to second order as lh−1 = lh ; ah−1 = ah + +1 h2 ; zh−1 = zh + +1 h2 ;
(8.37)
where +1 is a constant, so that one obtains ah = zh = O(2 h). We shall see that the arising logarithmic divergence is due to the “in6nite” wave function renormalization: if u = 0, the large distance behaviour of the two-point Schwinger function is 2 not |x|−1 , but |x|−(1+ +···) ; see Sections 10, 12 and 13. The above considerations show the necessity of an anomalous expansion. 8.6. Anomalous integration The integration is performed iteratively: at each step h 6 0 the Grassman integration measure is changed by using the results in Section 2.1 and the 6elds are rescaled by a suitable factor. The change of the integration measure can be interpreted as a shift of some terms contributing to the e=ective potential into the integration measure. Practically, one proceeds by introducing a sequence of constants Zh , with h 6 0, and Z0 = 1, in the following way. De6ne Ch (k )−1 =
h
fj (k ) ;
j=h+
where h+ is given by (5.12).
(8.38)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
329
Once the 6elds (0) ; : : : ; (h+1) have been integrated, we have √ (h) (6h) ) PZh (d (6h) )e−V ( Zh ; where, up to a constant, d PZh (d (6h) ) =
(6h)+ (6h)− k +!pF ;! d k +!pF ;!
k∈DL;+ !=±1
1 × exp − L+
Ch (k )Zh [(−ik0 + (1 − cos k ) cospF + !v0 sin k )
k ∈DL;+ !=±1
×
(8.39)
(60)+ (60)− k +!pF ;! k +!pF ;!
+ h (k )
(60)+ (60)− k +!pF ;! k −!pF ;−! ]
;
(8.40)
where h (k ) is de6ned iteratively (see (8.43) below). As before, it is convenient to split V(h) as a sum of two summands LV(h) + RV(h) , with R = 5 − L and L, the localization operator is the operator de6ned in the previous section. We write, if Nh is a constant √ √ 1 6h ˜ (h) (6h) −V(h) ( Zh (6h) ) PZh (d P˜ Zh−1 (d (6h) )e−V ( Zh ) ; )e = (8.41) Nh where
P˜ Zh−1 (d
(6h)
)=
d
k∈DL;+ !=±1
(6h)+ (6h)− k +!pF ;! d k +!pF ;!
1 × exp − L+
Ch (k )Zh−1 (k )
k ∈DL;+ !=±1
[(−ik0 + (1 − cos k ) cospF + !v0 sin k ) ×
(60)+ (60)− k +!pF ;! k +!pF ;!
+ h−1 (k )
(60)+ (60)− k +!pF ;! k −!pF ;−! ]
;
(8.42)
with Zh−1 (k ) = Zh (1 + Ch−1 (k )zh ) ; Zh−1 = Zh (1 + zh ) ;
Zh−1 (k ) h−1 (k ) =
Zh ( h (k ) + Ch−1 (k )sh )
if h ¡ 0 ;
C0−1 (k )s0
if h = 0 ;
(8.43)
330
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
˜ V
(h)
˜ (h) + (5 − L)V(h) ; = LV (h)
˜ ( ) = Ah nh F7(6h) + (ah − zh )F2(6h) + lh F LV
(6h)
:
Note that the functions Zh (k ) and h (k ) are de6ned iteratively for h 6 0 by (8.43) itself (for a better understanding of the integration procedure one can work out explicitly the 6rst scales h = 0; −1; : : :). In particular, one has 0
h (k ) =
Ch−1 (k )sh ;
(8.44)
h =h
so that, if k is such that Ch−1 (k ) = 0 (i.e. |k | 6 t0 Ah+1 ), one has h (k ) = Ch−1 (k )sh +
0
sj ;
(8.45)
h =h+1
as Ch−1 (k ) = 1 for h ¿ h + 1 for such k . Therefore h (k ) is a smooth function on T × R. We de6ne
h =
0
sh :
(8.46)
h =1
The right-hand side of (8.41) can be written as √ 1 ˜ (h) (6h−1) PZh−1 (d ) P˜ Zh−1 (d (h) )e−V ( Zh Nh where PZh−1 (d (6h−1) ) is given by (8.42) with (1) Zh−1 (k ) replaced by Zh−1 , (2) Ch (k ) replaced with Ch−1 (k ), (3) (6h) replaced with (6h−1) , while P˜ Zh−1 (d (h) ) is given by (8.42) with (1) Zh−1 (k ) replaced by Zh−1 , −1 (2) Ch (k ) replaced with f˜h (k ), if −1 −1 Ch−1 (k ) C (k ) h − ; f˜h (k ) = Zh−1 Zh−1 (k ) Zh−1
(6h)
)
;
(8.47)
(8.48)
(3) (6h) replaced with (h) . This can be easily proven by using the addition principle and the change of integration for fermionic integrations discussed in Section 4. Note also that f˜h (k ) is a compact support function, with support of width O(Ah ) and far O(Ah ) from the “singularity”, i.e. from pF .
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
The Grassman integration P˜ Zh−1 (d
(h) )
331
has propagator given by
(h) g!; g(h) (x − y) ! (x − y) = e−i(!x−! y)pF Zh−1 Zh−1
(8.49)
(h) g!; ! (x − y) = P˜ Zh−1 (d Zh−1
(8.50)
!;! =±1
with (h)
)
(h)− (h)+ x; ! y; !
such that (h) g!; ! (x − y) =
1 −ik ·(x−y) ˜ −1 e fh (k )[Th (k )]!; ! ; L+
(8.51)
k ∈DL;+
where the 2 × 2 matrix Th (k ) has elements [Th (k )]1; 1 = (−ik0 + (1 − cos k ) cospF + v0 sin k ) ; [Th (k )]1; 2 = [Th (k )]2; 1 = h−1 (k ) ; [Th (k )]2; 2 = (−ik0 + (1 − cos k ) cospF − v0 sin k ) ;
(8.52)
which is well de6ned on the support of f˜h (k ), so that, if we set Ah (k ) = det Th (k ) = [ − ik0 + (1 − cos k ) cos pF ]2 − (v0 sin k )2 − [ h−1 (k )]2 ; then Th−1 (k ) =
1 Ah (k )
[Eh (k )]1; 1 [Eh (k )]1; 2 [Eh (k )]2; 1 [Eh (k )]2; 2
(8.53)
;
(8.54)
with [Eh (k )]1; 1 = [ − ik0 + (1 − cos k ) cospF − v0 sin k ] ; [Eh (k )]1; 2 = [Eh (k )]2; 1 = − h−1 (k ) ; [Eh (k )]2; 2 = [ − ik0 + (1 − cos k ) cospF + v0 sin k ] :
(8.55)
Note that there exist two positive constants c1 ; c2 such that
c1 h 6 h (k ) 6 c2 h ;
h ≡
0
sh ;
(8.56)
h =h
where de6nition (8.46) has been used. The large distance behaviour of propagator (8.48) is given by the following lemma (which can be proven by reasoning as for proving (5.27) in Section A.4).
332
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
(h) Lemma 2. The propagator g!; ! (x − y) in (8:51) can be bounded as follows. For ! = ! one has (h) (h) (h) (h) g!; ! (x − y) = g0;! (x − y) + C1; ! (x − y) + C2; ! (x − y) ;
(8.57)
where (h) g0;! (x − y) =
1 −ik·(x−y) f˜h (k ) e ; L+ −ik0 + !v0 k
(8.58)
k∈DL;+
C1;(1)! (x − y) is independent on h (k ) and C1;(h)! and C2;(h)! are such that; for any integer N ¿ 1 and for |x − y| 6 L=2; |x0 − y0 | 6 +=2; one has | h | 2 A2h CN Ah CN (h) (h) |C1; ! (x − y)| 6 ; | C (x − y) | 6 (8.59) 2; ! 1 + (Ah (x − y))N Ah 1 + (Ah (x − y))N for a suitable constant CN . For ! = − ! one has (h)
|g!; −! (x − y)| 6
Ah CN ; Ah 1 + (Ah (x − y))N
| h |
(8.60)
where CN can be chosen the same as the constant in (8:59). (h) Note that g0;! (x − y) coincides with the propagator “at scale Ah ” of the Luttinger model [53] (this remark will be crucial for studying the renormalization group Cow); it admits the bound (h)
|g0;! (x − y)| 6
Ah CN ; 1 + (Ah (x − y))N
(8.61)
(h) (h) so that we see that, with respect to g0;! , the propagators C1;(h)! ; C2;(h)! and g!; −! have some extra h h 2 h good factors, which are, respectively, A ; (| h |=A ) and | h |=A . We rescale the 6elds by rewriting (8.47) as √ 1 (6h) ˆ (h) ) PZh−1 (d (6h−1) ) P˜ Zh−1 (d (h) )e−V ( Zh−1 ; (8.62) Nh
so that (h)
ˆ ( ) = Ah 7h F7(6h) + h F2(6h) + h F LV
(6h)
;
(8.63)
where, by de6nition, 7h =
Zh n ; Zh−1 h
Zh (a − zh ) ; Zh−1 h Zh 2 h = lh : Zh−1 h =
(8.64)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
333
We call the set ˜vh = (7h ; h ; h ) the running coupling constants. They will be the true running coupling constants of the model and replace the ones de6ned through (8.28). We perform the integration √ √ (6h) (h−1) (6h−1) ˆ (h) ) )+E˜ h = e−V ( Zh−1 ; (8.65) P˜ Zh−1 (d (h) )e−V ( Zh−1 where E˜ h is a suitable constant and LV(h−1) (
(6h−1)
) = Ah−1 nh−1 F7(6h−1) + sh−1 F (6h−1) +ah−1 F2(6h−1) + zh−1 FK(6h−1) + lh−1 F(6h−1) :
(8.66)
Note that the above procedure allows us to write the running coupling constants ˜vh in terms of ˜vh , h ¿ h + 1, ˜vh = ˜+(˜vh+1 ; : : : ;˜v0 ) ;
(8.67)
the function ˜+(˜vh+1 ; : : : ;˜v0 ) is called the beta function. Recall that, if no renormalization is performed, the e=ective potential V(h) ( ) is a sum of terms of the form n n (6h) 1 (h) i i (ki + !i pF ) + 2mp ; (8.68) Wˆ n; m (k1 ; : : : ; kn ) ki +!i pF ;!i (L+)n i=1
k1 ;:::;kn ∈DL;+
i=1
see (8.8). The renormalization procedure described above produced a new sequence of (renormalized) e=ective potentials which are of the form (8.68) with the 6elds (6h) replaced with √ (h) Zh (6h) and the kernels Wˆ n; m (k1 ; : : : ; kn ) computed with the new rules: we shall call them the renormalized values of the clusters. The e=ective potentials can be written as V(h) (
$
Zh
(6h)
)=
∞
V(h) (E;
$
Zh
(6h)
);
|P v0 |
˜ (6h) (P v0 )W(h) (E; P v0 ; x(Iv0 )) :
n=1 E∈Th; n (h)
V (E;
$
Zh
(6h)
)=
d x(Iv0 )
$
Zh
(8.69)
P v0 ⊂Iv0
Here the kernels (h)
W (E; P v0 ; x(P v0 )) =
d x(Iv0 \P v0 )W(h) (E; P v0 ; x(Iv0 ))
(8.70)
(h) are the functions of which the renormalized values Wˆ n; m (k1 ; : : : ; kkn ) mentioned above represent the Fourier transforms. De6ne
h∗ = inf {h ¿ h+ : a0 Ah+1 ¿ 2| h |} :
(8.71)
334
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
We shall prove in the following that the running coupling constants h remain bounded from below (uniformly in +): as Ah+1 can be arbitrarily small for + → ∞ and h small enough, de6nition (8.71) of h∗ makes sense. By the previous lemma one immediately gets the following result. Lemma 3. For h ¿ h∗ and for any integer N ¿ 1; it is possible to =nd a constant CN such that (h)
|g!; ! (x − y)| 6
CN Ah ; 1 + (Ah |x − y|)N
(8.72)
for |x − y| 6 L=2 and |x0 − y0 | 6 +=2. We shall see that, using the above lemmata and assuming that the running coupling constants are bounded (an assumption which will be checked to hold a posteriori), the integrations P˜ Zh−1 (d (6h) ) are well de6ned for 0 ¿ h ¿ h∗ . The integration of the scale from h∗ to h+ can be performed “in a single step” as ∗ ∗ √ ∗ 1 ∗ ( 6h ∗ ) ˜h √ (6h∗ ) −Vh ( Zh∗ (6h ) ) ) PZh∗ (d )e = ; (8.73) P˜ Zh∗ −1 (d (6h ) )e−V ( Zh∗ Nh∗ ∗ where the integration measure P˜ Zh∗ −1 (d (6h ) ) is de6ned by the propagator ∗ g(6h ) (x − y) (6h∗ ) −(6h∗ ) +(6h∗ ) ˜ P Zh∗ −1 (d ) x ≡ : y Zh∗ −1
(8.74)
The integration in the r.h.s. in (8.73) is well de6ned, as it follows from the following bound. ∗
S for a suitable Lemma 4. Assume that h∗ is =nite uniformly in L; +; so that | h∗ −1 |=Ah ¿ ; constant . S Then for any integer N it is possible to =nd a constant CN such that one has (6h∗ )
|g!; ! (x − y)| 6
∗
CN Ah 1 + (Ah∗ |x − y|)N
(8.75)
for |x − y| 6 L=2 and |x0 − y0 | 6 +=2. Comparing the previous lemmata, we see that the propagator of the integration of all the scale between h∗ and h+ has the same bound as the propagator of the integration of a single scale greater than h∗ : this will be used to perform the integration of the scales 6 h∗ altogether. ∗ ∗ In fact Ah is a momentum scale and, roughly speaking, for momenta larger than Ah the theory is “essentially” a massless theory (up to corrections O( h A−h )), while for momenta smaller than ∗ ∗ Ah , it is a “massive” theory with mass O(Ah ). By the lemma in Section 8.2 we see that it is possible to have quartic or bilinear contribution to V(h) , if |h| is large enough, such that (8.18) with n = 2; 4 is not satis6ed only with an
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
335
extremely large Nv = m, namely |m| ¿ CA−h=E , for some constant C. In order to show that such terms are irrelevant, we shall have to use the fact that |’ˆ m | 6 C e−|m| , for some constants C and ; see below. 8.7. Bounds for the renormalized expansion We want to prove the following result. Theorem 1. Let h ¿ h∗ ; with h∗ de=ned by (8:71). If; for some constants c1 ; one has Zh h c1 ,h2 6 ec1 ,h sup |˜vh | ≡ ,h sup (8.76) 6 e ; sup h ¿h h ¿h Zh −1 h ¿h h −1 and if there exists a constant ,S (depending on c1 ) such that ,h 6 ,; S then; for a suitable constant c0 ; independent of c1 ; as well as of u; L and +; one has [|nh (E)| + |zh (E)| + |ah (E)| + |lh (E)|] 6 (c0 ,h )n ; (8.77) E∈Th; n
|sh (E)| 6 | h |(c0 ,h )n ;
(8.78)
|E˜ h+1 (E)| 6 A2h (c0 ,h )n ;
(8.79)
E∈Th; n
E∈Th; n
and; setting D(P v ) = (nev =2) − 2 + zv (Nv ; P v ) + z˜v (Nv ; P v )=2; with z˜v = 1 if |P v | = 2 and !1 = − !2 and 0 otherwise d x(P v0 )|RW(h) (E; P v0 ; x(P v0 ))| 6 L+A−D(P v0 )h (c0 ,h )n ; (8.80) E∈Th; n
where (8:80) corresponds to trees such that an R operation is associated with the =rst vertex; (8:77) and (8:78) correspond to trees such that an L operation is associated with the =rst vertex; and (8:79) represents a constant (i.e. =eld-independent) contribution to the e>ective potential. Let us recall the discussion in Section 8.5 and let us consider 6rst the case of a cluster Gv (h) with two external lines such that Wˆ 2; m (k1 + !1 pF ; k2 + !2 pF ) has !1 = − !2 . Contributions to the e=ective potential corresponding to such clusters will be generated by at least a nondiagonal propagator. Therefore, if for some term contributing to the e=ective potential on scale h, there is a nondiagonal propagator on scale h , then one has an extra term | h A−h | with respect to the
336
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
bound holding in the absence of nondiagonal propagators (compare (8.60) with (8.61)). If Gv is a cluster containing the line corresponding to such a propagator one has h ¿ hv : the scale h will be the scale of the minimal cluster containing the line such that the nondiagonal propagator is associated with it. We shall prove in the following that h c, (8.81) h−1 6 e for some constant c, so that hv c,(hv −h∗ ) : h∗ 6 e
(8.82)
Then one has −hv
| hv A
hv h∗ 6 A(h−hv )(1−O(,)) 6 A(h−hv )=2 ; |6 h∗ Ahv
(8.83)
as | h∗ | 6 CAh∗ 6 CAh (for a suitable constant C), by (8.71). Let us consider vertices v1 ; v2 ; : : : ; vm ordered so that v1 ≺ v2 ≺ : : : ≺ vm , with vi−1 = vi for i = 2; : : : ; m and v1 is the root. Suppose that Gv1 is a cluster containing a nondiagonal propagator on scale h = hv . Then one has m (hv −h )=2 hv1 A−hv1 6 A i vi :
(8.84)
i=1
This means that we can associate a factor A(hv −hv )=2 to each cluster containing a cluster Gv with two external lines such that !1 = − !2 . Note also that as the values of such a cluster Gv is marginal in a naIWve power counting analysis (and zv = 1), the corresponding gain factor is enough to ensure the convergence, as far as the value of Gv is concerned. It remains to improve the power counting of the nonresonant clusters with two or four external lines (i.e. with |P v | = 2 or |P v | = 4) which are such that (f)!(f)pF + 2Nv p = 0 : (8.85) f∈P v
We have a bound analogous to (8.34) namely, if Pv ∈ {hv ; 7hv ; hv } d x(P v0 )|W(h) (E; P v0 ; x(P v0 ))| 6 C n A−h[D(P v0 )+zv0 (Nv0 ; P v0 +z˜v0 (Nv0 ; P v0 =2)] × A−[D(P v )+zv (Nv ; P v )+z˜v (Nv ; P v )=2](hv −hv ) |Pv | |’ˆ mv | v∈Vf (E)
v∈Vf (E)
v∈V ∗ (E) f
(8.86)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
337
and we can write in (8.86) A−[D(P v )+zv (Nv ; P v )](hv −hv ) v∈Vf (E)
6
A−|P v |=6
v∈Vf (E)
Ahv −hv
v∈V4 (E)
Ahv −hv
v∈V2 (E)
A2(hv −hv ) ;
(8.87)
v∈V2 (E)
where the following notations have been used. (1) V4 (E) is the set of vertices v ∈ V (E) with Gv nonresonant and |P v | = 4. (2) V2 (E) is the set of vertices v ∈ V (E) with Gv nonresonant, |P v | = 2 and a derivative acting on one of the external lines. (3) V2 (E) is the set of vertices v ∈ V (E) with Gv nonresonant, |P v | = 2 and no derivative acting on the external lines. With any vertex v ∈ Vnt (E) we associate a depth label Dv de6ned inductively as follows. If v is an endpoint then Dv = 1, otherwise Dv = 1 + max {Dw : w v} :
(8.88)
w∈Vnt (E)
Note that Dv 6 − hv + 2, if v is the nontrivial vertex immediately preceding v. We call BD the set of v ∈ Vf∗ (E) such that the nontrivial vertex immediately preceding v has depth D. Now consider the factors Ahv −hv in Section 8.58 corresponding to the vertices v ∈ V4 (E) ∪ V2 (E) ∪ V2 (E). For any pair v1 ; v2 of nontrivial vertices such that v2 = v1 we can consider all the trivial vertices following v1 and preceding v2 together with v2 and write
v1 ≺wv2 w∈V4 (E)∪V2 (E)
2(h −h ) −hv1 Ahw −hw A w w ; 6A
(8.89)
v1 ≺wv2 w∈V2 (E)
so that we can bind (8.87) by A−[D(P v )+zv (Nv ; P v )+z˜v (Nv ; P v )=2](hv −hv ) v∈Vf (E)
6
A−|P v |=12
v∈Vf (E)
A−2hv ;
(8.90)
v∈V4 (E)∪V2 (E)∪V2 (E)
simply using (8.89) for any path of trivial vertices between two nontrivial vertices. It is obvious that in this way we exhaust all the vertices in V (E).
338
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
One can prove inductively that Dv +1 |’ˆ mv | 6 e−|mv |=2 e−|Nv |=2 ; v∈Vf∗ (E)
v∈Vf∗ (E)
(8.91)
v∈Vf∗ (E)
see Appendix A.6 for a proof. Using the fact that Dv 6 − hv + 2 and de6ning the indicator function @ as
1 if property holds ; @(property) = 0 otherwise : one has
@∗ (|Nv | ¿ C1 (A−hv =E |P v |−1=E − mS |P v |))
v∈Vf (E)
6 Cn
e−|mv |=2
v∈Vf (E)
|’ˆ mv |
v∈Vf (E)
(8.92)
e−C2 A
−hv =E
=2−hv +3
;
(8.93)
v∈V4 (E)∪V2 (E)∪V2 (E)
where the star on @ means that the constraint represented by the indicator function is used only for the vertices v ∈ V4 (E) ∪ V2 (E) ∪ V2 (E), for which |P v | 6 4. At the end we obtain d x(P v0 )|W(h) (E; P v0 ; x(P v0 ))| 6 C n ,n A−h[D(P v0 )+zv0 +z˜v0 =2] A−|P v |=12 {mv }v∈V ∗ (E)
{P v }
f
×
A−hv e−C2
A−hv =E =2−hv +3
v∈V4 (E)∪V2 (E)
v∈Vf (E)
A−2hv e−C2
A−hv =E =2−hv +3
:
(8.94)
v∈V2 (E)
Choosing A so that A1=E =2 ¿ 1 and using the fact that, if n is the number of endpoints, the number of nontrivial vertices is bounded by 2n (see Appendix A.1), we have that, for 2 = 4; 2 ; 2, there exists a constant C such that −hv =E −hv +3 ) A−hv e−(C2 A )=(2 6 Cn : (8.95) v∈V2 (E)
By a standard calculation (see Appendix A.6) A−|P v |=8 6 C n ; E∈Th; n {P v }
(8.96)
v∈Vf (E)
for some constant C. Formula (8.93) is the analogue of Bryuno lemma for the problem we are considering, it ensures that the small denominator problem arising in the series, for the incommensurability
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
339
of the potential ’x , can be controlled by taking into account the Diophantine condition. The same rˆole is played by the original Bryuno lemma in the proof of the convergence of the Lindstedt series (see [65,66] for a discussion). Note, however, that while the Lindstedt series in classical mechanics have no loops, here there are loops and one has to use the Grahm–Hadamard inequality. The formal proof can be found in Appendix A.6. The idea is quite simple. We can associate −hv +3 with each cluster Gv with four or two external lines, an exponential factor e−|Nv |=2 , which is indeed quite small if |hv | is large. It is due to the analyticity of the incommensurate potential ’x . But |Nv | has to be very large because of the Diophantine condition, as noted in the lemma in Section 8.2, and the resulting factor compensates the “bad” factors A−hv or A−2hv due to the power counting. The relevant or marginal terms in (8.21) and (8.20) are the analogue of the resonances in KAM theory. 9. Relationship between lattice and continuum models 9.1. Continuum models By the renormalization group methods we have described in the preceding sections one can treat equally systems of fermions on the continuum or on a lattice; we applied such methods to fermions on a lattice but this was done only for 6xing ideas. This feature of such methods should not be undervalued; if on the one hand it is “physically reasonable” that, as far as the low energy properties are concerned and for weak interactions, the qualitative behaviour for fermions on a lattice or on the continuum is the same, it is also true, on the other hand, that the methods commonly used are very sensitive to the fact that the fermions are on a lattice or on the continuum. For instance, the continuous renormalization group of [19] can be used only in the continuum case (with linearized dispersion relation), and so also the bosonization techniques. All the attempts by these techniques to study models de6ned on a lattice, like Hubbard-models or spin chains, have to approximate the lattice with a continuum, see for instance [70])). On the other hand, the exact solutions based on the Bethe ansatz, like the one of the Hubbard model by [5] or the solution of the XYZ chain, are based on the presence of a lattice. Let us consider the simplest interacting fermionic model on the continuum, with Hamiltonian H = H0 + V + 7N0 ;
(9.1)
with H0 ; N0 given by (2.2) and V given by (2.7). Following the scheme in Section 8 we decompose the Grassman variables in the sum of an ultraviolet part and an infrared one. Suppose that we can prove that after the ultraviolet integration one gets V(0) similar to (8.7) with u = 0 and the sums over x replaced by integrals. It is clear that, as far as the integration of (i:r:) is considered, the presence of the lattice or of the continuum plays essentially no role. In fact the di=erences in the expansion for the infrared part of the e=ective potential due to the fact that the fermions are on the lattice or on the continuum are only that in one case the -functions of the conservation of momenta are de6ned modulo 2' and not in the other case; this can lead
340
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
a di=erence in the beta function. However, this di=erence is present only on trees containing RV(0) and such terms are O(AQh ) for some 0 ¡ Q ¡ 1. Moreover, the integrations over k are in one case for |k | 6 ' and in the other case k ∈ R; however the presence of the cut-o= compact support functions makes the integrals in both case extend over the same interval. 9.2. The ultraviolet problem What is really di=erent is the ultraviolet integration, and this is related to the fact that the behaviour of the Schwinger functions for small distances or large momenta is very di=erent in lattice or continuum models. The analysis of the ultraviolet part of model (9.1) was done in [38] using a tree analysis similar to the one (with R = 1) discussed above. One decomposes the ultraviolet part of the propagator g
(u:v:)
(x − y) =
∞
C (h) (x − y)
(9.2)
h=0
with C (h) (x − y) = Ah=2 CS h (Ah (x0 − y0 ); Ah=2 (x − y)) ;
(9.3)
such that |CS h (x0 ; x)| 6
Then V(0) ( C n ||n
(i:r:) )
CN : 1 + |x|N
(9.4)
is written as a sum over trees E ∈ Tn; 0 which is bounded, if |7| 6 C ||, by
A(hv −hv )Dv =4
(9.5)
E∈Tn;0 v∈E
with Dv = |P v | + 2n4v + 4n2v − 6
(9.6)
and |P v | is the number of external 6elds of the cluster v while n4v ; n2v are the number of or 7 vertices inside the cluster Gv . We have seen that the sum over trees can be done for Dv ¿ 0; this is what happens in this case except in a 6nite number of cases, namely (1) |P v | = 2, n4v = 2, n2v = 0; (2) |P v | = 2, n4v = 1, n2v = 0. However, an explicit analysis of the above cases shows that the bounds can be improved and Dv ¿ 0 also in these cases, see [38]. Note that the ultraviolet part of the theory is a superrenormalizable theory, while the infrared part is a renormalizable one.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
341
9.3. The Luttinger model and the ultraviolet problem A similar result holds for the ultraviolet part of the Luttinger model, see [37]. The analysis is more diKcult, for the weaker decay properties of the ultraviolet propagator of the Luttinger model, and it is inspired by a proof for the ultraviolet Yukawa2 model in [35]. We do not present here the details of the proof (also because we did not even give the exact de6nition of the Hamiltonian of the Luttinger model), and, as elsewhere in this paper, we refer to the original paper [37]; here we con6ne ourselves to outlining the idea of the proof. By analysing carefully the structure of the clusters of the Feynman diagrams, as done for the infrared problem, one can easily identify the contributions which are responsible for bad dimensional bounds as the clusters with two and four external lines (the scaling properties of the propagator are exactly the same both in the ultraviolet and in the infrared region). However, by using the connectedness properties of the Feynman diagrams and the nonlocality of the interaction, one can easily realize that not all clusters present the same problems: more precisely the dimensional bounds can be improved for a suitable subclass of clusters with two and four external lines. As far as the remaining ones are concerned, one can then use the symmetry properties of the propagator of the Luttinger model (it is an odd function), to show that there are cancellations such that the dominant terms (which would lead to divergences when summing all scales) are in fact vanishing. The real diKculty arises when trying to take care of the cancellations without separately bounding the single Feynman diagrams (for which the above argument simply applies), but directly working with the truncated expectations in order to use the Garm–Hadamard inequality to obtain summability. Then one has to disentagle the classes of graphs lumped together in the de6nition of truncated expectation in order to still keep the good ultraviolet bound valid for them, but not so much as to lose the good combinatorial behaviour of the estimates; with respect to [25] the decomposition of the graphs has to be carried out to a much deeper extent. 10. Hidden symmetries and 2ow equations In the preceding sections we have de6ned an expansion for the e=ective potential which is convergent if the running coupling constants are small, see Theorem 1 and (8.76). An easy consequence of its proof is that, if k = 0 for any k other than the running coupling constants ˜vh are analytic as a function of ˜vk , k ¿ h, and so it is the e=ective potential. Before considering the problem of proving that the running coupling constants verify (8.76) if the perturbative parameters ; u are small enough, we mention that the value of ,S obtained collecting patiently all the numerical constants obtained in the bounds is, in adimensional units, very small and “unrealistic”. It is likely that the value of ,S could be greatly improved, using perhaps computer assisted proof techniques. Again the analogy with classical mechanics is helpful: in the classical estimate of the convergence radius of KAM tori (whose perturbative expansion, as we have seen, is quite similar to the perturbative series of quantum 6eld theory, except for the presence of loops) one gets a really unapplicable bound for the convergence radius, which can however be largely improved upto a realistic value using computer assisted proofs. So it is reasonable to assume that one can improve largely the estimate on ,. S However,
342
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
from the exact solutions we can see that one cannot enlarge inde6nitely; for instance if u = 0 and v(x − y) = |x−y|; 1 the model reduces to the XXZ spin chain for which it is known, see [71], that the spectrum has a gap for 6 − 1, so there is no hope that our series converges for any values of . The same holds for the Luttinger model, in which the solution in [4] is valid only for ¿ − v(0)=4'. ˆ Note also that ,S depends in a critical way on pF and it is vanishing if pF = 0; '. The validity of (8.76) is a nontrivial property and it is essentially due to the fact that the Holstein–Hubbard model is “close” in a Renormalization group sense to an exactly solvable model, the Luttinger model, verifying many important symmetries (Lorentz invariance, gauge invariance, etc.) which are not veri6ed by the Holstein–Hubbard Hamiltonian (we are speaking here of this model only for 6xing the ideas but the same considerations hold for a class of models, see Section 13, if the fermions are spinless). One can say that symmetries are “hidden”. These kinds of ideas are very old; they date back to Tomonaga [116], who discovered that d = 1 nonrelativistic fermions can be “approximated” by two types of fermions with linear dispersion relation, and this leads Luttinger to propose his model [112]. This idea was so successful that since the seventies the study of d = 1 metals has been done directly in terms of these fermions with linear dispersion relation, see [19,20]. The new point in the approach we are discussing is that we can really check the validity of this approximation in a quantitative way, obtaining rigorous estimates on the size of the corrections. In fact, while the validity of such an approximation is usually justi6ed using qualitative arguments coming from renormalization group (for instance, saying that many models are in the same “class of universality” than the Luttinger model, that they di=er for “irrelevant terms” from the Luttinger model and so on), in the approach we are discussing it is possible to give to such arguments a rigorous meaning and they can be substantiated by rigorous bounds. To implement the above considerations we consider the integration (with an ultraviolet cut-o=, as there is no h = 1): (60)+
P(l) (d
(60)
(60)−
d ˆ k ; ! d ˆ k ; ! )= NL (k ) k !=±1 (60)+ (60)− 1 × exp − C0 (k )(−ik0 + !v0 k ) ˆ k ; ! ˆ k ; !
L+
where C0−1 = 0k=h+ fk . We can perform the integration (60) ) P(l) (d (60) )e−V (
with
V =
dx
;
(10.1)
!=±1 k
(60)+ (60)+ (60)− (60)− x; 1 x; −1 x; −1 x; 1
:
(10.2)
(10.3)
We can integrate (10.2) using a procedure similar to the one discussed in the previous sections but: (1) h = 0 for any h, as the propagator is diagonal in the !-indices (2) 7h = 0 as the propagator is an odd function as a function of x.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
343
This means, see Section 5, that there are only two running coupling constant in this theory, namely h ; h . We will call the theory de6ned by integration (10.1) infrared Luttinger model; note that, contrary to the true Luttinger model (obtained from (10.1) replacing C0−1 with 1 and Wick ordering V ), the infrared Luttinger model is not exactly solvable. Returning to the beta function of the Hubbard–Holstein model (8.1), the Cow equations for the running coupling constants ˜vh are given by 7h−1 = A7h + G7h (˜vh ; : : : ;˜v0 ) ; h−1 = h + Gh (˜vh ; : : : ;˜v0 ) ; h−1 = h + Gh (˜vh ; : : : ;˜v0 ) ; h−1 = h + G h (˜vh ; : : : ;˜v0 ) ; Zh−1 = 1 + Gzh (˜vh ; : : : ;˜v0 ) : Zh Let us call h = (h ; h ) the running coupling constants appearing in the renormalization group analysis of the infrared Luttinger model. We want to write the +i(h) functions in a Luttinger model part plus a “correction”. (k) We know from Lemma 2 that we can write each propagator g!; ! as a term independent of (k) (k) (k) (k) k , given by !; ! [g0; ! + C1; ! ] plus a term !; −! g!; −! + !; ! C2; ! ; this second addend veri6es the same bound of the 6rst one | k |=Ak times. We can write then, if i = 7; ; ; z Gih (h ; 7h ; h ; : : : ; 0 ; 70 ; 0 ) = Gi1; h (h ; 7h ; : : : ; 0 ; 70 ) + Gi2; h (h ; 7h ; h ; : : : ; 0 ; 70 ; 0 ) ; where the 6rst addend is independent, and by symmetry reasons G 1; h (h ; 7h ; : : : ; 0 ; 70 ) = 0
(10.4)
Moreover, |G 2; h (h ; 7h ; h ; : : : ; 0 ; 70 ; 0 )| 6 C |h h |
(10.5)
and for i = 7; |Gi2; h (h ; 7h ; h ; : : : ; 0 ; 70 ; 0 )| 6 C | h |A−h |h |2 ;
(10.6)
(k) (k) as a consequence of bounds (8.59), (8.60) of the propagator ! ; −! g!; ! + !; ! C2; ! and of the short memory property, see below. As in the infrared Luttinger model there is no running coupling constant 7h it is convenient to write, if i = ; 7, 1; h 1; h Gi1; h (h ; 7h ; : : : ; 0 ; 70 ) = GS i (h ; : : : ; 0 ) + Gˆ i (h ; 7h ; : : : ; 0 ; 70 ) ;
(10.7)
344
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where the 6rst term in the r.h.s. of (10.7) is obtained putting 7k = 0, k ¿ h in the l.h.s. Finally, we write 1; h 1; h; l 1; h; nl GS i (h ; : : : ; 0 ) = GS i (h ; : : : ; 0 ) + GS i (h ; : : : ; 0 ) ;
(10.8)
1; h; l where GS i (h ; : : : ; 0 ) involves only propagators g0;(k)! (x; y) and it coincides with beta function of the infrared Luttinger model. It is easy to see, from the oddness of g0;(h)! (x; y) that 1; h GS 7 (h ; : : : ; 0 ) = 0 :
(10.9)
On the other hand, 1; h; nl
|GS
(h ; : : : ; 0 )| 6 CAQh |h |2 :
(10.10)
1; h; nL has a propagator C1;(k)! replacing a g0;(k)! in the analogous In fact, each contribution to GS i contribution to Gi1; h; l ; one gains then, with respect to the bounds for Gi1; h; l , a factor Ak for some k ¿ h (see Lemma 2 in Section 8). One then uses an immediate consequence of (8.87), saying that any contribution to the e=ective potential associated to a tree with a vertex at scale k is bounded by C n ,hn AQ(h−k) , for some Q ¡ 1; this property is often called short memory, as the exponential factor decreases the contribution from trees contributing to the e=ective potential with scale h and involving integration of 6elds with scale k very far from k h. Let us assume that 1; h; l GS (h ; : : : ; h ) = 0 ;
(10.11)
which means that the infrared Luttinger model Beta function with equal arguments is vanishing. Of course one can check this statement at the 6rst perturbative orders but to really prove (10.11) one needs a nonperturbative argument, see Section 11. Assuming (10.11), one can check inductively that the running coupling constants remain small, if the counterterm 7 is chosen properly. We do not do this, see [38,43,46], but we simply give some idea of the proof. The 6rst step is to choose 7 in a proper way so that 7h is bounded. We have seen in Section 7 that the interaction moves the singularity, thus producing divergences if the counterterm is not chosen properly. The RG Cow equation is given by 7h−1 = A7h + +7(h)
(10.12)
with, by (10.19), |+7(h) | 6 Ch2 [AQh + h A−h ]. By solving (10.12) by iteration 0 −h+1 k−2 k 7h = A 70 + A +7
(10.13)
k=h+1
and 6xing 70 = − 0k=h∗ +1 Ak−2 +7k (note that also the r.h.s. of this equation depends on 70 , and that 70 is a function of 7) it is possible to show that |7h | 6 h2 C[AQh + | h |A−h ] :
(10.14)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
345
(1; h) By (10.14) we see that the bound for Gˆ i , which by de6nition is given by trees with at least a 7 endpoint by (10.7), has at least an extra AQh or | h |A−h with respect to the bound for (1; h) GS i , i.e. 1; h
|Gˆ i (h ; 7h ; : : : ; 0 ; 70 )| 6 C | h |A−h |h |2 :
(10.15)
By an explicit lowest-order computation, (h) h−1 = 1 + h [ − +1 + +ˆ ] ; h (h) Zh−1 = 1 + h2 [+2 + +ˆ z ] Zh (h) (h) with +1 ; +2 positive nonvanishing constants and |+ˆ |; |+ˆ z | 6 C |h |. From such equations, it immediately follow that
A+1 c1 h 6
| h−1 | 6 Ac2 +1 ||h | 0 |
2
2
A−c3 +2 h 6 Zh−1 6 A−c4 +2 h
(10.16)
and an immediate consequence of this is an estimate for h∗ . We now consider the Cow for h ; we want to prove that |h−1 − | ¡ C ||3=2 ;
|h−1 | 6 C ||3=2 :
(10.17)
Assume inductively that, for any h ¡ k 6 0 3 | k | Qk : |k−1 − k | 6 || 2 A + k A
(10.18)
We can write 1; h 1; h GS (h ; h+1 ; : : : ; 0 ) = GS (h ; : : : ; h ) +
0
Dh; k ;
(10.19)
k=h+1
where 1; h 1; h Dh; k = GS (h : : : ; h ; k ; k+1 ; ::0 ) − GS (h ; : : : ; h ; h ; k+1 ; : : : ; 0 )
(10.20)
1; h
and GS (h ; : : : ; h ) is estimated by (10.10) and (10.11). From the short memory property it follows that |Dh; k | 6 C ||A−2Q(k−h) |k − h | :
(10.21)
From (10.6), (10.7), (10.10), (10.15), (10.18) and (10.21) it follows that k 0 5 | k | | | −2Q(k−h) Qk |h−1 − h | 6 C || 2 A A + k + 2C2 hh + 2C2 AQh A A k=h+1
k =h+1
which immediately implies (10.18) with k = h.
346
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
11. Vanishing of the Luttinger model beta function We have seen in Section 10 that an essential argument to study the Cow of the running coupling constants is (10.11), stating that the beta function of the Luttinger model is vanishing. We have in fact seen in Section 10 that, given (10.11), the Cow of the running coupling constants remains bounded; this is a remarkable property of spinless fermions. In order to prove (10.11) one has to use some nonperturbative argument; one possibility would be to use some Ward identity. Another possibility [25], which is the one discussed here, is to use the exact solution of the Luttinger model [4] and the fact that all the Schwinger functions of such a model can be explicitly computed. The idea of using the exact solution is really in the spirit of the renormalization group approach; one shows that a model is in the “universality class” of some special model which is exactly solvable, and takes all the possible advantage from such an exact solution (see for instance [2], in particular for the idea of “continuation”). On the other hand, it is likely that one is able to prove (10.11) also directly by Ward identities without using exact solutions; this is done (with no pretense of rigor) in the context of multiplicative Renormalization group in [72]. Again we refer to the original papers, especially [38,37,18], for the proofs, and we give here only some ideas. The strategy is very simple. The Hamiltonian of the Luttinger model is given by H = H0 + V ; H0 =
!=±1 0
V =
0
L
L
dx
+ !; x (i!9x
d x d y v(x − y)
− pF )
+ !; x
;
− + − + 1; x 1; x −1; y −1; y
+a
0
L
d x(
+ − 1; x 1; x
+
− + −1; x −1; x )
+b
0
L
dx ;
(11.1) where |v(x − y)| 6 e−p0 |x−y| is a short range potential and a; b have to be computed introducing an ultraviolet cut-o= in (11.1) (which otherwise does not have a well-de6ned meaning) and by imposing the Schwinger functions of the model which are well de6ned uniformly in the cut-o=, see [41,42]; this correspond to a Wick order in the interaction. Finally pF = (2'=L)(nF + 1=2) with nF a positive integer. By the exact solution of [4] it is possible to compute all the + = ∞ 6nite L Schwinger functions for this model, see [36]. We stress that the Schwinger functions can be really computed in a rigorous way only (up to now) for the model with Hamiltonian (11.1). In the literature, the name “Luttinger model” is improperly used for many other models with slightly di=erent Hamiltonians; for instance the Thirring model [73], in which v(x − y) is replaced by (x − y) and the theory is de6ned with an ultraviolet cut-o= which selects momenta |k | 6 1; for such models, as far as we know, there exists no exact solution. One can study the Luttinger model (11.1) by Renormalization group methods. We have already discussed the renormalization group analysis of the infrared Luttinger model de6ned by
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
347
(10.1) and (10.2). In order to study the model (11.1) we can write as in Section 5 = (u:v:) + (i:r:) , and (i:r:) has exactly the same propagator as the 6eld with integration P (l) given by ˜ (10.1); in other words, if P (l) (d ) is the Grassman integration of the Luttinger model (11.1) P˜ (l) (d ) = P(l) (d
(i:r:)
)P(l) (d
(u:v:)
)
where P(l) (d 60 ) is given by (10.1). It is possible to prove, see [37], that if V is the Luttinger model interaction in (11.1), then (u :v:) (i:r :) ˜ (i:r:) + (i:r:) ) P(l) (d (u:v:) )eV ( = e0 V ( )+V ( ) where, if (i:r:) ≡ (60) , V ( is similar to (8.6) such that LV˜ (
60
60 )
is given by (10.2), 0 is an analytic function of and V˜ (
60 )
)=0 :
This means that the only di=erence between the infrared Luttinger model (10.1) and the Luttinger model is that of irrelevant terms; then the two models have the same beta function up to terms O(AQh ) for the short memory property. Let us call hL; + the running coupling constants of the Luttinger model, and set lim+→∞ hL; + = hL and limL→∞ hL = h . Note also that if pF = (2'=L)(2nF + 1=2) than the analogue of (5.10) is a 6nite sum starting from hL; + ; moreover, lim+→∞ hL; + = hL = O(log L−1 ). One can prove the following result. Lemma 5. There exists an ,˜ such that; for || 6 ,˜ and for any h; h is analytic as a function of and |h − 0 | 6 C ||3=2 :
(11.2)
Proof. Let us resume the proof of the above lemma referring for detailed proofs to the original papers. Let us prove 6rst that if || 6 ,˜ than |h | 6 2,˜ for any h. Suppose that this is not true; then there exists h0 ¿ − ∞ such that |k | 6 2,˜ for k ¿ h0 but |h0 | ¿ 2,˜ :
(11.3)
We will show that this gives a contradiction. Let us 6x L0 Ah0 = 1=n, if n is some 6xed real number. By the analogue of Theorem 1 in Section 8 for the Luttinger model we can say that the running coupling constants at scale h are analytic in k , k ¿ h if h ¿ h0 and 2,˜ 6 ,. S In general, h0 = hL0 . As we know from the exact solution of the Schwinger functions of the Luttinger model, we want to use this knowledge for showing that (11.3) is not possible. Let us consider the Luttinger model in a volume L0 and WL0 () e = P˜ (l) (d )eV ( +) (11.4)
348
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
and we compute the above integral in a single step i.e. without performing a multiscale decomposition analysis. It holds that − WL0 () = d x dyW2; L0 (x − y)+ !; x !; y !
+
!
− + − 6 d x1 d x2 d x3 d x4 W4; L0 (x1 ; x2 ; x3 ; x4 )+ !; x1 −!; x2 −!; x3 !; x4 + O( ) :
T T If Sˆ 2; L0 ;! (k); Sˆ 4; L0 ;! (k1 ; : : : ; k4 ) are the Fourier transforms of the truncated two- and four-point Luttinger model Schwinger functions (i.e. Schwinger functions expressed by connected Feynman diagrams) in a volume L0 then T Sˆ 2; L0 ;! (k) =
1 −ik0 + !k + Wˆ 2; irr; L0 (k)
T T T T T Sˆ 4; L0 ;! (k1 ; : : : ; k4 ) = Sˆ 2; L0 ;! (k1 )Sˆ 2; L0 ;−! (k2 )Sˆ 2; L0 ;−! (k3 )Sˆ 2; L0 ;! (k4 )Wˆ 4; irr; L0 (k1 ; : : : ; k4 )
(11.5) where Wˆ 2; irr; L0 ; Wˆ 4; irr; L0 are the irreducible parts of Wˆ 2; L0 ; Wˆ 4; L0 , i.e. they are given by Feynman graphs which cannot be disconnected by cutting a single line, see for instance [32]. The above relations can be proved using the analyticity in of the r.h.s. and the l.h.s. of (11.5) for || 6 ,L0 , with ,L0 → 0 for L0 → ∞, developing (11.5) in series of and showing the equality T of the coeKcients. It is straightforward then to express Wˆ 2; irr; L0 ; Wˆ 4; irr; L0 as functions of Sˆ 2; L0 ;! , T Sˆ 4; L0 ;! . Moreover, it holds that '' ' ( ' ' (( '' ' ( ' ' (( ; : : : ; 0; = W4;hLL 0; ; : : : ; 0; (11.6) Wˆ 4; irr; L 0; nL ˜ nL ˜ nL ˜ nL ˜ with n˜ ¿ 1, and similar expressions hold for Wˆ 2; irr; L and their derivatives. The r.h.s. of (11.6) is obtained from (11.4) integrating scale by scale from 1 to hL the fermionic integration as in Section 5, and noting that g(k) (k), k ¿ hL , is vanishing when computed at k = (0; '= nL) ˜ so that all the nonirreducible contributions are vanishing. At the end, as from Section 8, the running coupling constants are expressed in terms of h W4; L (k1 ; : : : ; k4 ); W2;h L (k) computed at the Fermi surface (or, more exactly, at the admissible momenta closest to pF ). We can express the running coupling constants ZhLL0 ; LhL0 ; hLL0 in terms 0
0
0
T T of Sˆ 2; L0 ;! , Sˆ 4; L0 ;! or their derivatives with momenta at the Fermi surface; for instance,
nL ˜ 0 1 ˜ 0) : = S2;T L0 ;! ('= nL ' Z˜ L0 (1 + ˜L0 ) hL 0 hL 0
(11.7)
L0 L0 L0 The running coupling constants Z˜ hL0 ; ˜hL0 ; ˜hL0 appearing in (11.7) are not exactly the running coupling constants ZhLL0 ; LhL0 ; hLL0 ; the last ones are related to the Fourier transform of the e=ective 0
0
0
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
349
potential computed at '=L0 while the others are related to the Fourier transform of the e=ective potential computed at '= nL ˜ 0 ; however it is easy to show that |hLL − ˜ LhL | 6 C,h2L . Eq. (11.7) is valid for || 6 ,L0 ; however the r.h.s. is analytic in by looking at the explicit expression which is obtained from the exact solution and the l.h.s. by Theorem 1 as 2,˜ 6 ,, S so one extends its validity to a domain L0 -independent by using analytic continuation. At the end, by using the explicit expression of the Luttinger model Schwinger functions one 6nds, see [18] |hLL0 − | 6 C2 0
|LhL0 | 6 C2 :
(11.8)
0
S the di=erence between hL00 and h0 is such On the other hand, one can prove that, for ,h0 6 ,, that, see [18], |hL00 − h0 | 6 C,h3=2 0 +1
A−h0 : L0
(11.9)
. On the other hand, it is clear, by the convergence of the beta function, that |hL00 − hLL0 | 6 Cn,h3=2 0 +1 0 ˜ we can write from (11.8) and (11.9) Let |0 | 6 ,; |h0 − 0 | = |h0 − hL00 | + |hL00 − hLL0 | + |hLL0 − 0 | 6 Cn,h3=2 + C2 6 0 +1 0
0
,˜ 8
(11.10)
in contradiction to (11.3). We have proved, then, that if || 6 ,˜ than |h | 6 2,˜ for any h. This means that the running coupling constants remain inside the analyticity radius of the beta function, so that h is analytic as a function of and (11.2) follows from (11.10). In order to prove (10.11) note that h = 0 +
r n=2
cn(h) 0n + O(0r+1 ) :
(11.11)
The Cow equation is given by h−1 = h + +hL (h ; : : : ; h ) +
(h − k )Dh; k :
k
Assume by contradiction that there exists a br = 0 such that +hL (h ; : : : ; h ) = br (h )r + O(hr+1 ) : Inserting (11.11) into (11.12) one gets cr(h−1) = cr(h) + br + O(AQh ) implying that cr(h) is a diverging sequence, in contradiction to (11.12).
(11.12)
350
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
12. The two-point Schwinger functions In naive perturbation theory (not convergent) an expansion for the correlation functions follows immediately from the expansion for the e=ective potential; for instance, the two-point Schwinger function is given by S(k) = g(k) + g(k)V2 (k)g(k) ; where V2 is the e=ective potential with two external 6elds. In a perturbation theory based on the renormalization group like the one discussed in the preceding sections the relationship between the e=ective potential and the correlation functions is not so immediate. In fact, V (h) has external lines with a smaller scale than the scale of the lines contracted to form the kernels of V (h) ; contracting the external lines of V (h) with 6elds representing the external 6elds one gets a contribution to the Schwinger functions, but there are many other terms contributing to the Schwinger functions that cannot be obtained in this way (the contributions in which the propagators connecting the external 6elds have the smallest scale among all the propagators). We will see that new expansions are necessary for the Schwinger functions and the response functions; new critical indices which were not present in the theory of the e=ective potential will appear. We start by the two-point Schwinger function which is given by the following functional ± integral, if ± x ; y are Grassman variables: S(x; y) = with e
S()
=
92
− S()|+ =− =0 9+ x 9y
P(d )eV (
)+ dx[
+ − + − x x +x x ]
:
We proceed as in the expansion of the e=ective potential integrating the 6elds thus obtaining S() −L+Eh +S (¿h+1) () e =e PZh (d (6h) ) e
√ −V(h) ( Zh
(6h)
√ )+B(h) ( Zh
(6h)
;)+ dx[
(h+1) (h+1) − − + + x; ! Q!; ! (x)x; ! +x; ! Q!; ! (x) x; ! ]
;
(1) ; : : : ;
(h+1) ,
(12.1)
where PZh (d (6h) ) and V(h) are given by (8.43), while S (¿h+1) () denotes the sum over all the terms dependent on but independent of the 6eld, and B(h) ( (6h) ; ) can be written as $ $ 9 9 (h) (h) + (h) − ! ∗ G!; ! ∗ + V ( Zh ) + − V(h) ( Zh ) ∗ G!; ! ∗ ! 9 ! 9 ! 2 $ 9 (h) (h) (h) − S (h) + + (12.2) ! ∗ G!; ! ∗ − V ( Zh ) ∗ G! ; ! ∗ ! + W R ; + 9 ! 9 !
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
351
where (h) G!; ! =
1 k¿h+1
Zk
(k) (k) g!; ! ∗ Q! ; !
(12.3)
(h) and WS R contains terms of higher order in . The above formula can be proved by induction, (2) with Q!; ! = 0. Now we write in (12.2)
9 9
+V !
(h)
$ $ $ 9 9 ( Zh ) = + LV(h) ( Zh ) + + RV(h) ( Zh ) 9 ! 9 !
√ and the same decomposition is done for (9= 9 !− )V(h) ( Zh ), but not on the terms in the second line of (12.2) (the reason for this choice will be clear at the end). Note that one has to avoid the application of a derivative generated by the renormalization procedure on the propagator G (h) (if this happens one does not get a gain factor Ahv −hv ), and this can be ensured by choosing as a localization point the coordinate of the 6eld contracted in G (h) . In the integration of the e=ective potential one has to put part of the relevant part of the e=ective potential in the free integration; the same has to be done in the expansion for the two-point Schwinger function for B(h) ( (6h) ; ), changing Qh . We de6ne (h) (h+1) (h) (h) Q!; ! = Q!; ! + zh Zh [9t + ,(i9x )]G!; ! + sh Zh G!; −! :
(12.4)
We can write the integral in the r.h.s. of (12.1) as −L+th e PZ˜h−1 (d 6h ) √ ˜ (h) ( Zh −V
×e
(6h)
(h) √ )+B˜ ( Zh
(6h)
;)+ dx[
(h) (h) − + + x; ! Q!; ! (x)x; ! +x; ! Q!; ! (x) x− ; ! ]
;
(12.5)
√ √ (h) ˜ (h) ( Zh ) where is equal to B(h) with (9= 9 !+ )V(h) ( Zh ) replaced by (9= 9 !+ )V where B˜ (h) LV˜ = LV (h) − sh F − zh (F2 + FK ). Now we rescale the 6elds e−L+th PZ˜h−1 (d 6h ) √ √ (h) − ˆ (h) ( Zh−1 (6h) )+Bˆ (h) ( Zh−1 (6h) ;)+ dx[ x;+ ! Q(h) (x)− ++ −V x; ! Q!; ! (x) x; ! ] x; ! !; ! ×e ; (12.6) (h) $ (h) √ where Bˆ ( Zh−1 (6h) ; ) = B˜ ( Zh (6h) ; ), we integrate with respect to (h) and the procedure can be iterated. − At the end, after taking the functional derivatives with respect to + x ; x we get an expansion in terms of a new class of trees E ∈ Tn; h; k , which are similar to the trees of the e=ective potential, see Fig. 8, with the following modi6cations.
352
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 16. A tree appearing in the graphic representation of the two-point Schwinger function (see Eq. (12.8) below). S The two endpoints vz and vy are connected both to the vertex vxy on scale h.
(1) There are n+2 endpoints and to two of them, called vx ; vy , are associated with the following function: 9 ˆ (h) $ (h) (h) − + d x x; ! Q!; ! (x)x; ! + d xG!; ! ∗ V ( Zh−1 )− x; ! 9 !+ or
(h)
d y + y; ! Q! ; ! (y)
− y; !
$ ˆ (h) − V ( Zh−1 !
9
(h)
d y+ !; ( ∗ G!; ! ∗
+
9
):
(2) Let be vxy the 6rst vertex whose cluster contains both vx and vy ; its scale is named hS and no R-operation is de6ned on the vertices on the line from vxy to the root (this follows from the fact that we have made no decomposition in the relevant and irrelevant parts in the terms in the second line of (12.2)). (3) There are no external lines in the root of the tree. The scale of the root is kS (Fig. 16). In order to perform the bounds we need some information on G (h) given by (12.3); it is easy to show that |g˜ (k) ∗ Q(k) | 6 Ak
1+
CN k (A |x −
y|)N
This simply follows from the fact that the Fourier transform of Qk is bounded by a constant. In fact, by (12.3) and (12.4), we obtain 1 1 (h) (k) h+1 k = Q + z Z [ 9 + ,( i9 )] g ∗ Q + s Z gk ∗ Q!(k) ; −! ; Q!; t x h h h h !; ! ! ! ;! Zk !; ! Zk !; ! k¿h+1
k¿h+1
(12.7) which can be solved by iteration. In bounding the convolution g˜ (k) ∗ Qk we have to evaluate (k) (k) Qˆ (k) only on the support of f˜ (k). Considering the Fourier transform of (12.7) one obtains
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
353
that only one term survives in the sum in (12.7) (h) (h+1) 1 (h+1) (−ik0 + !k) g!; ! (k)Qˆ ! ; ! (k) Qˆ !; ! (k) = 1 + zh Zh Zh+1 !
+sh Zh
1
Zh+1
!
(h+1) h+1 g!; ! (k)Q! ; −!
(h) and by induction one can deduce that |Qˆ !; ! (k)| 6 1 + O(,h ). Then one can easily obtain the bound
|G (h) (x; v)| 6
Ah CN : (h) Zh 1 + (A |x − y|)N
− After deriving with respect to + x , y we obtain
S(x; y) =
e
−ipF (!x−! y)
!;!
1 S ∗ h=h
S g˜(!;h)! (x; y)
+
S 1 ∞ h−1 S ∗ k=h S ∗ n=1 E∈Tn h=h
SS h;S k;S E;!; ! (x; y) ;
(12.8)
h; k
where SS h;S k;S E;!; ! (x; y) is obtained by the expansion described above after taking the functional derivative. Calling S
(h) SS (x; y) =
S ∞ h−1 S ∗ n=1 E∈Tn h; k k=h
SS h;S k;S E;!; ! (x; y) ;
we obtain S (h)
|SS !; ! (x; y)| 6
S
Ah CN S ZhS 1 + (Ah |x − y|)N
(12.9)
and S (h)
|SS !; −! (x; y)| 6
| hS|
AhSZhS
1+
CN S h (A |x −
y|)N
:
(12.10)
The proof of (12.9) and (12.10) is obtained by a modi6cation of the arguments used to bound the e=ective potential. In fact, as far as the bounds are concerned, the vertices vx and vy are S $ like two 7 vertices with an external line (the line) and an extra A−h = ZhS factor for each one; moreover, no integration over the coordinate is associated with such vertices. Another important di=erence is that no R is applied on all the vertices between vxy and the root; the clusters associated with such vertices can be at most marginal (by de6nition, at least the two 6elds are external to the cluster vxy so that if h = k all the clusters between vxy and v0 have S S S S S S at most four external lines). To renormalize them we can multiply by Ah−k Ak−h ; the factor Ak−h is enough to renormalize all the clusters between vxy and v0 . It is then natural to compare the bounds for SS h; k; E;!; ! (x; y) and the bounds for a contribution to the e=ective potential with two external lines; with respect to the bound for the e=ective
354
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437 S
S
potential with two external lines (which gives dimensionally a factor Ak ) there is a A2h more for the fact that there are two integrations less (one integration kills the factor L+); moreover, S S S there is an extra factor (A−2h =ZhS)Ah−k by the preceding considerations. Collecting together such factors with a decay factor 1+
CN S h (A |x −
y|)N
;
(12.11)
which one can extract from the tree connecting vx with vy , one gets (12.9); moreover, (12.10), is obtained taking into account that there is at least a vertex. A similar expansion can be obtained also for the four-point Schwinger functions, by simply deriving with respect to four -6elds; however such expansion is not suitable for the computation of the response functions and another one is necessary, see Section 14. 13. Two-point Schwinger functions for spinless fermions In this section, we show how the renormalization group analysis described above can be used to obtain properties for many systems of spinless fermions in one dimension. For 6xing the ideas we have considered till now the model with Hamiltonian (8.1). The reason is that the properties of the other models described in this section can be easily deduced from the discussion of model (8.1), which is in a sense the most general case, if the fermions are spinless. 13.1. Free fermions Let us resume quickly the properties of the two-point Schwinger function for a system of free fermions in the continuum, with Hamiltonian H = H0 given by (2.2). The eigenstates of H0 can be easily constructed by the solutions of the SchrIodinger equation −
1 92 (x) = E (x) : 2m d x2
(13.1)
The n-point Schwinger function can be written using the Wick rule in terms of the two-point Schwinger function dk eik(x−y) g(x − y) = (13.2) (2')2 −ik0 + k 2 =2m − ∞ ∞ with d k=(2')2 ≡ −∞ d k0 =2' −∞ d k=2'. It can be written as ) m (x0 −y0 )pF2 =2m −m(x−y)2 =2(x0 −y0 ) g(x; y) = − R(x − y) + #(x0 − y0 )e e ; 2'(x0 − y0 ) where R(x − y) =
1 '
0
pF
d k cos k(x − y)e−
x0 −y0 2 2 2m (k −pF )
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
355
is a smooth function such that lim
x0 −y0 →0
R(x − y) =
1 sin pF (x − y) : ' x−y
For large |x − y| the free Schwinger function decays as the inverse of |x − y| times an oscillating factor. Note also that the function is singular for x = y. Similar results can be found for H = H0 in the discrete case (2.2); the large distance asymptotic behaviour of the two-point Schwinger function is the same (but only if pF = 0; '), but the function is 6nite for x = y (but the time derivative is singular at x0 = y0 ). 13.2. Noninteracting fermions in a periodic potential Let us consider a system of fermions in the continuous case subject to a periodic potential, with Hamiltonian H = H0 + uP, with P given by (2.3); we 6x T = 1 for simplicity. Also this model is exactly solvable, the eigenstates of H being expressed in terms of the solutions of the SchrIodinger equation 1 92 − + u’(x) (x) = E (x) : (13.3) 2m d x2 The theory of the solutions of such equations is rather well developed, see [9]. By making a linear combination with suitable coeKcients of two independent solutions of (13.3) one obtains solutions (k; x; u), called Floquet solutions, such that (k; x + 1; u) = eika (k; x; u). If k is real they are called Bloch waves: they are indiced by the real number k, the crystalline momentum, and they verify (13.3) with E = ,(k; u) which is a continuous function except for k = n', n integer where it is generically discontinuous. The values 3n = ,((n'=a)+ ; u) − ,((n'=a)− ; u) are called gaps; sometimes 3n = 0 and in this case one speaks of closed gaps. The theory of Bloch waves can be adapted without diKculty to the case of the 6nite di=erence SchrIodinger equation. The two-point Schwinger function is given by
S0 (x; y) =
d k (k; x; u)(k; −y; u)eik0 (x0 −y0 ) : (2')2 −ik0 + (,(k; u) − )
(13.4)
The spectral gap is equal to 3n when pF = n' and it is 0 for all the other values of pF . For small u we have 3n = cn u+O(u2 ) where cn is the nth Fourier coeKcient of ’(x). If pF = n' the system is called =lled band Fermi system. The asymptotic behaviour for large values of |x − y| of the two-point Schwinger function depends critically on the value of the Fermi momentum; it holds, see [18,39]. (1) If pF = n' then, for a suitable constant C and |x − y| ¿ 1 |S0 (x; y)| 6
C |x − y|
(13.5)
356
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
and for small u |S0 (x; y) − g(x − y)| 6
C |u| : |x − y|
(2) If pF = n' for any N ¿ 1 one can 6nd constants CN such that if |x − y| ¿ 1 |S0 (x; y)| 6
1 CN : N |x − y| 1 + 3n |x − y|N
(13.6)
Probably an optimal bound is an exponential one. Two such cases correspond to the metallic or insulating phase of the system; in one case the ground state energy has no gap and in the other case it has a gap. It can be of some interest to have some insights into how (13.5) and (13.6), which are true for any value of u, can be derived by the Bloch waves property. A very common technique to obtain similar bounds is to shift the integration domain in the complex plane; this means that a detailed knowledge of the analytic properties of Bloch waves in the complex plane is required. A study of this problem was done in [11], and we quickly review the results. The function ,(k; u) as a function of complex k may be represented on a Riemann surface with an in6nite sequence of sheets Sn , in such a way that on each Sn for k real one has the value of ,(k; u) corresponding to the nth energy band. Each sheet Sn is connected to Sn+1 by an in6nite sequence of branch points of order two given by k2m = ± [2(j + 1)' + ih2m ] for j = 0; ±1; : : : and by k2m+1 = ± [2j' + ih2m+1 ] for j = 0; ±1; : : :; such branch points get closer and closer to the real axis as limn→∞ hn = 0. Then starting at a real value of k on the band n, passing around kn and returning to the real axis, one is in the band n + 1. Close to the branch points one has , = ,n + +n (k − kn )1=2 + o((k − kn )1=2 ) : Analogous properties hold for (k; x; u) with the only di=erence that the branch points are now of order 4 close to the branch points and can be written as (k; x; u) =
An (k −
1 kn ) 4
[1 + Bn (k − kn )1=2 + o((K − KN )1=2 )] :
Finally, on each Sn the functions are periodic or antiperiodic with period 2'. The functions ,(k; u), (k; x; u) appearing in (13.4) are the restriction on the real axis of functions de6ned in the complex plane, with cuts from kn to kSn , once the value corresponding to the segment (−'; ') is 6xed to the value of the 6rst band. Let us return now to the problem of shifting the contour of (13.4); as the singularity gets closer and closer to the real axis (unless one chooses some special periodic functions in which hn is bounded) one can consider a path circumventing the singularities with in6nitesimal circles, see [18]: one uses the fact that the singularity is integrable and the periodicity properties. Then, estimates (13.5) and (13.6) are obtained. The same results can be obtained in a di=erent way, at least if u is small, without using any property of the solutions of the SchrIodinger equation. In fact, one can apply the renormalization group techniques introduced above with = 0, (x) a periodic function and x ∈ R. The
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
357
expansions of the preceding sections can be easily adapted (in some sense they become trivial) to the case of H = H0 + uP + 7N0 . If = 0 all the contributions to the e=ective potential are bilinear in the 6elds, so that the de6nition of localization is given by the analogue of (8.20) but the Kronecker delta is not de6ned mod 2'; the running coupling constants are, if pF = n', 7h ; h ; zh ; 2h ; if pF = n' they are the same but h = 0. As the interaction is bilinear in the 6eld, a bound on each Feynman graph is enough to prove the convergence and there is no need for Gram–Hadamard bounds; moreover, there are no small divisor problems. In the =lled band case, pF = n', one can choose 7 = 0; this follows noting that, from (10.13), k−2 +(k) ] and |+(k) | 6 C |u|2 and A−h 6 C |u|−1 . It is easy to show that the run7h = A−h+1 [ h+1 7 7 k=0 A ning coupling constants ˜vh remain close O(u2 ) to their values at h = 0. From (12.8) the infrared part of the Schwinger function is given by S0 (x; y) =
1
1
(h)
g (x; y) + u
h=h∗
(h) SS (x; y) ;
(13.7)
h=h∗
where g(h) (x; y) =
(h)
e−i(!x−! y)pF g!; ! (x; y)
!;! =±1
and |SS
(h)
(x; y)| 6 Ah
1+
CN h (A |x −
y|)N
:
Remember that h∗ = O(log(|cn u|)−1 ) and let = |cn u| (we are assuming that cn = 0). If 1 6 |x − y| 6 (2 )−1 and hx ¿ h∗ is such that A−hx −1 ¡ |x − y| 6 A−hx , (13.7) gives, if N ¿ 1, |S0 (x; y)| 6 CN
h x −1
h
A +
h=h∗
1 h=hx
CN Ah CN 6 Ahx CN 6 : Nh N 1 + |x − y| A |x − y|
(13.8)
On the other hand, if |x − y| ¿ (2 )−1 , (13.7) implies that 1 CN −(N −1)h CN CN | | |S0 (x; y)| 6 A 6 | |−N +1 6 ; N N |x − y| |x − y| 1 + | |N |x − y|N ∗
(13.9)
h=h
provided that N ¿ 1. By a slight re6nement of the above bounds one obtains (13.6). Moreover 0h=h∗ g(h) (x; y) in (13.7) can be written, in the L; + → ∞ as
−ik0 (x0 −y0 ) (i:r:) ˜ x; u) (k; ˜ −y; u) e ; f (k) (k; (2')2 −ik0 + ,(k; ˜ u)
dk
(13.10)
358
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
˜ x; u) = e−ikx u(k; where f(i:r:) (k) is the numerator of the r.h.s of the third of (5.4), (k; ˜ x; u) and * sign(|k | − pF ) ; u(k; ˜ x; u) = eisign(k)pF x cos(pF x) 1 − $ ((|k | − pF )v0 )2 + 2 * −i sign(k) sin(pF x) 1 + $
sign(|k | − pF )
((|k | − pF )v0 )2 + 2 2 ,(k; ˜ u) = (|k | − pF ) =2m + sign(|k | − pF ) v0 (|k | − pF )2 + 2
;
(13.11) (13.12)
˜ x; u) are just and v0 = pF =m. The 6rst-order term of (13.7) is very similar to (13.4), and (k; the Bloch waves computed at the 6rst order by degenerate perturbation theory. In the not =lled band case, pF = n', then h = 0 so that the propagator is diagonal and by choosing a suitable 70 1
S0 (x; y) =
g(h) (x; y) + u
h=−∞
where g(h) (x; y) =
1
(h) SS (x; y) ;
(13.13)
h=−∞
h e−i!(x−y)pF g! (x; y) ;
!;! =±1
(h) where g!(h) (x; y) is given by the analogy of (8.50) with h = 0 and |SS (x; y)| 6 Ah CN =1+(Ah |x − y|)N . It holds if hx is such that A−hx −1 ¡ |x − y| 6 A−hx , (13.13) gives, if N ¿ 1,
|S0i:r: (x; y)| 6 CN
h x −1
Ah +
h=−∞
1 h=hx
CN Ah CN 6 Ahx CN 6 : Nh N 1 + |x − y| A |x − y|
(13.14)
A decomposition analogous to (13.7) holds. Note that if the occupation number is de6ned with respect to the “Bloch waves” then it is of course (tautologically) discontinuous. If we consider the occupation number with respect to the plane-waves, considering the Fourier transform of the Schwinger function, there is no discontinuity, in the 6lled band case, while it is discontinuous in the not 6lled band case. Finally, one can consider fermions on a lattice with Hamiltonian H = H0 + uP + 7N0 with P given by (2.4) and p=' a rational number and one can prove bounds similar to (13.5) and (13.6). 13.3. Noninteracting fermions in a quasi-periodic potential We consider now a less trivial case in which the periodic potential is replaced by a quasiperiodic one; more exactly we prefer to study the essentially equivalent case of noninteracting
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
359
fermions on a lattice with an incommensurate potential. As in the commensurate case such a problem could be studied by analysing the spectrum of the 6nite di=erence SchrIodinger equation − (x + 1) − (x − 1) + u’(x) (x) = E (x) ;
(13.15)
where ’(x) is de6ned as in Section 8 and p=' is irrational. In (13.15) there are two periods, the one of the potential and the intrinsic one of the lattice, and this makes the properties of (13.15) and of the continuous SchrIodinger equation with a quasi-periodic potential very similar. The eigenfunctions and the spectrum strongly depend on u, contrary to the case of periodic potential, in which the eigenfunctions are always Bloch waves whenever u is large or small. On the contrary, in this case for large u there are eigenfunctions with an exponential decay for large distances; this phenomenon is called Anderson localization (for details, see for instance [10,73,50]) while for small u there are eigenfunctions which are quasi-Bloch waves of the form eik(E)x u(x) with u(x) = u(px) S for (13.15), uS being 2'-periodic in its arguments. This is proved by using KAM techniques (see [14 –16,64,73,74]), if p veri6es a Diophantine condition, i.e. npT1 ¿ C0 |n|−E for any n = 0 and with the additional condition that, if k(E) ≡ k, then (a) k is such that k + npT1 ¿ C0 |n|−E ∀n ∈ Z\{(0)}, J nS ∈ N. (b) or k = np, Of course, such two cases do not cover all the possible k. Probably one can get the asymptotic behaviour of the Schwinger functions just by studying the properties of the solutions of the SchrIodinger equation, as it was done for the periodic potential case; this result is however missing in the literature. On the other hand, it is possible to obtain the Schwinger functions writing them as Grassman integrals using the methods seen in the previous sections. We consider a model on a lattice with Hamiltonian H = H0 + uP + 7N0
(13.16)
with P given by (2.4) and p=' irrational. Small u case: We start by the case in which the incommensurate potential is weak with respect to the kinetic energy. It is natural to distinguish the case pF = np, which is analogous of the 6lled band case in the commensurate case, from the case pF = np. However, if we assume simply pF = np one cannot prove the convergence of the series, due to the small divisor problem, see Section 8; one needs a stronger condition, namely that pF + npT1 ¿ C0 |n|−E ; ∀n ∈ Z\{(0)}. J mod 2' can be veri6ed by a 6nite number of nS if p=' is a Note that the condition pF = np rational number, but by an in6nite number in the irrational case. In other words the values of pF in (−'; ') such that the system has a gap in the ground state form a dense set. In order to perform a rigorous analysis we have to consider L 6nite with periodic boundary conditions; in this way the Grassman algebra is 6nite dimensional and the Grassman integral are well de6ned. This means that we cannot choose a p=' given by an irrational number, but we have to consider a sequence of rational numbers converging uniformly to a Diophantine one as the volume tends to in6nity. One can prove the following theorem, see [41].
360
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Theorem 2. Let us consider Hamiltonian (13:16) with 7 = 0 and a sequence Li ; i ∈ Z+ ; such that lim Li = ∞ ;
i→∞
lim pLi = p :
i→∞
J Li (mod 2'); ’ˆ nS = 0; pLi satis=es Suppose also that there is a positive integer nS such that pF = np the Diophantine condition 2npLi T1 ¿ C0 |n|−E ;
∀n ∈ Z\{(0)}; |n| 6
Li ; 2
(13.17)
for some positive constants C0 and E independent of i. Set = |uˆ nS|. Then there exists ,0 ¿ 0; such that; if |u| 6 ,0 in the limit i → ∞; + → ∞ for any N ¿ 1 there is a constant CN ; such that |S(x; y)| 6
1 CN : 1 + |x − y| 1 + (| | |x − y|)N
(13.18)
Moreover; for 1 6 |x − y| 6 | |−1 S(x; y) = g(x − y) + C2 (x; y) ; where g(x − y) is given by (3:4) and $ | x − y| |C2 (x; y)| 6 C |x − y| for a suitable constant C. For any i; there is a spectral gap D ¿ | |=2 around 0 . The above results can also be proved, specializing the analysis on the Hubbard–Holstein model to the case = 0. The existence of the sequence of Li is proved in [41] by choosing them as the denominators of the best approximants. A decomposition of the Schwinger function given by (13.7) and (13.11) holds so that the above theorem states that, for small u, the Schwinger J if p is rational or Diophantine is essentially the same; the crucial function behaviour for pF = np J while in the second di=erence is that in one case there is a 6nite number of pF of the form np case there is a dense set. It is also possible to prove the following result. Theorem 3. Let us consider Hamiltonian (13:16) and a sequence Li ; i ∈ Z+ ; such that limi→∞ Li = ∞, limi→∞ pLi = p; if pLi satis=es the Diophantine condition (13:17) and Li (13.19) 2 with the same positive constants C0 and E. Then there exist , ¿ 0 such that; for |u| 6 ,; there exists a function 7 = 7(u) such that pF; Li + npLi T1 ¿ C0 |n|−E ;
|S(x; y)| 6
C 1 + |x − y|
∀n ∈ Z\{(0)}; |n| 6
(13.20)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
361
for some constant C. Moreover; S(x; y) = g(x − y) + uC3 (x; y) ; where g(x − y) is given by (3:4) and C3 (x; y) veri=es the same bound as (13:20). The proof follows along the lines of preceding sections; in fact Lemma 1 is still valid if J By the de6nition of localization, see (8.19), (8.20) one assumes (13.19) instead of pF = np. and (8.21), one gets h = 0 for any h. However, the construction of a sequence of Li ; pF; Li ; pLi verifying (13.19) seems to be much more involved and it is so until no construction has been exhibited (but we think that this is only a technical problem). In any case, contrary to the commensurate case, the results obtained are not for all the J nor |pF + possible values of pF ; the behaviour of the system for pF verifying neither pF = np npT1 ¿ C0 |n|−E , ∀n ∈ Z\{(0)} is an open problem; likewise it is not known what happens if p is neither rational or Diophantine. Large u case: We have seen that the asymptotic behaviour of the Schwinger functions for fermions both with an external commensurate or incommensurate potential in the small u case are similar, at least if proper Diophantine conditions are imposed on the Fermi momentum. Such similarity is completely lost in the large u case. In this case from the study of the SchrIodinger equation we expect, see for instance [10], the phenomenon of Anderson localization (an exponential decay of correlation functions which is not due to the presence of a gap in the spectrum and delocalized states, but due to the fact that the states are exponentially localized). Again we write the Schwinger functions as Grassman integrals; however we develop a series of , = 1=u, considering H0 as the perturbation. In other words, we write H = HS 0 + VS ; HS 0 =
− ’(x))
+ − x x
;
x∈
, [ VS = − 2 x∈
+ − x x+1
+
+ − x x−1
−2
+ − x x ]
+7
+ − x x
(13.21)
x∈
with VS ≡ H0 , HS 0 = P. The , = 0 Schwinger function is given by +=2 x; y d E eik0 E g(x; y; E) = x; y g(x; ˆ k0 ) ≡ g(x; ˆ y; k0 ) = : −ik0 − ’(x) + −+=2
(13.22)
So one can see the analogy with the small u case; the two propagators are the same replacing x with k and ’(x) − with E(k). If ’(x) is even one can introduce quasi-particles and one can apply RG methods similar to the one discussed for the small u case. Then in [69] the following theorem is proven. Theorem 4. Let us consider Hamiltonian (13:21) and let be ’(x) = ’(!x) S an even periodic function in its argument; ’(x) = ’(−x); ’(x) S = ’(x S + 1); and with ! verifying a Diophantine
362
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
condition !nT ¿ C0 |n|−E ;
∀n ∈ Z\{0}
(13.23)
for some constants E ¿ 1 and C0 ¿ 0. Let us de=ne !S ≡ !xS such that = ’( S !) S and assume that there is only one xS ∈ (0; 1=2) satisfying such a condition and that ’S (!) S = 0 (the prime denotes derivative with respect to the argument). Then there exists ,0 ¿ 0; depending on ! and !; S and; for |,| ¡ ,0 ; a function 7 ≡ 7(,) = 0; such that (1) if !S ∈ !Z mod 1 and the additional Diophantine condition !n ± 2!S T ¿ C0 |n|−E ;
∀n ∈ Z\{0} ;
(13.24)
is veri=ed; then S(x; y) is bounded by |S(x; y)| 6 log(1 + min{|x|; |y|})E
CN exp{−4−1 |x − y| log |,−1 |} 1 + [(1 + min{|x|; |y|})−E |x0 − y0 |]N
(13.25)
for any N ¿ 1 and for some constant CN depending on N ; (2) if 2!S = (2k + 1)! mod 1; k ∈ N; then; for 2 = 2(k + 1) and for some constant CN depending on N; |S(x; y)| 6 log max{(1 + min{|x|; |y|})−E ; } ×
CN exp{−(42)−1 |x − y|log|,−1 |} 1 + [max{(1 + min{|x|; |y|})−E ; }|x0 − y0 |]N
with 0 6 6 C |,|Q(k) ; where
(2k + 1)=4; k ¿ 1 ; Q(k) = 1; k =1 ;
(13.26)
(13.27)
for some constant C. We see, under suitable conditions, that the Schwinger functions decay exponentially fast. The decay, faster than any power decay, in the small u case was due to the presence of a gap in the spectrum, while in the large u case, due to the localization of the eigenstates. In the 6rst case the decay rate is order O(uˆ m ), if pF = mp while in the second case, O(log u−1 ). 13.4. Interacting spinless fermions The case of spinless fermions interacting only through a two-body potential was studied in [38] in the continuum and in [55] in the lattice and can be derived by the considerations in the preceding sections putting u = 0 so that h = 0 for any h, the propagator becomes diagonal in the !-indices and h∗ = − ∞. In this case an expansion in Feynman diagrams does not
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
363
lead to a convergent expansion and one has to bound directly the truncated expectation, as explained in the previous sections. By Section 10 it follows that, for a suitable 7 = 7(), h ; h ; 7h converge to a nontrivial 6xed point lying in the analyticity domain of the expansion; moreover, as limh→−∞ Zh−1 =Zh = AQ , with Q = a2 + O(3 ) (the rules for computing it at every order were explained in the preceding sections) we can write, from Section 12, S(x; y) =
1 g(h) (x; y) h=−∞
Zh
+
1
(h) SS (x; y)
(13.28)
h=−∞
with |SS
(h)
(x; y)| 6
Ah CN : h Zh 1 + (A |x − y|)N
If hx is such that A−hx −1 ¡ |x − y| 6 A−hx , (13.28) gives, if N ¿ 1, 1
|SS
(h)
(x; y)| 6 CN
h x −1 h A h=h∗
h=−∞
Zh
+
1 1 h=hx
CN Ah CN 6 Ahx CN 6 : Nh N Zh A |x − y| 1 + |x − y|1+Q
(13.29)
At the end the following result can be proved, see [38]. Theorem 5. Given a Hamiltonian of the form H = H0 + V + 7N0 ; with H0 ; N0 ; V given by (2:1) and (2:5); for spinless fermions on the continuum; one can =nd a , ¿ 0 such that; for || 6 ,; there are functions 7 = 7(); Q = Q() such that the two-point Schwinger function of H is given by S(x; y) =
g(x − y) A (x; y) + ; |x − y|Q |x − y|1+Q
(13.30)
where g(x − y) is given by (13:2); A (x; y) is bounded by a constant; 7 = O() and Q = a2 + O(3 ); with a ¿ 0. It is natural to compare (13:30) with the large distance asymptotic behaviour of the Luttinger model (11:1); from [36,4], the sum of the ! = 1 and −1 Luttinger model two-point Schwinger functions is given by 1 !=±1
ei!pF (x−y)
2' i!(x − y) +
v0∗ (x0
− y0 )
(x2
+
1 + A! () + (v0∗ (x0 − y0 ))2 )Q
y2
with A! () is bounded by a constant and v0∗ ; Q are suitable functions of , see [36] for the explicit expression. The similarity of the above equations with (13.30) is clear, but there are also some di=erences: for instance, pF is changed by the interaction, while it is not changed in the Luttinger model. Moreover, the behaviour for small x − y is completely di=erent in the two models, the dependence from pF is much more complicate in the function S(x; y) given by
364
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
(13.30) than in its analogue for the Luttinger model, and so on. A more exhaustive comparison between the Luttinger model correlation functions and the one we are considering is, in the case of the density–density one, in Section 16 below. An analogous theorem can be proven if the fermions are on a lattice, see [55]. In this case however ,0 is proportional to (sin pF )2 , if 2 is a positive integer, so ,0 is vanishing for pF → 0 or pF → '. 13.5. Interacting spinless fermions with a periodic potential in the not =lled band case A result very similar to Theorem 5 holds adding to the Hamiltonian a periodic potential, in the not 6lled band case. The result is valid for small and any u and is found by realizing a renormalization group analysis similar to the one seen in Section 8 performing a multiscale decomposition not on g(x − y) but on S0 (x; y); this means that we are considering as free Hamiltonian not H0 but H0 + uP. The analysis uses many properties of the Bloch waves found in [11]. Note that || 6 , and , is proportional to 9,(k; u)= 9k |k=pF which is vanishing for k = n'. In [18] it is proved the following result. Theorem 6. Given a Hamiltonian of the form H = H0 + V + uP + 7N0 ; with H0 ; N0 ; V given by (2:1) and (2:5); for spinless fermions on the continuum with pF = n'; n ∈ N; one can =nd a , ¿ 0 such that; for || 6 ,; there are functions 7 = 7(); Q = Q() such that the two-point Schwinger function of H is given by S(x; y) =
S0 (x; y) A (x; y) + ; |x − y|Q |x − y|1+Q
(13.31)
where S0 (x; y) is given by (13:4); A (x; y) is bounded by a constant; 7 = O() and Q = a2 + O(3 ); with a ¿ 0. 13.6. Interacting spinless fermions with a periodic potential in the =lled band case Let us consider the Hamiltonian on the continuum H = H0 + V + uP + 7N0 ;
(13.32)
S nS ∈ N+ . An analysis similar to the one for the Holstein–Hubbard in 6lled band case pF = n', model can be performed and the following result holds, see [18,40]. Theorem 7. Given Hamiltonian (13:32); assume pF = n' S and ’ˆ nS = 0. There exists , ¿ 0 and functions 7 ≡ 7(; u), Q2 ≡ Q2 (; u) and Q3 ≡ Q3 (; u), continuous for |u|; || 6 , such that 7 = O() and Q3 = +1 2 +2 O(; u; u); ˆ Q2 = +2 + ||O(; u; u); ˆ with +1 ; +2 positive generically nonvanishing constants; and such that the Schwinger function; if |x − y| ¿ 1 and for any positive N; satis=es |S(x; y)| 6
1 CN ; 1+Q |x − y| 3 1 + (|3| |x − y|)N
(13.33)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
365
if CN is a suitable constant and 3 = |u’ˆ nS|1+Q2 :
(13.34)
Moreover; for 1 6 |x − y| 6 3−1 S(x; y) =
1 [g(x − y) + C2 (x; y)] |x − y|Q3
where g(x − y) is given by (13:2) and $ || + 3|x − y| |C2 (x; y)| 6 C |x − y| for a suitable constant C. Moreover; there is a spectral gap D verifying D¿
3 : 2
(13.35)
Let us state some physical consequences of this theorem. There is a nonvanishing spectral gap also in the presence of an interaction but, at least if the interaction is attractive ( ¡ 0) and ue−1 || , it is strongly renormalized by the interaction as the ratio between the bare gap and the dressed gap is 1 for small u and vanishing as u → 0. The two-point Schwinger function can be written as S(x; y) = S1 (x; y) + O(; u; u)S ˆ 2 (x; y) ;
(13.36)
where, looking at (12.8) replacing the h dependence with a momentum dependence dk 1 e−ik0 (x0 −y0 ) ˜ x; u(k)) ˜ −y; u(k)) ˆ (k; ˆ f(i:r:) (k) (k; 2 (2') Z(k) −ik0 + ,(k; ˜ u(k)) ˆ ˜ x; u(k)) with (k; ˆ = e−ikx u(k; ˜ x; u(k)) ˆ and * u(k; ˜ x; u(k)) ˆ = eisign(k)pF x cos(pF x) 1 − $
sign(|k | − pF )u(k) ˆ
((|k | − pF )v0 )2 + u(k) ˆ 2 * sign(|k | − pF )u(k) ˆ −i sign(k) sin(pF x) 1 + $ ; ((|k | − pF )v0 )2 + u(k) ˆ 2 ,(k; ˜ u(k)) ˆ = (|k | − pF )2 =2m + sign(|k | − pF ) v02 (|k | − pF )2 + u(k) ˆ 2
(13.37) (13.38)
ˆ J and u(k); if pF = np ˆ Z(k) are two bounded functions such that |u(k) ˆ − u| = O(u)|, −1 ˆ |Z(k) − 1| = O() for k | − pF | ¿ pF =2, and u(p ˆ F ) = |uˆ nS|1+Q2 ;
ˆ F ) = |uˆ nS|−Q3 (1+Q2 ) : Z(p
Finally, S1 (x; y); S2 (x; y) obeys the same bound (13:33).
366
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
It is natural to compare (13:36) with (13:10), valid in the = 0 case; one could describe the result saying that the particles near the Fermi momentum are still Bloch waves, but dressed and ˆ of the form (k; x; u(k))= ˆ Z(k), i.e. a sort of interacting Bloch waves. This extra momentum dependence is natural as we expect that the interaction changes the one-particle wave functions mainly for momenta near the Fermi surface. One can expect then that the spectral gap, which in the noninteracting = 0 case is O(u), is deeply renormalized by the interaction between electrons becoming O(u1+Q2 ), so becoming much larger or much smaller, if ue−1 || , depending on the attractive or repulsive nature of the interaction. Similar results can be found if the fermions are on a lattice with a commensurate potential. 13.7. Interacting spinless fermions with an incommensurate potential This is the Holstein–Hubbard model we discussed in Section 8. It is a model for interacting fermions on a lattice with an incommensurate potential. In the physical literature such systems are studied in connection with the so-called quasi-crystals, see for instance [61]. The following theorem holds, see [43]. Theorem 8. Let us consider Hamiltonian (8:1) and a sequence Li ; i ∈ Z+ ; such that lim Li = ∞;
i→∞
lim pLi = p :
i→∞
J Li (mod 2'), ’ˆ nS = 0; pLi satis=es Suppose also that there is a positive integer nS such that pF = np the Diophantine condition 2npLi T1 ¿ C0 |n|−E ;
0 = n ∈ Z; |n| 6
Li 2
(13.39)
for some positive constants C0 and E independent of i. There exist , ¿ 0 and functions 7 ≡ 7(; u); Q2 ≡ Q2 (; u) and Q3 = Q3 (; u); continuous for |u|; || 6 , and such that 7 = O(); Q3 = +1 2 + 2 O(; u; u); ˆ Q2 = +2 + ||O(; u; u); ˆ with +1 ; +2 positive generically nonvanishing constants; and such that the Schwinger function; if |x − y| ¿ 1 and for any positive N; satis=es |S(x; y)| 6
1 CN 1+Q 3 |x − y| 1 + (3 |x − y|)N
(13.40)
if 3 = |u’ˆ nS|1+Q2 and CN is a constant. Moreover; for 1 6 |x − y| 6 3−1 S(x; y) =
1 [g(x; y) + C2 (x; y)] ; |x − y|Q3
(13.41)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
367
where g(x; y) is given by (3:4) and $ || + 3|x − y| |C2 (x; y)| 6 C |x − y| for a suitable constant C. Moreover; there is a spectral gap D verifying D¿
3 : 2
(13.42)
The considerations in the preceding sections hold also in this case; in the same sense the quasi-Bloch waves are replaced by interacting quasi-Bloch waves. Finally, the analogue of Theorem 3 for the interacting case holds. Theorem 9. Let us consider Hamiltonian (13:16) and a sequence Li ; i ∈ Z+ ; such that limi→∞ Li = ∞, limi→∞ pLi = p; if pLi satis=es the Diophantine condition (13:39) and (13:19) then there exist an , such that; for ||; |u| 6 , there are functions Q(; u); 7(; u) with Q(; u) = +1 2 + 2 O(; u) such that S(x; y) =
A (x; y) g(x; y) + ; Q |x − y| 1 + |x − y|1+Q
(13.43)
where g(x; y) is given by (3:4); A (x; y) is bounded by a constant; 7 = O() and Q = a2 +O(3 ); with a ¿ 0. While a sequence of Li verifying the conditions of the 6rst theorem is found in [41], a sequence verifying the conditions of the second theorem is not constructed at the moment. 13.8. Open problems Finally, let us give a list of open problems about d = 1 spinless fermions weakly interacting with short range. (1) In the case of periodic potential there are two di=erent expansion for pF = n' and pF = n'; this second one is such that has to be taken smaller and smaller as pF → n'. One would like to know the correlation functions also for pF n' and 6xed i.e., not vanishing as pF → n'. (2) One would like to study by these methods interacting fermions with a stochastic external potential. For the = 0, the Schwinger function was obtained in [17] by the properties of the solution of the SchrIodinger equation. It is likely that the interaction between fermions produces a renormalization of the decay rate of the Anderson localization similar to the one for the quasi-periodic potential (see [21]). To prove this, probably Cluster expansion techniques have to be used (see [76]). (3) Another interesting case is when there is a large incommensurate potential and an interaction between fermions (Holstein–Hubbard model for large u).
368
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
14. Density--density response function 14.1. The expansion We have seen in Section 3 that the density–density response functions, in terms of which many important physical quantities can be expressed, are related to the four-point Schwinger functions. The expansion in Section 12 for the two-point Schwinger functions could be adapted to the case of the n-point Schwinger function. However, while such expansion is suitable for the analysis of the asymptotic behaviour of S T (x1 ; x2 ; x3 ; x4 ) when |x1 − x2 |; |x2 − x3 |; |x3 − x4 |, |x1 − x4 | are large, it cannot provide the asymptotic behaviour of the density–density correlation functions, related to four-point Schwinger functions with coordinates pairwise equal; see [44]. The reason is that while in the two-point Schwinger function, or in the four point one with all the di=erence of coordinates very large, the asymptotic behaviour is described in terms of the same critical indices appearing in the e=ective potential, in the expression for the density– density correlation function there are new ones. We give here an idea of the expansion referring for details and proofs to [46]. The density–density correlation function Hx is given by 92 log S() 3 ; (14.1) Hx = 9(x)(0) =0 where (x) is a bosonic external 6eld, periodic in x and x0 , of period L and +, respectively, and (61) S() )+ dx (x) x(61)+ x(61)− e = P(d (61) )e−V ( : (14.2) For 6xing ideas we study (14.2) for the Hamiltonian H0 + V + 7N0 on a lattice, and in Section 16 we discuss what happens for more complex Hamiltonians. We shall evaluate the integral in the r.h.s. of (14.2) introducing a scale decomposition and performing iteratively the integration of the single scale 6elds, starting from the 6eld of scale 1. After integrating the 6elds (1) ; : : : ; (h+1) , 0 ¿ h ¿ h∗ , we 6nd √ √ (h) (6h) S() −L+Eh +S (¿h+1) () )+B(h) ( Zh (6h) ;) e =e PZh (d 6h )e−V ( Zh ; (14.3) where PZh (d (6h) ) and Vh are given by (8.40) while S (¿h+1) (), which denotes the sum over all the terms dependent on but independent of the 6eld, and B(h) ( (6h) ; ), which denotes the sum over all the terms containing at least one 6eld and two 6elds, can be represented in the form m m ∞ 1 (¿h+1) (¿h+1) S () = Sm (p1 ; : : : ; pm ) (pi ) pi ; (14.4) m (L+) p ;p m=1
1
m
i=1
i=1
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
B(h) (
(6h)
; ) =
∞ ∞ m=1 n=1
×
m
!
p1 ;:::;pm k1 ;:::;k2n
(pi )
i=1
2n i=1
369
(h) Bm; 2n; ! (p1 ; : : : ; pm ; k1 ; : : : ; k2n )
(6h) i ki ;!i
m i=1
pi +
2n
i ki
:
(14.5)
k=1
It is easy to see that the 6eld is equivalent, from the point of view of dimensional considerations, to two 6elds so that the only terms in the r.h.s. of (14.5) which are not irrelevant are those with m = 1 and n = 1, which are marginal. Hence, we extend the de6nition of the localization operator L (8.19), (8.20) and (8.21) so that its action on B(h) ( (6h) ; ) is trivial unless n = m = 1 and in that case 1 (6h)− (h) L (p) k(6h)+ k2 ;!2 B1; 2; ! (p; k1 ; k2 )(k1 − k2 + p) 1 ;!1 (L+)3 k1 ;k2 ;p
×L
1 (p) (L+)3 k1 ;k2 ;p
(6h)+ (6h)− (h) k1 ;!1 k2 ;!2 B1; 2; ! ((!2
− !1 )pF ; !1 pF ; !2 pF )(k1 − k2 + p) :
(14.6)
Then, we can write LB(h) (
(6h)
; ) =
Zh(1) (6h) Zh(2) (6h) F + F ; Zh 1 Zh 2
where Zh(1) and Zh(2) are real numbers, such that Z1(1) = Z1(2) = 1 and (6h) (6h)− d x (x)e2i!pF x x;(6h)+ = F1 ! x; −! ;
(14.7)
(14.8)
!=±1
F2(6h) =
d x (x)
(6h)+ (6h)− x; ! x; !
:
(14.9)
!=±1
Of course, we could write the above expressions in momentum space, like in (8.27); we prefer however to write them in coordinate space to make it evident that in F1(6h) there is an oscillating factor absent in F2(6h) . Note also that Zh(1) and Zh(2) are new running coupling constants. By using the notation of Section 8, we can write the integral in the r.h.s. of (14.3) as √ √ (6h) ˜ (h) −L+th )+B(h) ( Zh (6h) ;) P˜ Zh−1 (d (6h) )e−V ( Zh e √ (h) √ (6h) ˆ (h) −L+th (6h−1) )−Bˆ ( Zh−1 (6h) ;) =e PZh−1 (d ) P˜ Zh−1 (d (h) )e−V ( Zh−1 ; (14.10) $ ˆ (h) ( Zh−1 (6h) ) is de6ned as in (8:63) and where V $ (h) $ Bˆ ( Zh−1 (6h) ; ) = B(h) ( Zh (6h) ; ) :
(14.11)
370
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
B(h−1) (
$
Zh−1
e−V
(h−1)
(
=
(6h−1) ; )
√
Zh− 1
(6h−1)
PZh−1 (d
and S (h) () are then de6ned through the analogues of (8.65), that is √ (h−1) (6h−1) ˜ ˜(h)
)+B
(h)
(
ˆ (h) (
)e−V
Zh − 1
√
Zh− 1
;)−L+E h +S
(6h)
(h) )+Bˆ (
√
Zh− 1
()
(6h)
;)
:
(14.12)
De6nitions (14.11) and (14.7) easily imply that (i) Zh−1
Zh(i)
= 1 + zh(i) ;
i = 1; 2 ;
(14.13)
where zh(1) and zh(2) are some quantities of order ,h , which can be written in terms of a tree expansion similar to that described in Section 8. It follows that 1
S() = − L+EL; + +
(h) S˜ ()
(14.14)
h=−∞
and S (¿h+1) = Hx =
1
1
k=h+1
(k) S˜ (), see (14.3); moreover deriving with respect to (x); (0)
(h) S˜ 2 (x; 0) :
(14.15)
h=−∞
√ The functionals B(h) ( Zh (6h) ; ) and S (h) () can be written in terms of a tree expansion similar to the one described in Section 8. We introduce, for each n ¿ 0 and each m ¿ 1, a family Th;mn of trees, which are de6ned as in Section 8.4, with some di=erences.
(1) If E ∈ Th;mn , the tree has 2n + m (instead of 2n) endpoints. Moreover, among the 2n + m endpoints, there are n endpoints, which we call normal endpoints, which are associated with a contribution to the e=ective potential on scale hv − 1. The m remaining endpoints, which we call special endpoints, are associated with a local term of the form (14.8) or (14.9); we shall say that they are of type Z (1) or Z (2) , respectively. (2) There are clusters with external 6elds, and on such clusters the R = 1 − L operation, if L is de6ned as in (14.6), acts if there is only one external 6eld and two external 6elds. As dimensionally a 6eld is like a couple of 6elds, (14.7) is enough to get a convergent renormalized expansion. 14.2. The results The running coupling constant Zh(1) veri6es a Cow equation of the form (1) Zh−1 Z (1) = h [+3 h + +3(h) (˜vh ; : : : ;˜v0 )] Zh−1 Zh
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
371
with +3 ¿ 0 and |+3(h) (˜vh ; : : : ;˜v0 )| 6 C |h |2 . By performing an analysis similar to the one in Section 10 for the running coupling constants one can prove, given c2 6 c1 are constants, that A−c2 +3 1 h 6
Zh(1) 6 A−c1 +3 1 h : Zh
(14.16)
A similar results holds of course also for Zh(2) =Zh , namely (2) Zh−1 Z (2) = h [+4 h + +4(h) (˜vh ; : : : ;˜v0 )] Zh−1 Zh
and it is easy to check, by explicit calculation, that one has +4 = 0. We shall show in the next section that we can decompose +4(h) in a Luttinger model part plus a correction, and that the Luttinger model part is vanishing. This will be proved by a Ward identity. In particular, one can show that −C||
A
Zh(2) 6 6 AC|| : Zh
(14.17)
This means that there is a remarkable cancellation such that the ratio between Zh(2) and Zh remains close to 1: they are two diverging quantities given by an (apparently) di=erent perturbative series; however, such two series are equal to any order up to irrelevant terms. At the end one 6nds for H = H0 + V + 7N , if the conditions of Theorem 1 are veri6ed i.e. small enough, pF = 0; ' and 7 chosen in a proper way 1
Hx =
e
2i!pF x
h; k=−∞!=±1
(1) 2 (2) 2 (Zh∨k ) (h) (Zh∨k ) (h) (k) (k) g!; ! (−!x)g−!; −! (−!x) − g (−!x)g!; ! (!x) Zh−1 Zk−1 Zh−1 Zk−1 !; !
(2) 2 1 (1) (2) Z (1) 2 Zh Zh Zh (h) (h) (h) h G1 (x) + G2 (x) + G3 (x) ; + Zh Zh Zh2 h=−∞
(14.18)
where h ∨ k = max{h; k } and given any integer N ¿ 0 (h)
(h)
(h)
|G1 (x)| + |G2 (x)| + A−Qh |G3 (x)| 6 CN ||
A2h 1 + [A(h) |x|]N
(14.19)
for a suitable constant CN (recall that the propagator is diagonal in the case we are considering here). Moreover, we can write G1(h) (x) =
(h)
e2i pF x GS 1; (x) + r1 (x) ; (h)
=±1
(h) G2(h) (x) = GS 2 (x) + r2(h) (x) ;
(14.20)
372
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
such that (h)
(h)
|r1 (x)| + |r2 (x)| 6 CN |1 |
A(2+Q)h ; 1 + [Ah |x|]N
(14.21)
0 m1 and, if we de6ne Dm0 ;m1 = 9m x0 9x (9x is a discrete derivative; see Section A.2), given any integers m0 ; m1 ¿ 0, there exists a constant CN; m0 ;m1 , such that Ah(m0 +m1 ) (h) (h) |Dm0 ;m1 GS 1; (x)| + |Dm0 ;m1 GS 2 (x)| 6 CN; m0 ;m1 |1 |A2h : (14.22) 1 + [A(h) |x|]N =±1
An easy corollary of the above equations is that the density–density correlation function can be written as 3; a 3; b 3; c Hx = cos(2pF x)UL; +; x + UL; +; x + UL; +; x
(14.23)
with 3; a |9ix UL; +; x | 6
C1 ; 1 + |x|2+2Q2 +i
3; b |9ix UL; +; x | 6
C1 ; 1 + |x|2+i
3; c |UL; +; x | 6
C1 ; 1 + |x|2+A˜
(14.24)
where 9x = (9x0 ; 9x ), A; ˜ C1 ¿ 0 depend only on pF and Q2 = − b3 + O(2 ). We will see in Section 16 that many properties of the density–density correlation function can be obtained from (14.18). Finally, it is interesting to compare the above expression with the density–density correlation function in the Luttinger model (11.1), obtained by the exact solution. The Luttinger model is de6ned in terms of two 6elds x; ! , ! = ± 1, and one expects that the large distance asymptotic behaviour of U3 (x) is qualitatively similar to that of the truncated correlation of the operator Px = x+ x− , where x = ! exp(i !pF x) x; ! . There is apparently a problem, since the expectation of Px is in6nite; however, it is possible to see that there exists the limit, as ,1 ; ,2 → 0+ , − + of [Px; ,1 Py; ,2 − Px; ,1 Py; ,2 ], where Px; , = (x; x0 +,) (x; x0 ) , and it is natural to take this quantity, let us call it G(x − y), as the truncated correlation of Px . From (2.5) of [36] (by inserting in the lst sum a (−,i ,j ), missing for a typo), it follows that, for |x| → ∞ G(x) [1 + a1 ()]
cos(2pF x)
2'2 [(v0∗ x0 )2 + x2 ]1+a3 ()
+
(v0 x0 )2 − x2 ; 2'2 [(v0 x0 )2 + x2 ]2
(14.25)
where v0∗ = v0 [1 + a2 ()] and ai (), i = 1; 2; 3, are bounded functions. Note that, in the second term in the r.h.s. of (2.7), the bare Fermi velocity v0 appears, instead of the renormalized one, v0∗ , as one could perhaps expect. 15. Approximate Ward identities In order to prove (14.17) we can reason as in Section 10 and we can write the beta function for Zh(2) =Zh as sum of several terms, like in (10.7) and (10.5); one of them coincides with the beta function of the infrared Luttinger model and is the crucial one in order to control the Cow,
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
373
while the others have little e=ect on the Cow of Zh(2) =Zh . Then one is led to consider the infrared Luttinger model beta function for Zh(2) =Zh in order to prove the analogue of (10.11); once we have proved this, the Cow for the original model can be controlled repeating the consideration in Section 10. In this section, we give a sketch of the proof of (14.17), following [46]. Let us consider the infrared Luttinger model with integration given by (10:1), and let Zh(2); l and Zhl be the analogues of Zh(2) and Zh for such a model. Let us introduce a new external 6eld Jx , commuting with the 6elds and [h; 0] , and let us consider the functional [h; 0]+ [h; 0]− [h; 0] (h) W(; J ) = − log P(l) (d [h; 0] )e−V ( +)+ dx Jx ! x; ! x; ! (15.1) (h) with P(l) (d functions
[h; 0] )
de6ned by (10.1) with C0−1 replaced by Ch;−10 =
92 ; Vh; ! (x − y) = − W(; J ) + 9x; ! 9y; ! =J =0 9 92 W(; J ) :
0
k=h
fk . We also de6ne the (15.2) (15.3)
These functions have here the role of the self-energy and the vertex part in the usual treatment of the Ward identities. However, they do not coincide with them, because the corresponding Feynman graphs expansions are not restricted to the one particle irreducible graphs. However, their Fourier transforms at zero external momenta, which are the interesting quantities in the limit L; + → ∞, are the same; in fact, because of the support properties of the fermion 6elds, the propagators vanish at zero momentum, hence the one particle reducible graphs give no contribution at those quantities. Let us de6ne (2) Z˜ h = 1 + <ˆ h; ! (0; 0) ;
(15.4)
Z˜ h = 1 + Vˆ h; ! (0) :
(15.5)
(2) If we did not perform any anomalous integration, Z˜ h and Z˜ h coincide with Zh(2); l and Zhl ; the anomalous integration makes them a slightly di=erent but one can prove that, see [46] (2) Z˜ Z˜ h h 6 − 1 C | | − 1 (2); l 6 C || : l Z Zh h (2) Let us consider a Feynman graph expansion for Z˜ h and Z˜ h similar to the one discussed in Section 7 with propagator Ch;−10 (k )= −ik0 +!v0 k; it is easy to see that the graphs contributing to Zh have two external fermionic lines and a derived internal propagator (in momentum space) while (2) the graphs contributing to Z˜ h have two external fermionic lines and an external bosonic line,
374
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 17. Graphic representation of the formal Ward identity.
representing the external 6eld . If we proceed formally replacing the propagator Ch;−10 (k )= − ik0 +!v0 k with g!; F (k) ≡ 1= −ik0 +!v0 k i.e. neglecting the infrared and ultraviolet cut-o>s one 6nds the formal equality of the two expansions, as a consequence of the equality 9k g!; F (k) = − ![g!; F (k)]2 ; see Fig. 17. Of course such an argument is only formal, as the two expansions (2) are both meaningless neglecting the cut-o=s, but it suggests that formally Z˜ h = Z˜ h . What is possible to prove in a rigorous way is (2) (2) Z˜ h = Z˜ h + Z˜ h
(15.6)
with (2)
|Z˜ h | 6 C ||Zh
(2)
(15.7)
This shows that the corrections due to the presence of cut-o= functions to the “formal” Ward (2) identity Z˜ h = Z˜ h are bounded by a diverging quantity as h → −∞, so that the cancellations seen above seems to capture only the “leading-log” behaviour. In order to prove (15.6) we will 6nd it convenient to write the integration in (15.1) in terms of the space-time 6eld variables (h) [h; 0] [h; 0] [h; 0]+ [h; 0] [h; 0]− P(l) (d d x x; ! D! )=D exp − ; (15.8) x; ! !
where D![h; 0]
[h; 0] x; !
=
[h; 0] 1 i k x e Ch; 0 (k )(i k0 − ! v0 k ) ˆ k ; ! : L+
(15.9)
k
D![h; 0] has to be thought of as a “regularization” of the linear di=erential operator 9 9 D! = + i!v0 : 9x0 9x
(15.10)
Let us now introduce the external 6eld variables x; ! , ! =±1, anticommuting with themselves and x;[h;!0] , and let us de6ne [h; 0] (h) U () = − log P(l) (d [h; 0] )e−V ( +) : (15.11)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
375
If we perform the gauge transformation [h; 0] x; !
→ ei 2x
[h; 0] x; !
(15.12)
and we de6ne (e−i2 ) x; ! = e−i 2x x; ! , we get (h) [h; 0] U () = −log P(l) (d ) exp −V ( [h; 0] + e−i2 )
−
dx
!
[h; 0]+ i2x [h; 0] −i2x x; ! (e D! e
[h; 0] − D! )
[h; 0]− x; !
:
(15.13)
(h) (d [h; 0] ) is not Gauge invariant due to the presence of the cut-o=. Note that the integration P(l) Since U () is independent of 2, the functional derivative of the r.h.s. of (15.3) w.r.t. 2x is equal to 0 for any x. Hence, we 6nd the following identity: 1 9U 9U − + (L; h) [h; 0] −V (L) ( [h; 0] +) −x; ! + + x; ! + P (d )Tx; ! e = 0 ; (15.14) Z() 9x; ! 9− x; ! !
where
Z() =
[h; 0]
)e−V (
[h; 0]+ [h; 0] [h; 0]− x; ! [D! x; ! ]
Tx; ! = =
(h) P(l) (d
[h; 0]
+)
;
+ [D![h; 0]
(15.15) [h; 0]+ [h; 0]− x; ! ] x; !
[h; 0]; − 1 −ipx ˆ [h; 0]; + e [Ch; 0 (p + k)D! (p + k) − Ch; 0 (k)D! (k)] ˆ p+k; ! ; k; ! 2 (L+)
(15.16)
p;k
D! (k) = − ik0 + !v0 k :
(15.17)
Note that (15.16) can be rewritten as Tx; ! = D! [
[h; 0]+ [h; 0]− x; ! x; ! ]
+ Tx; ! ;
(15.18)
where Tx; ! =
1 −ipx ˆ [h; 0]; + e k; ! (L+)2 p;k
[h; 0]; −
× {[Ch; 0 (p + k) − 1]D! (p + k) − [Ch; 0 (k) − 1]D! (k)} ˆ p+k; ! :
(15.19)
It follows that, if Ch; 0 is substituted with 1, that is if we consider the formal theory without any ultraviolet and infrared cut-o=, Tx; ! = D! [ x;[h;!0]+ x;[h;!0]− ] and we would get the usual Ward identities. The presence of the cut-o=s makes the analysis a bit more involved and adds some
376
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
corrections to the Ward identities, which however, for small enough, can be controlled by the same type of multiscale analysis, that we used before. − If we derive the l.h.s. of (15.14) with respect to + y; ! and to z; ! and we put = 0, we get 0 = −(x − y)Vh; ! (x − z) + (x − z)Vh; ! (y − x) + 9V 92 V 9V − − ; [D!˜ ( [h; 0]+ [h; 0]− [h; 0]+ [h; 0]− 9 y; ! 9 z; ! 9 y; ! 9 z; ! !˜
,T [h; 0]+ [h; 0]− ) x; !˜ x; !˜
+ Tx; !˜ ]
; (15.20)
where · ; ·T denotes the truncated expectation w.r.t. the measure Z(0)−1 P (L; h) (d By using the de6nitions (15.2) and (15.3), Eq. (15.20) can be rewritten as 0 = −(x − y)Vh; ! (x − z) + (x − z)Vh; ! (y − x) − Dx; !˜
[h; 0] )
e−V
(L)
(
[h; 0]
).
(15.21)
!˜
where
+
3h; ! (x; y; z) =
92 V 9 y;+! 9
− z; !
,T 9V 9V − ; Tx; !˜ : 9 y;+! 9 z;−! !˜
In terms of the Fourier transforms (15.21) can be written as 0 = Vˆ h; ! (k − p) − Vˆ h; ! (k) + (−ip0 + !p) ˜ <ˆ h; !; !˜ (p; k) + 3ˆ h; ! (p; k) :
(15.22)
(15.23)
!˜
and by (15.4) and (15.5) we get (15.6). In order to bind the correction term 3ˆ h; ! (p; k) one can de6ne for it a renormalized multiscale expansion similar to the one of the density–density correlation function, see [46] and bound (15.7) can be proved. Note 6nally that we have treated in a di=erent way the vanishing of the Luttinger model part of the beta function for h and for Zh2 =Zh : in the 6rst case we have used the exact solution of the Luttinger model, and in the second one a Ward identity. It is very likely that a proof of the vanishing of the Luttinger model part of the beta function for h by the use of Ward identities is also possible. 16. Spin chains We apply the results of the preceding two sections for obtaining the spin–spin correlation function in the direction of the magnetic 6eld of the Heisenberg XYZ model (2.10), for small anisotropy u and J3 , see [44,46]. This means that we have to generalize the analysis in Section 14, see (2.15), to the lattice Hamiltonian H0 + V + 7N0 + uB with 1 + + − − [ x x+1 + x+1 B=− x ] : 2 x
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
377
Of course, similar results hold for density–density correlation functions of all the models discussed in Section 13. The Heisenberg XYZ chain has been the subject of very active research over many years with a variety of methods. A 6rst class of results is based on the exact solutions. If one of the three parameters is vanishing (e.g. J3 = 0), the model is called XY chain. Its solution is based on the fact that the Hamiltonian in the fermionic form is quadratic in the fermionic 6elds, so that it can be diagonalized (see [6]) by a Bogoliubov transformation. The equal time correlation 2 functions U(x; 0) were explicitly calculated in [77] (even at 6nite L and +), in the case h = 0, that is pF = '=2. Note that, while Ux3 coincides with the correlation function of the density in the fermionic representation of the model, Ux1 and Ux2 are given by quite complicated expressions. 3 It turns out that U(x; 0) is of the following form: 3 U(x; 0) = −
' ( 2|x| 2 'x sin F(−|x| log 2; |x|); '2 x 2 2
2 = (1 − |u|)=(1 + |u|) ;
(16.1)
where F(A; n) is a bounded function, such that, if A 6 1, F(A; n) = 1+O(A log A)+O(1=n), while, if A ¿ 1 and n ¿ 2A, F(A; n) = '=2 + O(1=A). 3 For |h| ¿ 0, it is not possible to get so an explicit expression for U(x; 0) . However, it is not 3 |x| diKcult to prove that, if |u| ¡ sinpF , |U(x; 0) | 6 2 and, if x = 0 and |ux| 6 1 3 U(x; 0) = −
1 sin2 (pF x)[1 + O(|ux| log |ux|) + O(1= |x|)] : '2 x 2
(16.2)
3 2 2 −2 sin 2 (p x). Note that, if u = 0, a very easy calculation shows that U(x; F 0) = − (' x ) We want to stress that the only case in which the correlation functions and their asymptotic behaviour can be computed explicitly in a rigorous way is just the J3 = 0 case. If two parameters are equal (e.g. J1 = J2 ) and there is no external magnetic 6eld (h = 0), but J3 = 0, the model is called XXZ model. It was solved in [7] via the Bethe ansatz, in the sense that the Hamiltonian was diagonalized and the ground state energy was computed. However, it was not possible till now to obtain the correlation functions from the exact solution. Such solution is contained in the general solution of the XYZ model by Baxter [8], but again only in the case of zero magnetic =eld. The ground state energy was computed, showing for instance that the ground state may have a gap in the spectrum which, if J1 − J2 and J3 are not too large, is given approximately by (see [23])
3 = 8'
sin
|J1 |
|J12 − J22 | 16(J12 − J32 )
'=2
(16.3)
with cos = − J3 =J1 . The solution is based on the fact that the XYZ chain with periodic boundary conditions is equivalent to the eight vertex model, in the sense that H is proportional to the logarithmic derivative with respect to a parameter of the eight vertex transfer matrix, if a suitable identi6cation of the parameters is done, see [50,8]. The eight vertex model is obtained by putting
378
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
arrows in a suitable way on a bidimensional lattice with M rows, L columns and periodic boundary conditions. There are eight allowed vertices, and with each of them an energy is associated in a suitable way (there are four di=erent values of the energy). With the above choice of the parameters and T − Tc ¡ 0 and small, u = O(|T − Tc |), so that the critical temperature of the eight vertex model corresponds to no anisotropy in the XYZ chain. Moreover, see [79], the correlation function Cx between two vertical arrows in a row, separated by x vertices, in the limit M → ∞, is given by Cx = S03 Sx3 . Again an explicit expression for the correlation functions cannot be derived for the XYZ or the eight vertex model. In [79] the correlation length of Cx was computed heuristically under some physical assumptions (an exact computation is diKcult because it does not depend only on the largest and the next largest eigenvalues). The result is (−1 = (T − Tc )'=2 , if ( is the correlation length. One sees that the critical index of the correlation length is nonuniversal. Another interesting observation is that the XYZ model is equivalent to two bidimensional interpenetrating Ising lattices with nearest-neighbour coupling, interacting via a four spins coupling (which is proportional to J3 ). The four spin correlation function is identical to Cx . In the decoupling limit J3 = 0 the two Ising lattices are independent and one can see that the Ising model solution can be reduced to the diagonalization, via a Bogoliubov transformation, of a quadratic Fermi Hamiltonian, see [1]. One can presume that the large distance asymptotic behaviour of Ux3 is similar to the density– density Luttinger model correlation function (14.25), if J1 = J2 , to the large distance asymptotic behaviour of Ux3 , if some “reasonable” relationship between the parameters of the two models is assumed (one can make, for instance, the substitutions → −J3 , pF → arccos(J3 − h), p0−1 → a = 1, if a the chain step and p0−1 is the potential range). Of course, such identi6cation is completely arbitrary, but one can hope that for large distances the function Ux3 has something to do with the density–density Luttinger model correlation function. If J1 = J2 , there is no solvable model suitable for a similar analysis. As we said before, Ux3 can be obtained from the exact solution only in the case J3 = 0, when the fermionic theory is a noninteracting one. In particular, if x = (x; 0) and |ux|1, (16:2) and a more detailed analysis of the “small” terms in 3 the r.h.s. (in order to prove that their derivatives of order n decay as |x|−n ), show that UL; +; x is a sum of oscillating functions with frequency (npF )=' mod 1, n = 0; ±1. The frequencies are then measured by pF , so they depend only on the external magnetic 6eld h. If J3 = 0, a similar property is satis6ed for the leading terms in the asymptotic behaviour but the value of pF depends, in general, also on u and J3 . For example, if u = 0, the Hamiltonian is equal, up to a constant, to the Hamiltonian of a free fermion gas with Fermi momentum pF = arccos(J3 − h) plus an interaction term proportional to J3 . As is well known, the interaction modi6es the Fermi momentum of the system by terms of order J3 and it is convenient, in order to study the interacting model, to 6x the Fermi momentum to an interaction independent value, by adding a counterterm to the Hamiltonian. We proceed here in a similar way, that is we 6x pF and h0 so that h = h0 − 7;
cos pF = J3 − h0
(16.4)
and we look for a value of 7, depending on u; J3 ; h0 , such that, as in the J3 = 0 case, the 3 (x) can be represented as a sum of oscillating leading terms in the asymptotic behaviour of UL; + functions with frequencies (npF )=' mod 1, n = 0; ±1.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
379
As we shall see, we can realize this programme only if J3 is small enough and it turns out that 7 is of order J3 . It follows that we can only consider magnetic 6elds such that |h| ¡ 1. Moreover, it is clear that the equation h = h0 − 7(u; J3 ; h0 ) can be inverted, once the function 7(u; J3 ; h0 ) has been determined, so that pF is indeed a function of the parameters appearing in the original model. If J1 = J2 , it is conjectured, on the base of heuristic calculations, that to 6x pF is equivalent to imposing the condition that, in the limit L; + → ∞, the density is 6xed (“Luttinger Theorem”) to the free model value P = pF ='. Remembering that P − 1=2 is the magnetization in the 3-direction for the original spin variables, this would mean that to 6x pF is equivalent to 6xing the magnetization in the 3 direction, by suitably choosing the magnetic 6eld. If J1 = J2 , there is in any case no simple relation between pF and the mean magnetization, as one can see directly in the case J3 = 0, where one can do explicit calculations. The only exception is the case pF = '=2, where one can see that, in the limit L → ∞, 7 = J3 (so that h = 0 by (16:4)), and that Sx3 = 0. By using the results in Sections 14 and 15 and adapting the scheme followed in Section 8 one can prove, see [44,46], the following result. Theorem 9. Suppose that v0 = sin pF ¿ vS0 ¿ 0, for some value of vS0 =xed once for all, and let us de=ne a0 = min{pF =2; (' − pF )=2}; then the following is true. (a) There exists a constant ,, such that, if (u; J3 ) ∈ A, with
a0 √ ; |J3 | 6 , ; A = (u; J3 ): |u| 6 (16.5) 8(1 + 2) it is possible to choose 7, so that |7| 6 c|J3 |, for some constant c independent of L, +, u, 3 (x) is a bounded (uniformly in L, +, p and J3 , pF , and the spin correlation function UL; F + (u; J3 ) ∈ A) function of x = (x; x0 ), x = 1; : : : ; L, x0 ∈ [0; +], periodic in x and x0 of period L and +, respectively, continuous as a function of x0 . (b) We can write U3 (x) = cos(2pF x)U3; a (x) + U3; b (x) + U3; c (x)
(16.6)
with U3; i (x); i = a; b; c; continuous bounded functions; which are in=nite times di>erentiable as functions of x0 ; if i = a; b. Moreover; there exist two constants Q1 and Q2 of the form Q1 = − a1 J3 + O(J32 );
Q2 = a2 J3 + O(J32 ) ;
(16.7)
a1 and a2 being positive constants; uniformly bounded in L; +; pF and (u; J3 ) ∈ A; such that the following is true. Then; given any positive integers n and N; there exist positive constants A˜ ¡ 1 and Cn; N ; independent of L; +; pF and (u; J3 ) ∈ A; so that; for any integers n0 ; n1 ¿ 0 and putting n = n0 + n1 ; n1
|9nx00 9Sx U3; a (x)| 6
1
|x|2+2Q2 +n
Cn; N ; 1 + [3|x|]N
(16.8)
380
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
1 Cn; N ; 2+n |x| 1 + [3|x|]N (3|x|)1=2 1 1 C0; N 3; c |U (x)| 6 2 + min(0; 2Q2 ) ; A ˜ |x| |x| |x| 1 + [3|x|]N n1
|9nx00 9Sx U3; b (x)| 6
where 9Sx denotes the discrete derivative and $ 3 = max{|u|1+Q1 ; (v0 +)−2 + L−2 } :
(16.9) (16.10)
(16.11)
(c) U3; a (x) and U3; b (x) are even functions of x and there exists a constant ∗ ; of order J3 ; such that; if 1 6 |x| 6 3−1 and v0∗ = v0 (1 + ∗ ) 1 + A1 (x) ; + (v0∗ x0 )2 ]1+Q2
2 x0 − (x=v0 )2 1 3; b + A2 (x) ; U (x) = 2 2 2' [x + (v0 x0 )2 ]2 [x2 + (v0 x0 )2 ]2
U3; a (x) =
2'2 [x2
|Ai (x)| 6 c1 {|J3 | + (3|x|)1=2 }
(16.12) (16.13)
for some constant c1 . The function U3; a (x) is the restriction to Z×R of a function on R2 ; satisfying the symmetry relation 3; a 3; a ∗ x U (x; x0 ) = U x0 v0 ; ∗ : (16.14) v0 3 (d ) Let Uˆ (k); k = (k; k0 ) ∈ [ − '; '] × R1 ; the Fourier transform of U3 (x). For any =xed k 3 with k = (0; 0); (±2pF ; 0); Uˆ (k) is uniformly bounded as u → 0; moreover; for some constant c2 ; c2 ; 3
|Uˆ (0; 0)| 6 c2 + c2 |J3 | log 3
|Uˆ (±2pF ; 0)| 6 c2
1 ; 3
1 − 3Q2 : Q2
(16.15)
3 Finally; if u = 0; Uˆ (k) is at most logarithmically divergent at k = (0; 0) for any J3 ; and; at k = (±2pF ; 0); it is singular only if J3 ¡ 0; in this case it diverges as |k − (±2pF ; 0)|Q2 = |Q2 |. ˆ ˆ (e) Let G(x) = U3 (x; 0) and G(k) its Fourier transform. For any =xed k = 0; ±2pF ; G(k) is uniformly bounded as u → 0; together with its =rst derivative; moreover
ˆ | 6 c2 ; |9k G(0) ˆ ±2pF )| 6 c2 (1 + 3Q2 ) : |9k G(
(16.16)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
381
ˆ Finally; if u = 0; 9k G(k) has a =rst-order discontinuity at k = 0; with a jump equal to 1+O(J3 ); and; at k = ± 2pF ; it is singular only if J3 ¡ 0; in this case it diverges as |k − (±2pF )|Q2 . We comment on the above very elaborate theorem. (a) The above theorem holds for any magnetic 6eld h such that sin pF ¿ 0, if pF = h − J3 . Remember that the exact solution [8] is valid only for h = 0. Moreover, u has not to be small, see (16:5), and the only small parameter is J3 ; however the interesting (and more diKcult) case is when u is small. (b) A naIWve estimate of , is , = c(sin pF )2 , with c; 2 positive numbers; in other words we must take smaller and smaller J3 for pF closer and closer to 0 or ', i.e. for magnetic 6elds of size close to 1. It is unclear at the moment if this is only a technical problem or a property of the model. (c) If J1 = J2 and J3 = 0 one can distinguish, like in the J3 = 0 case (16.1), two regions in the behaviour of the correlation function U3 (x), discriminated by an intrinsic length which is given approximately by the inverse of spectral gap. In the 6rst region the bounds for the correlation function are the same as in the gapless J1 = J2 case, while in the second region there is a faster than any power decay with rate given essentially by the gap size, which is O(|u|1+Q1 ), see (16.11), in agreement with (16.3), found by the exact solution. The interaction J3 has the e=ect that the gap becomes anomalous and it acquires a critical index Q1 ; the ratio between the renormalized and bare gap is very small or very large, if u is small, depending on the sign of J3 . In the 6rst region one can obtain the large distance asymptotic behaviour of U3 (x), see (16.12) and (16.13); in the second region only an upper bound is obtained, but even in the J3 = 0 case we are not able to obtain more from the exact solution if h = 0. If u = 0 only the 6rst region is present as the spectral gap is vanishing. (d) It is useful to compare the expression for the large distance behaviour of U3 (x) in the case u = 0 with its analogues for the Luttinger model (2.7). A 6rst di=erence is that, while in the Luttinger model the Fermi momentum is independent of the interaction, in the XYZ model in general it is changed nontrivially by the interaction, unless the magnetic external 6eld is zero, i.e. pF = '=2. The reason is that the Luttinger model has special parity properties which are not satis6ed by the XYZ chain (except if the magnetic 6eld is vanishing). (e) Another peculiar property of the Luttinger model correlation function is that the dependence on pF of the correlation function is only by the factor cos(2pF x); this is true not only asymptotically (i.e. it is true not only in (14.25) but in the complete expression in [41,42]) and is due to a special symmetry of the Luttinger model (the Fermi momentum disappears from the Hamiltonian if a suitable rede6nition of the fermionic 6elds is done, see [41,42]). This is of course not true in the XYZ model and in fact the dependence from pF of U3 (x) is very complicated. However, we will see that U3 (x) can be written as the sum of three terms, see (16:6), and from (16.17) and (16.9) we have that the derivatives of the 6rst two terms verify the same bounds as their analogue of the Luttinger model (which were pF independent). This is not true for the third term U3; c (x), in which there are possibly oscillating terms making false a bound on the derivatives like (16.17) and (16.9). However, we can prove that such a term is smaller for large distances, see (16.10) (note that A˜ is J3 and u independent, contrary to Q2 ). Of course this is true only for small J3 and it could be that such a third term plays an important role for larger J3 . If we compare (16.12) with u = 0 with (14.25) we see that the expressions
382
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
di=er essentially for the factors Ai (x), containing terms of higher order in our expansion. We can prove that Ai (x) verify (16.13) and that the derivatives verify a bound like (16.8) and (16.9) which means that the higher-order terms verify the same bound as the 6rst-order terms, or the same bound as their analogue of the Luttinger model. However, the 6rst-order terms, or (14.25), have subtle symmetry properties which are very important in analysing the Fourier transform. We are able to prove that A1 (x) veri6es (16.14), which says essentially that v0∗ is the renormalized Fermi velocity; in fact the decomposition of U3a in the form of (16.12) with A1 (x) verifying (16.13) is not unique, as one can replace v0∗ with any velocity vS∗0 of the form vS∗0 = v0∗ (1 + O()) and an expression similar to (16.12) with A1 (x) verifying (16.13) is still found; however, with vS∗0 property (16.14) it is not true, unless vS∗0 = v0∗ , and this allows us to say that v0∗ is the renormalized Fermi velocity. We are not able, however, to prove a similar properties for A2 (x), see below. (f) Another important property of the Luttinger model correlation function is the fact that the not oscillating term does not acquire a critical index, contrary to what happens for the term oscillating with frequency pF ='. In the Luttinger model the not oscillating term of the correlation function is exactly (i.e. not asymptotically) equal to the noninteracting one. Again in the XYZ model this is not true, but one is naturally led to the conjecture that still the critical 3; b index of UL; + (x) is vanishing, see for instance [Sp]. In our expansion, we have a series also 3; b for the critical index of UL; + (x), and while an explicit computation of the 6rst order gives a vanishing result, it is not obvious that this is true at any order. However, due to some hidden symmetries of the model (i.e. symmetries enjoyed approximately by the relevant part of the e=ective action) we can prove that all the coeKcients are vanishing proving a Ward identity. We want to stress that this is, to our knowledge, the 6rst example in which an approximate Ward identity is proved in a rigorous way. The Ward identity we 6nd is not the same obtained neglecting the regularizations and proceeding formally. (g) The above properties can be used to study the equal time density correlation Fourier transform; if J3 = 0 its 6rst derivative at k = ± 2pF is logarithmically divergent at u = 0 and it is 6nite at k = 0; if J3 = 0 the behaviour of the 6rst derivative at k = ± 2pF is completely di=erent, as it is 6nite if J3 ¿ 0 while it has a power like singularity, if u = 0, if J3 ¡ 0 see item (e) in the theorem. This is due to the fact that the critical index Q2 appearing in the oscillating 3 (x) has the same sign as J (note that Q has nothing to do with the critical index term in UL; 3 2 + Q appearing in the two-point fermionic Schwinger function, which is O((J3 )2 )). On the other hand, the equal time density correlation Fourier transform near k = 0 of the Luttinger, XYZ or of the free fermionic gas (J1 = J2 ; J3 = 0) behaves in the same way (see also [Sp] for a heuristic explanation). This is due to a parity cancellation in the expansion eliminating the apparent dimensional logarithmic divergence. (h) From (14.25) in the u = 0 case we can see that the (bidimensional) Fourier transform can be singular only at k = (0; 0) and k = (±2pF ; 0). If J3 = 0 the singularity is logarithmic at k = (±2pF ; 0), but there is no singularity if J3 ¿ 0 and there is a power like singularity if J3 ¡ 0, see item (d) in the theorem. Then the singularity at k = (±2pF ; 0) is of the same type as in the Luttinger model, see (14.25). However, we cannot conclude that the same is true for the Fourier transform at k = 0, which is bounded in the Luttinger model, while we cannot exclude a logarithmic divergence. In order to get such a stronger result, it would be suKcient
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
383
to prove that the function U3; b (x) is odd in the exchange of (x; x0 ) with (x0 v; x=v), for some v; this property is true for the leading term corresponding to U3; b (x) in (14.25), with v = v0 , but seems impossible to prove on the base of our expansion. We can only see this symmetry for the leading term, with v = v0∗ . (i) Note that our theorem cannot be proved by building a multiscale renormalized expansion, neither by taking the XY one as the “free model” and J3 as the perturbative parameter, nor by taking the XXY one as the free model and u as the perturbative parameter. In fact, in order to solve the model, one cannot perform a single Bogoliubov transformation as in the J3 = 0 case; the gap has a nontrivial Cow and one has to perform a di=erent Bogoliubov transformation for each renormalization group integration. (l) If u = 0 the critical indices and 7 can be computed with any pre6xed precision; we write explicitly in the theorem only the 6rst order for simplicity. However, if u = 0, they are not 6xed uniquely; for what concerns 7, this means that, in the gapped case, the system is insensitive to variations of the magnetic 6eld much smaller than the gap size. (m) Finally, there is no reason for considering a nearest-neighbour Hamiltonian like (2.10); it will be clear by the following analysis that our results still hold for nonnearest-neighbour spin-Hamiltonian.
17. Spinning fermions 17.1. The repulsive case If the fermions are spinning, the general scheme is the same as the one discussed for spinless fermions, but new complications arise from the fact that the number of running coupling constants is much higher. Let us consider a system of spinning fermions on a lattice in the not 6lled band case with Hamiltonian H = H0 + V + 7N0
(17.1)
with H0 ; N0 given by (2.1), and V given by (2.5). This case was studied in [18] to which we refer for details. One can de6ne an anomalous integration similar to the one in Section 8 for spinless fermions; the localization operator is de6ned by (8.19) – (8.21). The spin has the e=ect that there are more running coupling constants; in fact the relevant part of the e=ective potential, which in the spinless case is given by (8.25), is, if pF = 0; ' for any integer n: (h)
(h)
(h)
(h)
LV (h) = Ah 7h F7(h) + h Fz(h) + gh1 F1 + gh2 F2 + gh4 F4 + pF ;'=2 gh3 F3
;
where F1(h) =
1 (L+)4
k1 ;:::;k4 ∈DL;+ ; !
(17.2)
(6h)+ k1 +!pF ;!;
(6h)+ k2 −!pF ;−!;
(6h)− k3 +!pF ;!;
(6h)− k4 −!pF ;−!;
4 i=1
i ki
;
384
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
F2(h) =
1 (L+)4
k ;:::;k ∈D 1
4
L;+
!
;
(6h)+ k1 +!pF ;!;
(6h)+ k2 −!pF ;−!;
(6h)− k3 −!pF ;−!;
(6h)− k4 +!pF ;!;
4
i ki
;
i=1
(17.3) F4(h) =
1 (L+)4
k ;:::;k ∈D 1
4
L;+
;
!
(6h)+ k1 +!pF ;!;
(6h)+ k2 +!pF ;!;
(6h)− k3 +!pF ;!;
(6h)− k4 +!pF ;!;
4
i ki
;
i=1
(17.4) F3(h) =
1 (L+)4
k ;:::;k ∈D 1
4
L;+
;
!
(6h)+ k1 +!pF ;!;
(6h)+ k2 +!pF ;!;
(6h)− k3 −!pF ;−!;
(6h)− k4 −!pF ;−!;
4
i ki
i=1
and ˆ + O(2 ); g02 = v(0) 2 g01 = v(2p ˆ F ) + O( );
g04 = v(0) ˆ + O(2 ) ; 2 g03 = v(2p ˆ F ) + O( ) :
Note that gh2 ; gh4 correspond to an interaction with a small exchange of momentum and are called forward scattering processes; gh1 correspond to an interaction with a big exchange of momenta and it is called backward scattering. Finally gh3 is possible only at pF = '=2 and it is an Umklapp scattering. Of course one can obtain the analyticity of the beta function if the running coupling constants are small enough, proving a result similar to Theorem 1 in Section 8. However the Cow of the running coupling constants is now much more complex. We consider the case pF = 0; '=2; '; the renormalization group Cow equations for the running coupling constants gh1 ; gh2 ; gh4 are given by, if h = gh2 ; gh4 ; h 1 gh−1 = gh1 + gh1 [ − +gh1 + +h1 (˜vh ; : : : ; v0 )] ; 2 gh−1 = gh2
+
4 = gh4 + gh−1
gh1
+ 1 ˆh (h) − gh + +2 (˜vh ; : : : ; v0 ) + +2 (h ; 7h ; : : : ; 0 ; 70 ) ;
2
h gh1 +ˆ 4 (˜vh ; : : : ; v0 )
+ +4(h) (h ; 7h ; : : : ; 0 ; 70 )
with + ¿ 0 and we have written explicitly the second-order terms. Note that, by trivial symme1 try considerations, any contributions to gh−1 has at least a g1 endpoint. Truncating the above equations at the second order we see that gh1 → 0 if g01 ¿ 0 grows while exiting out of the radius of convergence of the beta function if g01 ¡ 0. We consider for the moment the repulsive case v(2p ˆ F ) ¿ 0. One can proceed as in Section 10 dividing the Beta function in a part depending only on the Luttinger model part of the propagator g!(h) (see Lemma 2 in Section 8) plus a “correction” which is smaller by a factor AQh . Moreover, one can 6x the counterterm 7 so that 7h = O(AQh ) so dividing, like in Section 10, the Beta function in a part independent from 7h
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
385
plus a correction smaller by a factor AQh . Let +hi (h ; 7h ; : : : ; 0 ; 70 ) be the function obtained by +hi (˜vh ; : : : ; v0 ), putting gh1 = 7h = 0; one can show, see [18], that if +h2 (h ; 0; : : : ; h ; 0) = 0 ;
(17.5)
+h4 (h ; 0; : : : ; h ; 0) = 0 ;
(17.6)
+h1 (h ; 0; : : : ; h ; 0) = 0 ;
(17.7)
2 +ˆ h (h ; 0; : : : ; h ; 0) = 0 ;
(17.8)
4 +ˆ h (h ; 0; : : : ; h ; 0) = 0 ;
(17.9)
then it is possible to choose a counterterm 7 such that, if v(2pF ) ¿ 0 then gh1 →h→−∞ 0;
7h →h→−∞ 0;
Zh−1 →h→−∞ AQ Zh
2 ; g4 ; 2 3 2 2 2 and gh2 ; gh4 ; h →h→−∞ g∞ ∞ ∞ with Q = a + O( ) with a ¿ 0, and g∞ = g0 + O( ), 4 2 2 2 g∞ = g0 + O( ), ∞ = O( ). In order to prove (17.5) – (17.8) we follow essentially the same strategy for the spinless case, see Section 11, but in the spinning case the role of the Luttinger model is played by the Mattis model with Hamiltonian L + + H= d x(1 + ) !; ; x (i!9x − pF ) !; ; x 0
!=±1 =±1=2
+
g
2; p
g
2; o
g
4; p
0
!;
+
+
!;
g4; o
L
0
!;
+
L
0
!;
L
0
L
d x d yv(x − y)
− + − + !; x; !; x; −!; y; −!; y;
d x d yv(x − y)
− + − + !; x; !; x; −!; y; − −!; y; −
d x d yv(x − y)
+ − + − !; x; !; x; !; y; !; y;
d x d yv(x − y)
− + − + !; x; !; x; !; y; − !; y; − :
Also such a model is solvable, see [113], and the Schwinger functions can be explicitly computed [81].
386
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Reasoning as in Section 11 one can study the above model by Renormalization group. Let us start from the spin symmetric Mattis model g2; p = g2; o and g4; p = g4; o in which one obtains an expression for the relevant part of the e=ective potential similar to (17.2) but with gh1 = gh3 = 7h = h ≡ 0. As the 6nite volume Schwinger functions of the Mattis model are known we can reason exactly as in Section 11 and we obtain (17.5) and (17.6). In order to prove (17.7) and (17.8) we study by renormalizaton group the nonspin symmetric Mattis model in which g2; p = g2; o and g4; p = g4; o . One obtains an expression for the relevant part of the e=ective potential similar to (17.2) but with gh1 = gh3 = 7h = 0 and the relevant part of the e=ective potential is given by 2; p
(h)
(h)
(h)
LV (h) = gh F2; p + gh2; o F2; o + gh4; o F4; o ;
where F2;(h)p and F2;(h)o are given by (17.3) with = and − , respectively, and in the same way (h) (h) are de6ned Fp; 4 = 0 and Fo; 4 , see (17.4). The beta function with all the running coupling constants having the same scale driving the Cow of gi2; h with i = 2; 4 and 2 = o; p of the nonspin symmetric Mattis model can be written as 4; o [gh ]n1 [gh2; p ]n2 [gh2; o ]n3 [h ]n4 +i;(h)2;n1 ;:::;n4 : (17.10) n1 ;:::; n4
Again reasoning as in Section 11 by the comparison of the nonspin symmetric Schwinger functions of the Mattis model it follows the vanishing of (17.10) and from the independence of g4; o , gh2; p , g2; o , it follows that +i;(h)2;n1 ;:::;n4 = 0 :
(17.11)
Let us return to the spin symmetric model with e=ective potential given by (17.2) and gh2; p = gh2; o , gh1; p = gh1; o . For the conservation of the quasi-particle and spin indices, it is not possible to have 2 involving only one gh1; o and any number of h ; then the only possibility a contribution to gh−1 2 involving only one gh1; p and any number of h . But such is to have a contribution to gh−1 contribution is equal to [gho; 4 ]n1 [ghp; 2 ]n2 −1 [gho; 4 ]n3 [ghp; 1 ][h ]n4 +2;(h)2;n1 ;:::;n4 ;
(17.12)
so it is vanishing. In fact the function +2;(h)2;n1 ;:::;n4 in (17.12) and (17.10) are the same as (h) (h) Fp; 2 = Fp; 1 . This proves (17.8). The same argument can be repeated for i = 4 so proving (17.9). Finally, let us consider the contribution to g1h−1 involving only one gh1 and any number of p; 1 ; by symmetry considerations it follows that there is no h . We consider a contribution to gh−1 p; 1 o; 1 contribution to gh−1 involving one gh and any number of h , and the only possibility is a contribution involving one ghp; 1 and any number of h . But replacing ghp; 1 with ghp; 2 and remem(h) (h) p; 2 bering that Fp; 2 = Fp; 1 this contribution coincides with a contribution to gh−1 , so it is vanishing p; 1 o; 1 = gh−1 and by (17.11). On the other hand, we are considering the spin symmetric case so gh−1 (17.7) is proved.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
387
At the end, the following theorem can be proved (the proof in [18] refers to the continuum case): Theorem 10. Given Hamiltonian (17:1) for spinning fermions with pF =0; '=2; '; if v(2p ˆ F )¿0 there exists an , ¿ 0 such that; for || 6 ,; there are functions 7(); Q() such that the two-point Schwinger function is given by g(x; y) A(x; y) S(x; y) = + |x − y|Q |x − y|1+Q with A(x; y) bounded by a constant; 7() = O() and Q = a2 + O(3 ); with a ¿ 0. In the half-6lled band case pF = '=2 there is a running coupling constant more gh3 whose second-order Cow is not trivial and given by 3 gh−1 = gh3 + +gh3 (gh1 − 2gh2 ) ; so that the Cow of the running coupling constants becomes much more complex to study. It is quite clear that one can to Hamiltonian (17.1) a term uP representing the interaction with a commensurate or an incommensurate potential; in the v(2p ˆ F ) ¿ 0 and under proper conditions on pF forbidding the comparison of extra running coupling constants (for instance, if p=' a rational number we require pF = np=2 for any integer n) one can prove results similar to their analogue in the spinless case, see Section 13.
17.2. The attractive case The analysis above shows that the presence of the spin, if pF = 0; '=2; ' and the interaction is repulsive, is in some sense irrelevant, as the two point Schwinger function asymptotic behaviour is similar to the one in the spinless case. The situation is completely di=erent in the attractive case v(2p ˆ F ) ¡ 0, in which the running coupling constants do not remain in the convergence radius of the series for the Beta function unless, in the in6nite volume limit, the temperature is ˆ F )| for some suitable constant . It is easy in fact to check that for h ¿ h larger than e−=|v(2p +S −1 =| v(2p ˆ )| S S F O(log(+ )), with + 6 e , the running coupling constants remain O(). It is generally
believed that the growing of the coupling g1(h) in the attractive case, or of g3(h) if pF = '=2 and always in the attractive case, are related to the opening of a gap and to exponential decay of correlations. Our result gives an upper bound on a possible gap in the ground state energy, ˆ F )| . saying that |3| 6 e−=|v(2p A proof that there really is a gap in the spectrum is up to now lacking except in the remarkable case of the Hubbard model; it is a particular case of the model we are considering in which v(x − y) = x; y and pF = '=2. In this case it was proved in [5] that the ground state has a gap for any ¡ 0; moreover, the ground state is such that each site is occupied by an electron and the spins are alternating (hence a spin density wave with period 1=P). In the general situation, only mean-6eld approximations are at our disposal; a very simpleheuristic mean-6eld argument from which one can deduce from the growing of g1(h) the appearance of a gap is the following one. As gh1 is the instable process, this suggests that the relevant interactions involve the exchange of a momentum of order 2pF so that the important terms in the
388
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
interaction are of the form, for |k |; |k | 6 pF =4 (say) 1 1 + − − + : k+!pF ; k−!pF ; L L k −!pF ;− k +!pF ;− !; k
(17.13)
k
Making a BCS-type mean-6eld theory we write |S |ei2 =
1 L k
− + k+pF ; k−pF ;
and neglecting quantum Cuctuations one obtains an e=ective interaction x; |S | cos(2pF x + + − 2) x; x; , from which the existence of a gap at the Fermi surface can be deduced. In this argument there is, however, a Caw; it does not take into account the fact that, if pF =' is irrational, then it can be that 2npF 2pF mod . 2' for very large n, so that it is not a priori true that one of the interactions exchanging momenta O(2npF ) are negligible. A more correct way to perform a mean-6eld analysis is the following one. One can replace in the interaction (assumed local for simplicity) x;+ x;− x;+− x;−− two fermionic 6elds with a classical 6eld + − x; x;
→ ’(x) + [
+ − x; x;
− ’(x)] ;
neglecting (this is the approximation) terms quadratic in the “Cuctuations” [ thus obtaining a model H0 + ’(x) x;+− x;−− − ’2x : x∈
(17.14) + − x; x;
− ’(x)],
(17.15)
x∈
This model is called variational Holstein model and the nontrivial problem is to minimize the ground-state energy with respect to ’. One arrives at the same model also considering the interaction of fermions with a phonon 6eld, neglecting quantum Cuctuations, which will be discussed in Section 19. We anticipate that even in this approximation the existence of periodic ground states (which can be commensurate or incommensurate depending on whether pF =' is a rational or an irrational number) is not trivial (for instance, it is not proved for small and pF =' irrational, see below). In other words, even in a mean-6eld model the existence of a gap is not proven, in general, in the attractive case.
18. Fermions interacting with phonon 8elds 18.1. Interaction with a quantized phonon =eld The Hamiltonian of a system of one-dimensional fermions on a lattice interacting locally with the optical modes of a quantized phonon 6eld is given by (2.8) and (2.9). We refer to [78] for
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
more details. The two-point Schwinger function can be written as P(d W) P(d )e−gV˜ x;+ y;− S(x; y) = ; P(d W) P(d )e−gV˜ where P(d W) is a bosonic integration with propagator 1 e−ik(x−y) ; v(x; y) = 2 2 2 2(1 − cos k) L+ ik + ikL k + 1 + b 0 0 0 e
389
(18.1)
(18.2)
=1 e =1; |k|6'
with |v(x; 0)| 6
and
2 (b) =
C(b) =
C(b) − 0−1 |x0 | −2 (b)|x| e e ; 0
O(b−1 ) O(log b−1 )
for b → ∞ ;
O(1) O(b−1 log b)
for b → 0 ;
(18.3)
for b → 0;
(18.4)
for b → ∞:
Integrating out the boson 6elds in (18.1) we obtain 2 P(d )eg V x;+ y;− S(x; y) = ; P(d )eg2 V with
+=2 +=2 1 d x0 d y0 v(x − y) V= 8 −+=2 −+=2 x;y∈
− + − + x; x; y; y;
(18.5)
:
(18.6)
The only di=erence with the previously considered interacting spinless Hamiltonian is that it is not local in time; it is easy to check that this changes nothing in the previous discussion. Then in the spinless case one can prove that the Schwinger function has an anomalous behaviour; of course the convergence radius is vanishing as b → ∞ (corresponding to long range interactions, i.e. p0 → 0); it is also vanishing if 0 → ∞. In the spinning case one is in the situation of the preceding section, so results are found only 2 for temperatures greater than e−=g . 18.2. Classical limit: the static Holstein model We can study the above model also in the “static” limit in which the quantum Cuctuations are neglected, to put formally, corresponding 0 = ∞; b = 0; one again gets, in this way, the variational Holstein model [107] found at the end of the previous section.
390
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
The ground-state problem is now equivalent to 6nding the 6eld minimizing the ground-state fermionic energy. Before discussing this model, we stress again that the relationship between the variational Holstein model and the models considered in this and in the preceding sections are not well understood. Surely if there is no spin the quantum Cuctuation completely changes the behaviour (the static Holstein model makes no di=erence among spinning or spinless fermions), at least for small interactions.
19. The variational Holstein model 19.1. Old results In the two preceding sections, we arrived at the variational Holstein model either by considering a mean 6eld model for spinning fermions with an attractive potential or by considering a semi-classical model for phonon–fermion interaction. The problem is to 6nd the function ’(x) minimizing the ground-state energy of a system of fermions with Hamiltonian H ≡ HLel + =
1 2 ’ (x) 2 x∈
txy
+ − x y
−
x;y∈
+ − x x
x∈
−
x∈
’(x)
+ − x x
+
1 2 ’ (x): 2
(19.1)
x∈
At 6nite L, the fermionic Fock space is 6nite dimensional, hence there is a minimum eigenvalue ELel (’; ) of the operator HLel , for each given phonon 6eld ’ and each value of ; let PL (’; ) be the corresponding fermionic density. The aim is to minimize the functional FL (’; ) = ELel (’; ) +
1 2 ’x ; 2
(19.2)
x∈
subject to the condition PL (’; ) = PL ;
(19.3)
where PL is a 6xed value of the density, converging for L → ∞, say to P. It is generally believed that, as a consequence of Peierls’ instability argument [82], in the limit L → ∞, there is a 6eld ’(0) , uniquely de6ned up to a spatial translation, which minimizes (19.2) with constraint (19.3), and it is a function of the form ’(2'Px), S where ’(u) S is a 2'-periodic function in u. This is physically interpreted by saying that one-dimensional metals are unstable at low temperature, in the sense that they can lower their energy through a periodic distortion of the “physical lattice” with period 1=P (in the continuous version of the model, since 1=P is not an integer in general). There are a few results about this model in the literature. (1) An exact result [84,110], makes rigorous the theory of Peierls instability for model (19.1) in the case P = PL = 1=2 (half-=lled band case), for any value of . In fact, in this case it has
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
391
been proved that there is a global minimum of FL (’) of the form ,()(−1)x , where ,() is a suitable function of . This means that the periodicity of the ground state phonon 6eld is 2 (recall that in our units 1 is just the lattice spacing): this phenomenon is called dimerization. The proof heavily relies on symmetry properties which hold only in the half-6lled case. As in the case of the Hubbard model, the special symmetries at pF = '=2 play a crucial role. (2) In [85,95] Peierls instability for the Holstein model is proven assuming large enough: in that case the fermions are almost classical particles and the quantum e=ects are treated as perturbations. The results hold for the commensurate or incommensurate case; in particular in the incommensurate case the function ’(u), S related to the minimizing 6eld through the relation ’(x) = ’(2'Px), S has in6nitely many discontinuities. On the contrary, in the small case, according to numerical results, ’(u) S has been conjectured to be an analytic function of its argument, both for the commensurate and incommensurate cases [85]. The results are closely related to the existence of the so-called “Aubry–Mather” sets in Classical Mechanics. 19.2. New results We discuss here a result in [42] found using the RG methods reviewed above, in the case of small and any pF . A local minimum of (19.2) satisfying (19.3) must ful6l the conditions ’(x) = Px (’; ); PL =
1 Px (’; ) L x
(19.4)
and Mxy ≡ xy −
9 9’x
Py (’; ) is positive de6nite :
(19.5)
If ’ is a solution of (19.4), it must satisfy the condition ’ˆ 0 = L−1 x ’(x) = PL . On the other hand, if we de6ne @x = ’(x) − ’ˆ 0 , we can see immediately that PL (’; ) = PL (@; + ’ˆ 0 ). It follows that we can restrict our search of local minima of (19.2) to 6elds ’ with zero mean, satisfying the conditions
’(x) = (Px (’; ) − PL ) ; PL =
1 Px (’; ); L x
(19.6)
and condition (19.5). Of course, if the 6eld ’(x) satis6es (19.6), the same is true for the translated 6eld ’(x + n), for any integer n. On the other hand, one expects that the solutions of (19.6) are even with respect to some point of ; hence we can eliminate the trivial source of nonuniqueness described above by imposing the further condition ’(x) = ’(−x). We shall then consider only 6elds of
392
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
the form [(L−1)=2]
’(x) =
’ˆ n ei2n'x=L ;
’ˆ −n = ’ˆ n ∈ R;
’ˆ 0 = 0 :
(19.7)
n=−[L=2]
We want to consider the case of rational density, P = P=Q, P and Q relatively prime, and we want to look for solutions such that ’(x) = ’(x + Q). Hence, we shall look for solutions of (19.6) with L = Li = iQ, PL = P, and [(Q−1)=2]
’(x) =
’ˆ n ei2'Pnx ;
’ˆ n = ’ˆ −n ∈ R; ’ˆ 0 = 0 :
(19.8)
n=−[Q=2]
Note that the condition on L allows us to rewrite in a trivial way the 6eld ’(x) of (19.8) in the general form (19.7), by putting ’ˆ n = 0 for all n such that (2n')=L = 2'Pm; ∀m, and by relabelling the other Fourier coeKcients. Conditions (19.6) can be easily expressed in terms of the variables ’ˆ n ; if we de6ne Pˆn so that [(Q−1)=2]
Px (’; ) =
Pˆn (’; )ei2n'Px ;
(19.9)
n=−[Q=2]
we get ’ˆ n = Pˆn (’; );
n = 0; n = − [Q=2]; : : : ; [(Q − 1)=2] ;
Pˆ0 (’; ) = PL :
(19.10) (19.11)
Also the minimum condition (19.5) can be expressed in terms of the Fourier coeKcients; we obtain the L × L matrix 9 MS nm ≡ nm − Pˆm (’; ) (19.12) 9’ˆ n which has to be positive de6nite, if the 6eld ’ satis6es (19:10) and (19:11) and Pˆm (’; ) is de6ned analogously to ’ˆ m in (19.8). Hence, if we restrict the space of phonon 6elds to those of form (19.8), we have to show that the Q × Q matrix 9 M˜ nm ≡ nm − Pˆ (’; ) (19.13) 9’ˆ n m has to be positive de6nite, if the 6eld ’ satis6es (2.10) and (19.11). Then the following result holds. Theorem 11. Let P = P=Q; with P; Q relative prime integers; L = Li ≡ iQ. Then; for any positive integer N; there exist positive constants ,; ,; ˜ c and K; independent of i; P and N; such that; if 06
v2 (1 + log v0−1 )−1 4'v0 ; 6 2 6 , 0 N log(,v ˜ 0 L) K N ! log(cQ=v04 )
(19.14)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
393
where v0 = sin('P) ;
(19.15)
there exist two solutions ’(±) of (19:6); with L = Li ; 1 − = cos('P) and PL = P; of form (19:8). The matrices M˜ corresponding to these solutions; de=ned as in (19:13); are positive de=nite. Moreover; the Fourier coeLcients ’ˆ n(±) verify; for |n| ¿ 1; the bound 2 N (±) (±) |’ˆ n | 6 |’ˆ 1 | : (19.16) v0 |n| Finally, ’ˆ 1(±) is of the form
2'v0 + +(±) (; L) (±) 2 ; ’ˆ 1 = ± v0 exp − 2 with |+
(±)
(; L)| 6 C
2
1 1 + log v0
(19.17)
;
(19.18)
where C is a suitable constant. The one-particle Hamiltonian corresponding to this solution has a gap of order |’ˆ 1 | around ; uniformly on i. The above theorem proves that there are two stationary points of the ground-state energy corresponding to a periodic function with period equal to the inverse of the density, if the coupling is small enough and the density is rational, and that these stationary points are local minima at least in the space of periodic functions with that period. The energies associated with such minima are di=erent so that the ground-state energy is not degenerate. The theorem is proved by writing Px (’; ) as an expansion convergent for small and solving the set of equations (19.10) by a contraction method. As a byproduct it is found that the ’ˆ n are fast decaying, (see (19.16)), so that ’(x) is really well approximated by its 6rst harmonics (this remark is important as the number of harmonics could be very large). The results are uniform in the volume, so they are interesting from a physical point of view (a solution de6ned only for || 6 O(1=L) should be outside any reasonable physical value for ). The case in [84] for the half-6lled case is contained in Theorem 11, but in [84] it is also proved that the solution is a global minimum. Finally, the lower bound in (19.14) is a large volume condition: this is not a technical condition as, if the number of Fermions is odd, there is Peierls instability only for L large enough. The upper bound for in (19.14) requires to decrease as Q increases; in particular, irrational density is forbidden. This requirement is due to the discreteness of the lattice and to Umklapp phenomena. Note that the dependence of the maximum allowed on Q is not very strong as it is a logarithmic one.
394
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
We know that Px (’; ) is well de6ned for small not only in the rational density case, (in which the proof is almost trivial), but also in the irrational case: in fact the small divisor problem due to the irrationality of the density can be controlled thanks to a Diophantine condition (see Theorem 2). However, to solve the set of equations (19.10), a contraction method is used which is not trivially adaptable in the latter case. The same kind of problem arises in proving the positive de6niteness of MS nm in the rational case (and this is the reason why we are able to prove that the stationary points are local minima only in the space of periodic functions with pre6xed period). It is not known if such problems are only technical or whether there is some physical reason for this to occur.
20. Coupled Luttinger liquids A natural question is what happens if we consider two or more fermionic chains coupled with a hopping term from one chain to another. This problem is surprisingly very diKcult, as the number of running coupling constants is very high (15 or more, see [83]) and many of them are growing so that a rigorous analysis in the limit + → ∞ based on RG seems impossible. We can consider a simple model of two Mattis models exchanging Cooper pairs between them. Even for this model a renormalization group analysis of the + → ∞ limit is not possible (the Cow equations are similar to the one for spinning fermions in the attractive case) but it is possible to perform a sort of mean-6eld theory, see [45,87], obtaining the equivalent of a BCS theory but the corresponding critical temperature Tc is not exponentially small (see also [88] for a perturbative third-order analysis). We consider the following functional integral: ZL; +; r = Pa (d )Pb (d )e−Va −Vb −Vab −hr ; (20.1) where, calling 2g2 ≡ gt Vi = −
1 (L+)4
Vab = −2
k1 ;k2 ;k3 ;k4
!; ;
g (+L)3=2 k ;! 1
− − + + k1 ;!; ;i k2 ;!; ;i k3 ;−!; ;i k4 ;−!; ;i (k1
+ + k1 ;!1 ;1=2;a −k1 ;−!1 ;−1=2;a
1
1
hr =
1 [r L+ !; i k
1
g (+L)3=2 k ;! 2
g −2 (+L)3=2 k ;!
− k2 + k3 − k4 )
− − −k2 ;−!2 ;−1=2;b k2 ;!2 ;1=2;b
2
+ k1 ;!1 ;1=2;b
+ g −k1 ;−!1 ;−1=2;b 3=2 (+L) k ;!
− − k; !; 1=2; i −k; −!; −1=2; i
2
+r
− − −k2 ;−!2 ;−1=2;a k2 ;!2 ;1=2;a
2
+ + k; !; 1=2; i −k; −!; −1=2; i ]
;
(20.2)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
395
where k;±!; ; i is the Grassman variable describing a fermion with momentum k and spin = ± 1=2 associated with the chain = a; b, Vi describes the interaction between fermions belonging to the same chain and Vab describes the tunneling of Cooper pairs from one chain to another, in the Barden approximation. The term hr represents the interaction with an external 6eld and the parameter r is real and positive (for 6xing ideas). If g = 0 the system reduces to two independent Mattis models, and the Schwinger functions have an anomalous behaviour like (13.30). It is convenient to write the interaction in terms of Gaussian variables. We write Vab = − 2[3a 3S b + 3b 3S a ] ;
where 3i =
g (+L)3=2
+ + k ; !; 1=2; i −k ; −!; −1=2; i ;
k ;!
3S i =
g (+L)3=2 k ;!
− − −k ; −!; −1=2; i k ; !; 1=2; i :
By using the identity (Hubbard–Stratanovich transformation) ( = u + iv; S = u − iv; u; v ∈ R) 1 2 S 2ab e = d u d v e−(1=2)|| ea+b (20.3) 2' R2 we can rewrite the partition function as 1 2 −(1=2)|1 |2 1 ZL; +; r = d u1 d v1 e d u2 d v2 e−(1=2)|2 | 2' R2 2' R2 S S S S × Pa (d )e−Va Pb (d )e−Vb e−hr e1 3a +1 3b e2 3b +2 3a :
(20.4)
Performing the change of variables $ (ui ; vi ) → +L(ui ; vi ) ; we obtain ZL; +; r =
where Di =
+L 2 +L 2 d u1 d v1 e−(+L=2)|1 | d u2 d v2 e−(+L=2)|2 | 2' R2 2' R2 S S S S × Pa (d )e−Va Pb (d )e−Vb eg(1 −r=g)Da +g(1 −r=g)Db eg(2 −r=g)Db +g(2 −r=g)Da ;
1 (+L) k ;!
+ + k ; !; 1=2; i −k ; −!; −1=2; i ;
Si = D
1 (+L) k ;!
− − −k ; −!; −1=2; i k ; !; 1=2; i
:
After the integration of the Fermi 6elds, if ˜v = (u1 ; u2 ; v1 ; v2 ) r r −(+L=2)[(u1 + g )2 +(u2 + g )2 +v12 +v22 ] −+LFL; +; r (˜v) 2 ; g d u1 d v1 d u2 d v2 e e ZL; +; r = [+L=2']
=
+L 2'
2 R4
R4
L; +; r
d u1 d v1 d u2 d v2 e−+LH; g
(˜v)
;
(20.5)
396
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where L; +
e−+LF; g (˜v) =
Pa (d )
S S
S S
Pb (d )e−Va −Vb eg1 Da +g1 Db eg2 Db +g2 Da :
(20.6)
The partition function is then written as the (four dimensional) integral of the exponential L; +; r e−+LH; g (˜v) . If the function r +; r H+; v) = lim HL; v) ; g (˜ ; g (˜ L→∞
is two times di=erentiable and it admits a nondegenerate global minimum ˜v∗ for + large enough (the parameter r is introduced just to remove the possible degeneration) then L; +; r
lim lim r→0 L→∞
e−+LH; g
(˜v)
d u1 d u2 d v1 d v2 e
+; r −+LHL; v) ; g (˜
= (˜v − ˜v∗ ) :
(20.7)
r v) has a global minimum the model is solved; all the Schwinger If we can prove that H+; ; g (˜ functions can be computed using (20.7) and, if ˜v∗ = 0, there is a spontaneous gap generation. r So the problem is reduced to the computation of H+; v) and to the determination of its ; g (˜
r global minimum. However, H+; v) is given by the Grassmanian integral (20.6) which is not ; g (˜ quadratic in the Grassman variables and it is nontrivial to compute, especially in the g0t case. One has to take into account the interaction Va + Vb which is responsible for the g0t = 0 case of the Luttinger liquid behaviour of the model. +; r Let us assume that, given ˜v∗ , the function HL; v) is di=erentiable in a small neighbourhood ; g (˜ ∗ of ˜v (uniformly in L; +) and +; r +; r 9HL; v) 9HL; v) ; g (˜ ; g (˜ = 0; =0 : (20.8) ∗ ∗ 9ui 9vi ˜v=˜v
˜v=˜v
+; r This means that ˜v∗ is an extremal point for HL; v). An extremal point satis6es the following ; g (˜ extremality equations:
u1 +
r 1 -. −g g L+ k ;!
u2 +
1 -.
r −g g L+
v1 + ig
k ;!
1 -. L+ k ;!
v2 + ig
1 [ L+ k ;!
+ + k ; !; 1=2; a −k ; −!; −1=2; a + + k ; !; 1=2; b −k ; −!; −1=2; b
+ + k ; !; 1=2; a −k ; −!; −1=2; a
+ + k ; !; 1=2; b −k ; −!; −1=2; b
/
−
−
/ / .
+ +
. .
− − −k ; −!; −1=2; b k ; !; 1=2; b − − −k ; −!; −1=2; a k ; !; 1=2; a
− − −k ; −!; −1=2; b k ; !; 1=2; b
/0
/0 /0
=0 ; =0 ;
=0 ;
− − −k ; −!; −1=2; a k ; !; 1=2; a ] = 0
;
(20.9)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
397
where L+
+ + k ; !; 1=2; i −k ; −!; −1=2; i
=
S S S S + Pa (d )e−Va Pb (d )e−Vb eg1 Da +g1 Db eg2 Db +g2 Da k+ ; !; 1=2; i −k ; −!; −1=2; i S S S S Pa (d )e−Va Pb (d )e−Vb eg1 Da +g1 Db eg2 Db +g2 Da
(20.10)
− − and a similar one for −k ; −!; −1=2; i k ; !; 1=2; i . One has then to compute the r.h.s. of (20.9); if = 0 such computation is trivial and one obtains, as in BCS theory, that the gap and the critical temperature are exponentially small in 1=g2 . However the presence of the interaction along the chain, which is responsible of the anomalous behaviour, has a dramatic e=ect. One could think that the r.h.s. of the self-consistence equation (20.9) is obtained by the one obtained in the = 0 case simply replacing propagator (3.4) with the Mattis model Schwinger function (see [1], p. 209). This is in fact what is found by a naive 6rst order perturbation theory. However the true result is more complex, as also the gap acquires a critical index. In fact one can compute (20.10) by the techniques describes above and the following result holds, see [45,87]. r Theorem 12. There exist an , such that; if ¿ 0; ; |g| 6 , the function H+; v) de=ned in ; g (˜ (2:10) is di>erentiable at u1 = u2 ; v1 = v2 and the extremality equations (3:2) are pairwise equal. In particular the l.h.s. of third and the fourth are vanishing while the =rst and the second are equal to; if 1=+ 6 K |gu|; K ¡ 1 −Q 1 |gu| −Q ˜ | gu | 2 −1 2 u + r=g − g u f(g; ; u) = 0] ; − 1 [a + f(g; ; u)] + g u Q A A
(20.11) where Q = +1 + Q; ˜ |Q˜| 6 C2 ; |f|; |f˜ | 6 C; and C; a; +1 ; A are positive constants. Note that (20.11) is a non BCS or anomalous self-consistence equation describing a superconductor whose normal state is a Luttinger liquid; the Luttinger interaction modi6es the self-consistence equation for the gap from the BCS-like one to (20.11). Note that ; g2 have to be small but there is no restriction on their ratio, in particular it can be =g2 1. v) admits two Corollary. There exist , and K ¡ 1 such that; if ¿ 0; ; |g| 6 , then H+:r ; g (˜ 2 −1 2 extremal points; both if =g ¡ K or =g ¿ K . In the limit + → ∞, r → 0 they become of the form (±3; ±3; 0; 0). In particular if =g2 ¿ K −1 2 1=Q 2 1=Q g g |g3| = A 1 + O() + O (20.12) aQ while if =g2 ¡ K 2
|g3| = Ae(−a+O(g))=g :
398
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
The above analysis says two one-dimensional spinning Fermi systems with an intrachain interaction given only by forward scattering and an interchain interaction expressed by a Cooper pair tunnelling Hamiltonian, in the Barden approximation, are such that the two-point Schwinger function has a behaviour similar to the Mattis model Schwinger function if T ¿ Tc while for T 6 Tc there is long distance exponential decay related to the opening of a gap 3; Tc 3 and 3 has the nonBCS form given by (20.12) if the intrachain interaction is smaller than the interchain one.
21. Bidimensional Fermi liquids The techniques we have applied to one-dimensional fermions are general and can be applied also in d ¿ 2. In this case much less is known, and there is till now no rigorous construction of the theory in the + → ∞ limit. The study of d ¿ 2 fermions was started in [25,26] and has been pursued in [101–103]: a renormalization group analogous to the d = 1 case was de6ned; many new problems appear due to the fact that the singularity (i.e. the Fermi surface) are not two points but a circle or a sphere. The main result obtained in such papers was the de6nition of a well-de6ned mathematical setting, n! bounds for the perturbative series and the de6nition of the beta function. However, it appears that even truncating arbitrarily (as there is no proof of the convergence of the beta function, but only n! bounds) at the second order there are problems; one has in6nitely many running coupling constants and: (1) if the interaction is attractive, the Cow is not bounded due to the BCS instability, while (2) if it is repulsive due to the Kohn–Luttinger phenomenon it is likely that, except for very particular interactions with special symmetries, the Cow is still not bounded. As there is the generation of a gap, the fermionic techniques discussed till here probably have to be supplemented by cluster expansion techniques (the theory becomes partly bosonic due to the appearence of a Goldstone boson). At the moment, the only rigorous construction for a problem of interacting fermions in d = 2 is for temperature T ¿ e−k=|| [89,47,48]; note that we cannot expect to reach a colder region due to the appearance of BCS instability at Tc = e−a=|| (but =c1, see below; so perhaps fermionic techniques will allow us to reach at least =c 1). Let us consider a model in d = 2 of interacting fermions with Hamiltonian H = H0 +V +7N0 , where H0 and V are de6ned by the analogue of (2.2), (2.7) in two dimensions with an ultraviolet cut-o>. In d = 2 the Fermi surface is the circle k12 + k22 − pF2 = E(k) and the propagator is given by 0h=−∞ g(h) (x − y) with eik0 (t−s)+ik(x−y) (h) g (x − y) = d k0 d k fh (k02 + [E(k) − pF2 ]2 ) : (21.1) −ik0 + E(k) − pF2 Passing to polar coordinates we 6nd eik0 (t−s)+ik(x−y) (h) g (x − y) = d k0 d # |k|d|k|fh (k02 + [E(k) − pF2 ]2 ) −ik0 + E(k) − pF2
(21.2)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
399
and we can introduce another decomposition over the integration in # in the following way. The annulus of radius Ah around the Fermi surface is divided in sectors centred at # = #r and of angular width Ah=2 (the choice Ah=2 is not arbitrary, see below). Then 1 = ! @h; ! (#), where @h; ! (#) are compact support functions with support in Ah=2−1=2 6 |#−#! | 6 Ah=2+1=2 , ! 1 = A−h=2 and g!h (˜x − ˜y) = ei!pF (x−y) gS!h (˜x − ˜y) with gS!h (˜x
h d k0 d #@! (#)
− ˜y) =
k d kfh (k02 + [E(k) − pF2 ]2 )
eik0 (t−s)+i[(k−!pF )(x−y) ; −ik0 + E(k) − pF2
(21.3)
which is bounded by h |g! (˜x − ˜y)| 6 A3h=2
1+
[A(h) |t
− s| +
CN − y)r | + Ah=2 |(x − y)t |]N
A(h) |(x
(21.4)
where (x − y)r = |x − y|cos #! and (x − y)t = |x − y|sin #! . As in d = 1 one can write x
=
h
ei ! p F x
!
(h) !;˜x
;
(21.5)
(h) h where !;˜ x −˜y). The di=erence with respect to the d = 1 case is that x has propagator given by gS! (˜ −h=2 . We write a tree expansion as in the preceding section and we write the truncated !=A expectation as sum over anchored trees times determinants; the Gram–Hadamard inequality can be applied as there is always a 6nite number of kinds of fermions (on the contrary, if like in [25,28] one considers continuous ! variables, one 6nds technical diKculty in doing the Gram– Hadamard bound). Then we get the following bound for the e=ective potential; for a 6xed tree E and an anchored tree T we get: (1) a factor A−(5=2)hv (sv −1) for the integration over the coordinates, if sv are the subtrees coming out of the vertex v; (2) a factor A(3=2)hv n˜v where n˜v are the propagators (in the anchored tree T or in the determinants) in the cluster v and not in any smaller one; calling m4v the number of vertices with 4 external lines we get, using (5.32), (5.33), a factor
C n AhD
4
e
4
A(hv −hv )((3=2)(2mv −nv =2)−(5=2)(mv −1)) ;
(21.6)
v
if D is a proper dimension; (3) we have now to sum over !, which is the crucial point. In order to perform this sum, suppose that we have a number of vertices v with all the external lines 6xed to some scale
400
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
hv , with nev external lines; then the sum over ! gives e e [A(−hv =2)(nv −3)@(nv ¿3) ] :
(21.7)
v
In order to understand this formula one has to note that for each vertex v there are nev sums over v but (a) the conservation of momentum on each vertex eliminates one sum (b) the vertices are connected by an anchored tree in the truncated expectations; so if v1 , v2 are two vertices connected by a line l of the spanning tree, 6xing the sector of v1 of the half-line forming l 6xes automatically the half-line line of the vertex v2 which forms l; (c) by geometrical considerations [34] the fact that the momenta have to stay in an annulus around the Fermi surface of radius A(h) and that the sectors are O(Ah=2 ) cancels another sum. However, in general, the external lines are not all on the same scale and we need a slightly more complicated argument. One can perform an iterative argument for summing over !; let us consider the endpoints (assume only four 6eld interactions, for simplicity). In general, the scales of the external lines are di=erent; let us 6x all of the them equal to the largest one. By the above argument we get a factor (all the lines are 6xed to have the same scale): 4 A−(1=2)(hv −hv )mv : (21.8) v
Now we have to sum on the lines of the vertices whose scale was not the largest one. We contract all the minimal clusters in points, and we iterate the above argument; the lines external to the minimal clusters v were 6xed to a sector of width Ahv =2 ; so summing on the sectors of these lines (6xing all of them to the smallest scale) gives a factor A−(1=2)(hv −hv ) and at the end we get e e A(1=2)(hv −hv )(nv −3)@(nv ¿3) : (21.9) v
Putting together all terms we get 4 e 4 4 e e A(hv −hv )((3=2)(2mv −nv =2)−(5=2)(mv −1)−(1=2)mv +(1=2)(nv −3)@(nv ¿3) ;
(21.10)
v
which gives e e e A(hv −hv )[(−3=4)nv +5=2+(1=2)(nv −3)@(nv ¿3)] :
(21.11)
v
From the above formula, we see that the power counting is exactly the same as the d = 1 case i.e. the dimension of the cluster with two external lines is −1 and the one with 4 is 0. Then if one can restrict the summation to |Pv | ¿ 4 the series for the e=ective potential would be convergent (the above argument works really for trees which, for any v 20 ¿ |Pv | ¿ 4, see [47]; in fact the sector sums done like above produce a constant K |Pv | which should develop a factorial. For v with |Pv | ¿ 20 one uses the fact that the dimension is very negative. For this technical point, see [47]).
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
401
To renormalize the above theory, one uses a de6nition very similar to the one for d = 1 fermions. If we allow logarithmic divergences, we have only to renormalize at the 6rst order the clusters with two external lines (logarithmic divergences give a factor in the bounds C n n hn+ C 2 n (log +)n which allows us to get convergence for T ¿ e−k=|| , with kC 6 1). The de6nition of localization is the same as in the d = 1 case (note that, by the conservation of momenta the ! index of external lines of the clusters with two external lines are the same) (6h)+ (6h)− (6h)+ (6h)− (h) L d k d k0 k; ! W (k0 ; k) = d k d k0 k; ! W(h) (0; !pF ) : (21.12) k; ! k; ! Note that the theory is rotation invariant so that W(h) (0; !pF ) is in fact independent from !. There is, however, a di=erence with respect to the d = 1 case (see [48]). The e=ect of R gives (6h)+ (6h)− R d k d k0 k; ! W(h) (k0 ; k) k; ! (6h)− = d k d k0 k;(6h)+ [(k − !pF )9k W(h) + k0 9k0 W(h) ] : (21.13) ! k; ! Let us 6x a reference frame in which axis 1 is directed as ! and 2 is orthogonal; then k = k1 ; k2 and (1; 0) is a radial vector while 0; 1 is a tangential vector. Then, we can write the above equation as, if k − !pF = k (k is the momentum measured from the Fermi surface) 1 k1 d t 9k1 W (h) + d tk2 9k2 W (h) ; (21.14) 0
where k1 = O(A(h) ), k2 = O(Ah=2 ). The 6rst addend gives a factor Ahv −hv which is the right factor to leave only a logarithmic divergence; however the second addend gives a factor A(hv −hv )=2 A−hv =2
(21.15)
which is not the correct one to have only logarithmic divergences. This (apparent) problem is solved using the rotational invariance of the theory. In fact (h) (h) S W (k0 ; k = W (k0 ; k12 + k22 ) (21.16) and if P = (h)
(pF + tk1 )2 + t 2 k22 then (h)
W (k0 − k) − W (k0 ; !pF ) =
=
1 0 1
dt
d S (h) W (k0 ; P(t)) dt
k (pF + tk1 ) + tk22 d t WS (k0 ; P(t)) 1 P(t) 0
(21.17)
and from the absence of terms linear in k2 we see that the renormalization produces the right dimensional gain.
402
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Uncited references [68,75,80,86,97,100,104,105,108,109,115,117] Appendix A A.1. Graphs, diagrams and trees A.1.1. Graphs Given a set V with n elements, we shall call graph E on V a couple (V; E), where E is a subset of unordered pairs of elements in V ; we shall write V = V (E) and E = E(E) and shall call points the elements of V (E) and lines the elements of E(E). We shall denote by |V (E)| and by |E(E)| the number of elements in V (E) and in E(E), respectively; of course |V (E)| = n. We shall write also ‘ ∈ E for ‘ ∈ E(E). See Fig. 18. If a line ‘ connects two points v; w ∈ V (E) we shall write also ‘ = (vw): we say that the line ‘ is incident with the points v and w. Two points v; w ∈ V (E) are adjacent if (vw) ∈ E(E), while two lines are adjacent if they are incident on the same point. Given a point v ∈ V (E) we de6ne as degree of the point v the number d(v) of lines incident on v a point such that d(v) = 1 is called an endpoint. Of course d(v) = 2|E(E)| : (A.1.1) v∈V (E)
A subgraph E of E is a couple (V ; E ) with V = V (E ) ⊂ V (E) and E = E(E) a subset of lines (vw) in E(E) with v; w ∈ V (E ); we shall write E ⊂ E. A graph E is connected if for any v; w ∈ E there exist p ∈ N and p points v1 ; : : : ; vp , with v1 = v and vp = w, such that vj and vj+1 are adjacent for each j = 1; : : : ; p − 1: in such a case we say that the lines (v1 v2 ); : : : ; (vp−1 vp ) form a path P on E connecting the point v with the point w. We shall say also that P crosses or intersects the points v1 ; : : : ; vp . See Fig. 19. A graph is disconnected if it is not connected. A graph is acyclic if it has no cycle (or loop), i.e. if for any two points v; w ∈ V (E) there is only one path connecting them.
Fig. 18. A graph E with 14 points and 18 lines.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
403
Fig. 19. A path P connecting v1 with v6 .
Fig. 20. A rooted tree of order 9 with 27 vertices.
A.1.2. Trees A tree graph (or tree tout court) E is a connected acyclic graph. If |V (E)| = n we say that E is a tree with n points [106]. Given a tree one has |E(E)| = |V (E)| − 1 :
(A.1.2)
Note that given a tree E any subgraph (subtree) of E is still connected and acyclic: so any subtree is a tree. A rooted tree is a tree with a distinguished point v0 . A rooted tree can be seen as a partially ordered set of points connected by lines. The partial ordering relation can be denoted by 4: we shall say that v ≺ w if there is a path P connecting w with v0 and v is crossed by P. We can also superpose an arrow on each line pointing towards v0 : we say that the lines of the tree are oriented; by extension also the tree is said to be oriented. We shall call also vertices the points in V (E). The point v0 is called the 6rst vertex of E. To identify the 6rst vertex v0 , we can draw an extra point r and an extra oriented line ‘ connecting v0 with r (see Fig. 20). We shall call r the root of E and ‘ the root line. Such a line is added to the lines in E(E), while the root is not considered a vertex. With such a convention, (A.1.2) has to be replaced with |E(E)| = |V (E)| = n :
(A.1.3)
404
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 21. Two unequivalent unlabelled trees of order 3.
Note also that in this way (A.1.1) becomes d(v) = 2|E(E)| − 1 :
(A.1.4)
v∈V (E)
Given a vertex v ∈ V (E) we denote by v the node immediately preceding v, i.e. the vertex ≺ v such that (v v) ∈ E(E). We say that the line ‘ = (v v) exits from v and enters v . Note that the vertex v is uniquely de6ned, as the ordering relation implies a bijective correspondence between lines and vertices: given a vertex there is one and only one line exiting from it. For any vertex there are sv ¿ 0 exiting lines: one has sv = 0 if v is an endpoint. We de6ne the order of a tree as the number of its endpoints. We call trivial a vertex v with sv = 1 and nontrivial a vertex v either with sv ¿ 2 or with sv = 0 (this means that the endpoints are counted as nontrivial vertices). Denote by Vf (E) the set of endpoints in E, by Vt (E) the set of trivial vertices in E and by Vnt (E) the set of nontrivial vertices in E: of course V (E) = Vt (E) ∪ Vnt (E) and Vf (E) = {v ∈ Vnt (E): sv = 0} :
(A.1.5)
By the notation v ∈ Vf (E) we mean v ∈ V (E)\Vf (E). Given a vertex v ∈ V (E) the subgraph (V ; E ) with V = {w ∈ V (E): w ¡ v} ; E = {‘ ∈ E(E): ‘ = (w w): w v} ;
(A.1.6)
is a rooted subtree with root v . The just de6ned trees are sometimes called unlabelled trees, in order to distinguish them from the “labelled trees” (to be de6ned). The unlabelled trees are identi6ed if superposable up to a continuous deformation of the lines on the plane such that the endpoints coincide: in such a case we say that they are equivalent. In Fig. 21 two unequivalent unlabelled trees of order n = 3 are drawn. Note that the indices used to identify the vertices v ∈ Vf (E) play no role. The notions which will be used will be that of unlabelled tree and, mostly, that of labelled tree. A (rooted) labelled tree can be obtained from an unlabelled tree by assigning labels hv to its vertices v ∈ V (E) in the following way. A label h 6 0 is associated to the root. If Th; n
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
405
Fig. 22. A rooted tree and the corresponding walk W .
denotes the corresponding set of labelled trees of order n (i.e. with n endpoints), we introduce a set of vertical lines, labelled by an integer assuming values in [h; 2], such that each vertex v ∈ V (E) is contained in some vertical line h ∈ [h; 2] (this will always be possible, as the lines can be continuously deformed): then we set hv = h . The label hv will be called the frequency or the scale of the vertex v. By construction hv ¿ h for all v ∈ V (E) and hv ¿ h + 1 for all v ∈ Vf (E). Moreover, if v ≺ w then hv ¡ hw . The number of trees is controlled through the following result. Lemma A.1. The number of (rooted) unlabelled trees with n points is bounded by C n for some constant C. Proof. The number of (rooted) unlabelled trees is bounded by the number of one-dimensional random walks W with 2n steps. This can be proved as follows. We can imagine moving along the tree by remaining to the left of the lines and starting from the root line. We move forward until an endpoint is reached: in this case we turn backwards until we meet a nontrivial vertex; then we turn once more forward and so on, until we come back to the root line. See Fig. 22: + means that we move from left to right along the line, while − means that we move from right to left. Each time we move forward along a line we associate to it a sign +, while we associate to it a sign − when we move backwards. So the tree can be characterized by a collection of 2n signs ± which de6ne a walk W = {± ± : : : ±}. Note that not all one-dimensional random walks with 2n steps correspond to unlabelled trees: we call compatible the random walks for which this happens. For instance, the 6rst sign is always a + and the last one is always a −: moreover, the overall number of + signs has to be equal to the overall number of signs −: Note that the correspondence between unlabelled trees and one-dimensional compatible random walks is one-to-one. By neglecting all the constraints we can bound the number of collections of 2n
406
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
signs, hence the number of unlabelled trees with n nodes, by 22n , that is the overall number of random walks with 2n steps. So we can choose C = 4 and the assertion follows. Given a tree with n vertices one has, as it is straightforward to check,
n − 1 if n ¿ 2 ; 1 6 |Vf (E)| 6 1 if n = 1 ; |Vnt (E)| 6 2|Vf (E)| − 1 :
(A.1.7)
The number of labelled tree in Th; n cannot be bounded uniformly in h: there are at most 2n − 1 nontrivial vertices, by (A.1.7), but once they have been 6xed, one can add many trivial vertices between them, and the number of possible insertions goes to in6nity for h → ∞. Nevertheless, we have the following result on labelled trees. Lemma A.2. Let Th; n be the number of labelled trees of order n and with scale h assigned to the root. If A ¿ 1 and 2 ¿ 0; then A−2(hv −hv ) 6 C2n (A.1.8) E∈Th; n v∈Vf (E)
for some constant C2 . Proof. Let us denote by Th;∗n the set of labelled trees of order n having only nontrivial vertices, and by E∗ any element in Th;∗n . A labelled tree E of order n can be imagined as formed from a tree E∗ of order n, by inserting trivial vertices between the (nontrivial) vertices of E∗ : the number of inserted vertices automatically determines the values of the scale labels. Fixing a tree E, so that the corresponding tree E∗ is determined, we can write A−2(hv −hv ) = A−2(hv −hv ) ; (A.1.9) v∈Vf (E)
v∈Vnt (E∗ )\Vf (E∗ )
where, for v seen as a vertex of E∗ , v denotes the vertex in E∗ immediately preceding v. The tree E can be obtained by inserting hv − hv trivial vertices between v ∈ E∗ and v ∈ E∗ . Then we have A−2(hv −hv ) = A−2(hv −hv ) : (A.1.10) E∈Th; n v∈Vf (E)
E∗ ∈Th;∗n v∈V (E∗ )\Vf (E∗ )
Denote by Tn∗ the set of unlabelled trees of order n having only nontrivial vertices. Then = ; (A.1.11) E∗ ∈Th;∗n
E∗ ∈Tn ∗ {hv }v∈E∗
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
so that
A−2(hv −hv ) =
E∗ ∈Th;∗n v∈V (E∗ )\Vf (E∗ )
407
A−2(hv −hv )
E∗ ∈Tn ∗ {hv }v∈E∗ v∈V (E∗ )\Vf (E∗ )
6
E∗ ∈Tn ∗
1 A2 − 1
n
6 Cn ;
(A.1.12)
where we used |V (E∗ )| = |Vnt (E)| 6 2n (see (A.1.7)), so that the number of elements in Tn∗ is bounded by C 2n , for a constant C (see Lemma A.1); moreover, in performing the sum over the scales we neglected all constraints except that of hv − hv ¿ 1. A.1.3. Feynman diagrams A graph can be imagined as formed by giving n points v1 ; : : : ; vn with dv1 ; : : : ; dvn outcoming lines, respectively, and contracting (some of) such lines between themselves. We can also associate to each line a sign = ± 1 and allow only contractions such that a line with a sign + is contracted with a line with a sign −. In particular, we can consider points with 2 or 4 outcoming lines: in the 6rst case there is one line with a sign + and one line with a sign −, while in the second one there are two lines with a sign + and two lines with a sign −. We denote by n2 the number of points v with dv = 2 and by n4 the number of points v with dv = 4: of course n = n2 + n4 . The points can have also a structure: when dv = 4 the point v is formed by two disjoint points connected through an ondulated line, while when dv = 2 the point can be characterized by an extra label. We shall call graph elements the points with structure. We shall consider only graphs of the above type which are connected: such graphs will be called Feynman diagrams and will be denoted by <. Note that if all the lines are contracted then for each v ∈ < one has d(v) = dv , while if we allow some lines to remain uncontracted then d(v) 6 dv : in such a case the uncontracted lines are called the external lines of the diagram. The number of Feynman diagrams is controlled through the following result. Lemma A.3. Consider a Feynman diagram formed with n graph elements v1 ; : : : ; vn such that dvj ∈ {2; 4} ∀j = 1; : : : ; n; and with 2p uncontracted lines (p with the sign + and p with the sign −). Then the number of Feynman graphs is bounded by C n (2n)! uniformly in p. Proof. A generic Feynman graph can be obtained in the following way. First construct a tree graph between the n graph elements: such a tree will be formed by contracting 2(n − 1) lines. The number of trees which can be obtained in this way is bounded by C n n! (by Lemma A.1). Then contract all the remaining 4n − 2p − 2(n − 1) = 2(n − p + 1) lines (one has to exclude the p lines which have to be left uncontracted), by using the fact that only lines with opposite signs can be contracted between themselves. Of course, among the 2(n − p + 1) lines there are n − p + 1 lines with a sign + and n − p + 1 lines with a sign −; therefore, such lines can be contracted in (n − p + 1)! possible ways, so that the number of diagrams which can be obtained starting from a 6xed tree between the graph elements is bounded by n! uniformly in p. By collecting together the two bounds the assertion follows.
408
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
A.2. Discrete versus continuum A.2.1. Discrete derivatives ˆ Given a function F(k) with k = (k; k0 ) ∈ DL; + , we set 9k = (9k ; 9k0 ), where F(k + Zk; k0 ) − F(k; k0 ) ; Zk
9k F(k; k0 ) =
Zk =
2' L
(A.2.1)
and, analogously, 9k0 F(k; k0 ) =
Note that, if F(x) =
F(k; k0 + Zk0 ) − F(k; k0 ) ; Zk0
Zk0 =
2' : +
ˆ e−ik·x F(k) ;
(A.2.2)
(A.2.3)
k∈DL;+
then
e
−ik·x
k∈DL;+
−iZkx e − 1 −ik·x ˆ ˆ 9k F(k) = e F(k) ; Zk
(A.2.4)
k∈DL;+
so that, for |x| 6 L=2, −iZkx e − 1 |xF(x)| 6 C F(x) Zk ˆ |6C ˆ |; |e−ik·x 9k F(k) |9k F(k) 6C k∈DL;+
(A.2.5)
k∈DL;+
where C denotes some constant. A.3. Truncated expectations and Gram–Hadamard inequality A.3.1. Truncated expectations and graphic representations Given a Grassman algebra as in (4.1) and an integration measure like (4.10) we de6ne a simple expectation as in (4.12). Then g2 = E(
− + 2 2 )
Given a monomial X ( ) = ˜B =
:
(A.3.1)
2 2
;
(A.3.2)
2∈B
where B is a subset of A and 2 ∈ {±}, the expectation E( ˜ B ) can be graphically represented in the following way.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
409
Represent the indices 2 ∈ B as points on the plane. With each 2+ , 2 ∈ B, we associate a line exiting from 2, while with each 2− , 2 ∈ B, we associate a line entering 2. Let T be the set of graphs obtained by contracting such lines in all possible ways so that only lines with opposite 2 are contracted: given 2; + ∈ B, denote by (2+) the line joining 2 and + and by E an element of T, i.e. a graph in T. Then we can easily verify that E( ˜ B ) = (−1)'E g2 2; + ; (A.3.3) E∈T (2+)∈E
which is the Wick rule stated in Section 4.1: here 'E is a sign which depends on the graph E (see (4.20)). Then de6ne the truncated expectation ET ( ˜ B1 ; : : : ; ˜ Bp ; n1 ; : : : ; np ) ;
(A.3.4)
with Bj ⊂ A for any j, as in (4.13). One can easily check that, if Xj are analytic functions of the Grassman variables (each depending on an even number of variables, for simplicity, so that no change of sign intervenes in permuting the order of the Xj ), then (1) ET (X1 ; X2 ) = E(X1 X2 ) − E(X1 )E(X2 ) ; (2) ET (X1 ; X2 ; X3 ) = E(X1 X2 X3 ) − E(X1 X2 )E(X3 ) − E(X1 X3 )E(X2 ) − E(X2 X3 )E(X1 ) + 2E(X1 )E(X2 )E(X3 ) ;
(3) ET (X1 ; X2 ; X3 ; X4 ) = E(X1 X2 X3 X4 ) − E(X1 X2 X3 )E(X4 ) − E(X1 X2 X4 )E(X3 ) − E(X1 X3 X4 )E(X2 ) − E(X2 X3 X4 )E(X1 ) − E(X1 X2 )E(X3 X4 ) − E(X1 X3 )E(X2 X4 ) − E(X1 X4 )E(X2 X3 ) + 2E(X1 X2 )E(X3 )E(X4 )
+ 2E(X1 X3 )E(X2 )E(X4 ) + 2E(X1 X4 )E(X2 )E(X3 ) + 2E(X2 X3 )E(X1 )E(X2 ) + 2E(X2 X4 )E(X1 )E(X3 ) + 2E(X3 X4 )E(X1 )E(X2 ) − 6E(X1 )E(X2 )E(X3 )E(X4 )
(A.3.5)
and so on. One can always write the truncated expectations in terms of simple expectations: it is easy to check that, in general, one has T
E (X1 ; : : : ; Xs ) =
s
(−1)' E(Y1 : : : Yp ) ;
(A.3.6)
p=1 Y1 ;:::;Yp
where 1 (1) the sums are over all the possible partitions of {1; : : : ; s} into p subsets such that sj=1 Xj = 1p k=1 Yp and each Yk is the union of sets Xj , (2) ' is the parity leading to {Y1 ; : : : ; Yp } with respect to the initial ordering.
410
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Also the truncated expectation (A.3.4) can be graphically represented. Draw in the plane n1 boxes G11 ; : : : ; G1n1 , such that each of them contains all points representing the indices belonging to B1 , n2 boxes G21 ; : : : ; G2n2 , such that each of them contains all points representing the indices belonging to B2 , and so on: we call clusters such boxes (for obvious reasons, if one recalls the de6nition of clusters given in Section 5.2). Then consider all possible graphs E obtained by contracting, as before, all the lines emerging from the points in such a way that no line is left uncontracted and with the property that if the clusters were considered as points then E would be connected. If we denote the lines as before we have ET ( ˜ B1 ; : : : ; ˜ Bp ; n1 ; : : : ; np ) = (−1)'E g2 2; + ; (A.3.7) E∈T0 (2+)∈E
where T0 denotes the set of all graphs obtained following the prescription just given; again 'E is a sign depending on E. If A = A1 ⊕ A2 , we can de6ne E1 and E2 as the expectations de6ned as E in (4.10) and (4.12), with the di=erence that we have the constraint a ∈ A1 and a ∈ A2 , respectively. Then if each 6eld 2 2 appearing in the products ˜ Bj is replaced by 2 2
→
2 1 21
+
2 2 22 ;
2 = 21 = 22
(A.3.8)
with 21 ∈ A1 and 22 ∈ A2 , we can consider E2T ( ˜ B1 ; : : : ; ˜ Bp ; n1 ; : : : ; np ) ;
(A.3.9)
where E2T denotes the truncated expectation corresponding to the simple expectation E2 . Consider for simplicity the case nj = 1 ∀j; by (4.18) this is not restrictive. We have for (A.3.9) the following graphic representation. 1 For each Bj write Bj = Bj1 ∪ Bj2 , with Bj1 ∩ Bj2 = ∅. Fixing the sets B11 ; : : : ; Bp1 , de6ne B = pj=1 Bj1 and T0 (B) as the set of graphs obtained by contracting the lines emerging from the points contained inside the boxes corresponding to B21 ; : : : ; B2p . Then ˜ E2T ( ˜ B1 ; : : : ; ˜ Bp ; n1 ; : : : ; np ) = (−1)'E g2 2; + ; g2 = E( 2− 2+ ) : (A.3.10) B B
E∈T0 (B) (2+)∈E
If A = A1 ⊕ A2 ⊕ · · · ⊕ AN for some N ∈ N, the above procedure can be iterated (in the obvious way). A.3.2. Proof of (4.43) Given s set of indices P1 ; : : : ; Ps , consider quantity (4.38). De6ne Pj± = {f ∈ Pj : (f) = ±} and set f = (j; i) for f ∈ Pj± , with i = 1; : : : ; |Pj± |. Note that (4.38) is vanishing.
(A.3.11) s
s − + j=1 |Pj | = j=1 |Pj |,
otherwise
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
De6ne also P(d ) =
s j=1
+ d x(f) d
f∈Pj+
f∈Pj−
411
− x(f)
+ |p− | |p |
(
+
−
;<
j j s
)=
j; j =1 i=1 i =1
− + (j ; i ) <(j; i); ( j ; i ) (j; i)
;
(A.3.12)
where, if n=
s j=1
|Pj+ | =
s j=1
|Pj− | ;
(A.3.13)
then < is the n × n matrix with entries <(j; i); ( j ; i ) = g(x(j; i) − x(j ; i )) :
(A.3.14)
Then one has s ˜ (Pj ) = det < = P(d ) exp[ − ( E
+
;<
−
)] ;
(A.3.15)
j=1
which is known as Berezin integral [57]. Setting X ≡ {1; : : : ; s} and VSjj =
+ |Pj− | |Pj |
i=1 i =1
write V (X ) =
j; j ∈X
− + (j ; i ) <(j; i); ( j ; i ) (j; i)
VSjj =
;
Vjj ;
(A.3.17)
j6j
thus de6ning the quantity Vjj as if j = j ; VSjj Vjj = VSjj + VSj j if j ¡ j : Then (A.3.15) can be written as s ˜ (Pj ) = P(d ) e−V (X ) : E j=1
(A.3.16)
(A.3.18)
(A.3.19)
412
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Fig. 23. The sets Xk for k = 1; 2; 3. One has X1 = {1}; X2 = {1; 2} and X3 = {1; 2; 3}.
We want to express (A.3.17) in terms of the following quantities. De6ne WX (X1 ; : : : ; Xr ; t1 ; : : : ; tr ) =
r ‘
tk (‘) V‘ ;
(A.3.20)
k=1
where (1) Xk are subsets of X with |Xk | = k, inductively de6ned as X1 = {1} ; Xk+1 ⊃ Xk ;
(A.3.21)
(2) ‘ = (jj ) is a pair of elements j; j ∈ X and the sum in (A.3.20) is over all the possible pairings (jj ), (3) the functions tk (‘) are de6ned as
tk if ‘ ∼ 9Xk ; (A.3.22) tk (‘) = 1 otherwise ; where ‘∼9Xk means that ‘ = (jj ) “intersects the boundary” of Xk , i.e., it connects a point belonging to some Pj with j ∈ Xk to a point contained inside some Pj with j ∈ Xk . See Fig. 23. One has WX (X1 ; t1 ) =
s
t1 V1j + V11 +
Vj j = (1 − t1 )[V (X1 ) + V (X \X1 )] + t1 V (X )
1¡j 6j
j=2
(A.3.23) so that e
−V (X )
9 −WX (X1 ;t1 ) = d t1 e + e−WX (X1 ;0) 9t1 0 1 =− V‘1 d t1 e−WX (X1 ;t1 ) + e−WX (X1 ;0) :
1
‘1 ∼9X1
0
(A.3.24)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
413
If we de6ne X2 ≡ X1 ∪ ‘1 , i.e. X2 = {1; point connected by ‘1 with 1}, then WX (X1 ; X2 ; t1 ; t2 )
s s = t2 WX (X1 ; t1 ) + (1 − t2 ) t1 V1j + V11 + Vj j − (t1 V1j + V2j ) 1¡j 6j
j=2
j=3
= (1 − t2 )[WX2 (X1 ; t1 ) + V (X \X2 )] + t2 WX (X1 ; t1 ) ;
(A.3.25)
so that e
−WX (X1 ;t1 )
9 −WX (X1 ; X2 ;t1 ; t2 ) = d t2 e + e−WX (X1 ; X2 ;t1 ; 0) 9t2 0 1 =− V‘2 d t2 t1 (‘2 )e−WX (X1 ; X2 ;t1 ; t2 ) + e−WX (X1 ; X2 ;t1 ; 0) :
1
0
‘2 ∼9X2
(A.3.26)
Therefore, e
−V (x)
=
0
‘1 ∼9X1 ‘2 ∼9X2
+
1
0
‘1 ∼9X1
1
d t1
1
0
d t2 (−1)2 V‘1 V‘2 t1 (‘2 ) e−WX (X1 ; X2 ;t1 ; t2 )
d t1 (−1) V‘1 e−WX (X1 ; X2 ;t1 ; 0) + e−WX (X1 ;0)
(A.3.27)
and, iterating s − 1 times, e−V (X ) =
s−1
:::
r=0 ‘1 ∼9X1
×
r−1
‘r ∼9Xr
0
1
d t1 : : :
0
1
d tr (−1)r V‘1 : : : V‘r
t1 (‘k+1 ) : : : tk (‘k+1 ) e−WX (X1 ; :::; Xr+1 ;t1 ; :::; tr ; 0) ;
(A.3.28)
k=1
where the factors which are meaningless have to be set equal to 1 and for r = s − 1 one has WX (X1 ; : : : ; Xs ; t1 ; : : : ; ts−1 ; 0) = WX (X1 ; : : : ; Xs−1 ; t1 ; : : : ; ts−1 ) :
(A.3.29)
One can easily check that WX (X1 ; : : : ; Xr ; t1 ; : : : ; tr−1 ; 0) = WXr (X1 ; : : : ; Xr−1 ; t1 ; : : : ; tr−1 ) + V (X \Xr ) :
(A.3.30)
Let us introduce a tree graph T between the sets X1 ; : : : ; Xr , such that (1) for each k = 1; : : : ; r, it is “anchored” to some point (j; i), i.e. it contains a line incident with (j; i), where j ∈ Xk and i ∈ {1; : : : ; |Pj± |}, (2) each line ‘ ∈ T intersects at least one boundary 9Xk ,
414
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
(3) the lines ‘1 ; ‘2 ; : : : are ordered so that ‘1 ∼ 9X1 ; ‘2 ∼ 9X2 ; : : :, (4) for each ‘ ∈ T one de6nes two indices n(‘) and n (‘) such that n(‘) = max{k: ‘ ∼ 9Xk } ; n (‘) = min{k: ‘ ∼ 9Xk } :
(A.3.31)
We shall call T an anchored tree. Then, we can rewrite (A.3.28) as e−V (X ) =
s
(−1)r−1
r=1 Xr ⊂X X2 :::Xr −1 T on Xr
×
1
d t1 : : :
0
1
d tr−1
0
V‘
‘∈T
2r−1 tk (‘) k=1
e−WXr (X1 ; :::; Xr−1 ;t1 ; :::; tr−1 ) e−V (X \Xr ) ; (A.3.32)
tn(‘)
‘∈T
where “T on Xr ” means that T is an anchored tree for the clusters Pj such that j ∈ Xr . De6ne K(Xr ) = V‘ X2 :::Xr −1 T on Xr ‘∈T
×
1
0
d t1 : : :
0
1
d tr−1
2r−1 tk (‘) k=1
‘∈T
tn(‘)
e−WXr (X1 ; :::; Xr−1 ;t1 ; :::; tr−1 ) ;
so that (A.3.32) becomes e−V (X ) = (−1)|Y |−1 K(Y ) e−V (X \Y )
(A.3.33)
(A.3.34)
Y ⊂X Y {1}
and, iterating, e−V (X ) =
Q1 ;:::;Qm
(−1)|X | (−1)m
m
K(Qq ) :
(A.3.35)
q=1
Note that the constraint {1} ∈ Y in (A.3.34) would yield a constraint like {1} ∈ Q1 ; min{k: k ∈ X \Q1 } ∈ Q2 and so on in (A.3.35), but, as a rearrangement of the sets Qq inside the partition {Q1 ; : : : ; Qm } does not change (A.3.35) because the Grassman 6elds ± appear always in pairs, we can forget such a constraint. Therefore, by (A.3.19) and (A.3.35), one has (recall also the 6rst of (A.3.12)) m s s m ˜ (Pj ) = P(d ) E (−1) (−1) K(Qq ) : (A.3.36) j=1
Q1 ;:::;Qm
q=1
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
In (A.3.33) we can sum 6rst over the trees T , then over the sets Xk , = ; X2 :::Xr −1 T on Xr
415
(A.3.37)
T on Xr X2 :::Xr −1 6xed T
where “6xed T ” recalls that the sets X2 ; : : : ; Xr have to be compatible with the tree T . Moreover we can write, by (A.3.20), WXr (X1 ; : : : ; Xr−1 ; t1 ; : : : ; tr−1 ) = t1 (‘) : : : tr−1 (‘)V‘ = tn (‘) : : : tn(‘) V‘ (A.3.38) ‘∈Xr
‘∈Xr
and set in (A.3.33) 2r−1 k=1 tk (‘) = tn (‘) : : : tn(‘)−1 : tn(‘) Thus obtaining K(Xr ) =
V‘
T on Xr X2 :::Xr −1 ‘∈T 6xed T
×
(A.3.39)
1
0
(tn (‘) : : : tn(‘)−1 )e−
d t1 : : :
0
1
d tl−1
‘∈Xr tn (‘) :::tn(‘) V‘
:
(A.3.40)
‘∈T
We can reorder the integration measure P(d ) in (A.3.12) as + |Pj | |Pj− | s − P(d ) = d (j;−i) d (j ; i ) j=1
= (−1)
i =1
i=1
m q=1
|Qq− |
i=1
|Qq+ |
d
i
(q)−
i =1
d
(q)+ S d = P( i
);
(A.3.41)
where (i) (j;−i) and (j;+i) correspond to indices f ∈ Pj , while i(q)− and i(q)+ correspond to indices (q; i) and (q; i ) in Qq = Qq+ ∪ Qq− , m m − + (ii) q=1 |Qj | = q=1 |Qj |, (iii) is the parity of the permutation leading the Grassman 6elds ± from the initial ordering (left-hand side) to the 6nal one (right-hand side). The simple expectations can be expressed in terms of truncated expectations through the relation s ˜ (Pj ) = E (−1)' ET ( ˜ (Q1 ); : : : ; ˜ (Qm )) ; (A.3.42) j=1
Q1 ;:::;Qm
416
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where (1) the sum is over all the possible partitions of {1; : :1 : ; s} into 1 m subsets Q1 ; : : : ; Qm such that each Qk ; k = 1; : : : ; m is the union of sets Pj and sj=1 Pj = m q=1 Qq , (2) ' is the parity leading to {Q1 ; : : : ; Qm } with respect to the initial ordering. It is easy to realize that the parity in (A.3.41) is equal to the parity ' in (A.3.42), if the sets Q1 ; : : : ; Qm are chosen in the same way (i.e. if the sets Qq in (A.3.41) are the same sets Qq as in (A.3.42)). Therefore, by comparing (A.3.42) with (A.3.36) (by taking into account also (A.3.40) and (A.3.41)), we 6nd the following expression for the truncated expectations: T ˜ m+1 ˜ S d ) P( E ( (Q1 ); : : : ; (Qm )) = (−1) V‘
1
0
T on Xm X2 :::Xm−1 ‘∈T 6xed T
d t1 : : :
1
0
d tm−1
(tn (‘) : : : tn(‘)−1 )e−
‘∈X tn (‘) :::tn(‘) V (‘)
:
(A.3.43)
‘∈T
A remarkable property of (A.3.43) is the following result. Lemma A.4. In (A:3:43) one has X2 :::Xm−1 6xed T
0
1
d t1 : : :
1
0
d tm−1
(tn (‘) : : : tn(‘)−1 ) = 1
(A.3.44)
‘∈T
for any anchored tree T . As in (A:3:44) d PT (t) ≡
(tn (‘) : : : tn(‘)−1 )
X2 :::Xp−1 ‘∈T 6xed T
m−1
d tq
(A.3.45)
q=1
is positive and -additive; it can be interpreted as a probability measure in the variable t = (t1 ; : : : ; tm−1 ). Proof. Let us denote by bk the number of lines ‘ ∈ T exiting from points x(j; i), with j ∈ Xk . By construction the parameter tk inside the integral in the left-hand side of (A.3.44) appears to the power bk − 1, as all the lines intersecting 9Xk contribute to tk , except the one connecting Xk with the point whose union with Xk gives the set Xk+1 (this is clear by using the notations introduced after (A.3.20)). See Fig. 24. Then ‘∈T
(tn (‘) : : : tn(‘)−1 ) =
m−2 k=1
tkbk −1
(A.3.46)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
417
Fig. 24. The sets X1 ; : : : ; X6 , the (anchored) tree T and the lines belonging to T .
and in (A.3.44) one has m − 1 independent integrations m−2 1 m−2 1 1 bk −1 d tm−1 d tk tk = ; bk 0 0 k=1
(A.3.47)
k=1
which is a well-de6ned expression as bk ¿ 1 for k = 1; : : : ; m − 2. Moreover, we can write = ::: ; (A.3.48) X2 ::: Xm−1 6xed T
X2 X3 6xed X1 6xed X1 ; X2
Xm − 1 6xed X1 ; :::; Xm−2
where the number of possible choices in summing over Xk , once X1 ; : : : ; Xk−1 have been 6xed, is exactly bk−1 : if bk−1 lines exit from Xk−1 then Xk is obtained by adding to Xk−1 one of the bk−1 points connected to Xk−1 through one of the lines of the tree. Then 1 = b1 : : : bm−2 (A.3.49) X2 :::Xm−1 6xed T
and, at the end, 1 X2 :::Xm−1 6xed T
0
d t1 : : :
which yields (A.3.44).
0
1
d tm−1
‘∈T
(tn (‘) : : : tn(‘)−1 ) =
m−2 k=1
bk ; bk
(A.3.50)
418
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
Set V (t) ≡
tn (‘) : : : tn(‘) V‘ ;
(A.3.51)
‘∈X
so that, in (A.3.43), we can rewrite V‘ = (VS ij + VSji ) ‘∈T
(A.3.52)
(i; j)
and use de6nition (A.3.45) to obtain ET ( ˜ (Q1 ); : : : ; ˜ (Qm )) = (−1)m+1
S d ) P(
(VSj j + VSjj )
d PT (t) e−V (t) ;
T on {xi(q) } (jj )∈T
where
(A.3.53)
T on {xi(q) } denotes the sum over the trees on point xi(q) , q = 1; : : : ; m and i = 1; : : : ; |Qq |.
X , seen as a sum over the trees anchored
on some If we integrate the Grassman 6elds appearing in the product (VSjj + VSj j ) ;
(A.3.54)
(jj )∈T
in (A.3.53), we obtain
E ( ˜ (Q1 ); : : : ; ˜ (Qm )) = (−1)m+1 T
T on
{xi(q) }
g‘
∗ PS (d )
d PT (t) e−V (t) ;
(A.3.55)
‘∈T
∗
where PS (d ) means that the Grassman 6elds which are left to integrate are the ones not appearing in (A.3.54). The term ∗ PS (d ) d PT (t) e−V (t) (A.3.56) in (A.3.55) is the determinant of a suitable matrix G T (t) with elements G(Tj; i)( j ; i ) = tn ( jj ) : : : tn( jj ) g(x(j; i) − x(j ; i )) :
(A.3.57)
So (4.43) is proven, with tj; j = tn ( jj ) : : : tn( jj ) . A.3.3. Estimates for the truncated expectations The following results hold. Lemma A.5. Given m set of indices Q1 ; : : : ; Qm such that (q)
(q)
{x(f): f ∈ Qq } = {x1 ; : : : ; x|Qq | };
q = 1; : : : ; m
(A.3.58)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
419
m + − and m q=1 |Qq | = q=1 |Qq | = n; then the number of trees T anchored on Q = {Q1 ; : : : ; Qm } is bounded by 1 6 m!C n (A.3.59) T on {xi(q) }
for some constant C. Proof. The proof goes through the following steps. (1) First suppose that each set Qq is a point: we shall see at the end what happens if the sets contain several points. We can write 1= 1; (A.3.60) T on {xi(q) }
{dq }
T 6xed {dq }
where in the right-hand side the 6rst sum is over all the possible con6gurations {dq }, if we denote by dq the number of lines emerging from (i.e. entering or exiting from) Qq ≡ q, while the second sum is over all the trees compatible with a 6xed con6guration {dq }. (2) The second sum in the right-hand side of (A.3.60) can be exactly computed and it gives T 6xed {dq }
1=
(m − 2)! : (d1 − 1)! : : : (dm − 1)!
(A.3.61)
In fact, by de6nition of T , there are at least 2 points (which we can call 1 and m) such that there is only one line emerging from them: then d1 = dm = 1. The line emerging from 1 can reach one of the other m − 2 points: we call 2 the point it reaches. Then there are d2 − 1 lines emerging from 2 leading the 6rst one to one of the other m − 3 points, the second one to one of the other m − 4 points, : : : , the (d2 − 1)th one to one of the other m − d2 − 1 points; moreover, if we permute between themselves the d2 − 1 lines there is no change in the above discussion. Therefore, so far we have obtained (m − 2) (m − 3)(m − 4) : : : (m − d2 − 1) (d1 − 1)! (d2 − 1)! possible contributions. By iterating until the mth point is reached we 6nd (A.3.61). (3) The 6rst sum in (A.3.60) can be bounded by 1 6 Cm ;
(A.3.62)
(A.3.63)
{dq }
where one can choose C = 2. In fact one has two constraints m q=1 dq = 2(m−1) and 1 6 dq 6 m− 1 ∀i = 1; : : : ; m, as the tree T has m − 1 lines, each line emerge from two points and each point is connected with no less than 1 point and no more than with all the others. Then, if we set
420
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
M = 2(m − 1) and ignore, for simplicity, the second constraint on {dq }, we have M −x1 M − m − 1 xq M q=1 16 d x1 d x2 : : : d xm 0
0
{dq }
6
M
0
6
M
0
6
M
0
0
d x1 : : :
q=1
xq
M−
m−3 q=1
xq
0
d x1 : : :
m−2
0
d x1 : : :
M−
M−
m−4 q=1
xq
0
d xm−1 M −
d xm−2
d xm−3
m−1
xq
q=1
2 m−2 1 M− xq 2! q=1 3 m−3 1 M− xq 3! q=1
mm
1 1 m M = [2(m − 1)]m 6 2m (A.3.64) m! m! m! and as e−m 6 mm =m! 6 1, then (A.3.63) immediately follows with C = 2. (4) As 1=(dq − 1)! 6 1, by using (A.3.61) and(A.3.63), we see that (A.3.58) follows with C = 2. (5) Now we take into account that, for each q = 1; : : : ; m; Qq is a collection of points. Then (A.3.61) has to be replaced with 1= 1: (A.3.65) 6
T on {xi(q) }
{dq } anchored 6xed {dq }
Fixed T on Q, the number of anchored trees is m |Qq |! ; (|Qq | − dq )!
(A.3.66)
q=1
as we have to consider the |Qq |! permutations of the |Qq | elements of the set Qq and divide by the (|Qq | − dq )! permutations of the elements of Qq which no line emerges from. So, by using 2 dq 2(m−1) 6 4m , we obtain that [(dq − 1)!]−1 6 [dq !]−1 2dq and m q=1 2 = 2 2(n+m) 1 6 C˜ (m − 2)! ; (A.3.67) anchored 6xed {dq }
where one can take C˜ = 22 . (6) From the previous bounds one has 1 6 m!C n ; {dq } anchored 6xed {dq }
where one can take C = 25 , if 2n =
m
q=1
(A.3.68) |Qq |. Then the proof of the lemma is complete.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
421
Lemma A.6. If in (A:3:57) one has |g(x−y)| 6 C0 for some constant C0 ; then the term (A:3:56) is bounded by PS ∗ (d ) PT (d t)e−V (t) ≡ |det G T | 6 (C0 C)n−m+1 (A.3.69) for some constant C. Proof. As the entries of the matrix G T are given by (A.3.57), we try to write tn ( jj ) : : : tn( jj )−1 g(x(j; i) − x(j ; i )) = (uj ⊗ A(x(j; i) − ·); uj ⊗ B(x(j ; i ) − ·)) ≡ (f2 ; g+ ) ; (A.3.70) where (·; ·) denotes the inner product, i.e.
(uj ⊗ A(x(j; i) − ·); uj ⊗ B(x(j ; i ) − ·)) = uj · uj
d yA(x(j; i) − y)B(x(j ; i ) − y)
(A.3.71) and the vectors f2 and g+ , with 2; + = 1; : : : ; n are implicitly de6ned by (A.3.70). The reason why we rewrite (A.3.57) as in (A.3.70) is that then we can apply the Gram– Hadamard inequality [90], in order to bind the determinant of the matrix with entries M2; + = (f2 ; g+ )
(A.3.72)
as |det M | 6
n
f2 g2 ;
(A.3.73)
2=1
so that, if max {f2 } 6 C0a ;
max {g2 } 6 C0b ;
2=1;:::; m
2=1;:::; m
a + b=1 ;
(A.3.74)
then (A.3.69) follows. The bound (A.3.73) is a standard result: a proof is given in Section A.3.4 just for completeness. So we are left with verifying that (A.3.70) is possible and that the bounds (A.3.74) hold. We can de6ne a family of vectors in Rm inductively as u1 = v1 ; uj = tj−1 uj−1 + vj
2 ; 1 − tj−1
j = 2; : : : ; m ;
(A.3.75)
where {vi }m i=1 is an orthonormal basis and the sets Xk have been relabelled so that X1 = {1}; X2 = {1; 2}; : : : ; Xm = {1; 2; : : : ; m}, hence tn ( jj ) · · · tn( jj ) = tj : : : tj −1 for a line (jj ).
(A.3.76)
422
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
By de6nitions (A.3.75) one has uj · uj = tj : : : tj −1 :
(A.3.77)
Therefore, if we de6ne the vectors A(x(j; i) − y) and B(x(j ; i ) − y) so that g(x(j; i) − x(j ; i )) = (A(x(j; i) − ·); B(x(j ; i ) − ·)) = d y A(x(j; i) − y)B(x(j ; i ) − y)
(A.3.78)
and, simultaneously, (A(x(j; i) − ·); A(x(j ; i ) − ·)) ¡ ∞ ; (B(x(j; i) − ·); B(x(j ; i ) − ·)) ¡ ∞ ;
(A.3.79)
we can apply (A.3.70) and (A.3.74). Then the proof of the lemma is complete. How to de6ne the vectors A(x(j; i) − y) and B(x(j ; i ) − y) depends on the problem one has to study. For instance for propagators g(x) = g!(h) (x) such that 1 −ik·x (h) fh (k ) e gˆ! (k); gˆ(h) (k) = g!(h) (x) = ; (A.3.80) ! L+ −ik0 + E(k) k∈DL;+
a possible de6nition for such vectors, in terms of their Fourier transforms, is $ $ fh (k ) ˆ ˆ ; B(k) = − fh (k )(ik0 + E(k)) ; A(k) = 2 k0 + E 2 (k)
(A.3.81)
so that |g(x − y)| 6 C0 , with C0 = CN Ah (see (5.27)). A.3.4. Gram–Hadamard inequality Let x1 ; : : : ; xm be m vectors of an Euclidean space E of 6nite dimension n. We de6ne the Gram determinant as (x1 ; x1 ) : : : (x1 ; xm ) ::: ; <(x1 ; : : : ; xm ) ≡ det < = det : : : : : : (A.3.82) (xm ; x1 ) : : : (xm ; xm ) where (·; ·) denotes the inner product in E. The following results hold. Lemma A.7. Given a Euclidean space E and m vectors x1 ; : : : ; xm in E; the Gram determinant (A:3:82) satis=es <(x1 ; : : : ; xm ) = 0 ;
(A.3.83)
if and only if the vectors x1 ; : : : ; xm are linearly dependent. If the vectors x1 ; : : : ; xm are linearly independent then one has <(x1 ; : : : ; xm ) ¿ 0 :
(A.3.84)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
423
Proof. If the vectors x1 ; : : : ; xm are linearly dependent then there exists m coeKcients c1 ; : : : ; cm not all vanishing such that the vector m j=1 cj xj is vanishing. By considering its inner product with the vectors x1 ; : : : ; xm , we obtain the system cS1 (x1 ; x1 ) + · · · + cSm (x1 ; xm ) = 0 ::: ··· ::: ::: cS1 (xm ; x1 ) + · · · + cSm (xm ; xm ) = 0
(A.3.85)
which is a homogeneous system admitting a nontrivial solution cS1 ; : : : ; cSm : therefore, the determinant of the matrix of the coeKcients is zero, thus implying (A.3.83). Vice versa if (A.3.83) holds system (A.3.85) admits a nontrivial solution cS1 ; : : : ; cSm . If we multiply the m equations de6ning the system by c1 ; : : : ; cm , respectively, then we sum them, obtaining c1 x1 + · · · + cm xm = 0 ;
(A.3.86)
where · is the norm induced by the inner product (·; ·). Therefore, the vector m j=1 cj xj has to be identically vanishing: as the coeKcients c1 ; : : : ; cm are not all vanishing, the vectors x1 ; : : : ; xm have to be linearly dependent. To prove (A.3.84) consider a subset S ⊂ E, and set, for any x ∈ E; x = xS + xN , where xS ∈ S and xN belongs to the orthogonal complement to S. We can write xN as xN = x1 + · · · + xp , with p = n − dim (S); then we have that the vectors
(x1 ; x1 ) ::: det (xp ; x1 ) (x; x1 )
: : : (x1 ; xp ) ::: ::: : : : (xp ; xp ) : : : (x; xp )
x1 ::: xp xN
(A.3.87)
are identically vanishing. In particular, it follows that x1 < ::: 1 xN = − det ; det < xp (x; x1 ) : : : (x; xp ) 0
< = <(x1 ; : : : ; xp )
(A.3.88)
and, analogously, x1 < ::: 1 xS ≡ x − xN = det ; det < xp (x; x1 ) : : : (x; xp ) x
(A.3.89)
424
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
so that
(x1 ; x) < ::: 1 <(x1 ; : : : ; xp ; x) 0 6 h2 ≡ (xN ; x) = det : = det < <(x1 ; : : : ; xp ) (xp ; x) (x; x1 ) : : : (x; xp ) (x; x)
(A.3.90)
By setting x ≡ xp+1 and h2 = h2p , we can write (A.3.90) as <(x1 ; : : : ; xp ; xp+1 ) = h2p ¿ 0 ; <(x1 ; : : : ; xp )
(A.3.91)
where x1 ; : : : ; xp are p linearly independent vectors and xp+1 is arbitrary. The sign = in (A.3.91) can hold if and only if xp+1 is a linear combination of the vectors x1 ; : : : ; xp so that if x1 ; : : : ; xp ; xp+1 are linearly independent, then (A.3.91) holds with the strict sign, i.e. <(x1 ; : : : ; xp ; xp+1 ) = h2p ¿ 0 : <(x1 ; : : : ; xp )
(A.3.92)
As <(x1 ) = (x1 ; x1 ) = x1 2 ¿ 0 for x1 = 0, (A.3.92) implies (A.3.84). Lemma A.8 (Hadamard inequality). The Gram determinant (A.3.82) satis=es the inequality <(x1 ; : : : ; xm ) 6 <(x1 ) : : : <(xm ) ;
(A.3.93)
where the sign = holds if and only if the vectors are orthogonal to each other. Proof. By (A.3.92) and by using the fact that (xN ; xN ) 6 (x; x) = <(x), we have <(x1 ; : : : ; xm ; x) 6 <(x1 ; : : : ; xm )<(x)
(A.3.94)
for any vectors x1 ; : : : ; xm ; x ∈ E. By iterating and recalling the above arguments (A.3.93) follows. Let x1 ; : : : ; xm be m linearly independent vectors in E, with m = n if n = dim(E). Let {ej }m j=1 be an orthonormal basis in E: set xjk = (ej ; xk ), so that xk = m x e , k = 1; : : : ; m. Then j=1 jk j (x1 ; x1 ) : : : (x1 ; xm ) ::: <(x1 ; : : : ; xm ) = det : : : : : : (xm ; x1 ) : : : (xm ; xm ) xSi1 1 xj1 1 (ei1 ; ej1 ) : : : xSi1 1 xjm m (ei1 ; ejm ) i1 j1 i1 jm ::: ::: ::: = det xSim m xj1 1 (eim ; ej1 ) : : : xSim m xjm m (eim ; ejm ) im
j1
im
jm
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
= det
i1
xSi1 1 xi1 1 : : :
:::
:::
i1
xSi1 1 xim m :::
425
xSim m xi1 1 : : : xSi m xim m im im m x11 : : : xm1 xS11 : : : xS1m = det : : : : : : : : : : : : : : : : : : x1m : : : xmm xSm1 : : : xSmm
= det X T XS = det X T det XS = |det X |2 ; where the matrix X is de6ned as x11 x12 : : : x1m x x :::x 2m 21 22 X = : ::: ::: ::: xm1 xm2 : : : xmm
(A.3.95)
(A.3.96)
This yields that the Gram determinant (A.3.93) can be written as <(x1 ; : : : ; xm ) = |det X |2 ;
(A.3.97)
so that from the lemma above the following result follows immediately. Lemma A.9. Given m linearly independent vectors of an Euclidean space E; and de=ning the matrix X through (A:3:96); one has m |det X | = |det(ei ; xj )| 6 xj 2 ; 2
2
(A.3.98)
j=1
where (ei ; xj ) stands for the matrix with entries xij = (ei ; xj ). The lemma above is simply a reformulation of the preceding lemma: it implies the following inequality. m Theorem A.1 (Gram–Hadamard inequality). Let {fj }m j=1 and {gj }j=1 be two families of m linearly independent vectors in a Euclidean space E; and let (·; ·) an inner product in E and · the norm induced by that inner product. Then
|det(fi ; gj )| 6
m fj gj ; j=1
where (fi ; gj ) stands for the m × m matrix with entries (fi ; gj ).
(A.3.99)
426
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
m −1 Proof. If {gj }m j=1 is an orthogonal basis in E (so that {ej }j=1 , with ej = gj gj , is an orthonormal basis) then (A.3.98) gives m m |det(gi ; xj )| = |det(ei ; xj )| gj 6 gj xj ; (A.3.100) j=1
j=1
Now consider the case in which the only condition on the vectors {gj }m j=1 is that they are −1 2 linearly independent. Set g˜j = gj gj , so that g˜j = 1, and de6ne inductively the family of vectors e˜1 = g˜1 ; g˜ − (g˜2 ; g˜1 )g˜1 e˜2 = 2 1 − (g˜2 ; g˜1 )2
(A.3.101)
and so on, in such a way that one has (e˜i ; e˜j ) = i; j . The basis {e1 ; : : : ; em }, with ej = e˜j ∀j = 1; : : : ; m is by construction on orthonormal basis. If c2 = 1 − (g˜2 ; g˜1 )2 , with 0 6 c2 6 1, one has g˜2 = c2 e˜2 + c2 (g˜2 ; g˜1 )g˜1 ;
(A.3.102)
i.e. g˜2 ∼c2 e˜2 , if by ∼ we mean that, by computing det(g˜i ; fj ), no di=erence is made by the fact that one has the vector g˜2 instead of c2 e˜2 ; in fact, the contributions arising from the remaining part in (A.3.100) sum up to zero. We can reason analogously for the terms with j = 3; : : : ; m, and we 6nd g˜j ∼ cj e˜j , where ∼ is meant as above and the coeKcients cj are such that 0 6 cj 6 1 ∀j = 1; : : : ; m. In conclusion, m m |det(gi ; fj )| = |det(g˜i ; fj )| gj = |det(ei ; fj )| cj gj j=1
=
j=1
m
m gj fj ;
j=1
j=1
cj gj fj 6
(A.3.103)
so that (A.3.99) follows. A.4. Dimensional bounds A.4.1. Proof of (5.27) We call CN any constant depending on N; and C; any constant independent of N and denote by 9k = (9k ; 9k0 ) the discrete derivative (see Section A.2). One has ) f (k h (h) −ik·x |g! (x)| = e −ik0 + E(k) k∈DL;+ 6
k∈DL;+
fh (k ) 6 CA−h fh (k ) 6 CA−h A2h 6 CAh : | − ik0 + E(k)| k∈DL;+
(A.4.1)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
In the same way one has fh (k ) h N (h) h N −ik·x |(A |x|) g! (x)| = [(A 9k ) e ] −ik0 + E(k) k∈DL;+ fh (k ) = e−ik·x (Ah 9k )N −ik0 + E(k) k∈DL;+ fh (k ) h N Nh −(N +1)h 2h A 6 CN Ah ; 6 (A 9k ) −ik0 + E(k) 6 CN A A
427
(A.4.2)
k∈DL;+
so that, by using the two bounds together, one obtains (h) |1 + (Ah |x|)N g! (x)| 6 CN Ah ;
(A.4.3)
thus implying (5.27). A.4.2. Proof of (5.28) for Feynman diagrams Consider a Feynman diagram < and call E the tree associated with it. The diagram < consists of n+n4 points (here n4 is the number of endpoints v ∈ Vf (E) with iv = 4). After integrating over n4 variables by using the potentials v(x − y), we are left with n integrations. As the diagram has to be connected, for any cluster Gv containing sv subclusters Gv1 ; : : : ; Gvsv one has sv − 1 lines on scale hv assuring the connection between the subclusters; such lines form an anchored tree Tv for the cluster Gv . Of course, the union of the anchored trees Tv corresponding to all the clusters Gv , v ∈ V (E), is a tree T for the Feynman diagram <. So we can perform a change of coordinates and integrate over the variables x‘ − y‘ , where x‘ ; y‘ are the extremes of the lines ‘ such that ‘ ∈ Tv for some v ∈ V (E), i.e. they are the points on which the line ‘ is incident. For each line ‘ ∈ Tv we obtain a factor A−2hv , by the compact support properties of the propagators g‘ . In fact, by using (5.27), one obtains +=2 (h) d x0 |g! (x)| 6 CA−2h Ah ; (A.4.4) −+=2
x∈
where the factor Ah is taken into account by the factor 0
Ahv nv ;
(A.4.5)
(see (5.29)). As for any v there are sv − 1 lines on scale hv , then (5.28) follows. A.4.3. Proof of (6.16) for trees By using (6.13) one immediately sees that one has to integrate d x(Iv0 ) g‘ ; v∈V (E) Tv
‘∈Tv
(A.4.6)
428
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
where, for each v ∈ V (E), Tv is an anchored tree between the sv maximal subclusters contained inside Gv . For each line ‘ representing a propagator g‘ let us call x‘ and y‘ its extremes; analogously, we can de6ne as x‘ and y‘ the extremes of an ondulated line representing the potential v(x − y), if x‘ = x and y‘ = y. The overall number of lines is v∈V (E) (sv − 1) = n − 1, by (6.17), while the overall number of integration variables is n+n4 . As the union of all trees Tv and of all ondulated lines representing the two-body interaction potentials assures the connectedness of the clusters Gv , we can make a change of coordinates and integrate over (1) the n − 1 di=erences x‘ − y‘ , such that g‘ ∈ Tv for some v ∈ V (E), (2) over the n4 di=erences x‘ − y‘ , such that v(x‘ − y‘ ) represents an ondulated line, and (3) over a 6xed variable, say x1 . The last integrations give simply a constant to the power n4 6 4, while the 6rst one gives a factor A−2hv for each line g‘ on scale h‘ = hv (see the analogous discussion in Section A.4.2). The integration over x1 gives a factor (L+). A.5. Diophantine numbers A.5.1. De=nitions Given a vector ! ∈ Rn , we say that ! is a Diophantine vector if |! · n| ¿ C0 |n|−E ;
∀n ∈ Zn \{0} ;
(A.5.1)
where |n| = |n1 | + · · · + |nn | and C0 ; E are suitable positive constants, which are called Diophantine constants. If n = 2 and ! = (2'; !) the above inequality can be rewritten as n!T = sup |n! − 2p'| ¿ C0 |n|−E ; p∈Z
∀n ∈ Z\{0} ;
(A.5.2)
by renaming the constant C0 . In fact (A.5.1) for n = 2 would give |n1 ! + 2n2 '| ¿ C0 (|n1 | + |n2 |)−E . Of course, |n1 ! + 2n2 '| can be small only if, say, |n1 ! + 2n2 '| ¡ 1=2, i.e. if n2 is such that 2n2 ' di=ers from n1 ! less than 1=2; therefore, the supremum in the left-hand side of (A.5.2) is assumed for n2 such that a1 n1 6 |n2 | 6 a2 n1 , for some constants a1 and a2 . So, by rede6ning the constant C0 , the inequality in (A.5.2) follows. If we write ! = (2'; !) we call ! the rotation number. A.5.2. Properties The Diophantine vectors are of great interest as they are of full measure in Rn , provided that E ¿ n − 1 in (A.5.1); see for instance [67]. Most likely the results in Sections 8 and 13 could be obtained by relaxing the hypothesis on the rotation number, e.g. by imposing the weaker Bryuno condition: also in KAM theory the persistence of invariant tori (for Cows) and invariant curves (for di=eomorphisms) has been proven under such a condition; see [91 –93].
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
429
Anyway, we note that the fact that the Diophantine vectors are of full measure making such an extension to more general vectors of secondary importance, unless some explicit questions are asked (such as the dependence on the rotation vector of the domain of convergence for the perturbative series, or the optimal condition on the rotation vector, and so on). A.6. Some technical results A.6.1. Proof of (5.39) and (8.96) We want to show that, for any constant 2 ¿ 0, S(Pv0 ; E; n) = A−22|Pv | 6 C n ; E∈Th; n {Pv }
(A.6.1)
v∈Vf (E)
for some constant C depending on 2. In (A.6.1) we can write A−22|Pv | = A−2|Pv | A−2|Pv | 6 A−2(hv −hv ) A−2|Pv | v∈Vf (E)
and
v∈Vf (E)
=
{Pv }
v∈V (E) pv
v∈Vf (E)
v∈Vf (E)
(A.6.2)
v∈Vf (E)
@ (constraint on {pv }) ;
(A.6.3)
Pv |Pv |=pv
if the @ denotes the constraint 1 6 pv 6
sv
qvj ;
(A.6.4)
j=1
where qvj = |Qvj | and v1 ; : : : ; vsv are the vertices immediately following v along the tree E (we use the notations (5.25)). If we neglect the constraint @ and remove also the constraint that Pv0 is 6xed (i.e. we sum over all the possible Pv0 ), we can bound p + · · · + p v v 1 sv A−2|Pv | 6 A−2pv ; (A.6.5) p v v∈V (E) v∈V (E) pv {Pv }
where we used that
v∈V (E)
Pv |Pv |=pv
@(constraint on {pv }) 6
v∈V (E)
Pv |Pv |=pv
16
pv1 + · · · + pvsv pv
:
(A.6.6)
430
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
We can write (A.6.5) as pv1 + · · · + pvsv −2pv A = Iv ; p v v∈V (E) pv v∈V (E)
(A.6.7)
which de6nes the factors Iv . In particular, we have pv01 + · · · + pv0sv0 −2pv0 Iv0 = A pv0 pv 0
−2 pv01 +···+pv0sv0
= (1 + A )
=
sv 0
(1 + A−2 )pv0j ;
(A.6.8)
j=1
where v01 ; : : : ; v0sv0 are the vertices immediately following v0 , so that sv 0 pv1 + · · · + pvsv −2pv A = Iv (1 + A−2 )pv0j : pv j=1 v∈V (E) pv v∈V (E)
(A.6.9)
vv0
If we iterate the procedure we obtain s
sv 0 sv 0 vj p Iv0 Iv0j = (1 + A−2 (1 + A−2 )) vjj ; j=1
(A.6.10)
j=1 j =1
where vj1 ; : : : ; vjsv are the vertices immediately following v0j ; and so on until we reach all the endpoints of the tree E. If we denote by P a path (i.e. an oriented connected set of lines) from the root to an endpoint we 6nd Iv = [(1 + A−2 (1 + A−2 (1 + A−2 (: : :))))]4 ; (A.6.11) v∈V (E)
P
where we used the fact that the endpoints have, at most, four external lines (see (5.19)) and the product is over all the possible paths on E. Then, if we denote by ‘(P) the “length” of the path P, i.e. number of vertices along the path P, we have 4 2 4n ‘(P) A −2k Iv = A 6 2 ≡ C1n ; (A.6.12) A −1 v∈V (E)
P
k=0
where C1 = A42 (A2 − 1)4 . By using the results in Section A.1 one has A−2(hv −hv ) 6 C2n ; E∈Th; n v∈Vf (E)
for some constant C2 , thus implying (A.6.1) with C = C1 C2 , hence (8.96).
(A.6.13)
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
431
A.6.2. Proof of (5.23) First note that V is a sum of contributions (see (5.19)) which can be expressed as (61) d x(Iv )W(1) (x(Iv )) ˜ (Iv ) ; (A.6.14) where, for instance, x(Iv ) = {x; y}, (61) (Iv ) = x;+ y;+ y;− x;− , W(1) (x(Iv )) = v(x − y) and +=2 +=2 1 d x(Iv ) = d x d y0 ; (A.6.15) 0 (2S + 1)2 −+=2 −+=2 ; =±S x∈
y∈
for iv = 4 in the discrete case. So if hv = 1 one has (see (5.7) with h = 1 denoting the ultraviolet scale) (61) (61) 1 T (1) (1) ˜ ˜ d x(Iv1 )W (x(Iv1 )) E (Iv1 ); : : : ; d x(Ivn )W (x(Ivn )) (Ivn ) n! 1 (61) (61) 1 = d x(Iv1 ) : : : d x(Ivn )W(1) (x(Iv1 )) : : : W(1) (x(Ivn )) E1T ( ˜ (Iv1 ); : : : ; ˜ (Ivn )) ; n! (A.6.16) which contains an expression like (5.23) with Pvj = Ivj for j = 1; : : : ; n. If hv 6 0, then one has, by the inductive hypothesis (see (5.22) and (5.23)) (6hv +1) (6hv +1) 1 T T E (E (˜ (Pv11 ); : : : ; ˜ (Pv1sv1 )); : : : ; ) ; sv ! hv hv +1
(A.6.17)
where v1 ; : : : ; vsv are the sv vertices following v and vj1 ; : : : ; vjsvj are the svj vertices following vj , j = 1; : : : ; sv . Then, by the de6nitions, (6hv +1)
EhTv +1 ( ˜
=
Qvj1 ⊂Pjv1
(6hv +1) (Pvj1 ); : : : ˜ (Pvjsvj )) ˜ (6hv ) (Qvj1 ) : : : ˜ (6hv ) (Qvjs ) ::: vj Qvjsv ⊂Pvjsv j
×EhTv +1 ( ˜
(hv +1)
j
(hv +1) (Pvj1 \Qvj1 ); : : : ; ˜ (Pvjsvj \Pvjsvj )) ;
(A.6.18)
where EhTv +1 ( ˜
(hv +1)
(hv +1) (Pvj1 \Qvj1 ); : : : ; ˜ (Pvjsvj \Pvjsvj ))
(A.6.19)
gives a constant (i.e. a quantity which does not depend on the 6elds). Then in (A.6.17) one is left with an expression like (6hv ) 1 T ˜ (6hv ) Ehv ( (Pv1 ); : : : ; ˜ (Pvsv )) ; sv !
(A.6.20)
432
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
with s
Pvj =
vj
Qvji ;
(A.6.21)
i=1
so that (5.23) is proven. A.6.3. Proof of (8.91) Recall the de6nition of depth of nontrivial vertices given in Section 8. We call BD the set of v ∈ Vf∗ (E) such that the nontrivial vertex immediately preceding v has depth D. Given a tree E we de6ne the depth of the tree as DE = max{Dv : v ∈ Vnt (E)}
(A.6.22)
and set BD =
D
Bp ;
(A.6.23)
p=0
by construction, BD is the collection of all endpoints in Vf∗ (E) contained inside a cluster Gv , for some v with depth Dv = D. We prove by induction on the depth D ∈ [0; DE ] ∩ N the following bound: |’ˆ mv |1=2 v∈Vf∗ (E)∩BD
6
D−1
v∈Vf∗ ∩BD
F01=2
e−|Nv |=2
p+2
p=0 v∈Vf∗ (E)∩Bp
p+1
e−|Nv |=2
;
(A.6.24)
v∈Vf∗ (E)∩BD
where the product in the 6rst parentheses has to be thought of as 1 for D = 0. For D = 0, (A.6.24) is a trivial identity: it is enough to recall that |’ˆ m | 6 F0 e−m (see (8.6)) and that Nv = mv if v ∈ Vf (E) (see (8.17)). Suppose that (A.6.24) holds for some D − 1; then we want to show that it holds also for D. In fact, by using that, for any vertex v ∈ V (E)\Vf (E), one has (see comments after (8.34)) Nv =Nw1 + · · · +Nwsv +2×(number of lines with nondiagonal propagator in Gv) ;
(A.6.25)
where w1 ; : : : ; wsv are the sv nontrivial vertices immediately following v: this simply follows from the de6nition (8.17) and from the fact that if v˜ is a trivial vertex then Nv˜ = Nw , where w is the nontrivial vertex immediately following v. ˜ Then one has |’ˆ mv |1=2 = |’ˆ mv |1=2 |’ˆ mv |1=2 v∈Vf∗ (E)∩BD
6 Cn
v∈Vf∗ (E)∩BD
v∈Vf∗ (E)∩BD
F0 e−|mv |=2
v∈Vf∗ (E)∩BD−1
v∈Vf∗ ∩BD−1
F01=2
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
×
D−2
e−|Nv |=2
p=0 v∈Vf∗ (E)∩Bp
6 Cn
v∈V ∗ ∩B f
F01=2 D
×
v∈V ∗ (E)∩B
6 Cn
f
p+2
k−1
p=0
v∈V ∗ (E)∩B
v∈Vf∗ ∩BD
F01=2
e−|Nv |=2 D
v∈Vf∗ (E)∩BD−1
f
v∈V ∗ (E)∩B f
k−1
e−|Nv |=2
p+2
p
e−|mv |=2 D
e−|Nv |=2
D+1
D −1
433
e−|Nv |=2
p+2
p=0 v∈Vf∗ (E)∩Bp
D+1
e−|Nv |=2
;
v∈Vf∗ (E)∩BD
(A.6.26) Thus proving (A.6.24). By taking k = DE , (A.6.23) follows. References [1] P.W. Anderson, The theory of superconductivity in high-Tc cuprates, Princeton University Press, Princeton, NJ, 1997. [2] P.W. Anderson, Basic Principles in Solid State Physics, Benjamin/Cummings, Menlo Park, CA, 1985. [3] D. Mattis, The Many-body Problem: an Encyclopedia of Exactly Soluble Models, World Scienti6c, Singapore, 1993. [4] D. Mattis, E. Lieb, Exact solution of a many fermion system and its associated boson 6eld, J. Math. Phys. 6 (1965) 304–312. [5] E.H. Lieb, F.Y. Wu, Absence of Mott transition in an exact solution of the short-range, one-band model in one-dimension, Phys. Rev. Lett. 20 (1968) 1445–1448. [6] E.H. Lieb, T. Schultz, D. Mattis, Two soluble models of an antiferromagnetic chain, Ann. Phys. 16 (1961) 407–466. [7] C.N. Yang, C.P. Yang, One dimensional chain of anisotropic spin-spin interactions. I. Bethe’s hypothesis for ground state in a 6nite system, Phys. Rev. 150 (1) (1966) 321–327; II. Properties of the ground-state energy per lattice site for an in6nite system. Phys. Rev. 150 (1) (1966) 327–339. [8] R.J. Baxter, One dimensional anisotropic Heisember chain, Ann. Phys. 70 (1972) 323–327. [9] E.C. Titchmarsh, Eigenfunctions Expansions Associated with Second Order Di=erential Equations, Clarendon, Oxford, 1955. [10] L. Pastur, A. Figotin, Spectra of Random and Almost Periodic Operators, Springer, Berlin, 1991. [11] W. Kohn, Analytic properties of Bloch waves and Wannier functions, Phys. Rev. 114 (4) (1959) 809–821. [12] J. FrIolich, T. Spencer, Absence of di=usion in the Anderson tight binding model for large disorder and low energy, Comm. Math. Phys. 88 (1983) 151–184. [13] M. Aizenman, S. Molchanov, Localization at large disorder and at extreme energies: an elementary derivation, Comm. Math. Phys. 157 (2) (1993) 245–278. [14] E.I. Dinaburg, Ya.G. Sinai, On the one dimensional Schroedinger equation with a quasiperiodic potential, Functional Anal. Appl. 9 (1975) 279–289. [15] J. Moser, J. PIoschel, An extension of a result by Dinaburg and Sinai on quasi periodic potentials, Comment. Math. Helv. 59 (1984) 39–85.
434
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
[16] L.H. Eliasson, Floquet solutions for the one dimensional quasi periodic Schroedinger equation, Comm. Math. Phys. 146 (1992) 447–482. [17] G. Aizenman, G.M. Graf, Localization bounds for an electron gas, J. Phys. A 31 (32) (1998) 6783–6806. [18] F. Bonetto, V. Mastropietro, Beta function and anomaly of the Fermi surface for a d = 1 system of interacting fermions in a periodic potential, Comm. Math. Phys. 172 (1995) 57–93. [19] J. S\olyom, The Fermi gas model of one dimensional conductors, Adv. Phys. 28 (1979) 201–303. [20] J. Voit, One-dimensional Fermi liquids, Rep. Progr. Phys. 58 (9) (1995) 977–1116. [21] K.J. Schultz, G. Cuniberti, P. Pieri, Fermi liquids and Luttinger liquids, in: G. Morandi et al. (Eds.), Field Theories for Low-dimensional Condensed Matter Systems, Springer, Berlin, 2000. [22] W. Metzner, C. Castellani, C. di Castro, Fermi systems with strong forward scattering, Adv. Phys. 47 (3) (1998) 317–445. [23] A. Luther, I. Peschel, Calculation of critical exponents in two dimensions from quantum 6eld theory in one dimension, Phys. Rev. B 12 (9) (1975) 3908–3917. [24] H. Frahm, V.E. Korepin, Critical exponents for the one dimensional Hubbard model, Phys. Rev. B 42 (16) (1990) 10553–10565. [25] G. Benfatto, G. Gallavotti, Perturbation theory of the Fermi surface in a quantum liquid. A general quasiparticle formalism and one dimensional systems, J. Stat. Phys. 59 (3– 4) (1990) 541–664. [26] J. Feldman, M. Trubowitz, Perturbation theory for many fermion systems, Helv. Phys. Acta 63 (1) (1999) 114–149. [27] K.G. Wilson, The renormalization group and critical phenomena, Nobel Prize Lecture, Rev. Mod. Phys. 47 (1984) 773–840. [28] G. Benfatto, G. Gallavotti, Renormalization Group, Physics Notes 1, Princeton University Press, Princeton, NJ, 1995. [29] D. Brydges, A short course on cluster expansions, in: K. Osterwalder, R. Stora (Eds.), Les Houches Summer School on Critical Phenomena, Random Systems, Gauge Theories, North-Holland, Amsterdam, 1984. [30] K. Gawedzki, A. Kupiainen, Asymptotic freedom beyond perturbation theory, Les Houches Summer School on Critical Phenomena, Random Systems, Gauge Theories (Les Houches 1984), North-Holland, Amsterdam, 1986. [31] V. Rivasseau, From Perturbative to Constructive Renormalization, Princeton University Press, Princeton, NJ, 1994. [32] A.A. Abrikosov, L.P. Gorkov, I.Y. Dzyaloshinski, Methods of Quantum Field Theory in Statistical Physics, Dover, New York, 1965. [33] K. Gawedzki, A. Kupiainen, Gross–Neveu model through convergent perturbative expansions, Comm. Math. Phys. 102 (1) (1986) 1–30. [34] J. Feldman, J. Magnen, V. Rivasseau, E. Trubowitz, A renormalizable 6eld theory: the massive Gross–Neveu model in two dimensions, Comm. Math. Phys. 103 (1986) 67–103. [35] A. Lesniewski, E=ective action for the Yukawa2 quantum 6eld theory, Comm. Math. Phys. 108 (1987) 437–467. [36] G. Benfatto, G. Gallavotti, V. Mastropietro, Renormalization group and the Fermi surface in the Luttinger model, Phys. Rev. B 45 (1992) 5468–5480. [37] G. Gentile, B. Scoppola, Renormalization group and the ultraviolet problem in the Luttinger model, Comm. Math. Phys. 154 (1993) 135–179. [38] G. Benfatto, G. Gallavotti, A. Procacci, B. Scoppola, Beta functions and Schwinger functions for a many fermions system in one dimension, Comm. Math. Phys. 160 (1994) 93–171. [39] F. Bonetto, V. Mastropietro, Filled band Fermi systems, Math. Phys. Electron. J. 2 (1996) 1– 43 (paper 1). [40] F. Bonetto, V. Mastropietro, Critical indices in a d = 1 6lled band Fermi system, Phys. Rev. B 56 (3) (1997) 1296–1308. [41] G. Benfatto, G. Gentile, V. Mastropietro, Electrons in a lattice with incommensurate potential, J. Stat. Phys. 89 (3– 4) (1997) 655–708. [42] G. Benfatto, G. Gentile, V. Mastropietro, Peierls instability for the Holstein model with rational density, J. Stat. Phys. 92 (5 – 6) (1998) 1071–1113.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
435
[43] V. Mastropietro, Small divisors and anomalous behaviour in the Holstein–Hubbard model, Comm. Math. Phys. 201 (1999) 81–115. [44] V. Mastropietro, A renormalization group computation of the XYZ correlation functions, Lett. Math. Phys. 47 (4) (1999) 339–352. [45] V. Mastropietro, Anomalous BCS equation for a Luttinger superconductor, Mod. Phys. Lett. 13 (17) (1999) 585–597. [46] G. Benfatto, V. Mastropietro, Renormalization Group, hidden symmetries and approximate Ward identities in the XYZ model, Rev. Math. Phys., in press. [47] M. Disertori, V. Rivasseau, Interacting Fermi liquid in two dimensions at 6nite temperature, I. Convergent attributions, Comm. Math. Phys. 215 (2) (2000) 251–290. [48] M. Disertori, V. Rivasseau, Interacting Fermi liquid in two dimensions at 6nite temperature, II. Renormalization, Comm. Math. Phys. 215 (2) (2000) 291–341. [49] J.W. Negele, H. Orland, Quantum Many-Particle Systems, Addison-Wesley, New York, 1988. [50] S.B. Sutherland, Two-dimensional hydrogen bonded crystals, J. Math. Phys. 11 (1970) 3183–3186. [51] C. Bourbonnais, D. Jerome, The normal phase of quasi one dimensional superconductors, in: P. Bernier, S. Lefrant, G. Bidani (Eds.), Advances in Synthetic Metals, Twenty Years of Progress in Science and Technology, Elsevier, New York, 1999. [52] F. Axel, D. Gratis, Beyond Quasi-crystals, Les Houches, Springer, Berlin, 1984. [53] P.A. Lee, Sliding charge density waves, Nature 291 (5810) (1981) 11–12. [54] E.H. Lieb, D. Mattis, Mathematical Physics in One Dimension: Exactly Soluble Models of Interacting Particles, Academic Press, New York, 1966. [55] G. Benfatto, G. Gallavotti, J. Lebowitz, Disorder in the 1D Holstein model, Helv. Phys. Acta 68 (3) (1995) 312–327. [56] G.D. Mahan, Many-particle Physics, Plenum Press, New York, 1990. [57] F.A. Berezin, The Methods of Second Quantization, Academic Press, New York, 1966. [58] K. Hepp, Proof of the Bogolubov–Parasyuk theorem on renormalization, Comm. Math. Phys. 2 (4) (1966) 301–326. [59] W. Zimmermann, Convergence of Bogolubov’s method of renormalization in momentum space, Comm. Math. Phys. 15 (3) (1969) 208–234. [60] G. Gallavotti, F. Nicol\o, Renormalization theory for four dimensional scaler 6elds, I, Comm. Math. Phys. 100 (1985) 545 –590; II, Comm. Math. Phys. 101 (1985) 471–562. [61] J. Vidal, D. Mouhanna, T. Giamarchi, Correlated fermions in a one-dimensional quasi-periodic potential, Phys. Rev. Lett. 83 (1999) 3908–3911. [62] J.C. Chaves, I.I. Satjia, Transport properties of one-dimensional interacting fermions in aperiodic potentials, Phys. Rev. B 55 (1997) 14076. [63] W.M. Schmidt, in: Diophantine Approximation, Lecture Notes in Mathematics, Vol. 785, Springer, Berlin, 1980. [64] L.H. Eliasson, Absolutely convergent series expansions for quasi-periodic motions, Report 2-88, 1-31, Department of Mathematics, University of Stockholm (1988) and Math. Phys. Electron. J. 2 (1996) 1–33 (paper 4). [65] G. Gallavotti, Twistless KAM tori, Comm. Math. Phys. 164 (1994) 145–156. [66] G. Gentile, V. Mastropietro, Methods for the analysis of the Lindstedt series for KAM tori and renormalizability in classical mechanics. A review with some applications, Rev. Math. Phys. 8 (1996) 393–444. [67] G. Gallavotti, The Elements of Mechanics, Springer, New York, 1983. [68] G. Gallavotti, Renormalization group and ultraviolet stability for scalar 6elds via renormalization group methods, Rev. Mod. Phys. 57 (1985) 471–562. [69] G. Gentile, V. Mastropietro, Anderson localization for the Holstein model, Comm. Math. Phys. 215 (1) (2000) 69–103. [70] I. AOeck, Field theory methods and quantum critical phenomena, Les Houches Summer School on Critical Phenomena, Random Systems, Gauge Theories, North-Holland, Amsterdam, Springer, Berlin, 1984. [71] B. McCoy, T. Wu, Dynamic mass generation and the Thirring model, Phys. Lett. B 87 (1–2) (1979) 50–51.
436
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
[72] W. Metzner, C. di Castro, Conservation laws and correlation functions in the Luttinger liquids, Phys. Rev. B 47 (24) (1993) 16107–16123. [73] W. Thirring, A soluble relativistic 6eld theory, Ann. Phys. 3 (1958) 91–112. [74] J. Belissard, R. Lima, D. Testard, A metal–insulator transition for almost Mathieu model, Comm. Math. Phys. 88 (1983) 207–234. [75] R.A. Johnson, J. Moser, The rotation number for almost periodic potentials, Comm. Math. Phys. 84 (1982) 403–438. [76] J. Magnen, G. Poirot, V. Rivasseau, Ward type identities for the 2d Anderson model with weak disorder, J. Stat. Phys. 93 (1998) 331. [77] B.M. McCoy, Spin correlation functions of the X –Y model, Phys. Rev. 173 (1968) 531–541. [78] G. Benfatto, G. Gallavotti, J.L. Lebowitz, Disorder in 1D spinless Holstein model, Helv. Phys. Acta 68 (3) (1995) 312–327. [79] J.D. Johnson, S. Krinsky, B.M. McCoy, Vertical-arrow correlation length in the eight-vertex model and the low-lying excitations of the X –Y –Z Hamiltonian, Phys. Rev. A 8 (1973) 2526–2547. [80] J. Moser, On invariant curves of area preserving mappings of the annulus, Nachr. Akad. Wiss. GIottingen Math. Phys. Kl. II 1962 (1962) 1–20. [81] V. Mastropietro, Interacting soluble Fermi systems in one dimension, Nuovo Cimento B 1 (1994) 304–312. [82] R.E. Peierls, Quantum Theory of Solids, Clarendon Press, Oxford, 1955. [83] M. Fabrizio, Role of transverse hopping in a two-coupled chains model, Phys. Rev. B 48 (21) (1993) 15838–15859. [84] T. Kennedy, E.H. Lieb, An itinerant electron model with crystalline or magnetic long range order, Physica A 138 (1986) 320–358. [85] S. Aubry, G. Abramovici, J. Raimbault, Chaotic polaronic and bipolaronic states in the adiabatic Holstein model, J. Stat. Phys. 67 (3– 4) (1992) 675–780. [86] F. Bonetto, V. Mastropietro, Critical indices for the Yukawa2 quantum 6eld theory, Nucl. Phys. B 497 (1997) 541–554. [87] V. Mastropietro, Anomalous superconductivity for coupled Luttinger liquids, Rev. Math. Phys. 12 (12) (2000) 1627–1654. [88] A.H. Castro-Neto, F. Guinea, Superconductivity, Josephson coupling and ordr parameter symmetry in striped cuprates, Phys. Rev. Lett. 80 (18) (1998) 4040–4043. [89] J. Feldman, J. Magnen, V. Rivasseau, E. Trubowitz, An in6nite volume expansion for many fermions Freen functions, Helv. Phys. Acta 65 (1992) 679–721. [90] F.R. Gantmacher, The Theory of Matrices, Chelsea, New York, 1960. [91] H. RIussmann, Invariant tori in the perturbation theory of weakly non-degenerate integrable Hamiltonian systems, Preprint, Rehie Des Fachbereichs Mathematik der Johannes Gutenberg-UniversitIat Mainz, Nr. 14, 1998. [92] A.M. Davie, The critical function for the semistandard map, Nonlinearity 7 (1) (1994) 219–229. [93] A. Berretti, G. Gentile, Bryuno function and the standard map, Comm. Math. Phys., in press. [95] C. Baesens, R.S. MacKay, Improved proof of existence of chaotic polaronic and bipolaronic states for the adiabatic Holstein model and generalizations, Nonlinearity 7 (1994) 59–84. [97] C. Bourbonnais, L.G. Caron, Renormalization-group approach to quasi-one-dimensional conductors, Int. J. Mod. Phys. B 5 (6 –7) (1991) 1033–1096. [98] E.R. Caianiello, Number of Feynman graphs and convergence, Nuovo Cimento 3 (1) (1956) 223–225. [100] I.E. Dzyaloshinskii, A.I. Larkin, Correlation functions for a one-dimensional Fermi system with long-range interaction (Tomonaga model), Soviet Phys. JETP 38 (1974) 202–208. [101] J. Feldman, M. Salmhofer, M. Trubowitz, Perturbation theory around nonnested Fermi surface. I. Keeping the Fermi surface 6xed, J. Stat. Phys. 84 (5 – 6) (1996) 1209–1336. [102] J. Feldman, M. Salmhofer, M. Trubowitz, Regularity of the moving Fermi surface: RPA contributions, Comm. Pure Appl. Math. 51 (9 –10) (1998) 1133–1246. [103] J. Feldman, M. Salmhofer, M. Trubowitz, Regularity of interacting nonspherical Fermi surfaces: the full self-energy, Comm. Pure Appl. Math. 52 (3) (1999) 273–324.
G. Gentile, V. Mastropietro / Physics Reports 352 (2001) 273–437
437
[104] T. Giamarchi, H.J. Schultz, Correlation functions of one dimensional quantum systems, Phys. Rev. B 39 (1989) 4620–4629. [105] F.D.M. Haldane, Luttinger liquid theory of the one dimensional quantum Cuids, J. Phys. C 14 (1981) 2585– 2609. [106] F. Harary, E. Palmer, Graphical Enumerations, Academic Press, New York, 1973. [107] T. Holstein, Studies of polaron motion, part 1. The molecular-crystal model, Ann. Phys. 8 (1959) 325–342. [108] A.N. Kolmogorov, On the preservation of conditionally periodic motions, Dokl. Akad. Nauk. 96 (1954) 527–530. English translation published in G. Casati, J. Ford, Stochastic Behavior in Classical and Quantum Hamiltonians, Lecture Notes in Physics, Vol. 93, Springer, Berlin, 1979. [109] J.L. Lebowitz, N. Macris, Low-temperature phases of itinerant fermions interacting with classical phonons: the static Holstein model, J. Stat. Phys. 76 (1–2) (1994) 91–123. [110] E.H. Lieb, A model for crystallization: a variation of the Hubbard model, Physica A 140 (1986) 240–250. [112] J.M. Luttinger, An exactly soluble model of a many fermions system, J. Math. Phys. 4 (1963) 1154–1162. [113] D. Mattis, Band theory of magnetism in metals in the context of band theory of magnetism, Physics 1 (1964) 183–193. [115] R. Shankar, Renormalization group approach to interacting fermions, Rev. Mod. Phys. 66 (1) (1994) 129–192. [116] S. Tomonaga, Remarks on Bloch’s methods of sound waves applied to many fermion systems, Progr. Theor. Phys. 5 (1950) 544–569. [117] K.G. Wilson, M. Fisher, Critical exponents in 3.99 dimension, Phys. Rev. Lett. 28 (1976) 2240–2243.
Physics Reports 352 (2001) 439–458
Renormalization group and probability theory G. Jona-Lasinio Dipartimento di Fisica and INFN, Universita “La Sapienza”, Piazzale Aldo Moro 2, 00185 Roma, Italy Received March 2001; editor : I: Procaccia
Contents 1. Introduction 2. A renormalization group derivation of the central limit theorem 3. Hierarchical models 4. Eigenvalues of the linearized RG and critical indices 5. Self-similar random 1elds 6. Some properties of self-similar random 1elds
440 442 444 446 447 449
7. Multiplicative structure 8. RG and e3ective potentials 9. Coexistence of phases in hierarchical models 10. Weak perturbations of Gaussian measures: a non-Gaussian 1xed point 11. Concluding remarks Acknowledgements References
450 451 452 454 456 457 457
Abstract The renormalization group has played an important role in the physics of the second half of the 20th century both as a conceptual and a calculational tool. In particular, it provided the key ideas for the construction of a qualitative and quantitative theory of the critical point in phase transitions and started a new era in statistical mechanics. Probability theory lies at the foundation of this branch of physics and the renormalization group has an interesting probabilistic interpretation as it was recognized in the middle 1970s. This paper intends to provide a concise introduction to this aspect of the theory of c 2001 Elsevier phase transitions which clari1es the deep statistical signi1cance of critical universality. Science B.V. All rights reserved. PACS: 02:50:−r; 05:10:Cc; 05:20:−y
E-mail address:
[email protected] (G. Jona-Lasinio). c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 4 2 - 4
440
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
1. Introduction The renormalization group (RG) is both a way of thinking and a calculational tool which acquired its full maturity in connection with the theory of the critical point in phase transitions. The basic physical idea of the RG is that when we deal with systems with in1nitely many degrees of freedom, like thermodynamic systems, there are relatively simple relationships between properties at di3erent space scales so that in many cases we are able to write down explicit exact or approximate equations which allow us to study asymptotic behaviour at very large scales. The 1rst systematic use of probabilistic methods in statistical mechanics was made by Khinchin who showed, using the central limit theorem, that the Boltzmann distribution of the singlemolecule energy in systems of weakly correlated molecules is universal, that is independent of the form of the interaction, provided it is of short range. In his well known book [1] he emphasizes that physicists had not fully appreciated the generality of probabilistic methods so that, for example, they provided a new derivation (usually heuristic) of the Boltzmann law for every type of interaction. Similar remarks apply to the 1rst applications of the RG in statistical mechanics. RG was introduced as a tool to explain theoretically the universality phenomena, the scaling laws, discovered experimentally near the critical point of a phase transition. In the 1rst period RG calculations used di3erent formal devices, mostly borrowed from quantum 1eld theory, which gave good qualitative and quantitative results. It was soon realized that a new class of limit theorems in probability was involved. This class referred to situations of strongly correlated variables to which the central limit theorem does not apply, that is situations opposite to those considered by Khinchin. In fact, it was discovered that a critical point can be characterized by deviations from the central limit theorem. Before providing a description of the content of the present paper we give a short account of the development of RG ideas in statistical mechanics. It is useful to distinguish two di3erent conceptual approaches. The 1rst use of the RG in the study of critical phenomena [2] was based on a Green’s function approach to statistical mechanics which paralleled quantum 1eld theory. We recall Eq. (1) of [2] G(x; {yi }; ) = Z(t; {yi }; )G(x=t; {yi =t }; ZV−1 (t; {yi }; )Z 2 (t; {yi }; )) ;
(1.1)
where G is a dimensionless two-point Green’s function depending on a momentum variable x, a set of physical parameters yi and the intensity of the interaction (all dimensionless). This is an exact generalized scaling relation which in the vicinity of the critical point reduces to the phenomenological scaling due to the disappearance of the irrelevant parameters. The scaling functions Z and ZV can be expressed in terms of Green’s functions themselves via certain normalization conditions. This equation provided a qualitative explanation of scaling and, after the introduction of a non-integer space dimension d and the use of i = 4 − d as a perturbation parameter [3], became the basis for systematic quantitative calculations [4 – 6]. The second approach started with the use on the part of Wilson of a di3erent notion of RG [7] that he had already introduced in a di3erent context, the 1xed source meson theory [8,9], with no reference to critical phenomena. This was akin to certain intuitive ideas of Kadano3 [10] about the mechanism of reduction of relevant degrees of freedom near the critical point. Kadano3’s idea was that in the critical regime a thermodynamic system, due to the strong
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
441
correlations among the microscopic variables, behaves as if constituted by blocks of arbitrary size. In Wilson’s approach, in fact, the calculation of a statistical sum consisted in a progressive elimination of the microscopic degrees of freedom to obtain the asymptotic large scale properties of the system. Formally, the Green’s function and the Wilson method were very di3erent and in particular the 1rst one implied a true group structure while the second was a semigroup. Both gave exactly the same results and the problem arose of clarifying the conceptual structures underlying these methods. In fact, many people were confused by this situation and some thought that the two methods had little connection with each other. Actually, the possibility of di3erent RG transformations equally e3ective in the study of critical properties could be easily understood using concepts from the theory of dynamical systems. The critical point corresponds to a 1xed point of these transformations and the quantities of physical interest, i.e. the critical indices, are connected with the hyperbolic behaviour in its neighbourhood, which is preserved if the di3erent transformations are related by a di3erentiable map [11]. Still the multiplicative structure of the Green’s function RG and the elimination of degrees of freedom typical of Wilson’s approach did not appear easy to reconcile. The formal connection was clari1ed in [12] where it was shown that to any type of RG transformation one can associate a multiplicative structure, a cocycle, and the characterizing feature of the Green’s function RG is that it is de1ned directly in terms of this structure. In the probabilistic setting the multiplicative structure is related to the properties of conditional expectations as discussed in [30] and illustrated in the present paper. The relationship between the two approaches is an aspect that has not been fully appreciated in the literature and even an authoritative recent exposition of the RG history seems to suggest the existence of basic conceptual di3erences [13]. In particular, Fisher discusses whether equations like (1.1) did anticipate Wilson and concludes negatively. Di3erent interpretations in the history of scienti1c ideas are, of course, legitimate but after 30 years of applications of RG in critical phenomena and other 1elds this conclusion does not appear justi1ed. In his 1970 paper [9] on meson theories Wilson had emphasized analogies of his approach with the Green’s function RG of Gell-Mann and Low even though a detailed comparison was not yet available. A balanced presentation of the di3erent RG approaches to critical phenomena can be found in the second edition of [14] of which, to my knowledge, there is no English translation. More diIcult was to understand at a deeper level the statistical nature of the critical universality. After 25 years I still think that the language of probability provides the clearest description of what is involved. The 1rst hint came from a study by Bleher and Sinai [15] of Dyson’s hierarchical models where they showed that at the critical point the increase of Juctuations required a normalization di3erent from the square root of the number of variables in order to obtain a non-singular distribution for sums of spin variables (block spin). This normalization factor is directly related to the rescaling parameter of the 1elds in the RG. The limit distribution could be either a Gaussian as in the central limit theorem (CLT) or a di3erent one which could be calculated approximately. The next step consisted in the recognition that new limit theorems for random 1elds were involved [16 –18]. The random 1elds appearing in these limit theorems have scaling properties and some examples had already appeared in the probabilistic literature. However these examples [19] were not of a kind natural in statistical mechanics. The new challenging problem posed by the theory of phase transitions was the case of short range interactions producing at the critical point long
442
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
range correlations whose scaling behaviour cannot be easily guessed from the microscopic parameters. A general theory of such limit theorems is still missing and so far rigorous progress has been obtained in situations which are not hierarchical but share with these the fact that some form of scaling is introduced from the beginning. The main part of the present article will review the connections of RG with limit theorems as they were understood in the decade 1975 –1985. The justi1cation for presenting old material resides in the fact that these results are scattered in many di3erent publications very often with di3erent perspectives. Here an e3ort is made to present the probabilistic point of view in a synthetic and coherent way. Section 10 will give an idea of more recent work trying to extract some general feature from the hard technicalities which characterize it.
2. A renormalization group derivation of the central limit theorem CLT asserts the following. Let 1 ; 2 ; : : : ; n ; : : : be a sequence of independent identically distributed (i.i.d.) random variables with 1nite variance 2 = E( i − E( i ))2 , where E means expectation with respect to their common distribution. Then n 1 ( i − E( i )) n→∞ → N (0; 1) ; (2.1) n1=2 where the convergence is in law and N (0; 1) is the normal centered distribution of variance 1. To visualize things consider the random variables i as discrete or continuous spins associated n to the points of a one dimensional lattice Z and introduce the block variables n1 = 2−n=2 21 i n+1 and n2 = 2−n=2 22n +1 i . Then 1 n+1 = √ (n1 + n2 ) : 2 Therefore, we can write the recursive relation for the corresponding distributions √ √ pn+1 (x) = 2 dy pn ( 2x − y)pn (y) = (Rpn )(x) :
(2.2)
(2.3)
The non-linear transformation R is what we call a renormalization transformation. Let us 1nd its 1xed points, i.e. the solutions of the equation Rp = p. An easy calculation shows that the family of Gaussians pG; (x) = √
1 2 2 e−x =2 2 2
(2.4)
are 1xed points. To prove the CLT we have to discuss the conditions under which the iteration of R converges to a 1xed point of variance 2 . The standard analytical way is to use the Fourier transform since R is a convolution. In view of the subsequent developments here we shall illustrate the mechanism of convergence in the neighbourhood of a 1xed point from the point of view of non-linear analysis. There are three conservation laws associated with
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
R: normalization, centreing and variance. In formulas pn+1 (x) d x = pn (x) d x ;
(2.5)
xpn+1 (x) d x =
443
2
xpn (x) d x ;
x pn+1 (x) d x =
x2pn (x) d x :
(2.6) (2.7)
Therefore, only distributions with variance 2 can converge to a Gaussian pG; (x). We 1x = 1 and write pG for pG; 1 . Let us write the initial distribution as a centred deformation of the Gaussian with the same variance p (x) = pG (x)(1 + h(x)) ; where is a parameter. The function h(x) must satisfy pG (x)h(x) d x = 0 ; pG (x)xh(x) d x = 0 ; pG (x)x2 h(x) d x = 0 :
(2.8)
(2.9) (2.10) (2.11)
Suppose now small. In linear approximation we have (Rp ) = pG (1 + (Lh)) + O(2 ) ; where L is the linear operator 2 −1=2 dy e−y h(y + x2−1=2 ) : (Lh)(x) = 2
(2.12)
(2.13)
The eigenvalues of L are k = 21−k=2
(2.14)
and the eigenfunctions the Hermite polynomials. The three conditions above on h(x) can be read as the vanishing of its projections on the 1rst three Hermite polynomials. The mechanism of convergence of the deformed distribution to the normal law is now clear in linear approximation: if we develop h in Hermite polynomials only terms with k ¿ 2 will appear so that upon iteration of the RG transformation they will contract to zero exponentially as the corresponding eigenvalues are ¡ 1. To complete the proof one must show that the non-linear terms do not alter the conclusion. This is less elementary and will not be pursued here. A terminological remark. The Gaussian is an example of what is called in probability theory a stable distribution. These are distributions which are 1xed points of convolution equations and, with the exception of the Gaussian, have in1nite variance.
444
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
3. Hierarchical models Suppose now that the i are not independent. A case which has played a very important role in the development of the RG theory of critical phenomena is that of hierarchical models. To keep the notation close to that of the previous section we write the recursion relation connecting the distribution at level n to that at level n + 1 pn+1 (x) = (Rˆ pn )(x) = gn (x2 )(Rpn )(x) ;
(3.1)
where gn (x2 ) is a sequence of positive increasing functions and R has the same meaning as in the previous section. It is clear that such a dependence tends to favour large values of the block variable x and therefore values of the i ’s of the same sign. We call this a ferromagnetic n 2 dependence. We make the following choice gn (x2 ) = Ln e(c=2) x , where the constant Ln is determined by the normalization condition. This type of recursion arises from the following Gibbs distribution: n
d = Zn−1 e−Hn (x1 ; :::; x2n )
2
dp0 (xi ) ;
(3.2)
1
where Hn has the following hierarchical structure: Hn (x1 ; : : : ; x2n ) = Hn−1 (x1 ; : : : ; x2n−1 ) + Hn−1 (x2n−1 +1 ; : : : ; x2n ) − c
n
2n 2 xi 1
2
;
(3.3)
H0 = 0 and p0 (x) is a single spin distribution which characterizes the model. The constant c satis1es 1 ¡ c ¡ 2. For c ¡ 1 the model is trivial while for c ¿ 2 becomes thermodynamically unstable. To understand what happens in the case of dependent variables let us consider the hierarchical model de1ned by a Gaussian single spin distribution where the iteration can be performed exactly, that is 2
p0 (x) = (2)−1=2 e−x =2 ; n
(3.4)
(Rˆ p0 )(x) = (2n2 )−1=2 e−x =2n ; 2
2
n2 = 1 − 2
n
(3.5)
−1
(c=2)k
:
(3.6)
1
We see that the conservation of variance does not hold anymore under the transformation Rˆ and, in fact, the variance increases at each iteration and is dependent. When n tends to in1nity the iteration converges to the distribution 2
p(x) = (2)−1=2 (1 − 2c=(2 − c))1=2 e−(1−2c=(2−c))x =2
(3.7)
provided ¡ cr = 1=c − 12 , that is the CLT holds if the temperature is suIciently large. At = cr the variance of the limit distribution explodes which means that the Juctuations
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
445
increase faster than O(2n=2 ). We try a new normalization of the block variable and consider 2n n −n=2 . The recursion for the distribution of this variable is 1 i =2 c x2 dy pn (2=c1=2 x − y)pn (y) : (3.8) pn+1 (x) = Ln e We now follow the same pattern as in the previous section: calculate the 1xed points and see whether they admit a stable manifold or, in probabilistic language, a domain of attraction. The 1xed points are the solutions of the equation x2 p(x) = Le dy p(2=c1=2 x − y)p(y) (3.9) with L determined by normalization. A Gaussian solution is easily found pG (x) = (a0 =)1=2 e−a0 x
2
(3.10)
with a0 = c=(2 − c). We shall again discuss stability in linear approximation by considering a centred deformation of (3.10) p (x) = pG (x)(1 + h(x)). For small ˆ h)) + O(2 ) ; (Rˆ p ) = pG (1 + (L
(3.11)
ˆ is the linear operator where L 2 ˆ (Lh)(x) = dy e−2a0 y (h((x + y)=c1=2 ) + h((x − y)=c1=2 )) :
(3.12)
The corresponding eigenvalues for even eigenfunctions are 2k = 2=ck , with k = 0; 1; 2; 3; : : :, and the eigenfunctions are rescaled Hermite polynomials of even degree. Here h(x) must be considered as the e3ect of many iterations starting from some initial distribution which characterizes the model and is therefore dependent on . We now see that for 2 ¿ c ¿ 21=2 the eigenvalues 0 and 2 are ¿ 1. The projection of h over the constants vanishes due to the normalization condition so that for the iteration to converge we have to impose the vanishing of the projection of h on the second Hermite polynomial. In view of the previous remark this will select a special value cr , the critical temperature for the model considered. In conclusion, for 2 ¿ c ¿ 21=2 and = cr the 1xed point (3.10) has a non-empty domain of attraction. When c ¡ 21=2 the Gaussian 1xed point becomes unstable and we must investigate about the existence of other 1xed points. Bifurcation theory tells us that most likely there is an exchange of stability between two 1xed points and we should look for the new one in the direction which has just become unstable. For c ¡ 21=2 we have 4 ¿ 1 so the instability is in the direction H4 , the Hermite polynomial of fourth order. De1ne i = 21=2 − c and look, for i small, for a solution of (3.9) of the form pNG (x) = pG (x)(1 − iaH4 ( x)) + O(i2 ) ≈ e−r
∗ (i)x 2 =2−u∗ (i)x 4 =4
;
(3.13)
where r ∗ and u∗ are the 1xed point couplings. The analysis of this case is considerably more complicated and gives the following results: the linearization of the RG transformation around (3.13) has only one unstable direction so that by requiring the vanishing of an appropriate projection along this direction we obtain a non empty domain of attraction for some cr [20,21].
446
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
4. Eigenvalues of the linearized RG and critical indices We illustrate the interpretation of the eigenvalues of the linearized RG at a 1xed point in the context of hierarchical models, which is especially simple. Notations are as in the previous section. Consider for de1niteness the region 21=3 ¡ c ¡ 21=2 so that the Gaussian 1xed point is unstable, but in its neighbourhood there exists a non-trivial non-Gaussian 1xed point of the form (3.13) with a non-empty domain of attraction. Suppose now that we start the iteration of the RG from some initial distribution which is close to it but not in its domain of attraction. For example we may consider a distribution p(x; ) of the form (3.13) with parameters r; u slightly di3erent from r ∗ ; u∗ which are the values taken at the critical temperature cr . Application of the RG transformation will eventually drive this distribution away from the 1xed point due to the presence of an unstable direction. However, we can “renormalize” our parameters r; u by compensating the instability at each iteration and 1nd a sequence of parameters rn ; un such that when n tends to in1nity we have a sequence of distributions pn (x; n ) approaching a de1nite limit. Since we have assumed that the parameters rn ; un are close to the 1xed point values, the renormalized parameters can be simply expressed in terms of rescalings determined by the ˆ introduced in the previous section in connection eigenvalues of a linear operator analogous to L with the Gaussian 1xed point. This can be seen as follows. Let us write our initial distribution as a deformation of (3.13) p(x; ) = pNG (x; cr )(1 + h(x; )) :
(4.1)
If we develop h in terms of eigenfunctions of the linearized RG at (3.13), the iteration of the RG transformation will multiply the projections of h along these eigenfunctions by powers of the corresponding eigenvalues. The explosion in the unstable direction can then be controlled by rescaling at each step the projection in this direction with a factor proportional to the inverse eigenvalue. Let us de1ne the susceptibility of a block of 2n spins 2 2n 1 #n () = n E i : (4.2) 2 1
This quantity can be easily expressed in terms of the distribution pn (x; ) of the block with the critical normalization 2n c−n=2 n d x x2pn (x; ) : #n () = (2=c) (4.3) As n → ∞, #n diverges if = cr . To calculate the susceptibility critical index as approaches the critical value we assume that the two limits n → ∞ and → cr can be interchanged so that they can be calculated over subsequences. We want to compute −n log(c=2) + log dy y2 pn (y; n ) log #n (n ) $# = lim = lim : (4.4) n→∞ log |n − cr | n→∞ log|n − cr |
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
447
If we now choose |n − cr | ≈ −n , where is the eigenvalue corresponding to the unstable direction of the RG linearized at the 1xed point, the integral appearing in this formula will be almost constant and we obtain log(c=2) $# = : (4.5) log Similar calculations can be done for other thermodynamical quantities like the free energy or the magnetization. We can summarize the situation as follows: given a model de1ned by an initial distribution p0 , for ¡ cr we expect the CLT to hold. For = cr by properly normalizing the block variables we have new limit theorems where the limit law has a domain of attraction which is a non-trivial submanifold in the space of probability distributions called the critical manifold. If we start from a distribution which is not in the domain of attraction of a given 1xed point, but not too far from it, it can still be driven to a regular limit by rescaling at each step its coeIcients in a way dictated by the 1xed point. This de1nes the so-called scaling limits of the theory associated to a given 1xed point. For further reading see [21–23]. 5. Self-similar random %elds The notion of self similar random 1eld was introduced informally in [16] and rigorously in [17] and independently in [18]. It was then developed more systematically in [24,25]. The idea was to construct a proper mathematical setting for the notion of RG a la Kadano3–Wilson. This led to a generalization of limit theorems for random 1elds to the situation in which the variables are strongly correlated. Let Zd be a lattice in d-dimensional space and j a generic point of Zd ; j = (j1 ; j2 ; : : : ; jd ) with integer coordinates ji . We associate to each site a centred random variable j and de1ne a new random 1eld nj = (R; n )j = n−d=2 s ; (5.1) s∈Vjn
where Vjn = {s : jk n − n=2 ¡ sk 6 jk n + n=2}
(5.2)
and 1 6 ¡ 2. Transformation (5.1) induces a transformation on probability measures according to (R;∗ n )(A) = (A) = (R;−1n A) ;
(5.3)
where A is a measurable set and R;∗ n has the semigroup property ∗ ∗ ∗ R; n1 R; n2 = R; n1 +n2 :
(5.4)
A measure will be called self-similar if ∗ R; n =
(5.5)
448
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
and the corresponding 1eld will be called a self-similar random 1eld. We brieJy discuss the choice of the parameter . It is natural to take 1 6 ¡ 2. In fact, = 2 corresponds to the law of large numbers so that the block variable (5.1) will tend for large n to zero in probability. The case ¿ 1 means that we are considering random systems which Juctuate more than a collection of independent variables and = 1 corresponds to the CLT. Mathematically, the lower bound is not natural but it becomes so when we restrict ourselves to the consideration of ferromagnetic-like systems. A general theory of self-similar random 1elds so far does not exist and presumably is very diIcult. However, Gaussian 1elds are completely speci1ed by their correlation function and self-similar Gaussian 1elds can be constructed explicitly [18,26]. It is easier if we represent the correlation function in terms of its Fourier transform d E( i j ) = dk ((1 ; : : : ; d )ei k k (i−j)k : (5.6) −
1
The prescription to construct ( in such a way that the corresponding Gaussian 1eld satis1es (5.5) is as follows. Take a positive homogeneous function f(1 ; : : : ; d ) with homogeneity exponent d(1 + ), that is f(c1 ; : : : ; cd ) = cd(1+) f(1 ; : : : ; d ) :
(5.7)
Next, we construct a periodic function g(1 ; : : : ; d ) by taking an average over the lattice Z d 1 g(1 ; : : : ; d ) = : (5.8) f(1 + i1 ; : : : ; d + id ) i k
If we take now ((1 ; : : : ; d ) =
|1 − eii |2 g(1 ; : : : ; d ) ;
(5.9)
i
it is not diIcult to see that the corresponding Gaussian measure satis1es (5.5). The periodicity of ( insures translational invariance. For d = 1 there is only one, apart from a multiplicative constant, homogeneous function and one can show that the above construction exhausts all possible Gaussian self-similar distributions. For d ¿ 1 it is not known whether a similar conclusion holds. From this point on one can follow in the discussion the same pattern as for hierarchical models and investigate the stability of the Gaussian 1xed points PG of (5.5). Consider a deformation PG (1 + h) and the action of R;∗ n on this distribution. It is easily seen that ∗ n ∗ n n R; n PG h = E(h|{ j })R; n PG = E(h|{ j })PG ({ j }) :
(5.10)
The conditional expectation on the right hand side of (5.10) will be called the linearization of the RG at the 1xed point PG . To proceed further in the study of the stability we have to 1nd its eigenvectors and eigenvalues. These have been calculated by Sinai. The eigenvectors are appropriate in1nite dimensional generalizations of Hermite polynomials Hk which are described in full detail in [26]. They satisfy the eigenvalue equation E(Hk |{ nj }) = n[k(=2−1)+1]d Hk ({ nj }) :
(5.11)
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
449
We see immediately that H2 is always unstable while H4 becomes unstable when crosses from below the value 32 . By introducing the parameter i = − 23 , in principle, one can construct, as in the hierarchical case, a non-Gaussian 1xed point. The formal construction is explained in Sinai’s book [26] where one can 1nd also an exhaustive discussion of the questions, mostly unsolved, arising in this connection. A di3erent construction of a non-Gaussian 1xed point, for d = 4 has been made recently by Brydges, Dimock and Hurd. This will be brieJy discussed in Section 10. 6. Some properties of self-similar random %elds We have already characterized the critical point as a situation of strongly dependent random variables, in which the CLT fails. We want to give here a characterization which refers to the random 1eld globally. Consider in the product space of the variables i the cylinder sets, that is the sets of the form { i1 ∈ A1 ; : : : ; in ∈ An } ;
(6.1) Zd
with i1 ; : : : ; in ∈ +, + being an arbitrary 1nite region in and the Ai measurable sets in the space of the variables i . We denote with ,+ the -algebra generated by such sets. We say that the variables i are weakly dependent or that they are a strong mixing random 1eld if the following holds. Given two 1nite regions +1 and +2 separated by a distance d(+1 ; +2 ) =
min
i∈+1 ; j∈+2
|i − j | ;
(6.2)
where |i − j | is for example the Euclidean distance, de1ne -(+1 ; +2 ) =
sup
A∈,+1 ;B∈,+2
|(A ∩ B) − (A)(B)| :
(6.3)
Then -(+1 ; +2 ) → 0 when d(+1 ; +2 ) → ∞. Intuitively, the strong mixing idea is that one cannot compensate for the weakening of the dependence of the variables due to an increase of their space distance, by increasing the size of the sets. This situation is typical when one has exponential decay of correlations. This has been proved for a wide class of random 1elds including ferromagnetic non-critical spin systems [27]. The situation is entirely di3erent at the critical point where one expects the correlations to decay as an inverse power of the distance. In this connection the following result has been proved in [28]: a ferromagnetic translational invariant system with pair interactions with correlation function C(i) = E( 0 i ) − E( 0 )E( i ) such that lim
L→∞
(6.4)
L(sk −1)6ik ¡L(sk +1) C(i)
06ik ¡L C(i)
= 0
for arbitrary sk , does not satisfy the strong mixing condition.
(6.5)
450
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
This theorem implies in particular that a critical two-dimensional Ising model violates strong mixing. Therefore, violation of strong mixing seems to provide a reasonable characterization of the type of strong dependence encountered in critical phenomena. On the other end, under very general conditions, if strong mixing holds the one-block distribution satis1es the CLT [29]. An interesting question is whether we can describe the structure of the limit one-block distributions that can appear at the critical point beside the Gaussian. It was shown in [28], building on previous results by Newman, that for ferromagnetic systems the Fourier transform (characteristic function in probabilistic language) of the limit distribution must be of the form 2 E(eit ) = e−bt (1 − t 2 =j2 ) (6.6)
j
with j 1=j2 ¡ ∞. In the probabilistic literature these distributions are called the D-class [36]. The Gaussian is the only in1nitely divisible distribution belonging to this class. 7. Multiplicative structure In this section we show that there is a natural multiplicative structure associated with transformations on probability distributions like those induced by the RG. This multiplicative structure is related to the properties of conditional expectations. We use the notations of Section 5. Suppose we wish to evaluate the conditional expectation E(h|{ nj }) ;
(7.1)
where the collection of block variables nj indexed by j is given a 1xed value. Here h is a function of the individual spins i . It is an elementary property of conditional expectations that nm E(E(h|{ nj })|{ nm j }) = E(h|{ j }) :
(7.2)
Let P be the probability distribution of the i and R;∗ n P the distribution obtained by applying the RG transformation, that is the distribution of the block variables nj . By specifying in (7.2) the distribution with respect to which expectations are taken we can rewrite it as nm ER;∗ n P (EP (h|{ nj })|{ nm j }) = EP (h|{ j }) :
(7.3)
This is the basic equation of this section and we want to work out its consequences. For this purpose we generalize the eigenvalue equation (5.11) to the case in which the probability distribution is not a 1xed point of the RG. In analogy with the theory of dynamical systems we interpret the conditional expectation as a linear transformation from the linear space tangent to P to the linear space tangent to R;∗ n P and we assume that in each of these R;∗ n P
spaces there is a basis of vectors HkP , Hk equation [31]: R;∗ n P
EP (HkP |{ nj }) = k (n; P)Hk
({ nj }) :
connected by the following generalized eigenvalue (7.4)
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
451
Eq. (7.3) implies that the ’s must satisfy the relationship k (m; R;∗ n P)k (n; P) = k (mn; P) :
(7.5)
From (7.4) and (7.5) we 1nd that the k are given by the following expectations: ∗
R P k (n; P) = E(HQ k ; n ({ nj })HkP ({ j })) ;
(7.6)
P P where HQ k are dual to HkP according to the orthogonality relation HQ k HjP d P = 3kj . The k are therefore special correlation functions. The similarity between Eqs. (7.5) and (1.1) is then obvious. The Green’s function RG corresponds to a very simple transformation on the probability distribution such that its form is unchanged and only the values of its parameters are modi1ed.
8. RG and e*ective potentials In this section we want to illustrate a connection between RG and the theory of large deviations [32]. By large deviations we mean Juctuations with respect to the law of large numbers, e.g., Juctuations of the magnetization in a large but 1nite volume. In view of the connection of RG with limit theorems our discussion will parallel, actually generalize, some well known facts in the theory of sums of independent random variables. This will lead to a probabilistic interpretation of a widely used concept in physics, the e3ective potential, and will clarify its relationship with the e3ective Hamiltonian in RG theory. We continue with our model system of continuous spins i indexed by the sites of a lattice in d dimensions and try to estimate the probability that the magnetization in a volume + be larger than zero at some temperature above criticality. From the exponential Chebyshe3 inequality we have for 4 ¿ 0; x ¿ 0 P i = |+| ¿ x 6 e−|+|4x E(e4 i∈+ i ) 6 e−|+|5(|+|; x) ; (8.1) i∈+
where
1 log E(e4 i∈+ i ) 5(|+|; x) = sup 4x − |+| 4¿0
(8.2)
is the Legendre transform of (1= |+)|log E(e4 i∈+ i ). With some more work one can establish also a lower bound P i = |+| ¿ x ¿ e−|+|(5(|+|; x)+(|+|)+3) (8.3) i∈+
with → 0 for + → ∞, and 3 ¿ 0 arbitrarily small. We then conclude 1 log P − lim i = |+| ¿ x = lim 5(|+|; x) = Ve3 (x) ; +→∞ |+| +→∞
(8.4)
i∈+
where Ve3 (x) is known in the physical literature as the e3ective potential. An important remark. While 5(|+|; x), being the Legendre transform of a convex function, is always convex for any
452
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
+, this is not the case with −(1= |+|) log P( i∈+ i = |+| ¿ x) = V (|+|; x) for 1nite + and one has to be careful in interpreting, for example, results from numerical simulations. To understand the connection with the RG it is convenient to consider 1rst the case of independent random variables, that is the situation considered in Section 2. A classical problem in limit theorems for independent variables is the estimate of the corrections to the CLT when the argument of the limit distribution increases withn. A well known result in this domain is the following [29]: suppose we want to estimate P( n1 i =n1=2 ¿ x) when x = o(n1=2 ). Then n s −1=2 k i =n1=2 ¿ x ≈ e−n 2 5k (xn ) (8.5) P 1
k for n → ∞ and limn→∞ xs+1 n−(s−1)=2 = 0. The function 5(z) = ∞ 2 5k z is the Legendre trans4 form of log E(e i ). The sign ≈ has to be understood as logarithmic dominance. If x = O(n1=2 ), the whole function 5 contributes and we are back to the large deviation estimate at the beginning of this section. We expect a result like (8.5) to hold for the one-block distribution in the case of dependent variables as in statistical mechanics away from the critical point. We then see that the coeIcients of an expansion of 5(|+|; x) in powers of x determine the corrections to the CLT for the one-block distribution. More interesting is the situation at the critical point. Suppose 1rst that the one block limit distribution is Gaussian but the normalization is anomalous as it is the case in hierarchical models for a range of values of the parameter c. Instead of (8.5) we expect an estimate of the form s 1−( k P i = |+|( ¿ x ≈ e−|+| 2 5k (|+|)(x=|+| ) (8.6) i∈+
with ( ¿ 12 and lim|+|→∞ xs+1 = |+|(1−()s−( = 0. We see that for the quadratic term to survive the coeIcient 52 must vanish when |+| → ∞ as |+|1−2( . If the one-block limit distribution is not Gaussian we can establish a general relationship between its logarithm and the e3ective potential. Let us write −log P = VRG where P is the limit distribution. We now rewrite the large deviation estimate in the following way: P i = |+|( ¿ x|+|1−( ≈ e−|+|5(|+|; x) : (8.7) i∈+
Scale x → x= |+|1−( . Then we obtain VRG (x) = lim |+|5(|+|; x= |+|1−( ) : |+|→∞
(8.8)
Therefore 5(|+|; x) determines in di3erent limits either Ve3 or VRG . The discussion in the present section can be made rigorous in the case of hierarchical models. 9. Coexistence of phases in hierarchical models In the case of hierarchical models the RG recursion relation for the one-block probability distribution can be easily rewritten as a recursion for the quantity V (|+|; x) introduced in the
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
453
previous section which coincides with the e3ective potential in the limit |+| → ∞. In fact if we normalize the block-spin with its volume, that is consider the mean magnetization, a simple calculation gives the following iteration for the corresponding probability distribution n (x): n 2 d x n−1 (2x − x )n−1 (x ) : (9.1) n (x) = Ln ec x Taking the logarithm and dividing by the number of spins 2n , we obtain c n 1n 1n n− 1 2 Vn (x) = − log Ln − x − log d x e−2 (Vn−1 (2x−x )+Vn−1 (x )) : 2 2 2
(9.2)
To illustrate the di3erence between (9.2) and (3.8) let us consider again the simple case in which the model is de1ned by a Gaussian single spin distribution. The iteration of (9.2) gives for small n 1 k Vn (x) = 1− (c=2) x2 + $n ; (9.3) 2 0
where $n tends to zero when n → ∞. In this limit then Ve3 = 12 [1 − 2c=(2 − c)]x2 :
(9.4)
The critical temperature is de1ned as the value cr for which the coeIcient of x2 vanishes and coincides with that found in Section 3. On the other hand, in that section, it was the only temperature for which recursion (3.8) converges to the Gaussian 1xed point, i.e. the only temperature for which the following di3erence between two diverging expressions converges: n+1 n k 2 2 2 − : (9.5) c c 0
We want to apply now (9.2) to the study of the magnetization in the phase coexistence region for a general hierarchical model [33,34]. The problem we want to discuss is the following. In the hierarchical case at level n we have blocks, containing each 2n−1 spins, interacting in pairs through the Hamiltonian 1 2 cn (n−1 + n−1 )2 =4 = cn (n )2 ;
(9.6)
1 ; 2 ; are mean magnetizations. Suppose now that is assigned the value M , where n−1 n n−1 n 0 ¡ ¡ 1, M being the spontaneous magnetization corresponding to the temperature . We want to calculate the conditional distribution of n−1 given n , for large n. The remarkable 1 2 result is that one of the quantities n−1 or n−1 with probability close to 1 is equal to the full magnetization M . To compute the desired distribution we have to estimate n (x) or, what is the same, Vn (x) for large n. From (9.2) we expect asymptotically
Vn (x) = Ve3 (x) + (c=2)n Y (x) + · · · :
(9.7)
Since in the phase coexistence region Ve3 (x) is Jat, i.e. constant, the whole x dependence is given by Y (x). In order to compute this function we perform a subtraction and consider
454
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
Vn (x) − Vn (x0 ) choosing x0 in the Jat region of Ve3 (x). From (9.2) it is easily seen that the quantity :n (x) = (2=c)n (Vn (x) − Vn (x0 ))
(9.8)
satis1es a recursion of the form :n (x) = − c
−n
2
log An − x − c
−n
log
d x e−c
n−1
(:n−1 (x+x )+:n−1 (x−x ))
;
(9.9)
where An is determined by the condition :n (x0 ) = 0. Let us choose x0 = M . By symmetry :n (±M ) = 0. For 0 6 x 6 M and large n the main contribution to the integral on the right hand side of (9.9) comes from the region x ± x ≈ M , while for −M 6 x 6 0 from the region x ± x ≈ −M . We can write therefore the approximate recursion equations :n (x) = (M 2 − x2 ) + c−1 :n−1 (2x ∓ M ) ;
(9.10)
where the ∓ in the second term on the right corresponds to 0 6 x 6 M or −M 6 x 6 0. This type of equations has been rigorously studied by Bleher [33] and the asymptotic solutions show a complicated fractal structure. The conditional probability of interest to us is −cn−1 (:n−1 (x )+:n−1 (2M −x )) |x −M |¡iM d x e ∞ P(|n−1 − M | ¡ iM |n = M ) = : (9.11) −cn−1 (:n−1 (x )+:n−1 (2M −x )) ∞ dx e Since the main contribution to the integral in the denominator comes from the same region appearing in the numerator, our conditional probability is for suIciently large n as close as we want to 1. 10. Weak perturbations of Gaussian measures: a non-Gaussian %xed point Starting at the end of the seventies the RG has become a very important and e3ective tool for proving rigorous results in statistical mechanics and Euclidean quantum 1eld theory. An impressive amount of work has been done and it is not possible to give even a schematic account of it [35]. Many di3erent versions of the RG idea have been used, each author or group of authors following his own linguistic propensities. Probability theory is always in the background and we want to try to recover some conceptual feature common to all of them. As in the previous part of this review limit theorems will be a relevant reference. However the limit theorems to be considered are of a di3erent kind, they are those which in probability are called non-classical and are related to the following problem. Given a probability distribution P and an integer n, can one consider it as resulting from the composition (convolution) of n distributions Pk ; k = 1; 2; : : : ; n? In other words, can one consider the random variable described by P as the sum of n independent random variables? In formulas P = P1 ? P2 ? · · · ? Pn ;
(10.1)
where ? means convolution. It is clear, for example, that a Gaussian distribution of variance 2 can be thought as the composition of any number n of Gaussians with variances i2 provided
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
455
n
2 2 1 i = .
The problem arises naturally of investigating under what conditions convolutions like the right hand side of (10.1) converge to a regular distribution as n → ∞. The right hand side of (10.1) can be considered as a recurrence relation Pˆ n+1 = Pˆ n ? Pn ; (10.2) where Pˆ n = P1 ? P2 ? · · · ? Pn−1 . Comparing (2.3) with (10.1) we see that while in the case of limit theorems for independent identically distributed random variables we have a natural 1xed point problem, this is not in general the case for non-identically distributed variables. As we shall see below, the RG approach to Euclidean 1eld theory and the statistical mechanics of the critical point has led to formulations which have analogies with these problems. In fact, in1nite dimensional equations structurally similar to (10.2) are constructed which can be transformed into equations admitting 1xed points after a rescaling. In the following exposition, we shall follow the recent article by Brydges et al. [37]. The goal of these authors is the construction of a quantum 1eld theory in R4 with non-trivial scaling behaviour at long distances, that is in the infrared region, determined by a non-Gaussian 1xed point of an appropriate RG transformation. The starting point is a <4 theory in 1nite volume regularized at small distances to eliminate ultraviolet singularities. This model is believed to have a non-Gaussian 1xed point in 4 − i dimensions and to simulate such a situation in 4 dimensions the authors introduce a special covariance for the Gaussian part of the measure. The 1rst step consists in the construction of a covariance v(x − y) which behaves at large distances like (−:)−1−i=2 . This means that it scales like |x|−2+i for large |x|. Their choice is ∞ 2 v(x − y) = d i=2−2 e−|x−y| =4 : (10.3) 1
Such a covariance can be decomposed in the following way: ∞ L−(2−i) j C(L−j (x − y)) ; v(x − y) =
(10.4)
j=0
where L ¿ 1 is a scaling factor and L2 2 C(x) = d i=2−2 e−|x| =4 :
(10.5)
1
Each term in the expansion can be interpreted as the covariance of a rescaled 1eld
(10.6)
which has reduced Juctuations and varies over larger distances. The aim is to study the measure d + = Z −1 e−V+ (<) d v ;
where
V+ (<) =
+
: <4 :v +
(10.7) +
: (9<)2 :v +
+
: <2 :v
(10.8)
when + tends to R4 . The double dots indicate the Wick polynomials with respect to the covariance v.
456
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
Take for + a large cube of side LN so that the measure is well de1ned. We want to calculate (v ? e−V )(<) = (Cˆ N ? Cˆ N −1 ? · · · ? Cˆ 0 ? e−V )(<) having used the above decomposition of the covariance with Cˆ j (x) = L−(2−i) j C(L−j (x − y)) :
(10.9) (10.10)
Actually, in 1nite volume we should specify some boundary conditions but we shall ignore this aspect. Next by de1ning (10.11) Zˆ j (<) = (Cˆ j−1 ? · · · ? Cˆ 0 ? e−V )(<) ; we 1nd the recursion relation Zˆ j+1 (<) = (Cˆ j ? Zˆ j )(<) :
(10.12)
In this way, the calculation is performed by successive integrations over variables which exhibit decreasing Juctuations. This is not yet our RG equation because as j → ∞, Cˆ j becomes a singular distribution and we do not obtain a 1xed point equation. However by introducing the rescaled 1elds
(10.14)
with initial condition Z0 = e−V . We emphasize that the last step is possible due to the special structure of the measures Cˆ j . It is now meaningful to look for the 1xed points of (10.14). Brydges, Dimock and Hurd have proved that in d = 4 for i small there exists a non-Gaussian ˆ i; L) of the coupling and that for certain 1xed point of (10.14) characterized by a value ( values (); () the iteration of (10.14) with initial condition e−V converges to this 1xed point. Technically, the proof is very complicated and its description is beyond the aims of this review. A very good exposition with some simpli1cations of the techniques employed can be found also in [38]. 11. Concluding remarks The question we want to consider is the following: which are the bene1ts for our understanding of critical phenomena and more generally of statistical physics deriving from the use of probabilistic language? Feynman thought that it is worth to spend one’s time formulating a theory in every physical and mathematical way possible. In our case there is an intuition associated with probabilistic reasoning that is foreign to the usual formalisms of statistical mechanics based on correlation functions and equations connecting them. Apart from this general remark we must consider that the rigorous results obtained so far in RG theory have been strongly inJuenced by the probabilistic language as this appears the most natural for the mathematical study of statistical mechanics and Euclidean 1eld theory when a functional integral approach is used. New technical ideas however are needed to deal with
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
457
concrete problems like calculating the critical indices of the three-dimensional Ising model or establishing in a conclusive way whether the 1eld theory <44 is ultraviolet non-trivial. The formal apparatus of RG has been easily extended to the analysis of fermionic systems when these are described by a Grassmaniann functional integral [39], that is by the analog of a Gibbs distribution over anticommuting variables. In this case the convergence of perturbation theory plays a major role on the way to rigorous results. Recently, it has been possible to give a true probabilistic expression to general Grassmaniann integrals in terms of discrete jump processes (Poisson processes) [40,41], so that classical probability may become a main tool also in the study of fermionic systems especially in view of developing non-perturbative methods. For an early example of connection between anticommutative calculus and probability see [42]. In a wider perspective one may remark that the theory of Gibbs distributions is becoming instrumental also in various sectors of mathematical statistics, for example in image reconstruction, and critical situations appear also in this domain. Transfer of ideas from statistical mechanics to stochastic analysis is currently an ongoing process which shows the relevance of a language capable of unifying di3erent areas of research. Probability theory for a long time has not been included among the basic mathematical tools of a physics curriculum but the situation is slowly changing and hopefully this will help cross fertilization among di3erent disciplines. Acknowledgements This paper is an expanded version of a talk given at the meeting RG 2000, Taxco, Mexico, January 1999. It is a pleasure to thank C. Stephens and D. O’Connor for their kind invitation. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]
A.I. Khinchin, Mathematical Foundations of Statistical Mechanics, Dover Publications Inc., New York, 1949. C. Di Castro, G. Jona-Lasinio, Phys. Letts. 29A (1969) 322. K.G. Wilson, M.E. Fisher, Phys. Rev. Letts. 28 (1972) 240. C. Di Castro, Lett. Nuovo Cimento 5 (1972) 69. P.K. Mitter, Phys. Rev. D7 (1973) 2927. E. Brezin, J.C. Le Guillou, J. Zinn Justin, Phys. Rev. D8 (1973) 434. K.G. Wilson, Phys. Rev. B4 (1971) 3174. K.G. Wilson, 140 (1965) B445. K.G. Wilson, Phys. Rev. D2 (1970) 1438. L.P. Kadano3, Physics (N. Y.) 2 (1966) 263. G. Jona-Lasinio, Nobel Symp. 24 (1973) 38. G. Benettin, C. Di Castro, G. Jona-Lasinio, L. Peliti, A.L. Stella, in: M. Levy, P.K. Mitter (Eds.), New Developments in Quantum Field Theory and Statistical Mechanics, Plenum Press, New York, London, 1977. M.E. Fisher, Rev. Mod. Phys. 70 (1998) 653. A.Z. Patashinski, V.L. Pokrovski, Fluctuation Theory of Phase Transitions 2nd Edition, Nauka, Moskva, 1982 (in Russian). P.M. Bleher, Ya.G. Sinai, Comm. Math. Phys. 33 (1973) 23. G. Jona-Lasinio, Nuovo Cimento B 26 (1975) 9. G. Gallavotti, G. Jona-Lasinio, Comm. Math. Phys. 41 (1975) 301. Ya.G. Sinai, Theor. Prob. Appl. 21 (1976) 273.
458
G. Jona-Lasinio / Physics Reports 352 (2001) 439–458
[19] G. Jona-Lasinio, in: M. Levy, P.K. Mitter (Eds.), New Developments in Quantum Field Theory and Statistical Mechanics, Plenum Press, New York, London, 1977. [20] P.M. Bleher, Ya.G. Sinai, Comm. Math. Phys. 45 (1975) 347. [21] P. Collet, J.P. Eckmann, A Renormalization Group Analysis of the Hierarchical Model in Statistical Mechanics, Lecture Notes in Physics, Vol. 74, Springer, Berlin, 1978. [22] M. Cassandro, G. Gallavotti, Nuovo Cimento 25B (1975) 691. [23] G. Gallavotti, H. Knops, Rivista del, Nuovo Cimento 5 (1975) 341. [24] R.L. Dobrushin, in: R.L. Dobrushin, Ya.G. Sinai (Eds.), Multicomponent Random Systems, Dekker, New York, 1978. [25] R.L. Dobrushin, Ann. Prob. 7 (1979) 1. [26] Ya.G. Sinai, Theory of Phase Transitions: Rigorous Results, Pergamon Press, London, 1982. [27] G. Hegerfeldt, C. Nappi, Comm. Math. Phys. 53 (1977) 1. [28] M. Cassandro, G. Jona-Lasinio, in: L. Streit (Ed.), Many Degrees of Freedom in Field Theory, Plenum Press, New York, 1978. [29] I.A. Ibragimov, Yu.V. Linnik, Independent and Stationary Sequences of Random Variables, Wolters-Noordho3, Groningen, 1971. [30] M. Cassandro, G. Jona-Lasinio, Adv. Phys. 27 (1978) 913. [31] V.I. Oseledec, Trans. Moscow Math. Soc. 19 (1968) 197. [32] G. Jona-Lasinio, in: J. FrVolich (Ed.), Scaling and Self-similarity in Physics, BirkhVauser, Basel, 1983. [33] P.M. Bleher, Theor. Math. Phys. 61 (1985) 1107. [34] G. Jona-Lasinio, Helv. Phys. Acta 59 (1986) 234. [35] A basic reference is G. Gallavotti Rev. Mod. Phys. 57 (1985) 471. [36] Yu.V. Linnik, I.V. Ostrovski, Decomposition of Random Variables and Vectors, Translation of Mathematical Monographs, AMS, Providence, RI, 1977. [37] D. Brydges, J. Dimock, T.R. Hurd, Comm. Math. Phys. 198 (1998) 111. [38] P.K. Mitter, B. Scoppola, Comm. Math. Phys. 209 (2000) 207. [39] G. Benfatto, G. Gallavotti, Renormalization Group, Physics Notes, Princeton University Press, Princeton, 1995. [40] G.F. De Angelis, G. Jona-Lasinio, V. Sidoravicius, J. Phys. A 31 (1998) 289. [41] M. Beccaria, C. Presilla, G.F. De Angelis, G. Jona-Lasinio, Europhys. Lett. 48 (1999) 243. [42] D. Brydges, I. Munoz Maya, J. Theor. Prob. 4 (1991) 371.
Physics Reports 352 (2001) 459–520
Coarse-grained eective action and renormalization group theory in semiclassical gravity and cosmology E.A. Calzettaa; ∗ B.L. Hub , Francisco D. Mazzitellic a
Departamento de F sica and IAFE, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, Pabell on I, 1428 Buenos Aires, Argentina b Department of Physics, University of Maryland, College Park, MD 20742, USA c Departamento de F sica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Ciudad Universitaria, Pabell on I, 1428 Buenos Aires, Argentina Received March 2001; editor: I: Procaccia
Contents 1. Aim and scope 2. Phase transitions in the early universe 2.1. Eective action for dynamic order parameter 3elds 2.2. In4ation as scaling: static critical phenomena 2.3. Quasi-static 3eld, ‘slow-roll’ as dynamical critical phenomena 3. Coarse-graining, scaling and in4ation 3.1. In4ation 3.2. Scaling 3.3. Coarse-graining and stochastic in4ation 4. Coarse-grained eective action 4.1. Coarse graining and backreaction 4.2. ‘In–out’ coarse grained eective action 5. Backreaction in the in4ationary universe: renormalization group equations and the running of coupling constants
461 462 463 464 465 466 466 468 469 472 472 473 479
5.1. Cosmological consequences 5.2. Theoretical implications 6. The closed-time-path coarse-grained eective action 6.1. Perturbative evaluation of the CTP CGEA 6.2. Applications: dynamics of system 3elds=modes with backreaction of environment 3elds=modes 7. Master and Langevin equations in quantum 3eld theory 7.1. Master equation and decoherence of long wavelengths 7.2. The Langevin equation 8. Renormalization group from CTP CGEA 8.1. Towards a nonperturbative evaluation of the CTP CGEA: the exact RG equation 8.2. Derivative expansion
∗
481 483 484 485 489 489 492 494 496 496 498
Corresponding author. E-mail addresses:
[email protected] (E.A. Calzetta),
[email protected] (B.L. Hu),
[email protected] (F.D. Mazzitelli). c 2001 Elsevier Science B.V. All rights reserved. 0370-1573/01/$ - see front matter PII: S 0 3 7 0 - 1 5 7 3 ( 0 1 ) 0 0 0 4 3 - 6
460
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
9. Renormalization group and stochastic semiclassical gravity 10. Renormalization group and quantum corrections to the Newtonian potential 11. Renormalization group theory for nonequilibrium systems
501 509 511
11.1. Summary remarks 11.2. Towards a nonequilibrium renormalization group theory Acknowledgements References
511 512 517 517
Abstract In this report we introduce the basic techniques (of the closed-time-path (CTP) coarse-grained eective action (CGEA)) and ideas (scaling, coarse-graining and backreaction) behind the treatment of quantum processes in dynamical background spacetimes and 3elds. We show how they are useful for the construction of renormalization group (RG) theories for studying these nonequilibrium processes and discuss the underlying issues. Examples are drawn from quantum 3eld processes in an in4ationary universe, semiclassical cosmology and stochastic gravity. In Part I (Sections 2, 3) we begin by establishing a relation between scaling and in4ation, and show how eternal in4ation (where the scale factor of the universe grows exponentially) can be treated as static critical phenomena, while a ‘slow-roll’ or power-law in4ation can be treated as dynamical critical phenomena. In Part II (Sections 4, 5) we introduce the key concepts in open systems and discuss the relation of coarse-graining and backreaction. We recount how the (in-out, or Schwinger–DeWitt) CGEA devised by Hu and Zhang can be used to treat some aspects of the eects of the environment on the system. This is illustrated by the stochastic in4ation model where quantum 4uctuations appearing as noise backreact on the in4aton 3eld. We show how RG techniques can be usefully applied to obtain the running of coupling constants in the in4aton 3eld, followed by a discussion of the cosmological and theoretical implications. In Part III (Sections 6–8) we present the CTP (in–in, or Schwinger–Keldysh) CGEA introduced by Hu and Sinha. We show how to calculate perturbatively the CTP CGEA for the 4 model. We mention how it is useful for calculating the backreaction of environmental 3elds on the system 3eld (e.g. light on heavy, fast on slow) or one sector of a 3eld on another (e.g. high momentum modes on low, inhomogeneous modes on homogeneous), and problems in other areas of physics where this method can be usefully applied. This is followed by an introduction to the in4uence functional in the (Feynman–Vernon) formulation of quantum open systems, illustrated by the quantum Brownian motion models. We show its relation to the CTP CGEA, and indicate how to identify the noise and dissipation kernels therein. We derive the master and Langevin equations for interacting quantum 3elds, represented in the works of Lombardo and Mazzitelli and indicate how they can be applied to the problem of coarse-graining, decoherence and structure formation in de Sitter universe. We perform a nonperturbative evaluation of the CTP CGEA and show how to derive the renormalization group equations under an adiabatic approximation adopted for the modes by Dalvit and Mazzitelli. We assert that this approximation is incomplete as the eect of noise is suppressed. We then discuss why noise is expected in the RG equations for nonequilibrium processes. In Part IV (Sections 9, 10), following Lombardo and Mazzitelli, we use the RG equations to derive the Einstein–Langevin equation in stochastic semiclassical gravity. As an example, we calculate the quantum correction to the Newtonian potential. We end with a discussion on why a stochastic component of RG equations is expected for c 2001 Elsevier Science B.V. All rights reserved. nonequilibrium processes. PACS: 05.10.Cc
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
461
1. Aim and scope We discuss how the concepts of open systems and the techniques of the coarse-grained eective action and in4uence functionals can be applied to nonequilibrium quantum processes in the early universe and in semiclassical gravity, leading to renormalization group (RG) theories for the description of the interaction dynamics of these theories. Our wish is that by examining a sample class of problems of fully dynamic nature—as dierent from equilibrium (3nite temperature) or near-equilibrium (linear response theory)— we can lay out the issues and approaches useful for the construction of a RG theory for nonequilibrium (NEq) processes involving quantum 3elds. Description of phase transitions involves a scale that measures the behavior of the order parameter 3eld in the critical region. The energetics of the system is characterized by its quantum dynamical and statistical mechanical properties. At the heart of NEq statistical mechanics is the interplay of the dynamical scales of the system (from the time-dependent order-parameter 3eld) and some background (e.g., all physical processes in an expanding universe are measured against the time-dependent metric function such as the scale factor a(t)). In equilibrium treatment, this is usually captured by a 3nite temperature eective potential. But since the order parameter 3eld is generally time dependent, and there may not be a thermal equilibrium environment present in these dynamical processes, one should really be working with an effective action or a free energy density functional. Unlike scattering problems commonly found in particle physics where one can determine the transition amplitude between the in and out states based on the in–out or Schwinger–DeWitt eective actions, in evolutionary problems frequently encountered in statistical mechanics, the development of the expectation value (of an operator associated with some physical variable, such as the energy momentum tensor) need be obtained from the in–in or Schwinger–Keldysh eective actions. When open system concepts like coarse-graining and backreaction of the system and environment are applied, the close-time path (CTP) coarse-grained eective action (CGEA) or the in4uence action are more appropriate. (For the development and application of these ideas applied to problems in gravitation and cosmology, see, e.g., [1–8].) This paper is in the nature of a report rather than a review—in that we will present or develop only those works which are useful for the construction of the conceptual and technical frameworks to treat this broad class of problems, taking speci3c examples from semiclassical gravity and in4ationary cosmology as illustrations. RG concepts and techniques have been used in other areas of gravitation and cosmology. We mention some representative works: (1) RG in gravitational collapse: The use of universality and scaling ideas in classical gravitational collapse 3rst discussed by Choptuik [9] has grown since then into an interesting area of classical graviational research. Fractal structure and scaling laws in a self-gravitating gas have been investigated by de Vega et al. [10]. (2) RG in quantum 3eld theory in curved spacetime: notable work since the 80’s by Calzetta, Hu, O’Connor, Jack, Parker, Toms and others as well as the Tomsk group can be found in the book of Buchbinder et al. [11]. For more recent works see [12,13]. (3) Scaling in quantum gravity has been pursued by Ambjorn in a simplicial gravity approach [14] and by Antoniadis et al. [15], to mention just a few notable avenues of inquiry.
462
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
The organization of this paper is as follows: In Section 2 we 3rst give a descriptive summary of the nature of the problems encountered in phase transitions in the early universe, focussing on the in4ationary cosmology. We distinguish the case of eternal in4ation from that of slow roll and indicate why they can be viewed as static and dynamic critical phenomena respectively. In Section 3 we begin a discussion of the relation of scaling, coarse-graining and backreaction with the example of stochastic in4ation, thus bringing out the basic concepts of open systems. In Section 4 we introduce the (‘in–out’) CGEA for a 4 3eld to incorporate the backreaction eect of the short wavelength modes viewed as the environment, on the long wavelength modes of the system. In Section 5 we perform a rescaling of the modes and the 3eld in the spirit of RG transformations and derive the corresponding RG equations for the coupling constants of the system 3eld. We brie4y discuss the theoretical and cosmological implications. In Section 6, we present the ‘in–in’ or CTP CGEA and show how it is useful for the derivation of real and causal dynamical equations for expectation values of operators of the system 3elds with backreaction from the environment 3eld. We carry out a perturbative evaluation of the CTP CGEA and show how it could be useful for the consideration of backreaction of environment 3eld (modes) on the system 3eld (modes). In Section 7 we introduce the in4uence functional formulation via the quantum Brownian motion model and show its relation with the CTP CGEA. In this process we obtain the master and semiclassical Langevin equations for interacting quantum 3elds. We show how the noise kernels can be identi3ed in this equation and how their behavior can be used as a measure of decoherence. In Section 8 we carry out a nonperturbative evaluation of the CTP CGEA, and derive the RG equations under dierent approximations. We indicate how our real-time CTP CGEA approach diers from the euclidean averaged eective action approach. In Section 9 we use RG theory to derive the Einstein–Langevin equation in stochastic semiclassical gravity. In Section 10 we show how the RG equations change the Newtonian potential. In Section 11, following a short summary, we discuss the salient features of RG theory for nonequilibrium processes and argue why there is a stochastic component in the RG equations for such systems. Those readers who only wish to learn the methodology which underlies nonequilibrium quantum 3eld processes, but have no special interest in the speci3cs of such processes in gravitation and cosmology, can do just read Sections 4, 6, 8 and 11, from which they will be able to apply the methods to their own areas of research (e.g., nuclear=particle, atomic=optical physics). Part I. Scaling and ination 2. Phase transitions in the early universe We use phase transitions in the early universe (from the Planck time to the GUT time, or even at the later electroweak era) as working examples to illustrate the basic issues and methods involved. This is because phase transitions in the early universe are mediated by and involve many NEq processes of fundamental interest (e.g., nucleation, spinodal decomposition, particle creation, decoherence) that lead to many important physical consequences such as entropy generation and structure formation. We will only dwell on those aspects which illustrate the applications of the RG theory and suggest its extension to dynamical processes. The development
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
463
of a RG theory for NEq processes here—from the construction of the CGEA to the derivation of the RG equations—is based on the concepts (scaling, coarse-graining, backreaction) and techniques (CTP eective action and in4uence functional) of nonequilibrium mechanics and quantum 3eld theory. These ideas and techniques are generally applicable also to problems outside of gravitation and cosmology, such as the quark–gluon plasma and atoms interacting with a Bose–Einstein condensate. On the other hand this parallel presentation of RG theory in the context of NEq processes may be helpful to cosmologists who wish to 3nd a more solid theoretical anchor for the discussion of phase transitions in the early universe. 2.1. E8ective action for dynamic order-parameter 9elds As noted earlier knowledge of the exact form of the CTP CGEA holds the key to a complete description of a phase transition. One can deduce not only the qualitative features (3rst or second order) but also the quantitative details (mechanisms and processes). Therefore the construction of the eective action for the order parameter 3eld in dierent cosmological spacetimes is usually the necessary 3rst step towards a description of phase transitions in the very early universe. For the purpose of illustration it is useful to establish the connection with ideas of RG theory at the outset. The two themes established in [4,5], which we follow here are: (1) Eternal in4ation can be described equivalently as an exponential scale transformation, thus rendering this special class of dynamics as eectively static. (2) The class of ‘slow-roll’ in4ation can be treated as a dynamical perturbation o the eectively static class of exponential in4ation and be understood as a dynamical critical phenomenon in cosmology. 1 Let us concentrate on situations where the order parameter 3eld changes either with space or time. (A familiar example in condensed matter physics is anisotropic superconductivity where one can use a gradient expansion in the Landau–Ginzburg–Wilson eective potential to account for the dierences coming from the next-to-nearest neighbor interactions.) For cosmological 1 A comment on the meaning of the term ‘critical dynamics’ as used in the context of cosmological phase transitions is in order. By it we refer to studies of phase transitions mediated by a time-dependent order parameter 3eld in contrast to static critical phenomena where the order-parameter 3eld is constant in time. We are using this term in a general sense, not necessarily referring to the speci3c conditions of critical phenomena as discussed in condensed matter systems [16]. For example, critical phenomena usually deals with the change of the order parameter 3eld near the critical point as a function of temperature. In cosmology, temperature T is a parameter usually (e.g. under the assumption of adiabatic expansion) tied in with the scale factor a and does not play the same role as in critical phenomena. In the new in4ationary scenario, the critical temperature Tc is de3ned as the temperature at which a global ground state (the true vacuum) 3rst appears. The stage when vacuum energy begins to dominate and in4ation starts is the beginning of the phase transition. The stage when the system begins to enter the true vacuum and reheat can be regarded for practical purposes as the end of the phase transition. Throughout the process of in4ation the system is in a ‘critical’ state. The progression of a cosmological phase transition is measured not by temperature, but by the change of 3eld con3gurations in time. Criticality corresponds to the physical condition that the correlation 1 2 2 2 length = m− e → ∞, or me = d V=d |=0 → 0 (which may or may not be possible). In critical dynamics studies of condensed matter systems one usually analyses the time-dependent Landau–Ginzburg equation, with a noise term representing the eect of a thermal bath, and studies how the system (order-parameter 3eld) settles into equilibrium as it approaches the critical point. We are not concerned with the corresponding cosmological problem here. An attempt to describe this aspect of the in4ationary transition was made in [17]. See also [18].
464
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
problems, it is the time-dependence of the background 3eld which one needs to deal with. 2 Thus for a realistic description of many in4ationary transitions one needs to treat the case of a dynamical 3eld and a non4at or even quasi-static potential. The form of the potential and the metric of the background spacetime together determine the behavior of the scalar 3eld in the Laplace–Beltrami equation, but the 3eld in turn provides the source of the Einstein equation which determines the behavior of the background spacetime metric. Hence they ought to be solved self-consistently. (One usually considers only the homogeneous mode of the scalar 3eld for the dynamics of in4ation and the inhomogeneous modes of quantum 4uctuations for processes like structure formation.) At the classical level, the wave equation for the background scalar 3eld (assumed homogeneous) with self-interaction potential V () in a spatially 4at Robertson–Walker (RW) spacetime is given by Q + 3H (t)˙ + V () = 0 (2.1) and the Einstein equations read H˙ + 3H 2 = 8GV ();
2 H˙ = − 4G ˙ ;
(2.2)
˙ is the Hubble rate, a dot denotes derivative with respect to cosmic time t, and where H (t) ≡ a=a a prime denotes a derivative taken with respect to its argument. A trivial but important solution to these equations is obtained by assuming that V () = V0 = constant, = 0 = constant and H = H0 = constant, which is the de Sitter (DS) universe a = eH0 t with a constant 3eld. A less trivial but useful solution is the so-called ‘power-law’ in4ation model [22], with an exponential potential and a slowly varying in4aton 3eld. One can carry out a derivative expansion of the background 3eld to obtain a quasi-local eective action for the description of such classes of spacetimes and 3elds. (An example of ‘slow-roll’ transition.) 2.2. In:ation as scaling: static critical phenomena This idea arose from the work of Hu and Zhang [4] on coarse-graining and backreaction in stochastic in4ation. In trying to compare the in4ationary universe with phase transitions in the Landau–Ginzburg model, using a 4 3eld as example, they realized that the exponential expansion of the scale factor can be viewed as the system undergoing a Kadano–Migdal scale transformation [23] (this is explained in detail in Section 3.2). In other words, time in this case plays the role of a scaling parameter. It does not have to be viewed as a dynamical parameter. Thus for this special class of expansion, the dynamics of spacetime can be replaced 2
Strictly speaking, phase transition studies usually carried out assuming a constant 3eld in the de Sitter universe are unrealistic, in that they only address the situation after the universe has entered the in4ationary stage and in4ates inde3nitely. This model cannot be used to answer questions raised concerning the likelihood that the universe will still in4ate if it had started from a more general, less symmetric initial state, such as the mixmaster universe [19 –21]. Nor can one use this model to study the actual process of phase transition (e.g., slow roll-over), and the problem of exit (graceful or not) to the ‘true’ Friedmann phase. To do this, as is well known, one usually assumes that the potential is not exactly 4at, but has a downward slope which enables the in4aton 3eld to gradually (so as to give suTcient in4ation) settle into a global ground state. The cosmological solution is, of course, no longer a de Sitter universe.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
465
equivalently by a scaling transformation. In so doing one renders eternal in4ation into a static setting. By contrast, the larger class of power-law expansion a = t does not possess this scaling property. A useful parameter which marks the dierence between these two classes of dynamics is = |H˙ |=H 2 = = Q ˙2 , where ≡ lna, which can be regarded as a ‘nonadiabaticity parameter’ of dynamics: the DS exponential behavior with = 0 is ‘static’, the slow-roll with small 1 is ‘adiabatic’, while the RW low-power-law with ≈ 1 is ‘nonadiabatic’. 2.3. Quasi-static 9eld, ‘slow-roll’ as dynamical critical phenomena The eective potential V () gives a well-de3ned description of phase transition only for a constant background (order-parameter) 3eld. If the order-parameter 3eld is dynamic, the eective potential is ill-de3ned and a host of problems will arise. Indeed, the very meaning of phase transition can become questionable. This is because as the 3eld changes the eective action functional changes, and the location of the minima changes also. The notion of symmetry breaking and restoration is meaningful only when there exist well-de3ned global and local minima which do not change much in the time scale of the phase transition. Changing background 3elds will also engender particle creation, which aects the nature and energetics of phase transition as well. Therefore, in the context of phase transitions involving dynamic 3elds, short of creating a new framework, one can at best discuss the problem in a perturbative sense, where the background 3eld is nearly constant (quasi-static), so that an eective quasi-potential can still be de3ned [24,25]. An eective Lagrangian for a slowly varying background 3eld can be obtained by carrying out a quasilocal expansion in derivatives of the 3eld, the leading term being the eective potential [26]. L = L(; 9 ; 9 9 ; : : :) :
(2.3)
One can use this method to derive eective quasi-potentials for scalar 3elds in 4at space (for an example of its application to electroweak 3nite temperature transition see, e.g., Moss et al. [27]) or (in conformal time) for the conformally 4at RW spacetimes. This is useful for studying cosmological phase transitions where the background spacetime changes only gradually, as in the Friedmann (low-power law) solutions a = t ; ¡ 1. (For a description see [25].) However, for the in4ationary universe where the scale factor undergoes rapid expansion following either an exponential a = eHt or a high power-law behavior, the quasi-local expansion which assumes that the background 3eld varies slowly is usually inadequate. Using the conceptual framework introduced above, one can understand why the particular subclass of high-power-law expansion associated with an exponential potential can hence be viewed as quasi-static. It is in this context that one can introduce the quasi-‘static’ approximation to derive the eective action for scalar 3elds to depict this more realistic ‘slow-roll’ in4ation, now carried out as a quasi-local perturbation from the DS space, which is viewed as eectively static. For slowly varying background 3elds one can use the method of derivative expansion to derive the quasi-local eective Lagrangian. Usually this makes sense only for static (or conformally static, like the RW) spacetimes. However, one can view the special class of exponential expansion as eectively ‘static’. This can be understood with the ideas of ‘dynamical 3nite size eect’ [28] and implemented by treating in4ation as ‘scaling’ transformations [4]. The ‘slow roll-over’ type of phase transition used in many in4ationary models can be viewed
466
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
as a quasi-static case, and derived as a dynamic perturbation from the de Sitter universe [5]. This view reveals a close analogy of this case with dynamical critical phenomena where the scaling parameter s plays the role of in4ation and the time parameter t measures how much the system departs from the exponential expansion solution. The main points proposed in [5] can be summarized by the following schematic diagram: Constant Field in Static or Conformally Static Spacetimes (Finite Size Eect) | Quasilocal Approximation ↓ Slowly Varying Field in RW Universe (Low Power-Law)
SCALING −−−−−−−−→
−−−−−−−−→
Exponential Expansion a = eHt ‘Eternal In4ation’ (Dynamical Finite Size Eect) | Derivative Expansion ↓ High Power-Law expansion a = t ‘Slow Roll-Over’
The inadequacy of the 3nite temperature eective potential for the description of phase transitions in the early universe was what motivated us to look for more general methods useful for dynamic and nonequilibrium processes, especially involving quantum 3elds. We will trace out this pathway with two themes, one for the exposition of methods to treat NEq processes, and the other for the development of RG theories for such processes. This will lead us from the eective action to the in4uence functional methods, and for RG theory, from CTP CGEA to RG equations. It is better to start with a physical example to motivate, so we 3rst discuss a simple conceptual point which will enable us to see in4ation in the light of scaling. 3 3. Coarse-graining, scaling and ination 3.1. In:ation Consider a massive (m), self-coupled () scalar 3eld (˜x; t) in a background spacetime with action 1 1 2 1 4 4 √ 2 (3.1) S[] = d x −g − − (m + R) − ; 2 2 4! √ √ where = 1= −g 9= 9x g −g 9= 9x is the Laplace–Beltrami operator and = 0; 16 correspond, respectively, to minimal and conformal coupling of the 3eld with the background spacetime with scalar curvature R. We will consider RW and DS spacetimes with line element ds2 = dt 2 − a2 d 3
2
;
Sections 3–5 are excerpted from Lectures I and II of [4].
(3.2)
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
467
where d 2 is that of the 3-space (d 2 = d˜x2 for spatially 4at). We restrict our attention here only to the spatially homogeneous and isotropic RW, or the spacetime-homogeneous and isotropic DS cases, where a single isotropic scaling parameter is applicable. For the RW universe a(t) = t ; where = 1=2 for a radiation-dominated source and = 2=3 for a pressureless dust source. For a DS universe (in the spatially 4at RW coordinatization) a(t) = exp(Ht) :
(3.3)
Spatially homogeneous but anisotropic spacetimes are more complicated as they require dierent scaling parameters in dierent directions. In a spatially homogeneous spacetime, the scalar wave function (˜x; t) (˜x; t) = k (t)uk (˜x) (3.4) ˜
is the product of a function of time k (t) and a function of space uk (x)[ = eik·˜x for spatially 4at, and = Ylmn ($; %; ); k = (l; m; n) for spatially closed cases.] The wave equation for the time-dependent amplitude function k (t) associated with the kth mode of a scalar 3eld (˜x; t) with self-interaction potential V () in a RW spatially 4at spacetime is given by 2 d 2 k dk k 2 + 3H (t) + m + R k + Vk () = 0 ; (3.5) + dt 2 dt a2 where H (t) ≡ a=a ˙ = , ˙ or = ln a = H dt. (Here a dot denotes derivative with respect to cosmic time t.) The solution to (3.5) depends on V () = dV=d. For in4ation we are interested in cases ˙ This would correspond to considering only the relatively 4at portion where (i) V ()3H . of the potential, where in4ation takes place for an extended period of time (usually assumed V = t0t1 H dt ¿ 68 e-folding time to engender suTcient entropy corresponding to the observed universe). In new in4ation, this corresponds to assuming a 4at plateau; and in chaotic in4ation, a gradual downward slope. In the realistic DS case, the order parameter 3eld changes very slowly in this in4ation regime (slow roll-over), its rate controlled by 3H d=dt, which can be regarded as a viscous term in the dynamics. One can safely also assume that (ii) Q 3H ˙ in this regime. (By contrast the Q term dominates the dynamics in the reheating regime). A further simpli3cation is to assume that = constant in this regime throughout, corresponding physically to eternal in4ation, i.e., a = eHt for all time. This idealized case is of course unphysical in that the universe will never roll-over (in new and chaotic in4ation), or tunnel (in old in4ation) to the true vacuum. Nevertheless it captures the salient features of in4ation. We shall consider this idealized case in detail here in the light of static scaling. In summary one can distinguish the following cases: (a) (b) (c) (d)
Minkowski 3eld theory (static) H = 0, Eternal in4ation (stationary) H = constant; ˙ = 0, Slow-roll in4ation (quasi-stationary) H constant; ˙ = 0; Q = 0; Realistic in4ation (dynamic) H constant; ˙ = 0; Q = 0.
468
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
In both cases (a) and (b), the term 3H ˙ in (3.5) is ignored, while in cases (b) and (c) the term Q is ignored. We now discuss the eect of (eternal) in4ation on the scalar 3eld. We will show that a scalar 9eld with 4 self-interaction in an isotropically expanding spacetime undergoing eternal in:ation behaves exactly like the same 9eld in a static spacetime undergoing a scaling transformation. The physics of in4ation (we refer speci3cally to the initial stage only, but not the subsequent slow roll-over nor the reheating stages) can thus be understood in terms of scaling completely. 3.2. Scaling To 3x ideas, consider the spatially 4at RW metric Eq. (3.2) with a = constant. This is just the Minkowski spacetime. Let us consider an ordered sequence of such static hyperspaces (foliation) with scales a0 ; a1 ; a2 , etc., parameterized by tn = t0 + nVt; n = 0; 1; 2; : : : : These spacetimes have the same geometry and topology but dier only in the physical scale in space. One can = a x to render them equivalent. If each copy always rede3ne the physical scale length x(n) n H Vt has scale length magni3ed by a 3xed factor e over the previous one in the sequence, i.e. an+1 =an = eH Vt , we get exactly the physical picture as in an eternal in4ation. After n-iterations i.e. an =a0 = en(H Vt) , or, in terms of a continuous parameter a(t) = a0 eHt . It is important to recognize here that t can be any real parameter not necessarily describing the dynamics. Time in this depiction plays the role of a scaling parameter, and dynamics is nothing other than scaling. To see this simple point in another light, let us describe a totally dierent physical situation (which we will see at the end is exactly identical) where scaling plays the dominant role but one would otherwise not be inclined to associate with the physics of in4ationary cosmology. This is the Kadano–Migdal transformation [23,29] widely applied to condensed matter systems for the study of critical behavior. Consider a three-dimensional cubic lattice with lattice length L, and an order parameter 3eld describing the magnetization of the system. The Ginsburg– Landau–Wilson Hamiltonian for a Heisenberg magnet bears similarity with the 4 theory in a well-known manner [30]. The Kadano–Migdal (KM) transform is an arti3cial procedure for relating the microscopic and macroscopic properties of a system based on the existence of scaling properties in the system in the infrared limit. It involves taking a certain number of 3xed spin sites (e.g. 4 in 2 dimensions, 8 in 3 dimensions) and replacing them by one block spin with adjusted (e.g. doubled bond strength) interaction strength between nearest neighbors. Carrying out this transformation n times leads to a coarse-grained system. If the system (ferromagnet) manifests scaling properties near the critical point, as had been observed before the renormalization group theory was invented for its description, then the resulting rescaled macroscopic system should preserve the same properties as the original microscopic system. The appearance of long range order near the critical point makes this procedure a viable one, permitting enormous simpli3cation while preserving the salient features of the system. Let us examine a two-dimensional example of this process, sometimes also called decimation. Denote by a the original length and an the scale length after carrying out the KM transformation n times. If 4-spins are combined into 1 block spin in 2 dimensions, a1 = 2a0 , and an =a0 = 2n .
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
469
This gives the same result as in4ation, except that the scaling factor s ≡ an+1 =an here is 2, instead of s = e(H Vt) . De3ning an+1 s≡ ≡ e) ; (3.6) an ) for in4ation is H Vt , a constant, while for decimation in the above example is ln 2: We shall see that it is precisely this scaling property in in4ation which imparts all the distinct and desirable physical features, from the scale-invariant Zel’dovich–Harrison perturbation spectrum to the Hawking eect. While the above real-space scaling explains well the physical idea, the dual transformation in momentum space is easier to calculate. This is the original Wilson–Fisher RG transformation [29,30]. In momentum space, the lattice spacing acts like an ultraviolet cut o * of wave numbers k (* = ∞ for continuous systems). The block spin transformation corresponds to (a) eliminating the higher wave number modes (e.g. k ¿ *=s) and (b) rescaling (e.g., k → k = ks). The real parameter s ¿ 0 acts in (a) like a coarse-graining (cut parameter) and in (b) like a rescaling parameter. In our example above, s = 2: In every iteration, wavemodes with k ¿ *=2 are integrated out 3rst, then the remaining modes from 0 ¡ k ¡ *=2 are rescaled such that the new k space covers the full range with the same UV cuto. For de3niteness, we will refer to these two steps as coarse-graining (separation) and rescaling. Together these two steps constitute the renormalization group transformation. Thus in this example, at any iteration, wave numbers *=s ¡ k ¡ * are being integrated away, and only the lower wave mode sector is kept. For the long wavelength, low k cuto, we assume the horizon size H −1 , because those with lower k having wavelengths greater than the horizon size are of no physical signi3cance to the system, at least during the in4ationary stage. (Some of these wavelengths will reenter the horizon in the reheating phase and in4uence later events.) In the language of subdynamics one can call the former the environment (irrelevant sector), and the latter the system (relevant sector). The division of these two sectors changes with the order of iteration (in the language of KM transformation) or time-evolution (for in4ationary cosmology) of the system. The coupling of the environment with the system is determined by V (). In addition, for a self-interacting 3eld there are low–low and also high–high mode couplings. A useful way to keep track of these interactions, in particular, is by way of the CGEA which we will discuss in the next section. 3.3. Coarse-graining and stochastic in:ation In the above discussion we have shown how one can transcribe the dynamical process of in:ation to the static transformation of scaling. We also showed how the evolution of the universe during in4ation can be viewed as a succession of coarse-grainings, with the higher wave number modes cast away and their eect on the remaining (low k) modes accounted for by a coarse-grained eective action. The resulting RG equations describe the 4ow of the coupling constants. Using this equivalence, we can conceptually (and technically) replace the in:ationary expansion process in time by a running of the 3eld towards the infrared regime. Let us compare the coarse-graining scheme used in the critical phenomena example above (CP, or Case A) with that of an important class of in4ationary cosmologies called stochastic in4ation (SI, or Case B), because there are close similarities and basic dierences. Starobinsky’s
470
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
Fig. 1. The evolution of physical lengths during expansion of the Universe.
scheme [31] is to divide a free scalar 3eld into two parts: the system contains long-wavelength modes outside the horizon (p = k=a ¡ jH; ‘ = p−1 being the physical wavelength and j ¡ 1 a small parameter), which are treated classically; and the bath contains quantum 4uctuations of wavelengths ‘ shorter than the horizon. Thus the bath in stochastic in4ation is obtained from the full spectrum by a window function %(k −jaH ), and changes at each moment. Many authors have claimed that these high k modes with a dynamic cuto which obey the wave equation (3.5) (with 3rst order or with both 3rst and second order time derivative terms) give rise to a white noise. They then proceed to set up and solve a Langevin equation for the classical 3eld driven by the de Sitter dynamics and with this white noise source. Thus the name stochastic in4ation. 4 The physical setup of Starobinsky’s stochastic in4ation diers from the example we used above to illustrate the relation of scaling and in4ation in how the system is de3ned. Technically there exists a simple transformation between coarse-graining in critical phenomena (CP, Case A) and that in stochastic in4ation (SI, Case B). Let us explain: Compare the composition of system (S) and environment (E) at the beginning and end of in4ation for critical phenomena (Case A) and for stochastic in4ation (Case B). For Case B, at the start t0 , assume a(t0 ) = 1 and . = 1 for simplicity. At any later time the system consists of all modes k ¡ Ha or ‘ ¿ H −1 . As the universe in4ates a = eHt , more k modes are being red-shifted out of the horizon into the system. Thus S of stochastic in4ation increases (from the k region H0 K at t0 to HK at t∗H to GK at t∗G , see Fig. 1). The reason why in the stochastic in4ationary program the system is taken to consist of modes with wavelengths greater than the horizon H −1 4
Authors of [4,6,32] question how a white noise can be deduced by this dynamic partitioning of a free 3eld. They suggested instead how colored noise can be generated from coarse-graining an interacting 3eld without assuming any dynamic partitioning (see Section 6.2). This provides a more general scheme for stochastic in4ation, where one can discuss issues like nonGaussian galaxy distribution from vacuum 4uctuations.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
471
in the in4ation regime is because these are the modes which would reenter the RW horizon in the radiation-dominated RW phase; and only those higher (k ¿ kG ) modes (here treated as environment) which reenter the horizon before galaxy formation can physically in4uence kG in the RW phase. So the system in stochastic in4ation is really designed for and chosen by the physics in the later radiation-dominated RW regime, rather than during the in4ationary regime. Note that modes with wavelengths greater than the horizon size at the onset of in4ation have not yet reentered today’s horizon, because in, say, the new in4ationary scenario, today’s observable universe is believed to be only a small portion (lH ) of a much larger bubble in4ated from a 4uctuation domain (lD ) at t0 . The mode kH exits the horizon at time t∗H . The wave mode kH of physical wavelength ‘H (today’s Hubble radius) which is just entering the RW horizon today (and which disappeared outside the de Sitter horizon earlier than any observable physical scale (HRH Hnow in Fig. 1)) forms the upper limit of ‘phys . Of course more wave modes with ‘ ¿ ‘H will be entering into the RW horizon tomorrow. These modes have not yet come into contact with our universe so they should be considered as unphysical by today’s observers. However, they had interacted previously in the de Sitter phase at times t ¡ t∗H with all other k modes ¡ kH0 within the de Sitter horizon. In the in4ationary phase, it is this interaction which one should address. By contrast, in critical phenomena (Case A) depicted earlier, both the system and the bath are contained in the horizon throughout the period of in4ation from t0 to t1 (the OLH1 portion in Fig. 1). H −1 is the lower (IR) cuto. The partition is de3ned by the incremental fraction s = eH Vt = a(t + Vt)=a(t) of k modes slipping into the bath at each moment Vt (or each iteration) by coarse-graining. For example, one possible partition kL (point L in Fig. 1) is determined by the last wave mode which left the horizon at the end of in4ation at t1 . All modes with k ¿ kL never made it out and always stayed inside of the DS horizon. These will be the common “bath” to all systems at all times in the DS phase. Another choice more geared towards the galaxy formation problem is de3ned by the k modes of interest (e.g. galaxy scale kG ) and made up of k ¡ kG . The environment then contains modes k ¿ kG , including the common portion k ¿ kL . In this designation we make sure that: (1) The system consists of all scales within today’s Hubble radius (RW horizon). (2) Their interactions in the in4ationary phase with the higher and lower modes are incorporated. (3) The environment is always within the de Sitter horizon. We are interested in two eects: (1) The eect of the higher k modes 3eld ¿ on the system 3eld made up of lower k modes ¡ , and (2) How the 3eld parameters (such as mass m and coupling constants ) change in this process of in4ation (or scaling). These are answered, respectively, by (1) deriving the eective action and the associated eective equations for ¡ , and (2) deriving the renormalization group equation for these 3eld parameters. Despite the dierence in the aims of the two programs the coarse-graining techniques used in both cases are related in a simple manner. By replacing s → s−1 , or equivalently (Vt) → −(Vt) in Case A, one gets the procedure in Case B for stochastic in4ation. This amounts to changing in4ation to de4ation (time-reversal), or interchanging system and bath. One can also view the dierence as representing a passive versus an active view. By this we mean in4ating the background (spacetime) has, for a local observer in a scale-invariant system, the same eect as coarse-graining the system in a 3xed background. A 3gurative description which encapsulates the main ideas in this comparison is the eect of a zoom lens. The physics of a scale-independent system under in4ation is equivalent to holding
472
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
the system static and viewing it through a zoom lens. Here zooming (scaling) replaces dynamics (in4ation). Features of an evolving system undergoing in4ation (the passive interpretation) can be depicted equally by the magnifying view through a zoom lens (scaling) of a 3xed system (the active interpretation). Obviously this relation holds only if the structure is homogeneous and the lens has uniform magni3cation. (This corresponds to homogeneity in space and uniform scaling.) In the next section we present an eective action formalism for carrying out the coarse-graining. We shall keep the formalism general and nonspeci3c to the system–environment stipulation so that it can be applied to dierent situations in a variety of problems. Part II. Coarse-graining and renormalization group 4. Coarse-grained e"ective action 4.1. Coarse graining and backreaction In quantum 3eld theory, when the background 3eld or spacetime dynamics follows some simple discernable (e.g., classical) behavior one commonly performs what is called a background 3eld decomposition, = c + q on a (e.g., self-interacting) scalar 3eld . One then calculates the eect of the 4uctuation 3eld q on the background 3eld c (as well as the background geometry g when necessary) using a loop expansion in the in–out eective action. For many problems such as those encountered in statistical mechanics where one is interested in the detailed behavior of only a part of the overall system (call it the system) interacting with its surrounding (call it the environment), one can decompose the 3eld describing the overall system according to = S + E ; where S denotes the system 3eld and E the environment 3eld. One can calculate the backreaction of the latter on the former with the ‘coarse-grained eective action’ (CGEA) 3rst introduced by Hu and Zhang [4], where one does not integrate over the quantum 4uctuations, as in the conventional de3nition (Schwinger–DeWitt) of eective action in orders of ˝, but rather with respect to a small parameter . which can generally be expressed as the ratio of two time, length, mass or momentum scales characteristic of the system and environment discrepancy (examples are slow–fast time scale, long–short length scale, light–heavy mass scale or soft–hard momentum scale). In the stochastic in:ation scheme [31], one regards the system 3eld as containing only the lower modes and the environmental 3eld as containing the higher modes with the division provided by the event horizon in de Sitter spacetime. In post-in:ationary reheating, the quantum (inhomogeneous) 4uctuation 3elds are parametrically ampli3ed by the time-dependent (homogeneous) in4aton (background) 3eld, resulting in particle creation. Its backreaction eect is to damp away the oscillations of the in4aton 3eld and reheat the universe. (See, e.g., [33,34].) Similar processes of particle creation at the Planck time leading to damping of anisotropy of the background spacetime were earlier treated by means of a CTP, or in–in eective action. (See, e.g., [1].) In quantum cosmology one usually makes the
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
473
so-called ‘minisuperspace approximation’ of quantizing only the homogeneous universes and ignoring the inhomogeneous ones. The validity of this truncation scheme can be examined as a backreaction problem. One can view the homogeneous and inhomogeneous universes as the low and high lying modes of excitation of spacetime (this notion can be made more precise in terms of gravitational perturbations o a background spacetime as described by the Lifshitz wave operator), treat the former as constituting the system 3eld (minisuperspace) and the latter as the environmental 3eld. The magnitude of the backreaction of the latter on the former will give a measure of the validity of the minisuperspace approximation. This approach was taken by Hu and Sinha who introduced the CTP CGEA [35]. The same idea and method should be applicable to backreaction eects of hard thermal loops on the soft sector in quark–gluon plasma [36]. In the high–low mode division, one can think of the system as the average of the 3eld over a spatial volume *−3 c , *c = *=s being the inverse of the critical wavelength that separates the system from the environment. The Euclidean version of this coarse-graining leads to the so-called average eective action (AEA), introduced by Wetterich [37]. These two formalisms, Euclidean and real time CGEA, though sharing similar ideas have rather dierent ranges of applications. Wetterich’s AEA averages the 3elds over a space-time Euclidean region, while Hu’s [4] CGEA averages 3eld over a spatial region. The Euclidean AEA is more suitable for near-equilibrium systems, and has indeed been applied to QCD problems like quark– gluon plasma in a quasi-stationary state, while the real-time CTP CGEA gives causal evolution equations for expectation values of quantum operator and is more suitable for the analysis of dynamical problems. We will invoke the AEA in Section 8. 4.2. ‘In–out’ coarse grained e8ective action We now introduce the in–out version of the CGEA. This is because it is pedagogically simpler, and is closer to what one learns from textbooks, i.e., the Schwinger–DeWitt eective action. Functionally it is also adequate for the introduction of ideas on coarse-graining and backreaction to the extent of seeing the running of coupling constants leading to a RG theory. But the eective equation of motion with backreaction obtained from an in–out version contains nonreal and noncausal terms. This was 3rst noticed by DeWitt and Jordan [38,39] and Calzetta and Hu [1,2] in a study of the backreaction of particle creation on a background spacetime or 3eld. For a correct treatment of the backreaction of the environment on the system one needs the in–in or CTP version, which we will discuss in Section 6. Consider the 4 model with action (3.1) in a spatially 4at Robertson–Walker universe. The scale factor a(t) = e is left unspeci3ed for now. Rather than viewing a(t) as a dynamical function determined from Einstein’s equations we will, in the spirit of scaling as explained in an earlier section, regard a as a constant and t as a parameter. Although the space can have dierent scales a(t) at dierent times t, spacetime remains 4at at all times. (One can rescale ˜x at dierent times to make them all equal in value.) Thus we can simply think of doing Minkowski 3eld theory here, leaving a(t) as a parameter while carrying out coarse-graining in the three-dimensional space. The content of this subsection can thus be used without reference to cosmology (simply set a(t) = 1). However we do want to tag the scale factor a(t) along so that later we can perform scaling transformations and discuss in4ationary cosmology.
474
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
For the RW spacetime, the action (3.1) reads 1 ˆ 1 ˜ 4 2 3 ˆ ˆ S[] = d x dt a(t) − (d2 + d1 − 4 + m˜ ) − ; 2 4!
(4.1)
where dˆ1 = a2 (t)3H 9= 9t;
dˆ2 = a2 (t)92 = 9t 2 ;
4ˆ = a2 (t)4
(4.2)
are the second and 3rst order time derivative operators and 4 is the spatial Helmholtz operator, respectively. Here m˜ 2 (t) = a2 (t)(m2 + R) ;
(4.3)
˜ = a2 (t) (t)
(4.4)
are the conformally related generalized mass and 3eld self-coupling parameter, respectively. We will now assume dˆ1 = dˆ2 = 0, i.e., consider the case of eternal in4ation. In this ‘static’ limit, one can introduce the momentum space representation on the spatial section ˜ −3=2 (˜x; t) = (2) d 3 keik·˜x (˜k; t) : (4.5) Then
S[] =
dt a(t) +(2)−3
1 2 2 d k − (k + m˜ (t))(˜k)(−˜k) 3
2
dt a(t)
d 3˜k1 d 3˜k2 d 3˜k3 d 3˜k4
1 ˜ ˜k1 + ˜k2 + ˜k3 + ˜k4 )(˜k1 ; t)(˜k2 ; t)(˜k3 ; t)(˜k4 ; t) : × − (t)5(
4!
(4.6)
For interacting 3elds in a static spacetime mode-mixing is manifest. If dˆ1 = 0 “time” translationinvariance is lost and there will be frequency mixing. This becomes relevant in the context of critical dynamics, as mentioned in Section 2. Now we separate the 3eld (x; t) into two parts, ¡ and ¿ , which one can refer to as the system and the environment, = ¡ + ¿ : We assume that ¡ contains the lower k wave modes and ¿ the higher k modes and consider these two cases: Case A (critical phenomena): ¡ : |˜k | ¡ *=s; ¿ : *=s ¡ |˜k | ¡ * ; (4.7) Case B (stochastic in:ation): ¡ : |˜k | ¡ jHa; ¿ : |˜k | ¿ jHa;
j≈1:
(4.8)
Here * is the ultraviolet cuto and s ¿ 1 is the coarse-graining parameter which gives the fraction of total k modes counted in the environment. The separation of can also be made
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
475
in other manners, depending on the physical setup of the problem and the questions one asks. The formalism we present here can be easily adapted to other types of decomposition. S[] in (4.6) can be written as 1 ˜2 3˜ S[] = dt a(t) d k − (k + m˜ 2 (t)¡ (˜k; t)¡ (−˜k; t)) 2 1 3 + dt a(t) d ˜q − (˜q2 + m˜ 2 (t))¿ (˜q; t)¿ (−˜q; t) 2 1 ˜ −3 3˜ 3˜ 3˜ 3˜ dt a(t) d k1 d k2 d k3 d k4 − (t) +(2) 4! ס (˜k1 ; t)¡ (˜k2 ; t)¡ (˜k3 ; t)¡ (˜k4 ; t)5(˜k1 + ˜k2 + ˜k3 + ˜k4 ) 1 ˜ −3 3˜ 3˜ 3˜ 3 dt a(t) d k1 d k2 d k3 d ˜q1 − (t) +(2)
4!
×4¡ (˜k1 ; t)¡ (˜k2 ; t)¡ (˜k3 ; t)¿ (˜q1 ; t)5(˜k1 + ˜k2 + ˜k3 + ˜q1 ) 1 ˜ −3 3˜ 3˜ 3 3 dt a(t) d k1 d k2 d ˜q1 d ˜q2 − (t) +(2)
4!
×6¡ (˜k1 ; t)¡ (˜k2 ; t)¿ (˜q1 ; t)¿ (˜q2 ; t)5(˜k1 + ˜k2 + ˜q1 + ˜q2 ) 1 ˜ −3 3˜ 3 3 3 dt a(t) d k1 d ˜q1 d ˜q2 d ˜q3 − (t) +(2)
4!
×4¡ (k[1 ; t)¿ (˜q1 ; t)¿ (˜q2 ; t)¿ (˜q3 )5(˜k1 + ˜q1 + ˜q2 + ˜q3 ) 1 ˜ −3 3 3 3 3 dt a(t) d ˜q1 d ˜q2 d ˜q3 d ˜q4 − (t) +(2)
4!
׿ (˜q1 ; t)¿ (˜q2 ; t)¿ (˜q3 ; t)¿ (˜q4 ; t)5(˜q1 + ˜q2 + ˜q3 + ˜q4 ) ;
(4.9)
where all integrals of k and q have the limits indicated in (4.7) or (4.8) depending on the cases in question. Henceforth we shall use the shorthand notation ˜ ≡ ¿ (˜q1 ; t) ¡ (1) ≡ ¡ (˜k1 ; t); ¿ (1) (4.10) and denote G¡ (1) = [˜k 2 + m˜ 2 (t) − i.]−1 ;
˜ = [˜q2 + m˜ 2 (t) − i.]−1 : G¿ (1)
(4.11)
We can separate terms in (4.9) into three groups S[] = S[¡ ] + S0 [¿ ] + SI [¡ ; ¿ ] ;
(4.12)
where −1 ˜ S[¡ ] = − 12 G¡ (1)¡ (1)¡ (−1) − 4!1 (t) ¡ (1)¡ (2)¡ (3)¡ (4)5(1 + 2 + 3 + 4) −1 ˜ ˜ ¿ (−1) ˜ (1)¿ (1) S0 [¿ ] = − 12 G¿
476
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
and ˜ ˜ ˜ SI [¡ ; ¿ ] = − 4!1 (t)[4 ¡ (1)¡ (2)¡ (3)¿ (1)5(1 + 2 + 3 + 1) ˜ ¿ (2)5(1 ˜ ˜ + 2 + 1˜ + 2) + 6¡ (1)¡ (2)¿ (1) ˜ ¿ (2) ˜ ¿ (3)5(1 ˜ ˜ + 1˜ + 2˜ + 3) + 4¡ (1)¿ (1) ˜ ¿ (2) ˜ ¿ (3) ˜ ¿ (4)5( ˜ 1˜ + 2˜ + 3˜ + 4)] ˜ + ¿ (1) ≡ S1 + S2 + S3 + S4 :
(4.13)
Here integration over k’s and q’s with the respective ranges is understood, and we have used the simpli3ed notation of 1; 1˜ to denote ˜k1 ;˜q1 , etc. It is easier to use Feynman diagrams for the propagators and vertices corresponding to the two 3elds ¡ and ¿ . They are listed in Fig. 2. The functional integral formalism for quantum 3elds is set up in the usual way. Assume that at t = ± ∞, the interaction is turned o so that the in-vacuum |0− and the out-vacuum |0+ are de3ned. The vacuum persistence amplitude on the generating functional 0+ |0− gives the probability amplitude of the 3eld remaining in the vacuum after the interacting system evolves in time. With the splitting = ¡ + ¿ , the Schwinger–DeWitt (in–out) generating functional becomes, Z[] = N DeiS[] = N D¡ D¿ e{iS[¡ ]+S0 [¿ ]+SI [¡ ; ¿ ]} : (4.14) Introducing an integral over ¿ of exp iS0 [¿ ] as the norm for the functional average in the denominator and changing the constant N to N accordingly, we get iS[¡ ] iS0 [¿ ] iSI [¡ ; ¿ ] iS0 [¿ ] D¡ e D¿ e e D¿ e Z[] = N = N
D¡ eiS[¡ ] × eiSI [¡ ; ¿ ] ¿ :
(4.15)
Here ¿ de3nes averaging over the ¿ 3eld under the free action S0 [¿ ]. (We write 0 interchangeably to emphasize that the averaging is with the free action S0 [¿ ].) Denoting eiSI [¡ ; ¿ ] ¿ = exp iVS[¡ ] ;
we get Z[¡ ] = N
(4.16)
D¡ exp iSe [¡ ] :
(4.17)
The CGEA is given by Se [¡ ] = S[¡ ] + VS[¡ ] ;
(4.18)
VS[¡ ] = − i ln exp iSI [¡ ; ¿ ]¿ :
(4.19)
where
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
477
Fig. 2. Feynman diagrams for the in–out CGEA.
If the interaction between the ¡ and ¿ 3elds is small (e.g. 1 in a 4 theory), one can expand this in a Dyson–Feynman series VS[¡ ] = −i ln
∞ n i n=0
n!
SIn [¡ ; ¿ ]¿
i = SI [¡ ; ¿ ]¿ + [SI [¡ ; ¿ ]2 ¿ − SI [¡ ; ¿ ]2¿ ] + · · · ; 2 where in the second line terms up to the second order in SI are written out explicitly.
(4.20)
478
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
Let us now examine each term in (4.20) in detail, starting with the 3rst order term
SI [¡ ; ¿ ]¿ . We see that terms in (4.13) involve the average of products of ¿ 3elds,
but those containing an odd number of 3elds in the product average to zero. Also any product containing four ¿ 3elds (last term in (3:14)) averages out to a quantity which is independent ˜ 0 = ¿ (1) ˜ ¿ (2) ˜ ¿ (3) ˜ 0 = 0 and ¿ (1) ˜ ¿ (2) ˜ ¿ (3) ˜ ¿ (4) ˜ 0 is indepenof ¡ , i.e., ¿ (1) dent of ¡ (˜k; t). Thus the only contribution to SI [¡ ; ¿ ]0 is the quadratic product ˜ ¿ (2) ˜ 0 = − ia−1 (t)G¿ (1)5( ˜ 1˜ + 2)5(t ˜ ¿ (1) 1 − t2 ) :
(4.21)
We denote by SI [¡ ; ¿ ]j the terms containing j ¿ 3elds, i.e. SI 1 ∼ ¡ ¡ ¡ ¿ , SI 2 ∼ ¡ ¡ ¿ ¿ , etc. In the Feynman diagram depiction (Fig. 2c), the averaging over ¿ amounts to closing the lines representing the environment 3eld propagators on themselves, and it is easy to see how the above conclusion is reached. Explicitly, the only contributing 3rst order term is 1 −3 3˜ 3˜ 3 3 ˜ SI [¡ ; ¿ ]2 0 = (2) a(t) dt d k1 d k2 d ˜q1 d ˜q2 6 − a(t)(t) 4! ס (˜k1 ; t)¡ (˜k2 ; t)¿ (˜q1 ; t)¿ (˜q2 ; t)0 5(˜k1 + ˜k2 + ˜q1 + ˜q2 )
Using (4.21), we get SI [¡ ; ¿ ]2 0 =
a(t) dt
1 d k − ¡ (˜k; t) 5m˜ 2 (t; s)1 ¡ (˜k; t) ; 2 3˜
where i −1 ˜ 5m˜ 2 (t; s)1 = − 5(0)(t)a (t)(2)−3 2
d 3˜q [˜q2 + m˜ 2 (t) − i.]−1 :
(4.22)
(4.23)
This is depicted as (1) in Fig. 2c. The subscript 1 on 5m˜ 2 denotes that it is the 3rst order correction to m˜ 2 . The second order correction is given by (5) and (6) of Fig. 2c and computed below. Now we examine the second order terms in (4.20). The second term in the square bracket SI [¡ ; ¿ ]20 gives three non-contributing disconnected graphs. The 3rst term (SI [¡ ; ¿ ])2 0 has the following components: SI2 0 = S12 0 + S22 0 + S32 0 + S42 0 + 2[S1 S2 0 + S1 S3 0 + S1 S4 0
+ S2 S3 0 + S2 S4 0 + S3 S4 0 ] ; where we have written in short Sj = SI [¡ ; ¿ ]j
(j = 1; 2; 3; 4)
as in Eq. (4.13). In terms of the Feynman diagrams in Fig. 2c constructed from Fig. 2b, there are four contributing connected diagrams of the second order. They are denoted by (3)–(6) in Fig. 2c. We see that diagram (1) is the 3rst order and (5) (6) are the second order corrections ˜ while (3) generates to the mass m˜ 2 , (4) is a second order correction to the coupling constant , 6 correction.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
479
Adding these contributions together we obtain 3nally for the CGEA up to second order in : 1 3˜ ¡ (˜k; t)[k 2 + m˜ 2 (t; s)]¡ (−˜k; t) Se [¡ ] = a(t) dt d k − 2 1 ˜ 3˜ 3˜ 3˜ 3˜ (t; s; {˜k }) + a(t) dt d k1 d k2 d k3 d k4 − 4! ס (˜k1 ; t)¡ (˜k2 ; t)¡ (˜k3 ; t)¡ (˜k4 ; t)5(˜k1 + ˜k2 + ˜k3 + ˜k4 ) 1 3˜ 3˜ 6 (t; s; {˜k }) + a(t) dt d k1 : : : d k6 −
6!
ס (˜k1 ; t) : : : ¡ (˜k6 ; t)5(˜k1 + · · · + ˜k6 ) :
(4.24)
where the integration range of ˜ki is as indicated in (4.7) and (4.8): |˜k | 6 *=s for the case of critical phenomena and |˜k | ¡ jHa for stochastic in4ation. Here, m˜ 2 (t; s) = m˜ 2 (t) + 5m˜ 2 (t; s)1 + 5m˜ 2 (t; s)2a + 5m˜ 2 (t; s)2b ; ˜ s; {˜k }) = (t) ˜ + 5(t; ˜ s; {˜k })1 : (t; (4.25) The subscripts under 5m˜ 2 and 5˜ denote the order of correction to the mass and the coupling constants arising from averaging over ¿ 3elds. They are given by 1 2 2 ˜ 2 −3 5m˜ (t; s)2a = − [i5(0)] [(t)=a(t)] (2) d 3˜q1 [˜q12 + m˜ 2 (t) − ij]−2 4 ×(2)−3 d 3˜q2 [˜q22 + m˜ 2 (t) − ij]−1 ; (4.26) 1 2 2 ˜ 2 −3 3 −3 5m˜ (t; s)2b = − [i5(0)] [(t)=a(t)] (2) d ˜q1 (2) d 3˜q2 [˜q12 + m˜ 2 (t) − ij]−1 6 ×[˜q22 + m˜ 2 (t) − ij]−1 [(˜k − ˜q1 − ˜q2 )2 + m˜ 2 (t) − ij]−1 ; ˜ s; {˜k }) = [i5(0)] 3 a(t)−1 ˜2 (t)(2)−3 d 3˜q[˜q2 + m˜ 2 (t) − ij]−1 5(t;
(4.27)
2
×[(˜k1 + ˜k2 − ˜q)2 + m˜ 2 (t) + ij]−1 ; 2 6 (t; s; {˜k }) = − 10˜ (t)[(˜k1 + ˜k2 + ˜k3 )2 m˜ 2 (t) − ij]−1 :
(4.28) (4.29)
5. Backreaction in the inationary universe: renormalization group equations and the running of coupling constants We have so far discussed the separation and averaging processes leading to a CGEA which contains the averaged eect of the environment on the system. Now we will introduce a rescaling
480
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
of the k-space (truncated in critical phenomena, enlarged in stochastic in4ation) to bring it back to the original size (but with less content in CP, more in SI). By requiring that this transformed Lagrangian has the same form as the one we started out with we obtain a set of renormalization group equations for the 3eld parameters. At this point our earlier observation on the equivalence of scaling and in4ation for the 3eld and spacetime in question becomes relevant. That is, rescaling in a static picture serves the function of in4ation in a dynamic picture. Our derivation here of the dierential renormalization group equations is similar to the Wilson–Fisher [29] or the Wegner–Houghton methods [40]. After introducing a rescaling of the k-space (spatial dimension D = 3), ˜k = s˜k;
(˜k ; t) = s−(D+2)=2 ¡ (˜k; t) ;
(5.1)
the eective action (4.24) becomes 1 3˜ Se [ ] = dt a(t) d k − (˜k ; t)[˜k 2 + s2 m˜ 2 (t; s)] (˜k ; t) 2 1 ˜ s; {˜k }) (˜k1 ; t) · · · (˜k4 ; t) + dt a(t) d˜k1 · · · d˜k4 − s4−D (t; 4! ×5(˜k1 + ˜k2 + ˜k3 + ˜k4 ) 1 ˜ ˜ + dt a(t) d k1 · · · d k6 − s6−2D 6 (t; s; {˜k }) (˜k1 ; t) · · · (˜k6 ; t)
6!
×5(˜k1 + · · · + ˜k6 ) :
(5.2)
Note that the k -integrations now resume the full range as k in (4.7) or (4.8). In terms of the rescaled variables ; Se will have the same form up to a certain order as the original Se in terms of provided that we identify m˜ 2 (t; s) = s2 m˜ 2 (t; s) ; ˜ s) : ˜ (t; s) = s4−D (t;
(5.3)
The eective mass and coupling constants are given in (4.25) – (4.29). We can now proceed to tackle the coarse-grained integrals therein. At this point one needs to stipulate the system and bath separation such as the Cases A and B given as examples in Section 3. We will work out the details for CP (Case A) here. The case for SI (Case B) can be obtained via the simple relation between Case A and B also mentioned in Section 3. For small changes in s 1 + d), one can derive a set of dierential renormalization group equations for the mass and the coupling constant as follows: d x=d) = 2x − 12 y=(1 + x) ; dy=d) = .y + 32 y2 =(1 + x)2 ;
(5.4)
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
481
where x ≡ m˜ 2 (t; ))*−2 ; ˜ ))*−j a−1 (t) : y ≡ (t;
(5.5)
is the solid angle integration in D-dimension. Here, . = 4 − D, and As expected, these equations have the same form as in the Ginzburg–Landau–Wilson model [29,30,41]. 5 ˜ as the scaling changes. There exist They govern the 4ow of the 3eld parameters (m˜ 2 ; ) certain points in this parameter space known as the 3xed points, where further application of the RG transformation leads to an invariant result. The 3xed points are thus the steady-state solutions to the RG equations d x=d) = 0 and dy=d) = 0. For the 4 theory, it is well known that there is a trivial 3xed point at xf0 = 0;
yf0 = 0 ;
(5.6)
and a nontrivial 3xed point at xf∗ = − j=6;
yf∗ = − 2j=3 :
Near the trivial 3xed point the solution to (5.4) is
x 1 1 =A e2) + B ej) y 0 2(2 − .)
(5.7)
(5.8)
The critical point of interest to us is the nontrivial 3xed point. We want to 3nd out how in its neighborhood the 3eld parameters 4ow towards this 3xed point. By setting x = xf∗ + Vx and y = yf∗ + Vy we get 1 1 1 d(Vx)=d) = 2 − j Vx − 1 + j Vy 3 2 6 and d(Vy)=d) = − jVy with the solution
Vx 1 1 =A e(2−j=3)) + B e−j) Vy 0 4(1 + j=6)
(5.9)
(5.10)
5.1. Cosmological consequences As we explained earlier, in eternal in4ation the renormalization group transformation parameterized by s is equivalent to time evolution parametrized by t. Thus under such conditions we can treat the coupling constants like (t; s) as depending on either s (4ow) or t (evolution). The 3xed point corresponds to the extreme infrared limit which in the case of eternal in4ation 5
An interesting discussion of how this consideration motivated the use of scaling ideas and renormalization group techniques for the investigation of critical phenomena can be found in [41].
482
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
lies at in3nitely late time tf . In practice it corresponds to the time the in4ation regime ends t1 tf before reheating sets in. Thus in (5.10) Vy is a measure of how y changes from t to t1 . Explicitly, with . = 1 and ) = H dt, the O(.) expansion result gives 14B −H (tf −t) : (5.11) y(t) = y(tf ) + e 3 ˜ = (t)a2 (t), this At the 3xed point tf ; y(tf ) = − 23 .. Recall that y = (t˜)*−j a−1 (t) and (t) yields 2 (5.12) (t) = − e−H (t−tf ) [1 − 7BeH (t−tf ) ] : 3 Comparing (t) at two times t ¿ t0 , (t) e−H (t−t0 ) [1 − 7B(eH (t−tf ) − eH (t0 −tf ) )] : (5.13) (t0 ) Assuming tf t; t0 (which is clearly satis3ed for realistic in4ation tf ¿ t0 e68 ) the second term in the square bracket can be neglected. We get, (t)=(t0 ) e−H (t−t0 ) :
(5.14)
Thus the coupling constant of the eective 3eld theory (in a system which is depleting in content) actually decreases with time. 6 This is a new eect which could have important consequence for the galaxy formation problem. As we know from standard calculations [43] the density contrast 5A=A depends strongly on . For GUT processes one needs to assume an unnaturally weak ≈ 10−12 in order that 5A=A ≈ 10−4 at the time the 4uctuations reenter the (RW) horizon. What (5.14) implies is that the strength of at time tG∗ when the 4uctuation mode corresponding to galaxy scale left the de Sitter horizon is much weaker than at tH∗ , the time the Hubble size left the de Sitter horizon. The running of the coupling constants seems to provide a way to reduce their strength. The reason behind this phenomena is, as we recall, due to the backreaction of the environment of high k modes on the system of lower modes, which in this particular choice is diminishing in time. In contrast, for stochastic in4ation where the system (consisting of modes with physical wavelength greater than the de Sitter horizon) increases in content, the coupling constants will, from this analysis, actually increase exponentially with time during the in4ationary regime (era from t0 to t1 ). Because a formal analogy exists between this case (Case B) of SI and CP (Case A), one can indeed just read o the result from the results we obtained earlier. The transcription is simply (i) Changing * in CP to .H in SI. Recall that * refers to the ultraviolet cuto in CP, whereas H −1 is the horizon size. The ultraviolet cuto for SI is assumed to be ∞. (ii) Changing s in CP to s−1 in SI. Recall that the coarse-graining parameter s plays the role of the scale function a(t). Thus s−1 in CP is now acting like a(t) in SI [see Eq. (4.8)]. Of course, even though the running coupling goes like (t)=(t0 ) e−H (t −t0 ) , a loop expansion of a quantity such as the density–density correlation function will show that in the same limit each loop appears with a factor of eH (t −t0 ) . The self-consistent cancellation of these two factors is associated with a crossover phenomenon and in principle could be accounted for by using an H -dependent (environment-friendly) RG [42]. We thank C.R. Stephens for this remark. 6
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
483
Note that s ¿ 1 in CP but s ¡ 1 in SI. Note also that transformation (ii) is equivalent to a time reversal, i.e., changing Vt → −Vt. The way runs in stochastic in4ation is given by Eq. (5.2), with −t replacing t there. The ensuing discussion of the implications on galaxy formation is similar, except that, of course, one reaches the opposite conclusion, i.e., the strengthening of in time increases the density contrast as each mode reenters the RW horizon. The opposite results arising from these two cases should not be viewed as unsettling. They arise from very dierent stipulations of the system (CP decreasing in time, SI increasing). The cosmological consequences will depend on whether and how runs, but the theoretical issues this investigation brings out should be valid. What is shown here is that if one takes into account the interaction between the environment and the system, the running of coupling parameters in the theory as observed in the system is an unavoidable consequence. How they run (with time or scale or energy) will depend sensitively on how the system is selected with respect to the environment, and, of course, the nature of the interaction (gauge 3elds would presumably run dierently from 4 , as their ultraviolet (UV) and infrared (IR) behaviors dier). On this point, let us reexamine the physical criteria for the choice of the system versus the environment in our examples. The cuto is determined by the highest k mode which left the de Sitter horizon, which in turn depends on the duration of in4ation. At the start of in4ation of course there is no knowledge when in4ation will end and the modes do not know which ones among them are to be counted in (the system) and which are to be out (the environment). A more straightforward division would be to have a 3xed cuto at the beginning of in4ation, e.g., those with k greater than the galaxy size (kG ) constitute the environment and those smaller than kG but greater than H −1 be the system. This would give rise to running, but presumably at a dierent rate. Indeed, if one is interested in the behavior of a particular mode, say, that of the galaxy scale kG , one can regard this one mode as the system and the rest as the environment. At all times this one mode is interacting with the rest of the whole spectrum, some of these leave the DS horizon earlier, some later. The behavior of this particular mode determines at a critical time tG∗ (when it leaves the DS horizon) its own amplitude when it reenters the RW horizon, but it is in4uenced by all the lower and higher modes at earlier times. The eect of coarse-graining for this choice of the system (single mode) is similar to the minisuperspace problem in quantum cosmology. It is expected that the coupling constants will not run in these circumstances. Even in more general cases it may also be that the changing as experienced by modes in the system may not exert any eect on galaxy formation, because at the moment any kth mode departs the de Sitter horizon it will have the same amplitude no matter what history of interaction it has experienced, and it is this amplitude which determines the density contrast for galaxy formation in the RW era. This point, however, raises an interesting theoretical issue. 5.2. Theoretical implications We see that dierent choices of the integration range in the calculation of the CGEA give rise to very dierent behavior of the coupling constants. In critical phenomena we stipulated the system to be in the same way as the real space renormalization group treatment of critical phenomena, i.e., that the ultraviolet cuto is 3xed at the start of in4ation—this corresponds to the inverse of the lattice size, which is a 3xed value for condensed matter considerations. In each iteration the UV cuto is redshifted, making the eective range of frequency smaller
484
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
at successive periods. In critical phenomena scaling is an arti3cial procedure to facilitate the approach to the infrared limit. However, in in4ation every dierent time interval corresponds to a realistic physical situation. If one adopts the coarse-graining scheme similar to critical phenomena, one would get a diminishing range for the system as the universe expands. This is the cause of the changing coupling constant, and is also the source of a serious problem. Imagine that if one turns the problem around by demanding that the physics observed in each of the subsequent moments should be identical, speci3cally, if it possesses the full range from the Planck scale to in3nity (k from 0 to inverse Planck length)—this is certainly what one usually assumes, otherwise the physical range we observe in the post-in4ationary era such as in today’s universe would only consist of the low k modes—if one makes this reasonable assertion, then we would not have the running coupling constants problem. This is equivalent to assuming that the physical wavemodes p = k=a rather than the intrinsic wavemodes k has the integration range between zero and the inverse Planck length. However, making such an adjustment in order to make the range identical at each subsequent moment would require a mechanism to ‘replenish’ the high frequency modes between the redshifted UV limit and the Planck length, which would open up another serious problem. Notice that this problem is not particular to in4ation nor curved spacetime. It is already there for any theory in a dynamical setting, e.g., in cosmology. It only becomes more serious in in4ation because the redshifting is exponential. Similar questions would arise in black holes when one tries to compare what is observed between dierent observers from the event horizon to asymptotic in3nity. The relation between black hole and in4ation can be easily understood by viewing them in terms of exponential scaling, as in the dynamical 3nite size eects [28]. One can analyze this problem in the context of the renormalization group theory by using an explicit cuto (de3ned at one instant as the Planck scale). By demanding that this cuto be the same at all instants and at all energies, one would presumably not get any 4ow. This would in turn demand some theory on the physical consequence of transPlanckian frequencies. 7 Part III. Backreaction and noise from CTP-CGEA 6. The closed-time-path coarse-grained e"ective action As we introduced the CGEA in the ‘in–out’ version [4] we remarked that for a correct treatment of backreaction problems, the ‘in–in’ or CTP version of CGEA is the right way to proceed, as it gives a real and causal equation of motion for the eective dynamics of the open system. The CTP CGEA was 3rst introduced by Hu and Sinha [35] to analyze the validity of the minisuperspace approximation in quantum cosmology. The in4uence functional method [45] for interacting quantum 3elds (based on quantum Brownian motion [46]) was introduced by Hu et al. in [6,47,8] and Calzetta and Hu [7], and applied to a range of problems in semiclassical gravity and cosmology such as decoherence, structure formation, entropy generation 7 This issue raised originally in the context of in4ationary cosmology in [4] has a parallel in black holes, which was pursued independently by Jacobson, Unruh and others. For a discussion of trans-Planckian modes problems and explorations for its implication for quantum gravity, see, e.g., [44].
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
485
and reheating [32,33,48,49]. Lombardo and Mazzitelli [50] computed perturbatively the CTP CGEA for a self-interacting theory in a Robertson–Walker spacetime to discuss the quantum to classical transition of the low modes of the 3eld, and to put on 3rmer grounds the quantum theory of structure formation in in4ationary models. Greiner and MQuller [51] also used the CTP CGEA to analyze the classical limit of the soft modes of a quantum 3eld when the hard modes are in thermal equilibrium. Dalvit and Mazzitelli [52] showed that it is possible to write an exact equation for the dependence of the CGEA on the scale that separates the system and the environment. This is expected to be useful for a nonperturbative calculation of the CGEA, and to discuss the appearance of noise in the renormalization group equations. In this part we will review the above mentioned works. We will start with the necessary de3nitions and the perturbative evaluation of the CTP CGEA for a 4 scalar 3eld in a RW background spacetime. We will describe how to use the CTP CGEA to analyze the issue of quantum to classical transition of 4uctuations which give rise to structure formation in in4ationary models. We will 3nally describe attempts to compute the CTP CGEA using nonperturbative approximations. 6.1. Perturbative evaluation of the CTP CGEA We consider again the scalar 3eld action given by Eq. (3.1). In a 4at Robertson–Walker spacetime with metric ds2 = dt 2 − a2 (t) d˜x2 = a2 (B)[dB2 − d˜x2 ], the action can be written as 1 1 2 2 2 1 1 4 4 2 2 S(a; $) = d x B 9 $9 $ − m a $ −
− Ra $ − $ ; (6.1) 2 2 2 6 4! where $ = a. From now on, d 4 x denotes d 3 x dB. To see the 4at space results simply set a = 1 and R = 0. Let us make a system–environment 3eld splitting $(x) = $¡ (x) + $¿ (x) ; where we de3ne the system by d 3˜k (˜k; B) exp i˜k · ˜x ; $¡ (˜x; t) = 3 |˜k|¡*c (2) and the environment by d 3˜k $¿ (˜x; B) = (˜k; B) exp i˜k · ˜x : 3 |˜k|¿*c (2)
(6.2)
(6.3)
(6.4)
The system-3eld contains the modes with wavelengths longer than the critical value *−1 c , while −1 the bath or environment-3eld contains wavelengths shorter than *c . *c corresponds to *=s of previous sections. After the splitting, the total action (6.1) can be written as S[a; $] = S0 [$¡ ] + S0 [$¿ ] + Sint [a; $¡ ; $¿ ] ;
(6.5)
486
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
where S0 denotes the kinetic term and the interaction part is given by 2 M 2 4 M2 2 4 4 Sint [a; $¡ ; $¿ ] = − d x + $¡ + $¡ $¿ + $¿ 2 4! 2 4! 2 3 2 3 + $¡ (x)$¿ + $¡ $¿ + $¡ $¿ : 4 6 6
(6.6)
with M 2 = m2 a2 + ( − 16 )Ra2 . We are interested in the in4uence of the environment on the evolution of the system. Therefore ] is the object of relevance. It is de3ned by the CTP CGEA S* [a; $¡ ; a ; $¡ $¿f $¿f D $¿ D$¿ exp iS*c [a; $¡ ; a ; $¡ ] = exp i(S0 [$¡ ] − S0 [$¡ ]) d$¿f ×exp i{S0 [$¿ ] + Sint [a; $¡ ; $¿ ] − S0 [$¿ ] − Sint [a ; $¡ ; $¿ ]} : (6.7) ) with positive (and negative) frequency modes in The integration is over all 3elds $¿ (and $¿ =$ the remote past that coincide at the 3nal time $¿ = $¿ ¿f . We will calculate the CTP CGEA perturbatively in and M 2 , up to quadratic order in both quantities. After a simple calculation we obtain S*c [a; $¡ ; a ; $¡ ] = S0 [$¡ ] − S0 [$¡ ] + {Sint [a; $¡ ; $¿ ]0 − Sint [a ; $¡ ; $¿ ]0 } i 2 + {Sint [a; $¡ ; $¿ ]0 − [Sint [a; $¡ ; $¿ ]0 ]2 } 2 ; $¿ ]0 − Sint [a; $¡ ; $¿ ]0 Sint [a ; $¡ ; $¿ ]0 } − i{Sint [a; $¡ ; $¿ ]Sint [a ; $¡ i 2 + {Sint [a ; $¡ ; $¿ ]0 − [Sint [a $¡ ; $¿ ]0 ]2 } ; (6.8) 2 where the quantum average of a functional of the 3elds is de3ned with respect to the free action S0 $¿f $¿f B[$¿ ; $¿ ]0 = d$¿f D$¿ D$¿ exp {S0 [$¿ ] − S0 [$¿ ]} B : (6.9)
Eq. (6.8) is the in–in version of the Dyson–Feynman series (4.20). We de3ne the propagators of the environment 3eld as *c $¿ (x)$¿ (y)0 = iG++ (x − y) ;
(6.10)
*c $¿ (x)$¿ (y)0 = − iG+− (x − y) ;
(6.11)
*c $¿ (x)$¿ (y)0 = − iG− − (x − y) :
(6.12)
These propagators are not the usual Feynman, positive-frequency Wightman, and Dyson propagators of the scalar 3eld since, in this case, the momentum integration is restricted by the presence of the (infrared) cuto *c . The explicit expressions are d 4p ip(x−y) 1 *c G++ (x − y) = e ; (6.13) 4 p2 + i. |˜ p|¿*c (2)
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
d 4p ip(x−y) e 2i5(p2 )D(p0 ) ; 4 |˜ p|¿*c (2) d 4p ip(x−y) 1 *c G−− (x − y) = e : 4 p2 − i. |˜ p|¿*c (2)
*c G+− (x
487
− y) = −
(6.14) (6.15)
*c . The usual Feynman propagator As an example, we show the expression for the propagator G++ is 1 i 1 G++ (x) = 2 − 5()) ; (6.16) 8 ) 8 while i cos[*c (r − x0 )] cos[*c (r + x0 )] *c G++ (x) = 2 + 8 r(r − x0 ) r(r + x0 ) 1 sin[*c (r − x0 )] sin[*c (r + x0 )] − − 2 8 r(r − x0 ) r(r + x0 ) |˜ p|¡*c
≡ G++ (x) − G++
(x) ;
(6.17)
where ) = 12 x2 . The CTP CGEA can be computed from Eqs. (6.8) – (6.12) using standard techniques. After some algebra we 3nd 1 2 2 S*c [a; $¡ ; a ; $¡ ] = S0 ($¡ ) − S0 ($¡ ) − d 4 x(M 2 (x)$¡ − M 2 $¡ ) 2 1 *c 1 4 2 2 4 4 − d x ($ (x) − $¡ (x)) + iG++ (0)(M˜ (x) − M˜ (x)) 24 ¡ 2 2 3 2 3 *c *c 3 + d 4 x d 4 y − $¡ (x)G++ (x − y)$¡ (y) − $¡ (x)G+− 72 36 3 ×(x − y)$¡ (y) +
2 3 *c 3 $ (x)G− − (x − y)$¡ (y) 72 ¡
1 2 2 4 1 2 1 2 2 *c 2 *c 2 ˜ 2 + M˜ (x)iG+− (x − y)M˜ (y) − M˜ (x)iG− − (x − y)M (y) 2 4
*c 2 − M˜ (x)iG++ (x − y)M˜ (y)
2 2 *c 3 *c 3 (x − y)$¡ (y) + $¡ (x)G+− (x − y)$¡ (y) $¡ (x)G++ 18 9 2 *c 3 − $¡ (x)G− − (x − y)$¡ (y) ; 18
+
2 2 . where we introduced the notation M˜ = M 2 + 2 $¡
(6.18)
488
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
De3ning 4 4 ± $¡ ); P± = 12 ($¡ 2
3 3 R± = 12 ($¡ ± $¡ ); 2
Q± = 12 (M˜ ± M˜ );
$± = 12 ($¡ ± $¡ );
and using simple identities for the propagators, the real and imaginary parts of the CTP CGEA can be written as i *c 1 4 Re S*c = S0 ($¡ ) − S0 ($¡ ) − d x P− (x) + G++ (0)Q− (x) 12 2 1 *c + 2 d 4 x d 4 yD(y0 − x0 ) − R+ (x)Re G++ (x − y)R− (y) 18 1 1 *c 2 *c 3 + Q+ (x)Im G++ (x − y)Q− (y) + $+ (x)Re G++ (x − y)$− (y) ; (6.19) 4 3 1 *c (x − y)R− (y) Im S*c = 2 d 4 x d 4 y − R− (x)Im G++ 18 *c 2 *c 3 1 1 − 4 Q− (x)Re G++ (x − y)Q− (y) + 3 $− (x)Im G++ (x − y)$− (y) : (6.20) The real part of the CTP CGEA in Eq. (6.19) contains divergences and must be renormalized. As the propagators in Eqs. (6.10) – (6.12) dier from the usual ones only by the presence of the infrared cuto, the ultraviolet divergences coincide with those of the usual $4 -theory. The eective action can therefore be renormalized using the standard procedure. Consider the square of the Feynman propagator. Using dimensional regularization we 3nd (|˜ p|¡*c )2
*c 2 2 G++ (x) = G++ (x) + G++
where
(|˜ p|¡*c )
(x) − 2G++ (x)G++
(x) ;
1 i + (1) − 4 54 (x) + iI(x) − B(x) − Log[42 ] ; 162 n − 4 1 I(x) = d 4 peipx Log|p2 | ; (2)4 B(x) = d 4 peipx D(p2 ) : (2)4
2 G++ (x) =
(6.21)
(6.22) (6.23) (6.24)
Note that the divergence is the usual one, i.e., proportional to 1=(n − 4)54 (x − y) and independent *2 (x − y)Q (x)Q (y) in Eq. (6.19) is also divergent and of *c . Consequently, the term Im G++ + − renormalizes the coupling constant and the constants that appear in the gravitational action
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
489
(as usual, in order to renormalize the theory of a quantum 3eld in a curved space, it is necessary to include in the gravitational action the Einstein Hilbert term, a cosmological constant, and terms quadratic in the curvature tensor). The other divergences can be treated in a similar way. One can also check that the imaginary part of the eective action does not contain divergences. 8 As we will show with the help of QBM models, the (nonlocal) real and imaginary parts ] can be associated with the dissipation and noise, respectively, and can be of S*c [a; $¡ ; a ; $¡ related by an integral equation known as the 4uctuation–dissipation relation. Similar perturbative results for the CTP CGEA have been obtained by Greiner and MQuller in Ref. [51]. In order to derive eective 3eld equations for the soft modes of the scalar 3eld, they computed the CTP CGEA assuming that the modes in the environment are at thermal equilibrium at a temperature T such that *c T . Their result for the eective action is essentially given by expressions (6.19) and (6.20), for the particular case a(t) = 1 and replacing the vacuum propagators by thermal propagators in order to take into account the state of the environment. In Section 7.2 we will see that the 3eld equations derived from the CTP CGEA are real and causal. 6.2. Applications: dynamics of system 9elds=modes with backreaction of environment 9elds=modes The coarse-grained eective action is a very useful method in treating coarse-graining and backreaction problems. Examples include stochastic in4ation [6,32,50] and reheating in in4ationary cosmology [33,53], eect of hard thermal loops in QCD plasma, and the eect of individual atoms on a BEC condensate. The 3rst instances the CTP eective action was applied to were for backreaction of quantum 4uctuations and particle creation on the background spacetime [1,39] or for interacting quantum 3elds [2,54]. The CTP CGEA was 3rst introduced [35] to address the validity of the so-called minisuperspace approximation. Although it was introduced in the context of quantum cosmology, the method has a wide range of applications. The relevant issues are the backreaction eect of the inhomogeneous modes in an interacting quantum 3eld on the homogeneous mode; and the validity of quantizing a truncated theory: does it preserve the salient features of a fully quantized theory? These issues underlie many problems in physics, especially when we view eective theories as playing a more fundamental role in the description of Nature [55]. 7. Master and Langevin equations in quantum 0eld theory We now show how to use the open system concepts and techniques to give a 3rst principles derivation of the evolution equation for the reduced density matrix that describes the system under the in4uence of the environment. We will use the in4uence functional method to extract 8
Of course, a successful ultraviolet renormalization does not guarantee that an approximation scheme such as RG-improved perturbation theory will be well behaved. A good example is given in Section 5, where loops depend on a factor eH (t −t0 ) which would invalidate perturbation theory. Further ‘infrared’ H -dependent (environment-friendly) renormalization of is needed.
490
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
the noise and dissipation kernels. The open system framework is best illustrated by the quantum mechanical problem of Brownian motion (QBM). Feynman and Vernon [45] 3rst treated this problem with the in4uence functional method. Using this method, the master equation was derived by Caldeira and Leggett [46] for Markovian processes (Ohmic environment at high temperature), and by Hu et al. [56] for a general environment including nonMarkovian processes. (See also [57,58].) As we will soon 3nd out, the in4uence functional method is intimately related to the CTP coarse-grained eective action [59,7]. Hu et al. [47,6] 3rst generalized the quantum mechanical problem of Brownian motion to quantum 3elds, taking the system and environment as two independent 3elds. Lombardo and Mazzitelli [50] treated the same problem, taking as system and environment the low and high frequency modes of a single, self-interacting scalar 3eld. We will follow their treatment here. We begin with a brief review of the problem of quantum Brownian motion. Denote by x the coordinate of the Brownian particle, by its frequency, and by qi the coordinates of the oscillators in the environment. The in4uence of the environment on the Brownian particle can be described by the reduced density matrix Ar (x; x ; t) that is obtained from the full density matrix by integrating out the environmental degrees of freedom qi . For a linear coupling xqi , the master equation for the reduced density matrix Ar (x; x ; t) is of the form [60] i˝9t Ar (x; x ; t) = x|[H; Ar ]|x − i(t)(x − x )(9x − 9x )Ar (x; x ; t) (7.1) + f(t)(x − x )(9x + 9x )Ar (x; x ; t) − iD(t)(x − x )2 Ar (x; x ; t) ; where the coeTcients (t); D(t) and f(t) depend on the properties of the environment (temperature J−1 and spectral density I (!)). The 3rst term on the RHS of Eq. (7.1) gives the usual Liouvillian evolution. The second is a dissipative term with a time-dependent dissipative coef3cient (t). The last two are diusive terms. The one proportional to the anomalous diusion coeTcient f(t) does not produce decoherence, i.e., the o-diagonal terms in the density matrix 2 are not suppressed in time. To see this in a simple example, assume that Ar = A0 e−B(t)(x−x ) . Inserting this into Eq. (7.1) it is easy to prove that the f-term produces an oscillating function B(t) but no damping. (We refer the reader to Refs. [60,61] for a more detailed justi3cation.) The last term with diusion coeTcient D(t) t ∞ 1 D(t) = ds cos( s) d! I (!) coth J˝! cos(!s) ; (7.2) 2 0 0 gives the main contribution to decoherence. Indeed, an approximate solution of Eq. (7.1) is [56,60] t 2 D(s) ds ; (7.3) Ar [x; x ; t] ≈ Ar [x; x ; 0] exp −(x − x ) 0 and we see that the o-diagonal terms of the density matrix are suppressed as long as 0t D(s) ds is large enough. For nonlinear couplings like xn qim , one expects the master equation to contain terms of the form iD(n; m) (t)(xn − xn )2 Ar . We now proceed to quantum 3eld theory. The total density matrix (for the system and environment 3elds) is de3ned by A[$¡ ; $¿ ; $¡ ; $¿ ; B] = $¡ $¿ |Aˆ|$¡ $¿ ;
(7.4)
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
491
where |$¡ and |$¿ are the eigenstates of the 3eld operators $ˆ¡ and $ˆ¿ , respectively. For simplicity, we will assume that the interaction is turned on at the initial time B0 and that, at this time, the system and the environment are not correlated. (The physical consequences of such a choice is elaborated in [56]. More general initial conditions can be introduced by a stipulated preparation function [62].) As such, the total density matrix can be written as the product of the density matrix operator for the system and for the bath A[B ˆ 0 ] = Aˆ¡ [B0 ]Aˆ¿ [B0 ] :
(7.5)
We will further assume that the initial state of the environment is the vacuum. The reduced density matrix is de3ned by Ared [$¡ ; $¡ ; B] = D$¿ A[$¡ ; $¿ ; $¡ ; $¿ ; B] :
(7.6)
and evolves in time according to Ar [$¡f ; $¡f ; B] = d$¡i d$¡i Jr [$¡f ; $¡f ; B|$¡i ; $¡i ; B0 ]Ar [$¡i $¡i ; B0 ] ;
(7.7)
where Jr [t; B0 ] is the reduced evolution operator $¡f $ ¡f i D$¡ D$¡ exp {S[$¡ ] − S[$¡ ]}F[$¡ ; $¡ ]: Jr [$¡f ; $¡f ; B|$¡i ; $¡i ; B0 ] = ˝ $¡i $¡i (7.8) ] is de3ned as The in4uence functional (or Feynman–Vernon functional) F[$¡ ; $¡ $¿f $¿f F[$¡ ; $¡ ] = d$¿i d$¿i A¿ [$¿i ; $¿i ; B0 ] d$¿f D$¿ D$¿ $¿i
i
$¿i
×exp {S[$¿ ] + Sint [$¡ ; $¿ ] − S[$¿ ] − Sint [$¡ ; $¿ ]} : ˝
(7.9)
We see that when the environment is initially in its vacuum state, the in4uence functional coincides with the CTP CGEA [59,7]. Treatment of two self-interacting 3elds via the in4uence functional leading to noise is contained in [8]. As the propagator Jr is de3ned through a path integral, we can obtain an estimation using the saddle point approximation: i cl cl Jr [$¡f ; $¡f ; B|$¡i ; $¡i ; B0 ] ≈ exp S*c [$¡ ; $¡ ]; ˝
(7.10)
cl ($cl ) is the solution of the equation of motion 5 Re S =5$ | = 0 with boundary where $¡ ¡ $¡ =$¡ *c ¡ cl (B ) = $ ($ ) and $cl (B) = $ ). This formula enables us to analyze the conditions $¡ ($ ¡i ¡i 0 ¡f ¡f ¡ quantum to classical transition of the system 3eld using the perturbative evaluation of the CTP CGEA described in the previous section.
492
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
7.1. Master equation and decoherence of long wavelengths The decoherence eects are contained in the imaginary part of the CTP CGEA, which is cl already of order 2 . As a consequence, in the evaluation of Im S*c we can approximate $¡ by the solution of the free 3eld equation satisfying the appropriate boundary conditions. For simplicity we will consider massless, conformally coupled 3elds. Therefore the classical solution reads sin(k0 s) sin[k0 (B − s)] cl cl $¡ (˜x; s) = $¡f cos(˜k0 · ˜x) ≡ $¡ (s) cos(˜k0 · ˜x) ; (7.11) + $¡i sin(k0 B) sin(k0 B) where we assumed that the system-3eld contains only one Fourier mode with ˜k = ˜k0 . This is a sort of “minisuperspace” approximation for the system-3eld that will greatly simplify the calculations. As in the QBM problem, we can derive the diusion coeTcients from the master equation, which would provide relevant information about decoherence. Hu et al. [47,6] derived the noise kernel of two interacting 4 quantum 3elds in DS space and analyzed its behavior in relation to decoherence. Lombardo and Mazzitelli [50] did the same, with one self-interacting 3eld split into two mode sectors. They followed the method proposed in [60], and computed the time derivative of the propagator Jr , eliminating the dependence on the initial 3eld con3gurations cl and $cl . The master equation is of the form $¡i and $¡i that enters through $¡ ¡
3 3 )2 V ($¡f − $¡f i˝9B Ar [$¡f ; $¡f ; B] = $¡f |[Hˆ ren ; Aˆr ]|$¡f − i2 D1 (k0 ; B) 1152 2 2 )2 V )2 V − $¡f ($¡f − $¡f ($¡f ; B] + · · · : + D2 (k0 ; B) − D3 (k0 ; B) Ar [$¡f ; $¡f 32 6 (7.12) Due to the complexity of the equation, we only show the correction to the usual unitary evolution term coming from the noise kernels. This equation contains three time-dependent diusion coeTcients Di (B). Up to one loop, only D1 and D2 survive and are given by t *c D1 (k0 ; B) = ds cos3 (k0 s) Im G++ (3k0 ; B − s) 0
1 = 6k0
0
t
ds cos3 (k0 s) cos(3k0 s)%(3k0 − *c )
2k0 B + 3 sin(2k0 B) + 32 sin(4k0 B) + 13 sin(6k0 B) *c ; ¡ k0 ¡ *c ; 2 3 576k0 B *c 2 *c 2 D2 (k0 ; B) = ds cos2 (k0 s)(Re G++ (2k0 ; B − s) + 2 Re G++ (0; B − s)) : =
0
(7.13) (7.14)
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
493
Using that *c 2 (2k0 ; B Re G++
− s) = k0
+ *c 2 Re G++ (0; B
2k0 +*c
*c ∞
2k0 +*c
dp
dp
2k0 +p
*c
p+2k0
p−2k0
d z cos[(p + z)s]
d z cos[(p + z)s] ;
sin(2*c s) − s) = 25(s) − 2 s
(7.15)
;
(7.16)
the D2 diusion coeTcient can be written as 3 *c *c 3 − − Si[2B(*c − k0 )] − 2 − Si[2*c B] D2 (k0 ; B) = 4 2 2k0 2k0 3 *c cos[2*c B] *c − Si[2B(*c + k0 )] − 1 + Si[2B(2k0 + *c )] + + 2 2k0 2k0 4k0 t cos[2B(*c + k0 )] cos[2B(*c − k0 )] cos[2B(2k0 + *c )] − ; (7.17) − + 4k0 B 4k0 B 4k0 B where Si[z] denotes the sine-integral function [63]. Eq. (7.12) is the 3eld-theoretical version of the QBM master equation we were looking for. In our case, the system is coupled in a nonlinear form. Owing to the existence of three interaction 3 $ ; $2 $2 , and $ $3 ) there are three diusion coeTcients in the master equation. terms ($¡ ¿ ¡ ¿ ¡ ¿ The form of the coeTcients is 3xed by these couplings and by the particular choice of the quantum state of the environment. Our results are valid in the single-mode approximation of Eq. (7.11). In this approximation one obtains a reduced density matrix for each mode ˜k0 , and neglects the interaction between dierent system modes. Due to this interaction, in the general case, Ar will be dierent from ˜ ˜k0 Ar (k0 ). This point deserves further study. A detailed analysis of the quantum to classical transition in the model we are considering is a very complicated task. One should analyze in detail the master equation and see whether the o-diagonal elements of the reduced density matrix are suppressed or not. One should also study the form of the Wigner function, and see whether it predicts classical correlations or not [64,65]. Having in mind the analogy with the QBM, here we will only be concerned with the diusive terms of the master equation. By examining how the value of the diusion coeTcients change in time, one can get an indication of how eective decoherence is [6,50]. This simpli3ed method can only provide a rough approximation. A mode in the system will decohere if the diusion coeTcients are dierent from zero during an appreciable period of time. Therefore we ask the following question: which is the maximum value of the cuto *c such that, a few e-foldings after the initial time, all modes with k0 6 *c still suer the diusive eects? The value of the diusion coeTcients at a given time depends on
494
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
the value of the adimensional quantity l = *c B. For the particular case of a DS spacetime we have a(t) = exp(Ht) and 1 *c l = *c B = 1− ; H a which, a few e-foldings after the initial time, is approximately given by l = *c =H . We can distinguish two possibilities. When l ∼ 1 both coeTcients D1 and D2 are appreciably dierent from zero for all values of k0 . On the other hand, when l1 the situation is completely dierent: D1 is very small everywhere while D2 is also very small in the infrared sector. The diusive eects are important only for k0 ∼ *c . The conclusion is that, in order to have decoherence for all modes with k0 ¡ *c , we can include in our system only those modes with wavelength larger than H −1 . This is consistent with Starobinsky’s original suggestion. If we include wavelengths shorter than H −1 , the frequency threshold in the environment increases, the infrared sector of the system cannot excite the environment and therefore does not decohere. A more realistic calculation should include a time dependent cuto [66], since the system should contain, at each time, the modes with kph = k0 =a ¡ H . More importantly, the scalar 3eld should be minimally coupled ( = 0) to the curvature. In spite of this, we think that our example illustrates the main aspects of the problem. Indeed, one could repeat the calculations for a given mode of the minimally coupled scalar 3eld, and obtain a master equation, similar to our Eq. (7.12). The main dierence would be that, for = 0, the propagators in DS spacetime would dier from their 4at-spacetime counterparts Eqs. (6.13) – (6.15) (See Zhang’s thesis [47].) The evaluation of the diusion coeTcients using curved spacetime propagators would be more realistic for discussions of structure formation in in4ationary cosmology. 7.2. The Langevin equation We now show how to derive a stochastic (Langevin) equation for the system 3eld from the CTP CGEA. This equation takes into account the three fundamental eects of the environment on the system: renormalization, dissipation and noise. The real and imaginary parts of the CTP CGEA for our model are given in Eqs. (6.19) and (6.20). One can regard the imaginary part of S*c as coming from three noise sources (x); (x), and B(x) with a Gaussian functional probability distribution given by 2 −1 1 *c 4 4 P[(x); (x); B(x)] = N N NB exp − d x d y(x) (y) Im G++ 2 9 2 −1 1 *c 2 d 4 x d 4 y (x)
(y) × exp − Re G++ 2 2 −1 −22 1 *c 3 4 4 d x d yB(x) B(y) ; (7.18) Im G++ ×exp − 2 3
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
495
where N , N , and NB are normalization factors. Indeed, we can write the imaginary part of the in4uence action as three functional integrals over the Gaussian 3elds (x), (x), and B(x): i D(x) D (x) DB(x)P[; ; B] exp − {R− (x)(x) + Q− (x) (x) + $− (x)B(x)} ˝ 2 i * = exp − d4 x d4 y (x; y)R− (y) R− (x) Im G++ ˝ 18 2 2 *2 *3 + Q− (x) Re G++ (x; y)Q− (y) − $− (x) Im G++ (x; y)$− (y) : (7.19) 4 3 Therefore, the CTP-CGEA can be rewritten as 1 S*c [$¡ ; $¡ ] = − ln DP[] D P[ ] DBP[B] exp{iSe [$¡ ; $¡ ; ; ; B]} ; i where Se [$¡ ; $¡ ; ; ; B] = Re S*c [$¡ ; $¡ ]
−
(7.20)
d 4 x[R− (x)(x) + Q− (x) (x) + $− (x)B(x)] : (7.21)
From this eective action it is easy to derive the stochastic 3eld equation for the system ; ; ; B] 9Se [$¡ ; $¡ =0 : (7.22) 9$¡ $¡ =$¡ It is given by
1 3 1 2 2 *c 3 $¡ + $¡ + $¡ (x) d 4 y%(x0 − y0 ) Re G++ (x; y)$¡ (y) 6 12 1 *c 2 2 + 2 $¡ (x) d 4 y%(x0 − y0 ) Im G++ (x; y)$¡ (y) 4 1 2 *c 3 + d 4 y%(x0 − y0 ) Re G++ (x; y)$¡ (y) 6
3 1 2 = (x)$¡ (x) + (x)$¡ (x) + B(x) : 2 2
(7.23)
The 3eld equation is real and causal, as expected. We see that it contains multiplicative and additive colored noise. The non linear coupling between modes makes the Langevin equation much more complicated than the usually assumed white noise equation (see Section 3.3). Greiner and MQuller [51] obtained a similar stochastic equation in 4at spacetime, for a thermal environment. They analyzed in detail the dissipative terms in the Langevin equation. In particular they found explicit expressions for momentum dependent dissipation coeTcients using a Markovian approximation for the soft modes.
496
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
8. Renormalization group from CTP CGEA 8.1. Towards a nonperturbative evaluation of the CTP CGEA: the exact RG equation The main drawback of the results presented in the previous sections is that we have evaluated the CTP CGEA only perturbatively. As is well known, several applications, in particular the analysis of phase transitions in the early universe and condensed matter physics require nonperturbative calculations. In this section we will derive an exact evolution equation for the dependence of the CGEA on the coarse graining scale [52]. In order to simplify the notation in this section we will denote by *0 the ultraviolet cuto and by * the coarse graining scale. The CTP CGEA interpolates between the bare theory at * = *0 and the physical theory at the scale *. We will now consider a 4 3eld theory in Minkowski spacetime. To derive such an exact evolution equation, we follow the approach of Wegner and Houghton [40] which is designed for Euclidean spacetime. Therefore we will invoke the Euclidean AEA of Wetterich et al. [37,67–71]. (As already mentioned, the main dierence between both actions is that the Euclidean action averages the 3eld over a space-time volume, while our CTP CGEA averages the 3eld over a spatial volume, and is therefore more adept to study non-equilibrium conditions.) The Euclidean CGEA is de3ned by e−S* () ≡ D[(q)]e−Scl [] : (8.1) *0 ¿q¿*
By considering an in3nitesimal variation * → * − 5* it is possible to obtain an evolution equation for S* , which reads 4 −1 4 9 S* * d q 52 S* d q 5S* 5S* 52 S* * − (8.2) ln =− 9* 25* (2)4 5q 5−q (2)4 5q 5−q 5q 5−q (in the original equation of Wegner and Houghton there are additional terms coming from a rescaling of the modes after the coarse graining). The prime in the momenta integrals means that integration is restricted to the shell * ¿ q ¿ * − 5*. A typical approximation taken to solve this evolution equation is to assume that S* is of the form 1 4 2 S* [] = d x (9 ) + V* () (8.3) 2 Eq. (7.2) reduces in this case to an evolution equation for the eective potential V* (). We will discuss this kind of approximation for the CTP CGEA in the next section. Now we start the CTP calculation by writing the CGEA for a scale * − 5*, namely eiS*−5* (+ ; − ) ≡ D[+ (˜q; t)]D[− (˜q; t)]eiScl [+ ; − ] : (8.4) *0 ¿|˜q|¿*−5*
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
497
The modes to be integrated can be split into two parts: one within the shell * ¿ |˜q| ¿ * − 5* and another containing modes with *0 ¿ |˜q| ¿ *. Expanding the action in powers of the modes within the shell, one obtains eiS*−5* (+ ; − ) = eiS* (+ ; − ) ×
D[+ ]D[− ]ei(S1 +S2 +S3 ) ei=2
d 3 q=(2)3
dt(d=dt)(a (−˜q;t)˙ b (˜q;t)gab )
;
*¿|˜q|¿*−5*
(8.5) where
S1 = 1 S2 = 2
dt
d3 q 9S* a (˜q; t) 3 (2) 9a (−˜q; t)
dt dt
d3 q 92 S* (˜ q ; t) b (˜q; t ) : a (2)3 9a (−˜q; t)b (˜q; t )
(8.6)
In taking the functional derivatives of S* (which contains modes whose wave vectors satisfy |˜q| ¡ *) the modes within the shell are set to zero. We use the notation
+ (˜q; t) 1 0 ; gab = : (8.7) a (˜q; t) = − (˜q; t) 0 −1 The S3 term is cubic in the modes within the shell and, as in the Euclidean case, it does not contribute in the limit 5* → 0 (basically, this is because one is doing a one loop calculation for the shell modes). The functional integrals over the shell modes have the CTP boundary conditions. A comment about the last exponential factor in Eq. (8.5) is in order. Usually one discards it because it is a surface term, but in the CTP formalism it must be kept since the boundary conditions are that + (˜q; T ) = − (˜q; T ) with T → ∞ for the modes ˜q within the shell. In order to evaluate the functional integrals we split the 3eld as a = [ a + ’a and impose the boundary conditions on the “classical” 3elds [ ± , i.e. they vanish in the past −T (negative and positive frequencies respectively) and match in the Cauchy surface at time T . The 4uctuations ’a vanish both in the past and in the future. The classical 3elds are solutions to d2 92 Sint − 2 − q2 gab [ b (˜q; t) + dt (8.8) [ (˜q; t ) = 0 ; dt 9’a (−˜q; t)9’b (˜q; t ) b where we have split the CGEA as S* (± ) = Skin (± ) + Sint (± ) with ij 1 1 i. Skin = d 4 x (9 + )2 + 2+ − d 4 x (9 − )2 − 2− : 2 2 2 2 As before, in the functional derivatives the modes within the shell are set to zero.
(8.9)
498
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
Let ha be solutions to Eq. (8.8), vanishing in the past and satisfying an arbitrary normalization in the future, and let (˜q) be the common value of the 3elds taken in the future. We can then write ha (˜q; t) [ a (˜q; t) = (˜q) : ha (˜q; T )
(8.10)
We 3rst integrate over the common value (˜q) and then proceed with the functional integration over the 4uctuations ’a (both are Gaussian integrals with “source” terms). One 3nally gets
3 h˙+ (˜q; T ) h˙− (˜q; T ) 9 S* i* d q ln − * =− 9* 25* (2)3 h+ (˜q; T ) h− (˜q; T ) * + 25*
d3 q (2)3
h˙+ (˜q; T ) h˙− (˜q; T ) − h+ (˜q; T ) h− (˜q; T )
* i* − ln det (Aab ) + 25* 25*
dt dt
−1
9 S* ha (˜q; t) dt ha (˜q; T ) 9’a (−˜q; t)
2
d3 q 9S* 9S* (−˜q; t;˜q; t ) A−1 : ab 3 (2) 9’a (˜q; t) 9’b (˜q; t ) (8.11)
The 2 × 2 matrix Aab has the following elements: d2 92 Sint 2 A++ (−˜q; t;˜q ; t ) = − 2 − q + ij 5(t − t )53 (˜q + ˜q ) + ; dt 9’+ (−˜q; t)9’+ (˜q ; t ) 2 d 92 Sint 2 3 + q + i j 5(t − t )5 (˜ q + ˜ q ) + A− − (−˜q; t;˜q ; t ) = ; dt 2 9’− (−˜q; t)9’− (˜q ; t ) A+− (−˜q; t;˜q ; t ) = A−+ (˜q ; t ; −˜q; t) =
92 Sint : 9’+ (−˜q; t)9’− (˜q ; t )
(8.12)
The primed determinant must be calculated as the product of the eigenvalues of Aab in a space of functions with wave vectors within the shell (* − 5* ¡ |˜q| ¡ *) and satisfying null conditions both in the past and in the future. Similar conditions are to be used to evaluate the inverse A−1 ab . Eq. (8.11) is exact in the sense that no perturbative approximation has so far been used. It is similar to its Euclidean counterpart (8.2), but involves two 3elds and CTP boundary conditions. It contains all the information of the in4uence of the short wavelength modes on the long wavelength ones, and should be the starting point for a nonperturbative analysis of decoherence, dissipation, domain formation and out of equilibrium evolution. 8.2. Derivative expansion The exact renormalization group equation is too complex to be solved without taking some approximation. The usual ones are expansions in the number of powers of the 3elds (see Ref. [72] for a detailed analysis) or in derivatives of them [73–75]. In the following we shall make use of the derivative expansion approach.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
499
We will prove that, within this approach, the exact RG Eq. (8.11) admits a solution of the form S* (+ ; − ) = S* (+ ) − S* (− ) :
(8.13)
Clearly this is not the most general form that can be imagined for the CGEA because contributions involving mixing of both 3elds are not taken into account. The main drawback of this approach is therefore that we miss the stochastic aspects of the theory, since there can be no noise terms. However, the proposed form for the CGEA will be enough for studying the renormalization group 4ow of real time 3eld theories. The great technical advantage of the form Eq. (8.13) is that the second functional derivative of the action has no crossed terms, leading to a diagonal matrix Aab whose determinant is easily computed as the product of two determinants, one for A++ and one for A− − . Following Ref. [76] one can express both det A++ and det A− − as the product over momenta of a constant (momenta independent) times the mode h(˜q; T ) evaluated at the 3nal time T . Therefore the last term of the exact RG equation can be written as 3 d q ln det (Aab ) = ln[det (A++ ) det (A− − )] = ln(h+ (˜q; T )h− (˜q; T )) : (8.14) (2)3 The 3rst and the third terms can then be cast in the form of a single logarithm, and we arrive at 3 9 S* i* d q * ln(h− (˜q; T )h˙+ (˜q; T ) − h+ (˜q; T )h˙− (˜q; T )) =− 9* 25* (2)3
−1 2 3 ˙ h+ (˜q; T ) h˙− (˜q; T ) d q 9S* ha (˜q; t) * − dt + 25* (2)3 h+ (˜q; T ) h− (˜q; T ) ha (˜q; T ) 9’a (−˜q; t) * + 25*
dt dt
d3 q 9S* 9S* (−˜q; t;˜q; t ) A−1 : ab 3 (2) 9’a (˜q; t) 9’b (˜q; t )
(8.15)
Note that the equations for the two modes h+ and h− (Eq. (8.8)) simplify considerably, since the two equations are decoupled. What we still have to prove is that the proposed form for the action makes the r.h.s. of the exact RG equation split in the same form. Next we perform a derivative expansion of the interaction term. As our coarse graining explicitly breaks Lorentz invariance, we allow dierent coeTcients for the temporal and spatial derivatives, namely 2 1 1 4 2 ˙ ˜ ± ) + · · · : Sint (± ) = d x −V* (± ) + Z* (± )± − Y* (± )(∇ (8.16) 2 2 We expand the 3elds around a time dependent background: ± = ± (t) + ’± (˜x; t) and Fourier transform in space. We shall solve the Eq. (8.8) for the modes to zeroth order in the inhomogeneities, i.e. we equate terms in the equations for h± that are independent of ’± ’s. Since the 3rst functional derivative of the CGEA (S ) is linear in the inhomogeneities ’± , we put
500
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
. After a little algebra and functional S = 0 and keep the ’± -independent contributions to Sint derivations, we get 92 Sint 1 ˙ 2 d2 2 ˙ d Q = −V − Z − Yq − Z − Z 2 − Z + · · · 9’(˜q; t)9’(−˜q ; t ) 2 dt dt
×5(t − t )53 (˜q − ˜q ) ;
(8.17)
where the primes denote derivation with respect to the 3eld and the ellipsis denote terms linear in the 4uctuations. In this expression and hereafter we omit (unless explicitly stated) the ± subscripts in the background 3elds ± (t), in the potential V* (± (t)), and in the wave function factors Z* (± (t)) and Y* (± (t)). Note that the eective mass of the modes depends on the time-dependent background (t). The equations of motion for the modes ha become localized and take the form of harmonic oscillators with variable frequency and a damping term. The boundary conditions to be imposed are the aforementioned CTP ones. If one de3nes new modes as f(˜q; t) = (1 + Z* )1=2 h(˜q; t), the damping terms cancel out and the new modes are harmonic oscillators with frequency wq2 (t) = q2
Z*2 V* 1 1 + Y* ˙ 2 + 1 Z* Q : + + 1 + Z* 1 + Z* 4 (1 + Z* )2 2 1 + Z*
Using an adiabatic expansion for the modes, t 1 h± (˜q; t) = (1 + Z* )−1=2 e±i −T W± (˜q; t ) dt ; 2W± (˜q; t)
(8.18)
(8.19)
we can easily evaluate the logarithmic term in the r.h.s. of the exact RG equation (8.15), which is the only term that survives in the approximation we are working. The RG equation (8.15) reduces to dV* (+ ) 1 dZ* (+ ) ˙ 2 dV* (− ) 1 dZ* (− ) ˙ 2 * dt − + + − − + − d* 2 d* d* 2 d* 3 * d q = [W+ (˜q; t) − W− (˜q; t)] dt : (8.20) 25* (2)3 In the adiabatic expansion, the W ’s read 2
Q ; W 2 = A* + B* ˙ (t) + C* (t)
(8.21)
where the coeTcients are A* = *2
V* 1 + Y* + ; 1 + Z* 1 + Z*
B* =
A* Z*2 5A2 * + − ; 4(1 + Z* )2 16A2* 4A*
C* =
A Z* − * : 2(1 + Z* ) 4A*
(8.22)
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
501
Integrating by parts we get C* dV* 1 dZ* ˙ 2 *3 1 ˙ 2 B* dt −* A* + √ − √ : (8.23) + * = 2 dt d* 2 d* 4 2 A* A* Therefore the dependence of the potential and the wave function renormalization on the infrared scale is given by V* dV* *3 1 + Y* * + ; = − 2 *2 d* 4 1 + Z* 1 + Z* C* dZ* *3 B* √ * − √ : (8.24) = d* 42 A* A* These equations are valid both for the + 3eld and for the − 3eld. The above equations describe the 4ow of the CGEA with the infrared scale in the derivative expansion of the exact CTP renormalization group equation. It is interesting to note that the higher derivative terms modify the dierential equation for the eective potential. We have obtained two equations for the three independent unknown functions V* , Z* and Y* . In order to 3nd an additional relation between the spatial and temporal wave function renormalization functions Z* and Y* , it is necessary to write the exact RGE up to quadratic order in the inhomogeneities. We will not present this long calculation here. For simplicity, we will assume that Z* and Y* are small numbers, and therefore we will set them to zero on the r.h.s. of Eq. (8.24). This assumption is partially con3rmed by numerical calculations [52]. Note that in this approximation we recover the RG improved equation proposed in Ref. [77] for the coarse grained eective potential dV *3 * * = − 2 *2 + V* : (8.25) d* 4 There are other points which are worth noting. First, when we substitute V* , Z* and Y* by their classical values V* = V; Z* = Y* = 0, on the r.h.s. of both equations we obtain the one loop evolution equations [52]. Second, while in the one loop approximation it is possible to take the limit *0 → ∞ (the in3nities can be absorbed into the bare mass and coupling constant), in this nonperturbative calculation it is not possible to renormalize the theory (as is the case for Hartree, Gaussian and 1=N approximations). For these reasons we keep *0 as a large (compared with the mass) but 3nite number. Once the functions V* and Z* are known, one can write the eective dynamical equations for the coarse grained 3eld. Part IV. Renormalization group in semiclassical gravity 9. Renormalization group and stochastic semiclassical gravity In this section we will describe a dierent relation between the RG and the CTP CGEA. We will consider the backreaction of quantum matter 3elds on the spacetime geometry, assumed classical. To analyze this problem, we will use the formulation of quantum open systems and
502
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
compute the CTP CGEA using an expansion in powers of the spacetime curvature and discuss its relation with the RG equations. The usual approach to analyze backreaction in semiclassical gravity is based on the use of the semiclassical Einstein equations (SEE) [78] 1 1 (1) (2) clas R − Rg − H − JH = T + T : (9.1) 8G 2 In the SEEs, which can be derived from the real part of the CTP CGEA [1,2,39], the eect of quantum matter 3elds is taken into account by including as a source the quantum mean value of the energy–momentum tensor. The terms proportional to (1) H = [4R; − 4g R] + O(R2 ) ;
(9.2)
(2) H = [4R;
(9.3)
and ;
− 2 R − g R] + O(R2 ) ;
come from terms quadratic in the curvature in the gravitational action, which are needed to renormalize the theory. These equations cannot provide a full description of the problem [3], since they do not take into account the 4uctuations of the energy momentum tensor around its mean value. The 4uctuations can be incorporated by including an additional stochastic term [7,79,80] on the right hand side of Eq. (9.1). This noise-term can be derived from the imaginary part of the CTP CGEA, in the same way we proceeded for the soft modes of the scalar 3eld (see Section 7.2). When it is incorporated in the SEE, one obtains the “Einstein–Langevin Equations” (ELE), which include both the dissipative and diusive eects of the quantum matter on the geometry of spacetime [7], in complete analogy with what happens in quantum Brownian motion typical of quantum open systems. For a review of the semiclassical stochastic gravity program based on the ELE, see [81]. The ELE have been derived for arbitrary small metric perturbations conformally coupled to a massless quantum scalar 3eld in a spatially 4at background [80], and, in a cosmological setting, for a massive 3eld in a spatially 4at RW universe [82], and in a Bianchi type-I spacetime [83]. Further ellaboration of its physical meaning can be found in the papers by Verdaguer and Roura [84]. In Ref. [85] it is proven that the ELE may be used to compute certain quantum averages, even in conditions where there is no decoherence. Here we present the derivation of the ELEs based on the renormalization group equations of Lombardo and Mazzitelli [86]. Using a covariant expansion in powers of the curvature we will see that, for massless quantum 3elds, the ELEs are determined to leading order by the running couplings of the theory. This example will show an interesting connection between the “usual” renormalization group and the dierent versions of the eective action. Indeed, in a naive “Wilsonian” approach, some quantum eects can be taken into account by replacing the coupling constants of the theory by their running counterparts. However, the (usual) running of coupling constants is de3ned in momentum space, and a careful implementation of the Wilsonian approach leads to a nonlocal eective action in con3guration space. We will compute explicitly this non local eective action 3rst in its Euclidean version, and show that it coincides with calculations based on resummations
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
503
of the Schwinger DeWitt expansion. Using general relations between Euclidean propagators and CTP propagators, we will be able to obtain the CTP CGEA from the Euclidean eective action. From the CTP CGEA we will derive the ELE, thus showing that they are a consequence of usual renormalization group equations of the theory. As an application, we will compute the leading quantum corrections to the Newtonian potential. Consider a quantum scalar 3eld on a classical, Euclidean curved background. The classical action is given by S = Sgrav + Smatter ; where
Sgrav = − and 1 Smatter = 2
√ d4 x g
(9.4)
1 (R − 2*0 ) + 0 R2 + J0 R R 16G0
√ d 4 x g[9 9 + m2 2 + R2 ] :
;
(9.5)
(9.6)
Here is the coupling to the curvature. G0 , *0 , and the dimensionless constants 0 and J0 are bare constants. The eective action for this theory is a complicated, nonlocal object. It is de3ned by integrating out the quantum scalar 3eld, that is −Se e = N De−S[g ; ] ; (9.7) where N is a normalization constant. It is in general not possible to 3nd a closed form for it. If we compute it using a covariant expansion in powers of the curvature, the dierent terms must be constructed with the Riemann tensor and its derivatives ∇∇:::R. Using integration by parts and the Gauss Bonnet identity, the eective action can be written only in terms of the Ricci tensor R and R [87]. These arguments suggest that the eective action must have the general form 1 4 √ 2 Se = − d x g R + R + JR R 16G √ 1 + d 4 x g[F0 R + RF1 ( )R + R F2 ( )R + · · · ] ; (9.8) 2 32 where the ellipsis denote terms cubic in the curvature. For simplicity, in the above equation and in what follows we will omit the cosmological constant term. Note that the bare constants in Eq. (9.5) have been replaced by dressed couplings in Eq. (9.8). The expansion is adequate for weak gravitational 3elds, i.e. ∇∇RR2 . Up to this order, all the information about the eect of the quantum 3eld is encoded in the constant F0 and in the form factors F1 and F2 . The form factors are, in general, non-local two point functions constructed with the d’Alambertian and the parameters and m2 . F0 , F1 , and F2 also depend on an energy scale , introduced by the regularization method.
504
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
The dressed coupling constants depend on the energy scale according to the RG equations. Using minimal subtraction these equations read [11] 1 dG G 2 m2
− ; (9.9) = d 6
2 1 1 1 d =− − − ; (9.10) d 322 6 90
dJ 1 : =− d 9602
(9.11)
The dependence of F0 , F1 and F2 on is such that the full equation is -independent. For example, from Eqs. (9.8) and (9.9), we see that F0 = m2 ln(m2 =2 )( − 16 ) + const. When the scalar 3eld is massless, this information is enough to 3x completely the form factors. Indeed, as the Fi ; i = 1; 2 are dimensionless two point functions, by simple dimensional analysis we obtain Fi ( ; 2 ; ) = Fi ( 2 ; ). Inserting this into Eq. (9.8), using Eqs. (9.10) and (9.11), and the fact that Se must be independent of , we obtain
1 − 1 2 1 F1 ( ) =
− − ln + const ; 2 6 90 2 − 1 F2 ( ) = + const : (9.12) ln 60 2 The 3nal result for the eective action has a clear interpretation: it is just the classical action in which the coupling constants and J have been replaced by nonlocal two point functions that take into account their running in con9guration space. For a massive 3eld, the situation is more complex because there is an additional dimensional parameter. The form factors also depend on m2 =2 and the -independence of the eective action is not enough to 3x the form of them. They have already been computed in the literature [88,89]
1 m2 − 14 (1 − 2 ) Fi ( ) = d$i ( ; )ln ; (9.13) 2 0 where $1 ( ; ) = 12 [ 2 − 12 (1 − 2 ) + 1 4 $2 ( ; ) = 12 :
1 48 (3
− 62 − 4 )] ;
(9.14)
These equations can be obtained through a covariant perturbation expansion [88], or by a resummation of the Schwinger DeWitt expansion [89]. Of course these form factors coincide with our previous Eq. (9.12) in the massless case. For the sake of completeness we will consider in what follows the massive case, although it is clear that only in the massless case the form factors are determined by the RG equations.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
505
In order to clarify the meaning of the two point functions appearing in Eqs. (9.12) and (9.13), it is useful to introduce the following integral representation
∞ m2 − 14 (1 − 2 ) 1 (1 − 2 ) (z) ln = ln dz − GE ; (9.15) + 2 4 z + 2 0 so the logarithm of the d’Alambertian is written in terms of the massive Euclidean propagator GE(z) = (z + 4m2 =(1 − 2 ) − )−1 . This representation will also be useful to construct the CTP version of the eective action. Up to here we considered the Euclidean eective action. What about the in–out and CTP CGEA? Of course one can compute them from 3rst principles using the covariant expansion, and indeed there are some calculations in the literature for the CTP CGEA [90]. However, in order to emphazise the relation with the RG equations, we will construct the CTP CGEA from its Euclidean counterpart. Replacing the Euclidean propagator by the Feynman one in the integral representation Eq. (9.15), one obtains the usual in–out eective action. As we already pointed out, the eective equations derived from this action are neither real nor causal because they are equations for in–out matrix elements and not for mean values. The CTP CGEA can be written as + + − − iSe [g+ ;g− ] i(Sgrav [g+ ]−Sgrav [g− ]) e D+ D− ei(Smatter [g ; ]−Smatter [g ; ]) ; =Ne (9.16) and the 3eld equations are obtained from taking the variation of this action with respect to the + metric, and then setting g+ = g− . g In an alternative, and more concise notation, we can write this eective action as [91] C [g] C C [g] iSe iSgrav DeiSmatter [g;] ; e =Ne (9.17) where we have introduced the CTP complex temporal path C, going from minus to plus in3nity C+ and backwards C− , with a decreasing (in3nitesimal) imaginary part. Time integration over the contour C is de3ned by C dt = C+ dt − C− dt. The 3eld appearing in Eq. (9.17) is related to those in Eq. (9.16) by (t;˜x) = ± (t;˜x) if t belongs to C± . The same applies to g .
This equation is useful because it has the structure of the usual in–out or the Euclidean eective action. Feynman rules are therefore the ordinary ones, replacing the Euclidean propagator by GF (x; y) = i0; in | T(x)(y) | 0; in; t; t both on C+ ; GD (x; y) = − i0; in | T˜ (x)(y) | 0; in; t; t both on C− ; G(x; y) = (9.18) t on C− ; t on C+ ; G+ (x; y) = − i0; in | (x)(y) | 0; in; G (x; y) = i0; in | (y)(x) | 0; in; t on C ; t on C : −
+
−
506
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
Introducing Riemann normal coordinates, we can write, up to lowest order in the curvature d4 p eip(x−y) GF (x; y) = (9.19) = GD∗ (x; y) ; 4 2 (2) p + m2 − i. d 4 p ip(x−y) e 2i5(p2 − m2 )%(±p0 ) : (9.20) G± (x; y) = ∓ (2)4 All of the preceeding formulation of the eective action is valid for any 3eld theory. In our particular case, we must replace the Euclidean propagator GE(z) in Eq. (9.15) by the propagator G(x; y) of Eq. (9.18) with a mass given by 4m2 =(1 − 2 ) + z. After integration in z we obtain 2 4m =(1 − 2 ) − ln 2 CTP 4 2 2 2 d p ip(x−y) ln( (1− )(p−i.)+4m ); t; t both on C+ ; 4e 2 (2) 2 2 2 d 4 p ip(x−y) ln( (1− )(p+i.)+4m ); t; t both on C− ; 2 (2)4 e (9.21) = 4 2 d p ip(x−y) 0 2 4m 2i%(p )%(−p − 1−2 ); t on C− ; t on C+ ; (2)4 e d 4 p ip(x−y) 4m2 − 2i%(−p0 )%(−p2 − 1− 2 ); t on C+ ; t on C− : (2)4 e With the expression for the CTP logarithm of the d’Alambertian we can calculate explicitly + and g− the CTP eective action the CTP eective action. Using the previous notation with g reads r r Se [g+ ; g− ] = Sgrav [g+ ] − Sgrav [g− ] i 1 4 4 4 + 2 d x d y4(x)4(y)N1 (x; y) − 2 d x d 4 y4(x)I(y)D1 (x; y) 8 8 i 1 4 + 2 d 4 x d 4 y4 (x)4 (y)N2 (x; y) − d x d 4 y4 (x)I (y)D2 (x; y) ; 8 82 (9.22) − + − where 4 = (R+ − R− )=2, I = (R+ + R− )=2, 4 = (R+ − R )=2, I = (R + R )=2. The classical r gravitational action Sgrav contains the dressed, -dependent coupling constants and we absorbed F0 into the gravitational constant G. The real and imaginary parts of Se can be associated with the dissipation and noise, respectively. The dissipation Di and noise Ni kernels are given by 1 (1 − 2 )p2 + 4m2 d4 p ; d$i ( ; ) cos[p(x − y)] ln (9.23) Di (x; y) = (2)4 2 0 1 d4 p 4m2 2 Ni (x; y) = d$i ( ; ) cos[p(x − y)]% −p − : (9.24) (2)4 1 − 2 %
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
507
It is important to note that the imaginary part of this eective action must be positive de3nite. To make this point explicit, one can write the imaginary part in terms of the Weyl tensor CJ and the scalar curvature R by means of the following relation: CJ C J = 2R R − 2=3R2 . It is not diTcult to show that the scalar and tensor contributions to the imaginary part of the eective action are both positive. In order to derive the ELE we proceed as in Section 7.2. One can regard the imaginary part of the CTP CGEA as coming from two classical stochastic sources B(x) and BJ (x), where the last tensor has the symmetries of the Weyl tensor. In fact, we can write the imaginary part as DB(x) DBJ (x)P[B; BJ ] exp (i{4(x)B(x) + 4J BJ }) 4 4 J ˜ = exp − d x d y[4(x)N (x − y)4(y) + 4J (x)N2 (x − y)4 (y)] ;
(9.25)
− + where N˜ (x; y) = N1 (x; y) + 13 N2 (x; y), and 4J = 12 CJ − CJ . The Gaussian functional probJ ability distribution P[B; B ] is given by 1 d 4 x d 4 yB(x)[N˜ (x; y)]−1 B(y) P[B; BJ ] = A exp − 2 1 −1 J 4 4 d x d yBJ (x)[N2 (x; y)] B (y) ; (9.26) ×exp − 2
with A a normalization factor. Therefore, the CTP CGEA can be written as exp{iSe } = DBD BJ P[B; BJ ] exp {iAe [4; 4J ; I; I ; B; BJ ]} ; where
Ae = Re Se +
d 4 x[4(x)B(x) + 4J (x)BJ ] :
(9.27)
(9.28)
The 3eld equations 5Ae =0 ; 5g+ + − g =g
the Einstein–Langevin equations, are 1 1 (1) (2) ˜ R − g R − H ˜ − JH 8G 2 1 1 4 (1) (2) =− d yD (x; y)H (y) − d 4 yD2 (x; y)H (y) 1 322 322 + g B − B; + 2BJ ;J ;
(9.29)
508
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
where ˜ and J˜ dier from and J by -dependent 3nite constants. Eq. (9.29) is the main result of this section. The r.h.s. consists of the mean value of the energy–momentum tensor for the scalar 3eld plus a stochastic correction characterized by the two point correlation functions B(x)B(y) = N˜ (x; y) ; BJ (x)BA)S (y) = TJA)S N2 (x; y) ;
(9.30)
where the tensor TJA)S is a linear combination of four-metric products in such a way that the r.h.s of Eq. (9.30) keeps the Weyl’s symmetries. The scalar-noise kernel is given by 1 N˜ (x; y) = 2
%
1
d
(1 − 2 )
− 4
2
4 − 36
d4 p 4m2 2 − cos[p(x − y)]% p − : (2)4 1 − 2
(9.31) In the massless case N˜ is proportional to ( − 1=6)2 , and vanishes for conformal coupling. Therefore this term is present when the quantum 3elds are massive and=or when the coupling is not conformal. This is to be expected, since the imaginary part of the CTP CGEA signi3es particle creation. For massless, conformally coupled quantum 3elds, particle creation takes place only when the spacetime is not conformally 4at. Therefore in this case the only contribution to the imaginary part of the CTP CGEA is proportional to the square of the Weyl tensor. When the 3elds are massive and=or nonconformally coupled, particle creation takes place even when the Weyl tensor vanishes. This is why an additional contribution proportional to R2 appears in the imaginary part of the eective action. From Eq. (9.29) we can de3ne the eective energy–momentum tensor e stoch T = T + T = T + g B − B; + 2BJ ;J ;
(9.32)
where T is the quantum expectation value of the energy–momentum tensor of the quantum stoch is the contribution of the stochastic force, which in turn has contributions from 3eld and T the scalar and tensor noises. In the massless-conformal case the scalar-noise kernel vanishes, and (T )stoch = 0, because the noise-source BJ has vanishing trace. This means that there is no stochastic correction to the trace anomaly [80]. To summarize, we have obtained the ELE using a covariant expansion in powers of the curvature. Our results are valid for quantum scalar 3elds with arbitrary mass and coupling to the curvature . In the massless case, still for arbitrary , we have shown that it is possible to obtain the noise and dissipation kernels using only dimensional analysis, the running of the coupling constants and the relation between the Euclidean and CTP CGEA. From this point of view, we can therefore conclude that the RG equations already contain information about the dissipation and noise kernels. The results of this section can be easily generalized to any 3eld theory with massless quantum 3elds in gravitational or Yang Mills backgrounds.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
509
10. Renormalization group and quantum corrections to the Newtonian potential Two major areas of application for the stochastic and semiclassical gravity programs are the study of the 3nal states of black holes and the physics of the early Universe. These are enormously complicated problems. We will describe here one of the simplest applications of semiclassical gravity, the calculation of quantum corrections to the Newtonian potential. Although a very simple example, it will shed light on important conceptual issues, in particular on the relation between the quantum corrections and the renormalization group. Du [92] computed the corrections to the Newtonian potential produced by the vacuum polarization of gravitons. His result is, schematically, G GM Ve (r) = − 1+a 2 : (10.1) r r This result has been rederived more recently by a number of authors [93–95]. It was also shown that corrections of this form are the leading quantum corrections for massless quantum 3elds. The constant a depends on the number and spin of massless 3elds. The 3rst derivation based in the SEE has been presented in Ref. [96]. Recently there is a renewed interest in this type of corrections since it is relevant to relating two dierent developments in quantum gravity: Maldacena’s Ads=CFT correspondence and the Randall–Sundrum alternative to compacti3cation, see for example [97]. There is an alternative, intuitive “Wilsonian” way of taking into account, at least partially, the quantum eects: just modify the classical potential by replacing the Newton constant by its running counterpart G( = 1=r)M V (r) = − ; (10.2) r where G() is the solution to the renormalization group equations in the theory considered. As can be seen from Eqs. (9.9) and (10.1) this argument does not reproduce the leading quantum correction for massless 3elds. We will now show that the result (10.1) can be derived from the ELE, and are therefore a consequence of the RG equations for the parameters and J. For simplicity, and to make contact with previous works, we will solve the ELE without including the noise source. Solutions to the ELE including this source can be found in [98]. In the static, weak 3eld approximation g = B + h , we have R = − R= −
1 2
1 2
h ;
(10.3)
h;
(1) = (−29 9 + 2B ) h ; H 1 (2) h+ H = −9 9 + B 2
(10.4) (10.5) h ;
where h = B h and we assumed the Lorentz gauge condition (h − 12 hB ); = 0.
(10.6)
510
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
Including a point particle source with T = 50 50 53 (x), the linearized 3eld equations become, 1 1 m2 m2 (1) (2) −
− ln 2 h[ − H − JH = T + T ; (10.7) + 16G 322 6 with h[ = h − 12 hB . We de3ne the quantum corrected Newtonian potential as V (r) = − 12 h00 . The trace of the 3eld equations is 1 m2 1 m2 ∇2 h − 2(3 + J)∇2 ∇2 h = T + T ; −
− ln (10.8) 16G 322 6 2 where, to 3rst order in m2 = ∇2 ,
2 2 2 ∇ ∇ 1 1 1 T = − 3 − ln − 2 ∇2 ∇2 − 6m2 2 − ln − 2 ∇2 h : (10.9) 322 6 36 m We shall solve Eq. (10.8) perturbatively h = h(0) + h(1) . The classical contribution h0 satis3es (∇2 − )−2 ∇2 ∇2 )h(0) = − 16GM53 (x)
)−2 = 32G(3 + J) :
(10.10)
The time independent and spherically symmetric solution is h(0) =
4GM (1 − e−)r ) : r
(10.11)
The equation for the 3rst quantum correction is (∇2 − )−2 ∇2 ∇2 )h(1) = D(∇2 )h(0) ; where
3G 1 2 ∇2 D(∇ ) = −
− ln − 2 ∇2 ∇2 2 6 2 Gm2 1 ∇2 1 m 1 2 +
− ln 2 + 3 − ln − 2 ∇2 : 2 6 36 m
(10.12)
2
(10.13)
To 3nd a solution to this equation, we will consider the limit )r → ∞ (we are interested in long distance quantum corrections). In this limit h(0) = 4GM ( 1r + 4)−2 53 (x)) and the nonlocal operator in the rhs of Eq. (10.12) can be easily evaluated. After a long calculation we obtain 24G 2 Mm2 1 ln r=r0 12G 2 M 1 2 1 (1) 2 h =−
− −
− + ··· : (10.14) 36 r 6 r3 The dots denote corrections at the origin, which are proportional to 53 (x) and its derivatives. We have not included them because our quantum corrections are not accurate near the origin. Indeed, we have derived the modi3ed Einstein equations under the assumptions ∇∇RR2 and m2 R∇∇R. Both conditions are satis3ed for the GM=r potential if GM r m−1 , so the origin is excluded.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
511
A similar analysis can be carried out for h[00 . We omit the details [99]. The 3nal answer for the Newtonian potential is
1 GM 1 2 2G 1 1 1 2m2 G 2 1 r V (r) = − h00 = − 1+
− + +
+ ln : 2 r 2 6 90 r 2 12 r0 (10.15) From Eq. (10.15) we see that there are two dierent terms in the quantum correction. The term containing the logarithm is qualitatively what we expected from ‘Wilsonian’ arguments (see Eq. (10.2)). However, the coeTcient is not exactly the same as the one derived from the renormalization group equation (9.9), unless = 0. Besides the running of G, we have obtained the leading quantum r −3 correction, which is a consequence of the running of and J in con3guration space. 11. Renormalization group theory for nonequilibrium systems 11.1. Summary remarks We 3rst give a schematic summary of this report and then discuss some general issues concerning RG theory for NEq systems. The CGEA contains all the information about the in4uence of the environment on the system. The CTP version of the CGEA is suitable for the analysis of the dynamical evolution of the system under the in4uence of the environment, and takes into account eects of renormalization, dissipation and noise. In this review we have described several applications of the in–out and in–in eective action in semiclassical gravity and cosmology, paying particular attention to the relation with the RG equations. The Euclidean averaged eective action is reviewed by Wetterich and his collaborators in an article in this conference. We have described a perturbative calculation of the CTP CGEA in RW spacetimes. We have shown that the CTP CGEA is useful to derive the master and Langevin equations in quantum 3eld theory. The imaginary part of the CTP CGEA produces diusive terms in the master equation. These terms can induce the reduced density matrix to become diagonal as it evolves, and are therefore crucial for decoherence and quantum to classical transition of the system. We applied this formalism to investigate the decoherence of mean 3elds due to quantum 4uctuations relevant for theories of structure formation in in4ationary models. We adapted the Wegner and Houghton Euclidean approach to the in–in CGEA in order to obtain an exact renormalization group equation for the dependence of S* on the coarse graining scale. The exact equation is extremely complicated. We have solved it using a derivative expansion. In this approximation, the CTP CGEA “decouples” [i.e. it is of the form S* (+ ; − ) = S* (+ ) − S* (− )] and contains neither dissipative nor noise terms. Previous results on the RG improved eective potential are recovered under this approximation. Eorts to 3nd solutions beyond the adiabatic approximation are under investigation. We believe that this equation (or a simpli3ed version of it) will play an essential role in the
512
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
development of a renormalization group theory for nonequilibrium systems. We expect that, as soon as we decrease the scale from *0 , dissipative and noise terms will grow: the CGEA will develop an imaginary part (related to noise) and a real part containing interactions between the ± 3elds (dissipation). This can be easily checked both in the one loop approximation and from the exact RGE. Indeed, we have seen that the one loop CTP CGEA is in general nonreal. On the other hand, the real and imaginary parts of the CGEA are not decoupled in Eq. (8.11), and a nonvanishing real part at * = *0 will induce an imaginary part at lower scales. One should be able to 3nd a nonperturbative, *-dependent 4uctuation–dissipation relation and RG equations for the coupling constants of the theory that include the noise eects. Finally, in the context of semiclassical gravity we described an interesting relation between the usual RG equations (running coupling constants) and the CTP CGEA. We have seen that one can take into account backreaction eects of quantum 3elds on the spacetime metric using a “Wilsonian” eective action in which the parameters of the theory are replaced by their running counterparts (this form of the eective action suggests itself by demanding the theory to be independent of the scale introduced by dimensional regularization). As the running is local in momentum space, it becomes nonlocal in con3guration space, and thus one obtains an Euclidean, nonlocal eective action for the spacetime metric previously obtained by using dierent approaches. This Euclidean action can be transformed into the CTP CGEA taking into account formal analogies between their de3nitions. The 3nal result, which contains dissipation and noise eects, can be viewed as a consequence of the running couplings of the theory. 11.2. Towards a nonequilibrium renormalization group theory There are three important aspects in the construction of a RG theory for nonequilibrium systems: (a) The idea behind the introduction of the RG description of critical phenomena characterized by the running of the interaction parameters of the theory; (b) The scale characteristic of the dynamical interaction absent in static critical phenomena; (c) How noise and dissipation in the stochastic equations describing the eective dynamics are re4ected in the renormalization group equations governing the parameters of the theory. Point (a) is discussed in all theories of static critical phenomena. We approached this problem stressing the coarse graining and backreaction aspects. Point (b) is discussed in dynamical critical phenomena (see e.g., [100]). We note, however, in the literature the noise in the dynamical Ginzburg–Landau equation is usually introduced by hand, and what is at issue is how the system behaves towards the critical point—neither is the origin of noise accounted for nor is the dissipative aspect of the system dynamics being incorporated in the 4ow. The 3rst part of Point c), i.e., how dissipation and noise appear in an open system is discussed in this report. (For a discussion of how noise is identi3ed in interacting quantum 3elds in both a prescribed open system and an eectively open system, see, [8,101].) The crucial remaining issues are how they aect the RG 4ow and how they manifest in the approach to critical points. We will make some general comments here, as research on Neq RG is still in its infant stage, its application to self-organized criticality, driven-diusive systems and turbulence notwithstanding.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
513
It is perhaps helpful to re4ect upon the basic ideas and procedures behind the introduction of the RG-i.e., coarse graining and scaling. When a system has a certain degree of regularity (homogeneity, periodicity) depicted by some symmetry, one can choose to represent it with an equivalent coarse-grained description. Examples are decimation in real time RG applied to a lattice, where, e.g., in the Ising model every other spin is eliminated and the eective bond strength between remaining spins is doubled. This Kadano–Migdal procedure produces a new Hamiltonian, which, upon iteration, can transform an original system into a simpler one. In the momentum space description, this procedure transcribes (runs) the ultraviolet (short range) behavior of the system to its infrared (long-ranged) domain, and hence is useful for the description of critical phenomena, as the behavior of the system near the critical point is dominated by the appearance of long range order. Whether this procedure, which is one of many possible coarse-graining schemes, can faithfully depict or capture the essential physics depends on the compatibility of the procedure with the system and on the properties of the system near the critical point. If the coarse-graining respects the symmetry of the system, (e.g., for an anisotropic medium, use a dierent decimation grading in dierent directions commensurate with the symmetry of the medium) and if the system possesses some scaling properties near the critical point, then the transformed problem via the RG would preserve the same critical behavior as the original problem. Otherwise it fails. The applicability of the RG idea thus depends crucially on the choice of a coarse-graining procedure and the use of scaling concepts. Leaving scaling aside for now, which has more to do with the properties of the particular systems of interest than with the procedure for accessing relevant information about the system, our approach to nonequilibrium processes and NEqRG theory starts from examining closely the coarse-graining procedure. 11.2.1. RG procedures in the light of open system concepts As we stated clearly at the beginning of this report, the necessary steps to capture the essense of a physical system with a simpli3ed depiction lies in: (1) distinguish the system from the environment; (2) coarse grain the environment and (3) measure how the coarse-grained environment in4uences the system in providing an eective kinematics or dynamics of the reduced system. Let us examine the RG procedure in the light of the open systems scheme of nonequilibrium statistical mechanics. The 3rst step consists of (a) separating the order parameter 3eld (x; t) into two parts, = S + E , where S and E are, respectively, our system and environment. We assume that S contains the lower k wave modes and E the higher k modes. In critical phenomena, S : | ˜k | ¡ *=s; E : *=s ¡ | ˜k | ¡ *. Here * is the ultraviolet cuto and s ¿ 1 is the coarse-graining parameter which gives the fraction of total k modes counted in the environment. Then (b) one ‘integrates out’ the short wavelength sector, which amounts to 3nding out the eect or backreaction of the coarse-grained environment (the short wavelength modes) on the system. This is most succinctly and forcefully executed by the use of coarse-grained eective action [4], as illustrated in the examples of this report. The last step in the RG procedure is (c) to introduce a rescaling of the k-space and a rescaled 3eld, e.g. ˜k = s˜k; S (˜k ; t) = s−(d+2=2) S (˜k; t) (d) being the spatial dimension. In identifying (or rather, demanding that they are equivalent, up to a certain order in the perturbation expansion) the new action in terms of the rescaled variables with the old action, one can obtain a set of
514
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
dierential RG, or the Wegner–Houghton equation. At the critical point itself, this eective theory containing only the low frequency modes is insensitive to further coarse graining, and therefore is described by a 3xed point of the RG 4ow. For dynamical systems, in addition to the scaling transformation, an arti3cial device of the RG theory depicting the running of the coupling constants of the system towards the infrared region, we 3nd a real time-dependence depicting the dynamics of the system. 9 Now that we have identi3ed the steps of the RG procedure in parallel to the treatment of open systems we can ask a few questions to illuminate the relevant issues. It is easy to see where dissipation and noise occur in the open system. But the appearance or nonappearance of noise and dissipation in the RG equations for nonequilibrium dynamics is not a simple issue. That is why we deem it necessary to 3rst provide a clear and thorough discussion of the conceptual and technical basis for addressing the origin and nature of noise and dissipation in nonequilibrium 3eld theory, as this report attempts to do, before we move on to investigate the related issues in RG theory for nonequilibrium processes. When the RG procedure is viewed in the light of an open system theory—in eect the modes of interest in the system after the Kadano transformation form an open system, and the techniques of nonequilibrium 3eld theory are the natural language to address these problems—the long wavelength=slow modes dynamics will display, in general, both noise and dissipation. Indeed, the possibility that a renormalization group 4ow may acquire stochastic features for nonequilibrium processes is even clearer if we think of the RG as encoding the process of eliminating irrelevant degrees of freedom from our description of a system [102]. These elimination processes lead as a rule to dissipation and noise, as was made manifest in the in4uence action and the CTP CGEA approaches shown. But why, one may ask, is it that we do not usually talk about dissipation in the system or noise in the environment in a RG theory. A simple answer is that RG running is dierent from real-time dynamics, and dissipation in the dynamics of the eective system does not show up as dissipation runs. Another simple answer is that the bulk of RG research has been focused on equilibrium, stationary properties rather than the nonequilibrium dynamics [103]. 10 A major task in NEqRG is to make explicit this point, i.e., where does dissipation in the open system (Langevin) dynamics show up in the RG equations? We can oer only some speculative observations: 11.2.2. Stochastic RG equations (A) For apparently closed (yet eectively open) systems described by the Boltzmann dynamics, this issue is more involved. We learned that when the hierarchy of correlation functions are simply truncated (with no slaving) [101], the equation of motion for the 3nite set of low order correlation functions is unitary, such as is the case in Vlasov dynamics, which is obviously 9
In treating dynamical systems, we may also coarse grain the system according to some time scale, obtaining the eective theory for slow modes after coarse graining the fast modes. Sometimes what is slow compared to what is fast may not be taken at face value, but has to be determined relevant to the dynamics and symmetry of the system. (Viewing eternal in4ation as ‘static’ and ‘slow-roll’ as dynamic is an example we showed.) 10 As for whether it has to do with equilibrium versus nonequilibrium conditions, we know that even under equilibrium conditions there is always noise and dissipation, as can be seen from linear response theory. They are related by the 4uctuation-dissipation theorem. Indeed it is the precise balance of these processes which sustains the equilibrium.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
515
nonequilibrium, yet no dissipation or noise appears as such. A similar case is the n-loop eective action in ordinary quantum 3eld theory. If we view the separation between the classical background 3eld and the quantum 4uctuation 3eld as between a system and an environment, then the ordinary n-loop eective action is only one special case of the coarse-grained eective action, with quantum 4uctuations being integrated or coarse-grained away. This results in radiative corrections to the bare mass and charge as we learned from renormalization theory, which, in the nonequilibrium language, amounts to backreaction of the environment on the system. We would expect to see dissipation in the eective dynamics of the background 3eld if we follow the general arguments of NEq 3eld theory, but of course one never talks about dissipation or noise in the equations of motion derived from an n-loop eective action. So what is missing? The reason is similar for both cases: the mean 3eld or background 3eld is the lowest order of the hierarchy of correlation functions in the Schwinger–Dyson equations. Any loop expansion entails a truncation of the hierarchy and, just like Vlasov dynamics in classical mechanics, the equation of motion from the eective action is unitary. So again it is not just the dynamics which prompts the appearance of dissipation—the causal factorization condition which we call slaving (as in the molecular chaos assumption of Boltzmann) is responsible for it. For example the circumstances whereby the ordinary RG equations are derived could be similar to the Vlasov or the loop approximation examples mentioned above, in that the correlation functions are truncated from the hierarchy with a factorization condition (rather than from slaving, where the correlation noise arises). This results in the coupling constants being modi3ed only partially, in a way similar to the renormalized mass or charge in one-loop (quantum) 3eld theory or the (classical) Vlasov dynamics following the averaged potential which replaces the particle interactions. In other words, the backreaction of the coarse-grained modes is not taken into account fully. We see this, e.g. for the 4 theory: the rescaled theory diers from the original theory in the 6 and higher order terms. By identifying the two theories in such a way prescribed by the ordinary RG theory procedure, we assumed that the error of ignoring the higher order corrections are inmaterial in our range of consideration. As such it is a good description of the theory only for systems which scale near the critical point, because there the higher order correlation functions can be ignored or incorporated in a way simply related to that of the lower order ones. Away from the critical point this is not true and this procedure would give bad results. That is the criterion whereby one can assert the equivalence of the RG transformed theory and the original one. Thinking about the relation between the general nonequilibrium dynamics of an open system and that depicted by the RG transformation in this light, we can see that the former (say, starting from a nonlocal Langevin equation governing the system arising from the backreaction of another subsystem or environment via a Zwangzig–Mori projection operator) keeps track of much more information about the environment than that in the RG treatment of the critical regime (as slaving being more than truncation). Although this is not suTcient for a proper description of the eective dynamics as dissipation and 4uctuations carry important information about the system and the environment, it is suTcient for the purpose of depicting the (static) critical phenomena of the system in the critical region, as scaling behavior simpli3es the relation between the correlation functions. That is why in actual RG calculations one can assume that the cuto is in eect in3nitely far above the scales of interest. This advantage near the critical point will still come to the aid of dynamical critical phenomena, but when the full dynamics
516
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
goes beyond an approach to the equilibrium (which is manifest in say, critical slowing down) or when one is interested in the behavior of the system farther away from the critical point, then more of the dissipative and noise attributes which are always present in general circumstances will have to be included in a nonequilibrium RG theory. The art in the design of such a theory for NEq processes is to 3nd out what additional information (noise and dissipation) need be kept, or what degree of coarse-graining is suTcient for the RG procedure which can still capture the essence of the critical behavior for dynamical systems. (B) One place where one can indeed see the ‘disappearance’ of the eects of noise from the RG equations is when an adiabatic or quasilocal approximation is introduced on the mode function dynamics. In [52] Dalvit and Mazzitelli show from the Wegner–Houghton equation derived from the CTP eective action that, for static long wavelength 3elds, the usual RG is recovered. In view of this result it is clear that an understanding of the running of noise and dissipation requires consideration of dynamical long wavelength 3elds. However, if these 3elds are slowly varying, they may be considered as actually linear or quadratic in the space–time coordinates, in which case the functional integration of the high momentum modes may be carried out exactly, at least to one loop order [26,104]. Alternatively, we may consider that the low momentum modes are following a prescribed, though not necessarily slow, evolution, and compute the back reaction of the high momentum modes keeping track of non adiabatic eects. Such an analysis is carried out in [105] (See also [53]), where the fast modes are represented by a massive heavy 3eld interacting with slow modes, described as a light 3eld. As expected, noise and dissipation display signi3cant running (they increase exponentially), as the gap between the characteristic frequencies of the heavy and light 3elds narrows. This result suggests that a nonequilibrium RG should include, besides the usual interactions, the running of suitable parameters describing both noise and dissipation. Therefore the challenge is to construct a RG 4ow where the dissipative and noisy characteristics of the dynamics is ingrained therein. There is a precedent for this problem in studies of RG 4ow in the Navier–Stokes equation [106], where one is interested, for example, in the running of the viscosity. In this kind of research, the starting point is generally a Langevin type equation, whose solutions are given through functional representations following the methods of Zinn–Justin, e.g., [100] (see also [107]). We would like to point out that even in this context, closed time–path techniques oer an alternative way to obtain a functional representation of the solutions of the Langevin equation [108], which is in several aspects simpler than the more familiar, forward time path, one. (C) Another lead can be found in the stochastic correlation function G introduced in a quantum 3eld theoretical derivation of the Boltzmann–Langevin equation [101]. Its expectation value reproduces the usual propagators (Green functions), while its 4uctuations account for the quantum 4uctuations in the binary product of (operator) 3elds. The dynamical equation for G takes the form of an explicitly stochastic Dyson equation. In the kinetic limit, the 4uctuations in G become the classical 4uctuations in the one particle distribution function, and the dynamical equation for G’s Wigner transform becomes the Boltzmann–Langevin equation. (Each of these results has an interest of its own. A priori, there is no simple reason why the 4uctuations derived from quantum 3eld theory should have a physical meaning corresponding to a phenomenological entropy 4ux and Einstein’s relation.) The studies of the 4uctuating character of these 3eld theoretic Green functions also suggest new avenues in the development of RG theory. For
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
517
example, we are used to 3xing the ambiguities of renormalization theory by demanding certain Green functions to take on given values under certain conditions (conditions which should resemble the physical situation of interest as much as possible, as stressed by O’Connor and Stephens [109]). If the Green functions themselves are to be regarded as 4uctuating, then the same ought to hold for the renormalized coupling constants de3ned from them, and for the RG equations describing their scale dependence. The notion that Green functions (and indeed, higher correlations as well) may or even ought to be seen as possessing 4uctuating characters (when placed in the larger context of the whole hierarchy) with clearly discernable physical meanings is likely to have an impact on the way we perceive the statistical properties of 3eld theory. To end, we note that while the application of RG methods to stochastic equations is presented in well-known monographs [100], our proposal here goes beyond these results in at least two ways. First, in our approach the noise is not put in by hand or brought in from outside (e.g., the environment of an open system), as in the usual Langevin equation approach, but it follows from the (quantum) dynamics of the system itself. Actually, the possibility of learning about the system from the noise properties (whether it is white or coloured, additive or multiplicative, etc.)—unraveling the noise, or treating noise creatively—is a subtext in our program. Second, our result suggests that stochasticity may, or should, not only appear at the level of equations of motion, but also the level of the RG equations, as they describe the running of ‘constants’ which are themselves 4uctuating. This we feel is the 3rst task in the construction of a RG theory for NEq processes. Acknowledgements We wish to thank the organizers of the RG2000 meeting in Taxco (Mexico), January 1999 for their warm hospitality, especially Denjoe O’Connor and Chris Stephens, with whom we enjoyed many close discussions over the years. We also enjoy the exchanges with David Huse, Michael Fisher and Jean Zinn-Justin during the meeting on the role of noise in nonequilibrium renormalization group theory. EC, FDM are supported in part by CONICET, UBA, Fundaci^on Antorchas and Agencia Nacional de Promoci^on Cient^_3ca y Tecnol^ogica. BLH is supported in part by NSF grant PHY98-00967 and their collaboration is supported in part by NSF grant INT95-09847.
References [1] [2] [3] [4]
E. Calzetta, B.L. Hu, Phys. Rev. D 35 (1987) 495. E. Calzetta, B.L. Hu, Phys. Rev. D 40 (1989) 656. B.L. Hu, Physica A 158 (1989) 399. B.L. Hu, Y. Zhang, Coarse-graining, scaling, and in4ation Univ. Maryland Preprint 90-186; B.L. Hu, in: J.C.D. Olivo et al. (Eds.), Relativity and Gravitation: Classical and Quantum Proceedings of the SILARG VII, Cocoyoc, Mexico 1990, World Scienti3c, Singapore, 1991. [5] B.L. Hu, Class. Quant. Grav. 10 (1993) S93.
518
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
[6] B.L. Hu, J.P. Paz, Y. Zhang, Quantum origins of noise and 4uctuations in cosmology, in: E. Gunzig, P. Nardone (Eds.), The Origins of Structures in the Universe, NATO ASI Series, Plenum Press, New York, 1993, p. 227. [7] E. Calzetta, B.L. Hu, Phys. Rev. D 49 (1994) 6636. [8] B.L. Hu, in: R. Kobes, G. Kunstatter (Eds.), Proceedings of the Third International Workshop on Thermal Fields and its Applications, CNRS Summer Institute, Ban, August 1993, World Scienti3c, Singapore, 1994. [9] M.W. Choptuik, Phys. Rev. Lett. 70 (1993) 9. [10] H.J. de Vega, N. Sanchez, F. Combes, Phys. Rev. D 54 (1996) 6008–6020. [11] I.L. Buchbinder, S.D. Odintsov, I.L. Shapiro, Eective Action in Quantum Gravity, IOP Publishing Ltd, London, 1992. [12] K. Kirsten, G. Guido, L. Vanzo, Phys. Rev. D 48 (1993) 2813. [13] A. Bonanno, Phys. Rev. D 52 (1995) 9697. [14] J. Ambjorn, Nucl. Phys. B, Proc. (Suppl.) 42 (1995) 3–16. In *Bielefeld 1994, Lattice ’94* 3–16. [HEP-LAT 9412006]. [15] I. Antoniadis, P. Mazur, E. Mottola, Phys. Lett. B 323 (1994) 284. [16] See, e.g., P.C. Hohenberg, B.I. Halperin, Rev. Mod. Phys. 49 (1977) 435; S.K. Ma, Modern Critical Phenomena, Benjamin, N.Y. 1976 (Chapter XII); K. Kawasaki, J. Gunton, Phys. Rev. B 13 (1976) 4658. [17] J.M. Cornwall, R. Bruinsma, Phys. Rev. D 38 (1988) 3146. [18] B.L. Hu, A. Raval, in preparation. [19] R.M. Wald, Phys. Rev. D 28 (1983) 2118. [20] W. Boucher, G.W. Gibbons, G.T. Horowitz, Phys. Rev. D 30 (1984) 2447. [21] T.C. Shen, B.L. Hu, D.J. O’Connor, Phys. Rev. D 31 (1985) 2401. [22] F. Lucchin, S. Matarrese, Phys. Rev. D 32 (1988) 1316. [23] L. Kadano, Ann. Phys. (N.Y.) 2 (1966) 263; A.A. Migdal, Sov. Phys.-JETP 69, L457 (1975) 810. [24] B.L. Hu, Phys. Lett. 123B (1983) 189; B.L. Hu, L.F. Chen, Phys. Lett. 160B (1985) 36. [25] S. Sinha, B.L. Hu, Phys. Rev. D 38 (1989) 2423. [26] B.L. Hu, D.J. O’Connor, Phys. Rev. D 30 (1984) 743. [27] I. Moss, D.J. Toms, A. Wright, Phys. Rev. D 46 (1992) 1670. [28] B.L. Hu, in: F.C. Khanna, H. Umezawa, G. Kunstatter, P. Lee (Eds.), Proceedings of the CAP-NSERC Summer Institute in Theoretical Physics, Vol. II, World Scienti3c, Singapore, 1988; B.L. Hu, D.J. O’Connor, Phys. Rev. D 36 (1987) 1701; D.J. O’Connor, C.R. Stephens, B.L. Hu, Ann. Phys. (N.Y.) 190 (1990) 310. [29] L. Kadano, Ann. Phys. (N.Y.) 100 (1976) 359; Rev. Mod. Phys. 49 (1977) 267; K. Wilson, J. Kogut, Phys. Rep. C 12 (1974) 75; M.E. Fisher, Rev. Mod. Phys. 46 (1974) 597; M.E. Fisher, in: F.J.W. Hahne (Ed.), Critical Phenomena, Springer, Berlin, 1983; J.F Nicoll, T.S. Chang, H.E. Stanley, Phys. Rev. A 13 (1976) 1251; P. Pfeuty, G. Toulouse, Introduction to the Renormalization Group and to Critical Phenomena, Wiley, N.Y. 1977: for a simple introduction, see K. Huang, Statistical Mechanics, Wiley, N.Y. 1989 (Chapter 18); Bambi Hu, Phys. Rep. 91 (1982) 233. [30] See, e.g., E. Brezin, J.C. Lee Guillou, J. Zinn-Justin, Phase transitions and Critical Phenomena, Vol. VI, Academic Press, N.Y. 1976; D.J. Amit, Field Theory, The Renormalization Group and Critical Phenomena, 2nd Edition, World Scienti3c, Singapore, 1985. [31] A.A. Starobinsky, in: H.J. de Vega, N. Sanchez (Eds.), Field Theory, Quantum Gravity and Strings, Springer, Berlin, 1986; J.M. Bardeen, G.J. Bublik, Class. Quant. Grav. 4 (1987) 573; S.J. Rey, Nucl. Phys. B 284 (1987) 706. [32] E. Calzetta, B.L. Hu, Phys. Rev. D 52 (1995) 6770. [33] S.A. Ramsey, B.L. Hu, Phys. Rev. D 56 (1997) 678–705. Erratum-ibid. D 57 (1998) 3798. [34] F.D. Mazzitelli, J.P. Paz, C. El Hasi, Phys. Rev. D 40 (1989) 955. [35] S. Sinha, B.L. Hu, Phys. Rev. D 44 (1991) 1028. [36] E. Braaten, R.D. Pisarski, Phys. Rev. D 45 (1992) 1827–1830. [37] C. Wetterich, Nucl. Phys. B 352 (1991) 529; Phys. Lett. B 301 (1993) 90; Z. Phys. C 60 (1993) 461. [38] B.S. DeWitt, in: R. Penrose, C.J. Isham (Eds.), Quantum Concepts in Space and Time, Clarendon Press, Oxford, 1986. [39] R.D. Jordan, Phys. Rev. D 33 (1986) 44.
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520
519
[40] F.J. Wegner, A. Houghton, Phys. Rev. A 8 (1973) 401. [41] K. Wilson, Rev. Mod. Phys. 55 (1982) 583; see also S.K. Ma, Modern Critical Phenomena, Benjamin, N.Y., 1976. [42] C.R. Stephens, private communication. [43] J.M. Bardeen, P.J. Steinhardt, M.S. Turner, Phys. Rev. D 28 (1983) 629; A. Guth, S.Y. Pi, Phys. Rev. Lett. 49 (1982) 1110; A.A. Starobinsky, Phys. Lett. 117B (1982) 175; S.W Hawking, Phys. Lett. 115B (1982) 295. [44] T.A. Jacobson, Phys. Rev. D 44 (1991) 1731. [45] R. Feynman, F. Vernon, Ann. Phys. (N.Y.) 24 (1963) 118; R. Feynman, A. Hibbs, Quantum Mechanics and Path Integrals, McGraw-Hill, New York, 1965; H. Kleinert. Path Integrals in Quantum Mechanics, Statistics, and Polymer Physics, World Scienti3c, Singapore, 1990. [46] A.O. Caldeira, A.J. Leggett, Physica 121A (1983) 587; Ann. Phys. (N.Y.) 149 (1983) 374. [47] Y. Zhang, Ph.D. Thesis, University of Maryland, 1991 (unpublished). [48] B.L. Hu, A. Matacz, Phys. Rev. D 51 (1995) 1577. [49] D. Koks, A. Matacz, B.L. Hu, Phys. Rev. D 59 (1997) 5917. [50] F.C. Lombardo, F.D. Mazzitelli, Phys. Rev. D 53 (1996) 2001. [51] C. Greiner, B. MQuller, Phys. Rev. D 55 (1997) 1026. [52] D.A.R. Dalvit, F.D. Mazzitelli, Phys. Rev. D 54 (1996) 6338. [53] D. Boyanovsky, M. D’Attanasio, H.J. de Vega, R. Holman, D.-S. Lee, Phys. Rev. D 52 (1995) 6805–6827. [54] J.P. Paz, Phys. Rev. D 41 (1990) 1054; D 42 (1990) 529. [55] S. Weinberg, Phys. Lett. 83B (1979) 339. [56] B.L. Hu, J.P. Paz, Y. Zhang, Phys. Rev. D 45 (1992) 2843; B.L. Hu, J.P. Paz, Y. Zhang, Phys. Rev. D 47 (1993) 1576. [57] H. Grabert, P. Schramm, G.L. Ingold, Phys. Rep. 168 (1988) 115. [58] W.G. Unruh, W.H. Zurek, Phys. Rev. D 40 (1989) 1071. [59] Z. Su et al., Phys. Rev. B 37 (1988) 9810. [60] J.P. Paz, in: J. Halliwell, J. Perez Mercader, W. Zurek (Eds.), The Physical Origin of Time Asymmetry, Cambridge University Press, Cambridge, 1994. [61] F.C. Lombardo, F.D. Mazzitelli, D. Monteoliva, Phys. Rev. D 62 (2000) 045016. [62] L. Davila Romero, J.P. Paz, Phys. Rev. A 55 (1997) 4070. [63] M. Abramowitz, I. Stegun (Eds.), Handbook of Mathematical Functions, Dover Publications, N.Y., 1972. [64] S. Habib, R. La4amme, Phys. Rev. D 42 (1990) 4056. [65] J.J. Halliwell, T. Yu, Phys. Rev. D 53 (1996) 2012–2019. [66] S. Habib, Phys. Rev. D 46 (1992) 2408. [67] S.B. Liao, J. Polonyi, Ann. Phys. 222 (1993) 122. [68] J. Polchinski, Nucl. Phys. B 231 (1984) 269. [69] A. Hasenfratz, P. Hasenfratz, Nucl. Phys. B 270 (1986) 687. [70] T.R. Morris, Int. J. Mod. Phys. A 9 (1994) 2411. [71] M. Bonini, M. D’Attanasio, G. Marchesini, Nucl. Phys. B 490 (1993) 441. [72] T.R. Morris, Phys. Lett. B 334 (1994) 355. [73] T.R. Morris, Phys. Lett. B 329 (1994) 241. [74] R. Ball et al., Phys. Lett. B 347 (1995) 80; P. Haagensen et al., Phys. Lett. B 323 (1994) 330. [75] J. Adams et al., Mod. Phys. Lett. A 10 (1995) 2367. [76] I. Gel’fand, A. Yaglom, J. Math. Phys. 1 (1960) 48; S. Coleman, Aspects of Symmetry, Cambridge University Press, Cambridge, England, 1985. [77] S.B. Liao, J. Polonyi, D. Xu, Phys. Rev. D 51 (1995) 748; S.B. Liao, S. Strickland, Phys. Rev. D 52 (1995) 3653; M. D’Attanasio, M. Pietroni, Nucl. Phys. 472 (1996) 711. [78] N.D. Birrell, P.C.W. Davies, Quantum Fields in Curved Space, Cambridge University Press, London, 1982. [79] A. Campos, E. Verdaguer, Phys. Rev. D 49 (1994) 1861. [80] A. Campos, E. Verdaguer, Phys. Rev. D 53 (1996) 1927. [81] B.L. Hu, Int. J. Theoret. Phys. 38 (1999) 2987. [82] B.L. Hu, A. Matacz, Phys. Rev. D 51 (1995) 1577. [83] B.L. Hu, S. Sinha, Phys. Rev. D 51 (1995) 1587.
520 [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102]
[103] [104] [105] [106] [107] [108] [109]
E.A. Calzetta et al. / Physics Reports 352 (2001) 459–520 A. Roura, E. Verdaguer, Phys. Rev. D 60 (1999) 107503. E. Calzetta, A. Roura, E. Verdaguer, quant-ph=0011097. F.C. Lombardo, F.D. Mazzitelli, Phys. Rev. D 55 (1997) 3889. G.A. Vilkovisky, in: S.M. Christensen (Ed.), Quantum Theory of Gravity, Hilger, Bristol, 1984. A.O. Barvinsky, G.A. Vilkovisky, Nucl. Phys. B 282 (1987) 163; B 333 (1990) 471. I.G. Avramidi, Yad. Fiz. 49 (1989) 1185; (Sov. J. Nucl. Phys. 49 (1989) 735). W. Tichy, E.E. Flanagan, Phys. Rev. D 58 (1998) 124 007. F. Cooper et al., Phys. Rev. D 50 (1994) 2848. M.J. Du, Phys. Rev. D 9 (1974) 1837. J.F. Donoghue, Phys. Rev. Lett. 72 (1994) 2996. H. Hamber, S. Liu, Phys. Lett. B 357 (1995) 51. I. Muzinich, S. Vokos, Phys. Rev. D 52 (1995) 3472. D.A.R. Dalvit, F. Mazzitelli, Phys. Rev. D 50 (1994) 1001; see also Phys. Rev. D 52 (1995) 2577; Phys. Rev. D 56 (1997) 7779. M.J. Du, J.T. Liu, Phys. Rev. Lett. 85 (2000) 2052; E. Alvarez, F.D. Mazzitelli, Phys. Lett. B 505 (2001) 236. R. Mart^_n, E. Verdaguer, Int. J. Theoret. Phys. 38 (1999) 3049; R. Mart^_n, E. Verdaguer, Phys. Lett. B 465 (1999) 113; R. Mart^_n, E. Verdaguer, Phys. Rev. D 60 (1999) 084008; R. Mart^_n, E. Verdaguer, Phys. Rev. D 61 (2000) 124 024. D.A.R. Dalvit, Doctoral Thesis, Univ. of Buenos Aires, hep-th=9807112. J. Zinn-Justin, Field Theory and Critical Phenomena, Oxford University Press, Oxford, 1989; K. Kawasaki, J. Gunton, Phys. Rev. B 13 (1976) 4658; D. Forster, D. Nelson, M. Stephen, Phys. Rev. A 16 (1977) 732. E. Calzetta, B.L. Hu, Phys. Rev. D 61 (2000) 025 012. S.K. Ma, Modern Theory of Critical Phenomena, Benjamin, London, 1976; J. Zinn-Justin, Statistical Field Theory, John Wiley, New York, 1989; E. Brezin, J.C. le Guillou, J. Zinn-Justin, Field Theoretical Approach to Critical Phenomena, in: C. Domb, M.S. Green (Eds.), Phase Transitions and Critical Phenomena, Academic Press, London, 1976; M.E. Fisher, Rev. Mod. Phys. 70 (1998) 653. P. Hohenberg, B. Halperin, Rev. Mod. Phys. 49 (1977) 435; J. Cardy, Scaling and Renormalization in Statistical Physics, Cambridge University Press, Cambridge, 1996; M. Peskin, D. Schroeder, Quantum Field Theory, Addison-Wesley, New York, 1995. J. Schwinger, Phys. Rev. 82 (1951) 664 – 679; M. Brown, M. Du, Phys. Rev. D 11 (1975) 2124. E. Calzetta, B.L. Hu, Phys. Rev. D 55 (1997) 3536. U. Frisch, Turbulence, Cambridge University Press, Cambridge, 1995 and references therein. D. Hochberg et al., J. Stat. Phys. 99 (2000) 903; D. Hochberg, C. Molina-Paris, M. Visser, cond-mat 0009424. E. Calzetta, B.L. Hu, Phys. Rev. D 59 (1999) 065018. D. O’Connor, C.R. Stephens, Int. J. Mod. Phys. A 9 (1994) 2805; Phys. Rev. Lett. 72 (1994) 506; M. Van Eijck, D. O’Connor, C.R. Stephens, Int. J. Mod. Phys. A 10 (1995) 3343; C.R. Stephens, preprint hep-th=9611062; F. Freire, M. Van Eijck, D. O’Connor, C.R. Stephens, preprint hep-th=9601165.
521
CONTENTS VOLUME 352 J.-P. Minier, E. Peirano. The pdf approach to turbulent polydispersed two-phase #ows
1
D. O'Connor, C.R. Stephens. Renormalization group theory in the new millennium. III
215
D.V. Shirkov, V.F. Kovalev. The Bogoliubov renormalization group and solution symmetry in mathematical physics
219
G. Gallavotti. Renormalization group in statistical mechanics and mechanics: gauge symmetries and vanishing beta functions
251
G. Gentile, V. Mastropietro. Renormalization group for one-dimensional fermions. A review on mathematical results
273
G. Jona-Lasinio. Renormalization group and probability theory
439
E.A. Calzetta, B.L. Hu, F.D. Mazzitelli. Coarse-grained e!ective action and renormalization group theory in semiclassical gravity and cosmology
459
Contents volume 352
521
Forthcoming issues
522
PII: S0370-1573(01)00070-9
522
FORTHCOMING ISSUES D.G. Yakovlev, A.D. Kaminker, O.Y. Gnedin, P. Haensel. Neutrino emission from neutron stars B. Ananthanarayan, G. Colangelo, J. Gasse, H. Leutwyler. Roy analysis of pi-pi scattering R. Alkofer, L. von Smekal. The infrared behaviour of QCD Green's functions D.F. Measday. The nuclear physics of muon capture C. Schubert. Perturbative quantum "eld theory in the string-inspired formalism M. Bordag, U. Mohideen, V.M. Mostepanenko. New developments in the Casimir e!ect G.E. Mitchell, J.D. Bowman, S.I. PenttilaK , E.I. Sharapov. Parity violation in compound nuclei: experimental methods and results G. Grynberg, C. Robilliard. Cold atoms in dissipative optical lattices I. Pollini, A. Mosser, J.C. Parlebas. Electronic, spectroscopic and elastic properties of early transition metal compounds R. Fazio, H. van der Zant. Quantum phase transitions and vortex dynamics in superconducting networks D.I. Pushkarov. Quasiparticle kinetics and dynamics in nonstationary deformed crystals in the presence of electromagnetic "elds V.M. Shabaev. Two-time Green's function method in quantum electrodynamics of high-Z few-electron atoms G. Bo!etta, M. Cencini, M. Falcioni, A. Vulpiani. Predictability: a way to characterize complexity A. Wacker. Semiconductor superlattices: a model system for nonlinear transport I.L. Shapiro. Physical aspects of the space}time torsion Ya. Kraftmakher. Modulation calorimetry and related techniques M.J. Brunger, S.J. Buckman. Electron}molecule scattering cross sections. I. Experimental techniques and data for diatomic molecules S.Y. Wu, C.S. Jayanthi. Order-N methodologies and their applications M. Baer. Introduction to the theory of electronic non-adiabatic coupling terms in molecular structures
PII: S0370-1573(01)00071-0